DEPARTMENT OF EDUCATION, COMMUNICATION AND LEARNING Fine - tuning for L esson P lanning Comparing T eachers’ P erceptions and Us e of a f ine - t uned AI Assistant for Lesson Planning Sophia Helena Koenig Thesis : 30 credits Program and/or cou rs e : International Master’s Programme in IT and Learning Level : Second Cycle Term / year : Spring term 2024 Supervisor : Thomas Hillman Examine r: Sylvana Sofkova Hashemi Abstract Thesis : 30 credits Program and/or course : International Master’s Programme in IT and Learning Level : Second Cycle Term / year : Spring term 2024 Supervisor : Thomas Hillman Examiner : Sylvana Sofkova Hashemi Keyword s : Generative AI, fine - tuned AI , lesson plan ning , t eachers, i ntervention Purpose: Although AI has been in the public eye since the last century, the increasing popularity of generative AI tools has brought about renewed interest in exploring how these technologies can enhance educational practices, including lesson planning. T herefore, t his thesis aims to investigate teachers’ perceptions and use of a fine - tuned AI assistant for lesson planning. To uncover potential advantages and shortcomings of fine - tuned AI assistants for teachers, the thesis reports on a study that contrasts the use of a general - purpose AI model with a fine - tuned version for lesson planning. Theory: This study employs the Technology Acceptance Model (TAM) and the Technological Pedagogical Content Knowledge (TPACK) framework. TAM is implemented as a framework to analyse how teachers perceive a fine - tuned AI assistant for lesson planning in terms of ease of use and perceived usefulness and how these factors influence the future integration of the tool and TPACK allows for an extensive analysis of the teachers’ evaluation of the fine - tuned AI assistants’ generated lesson plans and materials. Method: To answer the research question, the study adopted an interp retative stance while following a crossover intervention study design T he participants were tasked to create a lesson plan and worksheet with the general - purpose AI model ChatGPT - 4 and the fine - tuned AI assistant for lesson planning. Subsequently , the study interviewed nine teachers regarding their perception s and use of a fine - tuned AI assistant for lesson planning. T he interview transcript s were analysed with a qualitative content analysis sugge sted by Kuckartz (2018). Results: Overall, the study strengthens the idea that the implementation of fine - tuned AI assistants can relieve novice teachers of their workload and scaffold the lesson preparation process for more experienced teachers by providing inspiration and differentiating content. These findings also complement those of other studies that indicate that fine - tuned LLMs outperform general - purpose LLMs in particular tasks. As the demands of lesson planning can be broken down into certain tasks such as creating detailed lesson plans and worksheets, this can inform the LLMs system prompt. Nevertheless, in order to be generally applicable to all teachers of all subjects, the level of fine - tuning will still create a leeway for hallucinated answers. Thus, critical evaluation and wo rking iteratively with the AI - assisted tool remains of great importance. Foreword First and foremost, I would like to express my heartfelt gratitude to my supervisor, Thomas Hillman, for his guidance and encouragement throughout the entire writing process of completing this thesis. Your support and invaluable insights have been instrume ntal in the completion of this thesis, and I am grateful to have had such a dedicated supervisor who patiently answered and discussed each of my (many, many) questions. Your supervision made this process so much more enjoyable. I would like to give my warmest and sincere thanks for the opportunity to write this thesis in collaboration with fobizz . I am deeply grateful to the fobizz team, especially Dr. Diana Knodel , Frederik Dietz and Marie - Lene Armingeon , for their immense expertise, endless support and patience with this project. It has been such a pleasure to do this project with you and has allowed me to learn more than I can put into words. A very special thank you goes to Elizabeth Olsson, whose feedback and discussions helped me grow as a writer throughout this master’s programme, and whose guidance supported me tremendously in navigating this thesis. Your enthusiasm for this topic was contagious I would also like to thank the participants in this study. Thank you for giving me your valuable time and sharing your experiences with me Y our reflection s and detailed encounters added so much depth to this thesis. My acknowledgements would not be complete without thanking my wonderful friends , especially Jana, Natalie, Pia and Teresa . Thank you for all the study sessions in the library and during our writing retreat. Your c onstant emotional support , proof - reading, and invaluable advice meant the world to me . I look forward to crossing the finish line together. Last but not least , I would like to thank my family and partner for their encouragement throughout the writing of this thesis. My heartfelt thanks go to my mother for being the best and most detailed proof - reader one could wish for, and to Nico for providing me with the tastiest meals and always being an unfailing source of advice and reassurance. Table of C ontent s Abstract ................................ ................................ ................................ ................................ .................... 2 Foreword ................................ ................................ ................................ ................................ .................. 3 Table of Contents ................................ ................................ ................................ ................................ ..... 4 1. Introduction ................................ ................................ ................................ ................................ ...... 1 1.1 Research Aim ................................ ................................ ................................ ................................ 2 1.2 Research Questions ................................ ................................ ................................ ........................ 2 2. Literature Review ................................ ................................ ................................ ............................. 3 2.1 Lesson Planning ................................ ................................ ................................ ....................... 3 2.1.1 Lesson Planning and general - purpose AI models ................................ ............................ 4 2.2 Fine - tuned AI models ................................ ................................ ................................ ..................... 6 2.3 Impact of AI on the Teacher Profession ................................ ................................ ........................ 8 2.3.1 Prompt Literacy ................................ ................................ ................................ ................ 8 3. Theoretical Framework ................................ ................................ ................................ .................... 9 3.1 Technology Acceptance Model (TAM) ................................ ................................ ................... 9 3.2 Technological Pedagogical Content Knowledge (TPACK) ................................ .................. 10 4. Method ................................ ................................ ................................ ................................ ........... 12 4.1 Intervention Study ................................ ................................ ................................ .................. 12 4.1.1 Materials and Instruments ................................ ................................ .............................. 12 4.1.2 Participants ................................ ................................ ................................ ..................... 14 4.1.3 Data collection ................................ ................................ ................................ ............... 15 4.1.4 Analysis ................................ ................................ ................................ .......................... 16 4.1.5 Ethics ................................ ................................ ................................ .............................. 16 5. Findings ................................ ................................ ................................ ................................ .......... 17 5.1 Perceptions of AI - assisted tools for lesson planning ................................ ............................. 17 5.1.1 Perceived Ease of Use ................................ ................................ ................................ .... 17 5.1.2 Perceived Usefulness ................................ ................................ ................................ ..... 20 5.2 Evaluation of lesson plans and materials ................................ ................................ ............... 22 5.2.1 Content Knowledge ................................ ................................ ................................ ........ 22 5.2.2 Pedagogical Knowledge ................................ ................................ ................................ 24 5.2.3 Pedagogical Content Knowledge ................................ ................................ ................... 25 6. Discussion ................................ ................................ ................................ ................................ ...... 27 6.1 Perceived Ease of Use and Usefulness ................................ ................................ ................... 28 6.2 Evaluation of Accuracy and Approaches ................................ ................................ ............... 30 6.3 Fine - tuned or General - purpose ................................ ................................ .............................. 32 6.4 Implications for Future Integration ................................ ................................ ........................ 33 6.5 Limitations and Further Research ................................ ................................ .......................... 34 5 7. Conclusion ................................ ................................ ................................ ................................ ...... 35 References ................................ ................................ ................................ ................................ .............. 36 Appendix 1 – Informed Consent Form ................................ ................................ ................................ .. 42 Appendix 2 – Interview Guide ................................ ................................ ................................ ............... 43 Appendix 3 – Original Quotes in German ................................ ................................ ............................. 44 Appendix 4 – System Prompt AI Assistant ................................ ................................ ............................ 47 1 1. Introduction Lesson planning is an essential part of teacher education (Kang, 2016) as it is critical for the effectiveness of teachers’ instruction and students’ learning (Li et al., 2009). Despite its importance, many teachers struggle with the demands of lesson planning such as creating tasks (Ainley, 2012) , setting learning objectives (Liyanage & Bartlett, 2010) or starting the lesson planning process (Schmidt, 2005) Recently, the topic of a rtificial i ntelligence (AI) has become an increasingly central topic in discussions about the role of technology in education Although AI has been in the public eye since the last century, the popularity of generative AI tools has brought about renewed interest in exploring how these technologies can enhance educational practices, including lesson planning. Therefore, the following thesis will investigate teachers’ perceptions of AI - assisted tools fine - tuned for lesson planning. In general, AI has been characterised as notoriously hard to define. In the 1950s the term AI was introduced to describe computers who appear ‘intelligent’ as they had the ability to translate text or conduct a dialogue (Russel & Norvig, 2021) . According to Schmid et al. (2021) , until now, it has proven difficult to provide a satisfactory, generally accepted definition of AI that includes all relevant aspects. Nowadays, the term is used collectively to refer to technologies such as machine learning, systems, and applications such as voice assistants and chatbots. Nevertheless, an important distinction can be made between early forms of AI, referred to as predictive AI, and generative AI (Gupta et al., 2024) . Systems based on predictive AI are rule - based and designed to perform a specific task , such as a computer playing chess (Kanbach et al., 2024) . While predictive AI excels at pattern recognition and decision - making, generative AI models , which this thesis is concerned with, are trained on large datasets and focus on generating new content (Hadi Mogavi et al., 2024) . This development has been made possible by the significant advances that large l anguage models (LLMs) have made in the field of natural language processing and machine learning . The introduction of the transformer architecture by Vaswani et al. (2017) and the possibility to pre - train language models on significantly larger datasets improve the ability of LLMs to produce human - like text and to respond to complex questions with high fluency , based on text input that are referred to as p rompt s (Kasneci et al., 2023) The recent popularity of AI was triggered by the release of the AI - assisted chatbot ChatGPT by the US company OpenAI in November 2022 (Goodman et al., 2024) Amongst students in Germany, the general - purpose AI chatbot is the most popular and widely used , according to a study by Franke et al. (2024) Its popularity is raising questions regarding its possible use in education and has set off a debate about potential misuse and ethical implications. On the one hand, Kasneci et al. (2023) define ethical challenges of generative AI in education as copyright issues of AI - generated materials, algorithmic bias, sustainable usage, and lack of adaptability , to name a few. In addition, concerned teachers and parents fear that students might turn to generative AI to complete assignments such as essays and find answers to homework questions (Coy, 2023) . On the other hand, educators and companies see the potential of AI - assisted tools to transform and improve education altogether. Instead of being afraid that students will use the technology to cheat on exams, proponents of generative AI advocate adjusti ng the nature of the assignments and working together with the new technology as reported by Coy in the New York Times ( 2023 ). Arguably, AI - assisted tools have the potential to support teachers as well by providing support with administrative tasks, translating educational materials, creating tasks and assessments, and differentiating learning materials (Hashem et al., 2023) As mentioned previously, issues that persist when implementing general - purpose AI systems are algorithmic bias (Kasneci et al., 2023) and producing false information (Sanderson, 2023) Moreover, LLMs are ‘few - shot’ learners , meaning that they require examples before being able to perform a new task (Brown et al., 2020) . To address these issues, researchers have investigated instruction - tuning LLMs for a specific task (Ouyang et al., 2022) In comparison to general - purpose AI systems , a fine - tuned AI system is based on an instructional prompt that guide s its function and role (Yun et al., 2023) This provides the LLM with relevant background information and contextual data for a specific task and 2 aims to enhanc e its response accuracy Ultimately, the fine - tuning of LLMs aims to improve their 'zero - shot' performance, i.e. , their ability to perform certain tasks without prior examples (Sanh et al., 2021) In educational settings, this capability allows for the LLM to be tailored for specific tasks, whether it be serving as a personalized tutor for students or supporting teachers with administrative duties and demands of lesson planning , such as creating lesson plans and materials Although research has shown that teachers rarely plan according to guidelines or frameworks, lesson planning entails certain demands that are expected by every teacher (König et al. 2021). Recker and Sumner (2018, p. 268) explain that professional development that is focused on preparing teachers to further customize their instructional preparation process can enhance their ‘ pedagogical design capacity ’ Subsequently, the authors (2018) advocate technology that supports teachers during instructional planning. The emergence of AI models which are fine - tuned for lesson planning offer new possibilities to do so. 1.1 Research Aim As the fine - tuning of LLMs for specific contexts is still a recent phenomenon, teachers’ perceptions and use of fine - tuned AI systems have not yet been extensively investigated. Therefore, the aim of this thesis is to investigate teachers’ perceptions and use of fine - tuned AI assistants for lesson planning. More specifically, the thesis will analyse how teachers from German schools at primary, secondary, and vocational levels perceive and use a fine - tuned text - generating LLM through a chat - based interface. Further, the thesis aims t o uncover potential advantages and shortcomings of fine - tuned AI assistants for teachers . To achieve this aim. the thesis reports on a study that contrasts the use of a general - purpose AI model with a fine - tuned version for lesson planning. 1.2 Research Questions The research aim will be achieved through the investigation of the following research questions: 1) How do teachers perceive the ease of use and usefulness of a fine - tuned AI assistant for lesson planning ? 2) How do teachers evaluate lesson plans and materials generated by a fine - tuned AI assistant for lesson planning in terms of content accuracy and proposed pedagogical approaches ? The overall structure of the thesis takes the form of six chapters, including a literature review, an overview of the theoretical framework employed in this study , method ological approaches , findings , and, lastly, a discussion and conclusion. The first two chapters give a brief overview of previous research on lesson planning , fine - tuned AI assistants , and AI’s impact on the teacher profession, and lay out the theoretical dimensions of the research. The third chapter is concerned with the methodology employed , and the subsequent chapter prese nts the findings. The final chapter includes a discussion of the implications of the findings for further research. 3 2. Literature Review For this thesis, a comprehensive narrative literature review was conducted on the topic of lesson planning, fine - tuned AI models and the impact of AI on the teacher profession. A narrative literature review provides a broad synthesis of the existing research on a particular subject (Green et al., 2006) Unlike a systematic review, a narrative review does not follow a predefined methodology, but rather aims to critically analyse and summarize the current state of knowledge in a narrative format (Green et al., 2006) The initial search for articles was conducted on the ‘Scopus’ , and ‘ LearnTechLib ’ databases with the keywords ‘lesson planning’, ‘generative AI’ and ‘fine - tuned AI models’ The purpose of this chapter is to review previous literature and research on the topic to identify its potential limitations Additionally, this section aims to uncover how this research focusing on teachers’ perceptions and use of a fine - tuned AI assistant for lesson planning will contribute to the discours e 2.1 Lesson Planning The following section contextuali s es the research aim by providing background information on the topic of lesson planning. The first section will address the significance of lesson planning for teachers’ daily practice and introduce teachers’ perceptions of lesson planning. Thereafter, the demands of lesson planning are discussed to illustrate the practice. The second section provides an overview of previous research on lesson planning supported by general - purpose AI models such as ChatGPT. As previously stated in the introduction, lesson planning is fundamental to teachers’ daily practice Sardo - Brown (1996, p. 519) characterises lesson planning as ‘ the instructional decisions made prior to the execution of plans during teaching ’ . The lesson plan itself can be described as a teaching aid that delineates the course instruction (Janssen & Lazonder, 2015) . There are different models of lesson plans. However, typically, it focuses on outlining the students’ learning goals, with which learning activities and teacher approach these goals will be tackled and the resources that will be employed (Janssen & Lazonder, 2015; Milkova, 2012) Whether it occurs deliberately or subconsciously, it entails a cooperation of several processes such as decision - making , professional judgment, problem - solving , and pedagogical reasoning (Enow & Goodwyn, 2018) To understand how to make the process of lesson planning tangible, research can help to make those processes palpable (Enow & Goodwyn, 2018) Moreover, Mutton et al. (2011) highlight how research on lesson planning can support learning about teaching as a practice and develop an understanding of how to improve it. Their research investigated how beginning teachers develop expertise and concluded that lesson planning is a knowledge - based process that must allow for flexibility (Mutton et al., 2011) Over time, many lesson plan models have been developed . To illustrate the process of planning, the following section will present the demands of lesson planning in further detail. König et al. (2021) propose the CODE - PLAN model ( C ognitive D emands of lesson planning) to empirically describe and investigate teachers’ planning competence. As this thesis aims to empirically ana l yse teachers’ perceptions and use of a fine - tuned AI assistant for lesson planning, this framework seemed to fit the research aim. The model defines content transformation, task creation, adaption to student learning, clarity of learning objectives, unit contextualisation and phasing as cognitive demands teachers must meet before te aching. The first cognitive demand, content transformation, is based on Shulman’s (1986) notion of transforming subject matter for teaching. Shulman characterises this transformation as pedagogical reasoning Based on the German Tradition of Didactics, teachers perform a didactic analysis (Klafki, 1995) to align the lesson topic with the curriculum an d transform the topical content into learning content. Shulman’s (1986) definition of this demand is essential to lesson planning and will be addressed in this thesis as a component of the theoretical framework and analysis. The second demand, task creation, is defined as the selection and creation of tasks as part of student activities (König et al., 2021) Küchemann et al. (2023) define the objective of tasks as the organisation of student learning and the observation of their progress The next cognitive demand, adaptation to student learning dispositions , is part of Shulman’s (198 6 ) content transformation. Here, the teacher is expected to 4 differentiate content to match the learner’s level and guide them into the ‘ zone of proximal developmen t ’ (Vygotsky, 1980, p. 84) Moreover, setting learning objectives is central for teachers and students. Learning objectives should usually be achieved within a single lesson . They support teachers in structuring lesson content and subsequently guide the students to understand what is expected of them. The demand unit contextualisation focuses on contextualising a lesson in a broader context of unit planning (König et al., 2021) . A single lesson should refer to previous and subsequent lessons and when planning a lesson, a teacher should reflect on how an individual lesson builds upon the unit. Lastly, the demand phasing refers to clearly structuring a lesson to reduce interruptions and assist classroom organisation (König et al., 2021) When exam ining previous research on lesson planning, it is apparent that there are often differences between how teachers with varying levels of teaching experience plan their lessons. For those new to teach ing , the process of planning is of significant importance (Mutton et al., 2011) and described as a visible process (Enow & Goodwyn, 2018) The teaching style is more recipe - like and le ss adaptive in comparison to expert teachers (Chizhik & Chizhik, 2018) With more experience, it is argued that lesson planning turns into a less visible process that incorporates instructional approaches with students’ learning dispositions (Berliner, 2004; Enow & Goodwyn, 2018) In summary, previous research on lesson planning demonstrates that lesson planning is of great significance for teachers and entails several demands König et al. (2021) highlight the content transformation that teachers carry out to transform subject matter for teaching. Moreover, lesson planning differs between novice and experienced teachers as it shifts to a less visible process with increased teaching experience (Enow & Goodwyn, 2018) Although lesson planning benefits teachers and students, as mentioned in the introduction, many teachers struggle with setting learning objectives (Liyanage & Bartlett, 2010) , designing effective tasks (Ainley, 2012) or beginning the lesson planning process (Schmidt, 2005) T his thesis examines a possible partial solution to these challenges by analysing teachers’ perceptions and use of a fine - tuned AI assistant for lesson planning and investigat ing how an AI - assisted tool influences the lesson planning process. 2.1.1 Lesson Planning and gene ral - purpose AI models The following section presents previous research on lesson planning supported by general - purpose AI models It focuses on the discussion around the potential benefits and shortcomings of general - purpose AI mod els in assisting teachers in preparing lesson plans and materials. As general - purpose AI models such as ChatGPT were only released to the public in November 2022, this area of research is still relatively new, limiting the number of studies available for this review. The following section provides a br ief overview of recent findings on lesson planning and general - purpose AI models, and aims to trace different, and in some cases conflicting, views on the topic Many studies, including Kasneci et al. (2023) , see the potential of generative AI in the creation of inclusive lesson plans and exercises By providing the model with background information on the prospective courses, the study propose s that generative AI has the potential to support teachers in design ing course syllab i , generat ing questions , and fostering critical thinking. Furthermore, generative AI can assist teachers in creating personalized and differentiated course materials such as problems and quizzes (ElSayary, 2024) By doing so , generative AI is assumed to relieve teacher workload and provide teacher support (Hashem et al., 2023; van den Berg & du Plessis, 2023) Hashem et al. (2023) point out that teachers have higher burnout levels than other professionals in the human service field. B y supporting the teacher in administrative and repetitive tasks such as lesson planning and material creation, generative AI has the potential to act as a tool for workload relief and burnout prevention (Hashem et al., 2023) Other studies highlight that AI - based tools for lesson planning still have significant limitations and shortcomings in terms of content accuracy and pedagogical approaches . For example, generative AI , such as ChatGPT, is not able to produce an adequate lesson plan on the first attempt (Goodman et al., 2024; Hashem et al., 2023; van den Berg & du Plessis, 2023) . As noted by van den Berg & du Plessis 5 (2023) , ChatGPT was able to produce an acceptable basic lesson plan for novice teachers that lacked detail and creativity for more experienced teachers. Similarly, Goodman et al. (2024) criticised the lack of detail in the lesson plans generated by ChatGPT, describing them as impractical to implement in the classroom. For example, the generated lesson plan included suggestions for materials and activities but lacked detail regarding specific materials to be use d or topics to be discuss ed In addition, the general - purpose AI model estimated how long an activity would take, but did not include the time to set up the activity (Goodman et al., 2024). To improve the quality of the generated lesson plan s , it is recommended to engage with the tool in an iterative manner , providing the tool with well - crafted prompts and specific contextual information (Goodman et al., 2024; Hashem et al., 2023) By working iteratively , the teacher can further instruct the tool and provide more alterations to the proposed lesson plan to tailor it to their expectations. As noted by Goodman et al. (2024) , the strength of ChatGPT is its ability to modify its response based on new requests. Furthermore, in addition to iterative engagement with generative AI, previous research has highlighted the importance of detailed and well - crafted prompts to produce high - quality outcomes for teachers (Goodman et al., 2024; Hashem et al., 2023) . According to Goodman et al. ( 2024 ) , the prompts used to generate lesson plans had a significant impact on the effectiveness of ChatGPT and i dentify thoughtfully designed and task - specific prompts as a key pillar of the successful implementation of ChatGPT for lesson planning. Another factor that influences generative AI to produce better results is context (Goodman et al., 2024; Hashem et al., 2023; van den Berg & du Plessis, 2023) . Hashem et al. (2023) and Küchemann et al. (2023) attribute the poor results of generative AI to a lack of appropriate context, consisting of an understanding of relevant theoretical concepts and domain - specific knowledge. If generative AI has access to a specialised context for instructional planning, i t has the potential to improve its performance (van den Berg & du Plessis, 2023) Furthermore, previous studies highlight the importance of a critical evaluation of the generated materials and lesson plans by teachers (Goodman et al., 2024; Hashem et al., 2023; van den Berg & du Plessis, 2023) . As the content generated by generative AI may contain factual errors and invented knowledge, Goodman et al. (2024) advise against working with generative AI unless teachers engage with the tool iteratively and reflectively If generative AI cannot base its response on a concrete source, it will make - up a response, a behaviour by large language models (LLMs) which is commonly referred to as hallucinat ion (Sanderson, 2023) In the same vein, van den Berg and du Plessis (2023) emphasise that teachers need to critically evaluate generated materials for accuracy and context before implementing them in the classroom. Subsequently, t he need for teacher training has been established and advocated to help teachers develop the necessary skills required to work effectively with generative AI, such as designing prompts , working iteratively , and evaluating the responses critically (Goodman et al., 2024; Hashem et al., 2023; Küchemann et al., 2023) When looking at the discussion around generative AI’s effectiveness for lesson planning, an important point of discussion is the necessary level of teacher experience. Goodman et al. (2024) deem the generated lesson plans unfit for novice teachers as the generated responses may include erroneous and false information , and novice teachers may have issues identifying hallucinations. Moreover, hallucinations do not occur with each response , which is why Goodman et al. (2024) underline the necessity for teachers to critically evaluate each response. Contrastingly, previous research characteris es ChatGPT as a supportive tool for novice teachers when developing science tasks. Küchemann et al. (2023) compared tasks created with generative AI with tasks created with the help of physics textbooks. The findings indicate that the tasks were overall comparable in their strengths and shortcomings. Subsequently , the authors conclude that a general - purpose AI model can effectively support teachers in task development as there were no significant differences between the examined methods (Küchemann et al., 2023) When looking at how more experienced teachers view lesson plan s generated by generative AI, van den Berg and du Plessis (2023) contemplate whether more experienced teachers may have higher demands towards the level of lesson plans. The authors allude to their finding that generated lesson plans 6 were deemed acceptable for novice teachers but lacked creativity concerning the proposed activities and methods for more experienced teachers (van den Berg and du Plessis, 2023) Together, these studies indicate that generative AI can serve as a supportive tool for teachers if used with well - designed prompts (Goodman et al., 2024; Hashem et al., 2023) . The studies also outline a crucial role for the teacher's critical evaluation of the generated content and highlight the need for teacher training in the design of effective prompts (Hashem et al., 2023; Küchemann et al., 2023; van den Berg & du Plessis, 2023) . As previous studies focused on examining the generated lesson plans through the lens of the researchers ( Hashem et al., 2023; van den Berg & du Plessis, 2023) and quantitatively evaluating the generated tasks through a survey (Küchemann et al., 2023) , this review has highlighted the lack of detailed, qualitative analysis of teachers' perceptions and use of AI - based tools for lesson planning. Thus, this thesis aims to contribute to this growing area of research by providing a detailed, qualitative analysis of teachers' perceptions and use of a fine - tuned AI - assistant for lesson planning assistant to improve our understanding of the potential benefits and shortcomings of AI - assisted lesson planning tools. 2.2 F ine - tuned AI models The renewed attention on AI due to the growing popularity of generative AI models has prompted numerous studies to examine how the performance of LLMs can be improved As mentioned previously, general - purpose AI models are prone to algorithmic bias and hallucinat ion (Kasneci et al., 2023) To address these issues, numerous researchers have investigated instruction - tuning LLMs for a specific task or application to enhance their response accuracy (Yun et al., 2023) Massimo Airoldi (2022) refers to the fine - tuning of LLMs as a secondary machine sociali s ation: Once the initial training of a supervised machine learning system is complete, the secondary machine socialization usually consists in a fine - tuning stage, when the general cultural dispositions of the machine habitus adjust to the local data contexts of r eal - life applications and feedback loops [...] (p. 64) The following section presents research on fine - tuned LLMs to uncover previously investigated areas of application and their potential advantages and shortcomings. This is done in order to identify potential gaps that the present study o f teachers’ perceptions of a fine - tuned AI assistant for lesson planning could address. Similarly to the topic of lesson planning supported by general AI - assisted tools, fine - tuned LLMs is an emerging area where the body of published work remains relatively limited Several studies have indicated that fine - tuned LLMs tend to outperform general purpose ones for specific tasks. As fine - tuned LLMs are instructed for a specific task, they were found to be capable of producing longer and more coherent text with fewer errors and hallucinations (Steinert et al., 2023) In a working paper by Steinert et al. (2023) the implementation of a fine - tuned LLM for providing feedback to students was investigated . The teacher provides the underlying LLM, either GPT - 3.5 Turbo or GPT - 4, with a specific task and the correct answer and instructs the LLM on how to present the task and address the student. The study concludes that fine - tuning an LLM has a notable benefit in reducing errors in its output (Steinert et al., 2023) . This is attributed to the fact that the LLM is provided with the correct solution to the task, negating the need for it to generate an answer through guesswork (Steinert et al., 2023) . Furthermore, the student is less likely to become distracted as the LLM is instructed to remain focused on the task and to redirect the student if they attempt to deviate from it (Steinert et al., 2023) . Another advantage of fine - tuned generative AI models is that they require less extensive prompt formulation. The system receives instructions on its task through pre - prompts, requiring minimal input from the user to perform its specialized task. This fea ture is regarded as particularly beneficial for users with limited experience in working with generative AI, as it reduces the threshold for them to begin working effectively with LLMs (Steinert et al., 2023) Nevertheless, numerous studies have demonstrated that fine - tuned LLMs are constrained by their limited memory capacity and the inherent limitations of LLMs. For example, Moore et al. (2022) argue 7 that the fine - tuned LLM GPT - 3 is suitable for conducting an initial evaluation and classification of student - generated questions. However, the extensive automatic evaluation was insufficient as the model overestimated the quality of the generated questions and misclassified them . To address this issue, the authors propose fine - tuning a more comprehensive dataset (Moore et al., 2022) . Similarly, fine - tuned LLMs have encountered challenges in adhering to their designated roles. Koyuturk et al. (2023) fine - tuned GPT - 3 to create an educational chatbot that could assist students when the teacher is not available. The study conclude s that a fine - tuned LLM was successful in fulfilling the role of the teacher with the appropriate pre - prompts. However, the model did not effectively maintain the teacher role and frequently diverged from the defined parameters (Koyuturk et al., 2023) . The authors of the study hypothesise that this behaviour may be attributed to the limitations of current LLMs in terms of memory or context window size (Koyuturk et al., 2023) It is important to note that LLMs have significantly improved in terms of context window size and amount of training data since the publication of these studies. An increased context window size mitigates the memory constraints from previous models, as criticised by Moore et al. (2022) and Koyuturk et al. (2023) in their research with GPT - 3. Niszczota & Abbas (2023) conducted a direct comparison of the LLMs and assessed the performance of GPT - 3.5 and GPT - 4 on a financial literacy test. The results showed that GPT - 3.5 achieved a score of 65 - 66% out of 100% , while ChatGPT based on the GPT - 4 model achieved a near - perfect score of 99%. Steinert et al. (2023) also note that their study of fine - tuned LLMs could have yielded more accurate results and an improved user experience if it had been conducted with the LLM GPT - 4 instead of its predecessor GPT - 3.5 Along the same lines , Singer et al. (2024) compared the performance of GPT - 4 wi