Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact

Sayed Mahbub Hasan Amiri; Md Mainul Islam

doi:doi:10.11648/j.se.20251101.11

Research Article |

| Peer-Reviewed

Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact

Sayed Mahbub Hasan Amiri^*

, Md Mainul Islam

Published in Software Engineering (Volume 11, Issue 1)

Received: 23 March 2025 Accepted: 31 March 2025 Published: 28 April 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

This is the study that presents an AI-Python-based chatbot that helps students to learn programming by demonstrating solutions to such problems as debugging errors, solving syntax problems or converting abstract theoretical concepts to practical implementations. Traditional coding tools like Integrated Development Environments (IDEs) and static analyzers do not give robotic help while AI-driven code assistants such as GitHub Copilot focus on getting things done. To close this gap, our chatbot combines static code analysis, dynamic execution tracing, and large language models (LLMs) to provide the students with relevant and practical advice, hence promoting the learning process. The chatbot’s hybrid architecture employs CodeLlama for code embedding, GPT-4 for natural language interactions, and Docker-based sandboxing for secure execution. Evaluated through a mixed-methods approach involving 1,500 student submissions, the system demonstrated an 85% error resolution success rate, outperforming standalone tools like pylint (62%) and GPT-4 (73%). Quantitative results revealed a 59.3% reduction in debugging time among users, with pre- and post-test assessments showing a 34% improvement in coding proficiency, particularly in recursion and exception handling. Qualitative feedback from 120 students highlighted the chatbot’s clarity, accessibility, and confidence-building impact, though critiques included occasional latency and restrictive code sanitization. Emphasizing the ethical aspects of the project, the bias principle led to the discrimination of gendered reasons for 83% and the GDPR-iPad-like procedures to anonymity were followed. The chatbot's productivity points to its ability to make coding education available to everyone and to give 24/7 aid to students in some not well-funded schools. Future work will expand multilingual support through localized datasets and culturally adapted examples, integrate gamification to enhance engagement, and develop collaborative learning features. By balancing technical innovation with pedagogical empathy, this research provides a blueprint for AI tools that prioritize educational equity and long-term skill retention over mere code completion. The chatbot exemplifies how AI can augment human instruction, fostering deeper conceptual understanding in programming education.

Published in	Software Engineering (Volume 11, Issue 1)
DOI	10.11648/j.se.20251101.11
Page(s)	1-17
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Python Programming, Intelligent Tutoring Systems, Code Debugging, Ethical AI, AI in Education

References

[1]	AICPA. (2023). SOC 2 reporting criteria. https://www.aicpa.org
[2]	Amazon. (2023). AWS CodeWhisperer: AI-powered code companion. https://aws.amazon.com/codewhisperer
[3]	Baker, R. S., & Inventado, P. S. (2014). Educational data mining and learning analytics. In Learning analytics (pp. 61-75). Springer. https://doi.org/10.1007/978-1-4614-3305-7_4
[4]	Becker, B. A., Quille, K., & Butler, D. (2019). Twenty years of primary and secondary computing education research: A thematic analysis of the literature. ACM Transactions on Computing Education (TOCE), 20(1), 1-32. https://doi.org/10.1145/3277565
[5]	Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7-74. https://doi.org/10.1080/0969595980050102
[6]	Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research, 81, 1-15. Retrieved from http://proceedings.mlr.press/v81/buolamwini18a.html
[7]	Chen, M., Tworek, J., Jun, H., et al. (2021). Evaluating large language models trained on code. arXiv preprint arXiv: 2107. 03374. https://arxiv.org/abs/2107.03374
[8]	Clark, J. M., & Paivio, A. (1991). Dual coding theory and education. Educational Psychology Review, 3(3), 149-210. https://doi.org/10.1007/BF01320076
[9]	Codecademy. (2023). Learn Python. https://www.codecademy.com
[10]	CodingBat. (2023). Python practice problems. https://codingbat.com/python
[11]	Denny, P., Prather, J., Becker, B. A., et al. (2023). Computing education in the era of generative AI. Communications of the ACM, 66(8), 56-67. https://doi.org/10.1145/3597063
[12]	Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL, 1, 4171-4186. https://doi.org/10.48550/arXiv.1810.04805
[13]	Dua, D., Bhansali, A., & Mehta, R. (2023). Secure code execution in educational environments: A Docker-based approach. Journal of Cybersecurity Education, 7(2), 45-60.
[14]	Ericson, B. J., Margulieux, L. E., & Morrison, B. B. (2020). Solving parsons problems versus fixing and writing code. Proceedings of the ITiCSE Conference, 67-73. https://doi.org/10.1145/3341525.3387377
[15]	Feldstein, M., & Hill, P. (2016). Personalized learning: What it really is and why it really matters. EDUCAUSE Review, 51(2), 24-35. Retrieved from https://er.educause.edu/articles/2016/3/personalized-learning-what-it-really-is-and-why-it-really-matters
[16]	Goodin, D. (2022, March 15). 1.4 million student records exposed in coding platform breach. Ars Technica. https://arstechnica.com/information-technology/2022/03/1-4-million-student-records-exposed-in-coding-platform-breach/
[17]	Graesser, A. C., Lu, S., Jackson, G. T., et al. (2008). AutoTutor: A tutor with dialogue in natural language. Behavior Research Methods, 40(4), 804-821. https://doi.org/10.3758/BRM.40.4.904
[18]	Guo, P. (2023). Python is now the most taught language in top U.S. universities. Communications of the ACM, 66(4), 12-15. https://doi.org/10.1145/3580785
[19]	Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. MIS Quarterly, 28(1), 75-105. https://doi.org/10.2307/25148625
[20]	Hu, E. J., Shen, Y., Wallis, P., et al. (2021). LoRA: Low-rank adaptation of large language models. arXiv preprint arXiv: 2106. 09685.
[21]	Koedinger, K. R., Corbett, A. T., & Perfetti, C. (2012). The knowledge-learning-instruction framework: Bridging the science-practice chasm. Educational Psychologist, 47(3), 153-183. https://doi.org/10.1080/00461520.2012.662800
[22]	Matthes, E. (2019). Python crash course: A hands-on, project-based introduction to programming. No Starch Press.
[23]	Merkel, D. (2014). Docker: Lightweight Linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. https://dl.acm.org/doi/10.5555/2600239.2600241
[24]	Meyer, J. H. F., & Land, R. (2003). Threshold concepts and troublesome knowledge: Linkages to ways of thinking and practising within the disciplines. Improving Student Learning, 4, 412-424.
[25]	Microsoft. (2023). Monaco Editor. https://microsoft.github.io/monaco-editor
[26]	MIT OpenCourseWare. (2023). Introduction to Computer Science and Programming in Python. https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0001-introduction-to-computer-science-and-programming-in-python-fall-2016/
[27]	Nielsen, J. (1993). Usability engineering. Morgan Kaufmann.
[28]	NIST. (2023). Advanced Encryption Standard (AES). FIPS Publication 197. https://doi.org/10.6028/NIST.FIPS.197
[29]	OpenAI. (2023). GPT-4 technical report. https://cdn.openai.com/papers/gpt-4.pdf
[30]	Ouyang, L., Wu, J., Jiang, X., et al. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv: 2203. 02155.
[31]	Paas, F., & Van Merriënboer, J. J. (2020). Cognitive-load theory: Methods to manage working memory load in the learning of complex tasks. Current Directions in Psychological Science, 29(4), 394-398. https://doi.org/10.1177/0963721420969371
[32]	Pearce, H., Ahmad, B., Tan, B., & Dolan-Gavitt, B. (2021). Asleep at the keyboard? Assessing the security of GitHub Copilot’s code contributions. IEEE Symposium on Security and Privacy, 1-15. https://doi.org/10.1109/SP40001.2021.00020
[33]	Piech, C., Sahami, M., Huang, J., & Guibas, L. (2022). Code in Place: A case study in scaling Python education with AI. Proceedings of the SIGCSE Conference, 1-7. https://doi.org/10.1145/3478431.3499291
[34]	Prather, J., Denny, P., & Leinonen, J. (2023). The prompt generation gap: Reimagining AI support for novice programmers. Proceedings of the SIGCSE Conference, 1-7. https://doi.org/10.1145/3478431.3499292
[35]	Replit. (2023). Replit AI: Your pair programmer. https://replit.com/site/ai
[36]	Rozière, B., Gehring, J., Gloeckle, F., et al. (2023). Code Llama: Open foundation models for code. Meta AI.
[37]	Rule, A., Tabard, A., & Hollan, J. D. (2019). Aiding collaborative reuse of computational notebooks with annotation-aware search. Proceedings of the CHI Conference, 1-12. https://doi.org/10.1145/3290605.3300500
[38]	Sánchez-Monedero, J., Dencik, L., & Edwards, L. (2023). Auditing GitHub Copilot for algorithmic bias in code generation. Proceedings of the FAccT Conference, 1-15. https://doi.org/10.1145/3531146.3533117
[39]	Savery, J. R. (2006). Overview of problem-based learning: Definitions and distinctions. Interdisciplinary Journal of Problem-Based Learning, 1(1), 9-20. https://doi.org/10.7771/1541-5015.1002
[40]	Settles, B., & Meeder, B. (2016). A trainable spaced repetition model for language learning. Proceedings of the ACL Conference, 1-10. https://doi.org/10.18653/v1/P16-1174
[41]	Smith, J., & Patel, R. (2022). Debugging difficulties in introductory Python courses: A longitudinal study. Journal of Computer Science Education, 34(4), 567-589. https://doi.org/10.1080/08993408.2022.2041234
[42]	Sorva, J., Karavirta, V., & Malmi, L. (2013). A review of generic program visualization systems for introductory programming education. ACM Transactions on Computing Education (TOCE), 13(4), 1-64. https://doi.org/10.1145/2490822
[43]	Sweigart, A. (2020). Automate the boring stuff with Python (2nd ed.). No Starch Press.
[44]	Tabnine. (2023). AI-powered code completion. https://www.tabnine.com
[45]	UNESCO. (2021). AI and education: Guidance for policy-makers. UNESCO Publishing.
[46]	United Nations. (2015). Sustainable Development Goal 4: Quality education. https://sdgs.un.org/goals/goal4
[47]	Van Rossum, G., Warsaw, B., & Coghlan, N. (2023). Python AST module documentation. https://docs.python.org/3/library/ast.html
[48]	VanLehn, K. (2011). The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems. Educational Psychologist, 46(4), 197-221. https://doi.org/10.1080/00461520.2011.611369
[49]	Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.
[50]	W3C. (2023). Web Content Accessibility Guidelines (WCAG) 2. 1. https://www.w3.org
[51]	Wood, D., Bruner, J. S., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, 17(2), 89-100. https://doi.org/10.1111/j.1469-7610.1976.tb00381.x
[52]	World Economic Forum. (2023). The future of jobs report 2023. https://www.weforum.org
[53]	Ziegler, A., Nijkamp, E., & Schmidt, L. (2022). GitHub Copilot and the rise of AI pair programmers. IEEE Software, 39(6), 89-94. https://doi.org/10.1109/MS.2022.3202091

Cite This Article

Plain Text BibTeX RIS

APA Style

Amiri, S. M. H., Islam, M. M. (2025). Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact. Software Engineering, 11(1), 1-17. https://doi.org/10.11648/j.se.20251101.11

Copy | Download

ACS Style

Amiri, S. M. H.; Islam, M. M. Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact. Softw. Eng. 2025, 11(1), 1-17. doi: 10.11648/j.se.20251101.11

Copy | Download

AMA Style

Amiri SMH, Islam MM. Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact. Softw Eng. 2025;11(1):1-17. doi: 10.11648/j.se.20251101.11

Copy | Download

@article{10.11648/j.se.20251101.11,
  author = {Sayed Mahbub Hasan Amiri and Md Mainul Islam},
  title = {Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact
},
  journal = {Software Engineering},
  volume = {11},
  number = {1},
  pages = {1-17},
  doi = {10.11648/j.se.20251101.11},
  url = {https://doi.org/10.11648/j.se.20251101.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.se.20251101.11},
  abstract = {This is the study that presents an AI-Python-based chatbot that helps students to learn programming by demonstrating solutions to such problems as debugging errors, solving syntax problems or converting abstract theoretical concepts to practical implementations. Traditional coding tools like Integrated Development Environments (IDEs) and static analyzers do not give robotic help while AI-driven code assistants such as GitHub Copilot focus on getting things done. To close this gap, our chatbot combines static code analysis, dynamic execution tracing, and large language models (LLMs) to provide the students with relevant and practical advice, hence promoting the learning process. The chatbot’s hybrid architecture employs CodeLlama for code embedding, GPT-4 for natural language interactions, and Docker-based sandboxing for secure execution. Evaluated through a mixed-methods approach involving 1,500 student submissions, the system demonstrated an 85% error resolution success rate, outperforming standalone tools like pylint (62%) and GPT-4 (73%). Quantitative results revealed a 59.3% reduction in debugging time among users, with pre- and post-test assessments showing a 34% improvement in coding proficiency, particularly in recursion and exception handling. Qualitative feedback from 120 students highlighted the chatbot’s clarity, accessibility, and confidence-building impact, though critiques included occasional latency and restrictive code sanitization. Emphasizing the ethical aspects of the project, the bias principle led to the discrimination of gendered reasons for 83% and the GDPR-iPad-like procedures to anonymity were followed. The chatbot's productivity points to its ability to make coding education available to everyone and to give 24/7 aid to students in some not well-funded schools. Future work will expand multilingual support through localized datasets and culturally adapted examples, integrate gamification to enhance engagement, and develop collaborative learning features. By balancing technical innovation with pedagogical empathy, this research provides a blueprint for AI tools that prioritize educational equity and long-term skill retention over mere code completion. The chatbot exemplifies how AI can augment human instruction, fostering deeper conceptual understanding in programming education.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact

AU  - Sayed Mahbub Hasan Amiri
AU  - Md Mainul Islam
Y1  - 2025/04/28
PY  - 2025
N1  - https://doi.org/10.11648/j.se.20251101.11
DO  - 10.11648/j.se.20251101.11
T2  - Software Engineering
JF  - Software Engineering
JO  - Software Engineering
SP  - 1
EP  - 17
PB  - Science Publishing Group
SN  - 2376-8037
UR  - https://doi.org/10.11648/j.se.20251101.11
AB  - This is the study that presents an AI-Python-based chatbot that helps students to learn programming by demonstrating solutions to such problems as debugging errors, solving syntax problems or converting abstract theoretical concepts to practical implementations. Traditional coding tools like Integrated Development Environments (IDEs) and static analyzers do not give robotic help while AI-driven code assistants such as GitHub Copilot focus on getting things done. To close this gap, our chatbot combines static code analysis, dynamic execution tracing, and large language models (LLMs) to provide the students with relevant and practical advice, hence promoting the learning process. The chatbot’s hybrid architecture employs CodeLlama for code embedding, GPT-4 for natural language interactions, and Docker-based sandboxing for secure execution. Evaluated through a mixed-methods approach involving 1,500 student submissions, the system demonstrated an 85% error resolution success rate, outperforming standalone tools like pylint (62%) and GPT-4 (73%). Quantitative results revealed a 59.3% reduction in debugging time among users, with pre- and post-test assessments showing a 34% improvement in coding proficiency, particularly in recursion and exception handling. Qualitative feedback from 120 students highlighted the chatbot’s clarity, accessibility, and confidence-building impact, though critiques included occasional latency and restrictive code sanitization. Emphasizing the ethical aspects of the project, the bias principle led to the discrimination of gendered reasons for 83% and the GDPR-iPad-like procedures to anonymity were followed. The chatbot's productivity points to its ability to make coding education available to everyone and to give 24/7 aid to students in some not well-funded schools. Future work will expand multilingual support through localized datasets and culturally adapted examples, integrate gamification to enhance engagement, and develop collaborative learning features. By balancing technical innovation with pedagogical empathy, this research provides a blueprint for AI tools that prioritize educational equity and long-term skill retention over mere code completion. The chatbot exemplifies how AI can augment human instruction, fostering deeper conceptual understanding in programming education.

VL  - 11
IS  - 1
ER  -

Copy | Download

Author Information

Sayed Mahbub Hasan Amiri

Department of ICT, Dhaka Residential Model College, Dhaka, Bangladesh

Contact Email

http://orcid.org/0000-0003-2349-2143
Md Mainul Islam

Department of ICT, Dhaka Residential Model College, Dhaka, Bangladesh

Contact Email

http://orcid.org/0009-0001-6093-1638

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Amiri, S. M. H., Islam, M. M. (2025). Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact. Software Engineering, 11(1), 1-17. https://doi.org/10.11648/j.se.20251101.11

Copy | Download

ACS Style

Amiri, S. M. H.; Islam, M. M. Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact. Softw. Eng. 2025, 11(1), 1-17. doi: 10.11648/j.se.20251101.11

Copy | Download

AMA Style

Amiri SMH, Islam MM. Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact. Softw Eng. 2025;11(1):1-17. doi: 10.11648/j.se.20251101.11

Copy | Download

@article{10.11648/j.se.20251101.11,
  author = {Sayed Mahbub Hasan Amiri and Md Mainul Islam},
  title = {Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact
},
  journal = {Software Engineering},
  volume = {11},
  number = {1},
  pages = {1-17},
  doi = {10.11648/j.se.20251101.11},
  url = {https://doi.org/10.11648/j.se.20251101.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.se.20251101.11},
  abstract = {This is the study that presents an AI-Python-based chatbot that helps students to learn programming by demonstrating solutions to such problems as debugging errors, solving syntax problems or converting abstract theoretical concepts to practical implementations. Traditional coding tools like Integrated Development Environments (IDEs) and static analyzers do not give robotic help while AI-driven code assistants such as GitHub Copilot focus on getting things done. To close this gap, our chatbot combines static code analysis, dynamic execution tracing, and large language models (LLMs) to provide the students with relevant and practical advice, hence promoting the learning process. The chatbot’s hybrid architecture employs CodeLlama for code embedding, GPT-4 for natural language interactions, and Docker-based sandboxing for secure execution. Evaluated through a mixed-methods approach involving 1,500 student submissions, the system demonstrated an 85% error resolution success rate, outperforming standalone tools like pylint (62%) and GPT-4 (73%). Quantitative results revealed a 59.3% reduction in debugging time among users, with pre- and post-test assessments showing a 34% improvement in coding proficiency, particularly in recursion and exception handling. Qualitative feedback from 120 students highlighted the chatbot’s clarity, accessibility, and confidence-building impact, though critiques included occasional latency and restrictive code sanitization. Emphasizing the ethical aspects of the project, the bias principle led to the discrimination of gendered reasons for 83% and the GDPR-iPad-like procedures to anonymity were followed. The chatbot's productivity points to its ability to make coding education available to everyone and to give 24/7 aid to students in some not well-funded schools. Future work will expand multilingual support through localized datasets and culturally adapted examples, integrate gamification to enhance engagement, and develop collaborative learning features. By balancing technical innovation with pedagogical empathy, this research provides a blueprint for AI tools that prioritize educational equity and long-term skill retention over mere code completion. The chatbot exemplifies how AI can augment human instruction, fostering deeper conceptual understanding in programming education.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Enhancing Python Programming Education with an AI-Powered Code Helper: Design, Implementation, and Impact

AU  - Sayed Mahbub Hasan Amiri
AU  - Md Mainul Islam
Y1  - 2025/04/28
PY  - 2025
N1  - https://doi.org/10.11648/j.se.20251101.11
DO  - 10.11648/j.se.20251101.11
T2  - Software Engineering
JF  - Software Engineering
JO  - Software Engineering
SP  - 1
EP  - 17
PB  - Science Publishing Group
SN  - 2376-8037
UR  - https://doi.org/10.11648/j.se.20251101.11
AB  - This is the study that presents an AI-Python-based chatbot that helps students to learn programming by demonstrating solutions to such problems as debugging errors, solving syntax problems or converting abstract theoretical concepts to practical implementations. Traditional coding tools like Integrated Development Environments (IDEs) and static analyzers do not give robotic help while AI-driven code assistants such as GitHub Copilot focus on getting things done. To close this gap, our chatbot combines static code analysis, dynamic execution tracing, and large language models (LLMs) to provide the students with relevant and practical advice, hence promoting the learning process. The chatbot’s hybrid architecture employs CodeLlama for code embedding, GPT-4 for natural language interactions, and Docker-based sandboxing for secure execution. Evaluated through a mixed-methods approach involving 1,500 student submissions, the system demonstrated an 85% error resolution success rate, outperforming standalone tools like pylint (62%) and GPT-4 (73%). Quantitative results revealed a 59.3% reduction in debugging time among users, with pre- and post-test assessments showing a 34% improvement in coding proficiency, particularly in recursion and exception handling. Qualitative feedback from 120 students highlighted the chatbot’s clarity, accessibility, and confidence-building impact, though critiques included occasional latency and restrictive code sanitization. Emphasizing the ethical aspects of the project, the bias principle led to the discrimination of gendered reasons for 83% and the GDPR-iPad-like procedures to anonymity were followed. The chatbot's productivity points to its ability to make coding education available to everyone and to give 24/7 aid to students in some not well-funded schools. Future work will expand multilingual support through localized datasets and culturally adapted examples, integrate gamification to enhance engagement, and develop collaborative learning features. By balancing technical innovation with pedagogical empathy, this research provides a blueprint for AI tools that prioritize educational equity and long-term skill retention over mere code completion. The chatbot exemplifies how AI can augment human instruction, fostering deeper conceptual understanding in programming education.

VL  - 11
IS  - 1
ER  -

Copy | Download