BERT-Based Grammatical Error Analysis in Indonesia Senior High School Essays
DOI:
https://doi.org/10.33394/jollt.v14i2.18551Keywords:
BERT-based learning model, Grammar error analysis, Automated writing evaluation, Writing skills, Natural language processingAbstract
In high-resource languages, automated grammatical error detection has rapidly evolved; however, there are still few technologies that are comparable for Bahasa Indonesia, especially in secondary school settings. Although spelling, morphology, syntax, and diction are common problems for Indonesian senior high school students, AI-assisted feedback systems specifically designed for Indonesian writing are still in their infancy. The use of IndoBERT-base for grammatical error analysis in 82 senior high school student essays totaling 10,911 words is examined in this work. Following two expert raters' hand annotation, 1,872 grammatical mistakes were found in four different categories. Prior to analysis utilizing a refined IndoBERT-base model, the essays underwent pre-processing procedures including as tokenization, normalization, and alignment with gold-standard annotations. F1-score, which is calculated by comparing predicted labels with teacher-validated error tags, accuracy, precision, and recall were used to assess the model's performance. The model demonstrated good agreement (80%) with human raters and correctly identified 1,594 mistakes, yielding a detection rate of 85.1%. Due to their contextual and semantic complexity, syntax and diction showed reduced accuracy, whereas spelling and morphology identification showed especially good performance. These results suggest that automated grammatical analysis of Indonesian student writing can be successfully supported by transformer-based models. Nonetheless, shortcomings in managing discourse-level interdependence underscore the ongoing significance of human assessment. The study supports the incorporation of hybrid human–AI feedback systems to improve writing teaching in the classroom and advances the development of AI-assisted grammar tools for Indonesian education.
References
Abro, A. A., Talpur, M. S. H., & Jumani̇, A. K. (2023). Natural Language Processing Challenges and Issues: A Literature Review. Gazi University Journal of Science, 36(4), 1522–1536. https://doi.org/10.35378/gujs.1032517
Ahmad, A., & Why, Dr. N. K. (2024). Automated Grading Using Natural Language Processing and Semantic Analysis. SSRN. https://doi.org/10.2139/ssrn.4999531
Alharbi, W. (2023). AI in the Foreign Language Classroom: A Pedagogical Overview of Automated Writing Assistance Tools. Education Research International, 2023, 1–15. https://doi.org/10.1155/2023/4253331
Aziz, Z. A., Fitriani, S. S., & Amalina, Z. (2020). Linguistic errors made by Islamic university EFL students. Indonesian Journal of Applied Linguistics, 9(3), 735–748. https://doi.org/10.17509/ijal.v9i3.23224
Bosse, M.-L., Brissaud, C., & Le Levier, H. (2021). French Pupils’ Lexical and Grammatical Spelling from Sixth to Ninth Grade: A Longitudinal Study. Language and Speech, 64(1), 224–249. https://doi.org/10.1177/0023830920935558
Chang, C. H. C., Nastase, S. A., & Hasson, U. (2022). Information flow across the cortical timescale hierarchy during narrative construction. Proceedings of the National Academy of Sciences, 119(51), e2209307119. https://doi.org/10.1073/pnas.2209307119
Daqiqil Id, I., Saputra, H., Syamsudhuha, S., Kurniawan, R., & Andriyani, Y. (2024). Sentiment analysis of student evaluation feedback using transformer-based language models. Indonesian Journal of Electrical Engineering and Computer Science, 36(2), 1127. https://doi.org/10.11591/ijeecs.v36.i2.pp1127-1139
Dizon, G., & Gayed, J. M. (2024). A systematic review of Grammarly in L2 English writing contexts. Cogent Education, 11(1), 2397882. https://doi.org/10.1080/2331186X.2024.2397882
Ferris, D., & Eckstein, G. (2020). Language matters: Examining the language-related needs and wants of writers in a first-year university writing course. Journal of Writing Research, 12(vol. 12 issue 2), 321–364. https://doi.org/10.17239/jowr-2020.12.02.02
Hattie, J., Crivelli, J., Van Gompel, K., West-Smith, P., & Wike, K. (2021). Feedback That Leads to Improvement in Student Essays: Testing the Hypothesis that “Where to Next” Feedback is Most Powerful. Frontiers in Education, 6, 645758. https://doi.org/10.3389/feduc.2021.645758
Jazuli, A., Widowati, & Kusumaningrum, R. (2024). Optimizing Aspect-Based Sentiment Analysis Using BERT for Comprehensive Analysis of Indonesian Student Feedback. Applied Sciences, 15(1), 172. https://doi.org/10.3390/app15010172
Keller-Margulis, M. A., Mercer, S. H., & Matta, M. (2021). Validity of automated text evaluation tools for written-expression curriculum-based measurement: A comparison study. Reading and Writing, 34(10), 2461–2480. https://doi.org/10.1007/s11145-021-10153-6
Kornev, A. N., & Balčiūnienė, I. (2021). Lexical and Grammatical Errors in Developmentally Language Disordered and Typically Developed Children: The Impact of Age and Discourse Genre. Children, 8(12), 1114. https://doi.org/10.3390/children8121114
Mahdun, M., Chan, M. Y., Yap, N. T., Mohd Kasim, Z., & Wong, B. E. (2022). Production Errors and Interlanguage Development Patterns of L1 Malay ESL Learners in the Acquisition of the English Passive. Issues in Language Studies, 11(1), 74–90. https://doi.org/10.33736/ils.4023.2022
Mahmood, S. A., & Abdulsamad, M. A. (2024). Automatic assessment of short answer questions: Review. Edelweiss Applied Science and Technology, 8(6), 9158–9176. https://doi.org/10.55214/25768484.v8i6.3956
Mahriyuni, M., Pramuniati, I., & Sitinjak, D. R. (2024). Interlanguage development among the learners of Indonesian language in Paris. Indonesian Journal of Applied Linguistics, 14(1), 206–219. https://doi.org/10.17509/ijal.v14i1.70394
Mannix, I. A., & Yulianti, E. (2024). Academic expert finding using BERT pre-trained language model. International Journal of Advances in Intelligent Informatics, 10(2), 280. https://doi.org/10.26555/ijain.v10i2.1497
Nückles, M., Roelle, J., Glogger-Frey, I., Waldeyer, J., & Renkl, A. (2020). The Self-Regulation-View in Writing-to-Learn: Using Journal Writing to Optimize Cognitive Load in Self-Regulated Learning. Educational Psychology Review, 32(4), 1089–1126. https://doi.org/10.1007/s10648-020-09541-1
Özçift, A., Akarsu, K., Yumuk, F., & Söylemez, C. (2021). Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from transformers (BERT): An empirical case study for Turkish. Automatika, 62(2), 226–238. https://doi.org/10.1080/00051144.2021.1922150
Parameswari, D. A., Manickam, R., Dhas.J, J. A., Kumar, M. V., & Manikandan, A. (2024). Error Analysis in Second Language Writing: An Intervention Research. World Journal of English Language, 14(3), 130. https://doi.org/10.5430/wjel.v14n3p130
Rahmanova, G., Eksi, G. Y., Shahabitdinova, S., Nasirova, G., Sotvoldiyev, B., & Miralimova, S. (2024). Enhancing Writing Skills with Social Media-Based Corrective Feedback. World Journal of English Language, 15(1), 252. https://doi.org/10.5430/wjel.v15n1p252
Singh, S., & Mahmood, A. (2021). The NLP Cookbook: Modern Recipes for Transformer Based Deep Learning Architectures. IEEE Access, 9, 68675–68702. https://doi.org/10.1109/ACCESS.2021.3077350
Terzioğlu, Y., & Bensen Bostanci, H. (2020). A Comparative Study of 10th Grade Turkish Cypriot Students’ Writing Errors. Sage Open, 10(1), 2158244020914541. https://doi.org/10.1177/2158244020914541
Tucudean, G., Bucos, M., Dragulescu, B., & Caleanu, C. D. (2024). Natural language processing with transformers: A review. PeerJ Computer Science, 10, e2222. https://doi.org/10.7717/peerj-cs.2222
Willis, J., Gibson, A., Kelly, N., Spina, N., Azordegan, J., & Crosswell, L. (2021). Towards faster feedback in higher education through digitally mediated dialogic loops. Australasian Journal of Educational Technology, 22–37. https://doi.org/10.14742/ajet.5977
Yulianti, E., & Nissa, N. K. (2024). ABSA of Indonesian customer reviews using IndoBERT: Single- sentence and sentence-pair classification approaches. Bulletin of Electrical Engineering and Informatics, 13(5), 3579–3589. https://doi.org/10.11591/eei.v13i5.8032
Zhang, C., Shao, Y., Yuan, Y., & Shen, W. (2025). Artificial Intelligence Reshapes Creativity: A Multidimensional Evaluation. PsyCh Journal, pchj.70042. https://doi.org/10.1002/pchj.70042
Zheng, X., & Zhang, J. (2025). The usage of a transformer based and artificial intelligence driven multidimensional feedback system in english writing instruction. Scientific Reports, 15(1), 19268. https://doi.org/10.1038/s41598-025-05026-9
Downloads
Published
How to Cite
Issue
Section
Citation Check
License
Copyright (c) 2026 Syarifuddin Tundreng, Heri Alfian, Parsya Kartika, Azka Airin Nisa

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
License and Publishing Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- That it is not under consideration for publication elsewhere,
- That its publication has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and publishing agreement.
Copyright
Authors who publish with JOLLT Journal of Languages and Language Teaching agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Licensing for Data Publication
-
Open Data Commons Attribution License, http://www.opendatacommons.org/licenses/by/1.0/ (default)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.














