Enhancing Contextual Arabic Handwritten Characters Recognition with CNNs: A Comparative Study on Augmentation Strategies and Dataset Scaling
DOI:
https://doi.org/10.11113/elektrika.v24n3.739Keywords:
Arabic Handwritten Character Recognition, Contextual Forms, Dataset Correction, Semantic DatasetAbstract
Arabic handwritten character recognition (AHCR) faces significant challenges due to the cursive nature of the script, positional variations of characters, and inconsistencies in existing datasets which hinder robust model training and generalization. Existing AHCR systems often rely on datasets with unreliable annotations, limited character sets, and inconsistent forms, restricting their real-world applicability. This study presents a comparative evaluation of four convolutional neural network (CNN) experiments developed to enhance contextual Arabic handwritten character recognition. Beginning with a baseline model trained on the HMBD dataset, we progressively examine the impact of data augmentation and dataset scaling by correcting mislabeled samples and incorporating additional positional forms, resulting in an extended dataset with 114 contextually diverse classes. While prior work such as the CNN-5 model, reported a 91.96% accuracy on the original HMBD dataset, our final model, trained on an enhanced dataset with semantic and structural improvements, achieved a higher test accuracy of 92.24%, along with precision and F1 scores of 92.48% and 92.24%, respectively, outperforming CNN-5 despite the increased complexity of class structure. The results underscore the importance of both data integrity and model architecture, and offer a robust framework for developing scalable and reliable handwritten Arabic OCR systems. This study benchmarks state-of-the-art CNNs and provides a reproducible pipeline that bridges performance with real-world applicability.
Downloads
Published
How to Cite
Issue
Section
License
Copyright of articles that appear in Elektrika belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.













