Integrating YOLOv8, EasyOCR, and GTTS for text detection in assistive technology for the visually impaired

Maria Bestarina Laili; Kartika Kartika; Muhammad Zaki Abdulah; Syuraih Amiruddin; Egi Sunardi; Jelita  Permatasari

doi:10.31603/bistycs.185

Integrating YOLOv8, EasyOCR, and GTTS for text detection in assistive technology for the visually impaired

Authors

Maria Bestarina Laili Universitas Singaperbangsa Karawang
Kartika Kartika Universitas Singaperbangsa Karawang
Muhammad Zaki Abdulah Universitas Singaperbangsa Karawang
Syuraih Amiruddin Universitas Singaperbangsa Karawang
Egi Sunardi Universitas Singaperbangsa Karawang
Jelita Permatasari Universitas Singaperbangsa Karawang

Keywords:

Text detection, Assistive technology, Visually impaired

Abstract

Technology for visually impaired individuals has advanced, but accessing text-based information remains challenging. Accurate text detection, clear reading, and voice conversion are essential. YOLOv8, EasyOCR, and Google Text-to-Speech (GTTS) are cutting-edge technologies that can be integrated to address this need. This study aims to develop a system combining YOLOv8 for text detection, EasyOCR for text recognition, and GTTS for text-to-speech conversion, focusing on improving accessibility for the visually impaired. The system operates in several stages. First, YOLOv8 detects text in images in real-time. Next, EasyOCR extracts text from the detected regions. Finally, GTTS converts the recognized text into clear speech. A diverse text image dataset was used for training and testing the detection model, while user testing was conducted to assess system usability and effectiveness. The developed system successfully detects and reads text with high accuracy and converts it into clear speech. System evaluation revealed significant improvements in information accessibility for the visually impaired, with users responding positively to its speed, accuracy, and ease of use. Integrating YOLOv8, EasyOCR, and GTTS into a single solution presents an innovative approach to text detection, recognition, and conversion for visually impaired individuals. This system demonstrates significant potential to enhance independence and quality of life by improving access to text-based information. The study contributes to assistive technology development and opens doors for further research into practical applications and system refinement.

References

[1] Partuni, “Siaran Pers: Peran Strategis Pertuni Dalam Memberdayakan Tunanetra Di Indonesia,” Available:https://pertuni.or.id/siaran-pers-peran-strategis-pertuni-dalam-memberdayakan-tunanetra-di-indonesia/, Mar. 04, 2017.

[2] A. , & B. H. Smith, “Evaluating Assistive Technology for the Visually Impaired: Methods and Metrics,” in Proceedings of the International Conference on Human-Computer Interaction (HCI). , 2019.

[3] X. , & Z. L. Liu, “EasyOCR: A Python Library for OCR with a Focus on Chinese Text.,” 2022.

[4] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection.” [Online]. Available: http://pjreddie.com/yolo/

[5] C. , & Z. H. Wang, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” 2021, arXiv preprint.

[6] Smelyakov, kirill, Chupryna Anastasya, Dmytro Darahan, and Midina Serhii, “Effectiveness of modern text recognition solutions and tools for common data sources.,” in 5th International Conference on Computational Linguistics and Intelligent Systems (ICOLINS-2021), Ukraina, Apr. 2021.

[7] R. D. , P. W. S. , & T. A. N. Kusumanto, “Aplikasi Sensor Vision untuk Deteksi MultiFace dan Menghitung Jumlah Orang,” Semantik, 2012.

[8] Jocher, G., & Zhao, D. (2022). YOLOv5: A PyTorch Implementation of YOLOv5. GitHub repository.

[9] Khan, A., & Hussain, M. (2019). A Survey on Optical Character Recognition (OCR) Systems. Journal of

Computer and Communications, 7(3), 25-32.

[10] Mishra, S., & Sharma, A. (2018). Text Detection and Recognition in Natural Images Using Deep Learning. In Proceedings of the International Conference on Computer Vision (ICCV).

[11] Rao, K., & Kannan, A. (2020). A Survey of Text-to-Speech Conversion Techniques. Journal of Computer Science and Technology, 35(4), 710-735.

[12] Dhanasekar, D., & Banu, N. (2018). Text to Speech Conversion using Google Text to Speech API.

International Journal of Engineering and Technology, 7(2), 823-828.

[13] Arumugam, K., & Ramesh, A. (2021). An Overview of Text-to-Speech Synthesis for Assistive Technologies. In Proceedings of the International Conference on Speech and Language Processing (ICSLP).

[14] Liu, Q., & Wang, W. (2019). Assistive Technology for the Visually Impaired: A Comprehensive Review.

Journal of Assistive Technologies, 13(2), 102-119.

Integrating YOLOv8, EasyOCR, and GTTS for text detection in assistive technology for the visually impaired

Integrating YOLOv8, EasyOCR, and GTTS for text detection in assistive technology for the visually impaired

Authors

Keywords:

Abstract

References

Downloads

Published

Conference Proceedings Volume

Section

License

How to Cite

Similar Articles

Similar Articles

Improved vehicle detection accuracy using CLAHE

Identification of induce draft fan damage using fault tree analysis

Visualization of morphology Keraton Kasepuhan Cirebon Siti Inggil complex through virtual reality technology

OCR implementation in archiving system at Borobudur Subdistrict using regular expression and TextRank methods

Attendance face detection on mobile device using particle swarm optimization and linear discriminant analysis

Using ellis model for the analysis of information seeking behavior about digital literacy

Conceptual model proposal for integration of consortium e-learning using SOA architecture: A literature-based approach

Strengthening urban governance: Digital transformation through the development of electronic-based government systems to create smart cities in Malang City

User experience measurement to SPADA DIKTI using system usability scale

Baby development and milestone application for integrated services post