Open Access Open Access  Restricted Access Subscription Access

A Decoder for Sign Boards of Indian Languages to English using Tesseract and seq2seq Model

Basavesha D, Piyush Kumar Pareek, Suhas G K.

Abstract


The development of a language translator for Indian languages to English has proven to be a difficult task due to the vast number of characters in Indian scripts such as Tamil, Kannada, and Hindi. We propose a system in this research that captures Indian written characters and translates them into English. We detail the many methods and machine learning models that were utilised to develop this system, which has an accuracy of 87 percent. In addition, the project is included in the article. A webOS sensor frame with a generic design. A centralised daemon for controlling and accessing all of the webOS device's sensors.


Full Text:

PDF

References


Liyanage, C., Nadungodage, T., & Weerasinghe, R. (2015, August). Developing a commercial grade Tamil OCR for recognizing font and size independent text. In 2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer) :130-134). IEEE

Smith, R. (2007, September). An overview of the Tesseract OCR engine. In Ninth international conference on document analysis and recognition (ICDAR 2007) (Vol. 2, pp. 629-633). IEEE.

Yamakawa, D., & Yoshiura, N. (2012, September). Applying Tesseract-OCR to detection of image spam mails. In 2012 14th Asia-Pacific Network Operations and Management Symposium (APNOMS) (pp. 1-4). IEEE

Mishra, N., Patvardhan, C., Lakshmi, C. V., & Singh, S. (2012). Shirorekha chopping integrated tesseract ocr engine for enhanced hindi language recognition. International Journal of Computer Applications, 39(6), 19-23.

Rakshit, S., & Basu, S. (2010). Development of a multi-user handwriting recognition system using Tesseract open source OCR engine. arXiv preprint arXiv:1003.5886.

Sagar, B. M., Shobha, G., & Kumar, P. R. (2008, December). Complete Kannada Optical Character Recognition with syntactical analysis of the script. In 2008 International Conference on Computing, Communication and Networking (pp. 1-4). IEEE.

Kunte, R. S., & Samuel, R. S. (2007, December). An OCR system for printed Kannada text using two-stage Multi-network classification approach employing Wavelet features. In International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007) (Vol. 2, pp. 349-353). IEEE..

Suhas, G. K., Devananda, S. N., Jagadeesh, R., Pareek, P. K., & Dixit, S. (2021). Recommendation-Based Interactivity Through Cross Platform Using Big Data. In Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2020, Volume 3 (pp. 651-659). Springer Singapore.

GK, M. S., Verma, V. K., Devananda, S. N., BR, C. R., Manchale, P., & Pareek, P. K. An Exploration on Recommendation Based Interactivity through Multiple Platforms in Big Data.

GK, S., SN, D., Pareek, P., & MS, N. M. (2021). A Altmetrics analysis in social media using Bigdata. Available at SSRN 3835021.


Refbacks

  • There are currently no refbacks.