Real-Time Translation of Speech to Indian Sign Language to Facilitate Hearing Impairment

Devika M; Aravind SK; Aleena B; Anagha Mary Philip; Muhammed Azharuddin Sahib

Real-Time Translation of Speech to Indian Sign Language to Facilitate Hearing Impairment

Devika M, Aravind SK, Aleena B, Anagha Mary Philip, Muhammed Azharuddin Sahib

Abstract

Sign language is a visual language utilized by individuals who are deaf as their primary means of communication. Unlike spoken languages, sign language relies on gestures, body movements, and manual communication to effectively convey ideas and thoughts. It can be used by individuals who have difficulty speaking, those who are unable to speak, and by individuals without hearing impairments to communicate with deaf individuals. Access to sign language is crucial for the social, emotional, and linguistic development of deaf individuals. Our project aims to bridge the communication gap between deaf individuals and the general population by leveraging advancements in web applications, machine learning, and natural language processing technologies. The primary objective of this project is to develop an interface capable of converting audio/voice inputs into corresponding sign language for deaf individuals. This is achieved through the simultaneous integration of hand shapes, orientations, and movements of the hands, arms, or body. The interface operates in two phases: first, converting audio to text using speech-to-text APIs (such as Python modules or Google API); and second, representing the text using parse trees and applying the semantics of natural language processing (specifically, NLTK) for the lexical analysis of sign language grammar. This work adheres to the rules of Indian Sign Language (ISL) and follows ISL grammar guidelines.

Full Text:

PDF

References

Ahn, H., Ha, T., Choi, Y., Yoo, H., & Oh, S. (2018). Text2action: Generative adversarial synthesis from language to action. In IEEE international conference on robotics and automation (ICRA).

Arikan, O., & Forsyth, D. A. (2002). Interactive motion generation from examples. In Proceedings of the 29th annual conference on computer graphics and interactive techniques, SIGGRAPH ’02 (pp. 483–490). ACM, New York, NY.

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.

Bangham, J. A., Cox, S. J., Elliott, R., Glauert, J. R. W., Marshall, I., Rankov, S., & Wells, M. (2000). Virtual signing: Capture, animation, storage and transmission-an overview of the visicast project. In IEE Seminar on speech and language processing for disabled and elderly people (Ref. No. 2000/025) (pp. 6/1–6/7). BDA: British Deaf Association (2019). BSL statistics. https://bda.org.uk/help-resources/#statistics. Accessed 16 Nov 2019.

Bowden, R., Zisserman, A., Hogg, D., & Magee, D. (2016). Learning to recognise dynamic visual content from broadcast footage. https://cvssp.org/projects/dynavis/index.html. Accessed 1 Nov 2018.

Camgoz, N. C., Hadfield, S., Koller, O., Ney, H., & Bowden, R. (2018). Neural sign language translation. In IEEE Conference on computer vision and pattern recognition (CVPR). Cao, Z., Simon, T., Wei, S., & Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In 2017 IEEE Conference on computer vision and pattern recognition (CVPR) (Vol. 00,pp. 1302–1310).

Chan, C., Ginosar, S., Zhou, T., & Efros, A. A. (2018). Everybody dance now. CoRR arXiv:1808.07371. Chen, Q., & Koltun, V. (2017). Photographic image synthesis with cascaded refinement networks. In ICCV (pp. 1520–1529). IEEE Computer Society.

Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1724–1734). Association for Computational Linguistics.

Chung, J., Gülçehre, Ç., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR arXiv:1412.3555.

Cox, S., Lincoln, M., Tryggvason, J., Nakisa, M., Wells, M., Tutt, M., & Abbott, S. (2002). Tessa, a system to aid communication with deaf people. In Proceedings of the 5th international ACM conference on assistive technologies (pp. 205–212). ACM

Refbacks

There are currently no refbacks.

Username
Password
Remember me

Recent Trends in Androids and IOS Applications (e-ISSN: 3048-9733)

Real-Time Translation of Speech to Indian Sign Language to Facilitate Hearing Impairment

Abstract

Full Text:

References

Refbacks