SignifyX: A Multi-Lingual Fingerspelling Sign Language Recognition System Using Artificial Intelligence
Abstract
In recent years, sign language recognition technology has gained significant attention due to its potential to bridge communication gaps between the hearing-impaired community and the broader population. Fingerspelling, a fundamental aspect of sign language, allows individuals to spell out words letter by letter using hand gestures, making it a critical tool for effective communication. This paper introduces SignifyX, a Kotlin-based mobile application designed to recognize fingerspelling in three sign languages—English (ASL), French (FSL), and Arabic (ArSL)—using MediaPipe's gesture recognition framework. The app aims to enhance communication for individuals with hearing and speech impairments by providing an intuitive interface that recognizes hand gestures and converts them into text, facilitating real-time communication and improving accessibility.
SignifyX offers two core functionalities:
- 1. Live Stream Gesture Recognition: This feature captures hand gestures in real-time, recognizing individual letters and assembling them into words. It provides immediate feedback to the user by displaying the recognized letters on the screen as they are spelled out.
- 2. Video-based Gesture Recognition: This feature enables users to upload or record videos for gesture recognition. The system analyzes the video frame by frame, identifies the letters being spelled out, and presents the resulting word or sentence. Additionally, the system supports gesture recognition from static images, offering versatility in its application.
A major challenge in fingerspelling recognition is the accuracy of the words being formed. Variations in hand positions, incomplete gestures, and the inherent ambiguity of certain letters in sign language can lead to incorrect word formation. To address this issue, SignifyX integrates a BK-Tree (Burkhard-Keller Tree) data structure for spell correction. The BK-Tree is an efficient algorithm that helps in correcting misspelled words by leveraging the Levenshtein distance, which measures the number of edits required to transform one word into another. By storing a predefined dictionary of valid words in the BK-Tree, the system can suggest corrections for inaccurately spelled words based on their proximity to valid words in the tree. This approach not only improves the recognition accuracy but also enhances user experience by offering real-time spelling correction and ensuring that the output is meaningful and contextually relevant.
The integration of MediaPipe's gesture recognition framework plays a pivotal role in the accuracy and speed of SignifyX. MediaPipe, a cross-platform machine learning framework, is optimized for real-time hand tracking and gesture detection, making it ideal for an application that requires high performance in both live and video-based recognition scenarios. The framework allows SignifyX to track 21 distinct hand landmarks, enabling precise detection of complex hand shapes and movements required for fingerspelling. By utilizing MediaPipe, the app is able to deliver a fast, responsive, and accurate recognition system that runs efficiently on mobile devices, even those with limited hardware resources.
Another key feature of SignifyX is its multilingual support. The app is designed to work across three distinct sign languages—ASL, FSL, and ArSL—each of which has unique gestures for representing different letters. The system dynamically adjusts to the selected language, ensuring that the appropriate gesture set is used for recognition and that the spell-checking process is aligned with the linguistic structure of the chosen language. This multilingual capability not only broadens the app's potential user base but also ensures that it can be deployed in diverse linguistic and cultural contexts, furthering its accessibility goals.
SignifyX was developed with the objective of creating a tool that empowers individuals with communication challenges by making it easier for them to interact with others in both real-time and asynchronous communication settings. By leveraging the latest advancements in gesture recognition, efficient data structures, and mobile computing, the app contributes to ongoing efforts to make communication more inclusive and accessible for all.
In conclusion, SignifyX represents a significant step forward in sign language recognition technology by offering a practical, user-friendly solution that combines real-time and video-based gesture recognition with advanced spell-checking capabilities. Future research will focus on expanding the app's functionality to include more sign languages, improving the precision of gesture recognition, and integrating additional assistive technologies to further enhance its usability.
References
In recent years, sign language recognition technology has gained significant attention due to its potential to bridge communication gaps between the hearing-impaired community and the broader population. Fingerspelling, a fundamental aspect of sign language, allows individuals to spell out words letter by letter using hand gestures, making it a critical tool for effective communication. This paper introduces SignifyX, a Kotlin-based mobile application designed to recognize fingerspelling in three sign languages—English (ASL), French (FSL), and Arabic (ArSL)—using MediaPipe's gesture recognition framework. The app aims to enhance communication for individuals with hearing and speech impairments by providing an intuitive interface that recognizes hand gestures and converts them into text, facilitating real-time communication and improving accessibility.
SignifyX offers two core functionalities:
Live Stream Gesture Recognition: This feature captures hand gestures in real-time, recognizing individual letters and assembling them into words. It provides immediate feedback to the user by displaying the recognized letters on the screen as they are spelled out.
Video-based Gesture Recognition: This feature enables users to upload or record videos for gesture recognition. The system analyzes the video frame by frame, identifies the letters being spelled out, and presents the resulting word or sentence. Additionally, the system supports gesture recognition from static images, offering versatility in its application.
A major challenge in fingerspelling recognition is the accuracy of the words being formed. Variations in hand positions, incomplete gestures, and the inherent ambiguity of certain letters in sign language can lead to incorrect word formation. To address this issue, SignifyX integrates a BK-Tree (Burkhard-Keller Tree) data structure for spell correction. The BK-Tree is an efficient algorithm that helps in correcting misspelled words by leveraging the Levenshtein distance, which measures the number of edits required to transform one word into another. By storing a predefined dictionary of valid words in the BK-Tree, the system can suggest corrections for inaccurately spelled words based on their proximity to valid words in the tree. This approach not only improves the recognition accuracy but also enhances user experience by offering real-time spelling correction and ensuring that the output is meaningful and contextually relevant.
The integration of MediaPipe's gesture recognition framework plays a pivotal role in the accuracy and speed of SignifyX. MediaPipe, a cross-platform machine learning framework, is optimized for real-time hand tracking and gesture detection, making it ideal for an application that requires high performance in both live and video-based recognition scenarios. The framework allows SignifyX to track 21 distinct hand landmarks, enabling precise detection of complex hand shapes and movements required for fingerspelling. By utilizing MediaPipe, the app is able to deliver a fast, responsive, and accurate recognition system that runs efficiently on mobile devices, even those with limited hardware resources.
Another key feature of SignifyX is its multilingual support. The app is designed to work across three distinct sign languages—ASL, FSL, and ArSL—each of which has unique gestures for representing different letters. The system dynamically adjusts to the selected language, ensuring that the appropriate gesture set is used for recognition and that the spell-checking process is aligned with the linguistic structure of the chosen language. This multilingual capability not only broadens the app's potential user base but also ensures that it can be deployed in diverse linguistic and cultural contexts, furthering its accessibility goals.
SignifyX was developed with the objective of creating a tool that empowers individuals with communication challenges by making it easier for them to interact with others in both real-time and asynchronous communication settings. By leveraging the latest advancements in gesture recognition, efficient data structures, and mobile computing, the app contributes to ongoing efforts to make communication more inclusive and accessible for all.
In conclusion, SignifyX represents a significant step forward in sign language recognition technology by offering a practical, user-friendly solution that combines real-time and video-based gesture recognition with advanced spell-checking capabilities. Future research will focus on expanding the app's functionality to include more sign languages, improving the precision of gesture recognition, and integrating additional assistive technologies to further enhance its usability.
Refbacks
- There are currently no refbacks.