Sign Language Translator

A real-time hand gesture recognition system that translates sign language into text and speech

Real-time hand gesture detection and translation interface in action

Project Overview

The Hand Language Translator is an innovative assistive technology project that bridges the communication gap between sign language users and non-signers. Using computer vision and machine learning, the system recognizes hand gestures in real-time and translates them into both written text and spoken audio, enabling seamless communication.

This project demonstrates the practical application of artificial intelligence in solving real-world accessibility challenges. By combining OpenCV for image processing, decision tree-based machine learning models for gesture recognition, and Google Text-to-Speech for audio output, the system creates a complete translation pipeline that operates in real-time.

The translator is designed to be accessible and user-friendly, requiring only a standard webcam for operation. It showcases proficiency in computer vision, machine learning model selection and optimization, and the integration of multiple technologies to create a cohesive solution that can make a meaningful impact on people's lives.

Challenges & Solutions

Model Training & Accuracy
Initial attempts using neural networks for hand gesture recognition resulted in poor accuracy and long training times. The model struggled to generalize across different lighting conditions and hand positions. Solved by switching from a traditional neural network to a decision tree-based model (Random Forest), which provided better accuracy with significantly less training time and computational resources.
Text-to-Speech Implementation
The initial text-to-speech implementation using native libraries was unreliable and had poor voice quality across different operating systems. Audio output was inconsistent and often failed to work. Resolved by implementing Google Text-to-Speech (gTTS), which provides consistent, high-quality voice output and works reliably across all platforms.
Real-Time Video Processing Latency
Hand gesture detection experienced significant lag during real-time video capture, making the system impractical for continuous translation. The frame processing rate was too slow for smooth user experience. Fixed by implementing frame skipping techniques and optimizing the image preprocessing pipeline, reducing resolution only where necessary while maintaining detection accuracy. This reduced latency from ~500ms to under 100ms per prediction.

Technologies Used

Python
Mediapipe
Numpy
Gtts
Pickle
Cv2
OS
Threading
Sk-Learn

Project Demonstration

Coming Soon...