Computer vision is a fast growing field with many novel applications. I wanted to create an app that could process live video and translate words that a user is pointing at. I took many German classes at Northwestern and it always took me forever to look up individual words that I didn't know. For this project I hoped to find a way to speed up this translation process using computer vision.
There are four major components to this application. First the application needs to create a skin color histogram. Then it uses this histogram to detect skin regions in video. From these skin regions the app finds a finger tip. Then a word is detected at the point of the finger tip. This word is translated and displayed on screen. I used the OpenCV library with Python for this project. For optical character recognition I used Tesseract.
Result of skin color detection:
Red point is finger tip:
Video of application in use: