Scientists have managed to build a computer program that can read people’s lips with more accuracy than trained experts.
Researchers at the University of Oxford built the software, called LipNet, which can read lips correctly 93.4% of the time.
In the research paper, the team claim their program is the first to perform sentence-level sequence prediction. All previous work done on lip-reading has been focused on performing just single word classification instead.
The goal is to automate lip-reading and the team’s list of potential uses includes “improved hearing aids, silent dictation in public spaces, covert conversations, speech recognition in noisy environments, biometric identification, and silent-movie processing.”
The computer was trained by playing around 30,000 three-second videos of men and women speaking short sentences as the machine extracted a mouth-centred crop and learned to match the different movements of the lips to the words being spoken.
The program was then tested against three hearing-impaired people who can lip-read and they scored an average of 47.7% accuracy – far less than the ability of the machine.
The software learned to lip-read using a set format of grammar known as GRID – which goes “command + colour + preposition + letter + digit + adverb”. Therefore more research needs to be done into everyday speech that doesn’t naturally follow this format before it can be released to the world.