[Disclosure: NVIDIA is a client of the author.]
There is a growing consensus in the analyst community that we will be moving from PC-based keyboard input to smartphone/digital assistant voice interfaces in 3-5 years. But if you’ve ever been in an office where people are always on the phone, you know the nightmare of sound that can result in common cubical or even open-plan offices.
If you add to this existing level of ambient noise folks talking rather than typing, suddenly it looks like we’ll either be more aggressively trying to escape and work from home or living with ever better active sound cancelling headphones permanently attached to our heads.
Given we are also moving to head-mounted displays and toward wearable computers, that suggests a future world where we are largely isolated from those around us by our own personal cones of silence. This will be particularly interesting for those that want to work on planes which have, up until now, successfully avoided the move to in-flight phone calls (you may recall a few decades back some planes actually had pay phones in the seats which clearly didn’t stand the test of time).
Well, the University of California San Francisco – using NVIDIA’s AI technology – appears to have come up with a fix: an electrocorticographic link to your brain that allows for silent voice input. Talk about a game-changer!
Eliminating voice sound at scale
If you can eliminate a sound at its source rather than after the fact, it’s both far easier and potentially far cheaper than the alternatives. For instance, if you were to place a microphone in front of your mouth followed by a speaker aimed at your mouth and tied both into an active noise cancellation system you could effectively eliminate most of the noise.
That’s far easier than the array of microphones that are typically required to do the same thing for sounds coming at your ears – because sound waves come from multiple directions, but largely go out in one direction. Even if we use old technology, a good muffler on a car can generally outperform any external noise cancellation technology for a fraction of the cost.
So, what if you could talk without making a sound? This is what the neurological work at the UCSF was able to determine. By monitoring neural activity and tying it through an NVIDA AI engine they can convert electrographic signals from the brain to sound or text. Granted, you probably would lose some voice inflection…but if you can translate to sound then any good speech-to-text converter can translate the result to speech.
Current performance is only 10 words a minute – far slower than the 150 words a minute we typically speak at – but it showcases that the technology is possible. With better sensors and improved AI performance there’s no doubt they can significantly increase the system output over time.
The result wouldn’t just benefit PC use: think of what the tech could do to eliminate noise from phone calls. If you no longer have to make sound you could have the system create a voice, even mimic your own, and have silent conversations over the phone that only the person at the other end of the call could hear.
Think of being able to make or take calls in noisy environments, in conferences or during meetings. There’s no ambient noise because the system isn’t capturing sound, so the other side gets a clean voice and – because it starts out as digital information – voice-to-text is a natural outcome, so you could automatically get a text record of the call. You could also dynamically switch between voice and text if the listener is in an excessively noisy environment and, with a head-mounted display, you could even have a conversation in a rock concert from the front row because sound is no longer a requirement or a detriment.
A whole new world
Being able to speak without sound is a huge game-changer. It makes head-mounted computers truly viable as a general use platform, because they suck with keyboards and mice. It potentially allows for cell phone use on planes without the downside of sound. And, even for regular PC use, it provides a voice option that really isn’t an option in today’s cubicle-based or open-plan offices.
The tech would also be a boon in areas that require documented communications (government, stock-trading, some litigation). For instance, it could eventually result in a better solution than what is currently used by most court reporters.
Finally, this will likely be a huge help even in its current form for those that have certain disabilities, providing many who can’t speak (or type) with a voice. While the ability to speak without speaking sounds like Zen thing, it could fundamentally change how we interact with computers and, in many cases, each other.
This article is published as part of the IDG Contributor Network. Want to Join?