Audio and Video

Video: This headphone AI can block ambient noise to improve focus

29 May 2024

Researchers at the University of Washington have developed an artificial intelligence (AI) system that allows headphone users to cancel all other sounds and focus on the voice of a target person in a noisy environment.

Called Target Speech Hearing, the AI system lets a user wearing headphones look at a person speaking for three to five seconds, which enrolls them in the system. Then the enrolled speaker’s voice in real time is heard in the headphones even as the listener moves around in noisy places and no longer faces the speaker.

While noise-canceling headphones have become ubiquitous in the audio marketplace and many automatically adjust sound levels for wearers, sensing conversations is still something that is harder to master.

AI is being used across multiple sectors for a variety of tasks, but this could usher in a new feature in portal wireless headphones like AirPods or earbuds.

“We tend to think of AI now as web-based chatbots that answer questions,” said Shyam Gollakota, a UW professor in the Paul G. Allen School of Computer Science & Engineering. “But in this project, we develop AI to modify the auditory perception of anyone wearing headphones, given their preferences. With our devices you can now hear a single speaker clearly even if you are in a noisy environment with lots of other people talking.”

How it works

A person wearing headphones fitted with microphones taps a button while directing their head at someone talking. The Target Speech Hearing system identifies the sound waves from the speaker’s voice then should reach the microphone on both sides of the headset simultaneously.

UW said there is a 16° margin of error as the signal is sent to an on-board embedded computer where the AI software learns the desired speaker’s vocal patterns. The system then latches onto the speaker’s voice and continues to play it back to the listener, even as the pair moves around. As the speaker keeps talking, the system learns more training data of the target speaker.

UW tested the AI system on 21 subjects and found the enrolled speaker’s voice was nearly twice as clear as the unfiltered audio on average. If the sound quality is not good, the wearer can run another enrollment to improve clarity.

The next steps are to test the system on earbuds and hearing aids as well as to enroll more than one enrolled speaker at a time.

The full research can be found in the journal CHI '24.

To contact the author of this article, email PBrown@globalspec.com


Powered by CR4, the Engineering Community

Discussion – 0 comments

By posting a comment you confirm that you have read and accept our Posting Rules and Terms of Use.
Engineering Newsletter Signup
Get the GlobalSpec
Stay up to date on:
Features the top stories, latest news, charts, insights and more on the end-to-end electronics value chain.
Advertisement
Weekly Newsletter
Get news, research, and analysis
on the Electronics industry in your
inbox every week - for FREE
Sign up for our FREE eNewsletter
Advertisement