Artificial Intelligence

Researchers Develop Human Speech Recognition Model With Deep Neural Networks

Updated on December 9, 2022

A group of researchers from Germany is exploring a new human speech recognition model based on machine learning and deep neural networks. The new model could help greatly improve human speech recognition.

Hearing aid algorithms are usually used to improve human speech recognition, and they are evaluated through various experiments that determine the signal-to-noise ratio at which a certain number of words are recognized. However, these experiments are often time consuming and expensive.

The new model was detailed in research published in The Journal of the Acoustical Society of America.

Predictions for Hearing-Impaired Listeners

Jana Roßbach is one of the authors from Carl Von Ossietzky University.

“The novelty of our model is that it provides good predictions for hearing-impaired listeners for noise types with very different complexity and shows both low errors and high correlations with the measured data,” said Roßbach.

The team of researchers calculated how many words per sentence a listener could understand through automatic speech recognition (ASR). Speech recognition tools like Alexa and Siri rely on this ASR, which is widely available.

The Study and Results

The study carried out by the team involved eight normal-hearing and 20 hearing-impaired individuals. The listeners were exposed to many different complex noises that hid the speech, and the hearing-impaired listeners were categorized into three groups depending on their level of age-related hearing loss.

Through the new model, the researchers could predict the human speech recognition performance of hearing-impaired listeners with differing degrees of hearing loss. They were able to make these predictions for various noise maskers with different complexities in temporal modulation and how similar they were to real speech. All of this enabled each person to be observed and analyzed individually in regard to possible hearing loss.

“We were most surprised that the predictions worked well for all noise types. We expected the model to have problems when using a single competing talker. However, that was not the case,” said Roßbach.

Since the model was focused on single-ear hearing, the team will now look to create a binaural model for two-ear hearing. They also say that the new model could be used to predict listening effort or speech quality as well.

Related Topics:AI artificial intelligence deep learning

Up Next

Identifying Celebrity Deepfakes From Outer Face Regions

Don't Miss

Why AI Isn’t Providing Better Product Recommendations

Alex McFarland

Alex McFarland is an AI journalist and writer exploring the latest developments in artificial intelligence. He has collaborated with numerous AI startups and publications worldwide.