OpenAI Introduces An Automatic Speech Recognition System

OpenAI researchers trained and released Whisper, a neural network that approaches human levels of robustness and accuracy in English speech recognition.

What is Whisper?

Whisper is an automatic speech recognition (ASR) system that was trained on 680,000 hours of supervised data from the web in multiple languages and involved multiple tasks. We show that using such an extensive and varied set of data makes the system more resistant to things like accents, background noise, and technical language. It also lets you transcribe in more than one language and translate from those languages into English. We are making our models and inference code public so we can use them to build useful apps and do more research on making speech processing more reliable.

Architecture

The Whisper architecture is a simple end-to-end method implemented as an encoder-decoder Transformer. The audio that comes in is broken up into 30-second pieces, turned into a log-Mel spectrogram, and then sent to an encoder.

A decoder is trained to predict the corresponding text caption and unique tokens for a single model to accomplish tasks like language recognition, phrase-level timing, multilingual speech transcription, and English speech translation.

Existing approaches

Other approaches commonly use smaller, more closely paired audio-text training datasets or broad but unsupervised audio pretraining. Whisper does not outperform models specializing in LibriSpeech performance, a notoriously competitive benchmark in speech recognition, because it was trained on a large and diverse dataset and was not fine-tuned to any specific one. Yet when the researchers test Whisper’s zero-shot performance across different datasets, they find that it is much more stable and makes 50% fewer mistakes than those models.

Conclusion

Approximately one-third of Whisper’s audio dataset is non-English, alternately tasked with transcribing in the original language or translating to English. According to the researchers, this approach is efficient at learning speech-to-text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot.

Furthermore, the OpenAI researchers hope that Whisper’s high accuracy and ease of use will enable developers to incorporate voice interfaces into a broader range of applications.

Source: indiaai.gov.in

OpenAI Introduces An Automatic Speech Recognition System

Leave a Reply Cancel reply

Editors Corner

How can Artificial Intelligence tools be a blessing for recruiters?

Will Artificial Intelligence ever match human intelligence?

Artificial Intelligence: Features of peer-to-peer networking

What not to share or ask on Chatgpt?

How can Machine Learning help in detecting and eliminating poverty?

How can Artificial Intelligence help in treating Autism?

Speech Recognition and its Wonders in your corporate life

Most groundbreaking Artificial Intelligence-based gadgets to vouch for in 2023

Recommended News

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

An Agreement Is Signed By MEA, MeitY, And CSC To Offer E-Migration Services Via Shared Service Centers

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

Related Posts

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

OpenAI Creates An AI Safety Committee Following Significant Departures

How Researchers Are Monitoring Biodiversity From Far Away By ‘Listening To The Forest’

Recent Posts

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

An Agreement Is Signed By MEA, MeitY, And CSC To Offer E-Migration Services Via Shared Service Centers

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

OpenAI Creates An AI Safety Committee Following Significant Departures

Tags

Follow us

Welcome Back!

Retrieve your password

Add New Playlist

Join Our Newsletter