OpenAI Launches Whisper API for Third-Party Developers to Integrate ChatGPT at Reduced Cost.

10 Jun 2023

OpenAI has announced that it has launched a new Whisper API that enables third-party developers to integrate its ChatGPT into their apps and services at significantly cheaper rates than using its existing language models. The Whisper API is a hosted version of the open-source Whisper speech-to-text model, which was released by the company in September 2022. It is an automatic speech recognition system that costs just $0.006 per minute and supports large-sized transcription in multiple languages, accepting various file formats such as M4A, MP3, MP4, MPEG, MPGA, WAV, and WEBM.
Despite the presence of competitive tech organizations such as Google, Amazon, and Meta, OpenAI's Whisper API stands out with its outstanding performance, as it is trained on 680,000 hours of multilingual and "multitask" data collected from the web. This affords it upgraded recognition features like unique accents, background noise, and technical jargon.

OpenAI's president and chairman, Greg Brockman, explained that the Whisper API is an optimized version of the same large model that is available as open source, and it is much faster and more convenient to use. The limitations in enterprises adopting voice transcription technology are accuracy, accent- or dialect-related recognition issues, and costs, according to a 2020 Statista survey.
"Our picture is that we really want to be this universal intelligence," Brockman said. "We really want to, very flexibly, be able to take in whatever kind of data you have and whatever kind of task you want to accomplish and be a force multiplier on that attention."


One limitation of Whisper is in "next-word" prediction, due to the enormous amount of data trained with the system. OpenAI cautions that Whisper might include words that weren't spoken in its transcriptions, possibly because it's both trying to predict the next word in the audio and transcribe the audio recording itself. Whisper's performance also varies according to the language used, with speakers of less well-represented languages in the training set experiencing a higher error rate.
OpenAI anticipates using Whisper's transcription capabilities to enhance current software, services, tools, and solutions. The Whisper API is already being used by the AI-powered language learning app Speak to enable a brand-new in-app virtual speaking companion. Furthermore, OpenAI breaking into the speech-to-text market may be quite profitable, with a single estimate placing the potential market value at $5.4 billion by 2026, up from $2.2 billion in 2021.

Play audio


Share:

Comments

No comments

Add your comment

Search Blog

Recent Posts

SA Agritech Startup Nile Secures $11.3 Million to Expand Digital Trading for Farmers South African agritech innovator Nile has announce...
The Rise of No-Code Platforms: Democratizing Startup Development Technology has traditionally been a significant&nb...
How Does ChatGPT Know So Much? Understanding Where AI Gets Its Knowledge Have you ever wondered how ChatGPT seems to know a...
Meta Plans to invest $10bn in Scale AI Meta Platforms Inc. is set to make a major move in...
New U.S.-Nigeria Trade Deal Could Accelerate Investment in Nigerian Startups A new trade agreement between the United States an...

Related Post

New U.S.-Nigeria Trade Deal Could Accelerate Investment in Nigerian Startups
A new trade agreement between the United States and Nigeria could serve as a cat...
Tech Firm Gains Global Certifications to Boost Nigeria’s IT Ecosystem
OdumareTech, a fast-growing Nigerian technology training company, has achieved a...
Nigeria Signs $7.5M Deal with Bill Gates Foundation to Build Artificial Intelligence Hub
The Federal Government of Nigeria has signed an agreement with the Bill Gates Fo...
Logo

Accelerating the growth of Africa's tech ecosystem