OpenAI Launches Whisper API for Third-Party Developers to Integrate ChatGPT at Reduced Cost.

25 Apr 2023

OpenAI has announced that it has launched a new Whisper API that enables third-party developers to integrate its ChatGPT into their apps and services at significantly cheaper rates than using its existing language models. The Whisper API is a hosted version of the open-source Whisper speech-to-text model, which was released by the company in September 2022. It is an automatic speech recognition system that costs just $0.006 per minute and supports large-sized transcription in multiple languages, accepting various file formats such as M4A, MP3, MP4, MPEG, MPGA, WAV, and WEBM.
Despite the presence of competitive tech organizations such as Google, Amazon, and Meta, OpenAI's Whisper API stands out with its outstanding performance, as it is trained on 680,000 hours of multilingual and "multitask" data collected from the web. This affords it upgraded recognition features like unique accents, background noise, and technical jargon.

OpenAI's president and chairman, Greg Brockman, explained that the Whisper API is an optimized version of the same large model that is available as open source, and it is much faster and more convenient to use. The limitations in enterprises adopting voice transcription technology are accuracy, accent- or dialect-related recognition issues, and costs, according to a 2020 Statista survey.
"Our picture is that we really want to be this universal intelligence," Brockman said. "We really want to, very flexibly, be able to take in whatever kind of data you have and whatever kind of task you want to accomplish and be a force multiplier on that attention."


One limitation of Whisper is in "next-word" prediction, due to the enormous amount of data trained with the system. OpenAI cautions that Whisper might include words that weren't spoken in its transcriptions, possibly because it's both trying to predict the next word in the audio and transcribe the audio recording itself. Whisper's performance also varies according to the language used, with speakers of less well-represented languages in the training set experiencing a higher error rate.
OpenAI anticipates using Whisper's transcription capabilities to enhance current software, services, tools, and solutions. The Whisper API is already being used by the AI-powered language learning app Speak to enable a brand-new in-app virtual speaking companion. Furthermore, OpenAI breaking into the speech-to-text market may be quite profitable, with a single estimate placing the potential market value at $5.4 billion by 2026, up from $2.2 billion in 2021.

Play audio


Share:

Comments

No comments

Add your comment

Search Blog

Recent Posts

The Importance of Agile Methodologies for Startup Success In today’s fast-paced business landscape, where...
AI Will Power the Next Wave of Financial Inclusion, Says SANEF’s Uche Uzoebo As progress continues in enhancing financial inc...
Meta Rolls Out New Facebook Measures to Suppress Spam and Support Real Creators Meta has announced a series of new measures aime...
Provisioning-on-Demand Software Can Radically Reduce Wastage for MNOs By Craig Palmer, Chief Executive Officer at VAS-...
Logidoo Opens Global Trade Channels for African Businesses with Groupage Shipping Logidoo, the pan-African logistics platform, has a...

Related Post

Logidoo Opens Global Trade Channels for African Businesses with Groupage Shipping
Logidoo, the pan-African logistics platform, has announced an important pivot to...
Lagos Turns Up for PUBG MOBILE Community Event with 1,000+ Attendees
On March 28, 2025, Lagos transformed into the ultimate gaming destination as PUB...
From Waste to Worth: LG Inspires Eco-Action with Hinckley Recycling Partnership
In a commendable initiative coinciding with Earth Day 2025, LG Electronics has...
Logo

Accelerating the growth of Africa's tech ecosystem