OpenAI's ChatGPT is set to undergo a significant transformation, moving beyond its text-centric capabilities. As of September 25, OpenAI revealed that it would be integrating voice and image functionalities into ChatGPT.
Since its introduction roughly nine months prior, ChatGPT has become a technological sensation, enabling users to produce essays, poems, and summaries using text prompts. The latest update will make ChatGPT even more dynamic, allowing users to engage in voice interactions with the AI assistant.
This development was announced concurrently with Amazon's decision to invest a staggering $4 billion in Anthropic, a competitor to OpenAI. This investment is a testament to the escalating competition in the generative AI sector, with tech behemoths like Google, Meta, and Microsoft vying for dominance.
OpenAI's recent move signifies a pivotal moment for generative AI, as it merges voice assistant technology with its robust large language models (LLMs). For example, users can verbally prompt ChatGPT to craft an impromptu bedtime story or pose questions and receive spoken responses.
Additionally, ChatGPT will now support image-based queries, allowing users to upload photos and seek explanations or instructions related to the image.
Powering the voice feature is a novel text-to-speech model capable of producing lifelike voices from text and brief speech samples. OpenAI collaborated with renowned voice actors to develop five distinct voices, utilizing its open-source Whisper system to convert spoken words into text.
Spotify has also partnered with OpenAI for this venture, introducing a feature for podcasters to convert their shows from English to languages like Spanish, French, or German, while preserving their original voice. However, OpenAI is treading cautiously, collaborating exclusively with select podcasters such as Dax Shepard and Lex Fridman for the initial rollout.
OpenAI acknowledged in a blog post the vast potential of this voice technology, emphasizing its utility in creative and accessibility domains. Yet, they also cautioned against potential misuse, like impersonation or fraud.
These enhancements will be accessible to Plus and Enterprise subscribers in the upcoming weeks. To enable voice features, users should navigate to the app's "settings," select "new features," and activate voice interactions. Voice functionality will initially be available on the ChatGPT Android and iOS apps through an opt-in beta, while the image search feature will be standard across all platforms.
In the rapidly advancing world of AI-driven tools like ChatGPT, it's essential to have a reliable partner to navigate the digital landscape. At Band of Coders, we're dedicated to offering tailored tech solutions that align with such advancements. If you're looking to harness the power of AI or require guidance in the digital domain, our team is here to assist. Reach out to us for a streamlined digital journey.