Large Language Model ChatGPT has received an upgrade in the form of voice and image capabilities.
OpenAI confirms that the language model offers a new and more intuitive type of interface that allows users to have a voice conversation.
No ad to show here.
Users can now use voice to engage in a conversation with the ChatGPT assistant. Whether it’s speaking to the bot on the go, or requesting a bedtime story for the family the, tweak even allows family dinner debates to be settled.
Voice and image
Users will now gain the ability to snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it.
When you’re home, you can simply take pictures of your fridge and pantry to figure out what’s for dinner, and prompt follow-up questions such as a step-by-step recipe.
Users can help a child with math problems by taking a photo circling the problem set.
This will allow the model to share hints about the math challenge for both, the child and parent.
OpenAi confirms it is rolling out the voice and images feature on enterprise and Plus users over the next two weeks.
The voice feature will be coming to iOS and Android soon.
According to OpenAi:
To get started with voice, head to Settings → New Features on the mobile app and opt into voice conversations. Then, tap the headphone button located in the top-right corner of the home screen and choose your preferred voice out of five different voices.
The new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech. We collaborated with professional voice actors to create each of the voices. We also use Whisper, our open-source speech recognition system, to transcribe your spoken words into text.
Listen to voice samples
Users will get to choose between a compiled recipe, speech, poem, and explanation.
Think of troubleshooting a bicycle issue.
Here are some samples in the pictures below.