ChatGPT provides voice and image features

Everyone’s favorite chatbot can now do just that see, hear and speak. On Monday, OpenAI announced modern multimodal capabilities for ChatGPT. Users can now make voice calls or share images using ChatGPT in real time.

Audio and multimodal capabilities have become the next stage in the fierce competition for generative AI. Meta recently launched AudioCraft to generate music using artificial intelligence, and Google Bard and Microsoft Bing have implemented multimodal features for chatting. Just last week, Amazon unveiled an improved version of Alexa that will be powered by its own LLM (Immense Language Model), and even Apple is experimenting with AI-generated voice via personal voice.

Voice features will be available on iOS and Android. Like Alexa or Siri, you can tap to talk to ChatGPT and ChatGPT will speak to you in one of your five preferred voice options. Unlike current voice assistants, ChatGPT is powered by more advanced LLM, so you’ll hear the same type of conversational and artistic response that OpenAI’s GPT-4 and GPT-3.5 are able to create through text. An example shared by OpenAI in the announcement is generating a bedtime story from a voice message. So exhausted parents at the end of a long day can outsource their creativity to ChatGPT.

The tweet may have been deleted

Multimodal recognition is something that has been predicted for some time, and now it is being launched in a user-friendly way for ChatGPT. When GPT-4 was released last March, OpenAI demonstrated its ability to understand and interpret images and handwritten text. It will now be part of everyday ChatGPT employ. Users can upload an image of something and query ChatGPT about it – identifying a cloud or preparing a meal plan based on a photo of the contents of the fridge. The multimodal service will be available on all platforms.

As with any development of generative AI, there are stern ethical and privacy issues to consider. To mitigate the risk of audio deepfakes, OpenAI says it only uses its audio recognition technology in the specific employ case of “voice chat.” In addition, voice actors with whom they “worked directly” took part in the film. That said, the announcement didn’t mention whether user voices can be used to train the model if you opt for voice chat. Regarding ChatGPT’s multimodal capabilities, OpenAI says it has “taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people because ChatGPT is not always exact and these systems should respect individuals’ privacy.” But the true test of its nefarious uses will not be known until it is released into the wild.

Voice chat and images will be made available to ChatGPT Plus and Enterprise users within the next two weeks, with all users “shortly thereafter.”

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

ChatGPT provides voice and image features

globally

Related Posts

What is the Rabbit R1 AI assistant and why is everyone crazy about it?

Plant-based “meat” start-up Tender has already won a contract for a rapid food chain and another $11 million

Gap shares rose 20% as profits rose on sales growth for all four brands

Leave a Reply Cancel reply

Recommended

Kevin Costner confirms he won’t be returning to ‘Yellowstone’: ‘I just realized I won’t be able to continue’

Sonakshi Sinha’s future husband Zaheer Iqbal poses for photos with Shatrughan Sinha

Junkyard Gem: 1968 Oldsmobile Delta 88 Custom Holiday Sedan

WATCH | Suryakumar Yadav-Rashid Khan shares playful banter during India vs Afghanistan clash; “stop sweeping me”

According to Nvidia CEO Jensen Huang, this is the “next wave” of artificial intelligence

Unreal faces and AI brains: the uncanny valley?

About Us

Category

Site Map