Phones
Meta takes some big AI swings at Meta Connect 2024
Screenshot by David Gewirtz/ZDNET
Mark Zuckerberg took the stage at Meta Connect 2024 and came out strong in the categories of VR/AR and AI. There’s a lot of mixing of these technologies, particularly in the Meta glasses line discussed elsewhere on ZDNET.
Also: Everything announced at Meta Connect 2024: $299 Quest 3S, Orion AR glasses, and more
In this article, though, we’ll dig into several powerful and impressive announcements related to the company’s AI efforts.
Multimodal large language model
Zuckerberg announced the availability of Llama 3.2, which adds multimodal capabilities. In particular, the model can understand images.
He compared Meta’s Llama 3.2 large language models with other LLMs, saying Meta “Differentiates itself in this category by offering not only state of the art models, but unlimited access to those models for free, and integrated easily into our different products and apps.”
Also: Meta inches toward open-source AI
Meta AI is Meta’s AI assistant, now based on Llama 3.2. Zuckerberg stated Meta is on track to be the most used AI assistant globally, having almost 500 million monthly active users.
Screenshot by David Gewirtz/ZDNET
To demonstrate the model’s understanding of images, Zuckerberg opened an image on a mobile device using the company’s image-edit capability. Meta AI was able to change the image, modifying a shirt to tie-dye or adding a helmet, all in response to simple text prompts.
Meta AI with voice
Meta’s AI assistant is now able to hold voice conversations with you from within Meta’s apps. I’ve been using a similar feature in ChatGPT and found it useful when two or more people need to hear the answer to a question.
Screenshot by David Gewirtz/ZDNET
Zuckerberg claims that AI voice interaction will be bigger than text chatbots, and I agree — with one caveat. Getting to the voice interaction has to be easy. For example, to ask Alexa a question, you simply speak into the room. But to ask ChatGPT a question on the iPhone, you have to unlock the phone, go into the ChatGPT app, and then enable the feature.
Also: AI voice generators: What they can do and how they work
Until Meta has devices that just naturally listen for speech, I fear even the most capable voice assistants will be constrained by inconvenience.
You can also give your AI assistant a celebrity voice. Choose from John Cena, Judi Dench, Kristen Bell, Keegan-Michael Key, and Awkwafina. Natural voice conversation will be available in Instagram, WhatsApp, and Messenger Facebook and is rolling out today.
Meta AI Studio
Next up are some features Meta has added to its AI Studio chatbot creation tool. AI Studio lets you create a character (either an AI based on your interests or an AI that “is an extension of you”). Essentially, you can create a chatbot that mirrors your conversational style.
But now Meta is diving into the realm of uncanny valley deepfakes.
Screenshot by David Gewirtz/ZDNET
AI Studio, until this announcement, contained a text-based interface. But Meta is releasing a version that is “more natural, embodied, interactive.” And when it comes to “embodied”, they’re not kidding around.
In the demo, Zuckerberg interacted with a chatbot modeled on creator Don Allen Stevenson III. This interaction appeared to be a “live” video of Stevenson, full and completely tracking head motion and lip animations. Basically, he could ask Robot Don a question and it looked like the real guy was answering.
Also: How Apple, Google, and Microsoft can save us from AI deepfakes
Powerful, freaky, and unnerving. Plus, the potential for creating malicious chatbots using other folks’ faces seems a distinct possibility.
AI translation
Meta seems to have artificial lip-synch and facial movements tied down. They’ve reached a point where they can make a real person’s face move and speak generated words.
Meta has extended this capability to translation. They now offer automatic video dubbing on Reels, in English and Spanish. That feature means you can record a Reel in Spanish, and the social will play it back in English — and it will look like you’re speaking English. Or you can record in English and it will play back in Spanish, as if you’re speaking in Spanish.
Screenshot by David Gewirtz/ZDNET
In the above example, creator Ivan Acuña spoke in Spanish, but the dub came back in English. As with the previous example, the video was nearly perfect and it looked like Acuña had been recorded speaking English originally.
Llama 3.2
Zuckerberg came back for another dip into the Llama 3.2 model. He said the multimodal nature of the model has increased the parameter count considerably.
Screenshot by David Gewirtz/ZDNET
Another interesting part of the announcement was the much smaller 1B and 3B models optimized to work on-device. This effort will allow developers to create more secure and specialized models for custom apps, that live right in the app.
Also: I’ve tested dozens of AI chatbots since ChatGPT’s stunning debut. Here’s my top pick
Both of these models are open source, and Zuckerberg was touting the idea that Llama is becoming “the Linux of the AI industry”.
Finally, a bunch more AI features were announced for Meta’s AI glasses. We have another article that goes into those features in detail.
You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.