Amazon is trying to transplant Alexa’s “brain” with generative AI

https3A2F2Fd1e00ek4ebabms.cloudfront.net2Fproduction2Fa20bc117-daa6-4429-86eb-839ba77d45f7.jpg


Amazon is preparing to relaunch its voice-controlled digital assistant Alexa as an artificial intelligence “agent” capable of handling practical tasks as the tech giant struggles to resolve challenges complicating the system’s AI overhaul have.

The $2.4 trillion company has spent the past two years trying to redesign Alexa, its conversational system built into 500 million consumer devices worldwide, transplanting the software’s “brains” with generative AI.

Rohit Prasad, head of the artificial general intelligence (AGI) team Amazontold the Financial Times that the voice assistant would still have to overcome some technical hurdles before it could be rolled out.

These include solving the problem of “hallucinations” or fabricated answers, response speed or “latency,” and reliability. “Hallucinations must be close to zero,” Prasad said. “It is still an open issue in the industry, but we are working hard on it.”

Amazon executives’ vision is to transform Alexa, currently used for a limited number of simple tasks like playing music and setting alarms, into an “agentic” product that acts as a personalized concierge. This can include everything from restaurant suggestions to configuring bedroom lighting based on a person’s sleep cycles.

The redesign of Alexa has been in the works since the launch of OpenAI’s ChatGPT, powered by Microsoft, in late 2022. While Microsoft, Google, Meta and others have quickly embedded generative AI into their computing platforms and improved their software services, critics have questioned whether Amazon can resolve its technical and organizational difficulties in time to compete with its rivals.

According to several employees who have worked on Amazon’s voice assistant teams in recent years, the effort was fraught with complications and the result of years of AI research and development.

Several former employees said the long wait for a rollout was largely due to the unexpected difficulties associated with switching and combining the simpler, predefined algorithms that underpin Alexa with more powerful but unpredictable large-scale language models.

In response, Amazon said it is “working hard to enable even more proactive and expert support” from its voice assistant. It added that a technical implementation of this scale into a live service and a range of devices used by customers around the world was unprecedented and not as simple as overlaying an LLM with the Alexa service.

Prasad, Alexa’s former chief architect, said the company’s release of the company’s in-house Amazon Nova models last month – led by its AGI team – was motivated in part by the specific requirements for optimal speed, cost and reliability AI-powered applications like Alexa “reach the last mile, which is really difficult.”

To function as an agent, Alexa’s “brain” must be able to call hundreds of third-party software and services, Prasad said.

“Sometimes we underestimate how many services are built into Alexa, and it’s a huge number. These applications receive billions of requests every week. So if you are trying to implement reliable actions quickly. . . They need to be able to do this in a very cost-effective way,” he added.

The complexity arises because Alexa users expect quick answers as well as an extremely high level of accuracy. Such qualities are at odds with the inherent probabilistic nature of today’s generative AI, statistical software that predicts words based on speech and language patterns.

Some former employees also point to difficulties in preserving the Assistant’s original characteristics, including its consistency and functionality, while giving it new generative characteristics such as creativity and free-flowing dialogue.

Because of the more personalized and conversational nature of LLMs, the company also plans to hire experts to shape the AI’s personality, voice and diction so that it remains familiar to Alexa users, according to a person familiar with the matter.

A former senior member of the Alexa team said that while LLMs were very demanding, they carried risks, such as producing answers that were “sometimes completely made up.”

“At the scale at which Amazon operates, this could happen many times a day,” they said, damaging its brand and reputation.

In June, Mihail Eric, a former machine learning scientist at Alexa and founding member of the Conversational Modeling Team, said: said publicly that Amazon “dropped the ball” when it became the “clear leader in conversational AI” with Alexa.

Eric said that despite its strong scientific talent and “huge” financial resources, the company was “plagued with technical and bureaucratic problems,” suggesting that “the data was poorly annotated” and “the documentation was either non-existent or outdated.” “.

According to two former employees who worked on Alexa-related AI, the historic technology underlying the voice assistant was inflexible and difficult to change quickly, burdened by a clunky and disorganized code base and an engineering team “spread too thin.” became.

The original Alexa software, built on technology from British start-up Evi in ​​2012, was a question-and-answer machine that searched a defined universe of facts to find the right answer, such as the weather of the day or a specific question song in your music library.

The new Alexa uses a range of different AI models to recognize, translate and generate responses to voice requests, as well as detect policy violations such as detecting inappropriate responses and hallucinations. Developing software to translate between the legacy systems and the new AI models was a major obstacle to the Alexa-LLM integration.

The models include Amazon’s in-house software, including the latest Nova models, as well as Claude, the AI ​​model from the start-up Anthropic, in which Amazon has invested 8 billion dollars over the course of the last 18 months.

“The biggest challenge with AI agents is ensuring they are safe, reliable and predictable,” Anthropic CEO Dario Amodei told the FT last year.

Agent-like AI software needs to get to the point “where…” . . “People can actually have confidence in the system,” he added. “Once we get to that point, we will release those systems.”

A current employee said more steps are needed, such as adding child safety filters and testing custom integrations with Alexa, such as smart lights and the Ring doorbell.

“The problem is reliability – it’s supposed to work almost 100 percent of the time,” the employee added. “That’s why you see us. . . or Apple or Google ship slowly and incrementally.”

Numerous third parties developing “skills” or features for Alexa said they were unsure when the new generative AI-enabled device would come to market and how they might create new features for it.

“We are waiting for the details and understanding,” said Thomas Lindgren, co-founder of Swedish content developer Wanderword. “When we started working with them, they were much more open. . . then they changed over time.”

Another partner said that after an initial period of “pressure” Amazon put on developers to prepare for the next generation of Alexa, things had quieted down.

An ongoing challenge for Amazon’s Alexa team – which was hit by major layoffs in 2023 – is making money. Figuring out how to make the assistants “cheap enough to work at scale” will be a big task, said Jared Roesch, co-founder of generative AI group OctoAI.

Options being discussed include creating a new Alexa subscription service or cutting sales of goods and services, a former Alexa employee said.

Prasad said Amazon’s goal is to develop a variety of AI models that could serve as “building blocks” for a variety of applications beyond Alexa.

“We are always oriented towards customers and practical AI. We don’t do science for the sake of science,” Prasad said. “We do this… . Delivering customer value and impact, which becomes more important than ever in the age of generative AI because customers want to see a return on investment.”



Source link

Spread the love

Leave a Reply

Your email address will not be published. Required fields are marked *