How Meta uses generative AI to understand user intent

Subscribe to our daily and weekly newsletters to receive the latest updates and exclusive content on industry-leading AI reporting. Learn more

Meta – parent company of Facebook, Instagram, WhatsApp, Threads and more – operates one of the largest recommendation systems in the world.

In two recent papers, researchers demonstrated how generative models can be used to better understand and respond to user intent.

By viewing recommendations as a generative problem, you can approach it in new, richer, and more efficient ways than traditional approaches. This approach can be very useful for any application that requires retrieving documents, products, or other types of objects.

Dense vs. generative retrieval

The standard approach to creating Recommendation systems is to compute, store, and retrieve dense representations of documents. For example, to recommend items to users, an application must train a model that can perform calculations Embeddings for both users and items. Then a large pool of article embeds needs to be created.

At the time of inference, the recommendation system attempts to understand the user’s intent by finding one or more items whose embeddings are similar to the user’s. This approach requires increasing storage and computing capacity as the number of items increases, as each item embedding needs to be stored and each recommendation operation requires a comparison of the user embedding with the entire item store.

Generative retrieval is a newer approach that attempts to understand user intent and make recommendations by predicting the next item in a sequence rather than searching a database. Generative retrieval does not require storage of element embeddings, and its inference and storage costs remain constant as the list of elements grows.

The key to making generative retrieval work is calculating “semantic IDs” (SIDs), which contain the contextual information about each item. Generative retrieval systems like TIGER Work in two phases. First, an encoder model is trained to create a unique embedding value for each element based on its description and properties. These embedding values become SIDs and are stored with the element.

In the second stage, a Transformer model is trained to predict the next SID in an input sequence. The list of input SIDs represents the user’s interactions with previous items and the model’s prediction is the SID of the item to recommend. Generative retrieval reduces the need to store and search individual article embeddings. It also improves the ability to capture deeper semantic relationships within the data and provides other benefits of generative models such as: B. changing the temperature to adjust the variety of recommendations.

Advanced generative retrieval

Despite the lower storage and inference costs, generative retrieval has some limitations. For example, it tends to overfit on the items it saw during training, meaning it has problems with items added to the catalog after the model has been trained. In recommendation systems, this is often referred to as “the”. Cold start problem“, which refers to users and items that are new and have no interaction history.

To address these shortcomings, Meta has developed a hybrid recommendation system called LIGERthat combines the computational and storage efficiency of generative retrieval with the robust embedding quality and ranking capabilities of dense retrieval.

During training, LIGER uses both the similarity score and the nearest token targets to improve the model’s recommendations. During inference, LIGER selects several candidates based on the generative mechanism and supplements them with some cold-start elements, which are then ranked based on the embeddings of the generated candidates.

The researchers note that “merging dense and generative retrieval methods holds enormous potential for the advancement of recommender systems,” and as models evolve, “they will become increasingly practical for real-world applications, enabling more personalized and responsive user experiences.”

In a separate article, the researchers introduce a novel multimodal generative retrieval method called Recognize multimodal preferences (Mender), a technique that can enable generative models to capture implicit preferences from user interactions with various elements. Mender builds on the SID-based generative retrieval methods and adds some components that can enrich recommendations with user preferences.

Mender uses a large language model (LLM) to translate user interactions into specific preferences. For example, if the user praised or complained about a particular item in a review, the model summarizes this into a preference for that product category.

The main recommendation model is trained to depend on both the order of user interactions and user preferences when predicting the next semantic ID in the input sequence. This gives the recommendation model the ability to generalize and perform contextual learning and adapt to user preferences without being explicitly trained to do so.

“Our contributions pave the way for a new class of generative retrieval models that open the possibility of using organic data to drive recommendations via textual user preferences,” the researchers write.

Mender recommendation framework (Source: arXiv)

Impact on enterprise applications

The efficiencies provided by generative retrieval systems can have important implications for enterprise applications. These advances lead to immediate practical benefits, including lower infrastructure costs and faster reasoning. The technology’s ability to maintain consistent storage and inference costs regardless of catalog size makes it particularly valuable for growing companies.

The benefits extend across all industries, from e-commerce to enterprise search. Generative retrieval is still in its early stages and we can expect applications and frameworks to emerge as it matures.

Daily insights into business use cases with VB Daily

If you want to impress your boss, VB Daily is for you. We give you the inside scoop on what companies are doing with generative AI, from regulatory changes to practical deployments, so you can share insights for maximum ROI.

Read ours Privacy Policy

Thank you for your subscription. Check out more VB newsletter here.

An error has occurred.