Despite the intense AI arms race, a multimodal future lies ahead

Despite the intense AI arms race, a multimodal future lies ahead

Subscribe to our daily and weekly newsletters to receive the latest updates and exclusive content on industry-leading AI reporting. Learn more


Every week – sometimes every day – a new one state-of-the-art AI model is born into the world. Heading into 2025, the pace at which new models are coming to market is dizzying, if not exhausting. The curve of the roller coaster continues to increase exponentially and fatigue and wonder have become constant companions. Each publication highlights why The A particular model is better than all others. Endless collections of benchmarks and bar charts fill our feeds as we struggle to keep up.

The number of major foundation models released each year has exploded since 2020
Charlie Giattino, Edouard Mathieu, Veronika Samborska and Max Roser (2023) – “Artificial Intelligence” Published online at OurWorldinData.org.

18 months ago, the vast majority of developers and companies were using a single AI model. Today the opposite is the case. It is rare for a company of significant size to limit itself to the capabilities of a single model. Companies fear vendor lock-in, especially with a technology that has quickly become a central part of both long-term corporate strategy and short-term net revenue. It is becoming increasingly risky for teams to rely exclusively on a single large language model (LLM).

Yet despite this fragmentation, many model providers still maintain that AI will be a winner-take-all market. They claim that the expertise and computing power required to train world-class models is scarce, defensible, and self-reinforcing. From their perspective, the hype bubble is for Creating AI models will eventually collapse, leaving behind a single, massive artificial general intelligence (AGI) model that can be used for anything and everything. Solely owning such a model would mean being the most powerful company in the world. The size of this price has sparked an arms race for more and more GPUs, with a new zero being added to the number of training parameters every few months.

Deep Thought, the monolithic AGI from The Hitchhiker’s Guide to the Universe
BBC, The Hitchhiker’s Guide to the Galaxy, TV series (1981). Still image retrieved for comment purposes.

We believe this view is wrong. There will be no single model that will dominate the universe, neither in the next year nor in the next decade. Instead, the future of AI will be multi-modeled.

Language models are fuzzy goods

The Oxford Dictionary of Economics defines a commodity as “a standardized good that is bought and sold on a large scale and whose units are interchangeable.” Language models are goods in two senses:

  1. The models themselves are becoming increasingly interchangeable for a wider range of tasks.
  2. The research expertise needed to create these models is becoming increasingly distributed and accessible, with frontier labs barely outdoing each other and independent researchers in the open source community hot on their heels.
Goods Describing Goods (Credit: Not Diamond)

But as language models become commodified, it does so unevenly. There is a large functional core for which every model, from the GPT-4 to the Mistral Small, is perfectly suited. At the same time, the further we move towards the edges and edge cases, we see ever greater differentiation, with some model providers explicitly specializing in code generation, reasoning, retrieval-augmented generation (RAG) or mathematics. This leads to endless hand-wringing, Reddit searching, evaluating, and fine-tuning to find the right model for each job.

AI models are commercialized around core competencies and specialize in peripheral areas. Credit: Not Diamond

Although language models are commodities, they are more accurately described as such blurred goods. For many use cases, AI models will be almost interchangeable, with metrics such as price and latency determining which model to use. But at the edge of performance, the opposite will happen: models will continue to specialize and become more and more differentiated. As an example: Deepseek V2.5 is stronger than GPT-4o when coding in C#, despite being a fraction of the size and 50 times cheaper.

These two dynamics – commercialization and specialization – challenge the idea that a single model is best suited for every possible use case. Rather, they point to an increasingly fragmented AI landscape.

Multimodal orchestration and routing

There is an apt analogy for the market dynamics of language models: the human brain. The structure of our brain has remained unchanged for 100,000 years and brains are far more similar than they are different. For most of our time on Earth, most people learned the same things and had similar skills.

But then something changed. We have developed the ability to communicate in language – first verbally, then in writing. Communication protocols enable networks, and as people began to network with each other, we also began to specialize more and more. We have been freed from the burden of having to be generalists in all areas, of being self-sufficient islands. Paradoxically, the collective wealth of specialization has also meant that the average person today is a far stronger generalist than any of our ancestors.

On a sufficiently wide input space, the universe always tends toward specialization. This applies to all aspects of molecular chemistry, biology and human society. Given sufficient diversity, distributed systems will always be more computationally efficient than monoliths. We believe the same will apply to AI. The more we can leverage the strengths of multiple models rather than relying on just one, the more those models can specialize, expanding the boundaries of capabilities.

Multi-model systems can enable greater specialization, performance and efficiency. Source: Not Diamond

An increasingly important pattern for leveraging the strengths of different models is routing – dynamically sending queries to the most appropriate model while leveraging cheaper, faster models when doing so does not compromise quality. Routing allows us to take advantage of all the benefits of specialization—higher accuracy at lower cost and lower latency—without sacrificing the robustness of generalization.

A simple proof of the power of routing is the fact that most of the world’s top models are routers themselves: they are made with Mixture of experts Architectures that pass each next generation of tokens to a few dozen expert submodels. If it is true that LLMs are exponentially proliferating fuzzy commodities, then routing must become an essential part of any AI stack.

There is a view that LLMs will plateau as they reach human intelligence. As we fully exploit the capabilities, we will coalesce into a single general model in the same way we did with AWS or the iPhone. None of these platforms (or their competitors) have increased their capabilities tenfold in the last few years – so we might as well familiarize ourselves with their ecosystems. However, we believe that AI will not stop at human-level intelligence; it will go far beyond any limits we can imagine. Like any other natural system, it becomes increasingly fragmented and specialized.

We cannot emphasize enough how fragmenting AI models is a very good thing. Fragmented markets are efficient markets: they give power to buyers, maximize innovation and minimize costs. And to the extent that we can leverage networks of smaller, more specialized models, rather than passing everything through the internals of a single giant model, we are moving toward a much safer, more interpretable, and more controllable future for AI.

The greatest inventions have no owner. Ben Franklin’s heirs don’t own electricity. Turing’s estate doesn’t own all of the computers. AI is undoubtedly one of humanity’s greatest inventions; We believe its future will and should be multi-model.

Zack Kass is the former head of go-to-market at OpenAI.

Tomás Hernando Kofman is co-founder and CEO of Not diamond.

DataDecisionMakers

Welcome to the VentureBeat community!

At DataDecisionMakers, experts, including engineers who work with data, can share data-related insights and innovations.

If you want to learn more about innovative ideas and current information, best practices and the future of data and data technology, visit us at DataDecisionMakers.

You might even think about it contribute an article Your own!

Read more from DataDecisionMakers



Source link
Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *