Less is more: how “chain of the draft” reduces the AI costs by 90% and at the same time improves performance

Less is more: how “chain of the draft” reduces the AI ​​costs by 90% and at the same time improves performance

Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more


A research team at Zoom communication has developed a groundbreaking technology that is dramatically reduced and possibly changing the cost and arithmetic resources that are required for AI systems to solve complex argumentation problems and possibly the way in which companies provide AI on a scale.

The method called Chain of the design (Cod) enables large language models (LLMS) to solve problems with minimal words – only 7.6% of the text required by current methods and at the same time the accuracy in maintaining or even improvement. The results were published last week in a paper on the Arxiv research repository.

“By reducing execution reasons and concentration on critical findings exceeds or exceeds COT (chain of thoughts) the accuracy, while only 7.6% of the tokens are used, whereby the costs and the latency are significantly reduced by different argumentation tasks.

The chain of designs (red) maintains the accuracy of the accuracy of the chain of thoughts (yellow) and uses less token for four argumentation tasks, which shows how concise AI argument can reduce the costs without impairing the performance. (Credit: arxiv.org)

How “less is more” changes the AI thinking without affecting the accuracy

Cod is inspired how humans solve complex problems. Instead of articulating every detail in the work of a mathematical problem or a logical puzzle, people normally only reject essential information in a abbreviated form.

“When solving complex tasks – whether mathematical problems, elaboration of essays or coding – we often only write the critical information that helps us to make progress,” the researchers explain. “By imitation of this behavior, LLMS can concentrate on dealing with solutions without the overhead of detailed thinking.”

The team tested its approach on numerous benchmarks, including arithmetic argument (GSM8K), reasonable thinking (understanding of the date and understanding of sports) and symbolic thinking (coin flip tasks).

In a striking example, in the Claude 3.5 Sonett Processed questions related to sport reduced the average outcome of 189.4 tokens to just 14.3 token-a reduction of 92.4%-and at the same time improved the accuracy from 93.2%to 97.3%.

Disguise of corporate costs

“With a company processing of 1 million argumentation issues per month, COD could reduce the costs of USD 3,800 (COT) to $ 760 and save over $ 3,000 per month” Ajith Vallath Prabhakar Write in an analysis of the paper.

Research takes place at a critical time for the provision of Enterprise AI. As companies, sophisticated AI systems are increasingly integrating into their company, the computing costs and the response times have proven to be significant obstacles to the widespread introduction.

Current state of the art such as (Cot), which was introduced in 2022, have dramatically improved the ability of AI to solve complex problems by collapsing in gradual argument. However, this approach creates long explanations that consume considerable calculation resources and increase reaction latency.

“The detailed nature of COT -Anlaufer leads to considerable computing effort, increased latency and higher operating costs,” writes Prabhakar.

What does it matter Cod particularly noteworthy For companies, the simplicity of implementation is. In contrast to many AI progress, which require expensive models or architectural changes, COD can be used immediately with existing models by a simple quick change.

“Companies that already use COT can switch to cod with a simple change in transformation,” explains Prabhakar.

The technology could prove to be particularly valuable for latency-sensitive applications such as customer service in real time, mobile AI, educational instruments and financial services, in which even small delays can significantly influence the user experience.

However, industry experts suggest that the effects go beyond cost savings. By democratizing access to sophisticated AI skills for smaller organizations and resource-related environments, COD could democratize access to demanding AI skills.

If the AI systems develop, techniques such as CSB underline growing emphasis on efficiency and raw ability. For companies that navigate through the rapidly changing AI landscape, such optimizations could prove as valuable as improvements in the underlying models themselves.

“If the AI models develop, the optimization of argumentation efficiency will be just as critical as the improvement of their raw skills,” concluded Prabhakar.

The research code and the data were created publicly available On Github so that companies can implement and test the approach with their own AI systems.



Source link
Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *