Anthropic starts the world’s first “hybrid argumenting” KI model

February 25, 2025No CommentsTech

The difference between a conventional model and an argument is similar to that of the two thoughts, which the economist Michael Kahneman, which was awarded the Nobel Prize, described in his book from 2011 Think quickly and slowly: Fast and instinctive system-1 thinking and slower-considering system-2 thinking.

The type of model that has made chatted, which is known as a large voice model or LLM, immediately creates answers to a prompt by asking a large neural network. These outputs can be strikingly clever and coherent, but cannot answer any questions that require step-by-step argumentation, including simple arithmetics.

An LLM can be forced to imitate considered considerations if it is instructed to create a plan, which he then has to follow. However, this trick is not always reliable, and models usually have difficulty solving problems that require extensive and careful planning. Openai, Google and now Anthropic all use A method for machine learning, which is known as a reinforcement learning To learn their latest models to generate argumentation that indicate correct answers. This requires collecting additional training data from people to solve specific problems.

According to Penn, Claude’s argumentation mode received additional data on business applications, including writing and fixing code, using computers and answering complex legal questions. “The things we have improved are … technical topics or topics that require long argument,” says Penn. “What we have from our customers is very interested in using our models in their actual workloads.”

According to Anthropic, Claude 3.7 is particularly good at solving coding problems that require gradual argument and surpass the O1 from Openai on some benchmarks such as SWE-Bench. The company today publishes a new tool called Claude Code that was specially developed for this type of AI-supported coding.

“The model is already good in the coding,” says Penn. “However, additional thinking would be good for cases that may require very complex planning – it is drawn to an extremely large code base for a company.”

Source link

Spread the love