I tested Grok 3 and it is not worth the price increase

Xai published the beginning of this week Grok 3The company’s most advanced AI comprises an argumentation model and a deepsearch function. The company claims that it “is”The smartest AI in the world,” And Elon himself Says that it has so far “surpassed everything that has been published”. But is it really the “maximum truth -seeking AI” Muschus says it is?

Well to spoil it, no. Not yet. This is a shame because GRK is expensive -beyond a limited free trial version, either 40 US dollars/month x Premium+ subscription is required, thanks to the new model or a supergroc subscription of $ 30 per month.

Both my tests and experiments from experts have problems I have to believe the “based” AI that these costs are worth. There is no breakthrough of the next generation or a groundbreaking argumentation model that we have never seen here. Grok 3 also regularly hallucinatedLike any other AI model out there, but that doesn’t mean that it has not improved.

With your own benchmark tests of X, Grok 3 basically strikes every model out there except openas. From the user position, however, a KI app goes far beyond the benchmarks.

A good AI chat bot is a mature, rounded product. After spending my own money to test this, I just don’t have the feeling that I get this, especially if the competition offers similar or even better products for much less.

GROK 3 did technically

It is best to leave elons unusual claims in the evaluation of Grok 3. It is impressive that Grok 3 got up with the limit of the AI force, and surprisingly quickly (Grok 2 was never in the big leagues).

Grok 3 was trained using 200,000 NVIDIA H100 GPUs and uses more than 10 times the computer as a grok. Grok 3 is now quite quick and a lot usable for regular daily tasks. The regular answers are quick, although the Think function (which provides somewhat more detailed answers) regularly takes 2 minutes for an answer. So be ready to wait.

It can also carry out deep research using web sources and also has a specific argumentation model. This means that it can spit out long reports and divid the input requests into step-by-step processes so that it can correct itself. Openai’s O3 modelGrok 3 will still be published in Benchmarks shortly, but it is a significant improvement to its predecessor.

This tweet is currently not available. It can be loaded or has been removed.

But while the charts say that Grok 3 Chatgpt, Gemini and Sonett should surpass in arithmetic -intensive tasks in relation to mathematics, natural sciences and coding, experts do not exactly promote trust.

For example, X users, Ai CEO and Youtuber Theo Browne compared the answers to a coding challenge between Grok 3, O3-Mini and Claude 3.5 Sonnet and GROK 3 did a complaint, whereby it was executed for more than a few seconds without errors .

This tweet is currently not available. It can be loaded or has been removed.

Andrej KarpathyPreviously, Grok 3 was a director of AI at Tesla near Tesla, but that his skills were somewhere between Deepseek R1 and Openais O1-Pro. Certainly not class lines and nothing that you cannot do with existing tools.

But a test, even a few of them, cannot really determine how a AI model works. I was lucky myself, but mainly for easy tasks. It can be helpful to examine, for example, which new air purifier is to buy or learn a new subject. But that’s not exactly something that I want to reduce my wallet for.

GROK is not “based”, it’s actually pretty boring

Before Grok 3 started, Musk made a big deal about how “based” it is. If you do not know what is based (happiness you), this is a slang term for essentially if you share your opinion regardless of others. For example, Musk announced a screenshot that shows a provocative answer from Grok, in which, among other things, the Tech publication described as “garbage”.

This tweet is currently not available. It can be loaded or has been removed.

But when I asked the same question, it came back with a nuanced, balanced answer and did not call the information for much of anything. The only criticism it had was that the website was “sometimes a bit niche or excessive silicon valley-centered” and “biased” more pragmatic than ideological “. This is a pretty shy attitude when you ask me.

Credit: Khamosh Pathak

I received similar results in other tests. Grok would not take a page in the lawsuit by Justin Baldoni against Blake Lively. And when I asked a political question how “Why did Kamala Harris lost the US presidential election?” Reporting of Axios Also fits with what I found.

Grok reaction in Justin Baldoni against Blake Lively Saga.

Credit: Khamosh Pathak

Perhaps GROK is a good thing to choose the Elons eccentricities, but it is certainly not what his master says.

How deep is your search?

Credit: Khamosh Pathak

In relation The newly launched, mostly free deep research function of confusion. As a modest technical journalist, I was able to test myself. I carried out two inquiries, one for a trip that my family is planning for the end of the year and one for an urban hybrid bike.

GROK command prompt for travel planning.

My detailed travel planning request for Grok DeepSearch.
Credit: Khamosh Pathak

In both cases, the confusion of AI has a little better than taken in most tasks. With the travel question, I essentially received the same route from both products, but the confusion has done a better work in formatting.

Credit: Khamosh Pathak

GROK went through the recommendation of other options in southern India, which was only confused for follow-up questions. So I have to give it props there.

Credit: Khamosh Pathak

In purchasing research, however, Grok confused with the top product recommendation. The product it suggested is simply not available in India where I live, and the other options just don’t want to search for.

Credit: Khamosh Pathak

Confused AI surprised me with his top selection, which I didn’t know about most of my boxes. The other options were also interesting and contained nothing that is not available in India. Both groc and confusion did a good job to explain what I should look for when buying an urban bicycle, so the latter was only much more usable.

Credit: Khamosh Pathak

Based on my tests, I have the feeling that Ki Ai still has an advantage over Grok 3 when it comes to deep research that is actually useful for the average person. Whether it is about planning a trip, shopping for research or understanding messages or concepts is confused. When it comes to pure speed, Grok is faster and is not afraid of providing the left in the text itself, but in realizing the click on the linked text is actually expanded to the topic in the report.

Confection also has more export options. You can download your report as a PDF in Markdown or create a voluntary page (Here is my report for urban research in Urban Cycle if you are interested). In GROK you can only copy the text.

What does that mean? Well, while GROK is certainly usable, it is a bit disappointing to see that its paid offer does not keep up with a free alternative step. I have the feeling that I kept bitting here.

Grok 3 is not worth the admission price

At the moment we are in the middle of the Hype cycle of GROK 3. Grok 3 ourselves improves every day, but when things stand, they do not have to go out and cancel their chatt plus or confusion subscriptions. Grok is good in many ways, just not The Good.

If you want, you can temporarily try Grok 3 for free, since X only enables limited free access if its servers cannot process the load. When will this time end? Who knows. According to Musks X account, it will only be “free”short time. “

In addition to the model output, Grok 3 also lacks some functions of a more established KI app. There is no voice mode that you have to have access to is the full Grok 3 model. The faster grok 3 mini still has to be released, and there is also no API for Grok 3.

If you take into account the pricing for full access, Grok 3 makes even less sense. 40 US dollars per month for the X Premium+ plan are twice as high as the industry standard of 20 US dollars for Gemini Advanced, Chatgpt Plus and confusion professional. And as soon as this free test period is over, the expensive X Premium+ plan is the only way to access Grok 3 until the 30 dollar -supergrok subscription for everyone runs live (the supergrok plan only offers you access to Grok 3, but none of the premiums x characteristics).

And as it looks, they don’t really become twice as high as the money. In fact, in many cases you can use a free model like Deepseek R1 instead (but, however, You may have a better experience with the use of an app of third -party providers).

Source link

Spread the love