AnyAi.fyi - Discover ANY AI to make more online for less.

Grok 4 edges out GPT-5 in complex reasoning benchmark ARC-AGI

In the ARC-AGI-2 benchmark, which is designed to measure a language model's general reasoning skills, GPT-5 (High) scored 9.9 percent at a cost of $0.73 per task, according to ARC Prize.
The article Grok 4 edges out GPT-5 in complex reasoning benchmark ARC-AGI appeared first on THE DECODER.

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

Grok 4.1 Fast's compelling dev access and Agent Tools API overshadowed

Elon Musk's frontier generative AI startup xAI<a href="https://x.ai/news/grok-4-1-fast"> formally opened developer access to its Grok 4.1 Fast models</a> last n [...]

More Copy

Match Score: 316.66

venturebeat

xAI launches Grok 4.3 at an aggressively low price and a new, fast, powerfu

While Elon Musk faces off against his former colleague and OpenAI co-founder Sam Altman in <a href="https://www.theverge.com/ai-artificial-intelligence/920775/evidence-exhibits-elon-m [...]

More Copy

Match Score: 207.34

venturebeat

Musk's xAI launches Grok 4.1 with lower hallucination rate on the web

In what appeared to be a bid to soak up some of Google's limelight prior to the <a href="https://venturebeat.com/ai/google-unveils-gemini-3-claiming-the-lead-in-math-science- [...]

More Copy

Match Score: 192.67

venturebeat

Microsoft built Phi-4-reasoning-vision-15B to know when to think — and wh

<a href="https://www.microsoft.com/en-us">Microsoft</a> on Tuesday released <a href="https://www.microsoft.com/en-us/research/blog/phi-4-reasoning-vision-and-the [...]

More Copy

Match Score: 175.68

venturebeat

OpenAI launches GPT-5.4 with native computer use mode, financial plugins fo

The AI updates aren't slowing down. Literally two days after OpenAI launched a new underlying AI model for ChatGPT called <a href="https://venturebeat.com/orchestration/gpt-5 [...]

More Copy

Match Score: 168.01

venturebeat

SpaceX's Grok 4.5 launches at half the price of rivals — here's

Elon Musk's <a href="https://www.spacex.com/">SpaceX</a> released <a href="https://x.ai/news/grok-4-5">Grok 4.5</a> on Wednesday, the firs [...]

More Copy

Match Score: 158.71

venturebeat

AI IQ is here: a new site scores frontier AI models on the human IQ scale.

For decades, the IQ test has been one of the most familiar — and most contested — yardsticks for human intelligence. Now, a startup project called <a href="https://www.aiiq.org/&q [...]

More Copy

Match Score: 147.73

venturebeat

Musk's xAI launches Grok Business and Enterprise with compelling vault

xAI has <a href="https://x.ai/news/grok-business">launched Grok Business and Grok Enterprise</a>, positioning its flagship AI assistant as a secure, team-ready platform [...]

More Copy

Match Score: 144.61

venturebeat

Samsung AI researcher's new, open reasoning model TRM outperforms mode

The trend of AI researchers developing new, <a href="https://www.linkedin.com/pulse/next-big-thing-ai-think-small-models-venturebeat-yyrte/?trackingId=x3X3vTZhTnmwCTUtOWGAug%3D%3D&quo [...]

More Copy

Match Score: 133.09