The scoreboard nobody can read

OpenAI dropped GPT-5.4 on March 5, and the reaction from the AI community was immediate, predictable, and completely contradictory. Half the internet declared it the most powerful language model ever built. The other half pointed out that Google's Gemini 2.5 Pro beats it on several benchmarks and costs less. Both halves were right. And that's exactly the problem. [1][2] Here's what we know about the raw numbers. GPT-5.4 comes in at $2.50 per million input tokens and $10 per million output tokens. It handles a 1-million-token context window — enough to process an entire novel or codebase in a single prompt. On reasoning tasks, mathematical problem-solving, and code generation, it consistently outperforms every other publicly available model. OpenAI's internal benchmarks show it acing graduate-level science questions and complex multi-step logic puzzles that trip up competitors. Google's Gemini 2.5 Pro fires back with a 2-million-token context window — double GPT-5.4's — and noticeably faster inference speeds. On benchmarks measuring long-context retrieval, multimodal understanding (text plus images plus code), and sustained coherence across massive documents, Gemini wins. Its pricing is also more aggressive, running roughly half of what OpenAI charges per token. [2] So who wins? It depends entirely on which benchmarks you look at, which tasks you care about, and — increasingly — which company's blog post you read last.

The AI Model Wars Are a Mess: What GPT-5.4 vs Gemini 2.5 Actually Tells Us

Key Points

The scoreboard nobody can read

The benchmark crisis

References

Comments (0)

Try Tubeletter

Related Articles

The New YouTube Coding Stack Is 'Vibe Coding' — and Creators Are Turning Tool Comparisons Into a Genre

Microsoft's $10 Billion Japan Bet Is Really a Sovereign-AI Infrastructure Story

The pricing war tells the real story

The velocity problem

What actually matters for choosing a model

The real competition isn't models — it's platforms

On this page

Gemma 4 Is Here: Google's Open-Source AI Runs Locally, Builds Apps, and Doesn't Need the Cloud

Related Articles

The New YouTube Coding Stack Is 'Vibe Coding' — and Creators Are Turning Tool Comparisons Into a Genre
A wave of YouTube creators is no longer just reviewing AI coding tools. They're stress-testing them live, comparing them head to head, and shaping how developers pick their stack. The bigger story is that YouTube is starting to act like the new software analyst layer for AI coding tools.
Ericsson
Apr 6, 2026 · 6 min read
Ericsson
Apr 6, 2026 · 6 min read

Microsoft's $10 Billion Japan Bet Is Really a Sovereign-AI Infrastructure Story
Microsoft's $10 billion Japan investment is less about generic cloud expansion and more about becoming part of the country's sovereign AI infrastructure stack through local compute, cybersecurity cooperation, and workforce development.

Gemma 4 Is Here: Google's Open-Source AI Runs Locally, Builds Apps, and Doesn't Need the Cloud
Google just dropped Gemma 4 — four open-source AI models from 2B to 31B parameters, Apache 2.0 licensed, and built to run entirely on your own hardware. After watching YouTubers put the 31B model through its paces on coding, UI generation, and agentic workflows, it's clear: local AI just got a serious upgrade.
Ericsson
Apr 6, 2026 · 6 min read
Ericsson
Apr 6, 2026 · 6 min read