The Line Between Smart AI And AGI Just Got Blurry
Photo by Steve Johnson on Unsplash
In early 2024, at our Total Wealth Symposium at the Ritz Carlton in Orlando, I gave a talk on the roadmap for AI over the next few years.
I predicted that before we reach artificial general intelligence (AGI) and then artificial superintelligence (ASI), we’ll ascend through levels of artificial capable intelligence.
This is when AI begins solving complex problems that require many steps, and it will eventually lead to a superintelligence that can improve itself better than a human could.
I’ve also been saying for a while now that the next big leap in AI won’t just be about bigger models or cheaper tokens.
It will come from smarter reasoning.
The ability to think across disciplines… to form original ideas… to solve problems that don’t have obvious answers.
That’s when we’re going to see real advances in AI.
And now it’s happening.
On Monday, OpenAI rolled out a wave of new models that aren’t just faster or cheaper than its previous models.
GPT-4.1 and its smaller siblings (GPT-4.1 mini and nano) are also smarter.
Like DeepSeek’s R1, these models represent a pivot away from brute-force compute and toward models that do more with less.
For example, GPT-4.1 mini is around 83% cheaper to use than GPT-4o, all while outperforming it on key coding and reasoning tasks.
These models offer faster interactions, and they can handle bigger problems and deliver better results across the board.
But as exciting as the 4.1 line is, I’m even more excited about what OpenAI announced just two days later.
Because it appears to be the next evolution of reasoning machines…
And our first major step toward artificial general intelligence.
PUTTING THE “O” IN REASONING
I’m talking about OpenAI’s “o-series” models, the newest versions of which were released on Wednesday.
For once, OpenAI’s CEO might be underselling what his company just put out.
Because unlike the general-purpose GPT-4 family, the o-series is specifically engineered for reasoning.
Think of these models like purpose-built engines for solving hard problems. They excel at things like science, coding, math and problem-solving.
And they crush the performance of OpenAI’s first reasoning model, o1.
Source: OpenAI
In fact, OpenAI says that o3 is the company’s most powerful reasoning model yet.
But it doesn’t just spit out plausible answers. It actually demonstrates the ability to abstract, to generalize and even to connect ideas across domains.
In other words, it’s doing the kind of cognitive heavy lifting we’ve always imagined when we talk about artificial general intelligence.
As a reminder, AGI is when a machine can match or surpass human capabilities.
And that’s exactly what’s happening here.
On the ARC-AGI benchmark, which is a notoriously difficult test designed to measure general intelligence by emphasizing human-like reasoning over brute memorization, OpenAI’s o1 model struggled to even crack 32%.
But the o3 scored 88%.
That’s not just a good result. It’s above baseline human-level performance.
For context, most STEM grads score in the 90s.
One website that quizzes 20 verbal and 6 vision AIs every week suggests that only 1% of humans are smarter than o3.
And although IQ isn’t the best representation of AI’s intelligence, these scores represent a real step change in AI capability.
It shows that these machines are already starting to think more like we do.
OpenAI o3 has crossed a threshold where it’s both solving pre-defined problems and beginning to understand how to approach problems in the first place.
And that’s why I’m so excited about these new models. Because they don’t just regurgitate facts…
They connect them.
This ability is what will elevate AI from being a useful tool to becoming a genuine reasoning partner.
At Argonne National Laboratory, scientists have already used early versions of the o3 model to design complex experiments in hours instead of days. This proves o3 can be a productivity multiplier.
And it has massive implications for a host of industries.
In pharmaceutical R&D, where time is literally money, an AI that can propose new compounds and simulate reaction pathways overnight could accelerate drug discovery by months.
In climate modeling, imagine feeding years of satellite data, topographical maps and atmospheric readings into a reasoning model that can propose new hypotheses about regional climate shifts.
This same AI could then control a simulator to test these hypotheses before a human ever sees them.
In education, tutoring platforms could shift from answering: “What’s the derivative of this function?” to “Why does this solution strategy work, and what are its limitations?”
That’s the kind of deeper reasoning students need. And with advanced AI reasoning models, we’ll be able to deliver this level of tutoring at scale.
And it’s going to be a true game-changer for software developers.
These new models can suggest entire system architectures. They can explain why certain trade-offs make sense, and they can even spot edge cases in code that developers might not notice until production.
But advanced reasoning capabilities come with a hefty price tag.
The output of o-3 is priced at $40 per one million tokens.
Compare that to GPT-4.1 nano which only costs 40 cents per one million tokens.
Rumors suggest OpenAI plans to charge up to $20,000 per month for enterprise-grade access to these advanced reasoning tools.
That’s about 1,000 times the price of a standard ChatGPT subscription, which is in line with the output costs I just shared.
Still, it’s a drop in the bucket for companies doing high-stakes research or building mission-critical infrastructure.
Especially if these new reasoning models can do work that previously took entire teams to accomplish.
HERE’S MY TAKE
The speed that we’re racing toward artificial superintelligence (ASI) is both exciting and a little unsettling.
AGI is the first step. And we’re a lot closer to AGI today than we were last week.
We’ve talked about how 2025 is going to be the year of AI agents. These o-series reasoning models will help make this a reality.
After all, they are already outperforming most humans on graduate-level STEM benchmarks.
And when you combine them with long context windows of up to one million tokens and the ability to manage real-world tools, it seems like we’re about to experience a fundamental change in how knowledge work gets done.
And if this pace of progress keeps up, we could be looking at a very different world in just a year.
Next spring I could be telling you about AI reasoning models that are helping to both plan experiments and run them.
We could see hybrid systems where a reasoning model proposes a new material, simulates it and then directs a robot to synthesize it.
Academic publishing could shift from months-long peer review to days of AI-assisted vetting.
Small startups with the right AI models could even out-reason giant R&D teams.
And it could all happen without much human intervention.
I know this sounds like science fiction. But think about where we were only a year ago with AI.
You can visually see the improvement in AI through this generative video of Will Smith eating pasta…
I’m simply projecting the logical outcome of what we’re already seeing.
Sometimes the future comes at you fast.
To me, this feels like one of those moments.
More By This Author:
Can Apple’s Ai Delay Actually Be Its Biggest Advantage?
How China Could Quietly Upend The AI Race
A Wild Week Hints At What Comes Next