OpenAI launches GPT-5.4 in a bid to win back user trust
OpenAI just dropped GPT-5.4, the latest version of its artificial intelligence model, and the timing of this release says a lot about the company’s strategy. After going through some serious turbulence, including the loss of roughly 1.5 million users from ChatGPT over a controversial partnership with the United States Department of Defense, Sam Altman’s company needs a strong response to regain ground.
The new model arrives promising to be, in the company’s own words, the most capable and efficient frontier model for professional work ever created by the organization. According to OpenAI, GPT-5.4 brings together advances in reasoning, coding, and agentic workflows into a single unified model. In practical terms, that means 33% fewer errors in individual responses and an overall mistake rate that is 18% lower compared to GPT-5.2, plus a previously unseen ability to operate computers autonomously. The big question hanging in the air, though, is whether purely technical improvements in performance can resolve a crisis that, for many users, has its roots in ethical and trust-related concerns.
What actually changes with GPT-5.4
The main differentiator OpenAI highlights in this update is the drastic reduction in hallucination rates — those moments when the model makes up information or delivers completely wrong answers with unshakable confidence. With GPT-5.4, the company says individual responses are 33% less likely to contain errors compared to GPT-5.2, while the overall probability of making mistakes dropped by 18%. The company also confirmed that hallucinations have become less frequent with this new version.
This kind of improvement might seem incremental at first glance, but it makes a huge difference in the daily lives of people who rely on the tool for work. Professionals who use ChatGPT for tasks like research, technical writing, data analysis, and programming will notice much greater reliability in the responses. When an artificial intelligence model makes fewer mistakes, the need for manual verification decreases, workflows speed up, and user trust grows organically over time.
Autonomous computer use: the big agentic leap
Another point that caught the tech community’s attention is GPT-5.4’s ability to operate computers autonomously. According to OpenAI, this is the first general-purpose model from the company with native computer-use capabilities. The model can write code to operate and execute tasks on computers, as well as issue keyboard and mouse commands to navigate the operating system.
This means GPT-5.4 can interact with graphical interfaces, click buttons, fill out forms, and switch between applications without the need for constant human intervention. This feature puts ChatGPT on a whole different level, moving beyond being just a text generation tool to becoming something closer to a full-fledged digital assistant. Imagine delegating repetitive computer tasks to an artificial intelligence that understands the context of what needs to be done and executes with precision. It is still too early to say how this will work in real-world scenarios with the endless variables that come with everyday operating system use, but the potential is undeniable.
Benchmarks and results that put the model on top
To back up the advances on the agentic front, OpenAI pointed to results from some highly relevant benchmarks. According to the company, GPT-5.4 claimed the top spot in three important reference tests:
- APEX-Agents by Mercor — a benchmark that evaluates model performance on professional services tasks
- OSWorld-Verified — focused on measuring the computer-use capabilities of AI models
- WebArena Verified — which also tests model performance in interactions with computing environments
These results show that OpenAI is not just talking about improvements but putting concrete numbers on the table to back up its claims. In a market that is becoming increasingly competitive, delivering superior performance backed by independent benchmarks is practically a requirement for anyone who wants to stay relevant. GPT-5.4 positions itself competitively against rival models like Anthropic’s Claude and Google’s Gemini, something the company clearly needed right now.
Availability and access plans
GPT-5.4 has already started rolling out and will be available on ChatGPT, Codex, and the OpenAI API. The GPT-5.4 Thinking version, which includes enhanced reasoning capabilities, will be accessible to subscribers on the Plus, Teams, and Pro plans.
The GPT-5.4 Pro version will be available through the API and also for ChatGPT Enterprise and ChatGPT Edu subscribers. This segmentation shows that OpenAI is prioritizing the professional and corporate audience with this update, which makes sense given the model’s positioning as the most advanced option for work.
For developers and companies that integrate OpenAI models into their own products, API access is especially relevant. The ability to use computer-use capabilities programmatically opens up a massive range of automations that previously required traditional RPA tools or custom scripts. Now, with a language model capable of interpreting context and executing actions on graphical interfaces, the potential for intelligent automation grows significantly.
The trust crisis and the challenge that goes beyond technology
As impressive as the technical improvements of GPT-5.4 may be, it is impossible to ignore the elephant in the room. The loss of 1.5 million users from ChatGPT did not happen because the previous model was bad or because the competition offered something technically superior. The exodus was driven by an institutional decision from OpenAI that deeply upset a significant portion of its user base.
The partnership with the United States Department of Defense sparked an intense debate about the ethical boundaries of artificial intelligence use and called into question the image the company had always tried to build as an organization committed to safe and beneficial technology development. This move became even more dramatic because it came on the heels of Anthropic’s public refusal to loosen its safeguards to meet Pentagon demands. The contrast between the two stances made OpenAI’s decision look even more controversial in the public eye.
Internal problems also weigh in
The turbulence was not limited to external perception. Within OpenAI itself, employees openly expressed their opposition to working with the Department of Defense. According to reports published by the Wall Street Journal, Sam Altman went so far as to defend the decision in internal meetings with the team, calling the negative reaction really painful. When a tech company’s own staff publicly questions leadership decisions, the warning signs are clear.
For users who left the platform over ethical concerns, a model that makes fewer mistakes or that can click buttons on its own may not be a compelling enough argument to come back. Trust is an asset that takes time to build and can be destroyed with a single poorly communicated or poorly evaluated decision.
The AI market has matured and the rules have changed
This scenario puts OpenAI in a delicate and quite revealing position about the current state of the artificial intelligence market. For a long time, the race was purely technical, and whoever delivered the most capable model won user preference almost automatically. Now, with multiple companies offering models of comparable quality, factors like ethical positioning, corporate transparency, and alignment with audience values have become a significant part of the decision about which platform to use.
ChatGPT is still the most popular generative AI tool in the world, but the margin of advantage is shrinking, and user loyalty has already proven to be more fragile than many had imagined. OpenAI needs to understand that technical performance and public perception are two equally important fronts in this battle. Ignoring either one is a risk that could prove costly in the long run.
What to expect going forward
The launch of GPT-5.4 is, without a doubt, an important step for OpenAI to demonstrate that it remains at the forefront of artificial intelligence development. The improvements are real, measurable, and relevant for anyone who uses ChatGPT on a daily basis. The reduction in errors, the autonomous computer-use capability, and the results on reference benchmarks all prove that the company’s technical team continues to deliver cutting-edge innovation.
However, the company is at an inflection point where technological innovation alone may not be enough to solve all of its problems. The market has matured, users have become more demanding and more attentive to corporate decisions, and the competition has never been closer. GPT-5.4 has the potential to be the best model OpenAI has ever produced, but winning back the trust of those who left is going to require much more than impressive numbers on performance benchmarks.
It is going to require consistency between words and actions, something that no language model, no matter how advanced, can deliver on its own. As someone aptly pointed out, fewer errors in ChatGPT probably won’t guarantee fewer errors in Sam Altman’s judgment. And that, perhaps, is the most urgent update OpenAI needs to ship. 🤖
