OpenAI has launched GPT-5, marking a major advancement in artificial intelligence capabilities with enhanced reasoning, reduced hallucinations, and improved performance across multiple domains. The new model functions as a unified system that combines a smart, efficient model for quick responses with a deeper reasoning model (GPT-5 thinking) for complex problems, all managed by a real-time router that automatically selects the appropriate approach.
Key Improvements Across Core Areas
Enhanced Coding Capabilities
GPT-5 represents OpenAI's strongest coding model to date, excelling particularly in complex front-end generation and debugging large codebases. The model can create sophisticated websites, applications, and games with a single prompt, demonstrating improved aesthetic sensibility and design choices. It achieves 74.9% on SWE-bench Verified and 88% on Aider Polyglot benchmarks.
Advanced Writing and Creative Expression
The model serves as OpenAI's most capable writing collaborator, transforming rough ideas into compelling prose with literary depth. GPT-5 handles structural ambiguity better than predecessors, including maintaining unrhymed iambic pentameter and natural free verse. The improvements extend to everyday tasks like drafting emails, reports, and memos.
Healthcare Applications
GPT-5 achieves the highest performance on health-related queries among OpenAI's models, scoring significantly better on HealthBench evaluations. The model proactively identifies potential concerns and adapts responses to users' context, knowledge level, and geography. While not replacing medical professionals, it helps users understand results and prepare questions for healthcare providers.
Performance Benchmarks
- Mathematics: 94.6% on AIME 2025 without tools
- Multimodal Understanding: 84.2% on MMMU
- Health: 46.2% on HealthBench Hard
- Scientific Reasoning: 88.4% on GPQA with GPT-5 Pro
Efficiency and Speed Improvements
GPT-5 delivers superior results with 50-80% fewer output tokens compared to OpenAI o3 across various capabilities including visual reasoning, agentic coding, and graduate-level problem solving.
Enhanced Reliability and Safety
Reduced Hallucinations
With web search enabled, GPT-5's responses contain approximately 45% fewer factual errors than GPT-4o. When utilizing thinking capabilities, the error rate drops by about 80% compared to OpenAI o3. The model shows approximately six times fewer hallucinations than o3 on open-ended fact-seeking prompts.
Improved Honesty
GPT-5 better communicates its limitations and capabilities, particularly for impossible or underspecified tasks. Testing shows the model gives confident answers about non-existent images only 9% of the time compared to 86.7% for OpenAI o3.
Advanced Safety Training
OpenAI introduced "safe completions" training for GPT-5, teaching the model to provide the most helpful answers possible while maintaining safety boundaries. This approach enables better handling of dual-use questions and reduces unnecessary refusals.
Reduced Sycophancy
The model is less excessively agreeable, uses fewer unnecessary emojis, and provides more thoughtful responses. Targeted evaluations show sycophantic replies decreased from 14.5% to less than 6%.
New Customization Features
GPT-5 significantly improves instruction following and custom instruction adherence. OpenAI is launching a research preview of four preset personalities (Cynic, Robot, Listener, and Nerd) that allow users to adjust ChatGPT's interaction style without writing custom prompts.
GPT-5 Pro
For the most complex tasks, OpenAI released GPT-5 Pro, which uses extended parallel test-time compute to provide comprehensive answers. External experts preferred GPT-5 Pro over standard GPT-5 thinking 67.8% of the time, with the Pro version making 22% fewer major errors.
Availability and Implementation
GPT-5 becomes the default model in ChatGPT, replacing GPT-4o and other previous models for signed-in users. The rollout includes:
- Immediate access for Plus, Pro, Team, and Free users
- Next week for Enterprise and Education users
- Pro subscribers receive unlimited GPT-5 access and GPT-5 Pro
- Plus users get significantly higher usage than free users
- Free users transition to GPT-5 mini after reaching usage limits
Pro, Plus, and Team users can also utilize GPT-5 through the Codex CLI by signing in with ChatGPT.
Technical Infrastructure
The model was trained on Microsoft Azure AI supercomputers, implementing comprehensive safeguards for biological risk mitigation under OpenAI's Preparedness Framework. The system underwent 5,000 hours of red-teaming with partners including CAISI and UK AISI.