OpenAI Unveils GPT-4.5 as Research Preview Model

OpenAI has released GPT-4.5 as a research preview, representing their most advanced chat model with improved world knowledge, reduced hallucinations, and enhanced emotional intelligence. The model, available to Pro users and developers, demonstrates significant improvements in factual accuracy, creative tasks, and human collaboration compared to previous versions.

openai Feb 27, 2025

OpenAI has launched a research preview of GPT-4.5, described as their most advanced and capable chat model to date. The model represents progress in scaling both pre-training and post-training methodologies, building upon unsupervised learning techniques.

Key Advancements

Enhanced Unsupervised Learning

GPT-4.5 advances capabilities through two complementary intelligence paradigms:

  1. Unsupervised learning - Enhances world model precision and intuitive understanding, the foundation of models like GPT-3.5, GPT-4, and now GPT-4.5
  2. Reasoning scaling - Enables models to develop thought chains before responding, crucial for complex STEM and logical challenges (exemplified by models like OpenAI o1 and o3-mini)

The model was developed using Microsoft Azure AI supercomputing infrastructure, incorporating architectural innovations and optimization improvements alongside expanded compute and data resources.

Performance Improvements

Factual Accuracy

  • SimpleQA Accuracy: GPT-4.5 achieves 62.5%, surpassing GPT-4o (38.2%), OpenAI o1 (47%), and o3-mini (15%)
  • Reduced Hallucinations: 37.1% hallucination rate compared to GPT-4o (61.8%), o1 (44%), and o3-mini (80.3%)

Human Preference Testing

GPT-4.5 demonstrated superior performance across multiple domains:

  • Creative intelligence: 56.8% preference over GPT-4o
  • Professional queries: 63.2% preference
  • Everyday tasks: 57.0% preference

Enhanced Human Collaboration

OpenAI has implemented novel scalable techniques allowing larger model training using data from smaller models. These advancements improve:

  • Model steerability and control
  • Nuanced understanding of context
  • Natural conversational flow
  • Emotional intelligence and social awareness
  • Creative and aesthetic capabilities

Benchmark Performance

GPT-4.5 shows significant improvements across academic benchmarks:

  • GPQA (science): 71.4% (vs. GPT-4o: 53.6%)
  • AIME '24 (math): 36.7% (vs. GPT-4o: 9.3%)
  • MMMLU (multilingual): 85.1% (vs. GPT-4o: 81.5%)
  • MMMU (multimodal): 74.4% (vs. GPT-4o: 69.1%)
  • SWE-Bench Verified: 38.0% (vs. GPT-4o: 30.7%)

Availability and Access

ChatGPT Integration

  • Immediate: Available to ChatGPT Pro subscribers
  • Next week: Rollout to Plus and Team users
  • Following week: Enterprise and Education users

The model includes support for:

  • Web search with current information
  • File and image processing
  • Canvas functionality for writing and coding tasks

Current limitations include lack of Voice Mode, video, and screensharing capabilities.

API Access

Developers on all paid tiers can access GPT-4.5 through:

  • Chat Completions API
  • Assistants API
  • Batch API

Supported features include:

  • Function calling capabilities
  • Structured Outputs
  • Real-time streaming
  • System message configuration
  • Vision capabilities via image inputs

Safety Measures

OpenAI has implemented enhanced safety protocols including:

  • Novel supervision techniques integrated with traditional supervised fine-tuning (SFT)
  • Reinforcement learning from human feedback (RLHF) methodologies
  • Comprehensive safety testing aligned with OpenAI's Preparedness Framework
  • Detailed evaluation results published in an accompanying system card

Future Direction

While GPT-4.5 excels at pattern recognition and creative insights without explicit reasoning, OpenAI views the integration of reasoning capabilities as essential for future models. The organization believes that combining pre-training advances with reasoning capabilities will produce more capable foundation models for reasoning and tool-using agents.

Due to the model's substantial computational requirements and associated costs, OpenAI is evaluating its long-term API availability based on user feedback and demonstrated value for specific applications.