GPT-5.5 Features: Full Breakdown of OpenAI's "Spud" Model

GPT-5.5 (Spud) launched April 23, 2026. Explore every major feature: agentic coding, 1M context window, GPT-5.5 Pro, computer use, scientific research, and pricing.

by Framia

GPT-5.5 Features: Full Breakdown of OpenAI's "Spud" Model

OpenAI released GPT-5.5 on April 23, 2026 — internally codenamed "Spud." Described as "a new class of intelligence for real work," GPT-5.5 is the company's most capable and production-ready model yet. This guide covers every significant feature and capability.

1. Agentic Coding — The Flagship Capability

GPT-5.5's most pronounced improvement over GPT-5.4 is in agentic coding — the ability to take on complex, long-horizon software engineering tasks autonomously.

Benchmark results:

  • Terminal-Bench 2.0: 82.7% (vs 75.1% for GPT-5.4) — state-of-the-art, beats Claude Opus 4.7 at 69.4%
  • Expert-SWE (Internal): 73.1% — tasks with a median estimated human completion time of 20 hours
  • SWE-Bench Pro: 58.6%

In practice, GPT-5.5 is better at:

  • Understanding why a system is failing and where the fix needs to land
  • Holding context across large, multi-file systems
  • Making changes that propagate correctly through the surrounding codebase
  • Debugging complex, ambiguous failures without repeated user prompting

Dan Shipper, CEO of Every, called it "the first coding model I've used that has serious conceptual clarity."

2. 1M Token Context Window

API context window: 1,000,000 tokens
Codex context window: 400,000 tokens

This is one of GPT-5.5's most significant practical improvements. The long-context benchmarks demonstrate it dramatically:

Context Range GPT-5.5 GPT-5.4
256K–512K 81.5% 57.5%
512K–1M 74.0% 36.6%

At 512K–1M, GPT-5.5 scores more than double GPT-5.4's accuracy. This makes full-codebase analysis, lengthy legal document review, and multi-chapter research synthesis genuinely practical without chunking.

3. Multiple GPT-5.5 Variants

GPT-5.5 (Base)

Standard model for ChatGPT (Plus/Pro/Business/Enterprise) and Codex.

GPT-5.5 Pro

Higher-accuracy variant with stronger performance on demanding tasks:

  • BrowseComp: 90.1% vs 84.4% (base)
  • FrontierMath Tier 4: 39.6% vs 35.4% (base)
  • GeneBench: 33.2% vs 25.0% (base)

Available to Pro, Business, and Enterprise users in ChatGPT; in the API at $30 input / $180 output per 1M tokens.

GPT-5.5 Thinking

Delivered in ChatGPT, this mode produces "smarter and more concise answers" for harder problems using extended chain-of-thought reasoning.

GPT-5.5 Fast Mode (Codex)

1.5× faster token generation at 2.5× the standard cost — for latency-sensitive agentic workflows.

4. Computer Use

GPT-5.5 can operate software autonomously — navigating interfaces, clicking, typing, and moving across tools to complete tasks. It reaches 78.7% on OSWorld-Verified, which measures whether models can operate real computer environments independently.

This brings GPT-5.5 closer to functioning as a true AI agent that can operate alongside a human on a computer — not just respond to prompts.

5. Knowledge Work

GPT-5.5 delivers state-of-the-art performance on professional knowledge tasks:

  • GDPval: 84.9% — tests agents across 44 occupations for knowledge work quality
  • Tau2-bench Telecom: 98.0% — complex customer-service workflows, without prompt tuning
  • OfficeQA Pro: 54.1% (vs Claude's 43.6%, Gemini's 18.1%)
  • Investment Banking Modeling: 88.5% (internal benchmark)

Real-world uses reported by OpenAI teams: automated business report generation (saving 5–10 hours/week), processing 24,771 tax forms in an accelerated timeline, and building automated routing systems for communications.

6. Scientific Research

GPT-5.5 represents a genuine leap in scientific capability:

  • GeneBench: 25.0% (GPT-5.4: 19.0%) — multi-stage genetics and quantitative biology analysis
  • BixBench: 80.5% (GPT-5.4: 74.0%) — real-world bioinformatics data analysis
  • FrontierMath Tier 4: 35.4% (GPT-5.4: 27.1%)

Notably, an internal GPT-5.5 variant helped discover a new proof about Ramsey numbers — verified in the Lean proof assistant — a landmark result in combinatorics.

7. Inference Efficiency

GPT-5.5 matches GPT-5.4's per-token latency despite being significantly more capable. Key engineering details:

  • Co-designed for NVIDIA GB200/GB300 NVL72 systems
  • Improved load balancing heuristics (developed with Codex) boosted token generation by 20%+
  • Uses fewer tokens to complete the same Codex tasks compared to GPT-5.4

For cost-conscious teams: while GPT-5.5 has a higher price per token, its token efficiency often results in comparable or lower total cost.

8. Cybersecurity Capabilities

GPT-5.5 is OpenAI's most capable cybersecurity model:

  • CyberGym: 81.8% (vs Claude Opus 4.7's 73.1%)
  • Capture-the-Flags (Internal): 88.1%

OpenAI classified these capabilities as "High" under its Preparedness Framework and deployed tighter controls around high-risk cyber workflows. A Trusted Access for Cyber program gives verified defenders expanded access with fewer restrictions.

9. Pricing and Availability

ChatGPT access: Plus, Pro, Business, Enterprise (free tier excluded at launch)
Codex access: Plus, Pro, Business, Enterprise, Edu, Go plans

API pricing:

Model Input Output
gpt-5.5 $5 / 1M tokens $30 / 1M tokens
gpt-5.5-pro $30 / 1M tokens $180 / 1M tokens

Batch/Flex: 50% of standard. Priority: 2.5× standard.

10. Accessing GPT-5.5 via Platforms

Beyond OpenAI's native interfaces, Framia.pro provides ready-built AI workflows powered by GPT-5.5 — covering content creation, business automation, and research tasks. It's the fastest way to put GPT-5.5's capabilities to work without API configuration.

Summary of Key Features

Feature Detail
Release date April 23, 2026
Codename Spud
Context window 1M tokens (API), 400K (Codex)
Top coding benchmark Terminal-Bench 2.0: 82.7%
Top knowledge benchmark Tau2-bench Telecom: 98.0%
Abstract reasoning ARC-AGI-2: 85.0%
API price $5/$30 per 1M tokens
Pro API price $30/$180 per 1M tokens
Variants Base, Pro, Thinking, Fast Mode