OpenAI Agents SDK Update: Sandbox Execution, Enhanced Harness, and File-Level Agent Capabilities

OpenAI is introducing new capabilities to the Agents SDK that provide developers with standardized infrastructure designed to work optimally with OpenAI's models. The update includes a model-native harness enabling agents to work across files and tools on a computer, along with native sandbox execution for running tasks safely.

Developers can now give an agent a controlled workspace, explicit instructions, and the tools it needs to inspect evidence. For instance, an agent can be configured with a manifest of local files, pointed at a specific model like GPT-5.4, and run inside a sandbox environment to analyze data and produce results.

OpenAI acknowledges that developers need more than just top-tier models to build useful agents-they need systems that support how agents inspect files, run commands, write code, and persist across many steps. Existing solutions come with tradeoffs: model-agnostic frameworks don't fully leverage frontier model capabilities, model-provider SDKs often lack harness visibility, and managed agent APIs can constrain where agents run and how they access sensitive data.

Several early testers-including Oscar Health, Actively, LexisNexis, FurtherAI, Thomson Reuters, Zoom, and Tomoro AI-have shared positive feedback on the updated SDK.

A More Capable Harness for the Agent Loop

With this release, the Agents SDK harness becomes significantly more capable for agents working with documents, files, and systems. It now features configurable memory, sandbox-aware orchestration, Codex-like filesystem tools, and standardized integrations with primitives that are becoming common in frontier agent systems.

These primitives include tool use via MCP, progressive disclosure via skills, custom instructions via AGENTS.md, code execution using the shell tool, file edits using the apply patch tool, and more. The harness will continue incorporating new agentic patterns over time, allowing developers to focus on domain-specific logic rather than core infrastructure.

The harness also helps developers unlock more of a frontier model's capability by aligning execution with the way those models perform best, keeping agents closer to the model's natural operating pattern and improving reliability on complex, long-running, or multi-tool tasks.

OpenAI designed the Agents SDK to accommodate the diversity of developer products. The harness is turnkey yet flexible, making it easy to adapt to any stack-including tool use, memory, and sandbox environment.

Native Sandbox Execution

The updated Agents SDK supports sandbox execution natively, enabling agents to run in controlled computer environments with the files, tools, and dependencies required for a task.

Many useful agents need a workspace where they can read and write files, install dependencies, run code, and use tools safely. Native sandbox support provides that execution layer out of the box, rather than requiring developers to assemble it themselves.

Developers can bring their own sandbox or use built-in support for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel.

To make environments portable across providers, the SDK introduces a Manifest abstraction for describing the agent's workspace. Developers can mount local files, define output directories, and pull in data from storage providers including AWS S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2.

This provides a consistent way to shape the agent's environment from local prototype to production deployment, and gives the model a predictable workspace for locating inputs, writing outputs, and staying organized across long-running tasks.

Separating Harness from Compute for Security, Durability, and Scale

Agent systems should be designed with the assumption that prompt-injection and exfiltration attempts will occur. Separating harness and compute helps keep credentials out of environments where model-generated code executes.

This separation also enables durable execution. When the agent's state is externalized, losing a sandbox container does not mean losing the run. With built-in snapshotting and rehydration, the Agents SDK can restore an agent's state in a fresh container and continue from the last checkpoint if the original environment fails or expires.

Additionally, it makes agents more scalable. Agent runs can use one sandbox or many, invoke sandboxes only when needed, route subagents to isolated environments, and parallelize work across containers for faster execution.

Pricing and Availability

These new Agents SDK capabilities are generally available to all customers via the API and use standard API pricing, based on tokens and tool use.

What's Next

OpenAI plans to keep expanding what developers can build with the Agents SDK, making it easier to bring more capable agents into production with less custom infrastructure while preserving flexibility and control.

The new harness and sandbox capabilities launch first in Python, with TypeScript support planned for a future release. OpenAI is also working to bring additional agent capabilities-including code mode and subagents-to both Python and TypeScript.

Furthermore, OpenAI aims to help bring the broader agent ecosystem together over time, with support for more sandbox providers, more integrations, and more ways for developers to plug the SDK into the tools and systems they already use.

View source Back to news