Introducing GPT-5.5

Summary (EN)

OpenAI announced GPT-5.5 as a new flagship model positioned around practical computer use and long-horizon task execution rather than only chatbot fluency. The company says the model is better at writing and debugging code, researching online, analyzing data, creating documents and spreadsheets, operating software, and moving across tools until a job is complete. OpenAI emphasizes that GPT-5.5 can handle messy, multi-part assignments with less step-by-step supervision, which is the core commercial promise behind agentic workflows. In OpenAI’s framing, the model improves especially in agentic coding, computer use, knowledge work, and early scientific research while maintaining GPT-5.4-level per-token latency. The release note also claims better token efficiency on Codex tasks, suggesting lower operating cost for some applied workloads. On benchmarks cited by OpenAI, GPT-5.5 improves over GPT-5.4 on Terminal-Bench 2.0, coding evaluations, browsing-heavy tasks, cyber evaluations, and frontier math tiers. The rollout begins in ChatGPT and Codex for paid tiers, with API access following under additional safeguards. OpenAI also highlights stronger safety work around the release, including broader preparedness testing, targeted cybersecurity and biology evaluations, external redteaming, and feedback from nearly 200 early-access partners. The announcement matters because it packages model progress directly around software execution and workplace delegation, making the commercial argument less about conversational quality and more about whether a model can reliably complete real tasks across tools, interfaces, and ambiguous operating conditions.

Summary (ZH)

OpenAI 发布 GPT-5.5,并将其定位为更偏向实际“电脑工作执行”的旗舰模型,而不只是更会对话的聊天模型。官方称,GPT-5.5 在写代码、调试、联网检索、数据分析、生成文档和表格、操作软件,以及跨工具持续推进任务直到完成等方面都有明显提升。与过去需要用户高度拆解步骤的用法相比,GPT-5.5 被描述为更适合处理杂乱、多阶段、目标复杂的真实工作流,这正是“智能体化”商业应用最核心的能力诉求。OpenAI 特别强调,该模型在 agentic coding、computer use、知识工作和早期科研辅助方面进步突出,同时保持与 GPT-5.4 接近的每 token 延迟,并在部分 Codex 任务中更节省 token,意味着在一些实际部署场景中可能兼具更高能力和更低成本。按照 OpenAI 公布的数据,GPT-5.5 在终端任务、编码评测、浏览任务、网络安全评测和高阶数学评测上较 GPT-5.4 有提升。首批上线范围为 ChatGPT 与 Codex 的付费用户,API 将在补充安全措施后开放。OpenAI 同时强调其进行了更强的安全准备,包括更完整的 preparedness 评估、面向网络安全与生物能力的专项测试、外部红队验证,以及近 200 家早期合作方的真实使用反馈。这一发布的意义在于,OpenAI 把模型竞争焦点进一步从“回答得好不好”推进到“能否跨工具把真实工作做完”。

Source

https://openai.com/index/introducing-gpt-5-5/