OpenAI Preparing GPT-5.4 with Million-Plus Token Context Window

OpenAI is developing GPT-5.4, a follow-up model reported to include a context window exceeding 1 million tokens, stronger retention across extended tasks and a new extreme reasoning mode that allocates more compute and time to difficult queries. The information comes from a person familiar with the development.

Key Points

OpenAI is developing GPT-5.4, which a person with knowledge says will have a context window exceeding 1 million tokens, compared with 400,000 tokens in GPT-5.2.
The model is expected to better retain user instructions and operational parameters across multi-step tasks, lower error rates for hours-long processes, and introduce an extreme reasoning mode that uses substantially more time and computational resources for difficult questions.
Sectors likely affected include AI software development, cloud computing and developer tools such as coding assistants that automate complex, extended tasks.

OpenAI is working on a new generative model, dubbed GPT-5.4, according to a person with knowledge of the development. The next iteration is expected to expand how much text or data the model can consider at once by offering a context window that surpasses 1 million tokens.

This increase would more than double the 400,000-token capacity available in the current GPT-5.2 release and enable the model to process queries that contain substantially larger bodies of text or datasets. The planned expansion would align the model's context capacity with other large models that already support around 1 million tokens, and it follows earlier OpenAI models that at one point supported a 1 million token window - a capability that was absent in version 5.2.

Beyond raw context size, the person said GPT-5.4 is being developed to handle tasks that require multiple hours to complete with improved reliability. The model is expected to retain details of user requests and operational parameters more effectively across multi-step processes, while also producing fewer errors. These attributes are likely to be relevant for tools that automate extended or intricate workflows - OpenAI's Codex coding assistant is an example noted for applications that involve long-running, complex tasks.

GPT-5.4 is also reported to introduce what the person described as an extreme reasoning mode. In that mode the model would be allowed to devote substantially more time and computational resources to particularly difficult questions - a design intended to improve performance on hard reasoning problems by expanding the resource envelope for individual queries.

The details in this account come from someone familiar with the project rather than an official company announcement. As described, the upgrade focuses on three technical areas - a much larger context window, stronger multi-step memory and reduced error rates for lengthy tasks - plus a dedicated mode to tackle hard reasoning problems by increasing computation and time per query.

Those capabilities, if delivered as reported, would represent a restoration and extension of long-context functionality and a targeted push to improve the model's behavior on extended, compute-intensive tasks.

Risks

The information is based on an individual with knowledge of the development rather than an official announcement, leaving release details and final feature set uncertain - this uncertainty affects expectations in AI and software markets.
The extreme reasoning mode is described as using substantially more computational resources and time, which could increase operational costs for cloud compute and enterprise deployments.
Past inconsistency in long-context support across OpenAI versions - earlier models supported a 1 million token window that was absent in version 5.2 - suggests future availability and stability of the feature may be uncertain for product planners and vendors.

Menu

Key Points

Risks

More from Economy