OpenAI is working on a new generative model, dubbed GPT-5.4, according to a person with knowledge of the development. The next iteration is expected to expand how much text or data the model can consider at once by offering a context window that surpasses 1 million tokens.
This increase would more than double the 400,000-token capacity available in the current GPT-5.2 release and enable the model to process queries that contain substantially larger bodies of text or datasets. The planned expansion would align the model's context capacity with other large models that already support around 1 million tokens, and it follows earlier OpenAI models that at one point supported a 1 million token window - a capability that was absent in version 5.2.
Beyond raw context size, the person said GPT-5.4 is being developed to handle tasks that require multiple hours to complete with improved reliability. The model is expected to retain details of user requests and operational parameters more effectively across multi-step processes, while also producing fewer errors. These attributes are likely to be relevant for tools that automate extended or intricate workflows - OpenAI's Codex coding assistant is an example noted for applications that involve long-running, complex tasks.
GPT-5.4 is also reported to introduce what the person described as an extreme reasoning mode. In that mode the model would be allowed to devote substantially more time and computational resources to particularly difficult questions - a design intended to improve performance on hard reasoning problems by expanding the resource envelope for individual queries.
The details in this account come from someone familiar with the project rather than an official company announcement. As described, the upgrade focuses on three technical areas - a much larger context window, stronger multi-step memory and reduced error rates for lengthy tasks - plus a dedicated mode to tackle hard reasoning problems by increasing computation and time per query.
Those capabilities, if delivered as reported, would represent a restoration and extension of long-context functionality and a targeted push to improve the model's behavior on extended, compute-intensive tasks.