What is Extended Thinking?
TL;DR
A 2026-standard reasoning feature where the model thinks at length internally before producing a final answer.
Extended Thinking: Definition & Explanation
Extended Thinking is a feature, first introduced with Anthropic's Claude 3.7 and now common across Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-5, Gemini 3, and Grok 4, that lets the model run an internal chain-of-thought for an extended period before returning a final answer. Accuracy improves dramatically on tasks that require deep reasoning — competition mathematics, code design, strategic decisions, long-document analysis. The feature builds on the test-time compute scaling thesis (more inference compute → better results) introduced by OpenAI's o1/o3 series, and as of 2026 the available controls (reasoning_effort, thinking_budget, etc.) typically scale think time from seconds to minutes. Benefits: higher accuracy on hard problems (math olympiad, SWE-bench), fewer hallucinations, and more logically consistent answers. Trade-offs: longer latency (seconds to minutes) and higher API cost (thinking tokens are billed). Best practice is to disable extended thinking on simple prompts and enable it for complex analysis.