What is Test-Time Compute?
TL;DR
An approach where models use additional computational resources during inference to improve answer accuracy.
Test-Time Compute: Definition & Explanation
Test-Time Compute is an approach where AI models consume additional computational resources during inference (test time) to generate more accurate outputs. While standard LLMs respond instantly using their trained parameters, test-time compute allows models to 'take time to think,' exploring multiple reasoning paths and performing self-verification to improve response quality. OpenAI's o1 and o3, as well as DeepSeek-R1, are representative reasoning models that employ this technique. They internally expand Chain-of-Thought reasoning extensively, significantly outperforming conventional models on mathematical reasoning and complex coding problems. Since there is a trade-off between computational cost and response speed, it is important to selectively apply test-time compute based on task difficulty.