A daily-style snapshot of the frontier: the five strongest general models, their headline benchmarks, and what they actually cost per million tokens — sorted so the best value rises to the top.
| # | Model | GPQA Diamond | SWE-bench | Input $/M | Output $/M | Blended* | Context | Verdict |
|---|