arc-agibenchmarkchollet
Every Frontier AI Model Scored Under 1% on ARC-AGI-3. Humans Got 100%.
Chollet's new benchmark drops the same week Jensen Huang declared AGI. GPT-5.4 scored 0.26%. Claude Opus 4.6 scored 0.25%. The gap with humans is 99+ points.
9 min