Artificial intelligence (AI) encompasses diverse technologies that allow machines to perform tasks such as reasoning, learning and problem-solving. Its ability to perform such processes typically associated with human intelligence must therefore render AI systems intelligent, right? Are these models smarter than the average human?
Tracking AI sought to find out by ranking AI models based on their performance on the Mensa Norway IQ test, a widely recognized and difficult IQ exam used to evaluate human intelligence. For comparison, the average human IQ score ranges from 90 to 110, while a score above 130 is typically considered genius-level.
And OpenAI’s o3 model is clearly in the genius category, scoring a 133. As a part of ChatGPT, it’s also among the world’s most popular AI tools. Anthropic’s Claude-4 Sonnet and Google’s Gemini 2.0 Flash Thinking follow closely with IQ scores of 127 and 126, respectively.
Here’s a major distinction: the highly intelligent AI models at the top of the chart are text-only models that cannot read or process images. Those at the bottom of the list are all multimodal models with the ability to read and process images.
