Google DeepMind Unveils Framework for Measuring AGI Evolution
Cognitive science-based staged evaluation system charts path to artificial general intelligence

- •Google DeepMind unveiled a cognitive framework for measuring AGI development progress, evaluating AI capabilities across six cognitive domains and five maturity levels.
- •Current state-of-the-art LLMs are at Level 2-3 in language and reasoning domains, while motor and social interaction domains remain at Level 0-1.
- •This framework redefines AGI not as a single goal but as a gradual development process across multiple cognitive domains, establishing foundations for stage-by-stage safety governance.
A New Standard for Measuring AI 'Intelligence'
Google DeepMind has unveiled a cognitive framework designed to objectively measure the developmental progress toward Artificial General Intelligence (AGI). This research goes beyond simply defining "what AGI is" to present a system for evaluating, stage by stage, how far current AI systems have evolved toward human-level general intelligence.
Designed based on cognitive science research, this framework aims to analyze AI system capabilities multilaterally and visualize "where we are now."
Why We Need AGI Measurement Systems Now
While AGI has long been referenced in the AI industry as the "ultimate goal," there has been insufficient consensus on what it actually is and how to measure it. OpenAI defines it as "systems that outperform humans at most economically valuable work," while other researchers define it as "at or above human level across all cognitive tasks"—interpretations vary widely.
Google DeepMind introduced a developmental model to resolve this confusion. Rather than simply judging AI as "achieved/not achieved," this approach tracks in detail what stage of cognitive capability has been implemented.
The importance of this approach is twofold:
- Setting research direction: By clearly identifying current AI strengths and weaknesses, it can guide what research is needed to advance to the next stage.
- Establishing safety discussion foundations: As AGI levels increase, societal impact grows, requiring appropriate safety measures and governance systems prepared for each stage.
Core Structure of the Cognitive Framework
DeepMind's framework consists of six cognitive domains and five capability maturity levels.
Six Cognitive Domains
| Domain | Description | Evaluation Examples |
|---|---|---|
| Perception | Ability to process sensory information like vision and hearing | Image recognition, speech understanding |
| Motor Skills | Ability to perform physical actions | Robot control, object manipulation |
| Language | Natural language understanding and generation ability | Conversation, translation, writing |
| Reasoning | Logical thinking and problem-solving ability | Math problem solving, strategy formulation |
| Learning | Ability to acquire and apply new information | Few-shot learning, transfer learning |
| Social Interaction | Ability to cooperate and communicate with others | Teamwork, emotion recognition |
Five Maturity Stages
The framework divides the capability levels AI can reach in each cognitive domain into five stages:
Level 0 — Non-Human: Below human level, performs only basic tasks
Level 1 — Emerging: Can perform simple tasks but inconsistently
Level 2 — Competent: Performs tasks at typical adult human level
Level 3 — Expert: Equals top human experts in the field
Level 4 — Superhuman: Surpasses humanity's best experts
For example, current large language models (LLMs) can be evaluated at approximately Level 2-3 in the language domain and Level 1-2 in the reasoning domain. Meanwhile, the motor domain remains at Level 0-1, and social interaction is also limited.
What Stage Is Current AI At?
DeepMind mapped the latest AI systems according to this framework and discovered the following patterns:
- Latest LLMs like GPT-4, Gemini 2.0, Claude 3.5: Level 2-3 in language and reasoning domains. While approaching expert level on specific benchmarks (MMLU, HumanEval), generalization capabilities remain weak.
- AlphaGo, AlphaFold: Achieved Level 4 (superhuman) in specific domains (Go, protein structure prediction). However, they are not classified as AGI due to lack of generality.
- Robotic AI systems: Level 0-1 in perception and motor domains. Real-time environmental adaptation capabilities are limited.
In conclusion, current AI exists between two extremes: "high performance in narrow domains" and "low generalization across broad domains." To achieve AGI, at least Level 2 or higher must be attained across all six cognitive domains—a goal that remains distant.
[AI Analysis] Future Path of AGI Development
The implications this framework presents are clear: AGI is not a single breakthrough but a process where gradual progress across multiple cognitive domains converges.
Short-term Outlook (2026-2028)
- Accelerated multimodal integration: Models with enhanced interaction between language, perception, and reasoning domains are likely to emerge. Systems like current Gemini 2.0 or GPT-5 (anticipated) are already evolving in this direction.
- Rise of robotic AI: To transition from Level 0→1 in the motor domain, Google, Tesla, Figure AI and others are expected to deploy robotic systems that learn in real environments at scale.
Medium-term Outlook (2029-2032)
- Expansion of expert-level domains: AI achieving Level 3-4 in specific areas (coding, medicine, law) will be commercialized, and "human + AI collaboration" models will likely become standard.
- Surge in social interaction research: Research investment is expected to concentrate on areas like emotion recognition, ethical judgment, and teamwork.
Long-term Questions (Post-2033)
Predictions about AGI achievement timing remain controversial. However, DeepMind's framework suggests that "which domain reaches Level 4 first" is a more important question than "when AGI will be achieved." Some domains may reach superhuman levels while others remain at Level 1.
From a safety perspective, this framework also has important implications. Staged governance becomes possible, such as pre-assessing potential risks at each stage and applying enhanced oversight systems when entering Level 3 or above.
A New Starting Point for AGI Discussion
Google DeepMind's research is significant in transforming AGI from "a distant philosophical concept" to "a measurable engineering goal." AI researchers can now quantitatively evaluate "how intelligent are the systems we've built" and chart roadmaps for advancing to the next stage.
However, this framework is not perfect. It has limitations in simplifying human intelligence complexity into six domains and does not address abstract concepts like creativity or consciousness. Whether this framework will establish itself as an academic and industry standard, or whether new measurement methodologies will emerge, remains to be seen.
댓글 (3)
흥미로운 주제입니다. 주변에도 공유해야겠어요.
그 부분은 저도 궁금했습니다.
기사 잘 봤습니다. 다른 시각의 분석도 읽어보고 싶네요.
More in this series
More in AI & Tech

영국 정치지도자들, 아동 성착취 혐의에 대한 긴급 조사 촉구

A humanoid robot performing in China has a child's face on it.

U.S. jury finds Meta and Google responsible for ‘social media addiction’… 3.7 billion won compensation ruling

Japanese X-ray Observatory makes first direct measurement of ultrafast 'cosmic wind' in galaxy M82

NASA selects 24 people for 2026 Astrophysics Postdoctoral Fellowships

Ethereum is at a crossroads to ‘redefine its identity’ ahead of the quantum computing and AI era
Latest News

"간부 잘 아는데 교통비 좀" 휴가 군인들 돈 뜯은 50대 구속
50대 A씨가 휴가 중인 군인들에게 부대 간부를 아는 척 접근해 돈을 사취

英 옥토퍼스, 이란 전쟁 이후 태양광 판매 50% 증가
이란 전쟁 이후 영국 옥토퍼스의 태양광 판매량 50% 증가

당정 "추경, 지방·취약계층에 더 지원되는 방식으로"
당정이 지방자치단체와 취약계층 중심의 추경 편성 방침 재확인

당정, 석유 최고가격제 손실 보전을 추경에 반영키로
당정이 석유 최고가격제 손실을 추경에 반영하기로 결정

어머니 폭행하고 금팔찌 빼앗은 30대 아들 경찰에 붙잡혀
어머니 폭행 후 금팔찌 빼앗은 30대 남성 체포

아이티 갱단 폭력사태로 10개월간 5천명 이상 사망
아이티에서 지난 10개월간 갱단 폭력으로 5천명 이상 사망

서방 정보당국 "러시아, 우크라이나 전쟁 후 이란에 드론·식량 공급"
서방 정보당국, 러시아의 이란 드론·식량 공급 작업 거의 완료 파악

6년 전 세 살 딸 살해한 30대 친모 구속송치
경찰, 6년 전 세 살 딸 살해 혐의 30대 친모를 구속송치