Scientific paper - Evaluating large language models in theory of mind tasks / AHEAD - Psychological dimension / AHEAD OBSERVATORY / AI Legal Atlas / Biodiritto

Vai al Contenuto
Vai alla navigazione del sito

Scientific paper - Evaluating large language models in theory of mind tasks
Anno 2024

Paper testing large language models on false-belief tasks commonly used to assess Theory of Mind in humans.

This paper evaluates large language models using a battery of false-belief tasks that are widely used to assess Theory of Mind in humans. The results show that more recent models solve a substantial proportion of tasks that smaller or older models fail, suggesting emergent social-reasoning capabilities in some systems. The study is significant because it translates a classic psychological construct into an AI evaluation setting and raises questions about interpretation, anthropomorphism, and the limits of benchmark-based claims about machine social cognition.

Author of the paper: Michal Kosinski

Publisher or journal of publication: Proceedings of the National Academy of Sciences (PNAS)

The paper is available at the following link.

Christine Kakalou, CERTH

torna all'inizio del contenuto

Pubblicato il: Lunedì, 29 Aprile 2024 - Ultima modifica: Mercoledì, 06 Maggio 2026

torna all'inizio

Scientific paper - Evaluating large language models in theory of mind tasks
Anno 2024

Condividi

Argomento

AI Legal Atlas

gallery

Menu di navigazione

Scientific paper - Evaluating large language models in theory of mind tasksAnno 2024

Condividi

Argomento

AI Legal Atlas

gallery

Scientific paper - Evaluating large language models in theory of mind tasks
Anno 2024