Tool of evaluation - T4D (Thinking for Doing) dataset / AHEAD - Technological evolution / AHEAD OBSERVATORY / AI Legal Atlas / Biodiritto

Vai al Contenuto
Vai alla navigazione del sito

Tool of evaluation - T4D (Thinking for Doing) dataset
Anno 2023

Dataset derived from ToMi for evaluating whether LLMs can use Theory-of-Mind inferences to choose appropriate actions.

The T4D dataset is a converted dataset based on ToMi and the paper “How Far Are Large Language Models from Agents with Theory-of-Mind?”. It is designed to test whether models can move beyond answering explicit Theory-of-Mind questions and instead use inferred mental states to decide on socially appropriate actions. The dataset contains 564 rows and is distributed through Hugging Face under an Apache 2.0 license. A linked GitHub repository provides the conversion code. This resource is useful for benchmarking situated Theory-of-Mind reasoning and action selection.

This dataset can be used throughout BSC’s secondment to CERTH-Psy on April 2026.

Developer of the tool of evaluation: sachithgunasekara / Hugging Face dataset card based on Zhou et al.

The tool of evaluation is available at the following link.

Christine Kakalou, CERTH

torna all'inizio del contenuto

Pubblicato il: Mercoledì, 04 Ottobre 2023 - Ultima modifica: Mercoledì, 06 Maggio 2026

torna all'inizio

Tool of evaluation - T4D (Thinking for Doing) dataset
Anno 2023

Condividi

Argomento

AI Legal Atlas

gallery

Menu di navigazione

Tool of evaluation - T4D (Thinking for Doing) datasetAnno 2023

Condividi

Argomento

AI Legal Atlas

gallery

Tool of evaluation - T4D (Thinking for Doing) dataset
Anno 2023