Vai menu di sezione

Scientific paper - Generative language models exhibit social identity biases
Anno 2024

Study showing that many generative language models reproduce social identity biases similar to humans.

This paper investigates whether large language models exhibit social identity biases, including ingroup favoritism and outgroup hostility. Across a large set of models, the authors find that many foundational and some instruction-tuned models produce biased associations resembling those documented in human social psychology. The article also discusses mitigation through training-data curation and instruction fine-tuning. The findings are relevant for fairness evaluation, human–AI interaction, and the potential reinforcement of social biases through conversational systems.

Author of the paper: Tiancheng Hu, Yara Kyrychenko, Steve Rathje, Nigel Collier, Sander van der Linden, Jon Roozenbeek

Publisher or journal of publication: Nature Computational Science

The paper is available at the following link.

Christine Kakalou, CERTH
Pubblicato il: Giovedì, 12 Dicembre 2024 - Ultima modifica: Mercoledì, 06 Maggio 2026
torna all'inizio