Study of gender bias in automatically generated French clinical cases, showing over-generation of male cases and mismatch with real disorder prevalence.
Scientific paper - “Women do not have heart attacks!” Gender Biases in Automatically Generated Clinical Cases in French
Anno 2025
This paper examines gender bias in synthetic clinical case generation in French. Using seven language models fine-tuned for clinical case generation and an automatic linguistic gender detection tool, the authors evaluate 21,000 generated cases across ten disorders. The study finds that models systematically over-generate male patient cases, even when this does not match documented prevalence. The paper discusses the risks of using biased synthetic clinical text in healthcare settings and recommends mitigation strategies, including the explicit inclusion of demographic information in prompts. The work is particularly relevant for evaluating representational harms, fairness, and the safe use of synthetic text in clinical and educational contexts.
Author of the paper: Fanny Ducel, Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol
Publisher or journal of publication: Findings of the Association for Computational Linguistics: NAACL 2025 (Association for Computational Linguistics)
The paper is available at the following link.