Publications

You can also find my articles on my Google Scholar profile.

Journal Articles


Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data

Published in Science Advances, 2023

We find that using factual labels to train models intended for normative judgments introduces a notable measurement error and models trained using factual labels yield significantly different judgments than those trained using normative labels such that the impact of this effect on model performance can exceed that of other factors (e.g., dataset size) that routinely attract attention from ML researchers and practitioners.

Recommended citation: Aparna Balagopalan, David Madras, David H Yang, Dylan Hadfield-Menell, Gillian K Hadfield, Marzyeh Ghassemi. Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data. Sci. Adv. 9, eabq0701 (2023). DOI:10.1126/sciadv.abq0701
Download Paper