Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data

Published in Science Advances, 2023

The contents above will be part of a list of publications, if the user clicks the link for the publication than the contents of section will be rendered as a full page, allowing you to provide more information about the paper for the reader. When publications are displayed as a single page, the contents of the above “citation” field will automatically be included below this section in a smaller font.

Recommended citation: Aparna Balagopalan, David Madras, David H Yang, Dylan Hadfield-Menell, Gillian K Hadfield, Marzyeh Ghassemi. Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data. Sci. Adv. 9, eabq0701 (2023). DOI:10.1126/sciadv.abq0701
Download Paper