NeurIPS award-winning paper on diverse feedback for LLMs with the University of Oxford

We’re excited to share our paper collecting diverse human feedback on LLMs earned recognition at NeurIPS 2024 — one of the world’s premier machine learning and computational research conferences. The PRISM Alignment Project, led by Hannah Rose Kirk at the University of Oxford, received the best paper award in the datasets and benchmarks track at the conferences. Dr. Scott A. Hale, Meedan’s Director of Research, was a major contributor to the paper.

Human feedback plays an important role in teaching large language models what responses, content, and behaviors are expected of them. Our paper’s findings demonstrate that it matters who is asked to give feedback on LLMs. For an LLM to learn what sorts of responses humans “prefer,” it is necessary to seek feedback from a diverse population.

As the project’s webpage notes:

In the early days of human feedback learning in AI systems, data was collected from a narrow and unrepresentative set of crowdworkers. This raises concerns about the potential impact of limited voices steering language models that are now used by hundreds of millions of people around the world.

To combat these concerns, we've collected diverse and disaggregated feedback from 1,500 participants born in 75 countries, including census-representative samples from the UK and the US. Our participants converse with over 20 LLMs in real-time, giving rich signals on each response.

With this data, we aim to provide insights into how humans differ in their interactions with large language models across different sociocultural contexts.

Here’s the abstract:

Human feedback is central to the alignment of Large Language Models (LLMs). However, open questions remain about methods (how), domains (where), people (who) and objectives (to what end) of feedback processes. To navigate these questions, we introduce PRISM, a dataset that maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 LLMs. With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts. We target subjective and multicultural perspectives on value-laden and controversial issues, where we expect interpersonal and cross-cultural disagreement. We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.

We congratulate all of the authors — Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, and Scott A. Hale — for their hard work on this important issue.

Read the paper

Download the data

We collaborated with 53 partner organizations worldwide to design and carry out our 2024 elections projects. We extend special gratitude to our lead partners in Brazil, Mexico and Pakistan, whose work we highlight in this essay.