Rada Mihalcea and Zhijing Jin win two Best Paper Awards at NeurIPS 2024
Rada Mihalcea, Janice M. Jenkins Collegiate Professor of Computer Science and Engineering at the University of Michigan, and Zhijing Jing, research associate in the Michigan AI lab, at ETH, and at the Max Planck Institute, received two Best Paper Awards at the 2024 Conference on Neural Information Processing Systems (NeurIPS). Their paper titled “Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias” was awarded Best Paper at the NeurIPS Workshop on Causality and Large Models, and their paper “Language Model Alignment in Multilingual Trolley Problems” won Best Paper Award at the NeurIPS Workshop on Pluralistic Alignment.
A top international conference in machine learning and artificial intelligence (AI), NeurIPS brings together researchers from across the world to present the latest innovations and findings in these areas. This year’s conference was held December 10-15 in Vancouver, Canada. In addition to the main track of the conference, NeurIPS hosts numerous themed workshops, allowing for more detailed discussion on specific topics within AI and neural information processing.
The Workshop on Causality and Large Models explored recent advances in the assessment of models’ causal knowledge and reasoning, as well as examining how causality can be used to improve large models’ performance. In their paper “Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias,” Mihalcea, Jin, and their coauthors introduce a causal framework for measuring gender bias in large language models (LLMs), as well as proposing the OCCUGENDER benchmark to assess occupational gender bias. Through their experiments, the authors found substantial gender bias in several open-source LLMs and discussed strategies for mitigating these biases.
The NeurIPS Workshop on Pluralistic Alignment focused on aligning language model outputs with diverse human ethical standards. Jin and Mihalcea’s award-winning paper, “Language Model Alignment in Multilingual Trolley Problems,” presented the MULTITP dataset, a collection of moral dilemma scenarios in over 100 languages based on the Moral Machine experiment. The study evaluated the moral alignment of 19 different LLMs with human preferences across various languages and cultural contexts, revealing significant variances and emphasizing the importance of including diverse perspectives in AI ethics.
By increasing our understanding of bias and ethical alignment in LLMs, Mihalcea and Jin’s contributions provide valuable insights to guide the development of more robust, fair, and ethical AI models.