[LOGOS] Fri Jan 17, 2pm, Yue Dong, Exploring Political, Social, and Cultural Biases in Pretrained Language Models

Emiliano De Cristofaro emilianodc at cs.ucr.edu
Fri Jan 10 18:04:18 PST 2025


Dear LOGOrithms,

First of all, Happy New Year! To start the year strong, we will have a talk
by Yue on Friday, January 17th, at 2pm in WCH 203. Please see details below.

If you can't join in person, here's the Zoom link
<https://ucr.zoom.us/j/98664053204?pwd=quPIPmylgJjHap4VkzPnaaVKk1ndi6.1>.

Cheers,
Emiliano


TITLE
Exploring Political, Social, and Cultural Biases in Pretrained Language
Models

ABSTRACT
In this talk, I will discuss two interesting papers in NLP that examine the
political, social, and cultural biases inherent in pretrained large
language models. These biases raise important questions about potential
harms and safety implications, which I hope to brainstorm with the group.
The focus will be on the following works:
1. "From Pretraining Data to Language Models to Downstream Tasks: Tracking
the Trails of Political Biases Leading to Unfair NLP Models" (ACL 2023 Best
Paper)
2. "Whose Opinions Do Language Models Reflect?"  (Santurkar, Shibani, Esin
Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto.
ICML 2023)

BIO
Yue Dong (yuedong.us) is an Assistant Professor in the Department of
Computer Science and Engineering at the University of California,
Riverside. Her primary research interests lie in trustworthy NLP, with a
focus on LLM safety, hallucination reduction, and AI detection in
conditional text generation tasks such as summarization. Her recent work
targets red teaming of Large Language Models (LLMs) through adversarial
attacks, safety alignment, and in-context vulnerabilities across generative
models, including LLMs, Vision-Language Models (VLMs), and Stable
Diffusion. Her recent research on VLM adversarial attacks earned the Best
Paper Award at the SoCal NLP Symposium and was spotlighted at ICLR 2024.
Additionally, she has served as a Senior Area Chair for top-tier NLP
conferences such as NAACL 2025 & 2024, and EMNLP 2024, as well as Area
Chair for ICLR 2025, ACL 2024 & 2023 and EMNLP 2022 & 2023. She has also
co-organized workshops and tutorials at prestigious conferences, including
EMNLP 2021 and 2023 (Summarization), NeurIPS 2021–2024 (LLM Efficiency),
and NAACL 2022 (Text editing).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ucr.edu/pipermail/logos/attachments/20250110/8c2fff88/attachment.htm>


More information about the LOGOS mailing list