9/26 | Shomik Jain (MIT IDSS) | As an AI Language Model, “Yes I Would Recommend Calling the Police”: Norm Inconsistency in LLM Decision-Making | |
| Aurora Zhang (MIT IDSS) | Structural Interventions and the Dynamics of Inequality | |
10/3 | Dr. Nasim Sonboli (Brown CS/DSI/CNTR) | The trade-off between data minimization and fairness in collaborative filtering | |
10/10 | Stephen Casper (MIT EECS) Stephen Casper is a PhD student at MIT advised by Dylan Hadfield-Menell. His research focuses on red-teaming and robustness in AI systems. He holds a BA in statistics from Harvard College. | Technical and sociotechnical evaluations of LLMs | Evaluations of large language model (LLM) capabilities are increasingly incorporated into AI safety and governance frameworks. However, there is a laundry list of technical, sociotechnical, and political challenges toward ensuring meaningful oversight from evaluations. This talk will focus on current challenges with access, tooling, and politics for evals. |
10/17 | Rui-Jie Yew (Brown CS/DSI/CNTR) Rui-Jie Yew is a PhD student at Brown advised by Suresh Venkatasubramanian. Her research lies at the intersection of computer science and law. She holds an SM from MIT and a joint BA in computer science and mathematics from Scripps College. | You Still See Me: How Data Protection Supports the Architecture of AI Surveillance | Data forms the backbone of artificial intelligence (AI). Privacy and data protection laws thus have strong bearing on AI systems. Shielded by the rhetoric of compliance with data protection and privacy regulations, privacy-preserving techniques have enabled the extraction of more and new forms of data. We illustrate how the application of privacy-preserving techniques in the development of AI systems--from private set intersection as part of dataset curation to homomorphic encryption and federated learning as part of model computation--can further support surveillance infrastructure under the guise of regulatory permissibility. Finally, we propose technology and policy strategies to evaluate privacy-preserving techniques in light of the protections they actually confer. We conclude by highlighting the role that technologists could play in devising policies that combat surveillance AI technologies. |
10/24 | Dora Zhao (Stanford CS) Dora Zhao is a PhD student at Stanford co-advised by Michael Bernstein and Diyi Yang. Her research lies at the intersection of human-computer interaction and machine learning fairness. She holds an AB and MSE in computer science from Princeton University. | Encoding Human Values in Social Media Feed Ranking Algorithms | While social media feed rankings are primarily driven by engagement signals rather than any explicit value system, the resulting algorithmic feeds are not value-neutral: engagement may prioritize specific individualistic values. This paper presents an approach aiming to be more intentional about the values that feeds encode. We adopt Schwartz’s theory of Basic Human Values---a complete set of human values that articulates complementary and opposing values that form the basic building blocks of many cultures---and we implement an algorithmic approach that models and then ranks feeds by expressions of Schwartz’s values in social media posts. Our ranking approach enables controls where end users can express weights on their desired values, then combines these weights and post value expressions together into a ranking that respects the users’ articulated trade-offs. Through controlled experiments (N=209 and N=352), we demonstrate that users can use these controls to architect feeds that reflect their desired values. |
10/31 | Palak Jain (BU CS) Palak Jain is a PhD student at Boston University advised by Adam Smith. Their research uses the lenses of cryptography and differential privacy to design privacy-respecting systems and understand the downstream effects of those technologies on the individuals they intend to protect. | Enforcing Demographic Coherence: A Harms Aware Framework for Reasoning about Private Data Release | In our work, we introduce a new framework for reasoning about the privacy of large data releases; our framework is designed intentionally with socio-technical usability in mind. This talk will present our approach, which characterises the adversary as a predictive model and introduces the notion of "incoherent predictions" to capture potentially harmful inferences. Finally, I’ll explain what it means for a data curation algorithm to be "coherence enforcing" and briefly touch upon how some existing privacy tools can be used to achieve this notion.
Based on joint work with Mark Bun, Marco Carmisino, Gabe Kaptchuk, and Satchit Sivakumar |
11/7 | David Liu (Northeastern CS) David Liu is a CS PhD student at Northeastern advised by Tina Eliassi-Rad. His research interests lie at the intersection of graph machine learning, algorithmic fairness, and the societal impact of AI. He received a B.S.E. in computer science from Princeton University. | When Collaborative Filtering is not Collaborative: Unfairness of PCA for Recommendations | Collaborative-filtering recommender systems leverage low-rank approximations of high-dimensional user data to recommend relevant items to users. The low-rank approximation encodes latent group structure among the items, such as genres in the case of music and cuisines in the case of restaurants. Given that items often vary widely in their popularity, I will present work that addresses the question: do collaborative-filtering recommender systems disproportionately discard revealed preferences for low-popularity items, and if so, to what extent are recommendations for low-popularity items harmed? I will show that in the case of PCA, on common benchmark datasets, two unfairness mechanisms arise. First, the trailing, discarded PCA components characterize interest in less popular items. Second, the leading, preserved components specialize in individual popular items instead of capturing latent groupings. To address these limitations, I will then introduce Item-Weighted PCA, an algorithm for identifying principal components that re-weights less-popular items while preserving convexity. On benchmark datasets, Item-Weighted PCA improves the characterization of both popular and unpopular items. I will conclude the talk with a discussion about future work on fairness notions that focus less on equalizing performance among groups and more on inducing models that capture what makes groups different from each other. |
11/14 | Edgar Ramirez Sanchez (MIT Civil and Environmental Engineering) Edgar Ramirez Sanchez is a PhD student in Civil and Environmental Engineering at MIT advised by Cathy Wu. He holds an S.M. in Technology and Policy from MIT and a B.S. in Engineering Physics from the Technológico de Monterrey in Mexico. | A data-driven traffic reconstruction framework for identifying stop-and-go congestion on highways | Identifying stop-and-go events (SAGs) in traffic flow presents an important avenue for advancing data-driven research for climate change mitigation and sustainability, owing to their substantial impact on carbon emissions, travel time, fuel consumption, and roadway safety. In fact, SAGs are estimated to account for 33-50% of highway driving externalities. However, insufficient attention has been paid to precisely quantifying where, when, and how much these SAGs take place, which is necessary for downstream decision making, such as intervention design and policy analysis. A key challenge is that the data available to researchers and governments are typically sparse and aggregated to a granularity that obscures SAGs. To overcome such data limitations, this study thus explores the use of traffic reconstruction techniques for SAG identification. In particular, we introduce a kernel-based method for identifying spatio-temporal features in traffic and leverage bootstrapping to quantify the uncertainty of the reconstruction process. Experimental results on California highway data demonstrate the promise of the method for capturing SAGs. This work contributes to a foundation for data-driven decision making to advance the sustainability of traffic systems. |
11/21 | Minseok Jung (MIT TPP) Minseok “Mason” Jung is a graduate student at MIT. He was selected as a Social and Ethical Responsibility of Computing (SERC) Scholar at Schwarzman College of Computing, and conducted research around AI fairness, policy, and HCI. Mason is developing research with Dr. Paul Pu Liang and Dr. Lalana Kagal. | Quantitative Insights into Language Model Usage and Trust in Academia: An Empirical Study | Language models (LMs) are revolutionizing knowledge retrieval and processing in academia. However, concerns regarding their misuse and erroneous outputs, such as hallucinations and fabrications, are reasons for distrust in LMs within academic communities. Consequently, there is a pressing need to deepen the understanding of how actual practitioners use and trust these models. There is a notable gap in quantitative evidence regarding the extent of LM usage, user trust in their outputs, and issues to prioritize for real-world development. This study addresses these gaps by providing data and analysis of LM usage and trust. Specifically, our study surveyed 125 individuals at a private school and secured 88 data points after pre-processing. Through both quantitative analysis and qualitative evidence, we found a significant variation in trust levels, which are strongly related to usage time and frequency. Additionally, we discover through a polling process that fact-checking is the most critical issue limiting usage. These findings inform several actionable insights: distrust can be overcome by providing exposure to the models, policies should be developed that prioritize fact-checking, and user trust can be enhanced by increasing engagement. By addressing these critical gaps, this research not only adds to the understanding of user experiences and trust in LMs but also informs the development of more effective LMs. |
11/28 | Thanksgiving Recess | | |
12/5 | Princess Sampson (UPenn) | TBD | |
12/12 | Zoë Bell (UC Berkeley) Holiday Lunch/ Fin | TBD | |