Jifan Zhang
I am a Ph.D. candidate in computer science at the University of Wisconsin, working with Robert Nowak. I obtained my M.S. and B.S. degrees in computer science from the University of Washington. During that time, I was fortunate to be advised by Kevin Jamieson, Lalit Jain, Tanner Schmidt, Dieter Fox and Zachary Tatlock. My research focuses on both applied and theoretical perspectives of Machine Learning, primarily along the following axes.
Efficient Distillation of Black Box Intelligence
Many of the most capable intelligences, including humans, closed-source language models (LLMs), and domain-specific tools protected by private IPs, remain black boxes today. Accessing these sources of intelligence at scale can be prohibitively expensive. My research on Label-Efficient Learning explores the most efficient strategies for querying these expensive intelligence sources and selecting the most informative data to train language and vision models that can effectively mimic the queried answers.
By focusing on Active Learning, Semi-Supervised Learning, and Transfer Learning, my research effectively democratizes access to powerful AI systems, enabling more people to leverage state-of-the-art intelligence without incurring exorbitant costs. For an overview of the latest Label-Efficient Learning research from my collaborators and me, check out LabelTrain.ai.
Humor Generation and Alignment of LLMs
Humor, a complex and subjective human trait, poses significant challenges for large language models (LLMs) today. While state-of-the-art LLMs have made notable progress, my research has shown that human humor professionals still clearly prefer the best human-generated content over AI-generated humor. This preference highlights the intricate nature of human creativity, where the most effective creative outputs are not merely divergent thoughts but rather a series of out-of-the-box reasoning steps.
My research aims to help LLMs reach expert-level ability in humor generation from multiple fronts. This includes gamified crowdsourcing of humor-related data to capture human humor preferences and developing novel algorithms for alignment of multi-step reasoning.
[Curriculum Vitae] [Twitter] [Google Scholar] [Github]
News and Talks
- [October 2024] Talk at Foundation of AI Seminar at Georgia Tech about distillation of black-box intelligences.
- [October 2024] Lecture at CS 8803 at Georgia Tech about humor in AI and the New Yorker Caption Contest.
- [October 2024] Talk at Machine Learning Lunch Meeting (MLLM) at UW-Madison about humor in AI and the New Yoker Caption Contest.
- [September 2024] Steve Mussmann, Rob Nowak and I are teaching a brand new course on Data-Centric Machine Learning at UW-Madison and Georgia Tech in parallel.
- [July 2024] Talk at University of Washington about distillation of black-box intelligences.
- [July 2024] Talk at Medtronic about LabelBench and Imbalanced Active Learning.
- [June 2024] Talk at Summer SILO about distillation of black-box intelligences.
- [March 2024] Check out LabelTrain.ai for our effort on Label-Efficient Learning research including LabelBench, label-efficient SFT of LLMs, TAILOR and DIRECT.
- [November 2023] Talk at IFDS@University of Washington about LabelBench and benchmark overfitting in active learning research.
- [November 2023] Talk at ETH Zurich about LabelBench and benchmark overfitting in active learning research.
- [November 2023] Talk at Meta about LabelBench.
- [October 2023] Talk at MLOPT Idea Seminar.
- [August 2022] My internship project (among two other projects) received an internal shoutout from Mark Zuckerburg at Meta.
- [June 2022] Talk at Meta about GALAXY.
- [April 2022] Talk at IFDS about GALAXY.
- [September 2021] I will be joining the University of Wisconsin, Madison as a Computer Science Ph.D. student! My Google search engine is very confused about the word “UW” now.