Projects

Image 5

Personalized surveys and experience sampling

How can we personalize survey question selection in real-time? Large user behavior models need high-quality human data, including surveys. Built ML-based methods to personalize longitudinal survey data collection, enhancing accuracy while minimizing user burden. Conducted two large-scale studies: First, used mixed-effects models to predict non-response biases in experience sampling surveys; Second, used Bayesian Networks and ML to personalize survey question selection (based on information gain or reduction in entropy) in real-time for each user.

Image 1

Modeling developer preferences

What do developers care about the most? Built discrete choice models to understand developer preferences for MongoDB Atlas via large-scale MaxDiff surveys. These surveys captured trade-offs between cost and performance features, revealing distinct preferences between novice and experienced developers. Insights guided the prioritization of A/B experiments for personalized developer experiences, resulting in a ~XX% improvement in KPIs.

Image 2

UX metrics for developer experiences

How do we measure the success of design changes? Created original metrics to assess MongoDB Atlas experience at scale, including developer sentiment, engagement, and task success. These metrics evaluated design changes across observability, identity & access management, and billing interfaces, establishing a scalable data-driven framework for cross-functional teams.

Image 4

User modeling of new music listening

How do users stream new music? Developed ML models predicting user interactions with new music recommendations, including genre success forecasting, taste classification, and user segmentation. Key findings revealed genre popularity drives repeat listens, distinct tastes for new vs. older music, consistent new music listening diet, and user segments that are most receptive to new recommendations. These insights informed A/B tests on Spotify Home, achieving ~XX% improvements in music discovery. Read more

Image 3

Trending topics for podcast growth

How can smaller podcasters grow their audience? Developed a novel algorithm identifying trending topics via user behavior logs from Spotify searches and Wikipedia page views to amplify long-tail podcasts. Demonstrated LLM text-embeddings’ superiority over zero-shot prompting in terms of speed, accuracy, and cost when finding the most relevant podcast episodes for trending topics. Observational analysis confirmed that trending topics drove podcaster growth and user engagement.

Image 5

Podcast discovery via goal-setting

How does user input improve discovery? Built an interactive AI prototype (inspired by Strava/Duolingo) where users set goals to receive personalized podcast recommendations. Used large-scale event-triggered surveys and search queries to identify user needs, revealing a gap between long-term goals (e.g., learning) and short-term entertainment needs. Real-world evaluation showed that goal-setting helped users discover new podcasts. Read more

Image 5

Measuring search user frustration

Can we predict user frustration with search? Developed a novel UX metric for Spotify search frustration by collecting large-scale session-level labels via event-triggered surveys. Built an ML model predicting frustration using search interactions, revealing two key insights: frustration is session-dependent rather than user-history-based, and query edit distance is the strongest predictor. These findings enabled real-time frustration measurement in online experiments to improve the search experience.

Image 5

MicroEMA: Surveys with smartwatch microinteractions

How can we enable high-density survey data collection? Developed a smartwatch-based experience sampling method that replaces traditional surveys with microinteractions—such as glance-like responses involving a single question and a binary response set—to collect high-frequency, real-world data. Unlike burdensome multi-question surveys, MicroEMA’s quick interactions achieved twice the response rates in longitudinal field experiments while reducing user burden, particularly in large-scale human data collection for training pattern recognition algorithms. Read more

Image 5

Signaligner Pro: AI-assisted sensor data labeling

How can we make data labeling seamless? Created Signaligner Pro, an open-source tool for AI-assisted labeling of wearable sensor data, enabling efficient annotation of activities like walking/running. It combines manual precision with AI automation to handle large datasets, addressing the need for high-quality labels in activity-recognition algorithms. Now adopted by multiple NIH-funded studies, the tool accelerates the development of robust AI models through human-AI collab. Try it here

Image 5

MixWILD: Interactive mixed-effects modeling

What if we could create statistical models without code? Created an open-source tool for mixed-effects modeling of intensive longitudinal data (e.g., real-world experience sampling surveys). It provides a no-code interface, enabling researchers without programming expertise to perform complex multilevel analyses. Now adopted in conferences and workshops, MixWILD makes advanced statistical modeling more accessible. Try it here

Image 5

Video games for human data labeling

Can video games help train AI? Created video games to crowdsource sensor data labeling for AI activity recognition, addressing tedious manual annotation. Tested how casual players without AI expertise could generate high-quality labels through gameplay. Experiments revealed that puzzle-based games outperformed endless runner-type games in label accuracy and gamer engagement. VentureBeat converage)