Projects

Personalized surveys and experience sampling
How can we personalize survey questions? Personalized user models require high-quality human data from real-world surveys. Built ML-based methods to optimize longitudinal survey data collection, enhancing accuracy while minimizing user burden. Two large-scale studies were conducted: first, using mixed-effects models to predict non-response biases in experience sampling; second, deploying Bayesian networks and ML to dynamically adapt questions, personalizing surveys for each user.

Modeling developer preferences
What do developers care about the most? Built discrete choice models to understand developer preferences for MongoDB Atlas via large-scale MaxDiff surveys. These surveys captured trade-offs between cost and performance features, revealing distinct preferences between novice and experienced developers. Insights guided the prioritization of A/B experiments for personalized developer experiences, resulting in a ~XX% improvement in KPIs.

UX metrics for developer experiences
How do we measure the success of design changes? Created original metrics to assess MongoDB Atlas experience at scale, including developer sentiment, engagement, and task success. These metrics evaluated design changes across observability, identity & access management, and billing interfaces, establishing a scalable data-driven framework for cross-functional teams.

User modeling of new music listening
How do users stream new music? Developed ML models predicting user interactions with new music recommendations, including genre success forecasting, taste classification, and user segmentation. Key findings revealed genre popularity drives repeat listens, distinct tastes for new vs. older music, consistent new music listening diet, and user segments receptive to new recommendations. These insights informed A/B tests on Spotify Home, achieving ~XX% improvements in music discovery. Read more

Trending topics for podcast growth
How can smaller podcasters grow their audience? Developed a novel algorithm identifying trending topics via user behavior logs from Spotify searches and Wikipedia page views to amplify long-tail podcasts. Demonstrated LLM text-embeddings’ superiority over zero-shot prompting in terms of speed, accuracy, and cost when finding the most relevant podcast episodes for trending topics. Observational analysis confirmed that trending topics drove podcaster growth and user engagement.

Podcast discovery via goal-setting
How does user input improve discovery? Created an interactive podcast recommender (inspired by Strava/Duolingo) where users set goals to receive personalized podcast recommendations. Used large-scale event-triggered surveys and search queries to identify user needs, revealing a gap between long-term goals (e.g., learning) and short-term entertainment needs. Real-world evaluation showed that goal-setting helped users discover new podcasts. Read more

Measuring search user frustration
Can we measure search user experience beyond success? Developed a novel UX metric for Spotify search frustration by collecting large-scale session-level labels via event-triggered surveys. Built an ML model predicting frustration using search interactions, revealing two key insights: frustration is session-dependent rather than user-history-based, and query edit distance is the strongest predictor. These findings enabled real-time frustration measurement in online experiments to improve the search experience.

MicroEMA: Surveys with smartwatch microinteractions
How can users sustain large-scale repeated surveys? Built a smartwatch-based experience sampling method replacing traditional surveys with microinteractions (e.g., glance-like responses with a single question and a binary response set) to collect high-frequency, real-world data. Unlike burdensome multi-question surveys, MicroEMA’s quick interactions achieved 2X response rates in longitudinal field experiments while reducing user burden, especially in large-scale human data collection to train pattern recognition algorithms. Read more

Signaligner Pro: AI-assisted sensor data labeling
How can we make data labeling seamless? Created Signaligner Pro, an open-source tool for AI-assisted labeling of wearable sensor data, enabling efficient annotation of activities like walking/running. It combines manual precision with AI automation to handle large datasets, addressing the need for high-quality labels in activity-recognition algorithms. Now adopted by multiple NIH-funded studies, the tool accelerates the development of robust AI models through human-AI collab. Try it here

MixWILD: Interactive mixed-effects modeling
What if we could create models without code? Created an open-source tool for mixed-effects modeling of intensive longitudinal data (e.g., real-world experience sampling surveys). It provides a no-code interface, enabling researchers without programming expertise to perform complex multilevel analyses. Now adopted in conferences and workshops, MixWILD makes advanced statistical modeling more accessible. Try it here

Video games for human data labeling
Can video games help train AI? Created video games to crowdsource sensor data labeling for AI activity recognition, addressing tedious manual annotation. Tested how casual players without AI expertise could generate high-quality labels through gameplay. Experiments revealed that puzzle-based games outperformed endless runner-type games in label accuracy and gamer engagement. VentureBeat converage