Select Language

AudioBoost: Enhancing Audiobook Discovery in Spotify Search via LLM-Generated Synthetic Queries

Research on using Large Language Models to generate synthetic queries for improving audiobook retrievability in Spotify's search system, addressing cold-start challenges through query auto-completion and retrieval enhancement.
audio-novel.com | PDF Size: 0.6 MB
Rating: 4.5/5
Your Rating
You have already rated this document
PDF Document Cover - AudioBoost: Enhancing Audiobook Discovery in Spotify Search via LLM-Generated Synthetic Queries

Table of Contents

1. Introduction

Spotify's introduction of audiobooks created a significant cold-start problem where new content suffers from low retrievability compared to established music and podcast offerings. The AudioBoost system addresses this challenge by leveraging Large Language Models to generate synthetic queries that enhance both query formulation and retrieval capabilities.

Key Performance Metrics

  • Audiobook Impressions: +0.7%
  • Audiobook Clicks: +1.22%
  • Exploratory Query Completions: +1.82%

2. Methodology

2.1 Synthetic Query Generation

AudioBoost uses LLMs conditioned on audiobook metadata to generate diverse exploratory queries covering topics, genres, story tropes, and decades. The generation process follows a structured prompt engineering approach to ensure query quality and relevance.

2.2 Query Auto-Completion Integration

Synthetic queries are integrated into Spotify's Query Auto-Completion system to inspire users to type more exploratory queries, addressing the vocabulary mismatch between user search behavior and audiobook content.

2.3 Retrieval System Enhancement

The generated queries are indexed in Spotify's search retrieval engine, creating additional pathways for audiobooks to be discovered through broader, topic-based searches rather than just exact title matches.

3. Technical Implementation

3.1 Mathematical Framework

The retrievability improvement can be modeled using the probability framework: $P(r|q,d) = \frac{\exp(\text{sim}(q,d))}{\sum_{d' \in D} \exp(\text{sim}(q,d'))}$ where $q$ represents queries, $d$ represents documents, and $\text{sim}$ is the similarity function. The synthetic query generation aims to maximize $\sum_{q \in Q_{\text{syn}}} P(r|q,d_{\text{audiobook}})$.

3.2 Code Implementation

class AudioBoostQueryGenerator:
    def __init__(self, llm_model, metadata_fields):
        self.llm = llm_model
        self.fields = metadata_fields
    
    def generate_queries(self, audiobook_data, num_queries=10):
        prompt = self._construct_prompt(audiobook_data)
        synthetic_queries = self.llm.generate(
            prompt=prompt,
            max_tokens=50,
            num_return_sequences=num_queries
        )
        return self._filter_queries(synthetic_queries)
    
    def _construct_prompt(self, data):
        return f"""Generate diverse search queries for audiobook:
        Title: {data['title']}
        Author: {data['author']}
        Genre: {data['genre']}
        Themes: {data['themes']}
        Generate exploratory queries about topics, similar books, mood:"""

4. Experimental Results

4.1 Offline Evaluation

The offline evaluation demonstrated significant improvements in audiobook retrievability metrics. The synthetic queries increased coverage by 35% compared to organic queries alone, with quality scores exceeding 0.85 on human evaluation scales.

4.2 Online A/B Testing

The online A/B test involving millions of users showed statistically significant improvements: +0.7% in audiobook impressions, +1.22% in audiobook clicks, and +1.82% in exploratory query completions, validating the effectiveness of the AudioBoost approach.

5. Future Applications

The AudioBoost methodology can be extended to other cold-start scenarios in content platforms, including new podcast shows, emerging music genres, and video content. Future work includes personalizing synthetic queries based on user listening history and integrating multimodal content understanding.

Expert Analysis: The Cold-Start Conundrum in Content Discovery

AudioBoost represents a pragmatic solution to one of the most persistent problems in recommendation systems: the cold-start dilemma. The approach cleverly bridges the gap between limited user interactions and comprehensive content discovery by leveraging LLMs as synthetic user proxies. This methodology aligns with similar techniques in computer vision, where CycleGAN-style domain translation has been used to generate training data for underrepresented classes [Zhu et al., 2017].

The technical implementation demonstrates sophisticated understanding of search ecosystem dynamics. By targeting both query formulation (through QAC) and retrieval simultaneously, AudioBoost creates a virtuous cycle where improved suggestions lead to better queries, which in turn improve retrieval performance. This dual approach is reminiscent of reinforcement learning systems where action and observation spaces are optimized concurrently [Sutton & Barto, 2018].

However, the paper's most significant contribution may be its demonstration of practical LLM deployment in production systems. While much LLM research focuses on benchmark performance, AudioBoost shows how these models can drive concrete business metrics in real-world applications. The +1.82% increase in exploratory queries suggests that the system successfully nudges user behavior toward more discovery-oriented search patterns, addressing the fundamental cold-start challenge.

The approach could be further enhanced by incorporating user-specific factors into query generation, similar to how modern recommender systems personalize content based on individual preferences [Ricci et al., 2011]. Additionally, the integration of audio content analysis could provide another dimension for query generation, moving beyond metadata to actual content understanding.

6. References

  1. Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE international conference on computer vision.
  2. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
  3. Ricci, F., Rokach, L., & Shapira, B. (2011). Introduction to recommender systems handbook. Springer.
  4. Palumbo, E., et al. (2025). AudioBoost: Increasing Audiobook Retrievability in Spotify Search with Synthetic Query Generation. EARL Workshop@RecSys.

Industry Analyst Perspective

一针见血: AudioBoost isn't just another AI experiment—it's a surgical strike against the cold-start problem that has plagued content platforms for decades. Spotify is using LLMs not as chatbots, but as strategic weapons to reshape user behavior and content discovery economics.

逻辑链条: The causal chain is brilliantly engineered: limited audiobook interactions → synthetic query generation → improved QAC suggestions → user behavior modification → increased exploratory queries → enhanced audiobook retrievability → business metric improvements. This creates a self-reinforcing discovery loop that fundamentally alters the content exposure landscape.

亮点与槽点: The standout innovation is the dual deployment in both query suggestion and retrieval systems—most companies would stop at one or the other. The 1.82% lift in exploratory queries demonstrates actual behavior change, not just algorithmic optimization. However, the approach risks creating an artificial query ecosystem detached from genuine user intent, and the paper doesn't address potential query quality degradation over time.

行动启示: For product leaders: this demonstrates that LLM applications should focus on ecosystem-level interventions rather than point solutions. For engineers: the real lesson is in productionizing academic techniques—notice how they used established metrics rather than chasing novel evaluation frameworks. The next frontier will be personalizing these synthetic queries while maintaining discovery diversity.