|
| 1 | +.. _arch-center-is-text-to-audio: |
| 2 | + |
| 3 | +================================================ |
| 4 | +Text-To-Audio News Conversion With Generative AI |
| 5 | +================================================ |
| 6 | + |
| 7 | +.. facet:: |
| 8 | + :name: genre |
| 9 | + :values: tutorial |
| 10 | + |
| 11 | +.. meta:: |
| 12 | + :keywords: media, videos, news, gen AI |
| 13 | + :description: Automate news broadcasting combining generative AI with MongoDB. |
| 14 | + |
| 15 | +.. contents:: On this page |
| 16 | + :local: |
| 17 | + :backlinks: none |
| 18 | + :depth: 1 |
| 19 | + :class: singlecol |
| 20 | + |
| 21 | +Combine generative AI for podcast creation and MongoDB for data storage |
| 22 | +to automate and scale news broadcasting. |
| 23 | + |
| 24 | +**Use cases:** `Gen AI |
| 25 | +<https://www.mongodb.com/use-cases/artificial-intelligence>`__ |
| 26 | + |
| 27 | +**Industries:** Media |
| 28 | + |
| 29 | +**Products:** `MongoDB Atlas <http://mongodb.com/atlas>`__, |
| 30 | +`MongoDB Aggregation Framework |
| 31 | +<https://www.mongodb.com/docs/manual/core/aggregation-pipeline/>`__, |
| 32 | +`MongoDB Atlas Vector Search |
| 33 | +<https://www.mongodb.com/products/platform/atlas-vector-search>`__ |
| 34 | + |
| 35 | +**Partners:** `Google NotebookLM <https://notebooklm.google/>`__ |
| 36 | + |
| 37 | +Solution Overview |
| 38 | +----------------- |
| 39 | + |
| 40 | +The surge in demand for audio content has prompted news organizations to |
| 41 | +seek efficient ways to deliver daily summaries. For example, podcasts |
| 42 | +have `9 million listeners per year |
| 43 | +<https://advertise.acast.com/news-and-insights/future-of-podcast-advertising-in-2025-industry-leaders-predictions>`__ |
| 44 | +in the U.S. alone. However, automating this process is challenging |
| 45 | +because it involves managing dynamic article data and converting it into |
| 46 | +high-quality audio experiences. |
| 47 | + |
| 48 | +With MongoDB and generative AI, you can build a news automation solution |
| 49 | +to streamline and scale podcast creation. MongoDB serves as the core |
| 50 | +data layer for the system, efficiently managing news articles as |
| 51 | +flexible, schema-less documents within a single collection. These |
| 52 | +documents capture both static information—such as title, content, and |
| 53 | +publication date—and dynamic metrics that monitor article performance |
| 54 | +and popularity over time, such as the number of qualified reads. You can |
| 55 | +also store derived insights, such as sentiment analysis and key |
| 56 | +entities, in your MongoDB collection and enrich them with a generative |
| 57 | +AI pipeline. |
| 58 | + |
| 59 | +This adaptable structure provides a robust framework to query and |
| 60 | +extract the latest news and metadata. You can then transform this |
| 61 | +information into audio podcasts by integrating advanced language models. |
| 62 | +With this foundation in place, you can unlock AI-driven business |
| 63 | +opportunities, attract new customers and increase revenue streams. |
| 64 | + |
| 65 | +Reference Architectures |
| 66 | +----------------------- |
| 67 | + |
| 68 | +To implement this framework, you need MongoDB for data storage and |
| 69 | +AI-powered speech synthesis for audio creation. You can use Google’s |
| 70 | +NotebookLM model to refine news text with accurate intonation and |
| 71 | +pacing. The diagram below outlines the workflow for converting news |
| 72 | +summaries into audio: |
| 73 | + |
| 74 | +.. figure:: /includes/images/industry-solutions/text-to-audio-architecture.svg |
| 75 | + :figwidth: 1200px |
| 76 | + :alt: visualization for text-to-audio conversion |
| 77 | + |
| 78 | + Figure 1. AI-based text-to-audio conversion architecture |
| 79 | + |
| 80 | +1. **Retrieve Articles:** Use :ref:`aggregation <aggregation>` and |
| 81 | + :ref:`Atlas Vector Search <avs-overview>` to fetch relevant news |
| 82 | + articles from the database. |
| 83 | + |
| 84 | +#. **Generate Podcast Script:** Pass the articles through an AI pipeline to |
| 85 | + create a structured, multi-voice podcast script. |
| 86 | + |
| 87 | +#. **Convert to Audio:** Use advanced text-to-speech models to transform the |
| 88 | + script into high-quality audio, stored as a ``.wav`` file. |
| 89 | + |
| 90 | +#. **Optimize Delivery:** Cache the generated podcast to ensure seamless, |
| 91 | + on-demand playback for users. |
| 92 | + |
| 93 | +This framework delivers high-quality, human-like narration in MP3 |
| 94 | +format, providing users with a professional and engaging listening |
| 95 | +experience. |
| 96 | + |
| 97 | +Build the Solution |
| 98 | +------------------ |
| 99 | + |
| 100 | +Follow these steps to build a text-to-audio solution using the |
| 101 | +MongoDB `ist.media <https://github.com/mongodb-industry-solutions/ist.media/tree/main>`__ |
| 102 | +GitHub repository. You can use this framework as inspiration to |
| 103 | +build your own customized text-to-audio pipeline. |
| 104 | + |
| 105 | +.. procedure:: |
| 106 | + :style: normal |
| 107 | + |
| 108 | + .. step:: Deploy the ist.media demo |
| 109 | + |
| 110 | + Clone the `ist.media github repository |
| 111 | + <https://github.com/mongodb-industry-solutions/ist.media/tree/main>`__ |
| 112 | + and follow the ``README`` instructions to deploy the demo. |
| 113 | + |
| 114 | + .. step:: Create a feed for news |
| 115 | + |
| 116 | + Run the demo and verify that the ``/feed`` endpoint provides the |
| 117 | + news feed for the current day. Alternatively, if you prefer not to |
| 118 | + use the ist.media news collection mechanisms, you can supply your |
| 119 | + own data, which is served statically by the endpoint in the same |
| 120 | + format. |
| 121 | + |
| 122 | + .. step:: Generate text-to-audio conversion |
| 123 | + |
| 124 | + Run the `podcast.py |
| 125 | + <https://github.com/mongodb-industry-solutions/ist.media/blob/main/scripts/podcast.py>`__ |
| 126 | + script in the ist.media demo. This script uses the `AutoContent API |
| 127 | + <https://autocontentapi.com/>`__ to generate the podcast. It then |
| 128 | + downloads and saves it with the date (day/month/year) in the |
| 129 | + filename. |
| 130 | + |
| 131 | +Key Learnings |
| 132 | +------------- |
| 133 | + |
| 134 | +To create a media solution that converts news data into audio content, |
| 135 | +you need a system that is flexible, fast, and able to scale easily. |
| 136 | +MongoDB makes this possible through these core strengths: |
| 137 | + |
| 138 | +- **The document model handles diverse attributes:** News data combines |
| 139 | + various attributes, including static fields such as ID, title, date and |
| 140 | + body, dynamic metadata such as read count, AI-generated insights such |
| 141 | + as keywords and article sentiment, and embeddings for semantic search. |
| 142 | + The document model supports all these elements, removing database |
| 143 | + limitations and allowing the system to evolve smoothly. |
| 144 | + |
| 145 | +- **Speed ensures operational efficiency:** By processing complete, |
| 146 | + self-contained documents, MongoDB avoids complex operations, enabling |
| 147 | + faster analysis and near real-time transformation of articles into |
| 148 | + audio content. |
| 149 | + |
| 150 | +- **Scalable systems enable growth:** MongoDB Atlas handles both small |
| 151 | + changes and large amounts of data smoothly, ensuring high performance |
| 152 | + and reliability as your media application grows. |
| 153 | + |
| 154 | +- **Flexible systems empower developers:** Without fixed schemas, |
| 155 | + developers can easily add new information, like AI insights, audience |
| 156 | + metrics, or editorial updates. This makes it simple to adapt and |
| 157 | + respond to evolving news consumption. |
| 158 | + |
| 159 | +Authors |
| 160 | +------- |
| 161 | + |
| 162 | +- Benjamin Lorenz, MongoDB |
| 163 | +- Diego Canales, MongoDB |
| 164 | + |
| 165 | +Learn More |
| 166 | +---------- |
| 167 | + |
| 168 | +- :ref:`arch-center-is-ai-media-personalization` |
| 169 | + |
| 170 | +- :ref:`arch-center-is-telco-ops` |
| 171 | + |
| 172 | +- :ref:`arch-center-is-Gen-AI-powered-video-summarization-solution` |
0 commit comments