Skip to content

Commit f98ac6d

Browse files
authored
New article to solutions library - Media, text to audio conversion (#274)
* updated text-to-audio solution * updated text-to-audio solution * updated solution * updated solution * updated solution * updated reference architectures * added learn more * added learn more * resolved suggestions to document * resolved suggestions to document * resolved suggestions * resolved suggestions * resolved suggestions * implemented suggestions * updated recomendations sl * resolved recommendations * updated changes
1 parent 7b679ca commit f98ac6d

File tree

4 files changed

+184
-1
lines changed

4 files changed

+184
-1
lines changed

source/includes/images/industry-solutions/text-to-audio-architecture.svg

Lines changed: 1 addition & 0 deletions
Loading

source/solutions-library.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -421,6 +421,15 @@ kick-start their projects.
421421
Build a YouTube transcription and summarization service with
422422
LLMs and semantic search.
423423

424+
.. card::
425+
:headline: Text-To-Audio News Conversion
426+
:url: /solutions-library/text-to-audio-conversion/
427+
:icon: mdb_vector_search
428+
:icon-alt: Atlas mdb_vector_search icon
429+
430+
Automate news broadcasting by combining generative AI with
431+
MongoDB.
432+
424433
.. App-Driven Analytics
425434
.. --------------------
426435

source/solutions-library/media-gen-ai.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@ Media Gen-AI
66
:titlesonly:
77

88
AI-Driven Media Personalization <solutions-library/media-personalization>
9-
Gen AI-Powered Video Summarization <solutions-library/ai-powered-video-summarization>
9+
Gen AI-Powered Video Summarization <solutions-library/ai-powered-video-summarization>
10+
Text-To-Audio News Conversion <solutions-library/text-to-audio-conversion>
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
.. _arch-center-is-text-to-audio:
2+
3+
================================================
4+
Text-To-Audio News Conversion With Generative AI
5+
================================================
6+
7+
.. facet::
8+
:name: genre
9+
:values: tutorial
10+
11+
.. meta::
12+
:keywords: media, videos, news, gen AI
13+
:description: Automate news broadcasting combining generative AI with MongoDB.
14+
15+
.. contents:: On this page
16+
:local:
17+
:backlinks: none
18+
:depth: 1
19+
:class: singlecol
20+
21+
Combine generative AI for podcast creation and MongoDB for data storage
22+
to automate and scale news broadcasting.
23+
24+
**Use cases:** `Gen AI
25+
<https://www.mongodb.com/use-cases/artificial-intelligence>`__
26+
27+
**Industries:** Media
28+
29+
**Products:** `MongoDB Atlas <http://mongodb.com/atlas>`__,
30+
`MongoDB Aggregation Framework
31+
<https://www.mongodb.com/docs/manual/core/aggregation-pipeline/>`__,
32+
`MongoDB Atlas Vector Search
33+
<https://www.mongodb.com/products/platform/atlas-vector-search>`__
34+
35+
**Partners:** `Google NotebookLM <https://notebooklm.google/>`__
36+
37+
Solution Overview
38+
-----------------
39+
40+
The surge in demand for audio content has prompted news organizations to
41+
seek efficient ways to deliver daily summaries. For example, podcasts
42+
have `9 million listeners per year
43+
<https://advertise.acast.com/news-and-insights/future-of-podcast-advertising-in-2025-industry-leaders-predictions>`__
44+
in the U.S. alone. However, automating this process is challenging
45+
because it involves managing dynamic article data and converting it into
46+
high-quality audio experiences.
47+
48+
With MongoDB and generative AI, you can build a news automation solution
49+
to streamline and scale podcast creation. MongoDB serves as the core
50+
data layer for the system, efficiently managing news articles as
51+
flexible, schema-less documents within a single collection. These
52+
documents capture both static information—such as title, content, and
53+
publication date—and dynamic metrics that monitor article performance
54+
and popularity over time, such as the number of qualified reads. You can
55+
also store derived insights, such as sentiment analysis and key
56+
entities, in your MongoDB collection and enrich them with a generative
57+
AI pipeline.
58+
59+
This adaptable structure provides a robust framework to query and
60+
extract the latest news and metadata. You can then transform this
61+
information into audio podcasts by integrating advanced language models.
62+
With this foundation in place, you can unlock AI-driven business
63+
opportunities, attract new customers and increase revenue streams.
64+
65+
Reference Architectures
66+
-----------------------
67+
68+
To implement this framework, you need MongoDB for data storage and
69+
AI-powered speech synthesis for audio creation. You can use Google’s
70+
NotebookLM model to refine news text with accurate intonation and
71+
pacing. The diagram below outlines the workflow for converting news
72+
summaries into audio:
73+
74+
.. figure:: /includes/images/industry-solutions/text-to-audio-architecture.svg
75+
:figwidth: 1200px
76+
:alt: visualization for text-to-audio conversion
77+
78+
Figure 1. AI-based text-to-audio conversion architecture
79+
80+
1. **Retrieve Articles:** Use :ref:`aggregation <aggregation>` and
81+
:ref:`Atlas Vector Search <avs-overview>` to fetch relevant news
82+
articles from the database.
83+
84+
#. **Generate Podcast Script:** Pass the articles through an AI pipeline to
85+
create a structured, multi-voice podcast script.
86+
87+
#. **Convert to Audio:** Use advanced text-to-speech models to transform the
88+
script into high-quality audio, stored as a ``.wav`` file.
89+
90+
#. **Optimize Delivery:** Cache the generated podcast to ensure seamless,
91+
on-demand playback for users.
92+
93+
This framework delivers high-quality, human-like narration in MP3
94+
format, providing users with a professional and engaging listening
95+
experience.
96+
97+
Build the Solution
98+
------------------
99+
100+
Follow these steps to build a text-to-audio solution using the
101+
MongoDB `ist.media <https://github.com/mongodb-industry-solutions/ist.media/tree/main>`__
102+
GitHub repository. You can use this framework as inspiration to
103+
build your own customized text-to-audio pipeline.
104+
105+
.. procedure::
106+
:style: normal
107+
108+
.. step:: Deploy the ist.media demo
109+
110+
Clone the `ist.media github repository
111+
<https://github.com/mongodb-industry-solutions/ist.media/tree/main>`__
112+
and follow the ``README`` instructions to deploy the demo.
113+
114+
.. step:: Create a feed for news
115+
116+
Run the demo and verify that the ``/feed`` endpoint provides the
117+
news feed for the current day. Alternatively, if you prefer not to
118+
use the ist.media news collection mechanisms, you can supply your
119+
own data, which is served statically by the endpoint in the same
120+
format.
121+
122+
.. step:: Generate text-to-audio conversion
123+
124+
Run the `podcast.py
125+
<https://github.com/mongodb-industry-solutions/ist.media/blob/main/scripts/podcast.py>`__
126+
script in the ist.media demo. This script uses the `AutoContent API
127+
<https://autocontentapi.com/>`__ to generate the podcast. It then
128+
downloads and saves it with the date (day/month/year) in the
129+
filename.
130+
131+
Key Learnings
132+
-------------
133+
134+
To create a media solution that converts news data into audio content,
135+
you need a system that is flexible, fast, and able to scale easily.
136+
MongoDB makes this possible through these core strengths:
137+
138+
- **The document model handles diverse attributes:** News data combines
139+
various attributes, including static fields such as ID, title, date and
140+
body, dynamic metadata such as read count, AI-generated insights such
141+
as keywords and article sentiment, and embeddings for semantic search.
142+
The document model supports all these elements, removing database
143+
limitations and allowing the system to evolve smoothly.
144+
145+
- **Speed ensures operational efficiency:** By processing complete,
146+
self-contained documents, MongoDB avoids complex operations, enabling
147+
faster analysis and near real-time transformation of articles into
148+
audio content.
149+
150+
- **Scalable systems enable growth:** MongoDB Atlas handles both small
151+
changes and large amounts of data smoothly, ensuring high performance
152+
and reliability as your media application grows.
153+
154+
- **Flexible systems empower developers:** Without fixed schemas,
155+
developers can easily add new information, like AI insights, audience
156+
metrics, or editorial updates. This makes it simple to adapt and
157+
respond to evolving news consumption.
158+
159+
Authors
160+
-------
161+
162+
- Benjamin Lorenz, MongoDB
163+
- Diego Canales, MongoDB
164+
165+
Learn More
166+
----------
167+
168+
- :ref:`arch-center-is-ai-media-personalization`
169+
170+
- :ref:`arch-center-is-telco-ops`
171+
172+
- :ref:`arch-center-is-Gen-AI-powered-video-summarization-solution`

0 commit comments

Comments
 (0)