Turn Newsletters Into a value-only Private Podcast Feed With n8n and AI

Harshal
Mar 26
4 min read

Build notes on extracting the useful parts of emails and blogs, converting them to audio, and listening as a podcast.

I made an automation in n8n to extract the value from newsletters and blogs and convert them to a private podcast feed to save time and enable me to learn more. Here are my build notes and demo.

You need about 3 minutes to read this.

Problem Context

I want to be informed about the latest news or best practices in product management, entrepreneurship, or AI. Even though I know the right newsletters to follow, I don't want to strain my eyes to sit and read them one by one. I usually used to use a text-to-speech software on my phone or laptop to listen to them while I am walking or doing some other menial task.

Solution via n8n automation

Here's my automation that converts newsletters and blog posts into a private podcast feed I can listen to later.

My n8n workflow does 3 things:

It extracts only the value (removing headers, footers, and advertisements)
It converts that text into a voice version.
It creates a private podcast feed that I can listen to later.

https://www.youtube.com/watch?list=PLsRhZObe1lqdApno-RxBpI1Z4Ve-J5Ewe&v=8GQKZFAeDYE

Input of Newsletter emails

The automation starts in 2 ways:

First, the n8n automation processes an email. Some emails arrive with labels like Product Hunt, Substack, or other news sources and n8n reads them.
In another flow, I forward an email to a specific email address (like name+n8n+automation+newsletter@... instead of name@... email). n8n reads emails sent to this address and processes them.

If something does not work out, the workflow emails me.

Input from Telegram or n8n chat

Second, I handle "I want to read this later" links from Telegram or n8n ChatHub.

On Telegram, I paste a URL and click enter. Within 3 minutes, the workflow extracts the content and adds it to the podcast feed.

Pipeline

Once the workflow receives content (email or link), it extracts the value components out of it. It removes the parts I do not want to listen to, like headers, footers, and ads from blog posts or newsletter emails.
It converts the cleaned content into a long-form audio podcast.
The audio gets stored in Google Cloud and added to the private podcast feed one by one.

Observability

To make the automation improve over time, I save two kinds of information:

First, I store telemetry in Supabase to track what is happening across runs. I store the input URL or email subject, the length of input, output, latency, and more.
Second, I added an observability loop for the AI extraction decisions. When the AI decides to remove some text from the input, I save these rejected blocks of text. Later, I evaluate whether those decisions are right. If they are wrong, I enhance the AI Agent context and provide more examples of what is acceptable to remove and what should be kept.

Tech Stack

n8n for the automation.
Supabase for telemetry.
Google TTS for audio conversion.
Google Cloud Storage for audio storage.
AI Agent using OpenAI for extraction and conversion.
Telegram for input and output.
n8n ChatHub for input and output.

Learnings

I tested multiple text-to-speech options and picked the one that handled long newsletters reliably. OpenAI text-to-speech had a character limit. ElevenLabs was expensive, and its n8n node failed while the HTTP request node worked. Google beta long TTS service handled long newsletter and blog content.
I compared AWS S3, GitHub Pages, Google Drive, and Google Cloud Storage for podcast files. I chose Google Cloud Storage because audio generation already ran there, and I needed a public bucket instead of a private Google Drive link.
I added clear keep and remove examples in the AI agent system prompt. Those examples improved text extraction quality.
I started telemetry in n8n datatables, then moved to Supabase to enable external AIs to review the telemetry and because I needed to migrate my personal n8n instance.
I built explicit error handling for every critical step. When AI agent, TTS, or feed updates failed telemetry captured the failure or the workflow alerted me on email.
Google Cloud Storage has a delay when reading recently updated files. If I read a file again within a minute, Google Cloud Storage may return an old version. I plan to split the automation into two asynchronous workflows. I also explored RabbitMQ and other queueing options.
You can build a self-improving AI coding loop by giving it full visibility into telemetry, examples, and database state.
GCP setup is a pain.

What's Next

This setup adds infrastructure work, but it buys back attention. I spend less time scanning inbox noise and more time consuming the parts that matter.

Ideas:

Enhance this to also let me submit PDFs.
Do a multi-step processing of podcasts themselves by filtering out their advertisements.
Another workflow on my radar was to search for AI news sources. Once I can build that, I can connect that as an input to this automation.