What's the best way to transcribe podcasts on Mac?

For free local transcription, use Hapi — drag your podcast audio file into the app and get an accurate transcript with speaker detection. For batch processing multiple episodes, Whisper.cpp provides command-line automation. Both run entirely on your Mac.

Can I transcribe podcasts locally without uploading audio?

Yes. Hapi and Whisper.cpp both transcribe podcast episodes entirely on your Mac using local AI models. Your audio never leaves your device, which is ideal for unreleased episodes or content with proprietary information.

How long does it take to transcribe a podcast episode?

On Apple Silicon Macs, local transcription typically runs 2-5x faster than real-time. A 60-minute podcast episode transcribes in 12-30 minutes depending on your Mac model and the transcription settings used.

Do podcast transcription tools detect different speakers?

Some do. Hapi includes speaker detection that labels different voices (Speaker 1, Speaker 2, etc.). You can then replace these labels with actual names. Basic transcription tools like Whisper.cpp output text without speaker separation.

2026 · 02 · 06

How to Transcribe Podcasts on Mac (Free Local Methods)

Learn how to transcribe podcast episodes on Mac for free using local AI tools. Get accurate transcripts with speaker labels without uploading audio to the cloud.

7 min read·Voice notes

How to Transcribe Podcasts on Mac

Podcast transcription serves multiple purposes: accessibility, SEO, content repurposing, and show notes. On Mac, you can transcribe episodes locally without uploading audio to cloud services.

This guide covers the best methods for transcribing podcasts on Mac.

Why Transcribe Podcasts?

Accessibility

Transcripts make your podcast accessible to:

Deaf and hard-of-hearing audiences
Non-native speakers who prefer reading
People in sound-sensitive environments
Screen reader users

SEO Benefits

Search engines can't listen to audio, but they can index text:

Episode pages rank for spoken keywords
Long-form content improves domain authority
Transcripts provide internal linking opportunities

Content Repurposing

One transcript enables:

Blog post summaries
Social media quotes
Newsletter content
YouTube video captions
Audiogram clips

Show Notes

Detailed transcripts make better show notes:

Accurate timestamps
Complete quotes
Name spellings verified
Links mentioned

Method 1: Hapi — Best for Individual Episodes

Hapi provides simple drag-and-drop transcription for podcast files.

How to Transcribe a Podcast with Hapi

Open Hapi (download from speakhapi.com)
Drag your audio file (mp3, m4a, wav) into the Hapi window
Wait for transcription — typically 2-5x faster than audio length
Review and edit — fix names, technical terms
Export — TXT, SRT, VTT, or Markdown

Features for Podcast Transcription

Speaker detection: Hapi identifies different voices and labels them (Speaker 1, Speaker 2). After transcription, replace labels with actual names.

Smart formatting: Automatic punctuation and paragraph breaks based on pauses — no need to add manually.

Multiple formats: Export as plain text for blog posts, SRT/VTT for video platforms, or Markdown for static site generators.

Local processing: Your unreleased episodes never leave your Mac.

Method 2: Whisper.cpp — Batch Processing

For transcribing multiple episodes or automating your workflow.

Setup

# Clone and build
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
make

# Download model (large recommended for podcasts)
./models/download-ggml-model.sh large-v3

Transcribe a Single Episode

# Convert to WAV if needed (Whisper.cpp prefers WAV)
ffmpeg -i episode.mp3 -ar 16000 -ac 1 episode.wav

# Transcribe
./main -m models/ggml-large-v3.bin -f episode.wav -otxt

Batch Transcribe Multiple Episodes

#!/bin/bash
# Save as transcribe-episodes.sh

for file in *.mp3; do
    # Convert to WAV
    ffmpeg -i "$file" -ar 16000 -ac 1 "${file%.mp3}.wav"

    # Transcribe
    ./main -m models/ggml-large-v3.bin -f "${file%.mp3}.wav" -otxt

    # Clean up WAV
    rm "${file%.mp3}.wav"

    echo "Completed: $file"
done

Output Formats

# Plain text
./main -m models/ggml-large-v3.bin -f episode.wav -otxt

# SRT subtitles
./main -m models/ggml-large-v3.bin -f episode.wav -osrt

# VTT subtitles
./main -m models/ggml-large-v3.bin -f episode.wav -ovtt

# JSON with timestamps
./main -m models/ggml-large-v3.bin -f episode.wav -ojson

Limitations

No GUI — command line only
No speaker detection (single text block)
Requires some technical setup
Slower than Hapi on Apple Silicon (doesn't use Neural Engine)

Best for: Podcasters who want automated batch processing or CI/CD integration.

Method 3: MacWhisper — GUI Alternative

A paid Mac app wrapping Whisper for file transcription.

Features

Drag-and-drop interface
Multiple model options
Export formats
Batch processing

Pricing

Free tier: Limited features
Pro: $30 one-time
Pro+: $60 one-time

When to Choose MacWhisper

Want a GUI but don't need speaker detection
Primarily transcribe files (not real-time recording)
Willing to pay for a focused tool

Method 4: Cloud Services

When local processing doesn't meet your needs.

Descript

Full editing suite
Overdub for corrections
Video editing included
$15-30/month

Best for: Podcasters who also edit in the same tool.

Otter.ai

Good accuracy
Speaker labels
Searchable archive
$16.99/month

Best for: Teams who need collaboration features.

Rev

Human + AI options
High accuracy
Pay per minute
$0.25-1.50/minute

Best for: One-off transcription of important episodes.

Comparison: Podcast Transcription Options

Method	Price	Speaker Labels	Batch	Processing
Hapi	Free	Yes	No	Local
Whisper.cpp	Free	No	Yes	Local
MacWhisper	$30-60	No	Yes	Local
Descript	$15-30/mo	Yes	Yes	Cloud
Otter.ai	$16.99/mo	Yes	Yes	Cloud
Rev	$0.25+/min	Yes	Yes	Cloud

Optimizing Transcription Quality

Recording Quality Matters

Transcription accuracy depends heavily on audio quality:

Use quality microphones
Record in quiet environments
Minimize background music
Avoid heavy compression in editing

Pre-Processing Tips

If accuracy is low, try:

Noise reduction: Use Audacity or your DAW to reduce background noise before transcription.

Leveling: Normalize audio so all speakers are similar volume.

Remove music: Intro/outro music can confuse transcription. Consider removing or transcribing those segments separately.

Model Selection

Whisper model sizes for podcasts:

Model	Size	Speed	Best For
base	150MB	Fastest	Clear audio, single speaker
small	500MB	Fast	Good audio, 2-3 speakers
medium	1.5GB	Moderate	Variable audio quality
large-v3	3GB	Slowest	Maximum accuracy

Recommendation: Use large-v3 for podcasts. The speed difference is minimal compared to accuracy gains.

Post-Transcription Editing

Always review transcripts for:

Names: People, companies, products
Technical terms: Industry jargon, acronyms
Numbers: Dates, statistics, prices
Homophones: "their/there/they're" type errors

Workflow for Regular Podcast Transcription

Weekly Podcast Workflow

Record episode (as normal)
Export audio (mp3 or m4a)
Transcribe with Hapi (drag-and-drop)
Edit transcript (15-30 minutes for a 1-hour episode)
Export (Markdown for blog, TXT for show notes)
Publish (transcript on episode page)

Automation Ideas

For podcasters publishing frequently:

Create a Shortcut that opens Hapi with the latest export
Use Whisper.cpp in a script triggered by folder watch
Set up a GitHub Action to transcribe when audio is pushed

Common Questions

How accurate is local podcast transcription?

For clear podcast audio with good microphones, expect 95%+ accuracy. Multiple speakers with crosstalk, background music, or poor audio quality will reduce accuracy.

Can I transcribe video podcasts?

Yes. Extract the audio track first:

ffmpeg -i video.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav

Then transcribe the audio file.

How do I add speaker names instead of "Speaker 1"?

After transcription in Hapi, use find-and-replace:

Find: "Speaker 1:"
Replace: "Host:"

Or export to text and edit in your preferred editor.

What about transcribing old episodes?

The same methods work for old episodes. If you have archives going back years, use Whisper.cpp batch processing to transcribe entire back catalogs.

Should I transcribe live recordings?

For live podcast recordings (no post-production), transcription may have more errors due to:

Background audience noise
Cross-talk between speakers
Uneven audio levels

Consider cleaning the audio before transcription, or accept that live show transcripts need more editing.

Summary

Podcast transcription on Mac works best with local tools:

Individual episodes: Hapi — drag-and-drop, speaker labels, free
Batch processing: Whisper.cpp — automated, flexible, free
GUI preference: MacWhisper — simple interface, one-time payment

Local transcription means your unreleased episodes stay private, you're not limited by subscription minutes, and processing speed depends only on your Mac — not your internet connection.