Voice Notes & Dictation9 min read·

How to Transcribe Audio Files on Mac: Complete Guide (2026)

Step-by-step guide to transcribing audio files on Mac. Compare methods: built-in tools, local apps like Hapi, cloud services. Find the best solution for your workflow.

audio transcriptionmachow-toaudio filestranscription guide

Quick Answer: Best Ways to Transcribe Audio Files on Mac

  1. For quick transcripts: Use Apple Voice Memos built-in transcription (iOS/Mac sync required)
  2. For privacy + accuracy: Use Hapi (local AI, 99% accuracy, works offline)
  3. For collaboration: Use cloud services (Otter.ai, Rev.ai) if you don't mind uploading audio

This guide covers all three methods — read on for step-by-step instructions.

Method 1: Apple Voice Memos Transcription (Free, Limited)

Best for: iPhone users who recorded on Voice Memos and synced to Mac

Apple's Voice Memos app includes basic transcription for recordings made on iPhone and synced via iCloud.

How to Use Voice Memos Transcription

  1. Record on iPhone: Open Voice Memos app, tap red record button
  2. Sync to Mac: Enable iCloud sync (Settings → [Your Name] → iCloud → Voice Memos)
  3. Open on Mac: Launch Voice Memos app on Mac
  4. View transcript: Select recording, click "Transcription" tab

Limitations

  • iPhone-only recordings: Only works for recordings made on iPhone, not audio files from other sources
  • Cloud required: Requires iCloud sync and internet connection
  • No export: Can't export transcript as text file (copy/paste only)
  • No speaker detection: Can't identify multiple speakers
  • Basic formatting: No punctuation improvements or filler word removal
  • Language limited: Works best with English; other languages have poor accuracy

Verdict: Good for casual iPhone recordings, but not suitable for professional transcription or imported audio files.

Method 2: Local AI Transcription with Hapi (Recommended)

Best for: Privacy-conscious users, long recordings, professional transcription, offline use

Hapi is a macOS app that transcribes audio files entirely on your Mac using local AI models. No internet required, no file size limits, no cloud processing.

How to Transcribe Audio Files with Hapi

Step 1: Download Hapi (Free)

  • Visit speakhapi.com
  • Download and install the app
  • Grant microphone permission when prompted

Step 2: Import Audio File

  • Open Hapi from menu bar
  • Click "Transcripts" → "Import Audio File"
  • Select your audio file (MP3, M4A, WAV, AAC, FLAC, etc.)

Step 3: Configure Settings (Optional)

  • Select language (or use auto-detection)
  • Enable speaker detection if multiple speakers
  • Choose engine: WhisperKit (accuracy) or Parakeet (speed)

Step 4: Start Transcription

  • Click "Transcribe"
  • Hapi processes the file locally on your Mac
  • Progress indicator shows completion percentage

Step 5: Review & Export

  • Read transcript in Hapi's interface
  • Edit any errors (rare with 99% accuracy)
  • Export as TXT, JSON, SRT, VTT, or Markdown

Hapi Features

100% local — audio never leaves your Mac ✅ No file size limit — transcribe 3-hour podcasts or lectures ✅ Speaker detection — identifies who said what ✅ 25+ languages with automatic detection ✅ Noise reduction — DTLN enhancement for better accuracy ✅ AI chat — summarize, translate, or format transcripts with local LLM ✅ Export formats — TXT, JSON, SRT, VTT, Markdown ✅ Free — no subscription, no usage limits

Why Hapi for Audio Files?

Privacy: Your audio files stay on your Mac. No uploads to servers, no data collection.

Accuracy: WhisperKit engine achieves 99% word error rate, better than most cloud services.

Speed: Processes a 1-hour recording in ~5 minutes on Apple Silicon Macs (M1/M2/M3/M4).

Speaker detection: Automatically identifies different speakers and labels them in transcript (Speaker 1, Speaker 2, etc.)

Offline: Works anywhere, no internet required. Perfect for traveling or areas with poor connectivity.

AI transformation: Built-in local LLM lets you summarize transcripts, extract action items, or translate — all without sending data to ChatGPT/Claude.

Method 3: Cloud Transcription Services

Best for: Collaboration, shared transcripts, API integration

Cloud services offer transcription via web upload or API. Your audio is sent to their servers for processing.

Popular Services

ServicePriceAccuracyFeatures
Otter.aiFree (600 min/mo) / $17/mo Pro~90%Collaboration, speaker ID
Rev.ai$0.02/min ($1.20/hour)~95%API access, custom vocab
AssemblyAI$0.00025/sec ($0.90/hour)~95%API, sentiment analysis
Sonix$10/hour~95%Multi-language, subtitles
Descript$12/mo~90%Video editing, overdub

How to Use Cloud Services

  1. Upload audio: Drag file to web interface or use API
  2. Wait for processing: Usually 5-15 minutes for 1-hour file
  3. Download transcript: Export as TXT, DOCX, SRT, etc.

Downsides of Cloud Services

Privacy risk: Your audio is uploaded and processed on their servers ❌ Internet required: Can't transcribe offline ❌ File size limits: Most cap at 2GB or 4 hours ❌ Recurring costs: Subscription or pay-per-minute pricing ❌ Data retention: Most keep your audio for 30-90 days

When to use cloud services: Team collaboration where multiple people need access, or API integration for automated workflows.

Method 4: macOS Speech-to-Text (Not Recommended)

macOS includes a "Dictation" feature, but it cannot transcribe audio files — it only works for live dictation into text fields.

Why it doesn't work for audio files:

  • Dictation captures microphone input, not playback audio
  • No way to feed an audio file into the dictation system
  • 40-second timeout in standard mode

Workaround (hacky, not recommended):

  1. Play audio file through speakers
  2. Use another device to record the playback
  3. Use Dictation to transcribe the re-recorded audio

This results in quality loss, background noise, and poor accuracy. Use Hapi or cloud services instead.

Comparison Table: All Methods

MethodPrivacyAccuracySpeedCostOfflineSpeaker ID
Voice Memos⚠️ iCloud required~85%FastFree❌ No❌ No
Hapi✅ 100% local~99%~5 min/hourFree✅ Yes✅ Yes
Otter.ai❌ Cloud~90%~10 min/hour$0-17/mo❌ No✅ Yes
Rev.ai❌ Cloud~95%~5 min/hour$1.20/hour❌ No✅ Yes
AssemblyAI❌ Cloud~95%~10 min/hour$0.90/hour❌ No✅ Yes

Audio File Formats Supported

Most transcription apps support these formats:

Common formats:

  • MP3 (MPEG audio)
  • M4A (Apple audio)
  • WAV (uncompressed)
  • AAC (Advanced Audio Coding)

Professional formats:

  • FLAC (lossless)
  • OGG (Vorbis)
  • WMA (Windows Media)
  • AIFF (Apple audio)

Hapi automatically converts any format to the required input format (16kHz mono WAV) before transcription, so you can drag-and-drop any audio file without preprocessing.

Tips for Better Transcription Accuracy

1. Improve Audio Quality

  • Reduce background noise: Transcribe in a quiet environment or use apps with noise reduction (Hapi has DTLN enhancement)
  • Clear speech: Speak clearly and at a moderate pace
  • Good microphone: Use an external microphone for recordings (not laptop built-in)

2. Optimize Settings

  • Correct language: Select the right language or use auto-detection
  • Speaker detection: Enable for multi-person recordings (interviews, meetings, podcasts)
  • Longer recordings: Use batch processing (WhisperKit in Hapi) for better accuracy on 30+ minute files

3. Post-Processing

  • Review transcript: Even 99% accuracy means 1 error per 100 words (10 errors in 1,000-word transcript)
  • Use AI editing: Hapi's local LLM can clean up grammar and punctuation
  • Custom dictionary: Add industry terms or names to improve recognition

Use Cases

Podcast Transcription

  • Use Hapi for privacy and speaker detection
  • Export as Markdown for blog posts or show notes
  • Use AI chat to generate episode summaries

Interview Transcription

  • Use Hapi for speaker labels (Interviewer, Interviewee)
  • Export as TXT for analysis in qualitative research software
  • Summarize key quotes with local LLM

Lecture Transcription

  • Use Hapi for long recordings (2-3 hours)
  • Export as Markdown for study notes
  • Extract action items or key concepts with AI chat

Meeting Recordings

  • Use Hapi's live meeting transcription (auto-detects Zoom/Teams)
  • Export as JSON for searchable archive
  • Generate action items with custom prompts

Legal Depositions

  • Use Hapi for 100% local processing (compliance-friendly)
  • Speaker detection for multiple parties
  • Export as TXT with timestamps for legal review

Frequently Asked Questions

Can I transcribe multiple audio files at once?

Hapi: Not yet (batch processing coming in future update). Transcribe files one at a time.

Cloud services: Most support batch upload (5-10 files at once).

How long does transcription take?

Hapi: ~5 minutes for 1-hour audio on M1 Mac (faster on M3/M4)

Cloud services: ~10-15 minutes for 1-hour audio (depends on server load)

Can I edit transcripts in the app?

Hapi: Yes, live editing with auto-save

Cloud services: Most have in-app editors

What's the accuracy for non-English languages?

Hapi: 95-99% for 25+ languages (Spanish, French, German, Portuguese, Japanese, Chinese, Korean, etc.)

Cloud services: 90-95% for major languages, lower for less common languages

Can I transcribe phone call recordings?

Yes, as long as you have the audio file. Most transcription apps support phone call audio formats (M4A from iPhone, WAV from call recording apps).

Is there a file size limit?

Hapi: No hard limit, tested up to 4-hour recordings

Cloud services: Typically 2GB or 4 hours max

Which Method Should You Use?

Use Voice Memos if:

  • You only transcribe iPhone voice recordings
  • You don't need high accuracy
  • You're okay with iCloud sync

Use Hapi if:

  • You transcribe audio files from any source
  • You value privacy (100% local processing)
  • You want speaker detection
  • You need high accuracy (99%)
  • You want offline transcription
  • You need AI transformation (summarize, translate, etc.)
  • You want it free with no usage limits

Use cloud services if:

  • You need team collaboration on transcripts
  • You're integrating transcription into automated workflows (API)
  • You don't mind uploading audio to servers
  • You're okay with recurring subscription costs

Get Started with Hapi

For most Mac users who want to transcribe audio files locally with high accuracy, Hapi is the best choice — it's free, private, accurate, and works offline.

Why Hapi?

  • 100% local — nothing sent to the cloud
  • 25+ languages with auto-detection
  • Meeting recording with speaker labels
  • Free — no subscription

Transcribe anything on your Mac.

100% local. No cloud. No subscription.

Download Hapi — Free

Related Posts