Summarize

Summarize or extract text/transcripts from URLs, podcasts, YouTube videos, and local files.

Summarize is a powerhouse content-extraction CLI and Chrome extension built by Peter Steinberger (steipete). It's the Swiss Army knife for turning any URL, video, podcast, PDF, or local file into clean, structured text — then optionally summarizing it with your choice of AI model. It's deterministic by default (extraction only), with optional AI summarization when you want it. The media awareness is what makes Summarize special. Point it at a YouTube video and it extracts captions (with timestamps via --timestamps). Give it a podcast RSS feed and it pulls the audio transcript. Hand it a PDF and it extracts the text. A webpage? Clean Markdown with metadata. It auto-detects content type and routes through the right pipeline — no manual configuration needed. Transcription support is deep: YouTube captions (multiple fallback methods including yt-dlp), podcast episodes via RSS feed URLs, direct audio/video files, and even Gemini-powered audio/video transcription for content without existing captions. Local Whisper support means you can transcribe without any API calls when privacy matters. The CLI outputs streaming Markdown with metrics and cache-aware status. JSON output is available for automation. It supports multiple LLM backends for summarization: OpenAI, Anthropic, Google Gemini, and local models. The Chrome extension adds a side panel for in-browser summarization of any page you're viewing. For OpenClaw users, Summarize is the bridge between 'here's a link' and 'here's what it says.' Drop a YouTube URL in chat and get a transcript. Share an article link and get a summary. It's particularly valuable for the blogwatcher skill pipeline: detect new blog posts → extract content → summarize → notify. Best suited for: anyone who needs to extract text from diverse media, content curators processing multiple sources, developers building content pipelines, researchers who need transcripts from videos and podcasts.

Tags: summarization, youtube, transcription, url, ai

Category: AI

Use Cases

YouTube video transcription: get full text from any video with captions
Article summarization: drop a URL, get a clean summary
Podcast transcription: extract text from podcast episodes for notes
Content curation pipeline: blogwatcher → summarize → notify
Research: extract and summarize multiple papers/articles for comparison
Meeting recording transcription: process local audio/video files

Tips

Use `--extract-only` for deterministic text extraction without AI summarization
Add `--timestamps` for YouTube/podcast transcripts to get segment-level timings
Use `--json` output for automation pipelines: `summarize https://example.com --json | jq .text`
Pair with blogwatcher for automated content monitoring: detect → extract → summarize → notify
Use local Whisper (`--whisper-model medium`) for private transcription without API calls
The Chrome extension is great for quick summarization while browsing — install from summarize.sh
Cache saves API calls — repeated URLs return cached results instantly
Use `--model` flag to pick the best model for your content type (long articles → Gemini for context window)

Known Issues & Gotchas

AI summarization requires an LLM API key (OpenAI, Anthropic, or Google) — extraction is free
YouTube transcript extraction may fail if the video has no captions — falls back to yt-dlp audio download + Whisper
yt-dlp must be installed for YouTube fallback transcription: `brew install yt-dlp`
Local Whisper transcription requires significant CPU/GPU — slow on older machines
Some websites block scraping — you may get incomplete extractions from heavily protected sites
Podcast transcription requires the RSS feed URL, not just the podcast name
Cache is per-URL — clear with `summarize cache clear` if you need fresh extraction
Gemini audio/video transcription requires a Google API key and may have file size limits

Alternatives

yt-dlp + Whisper (manual)
Fabric (danielmiessler)
Reader API (Jina)
Trafilatura (Python)
Whisper CLI (standalone)

Community Feedback

Step 1: great CLI. Step 2: Chrome Extension. Get Summarize: runs local/free/paid models. Any page: YouTube, podcasts, articles. No transcript? Local Whisper.
— Twitter/X (steipete)

YouTube Summarizer: Extracts and summarizes YouTube video transcripts to help generate descriptions, headlines, and social copy. One of the top 10 OpenClaw skills.
— Composio Blog

Peter Steinberger discusses how summarize was built to be the content extraction layer for OpenClaw — deterministic text extraction that any AI model can then process.
— Lex Fridman Podcast Transcript

OpenClaw is a game changer. Runs on my own machine, where I can control network traffic. I can review the Agent skills it uses — like summarize for content extraction.
— Reddit r/vibecoding

Configuration Examples

Basic URL extraction and summarization

# Extract text only (no AI)
summarize https://example.com/article --extract-only

# Summarize with default model
summarize https://example.com/article

# JSON output for automation
summarize https://example.com/article --json | jq '{title: .title, text: .text[:200]}'

YouTube and podcast transcription

# YouTube transcript with timestamps
summarize https://youtube.com/watch?v=xxx --timestamps

# Podcast episode from RSS
summarize https://feeds.example.com/podcast.xml --episode 1

# Local video file
summarize ./meeting-recording.mp4 --whisper-model medium

Model selection and caching

# Use a specific model
summarize https://example.com --model gpt-4o

# Use Gemini for long content
summarize https://long-article.com --model gemini-2.0-flash

# Clear cache for a URL
summarize cache clear https://example.com

Installation

brew install steipete/tap/summarize

Homepage: https://summarize.sh

Source: bundled