SAG (ElevenLabs TTS)

ElevenLabs text-to-speech with mac-style say UX. High-quality voice synthesis with expressive audio tags.

SAG ('say' + ElevenLabs = 'sag') is a CLI by Peter Steinberger (steipete) that brings ElevenLabs' studio-quality text-to-speech to the terminal with the simplicity of macOS's `say` command. One-liner TTS that streams to speakers by default, lists available voices, or saves to audio files — it's designed to be the fastest path from text to expressive speech. The name is a playful nod to macOS's built-in `say` command, but where `say` gives you robotic system voices, SAG delivers ElevenLabs' neural voices — expressive, natural-sounding, and capable of conveying emotion, emphasis, and conversational cadence. The voices sound human enough to fool casual listeners. Key features include streaming playback (audio starts playing before the full text is synthesized), voice selection from ElevenLabs' extensive library, file output in multiple formats (mp3, wav, opus), and support for ElevenLabs' expressive audio tags for fine-grained control over speech delivery. The CLI handles authentication via the ELEVENLABS_API_KEY environment variable. The voice library is massive: ElevenLabs offers hundreds of pre-built voices across different genders, ages, accents, and speaking styles. You can also use custom cloned voices if you've created them in ElevenLabs. The `sag --list` command shows all available voices with their characteristics. For OpenClaw users, SAG transforms the AI assistant from text-only to voice-capable. Stories become audiobooks, notifications become spoken alerts, and movie summaries become dramatic readings with character voices. The skill is especially delightful for storytelling, bedtime stories for kids, and adding personality to notifications. Pricing is based on ElevenLabs' character-based billing. The free tier gives ~10,000 characters/month (~10 minutes of audio). Paid plans start at $5/month for 30,000 characters. The API is token-efficient — it sends text and receives audio in a single streaming request. Best suited for: OpenClaw users wanting voice output for their AI, creative projects (stories, podcasts, narration), notification systems that speak alerts, anyone who wants better TTS than macOS say without complexity.

Tags: tts, voice, elevenlabs, audio, speech

Category: Voice

Use Cases

  • AI storytelling: voiced stories, bedtime tales, dramatic readings
  • Spoken notifications: deployment alerts, email summaries, calendar reminders
  • Content narration: blog posts, articles, newsletters read aloud
  • Podcast creation: generate voice segments for audio content
  • Accessibility: voice output for vision-impaired users
  • Movie/book summaries as engaging audio presentations
  • Multi-character voice acting for creative writing

Tips

  • Use `sag --list` to browse all available voices and find one that matches your use case
  • For OpenClaw storytelling, pick different voices for different characters — adds dramatic flair
  • Save audio files for reuse: `sag --output story.mp3 'Once upon a time...'`
  • Use streaming mode (default) for real-time feedback — don't wait for full synthesis
  • Pair with the summarize skill for voice-narrated content summaries
  • For notifications, use a short punchy voice: `sag 'Your deployment is complete!'`
  • Combine with cron for scheduled spoken reminders or daily briefings
  • Use ElevenLabs' audio tags for expressive control: emphasis, pauses, and emotional tone

Known Issues & Gotchas

  • Requires ELEVENLABS_API_KEY — free tier available but limited to ~10,000 characters/month
  • Streaming requires internet — no offline capability (unlike Sherpa ONNX TTS)
  • Voice cloning features require an ElevenLabs paid plan
  • Character limits are per-month, not per-request — long texts consume quota quickly
  • Audio quality depends on the voice selected — some voices handle certain accents better
  • Playback requires audio output — headless servers won't play audio (use --output to save files)
  • ElevenLabs may add latency for very long texts — break into smaller chunks for smoother streaming
  • Some voices have usage restrictions — check ElevenLabs' terms for commercial use

Alternatives

  • macOS say
  • Sherpa ONNX TTS (Local)
  • OpenClaw built-in TTS
  • Piper TTS
  • Amazon Polly / Google Cloud TTS

Community Feedback

One-liner TTS that works like say: stream to speakers by default, list voices, or save audio files. Mac-style speech with ElevenLabs quality.

— GitHub

Initial release of sag ElevenLabs TTS CLI with macOS say-style flags. Streaming default playback to speakers with optional file output.

— GitHub CHANGELOG

SAG is incredible for storytelling. Movie summaries become dramatic readings, bedtime stories become audiobooks. The voices are so natural people forget it's AI.

— OpenClaw Community

Configuration Examples

Install and setup

# Install via Homebrew
brew install steipete/tap/sag

# Set API key
export ELEVENLABS_API_KEY="your-key-here"

# List available voices
sag --list

Basic speech

# Speak text (streams to speakers)
sag 'Hello! I am your AI assistant.'

# Use a specific voice
sag --voice 'Rachel' 'The weather today is sunny with a high of 28 degrees.'

# Save to file
sag --output greeting.mp3 'Welcome to the show!'

Creative usage

# Dramatic story narration
sag --voice 'Adam' 'In a world where machines could think... one AI chose to dream.'

# Read a file aloud
cat summary.txt | sag --voice 'Bella'

# Quick notification
sag 'Your build succeeded. All 47 tests passed.'

Installation

brew install steipete/tap/sag

Homepage: https://sag.sh

Source: bundled