SongSee

Generate spectrograms and feature-panel visualizations from audio files. Multi-panel analysis with customizable styles.

SongSee is a Go-based CLI by Peter Steinberger (steipete) that generates beautiful spectrogram and feature-panel visualizations from audio files. It transforms any audio (WAV, MP3, or anything ffmpeg can handle) into rich visual representations — spectrograms, mel-frequency plots, chroma diagrams, harmonic/percussive separation, self-similarity matrices, and more. The tool offers 9 distinct visualization modes, each revealing different aspects of the audio: spectrogram (time × frequency magnitude), mel (perceptual frequency scale), chroma (12-bin pitch class distribution), hpss (harmonic vs. percussive separation), selfsim (self-similarity matrix for structural analysis), loudness (volume over time), tempogram (tempo variation), mfcc (timbre fingerprint), and flux (spectral change detection). You can generate any combination of these as a multi-panel grid image. 6 color palettes are available — classic, magma, inferno, viridis, gray, and the playful 'clawd' 🦞 palette. Each panel gets automatic per-panel percentile normalization (auto-contrast) for readable heatmaps regardless of the source audio's dynamic range. The tool is written in pure Go with no Python dependencies, making it fast and easy to install. It handles FFT window size configuration, hop size, frequency range filtering, time range selection (start/duration), and custom output dimensions. Output formats include PNG and JPEG. For OpenClaw users, SongSee adds audio analysis capabilities to the AI toolkit. 'Show me the spectrogram of this podcast episode' or 'Analyze the frequency content of this recording' become visual outputs the agent can generate and share. Combined with the canvas tool, visualizations can be displayed directly on connected devices. The most practical use cases are music production (analyzing mix frequency balance), podcast editing (identifying noise or silence patterns), audio forensics (visual inspection of recordings), and education (teaching audio signal processing concepts visually). Best suited for: musicians and audio engineers wanting quick visualizations, podcast editors analyzing audio quality, developers building audio analysis pipelines, educators teaching audio/signal processing, anyone wanting to 'see' their audio.

Tags: audio, visualization, spectrogram, music, analysis

Category: Media

Use Cases

Music production: analyze frequency balance and spectral content of a mix
Podcast editing: identify noise, silence, or clipping in recordings
Audio forensics: visual inspection of recordings for anomalies
Education: generate visual aids for audio/signal processing courses
AI audio analysis: generate visualizations that can be analyzed with vision models
Content creation: beautiful spectrogram images for social media or documentation
Tempo analysis: use tempogram mode to visualize BPM changes in music

Tips

Start with `songsee track.mp3` for a quick default spectrogram — customize from there
Use `--viz spectrogram,mel,chroma` for a comprehensive 3-panel music analysis
The 'hpss' mode is great for identifying whether audio is percussion-heavy or harmonic
Use `--start` and `--duration` to analyze specific sections without processing the full file
The 'clawd' palette adds personality to your visualizations 🦞
Combine with the canvas skill to display audio visualizations on connected screens
Use high resolution (--width 2560 --height 1440) for detailed analysis of complex audio
Pair with Whisper transcription: visualize + transcribe for comprehensive audio analysis

Known Issues & Gotchas

Requires ffmpeg installed for non-WAV input — it converts to WAV internally via ffmpeg
Large audio files (1hr+) may produce very wide spectrograms — use --start and --duration to crop
Output image size can be very large when combining all 9 modes at high resolution
The self-similarity matrix mode is computationally expensive for long audio files
Color palette names are specific — use exactly: classic, magma, inferno, viridis, gray, clawd
No real-time/streaming visualization — it processes the full file first
The tool generates static images, not interactive visualizations

Alternatives

SoX (sox spectrogram)
Audacity
librosa (Python)
ffmpeg showspectrumpic
Raven Pro

Community Feedback

SongSee: 9 visualization modes, 6 color palettes, auto-contrast per-panel normalization. Universal input via ffmpeg, native Go, no Python dependencies.
— GitHub

SongSee delivers multi-panel audio visualization from the terminal. Stack spectrograms, mel plots, and chroma diagrams in a single grid image.
— GitHub Releases

Generate spectrograms and feature-panel visualizations from audio files. Multi-panel analysis with customizable styles. Built by steipete.
— OpenClaw Skills Registry

Configuration Examples

Install and basic usage

# Install via Homebrew
brew install steipete/tap/songsee

# Basic spectrogram
songsee track.mp3

# Mel spectrogram with magma palette
songsee track.mp3 --viz mel --style magma

Multi-panel analysis

# Combine multiple visualization modes
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,loudness

# All 9 modes in one grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

Custom output and cropping

# High-resolution output of a specific section
songsee track.mp3 --viz hpss,chroma --style inferno \
  -o viz.png --width 2560 --height 1440 \
  --start 30 --duration 60

# Frequency range focus
songsee track.mp3 --viz spectrogram --min-freq 100 --max-freq 8000

Installation

brew install steipete/tap/songsee

Homepage: https://github.com/steipete/songsee

Source: bundled