SongSee
Generate spectrograms and feature-panel visualizations from audio files. Multi-panel analysis with customizable styles.
SongSee is a Go-based CLI by Peter Steinberger (steipete) that generates beautiful spectrogram and feature-panel visualizations from audio files. It transforms any audio (WAV, MP3, or anything ffmpeg can handle) into rich visual representations — spectrograms, mel-frequency plots, chroma diagrams, harmonic/percussive separation, self-similarity matrices, and more.
The tool offers 9 distinct visualization modes, each revealing different aspects of the audio: spectrogram (time × frequency magnitude), mel (perceptual frequency scale), chroma (12-bin pitch class distribution), hpss (harmonic vs. percussive separation), selfsim (self-similarity matrix for structural analysis), loudness (volume over time), tempogram (tempo variation), mfcc (timbre fingerprint), and flux (spectral change detection). You can generate any combination of these as a multi-panel grid image.
6 color palettes are available — classic, magma, inferno, viridis, gray, and the playful 'clawd' 🦞 palette. Each panel gets automatic per-panel percentile normalization (auto-contrast) for readable heatmaps regardless of the source audio's dynamic range.
The tool is written in pure Go with no Python dependencies, making it fast and easy to install. It handles FFT window size configuration, hop size, frequency range filtering, time range selection (start/duration), and custom output dimensions. Output formats include PNG and JPEG.
For OpenClaw users, SongSee adds audio analysis capabilities to the AI toolkit. 'Show me the spectrogram of this podcast episode' or 'Analyze the frequency content of this recording' become visual outputs the agent can generate and share. Combined with the canvas tool, visualizations can be displayed directly on connected devices.
The most practical use cases are music production (analyzing mix frequency balance), podcast editing (identifying noise or silence patterns), audio forensics (visual inspection of recordings), and education (teaching audio signal processing concepts visually).
Best suited for: musicians and audio engineers wanting quick visualizations, podcast editors analyzing audio quality, developers building audio analysis pipelines, educators teaching audio/signal processing, anyone wanting to 'see' their audio.
Tags: audio, visualization, spectrogram, music, analysis
Category: Media
Use Cases
- Music production: analyze frequency balance and spectral content of a mix
- Podcast editing: identify noise, silence, or clipping in recordings
- Audio forensics: visual inspection of recordings for anomalies
- Education: generate visual aids for audio/signal processing courses
- AI audio analysis: generate visualizations that can be analyzed with vision models
- Content creation: beautiful spectrogram images for social media or documentation
- Tempo analysis: use tempogram mode to visualize BPM changes in music
Tips
- Start with `songsee track.mp3` for a quick default spectrogram — customize from there
- Use `--viz spectrogram,mel,chroma` for a comprehensive 3-panel music analysis
- The 'hpss' mode is great for identifying whether audio is percussion-heavy or harmonic
- Use `--start` and `--duration` to analyze specific sections without processing the full file
- The 'clawd' palette adds personality to your visualizations 🦞
- Combine with the canvas skill to display audio visualizations on connected screens
- Use high resolution (--width 2560 --height 1440) for detailed analysis of complex audio
- Pair with Whisper transcription: visualize + transcribe for comprehensive audio analysis
Known Issues & Gotchas
- Requires ffmpeg installed for non-WAV input — it converts to WAV internally via ffmpeg
- Large audio files (1hr+) may produce very wide spectrograms — use --start and --duration to crop
- Output image size can be very large when combining all 9 modes at high resolution
- The self-similarity matrix mode is computationally expensive for long audio files
- Color palette names are specific — use exactly: classic, magma, inferno, viridis, gray, clawd
- No real-time/streaming visualization — it processes the full file first
- The tool generates static images, not interactive visualizations
Alternatives
- SoX (sox spectrogram)
- Audacity
- librosa (Python)
- ffmpeg showspectrumpic
- Raven Pro
Community Feedback
SongSee: 9 visualization modes, 6 color palettes, auto-contrast per-panel normalization. Universal input via ffmpeg, native Go, no Python dependencies.
— GitHub
SongSee delivers multi-panel audio visualization from the terminal. Stack spectrograms, mel plots, and chroma diagrams in a single grid image.
— GitHub Releases
Generate spectrograms and feature-panel visualizations from audio files. Multi-panel analysis with customizable styles. Built by steipete.
— OpenClaw Skills Registry
Configuration Examples
Install and basic usage
# Install via Homebrew
brew install steipete/tap/songsee
# Basic spectrogram
songsee track.mp3
# Mel spectrogram with magma palette
songsee track.mp3 --viz mel --style magmaMulti-panel analysis
# Combine multiple visualization modes
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,loudness
# All 9 modes in one grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,fluxCustom output and cropping
# High-resolution output of a specific section
songsee track.mp3 --viz hpss,chroma --style inferno \
-o viz.png --width 2560 --height 1440 \
--start 30 --duration 60
# Frequency range focus
songsee track.mp3 --viz spectrogram --min-freq 100 --max-freq 8000Installation
brew install steipete/tap/songseeHomepage: https://github.com/steipete/songsee
Source: bundled