Local & private
Whisper-class speech-to-text runs on your machine. Audio is never uploaded; the only network calls are when you ask CMVideo to download a video from a URL.
desktop v0.4.18-alpha · mini 2026.05.28.1-alpha · open · local-first · live
Drag in any clip. Drop out a clean copy. CMVideo finds every swear and slur with on-device speech recognition, then silences, beeps, or replaces them with a TTS overlay - whichever you pick.
Desktop app: 100% local — no accounts, no clip uploads. Audio never leaves your machine. This web page may use optional analytics/ads with your consent.
Whisper-class speech-to-text runs on your machine. Audio is never uploaded; the only network calls are when you ask CMVideo to download a video from a URL.
Drop one or many files, or paste a YouTube / yt-dlp-supported URL. Pick the output format (MP4, MOV, MP3, WAV, OGG) and quality. Batches run unattended.
Replace flagged words with crisp silence, a classic 1 kHz censor beep, or a TTS overlay - sixteen selectable voices including six livestream-style ElevenLabs voices (Brian / Adam / Sam / ...) when you supply your own API key. Add the "Retro audio" colour for a lo-fi crunch, or the File-size presets to ship a smaller MP4 / MP3.
Linux gets a single-file AppImage; Windows gets a single
.exe — download and double-click. No installer,
no Python. macOS stays source-first for now (untested).
.AppImage · chmod +x → double-click
no install
.exe · double-click to run
no install
.tar.gz · works via the Linux instructions
untested
Linux quickstart:
chmod +x CMVideo-0.4.18-alpha-x86_64.AppImage && ./CMVideo-0.4.18-alpha-x86_64.AppImage
·
Windows: if SmartScreen appears, choose “More info” →
“Run anyway” (unsigned prerelease build).
·
checksums & release notes
Looking for an older build? All previous releases stay available on GitHub (0.4.0-alpha and onwards).
MP4, MOV, MP3, WAV, OGG locally - or any YouTube / 1,800+ site URL supported by yt-dlp.
Swears, slurs, or both. Choose Silence, Beep, or TTS (sixteen voices). Add Retro audio or shrink the output via File-size presets. Optionally save a full transcript.
CMVideo transcribes, finds every match (including phonetic / leet-speak variants), and writes the cleaned file next to the original.