Everything you need to know about using BleepKit to create broadcast-ready clean edits.
BleepKit combines AI detection with human review to produce clean edits you can trust. Upload your song, let the AI process it, review every flagged word yourself, and export your clean version.
Upload any audio file β WAV, MP3, FLAC, or AIFF. Choose your censor mode and output format.
The AI separates vocals, identifies the song, transcribes lyrics, and flags profanity automatically.
Listen to each flagged word in context. Approve words to censor or reject them to keep. You have full control.
Download your clean version, stems, and compliance reports β proof for labels, distributors, and legal.
Unlike fully automated tools, BleepKit never ships an edit you haven't reviewed. The AI does the heavy lifting, but you make every final decision. Nothing gets censored that shouldn't be, and nothing slips through.

WAV, MP3, FLAC, and AIFF files up to 200 MB. Maximum duration is 30 minutes for songs, 3 hours for podcasts.
| Duration | Credits |
|---|---|
| Up to 10 minutes | 1 credit |
| 10 β 20 minutes | 2 credits |
| 20 β 30 minutes | 3 credits |
When you upload a song, BleepKit runs through several automated steps. Here is what happens behind the scenes:

AI-powered source separation splits the audio into vocals and instrumental stems with studio-quality precision.
The system attempts to identify the song using audio fingerprinting and metadata analysis. This helps find accurate lyrics.
If the song is identified, lyrics are fetched from online databases. Multiple sources are checked for the best match.
The vocal track is transcribed using AI speech recognition, providing word-level timestamps for precise censoring.
Transcribed words and retrieved lyrics are cross-referenced against profanity databases to flag explicit content.
Flagged words are compiled with timestamps and source information, ready for your review.
The review page gives you full control over which words are censored. Every word flagged by the AI can be approved or rejected before export. This is what makes BleepKit zero-risk β you make the final call on every word.

| # | Timestamp | Word | Source | Action |
|---|---|---|---|---|
| 1 | 0:42 β 0:43 | **** | both | Approved |
| 2 | 1:15 β 1:16 | **** | stt | Approved |
| 3 | 2:08 β 2:09 | heck | stt | Rejected |
Each row has a play button to hear the word in context. Approve words to censor them, reject to keep them in the audio.
Use "Approve All" to censor every flagged word, or "Reject All" to keep them all. Then fine-tune individual words as needed.
Click the play button next to any word to hear a short audio snippet centered on that word. This helps you decide whether to censor it in context.
Each flagged word shows where it was detected:
You can change the censor mode (silence, bleep, or reverse) directly on the review page before exporting. No need to re-upload.
Below the word table, the full transcript shows every word in the song. Already-flagged words are highlighted. Click any unflagged word to add it to the censor list β useful for catching words the AI missed.
After approving your word selections, BleepKit exports your files. Five downloads are available:
The full song with all approved words censored using your chosen censor mode.
The instrumental track with vocals completely removed.
The clean vocal track only, with approved words censored.
A plain text report listing every flagged word, timestamps, and your decisions.
A professionally formatted PDF with tables, song metadata, and your complete review decisions.
Changed your mind? Click "Re-Review Words" on the download page to go back and adjust your selections. Re-reviews are free and unlimited β you only pay once per upload.
Every processed song generates a compliance report β a detailed record of exactly what was censored, where, and by whom. This is documentation you can hand to a label, distributor, or legal team to prove the edit meets broadcast standards.
Machine-readable plain text with every flagged word, timestamp, detection source, and approval status. Ideal for archiving and automated workflows.
Professionally formatted with BleepKit branding, song metadata, censored word tables, and processing details. Ready to attach to submissions or hand to stakeholders.
Reports include: song/podcast metadata, processing date, censor mode used, total words flagged, individual word timestamps and sources, approval decisions, and output file details.
Customize BleepKit to match your workflow from the Settings page.
Set your preferred censor mode, output format, content type, and normalization settings. These defaults auto-populate the upload form so you don't have to set them every time.
Add words that should always be flagged for censoring, even if the AI does not flag them. One word per line. These words will be checked against every song you process.
Add words that should never be flagged, even if the AI detects them. Useful for brand names, slang, or context-specific terms that are not actually profanity.
BleepKit automatically detects the language of your audio and applies language-specific profanity detection. The following languages are currently supported:
Accuracy is highest for major world languages. Less common languages are supported but results may vary. Language is auto-detected during processing.
Each song you upload costs credits based on its duration. Credits are deducted when processing begins. Re-reviewing and re-exporting a song does not cost additional credits.
Your first song is completely free β no credit card required. Podcast users get 15 free minutes.

Podcast and Video modes use a simplified pipeline optimized for spoken word content. Select βPodcastβ or βVideoβ when uploading to activate this mode. Same AI + human review approach β you approve every word before export.
Upload MP4, MOV, MKV, or WebM files up to 2 GB. BleepKit extracts the audio, cleans it, and muxes it back into the original video container β your video quality is untouched. Perfect for YouTube creators who need to keep content monetization-safe without re-editing.
Podcast exports include the clean audio file and compliance reports (TXT + PDF). Video exports include the clean video file with the censored audio track. Instrumental and acapella stems are not available for media content since there is no vocal separation.
BleepKit is highly accurate, but no AI system is perfect. Here are scenarios where results may vary:
Extremely noisy recordings (low-quality live recordings, heavy distortion) may reduce transcription accuracy. Studio-quality audio produces the best results.
When multiple speakers or singers overlap simultaneously, the AI may miss words or flag incorrect timestamps. This is most common in group choruses or crosstalk in podcasts.
Some words have both clean and explicit meanings depending on context. The AI flags conservatively β you can always reject false positives on the review page.
While 30+ languages are supported, accuracy is highest for major world languages. Less common languages or heavy regional accents may produce lower detection rates.
Very quiet, mumbled, or whispered vocals may not be accurately transcribed. The full transcript on the review page lets you catch anything the AI missed.
Your safety net: Even when the AI isn't perfect, the review step catches everything. You see every flagged word, hear it in context, and make the final decision. That's why we say AI + human review = zero-risk.
Our team is here to help. Reach out and we will get back to you as soon as possible.
hello@bleepkit.com