Adding subtitles to your videos makes them accessible to more people and keeps viewers watching even with the sound off. This guide walks you through the entire process — from uploading your video to exporting it with burned-in subtitles or downloadable SRT files.
We'll use our captions.yeb.to, a browser-based tool that handles transcription, styling, and translation in one place.
Step 1: Get Started with Captions AI by YEB
Go to captions.yeb.to and sign in with your Google account. It takes one click. New users get free credits to test the tool before buying anything. With free credits you can transcribe and export videos up to 1 minute long — enough to test everything and see the quality. The only catch is a small YEB.to watermark on the exported video. SRT and VTT subtitle files are always free with no watermark, regardless of plan.
If you need longer videos or no watermark, you buy PRO credits. There's no subscription — you buy once and use them whenever you want. They never expire. Currently, a short video under 5 minutes costs 2 credits to transcribe and 2 more to export as video. A 10-minute video costs 4 credits for each. The price scales with duration, not with features — everything is unlocked from the start.
Step 2: Upload Your Video or Audio
Once signed in, you'll see the upload zone on the main page. Either drag and drop your file directly onto it, or click "Select File" to browse.
The tool accepts most common video formats — MP4, MOV, AVI, MKV, and WebM, with files up to 500MB. But it's not limited to video. You can upload audio files too — MP3, WAV, M4A, AAC, OGG, FLAC, up to 100MB. If you're a podcaster or just need a transcript without video, upload your audio file and the rest of the process works the same way. You'll get a full transcript with timestamps that you can export as SRT or VTT.
After selecting your file, two settings appear:
Source Language — pick the language spoken in your video, or leave it on "Auto-detect" if you're not sure. Auto-detect works well for most major languages.
Project Name — give it a name so you can find it later in your projects list.
When everything looks right, hit "Start Processing." The AI will begin transcribing your content. This usually takes a fraction of the actual video length — a 10-minute video transcribes in about a minute.
Step 3: Review and Edit the Transcript
Once transcription is done, you'll land in the timeline editor. Every word has been timestamped and laid out as segments along a timeline synced to your video.
Here you can:
Edit text — click on any segment and fix words the AI might have gotten wrong. This is especially useful for proper nouns, brand names, or technical terms.
Adjust timing — drag segment edges to shift when subtitles appear and disappear. If a subtitle shows up a beat too early or too late, this is where you fix it.
Split or merge segments — if a subtitle chunk is too long for comfortable reading, split it. If two short fragments belong together, merge them.
Play the video at any point to check how the subtitles sync up with the audio. What you see in the editor is exactly what you'll get in the export.
Step 4: Style Your Subtitles
This is where you make the subtitles look the way you want. The fastest way to get started is with presets — ready-made subtitle styles that you can apply with one click. Pick one that fits your content, and you're done. If you want to tweak it further or start from scratch, the full style editor gives you control over everything:
Font and size — pick a typeface that matches your brand or content style. Larger text works better for TikTok and Instagram Reels where people watch on small screens.
Colors — set the text color, background color, and opacity. White text with a semi-transparent dark background is the safe default. Bright colored text with no background works for a more modern TikTok style.
Position — place subtitles at the bottom (standard), top, or center of the frame.
Effects — add outline, shadow, or animations to make text pop against busy backgrounds.
Transitions — control how subtitles appear and disappear on screen. Fade in, slide up, pop in, or use word-by-word reveal for a dynamic karaoke-style effect that highlights each word as it's spoken. This works especially well for short-form content on TikTok and Reels.
Display mode — choose how subtitles appear on screen. Standard shows the full segment at once. Word-by-word reveals one word at a time, TikTok-style. Word highlight is karaoke mode — the full sentence is visible but each word lights up as it's spoken. Line progress draws a progress bar across the text in sync with the audio. Word-by-word and karaoke modes work best for short-form content where you want maximum visual engagement.
You can also save your custom style as a preset for future projects — useful if you produce content regularly and want a consistent look across videos.
The preview updates in real time as you make changes, so you can see exactly how everything looks before committing.
Step 5: Add Translation (Optional)
If you want to reach viewers who speak a different language, open the translation panel and select a target language. The tool supports over 100 languages.
The AI translates your entire transcript with one click. But the interesting part is bilingual mode — instead of replacing the original text, it displays both languages simultaneously. Your English-speaking audience reads the original while your Spanish-speaking viewers read the translation, all in the same video.
Each language gets its own independent styling. You might want the original in white at the bottom and the translation in yellow slightly above it. Or the original in a larger font with the translation smaller underneath. You control both separately.
Step 6: Add Emojis (Optional)
For more casual content — especially TikTok and Reels — the AI emoji feature adds contextually relevant emojis to your subtitles automatically. The AI reads what's being said and picks emojis that match.
You can set where emojis appear: above the text, below it, to the left, to the right, or randomly positioned for a more dynamic feel.
This is entirely optional and probably not what you want for a corporate training video. But for social content, it adds visual energy.
Step 7: Export
You have three export options:
SRT or VTT files — these are subtitle files that you upload separately to YouTube, Vimeo, or any platform that supports subtitle tracks. YouTube uses SRT. Web video players typically use VTT. Downloading these is always free — no credits required.
HD Video (1080p) — this renders a new copy of your video with the subtitles permanently burned into the frames. The output is a standard MP4 file that you can upload anywhere — TikTok, Instagram, YouTube, LinkedIn, wherever. No subtitle compatibility issues because the text is part of the video itself.
If you added translations, the bilingual subtitles are included in both the subtitle files and the rendered video.
Pick your format, hit export, and download the result when it's ready. For SRT/VTT it's instant. For video rendering, expect a few minutes depending on length.
Quick Tips
For YouTube: Export an SRT file and upload it as a subtitle track in YouTube Studio. This keeps your video clean while giving viewers the option to toggle subtitles on or off. YouTube also indexes subtitle text, which helps your video appear in search results.
For TikTok and Reels: Use the burned-in video export. These platforms don't support separate subtitle files, so the text needs to be part of the video. Use a larger font size — people watch on phones, and small text disappears. Position subtitles in the center or upper-center to avoid overlap with TikTok's UI elements at the bottom.
For podcasts and audio-only content: Upload your audio file the same way. You won't get a video export, but you'll get a clean SRT/VTT transcript that you can use for show notes, blog posts, or accessibility.
General: Always review the transcript before exporting. AI transcription is accurate but not perfect — proper nouns, slang, and heavily accented speech sometimes need manual correction. Two minutes of editing saves you from publishing a subtitle that says "capitol" when you meant "capital."