• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EST 17:02
CET 23:02
KST 07:02
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
RSL Season 3 - Playoffs Preview0RSL Season 3 - RO16 Groups C & D Preview0RSL Season 3 - RO16 Groups A & B Preview2TL.net Map Contest #21: Winners12Intel X Team Liquid Seoul event: Showmatches and Meet the Pros10
Community News
Weekly Cups (Nov 24-30): MaxPax, Clem, herO win2BGE Stara Zagora 2026 announced15[BSL21] Ro.16 Group Stage (C->B->A->D)4Weekly Cups (Nov 17-23): Solar, MaxPax, Clem win3RSL Season 3: RO16 results & RO8 bracket13
StarCraft 2
General
Chinese SC2 server to reopen; live all-star event in Hangzhou Maestros of the Game: Live Finals Preview (RO4) BGE Stara Zagora 2026 announced Weekly Cups (Nov 24-30): MaxPax, Clem, herO win SC2 Proleague Discontinued; SKT, KT, SGK, CJ disband
Tourneys
Sparkling Tuna Cup - Weekly Open Tournament RSL Offline Finals Info - Dec 13 and 14! StarCraft Evolution League (SC Evo Biweekly) Sea Duckling Open (Global, Bronze-Diamond) $5,000+ WardiTV 2025 Championship
Strategy
Custom Maps
Map Editor closed ?
External Content
Mutation # 503 Fowl Play Mutation # 502 Negative Reinforcement Mutation # 501 Price of Progress Mutation # 500 Fright night
Brood War
General
The top three worst maps of all time Foreign Brood War BGH Auto Balance -> http://bghmmr.eu/ Data analysis on 70 million replays BW General Discussion
Tourneys
Small VOD Thread 2.0 [Megathread] Daily Proleagues [BSL21] RO16 Group D - Sunday 21:00 CET [BSL21] RO16 Group A - Saturday 21:00 CET
Strategy
Current Meta Game Theory for Starcraft How to stay on top of macro? PvZ map balance
Other Games
General Games
Nintendo Switch Thread Stormgate/Frost Giant Megathread Path of Exile ZeroSpace Megathread The Perfect Game
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Mafia Game Mode Feedback/Ideas TL Mafia Community Thread
Community
General
US Politics Mega-thread Things Aren’t Peaceful in Palestine European Politico-economics QA Mega-thread Russo-Ukrainian War Thread The Big Programming Thread
Fan Clubs
White-Ra Fan Club
Media & Entertainment
Anime Discussion Thread [Manga] One Piece Movie Discussion!
Sports
2024 - 2026 Football Thread Formula 1 Discussion
World Cup 2022
Tech Support
Computer Build, Upgrade & Buying Resource Thread
TL Community
Where to ask questions and add stream? The Automated Ban List
Blogs
I decided to write a webnov…
DjKniteX
Physical Exertion During Gam…
TrAiDoS
James Bond movies ranking - pa…
Topin
Thanks for the RSL
Hildegard
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1509 users

[G] GenAI subtitles for Korean BW content

Forum Index > BW General
Post a Reply
Kraekkling
Profile Blog Joined June 2007
578 Posts
Last Edited: 2025-05-07 04:12:04
May 07 2025 01:42 GMT
#1
ASL RO8, Soulkey vs Rush, from Flash/Shuttle stream

ASL RO8 spoilers below!!

g1 + Show Spoiler +
https://www.captionfy.com/video/youtube/Ixu6V3pCQf8?c=en

g2 + Show Spoiler +
https://www.captionfy.com/video/youtube/p7l6c5qzoDw?c=en

g3 + Show Spoiler +
https://www.captionfy.com/video/youtube/p_rvWNRKhgw?c=en

g4 + Show Spoiler +
https://www.captionfy.com/video/youtube/a478rarEBTY?c=en

g5 + Show Spoiler +
https://www.captionfy.com/video/youtube/YZWpi_IUi94?c=en-Ntb

g6 + Show Spoiler +
https://www.captionfy.com/video/youtube/YZWpi_IUi94?c=en-Ntb

g7 + Show Spoiler +
https://www.captionfy.com/video/youtube/YZWpi_IUi94?c=en-Ntb


The latest Gemini model by Google can handle video input and works surprisingly well for generating English subtitles for Korean Brood War videos. It still makes mistakes here and there and sometimes hallucinates, but it's a big step up from the gibberish you get from YouTube's auto-subtitles. If I had to guesstimate, I’d say it gets >80% right, which feels pretty impressive.

Workflow below.

+ Show Spoiler +
I'm using Gemini 2.5 Pro Preview (05-06) at https://aistudio.google.com/ with default settings. The model is currently free to test. It supports up to 1 million tokens of context; one minute of video is roughly 20k tokens, so the videos above ended up around ~160k–170k tokens each. However this means long content videos like daily proleague or KCM would not work as these exceed the context limit. Maybe chopping them up somehow could work?

Basically, I just pass it the YouTube link and ask it to generate English subtitles.

I've found it works better if I do this in two steps. First, I give it the link and just ask, "what is happening here?".

It will take a while and output a summary.
+ Show Spoiler +
[image loading]

+ Show Spoiler +
Interestingly, this summary often has hallucinations and often doesn’t accurately describe the video. Still, I noticed that when I skip this step and instead ask for subtitles right away, the results are worse. It seems like preloading the context window with Brood War jargon actually helps when it comes time to generate the subtitles. The summary itself being wrong doesn't seem to have any effect on the quality of the subtitles.


After that, I ask it to create the subtitles. The prompt I use looks like this:

+ Show Spoiler +

create english subtitles (.srt)

Quick sanity checklist for SRT files:

Sequential numbers starting at 1.

Timestamp line exactly HH:MM:SS,mmm --> HH:MM:SS,mmm.

The video is less than 1 hour long so all timestamps must start with 00 for HH.

One subtitle text line.

A blank line after every cue.


This should give you subtitles you can copy, save as an .srt file, and use with the video. + Show Spoiler +
[image loading]


The resulting .srt file sometimes has errors which results in missing text; this is often due to the generated formatting being wrong. Most of the times I found it best to just re-run until it worked. Alternatively you could adjust the prompt or fix the .srt yourself. I found the browser addon substital useful, because it allows you to use a local .srt file for youtube videos; and it generated error messages caused by wrong formatting of the .srt files faster than captionify.

I’m still figuring out the best way to share these or upload them for YouTube. I found captionfy, which seems pretty easy to use. You sign up and can create a shareable overlay for any YouTube video. The good thing is that traffic still goes to the original creator, and anyone can upload subtitles that are then available for everyone.

I guess the end goal would be to automate the full pipeline and translate a lot of stuff? It seems captionfy does not have an api so maybe something else might be better suited?

Also the gemini model likely won't be free forever, but with current pricing it should be possible at about ~6cent per 1 minute of content (for videos of similar length) which seems cheap enough? The price scales with (video) input length so longer videos will be more expensive.
(*^^)(^*)
Last.Midnight
Profile Blog Joined July 2006
Australia906 Posts
May 07 2025 01:53 GMT
#2
I was curious about doing this. Surely there are models/n8n setups that can automatically replace/overdub the voice too?

Thanks for sharing man this is great.
Last.Midnight
Profile Blog Joined July 2006
Australia906 Posts
May 07 2025 02:49 GMT
#3
Recall (https://www.getrecall.ai/) provides written translations and app.vozo.ai apparently does voiceover dubs, but I'm not sure how accurate they are and it's expensive.
Simplistik
Profile Blog Joined November 2007
2094 Posts
May 07 2025 03:34 GMT
#4
I feel like there is a webservice niche for automating this workflow if anyone has the patience to make to makw it work.
Dear BW Gods, it IS now autumn, so...
Last.Midnight
Profile Blog Joined July 2006
Australia906 Posts
Last Edited: 2025-05-07 04:25:05
May 07 2025 04:24 GMT
#5
yt-dlp for download into ElevenLabs overdub most likely. Only problem is the EL credits.

Possibly with a specialised Eng>Kor model in between.
rtyrt7
Profile Joined August 2018
48 Posts
May 07 2025 07:34 GMT
#6
Maybe the free models over here would also be helpful, as API:
https://openrouter.ai/models?max_price=0

But it has these limits for the models whose ID is ending in ":free":
- Per-Minute Limit: 20 requests per minute
- Daily Limit: 50 requests per day per account
prosatan
Profile Joined September 2009
Romania8500 Posts
May 07 2025 07:57 GMT
#7
Thank you Kraekkling !
Lee JaeDong Fighting! The only church that illuminates is the one that burns.
Kraekkling
Profile Blog Joined June 2007
578 Posts
May 07 2025 11:56 GMT
#8
On May 07 2025 10:53 Last.Midnight wrote:
Surely there are models/n8n setups that can automatically replace/overdub the voice too?


This is likely not feasible yet. What you're talking about is basically a different piece of technology.

You're right though that there are models that are able to translate audio and output sound in a voice similar to the speaker. However those models are several orders of magnitudes smaller than what we have here and do purely audio-to-audio. They can't handle long-term context. Also there just isn't much training data for these models to be able to properly handle bw jargon.

The advantage of the Gemini model is that we're using information from the video itself (not only the audio) and also tapping inside its "general intelligence" which is due to the very big model size. Additionally here we have inference time scaling, which means the model internally outputs an ensemble of chain-of-thought threads in which it discusses the best way to translate a given passage of video given the overall context, before giving an answer to the user.

However I think we might be not too far away to have models which could do what you suggested, give it 1-2 years at max and we'll be there. The next iteration of openai's omni-series might already do it.

(*^^)(^*)
yubo56
Profile Joined May 2014
690 Posts
May 07 2025 20:31 GMT
#9
On May 07 2025 20:56 Kraekkling wrote:
Show nested quote +
On May 07 2025 10:53 Last.Midnight wrote:
Surely there are models/n8n setups that can automatically replace/overdub the voice too?


This is likely not feasible yet. What you're talking about is basically a different piece of technology.

You're right though that there are models that are able to translate audio and output sound in a voice similar to the speaker. However those models are several orders of magnitudes smaller than what we have here and do purely audio-to-audio. They can't handle long-term context. Also there just isn't much training data for these models to be able to properly handle bw jargon.

The advantage of the Gemini model is that we're using information from the video itself (not only the audio) and also tapping inside its "general intelligence" which is due to the very big model size. Additionally here we have inference time scaling, which means the model internally outputs an ensemble of chain-of-thought threads in which it discusses the best way to translate a given passage of video given the overall context, before giving an answer to the user.

However I think we might be not too far away to have models which could do what you suggested, give it 1-2 years at max and we'll be there. The next iteration of openai's omni-series might already do it.


Wait, but you're describing the difficulty of direct audio-audio translation. If you already can do audio -> translated text though, can't you just slap a text-to-speech and have a (basic) audio-audio translation?

I guess you'd have trouble matching the duration of the sentences, but with some simple squeezing and stretching of audio bytes it's still surely quite feasible compared to direct audio-to-audio translation...
Jung Yoon Jong fighting, even after retirement! Feel better soon.
prion_
Profile Joined September 2022
79 Posts
Last Edited: 2025-05-07 22:10:15
May 07 2025 22:08 GMT
#10
The problem is that it would sound like TikTok caption voice. I mean, not exactly that, but you wouldn't be able to keep the rhythm and modulation of their voices by going audio->text->audio, even if you adjusted for time.
IntoTheWow
Profile Blog Joined May 2004
is awesome32277 Posts
May 08 2025 02:26 GMT
#11
This is really cool!

Do you think that adding some keywords in the prompt could help the model? Like units, BW jargon, etc? Or are errors due to other factors?
Moderator<:3-/-<
Last.Midnight
Profile Blog Joined July 2006
Australia906 Posts
May 08 2025 03:36 GMT
#12
I tried ElevenLabs dubbing feature and it works pretty great. Of course I can't speak to the accuracy of the translation but it's certainly more accurate than "translate to English" on Chrome. Only funny thing is that it also dubs the unit sounds so whenever the player isn't speaking he'll repeat SCV commands etc. haha
Lorch
Profile Joined June 2011
Germany3689 Posts
May 08 2025 13:00 GMT
#13
This is completely useless if you don't speak Korean.
You can never know what part of the translation are accurate and which aren't. You thinking that it sounds reasonable/makes sense is not a great heuristic, especially with how AIs tend to hallucinate.

Would probably need a dedicated bw ai model trained under the supervision of someone who speaks korean + english and is knowledgeable in starcraft to create something worth using.
Kraekkling
Profile Blog Joined June 2007
578 Posts
May 08 2025 14:32 GMT
#14
On May 08 2025 11:26 IntoTheWow wrote:
This is really cool!

Do you think that adding some keywords in the prompt could help the model? Like units, BW jargon, etc? Or are errors due to other factors?


We're pre-filling the prompt with BW jargon by asking for a video summary first. As to why there are errors - I guess the easiest answer is that the technology is not 100% there yet. Machine translation generally got useful only in the last decade or so... Additionally, BW is a niche domain - one needs a sufficient world model to make sense of the meaning behind words. Koreans often use abbreviations, for example they'd say "zildra" for a zealot/dragoon army; or "sam-hat" (삼햇) for a 3-hatchery opening, etc. I've also tried older models but this one by far is the best one to make sense of stuff like this.

To me, the fact that any of this works at all is pretty crazy.

On May 08 2025 22:00 Lorch wrote:
Would probably need a dedicated bw ai model trained under the supervision of someone who speaks korean + english and is knowledgeable in starcraft to create something worth using.


Unfortunately this won't happen, so for now its either youtube auto-subs or this. + Show Spoiler +
also this is not how models are trained


This is completely useless if you don't speak Korean.
You can never know what part of the translation are accurate and which aren't. You thinking that it sounds reasonable/makes sense is not a great heuristic, especially with how AIs tend to hallucinate.


Maybe someone who speaks Korean could comment? I'm only comparing this to yt auto-subs, and it felt like even with some obvious hallucinations the overall commentary was pretty easy to grasp?
(*^^)(^*)
Last.Midnight
Profile Blog Joined July 2006
Australia906 Posts
Last Edited: 2025-05-08 21:25:04
May 08 2025 21:24 GMT
#15
On May 08 2025 22:00 Lorch wrote:
This is completely useless if you don't speak Korean.
You can never know what part of the translation are accurate and which aren't. You thinking that it sounds reasonable/makes sense is not a great heuristic, especially with how AIs tend to hallucinate.

Would probably need a dedicated bw ai model trained under the supervision of someone who speaks korean + english and is knowledgeable in starcraft to create something worth using.


Not useless, but not optimal either. Some phrases are lost but things like "focus fire the tank here" when he's also clicking a tank is pretty clear. Hallucinations don't happen as much when models draw from source material, they tend to happen when the trained parameters through a massive database misinterpret a request.

That's why for enterprise integration RAG is all the rage, since the "database" the models link to is the company's data.
Please log in or register to reply.
Live Events Refresh
BSL 21
20:00
RO16: Group D
Bonyth vs StRyKeR
Tarson vs Dandy
ZZZero.O325
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
Liquid`TLO 225
JuggernautJason162
ProTech126
CosmosSc2 97
StarCraft: Brood War
Shuttle 636
ZZZero.O 325
Dewaltoss 88
Hyun 63
Dota 2
Dendi1151
syndereN200
Counter-Strike
fl0m7411
byalli643
Heroes of the Storm
Khaldor193
Other Games
Grubby6356
shahzam352
ArmadaUGS127
Mew2King102
XaKoH 90
fpsfer 0
Organizations
Other Games
EGCTV2851
gamesdonequick1497
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 16 non-featured ]
StarCraft 2
• Hupsaiya 12
• Reevou 8
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Dota 2
• WagamamaTV637
• Ler116
Other Games
• imaqtpie2084
• Shiphtur274
Upcoming Events
Replay Cast
10h 58m
Wardi Open
13h 58m
StarCraft2.fi
17h 58m
Monday Night Weeklies
18h 58m
Replay Cast
1d 1h
WardiTV 2025
1d 13h
StarCraft2.fi
1d 17h
PiGosaur Monday
2 days
StarCraft2.fi
2 days
Tenacious Turtle Tussle
3 days
[ Show More ]
The PondCast
3 days
WardiTV 2025
3 days
StarCraft2.fi
3 days
WardiTV 2025
4 days
StarCraft2.fi
5 days
RSL Revival
5 days
IPSL
5 days
Sziky vs JDConan
RSL Revival
6 days
Classic vs TBD
herO vs Zoun
WardiTV 2025
6 days
IPSL
6 days
Tarson vs DragOn
Liquipedia Results

Completed

Proleague 2025-12-04
RSL Revival: Season 3
Light HT

Ongoing

C-Race Season 1
IPSL Winter 2025-26
KCM Race Survival 2025 Season 4
YSL S2
BSL Season 21
Slon Tour Season 2
Acropolis #4 - TS3
WardiTV 2025
META Madness #9
Kuram Kup
SL Budapest Major 2025
ESL Impact League Season 8
BLAST Rivals Fall 2025
IEM Chengdu 2025
PGL Masters Bucharest 2025
Thunderpick World Champ.
CS Asia Championships 2025
ESL Pro League S22

Upcoming

BSL 21 Non-Korean Championship
Acropolis #4
IPSL Spring 2026
Bellum Gens Elite Stara Zagora 2026
HSC XXVIII
Big Gabe Cup #3
RSL Offline Finals
PGL Cluj-Napoca 2026
IEM Kraków 2026
BLAST Bounty Winter 2026
BLAST Bounty Winter Qual
eXTREMESLAND 2025
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.