I cut a NVDA 2-hour keynote into 121 clips with one prompt - no hands

We cut every time Jensen Huang said "AI" in his CES keynote — turns out he said it 121 times. One prompt handled the entire workflow.

How it worked:

- Dive chained two MCPs together (yt-dlp + ffmpeg)

- Download video → parse subtitle timestamps → cut 121 clips → merge

- All processed locally

One prompt:

Task: Create a compilation video of every exact moment Jensen Huang says "AI".
Video source:
Instructions:
Download video in 720p + subtitles in JSON3 format (word-level timestamps)
Parse JSON3 to find every "AI" instance with precise start/end times
Use ffmpeg to cut clips (~50-100ms padding for natural sound)
Concatenate all clips chronologically
Output: Jensen_CES_AI.mp4