AI that truly understands video. Uses multimodal models (Marengo/Pegasus) to search, analyze & generate text from video content at scale.
This is the 2nd launch from TwelveLabs. View more
TwelveLabs Marengo 3.0
Marengo 3.0 is TwelveLabs' most significant model to date, delivering human-like video understanding at scale. A multimodal embedding model, Marengo fuses video, audio, and text for holistic video understanding to power precise video search and retrieval.








Free Options
Launch Team / Built With








TwelveLabs
Hi, Emily here from @TwelveLabs!
Why we built Marengo 3.0: Modern multimodal models break down on the things that actually matter in production: long videos, fast-moving sports, mixed-modality queries, noisy real-world audio, and multilingual content. We built Marengo 3.0 to solve those exact pain points. Instead of optimizing for short clips or English-only benchmarks, we focused on understanding the world as it really is—messy, long-form, multilingual, and multimodal.
What’s new and unique: Marengo 3.0 introduces a more efficient unified embedding space that works across video, audio, text, images, and even composed queries (e.g., image + text together). That unlocks new capabilities like action-level sports retrieval, long descriptive queries, accurate speech and non-speech audio retrieval, and native multilingual search across 36 languages. And it does this while being 3–6× more storage-efficient than alternative models.
What we’re most proud of: The biggest milestone: there’s no longer a trade-off between multimodality and performance. Marengo 3.0 hits state-of-the-art results across composed retrieval, sports, OCR, long-form understanding, audio, and multilingual tasks—while staying lightweight and production-friendly. Instead of chasing synthetic benchmarks, we designed a model that excels in real-world use.
Curious to hear what the Product Hunt community thinks! What would you build with access to multimodal video understanding that actually works at production scale?
@emilykurze The storage efficiency detail caught my eye
minimalist phone: creating folders
Only people with a movie historical background will understand the logo :) Love the idea behind it :)
do you refer to the TriStar Pictures logo? good ol' memories indeed!
TwelveLabs
@fmerian Nope! The first movie was of a horse running to see if all four hooves left the ground at the same time.
@emilykurze TIL!
minimalist phone: creating folders
@fmerian I mean this logo:
As @emilykurze said, the video as we know it today comes from capturing the motion of a horse (at a certain point, a horse, when running, has all 4 legs above the ground) – that's how we identified movement on the camera/photos. https://en.wikipedia.org/wiki/The_Horse_in_Motion
TwelveLabs
@busmark_w_nika I love that you got the reference!
Congratulations guys!! Could you use TwelveLabs to review a final cut of a video edit before publishing in the context of content creation and YouTube, it could be really interesting for final cut reviews and missed keyframes?
TwelveLabs
@milo_mccloud That's a great use case. I believe we've have customers do this, but let me check with the team and get back to you with more details.
@emilykurze Brilliant stuff, it's definitely some of the biggest time investment for creators atm
I had a blast collaborating with @emilykurze and the @TwelveLabs team on this launch.
Read the behind-the-scenes here in the Product Forum /p/twelvelabs and go to playground.twelvelabs.io to start playing around with the product. Enjoy!
I like the direction you’re taking with this. What kind of feedback from early users influenced this version?
TwelveLabs
@amelia_brooks3 We work really closely with those building with our models so I'd say most features are influenced by feedback from our users.
Unfold
it's insanely fast! do you think i can use this to detect is some animation is broken?
(it's hard to define what broken even means)
check out unfold to see what i mean (we just launched yesterday on PH), and all the best guys - you have #1 vibes!
TwelveLabs
@unlikefraction I always recommend testing it (you get 10 free hours in our API playground) to see how it performs for your use case. I'm not even sure I know what broken animation looks like!
Mom Clock
It looks amazing! Does it handle fast-moving sports like action retrieval? I am starting a new product and kinda need something like that.
TwelveLabs
@justin2025 Yes, Marengo 3.0 has enhanced sports understanding for American football, ice hockey, baseball, basketball and soccer/football. Sports lingo can be so tricky so this version of the model we really focused on strengthening the model for sports action recognition.