TwelveLabs

TwelveLabs

AI platform for deep video understanding

5.0
1 review

689 followers

AI that truly understands video. Uses multimodal models (Marengo/Pegasus) to search, analyze & generate text from video content at scale.
This is the 2nd launch from TwelveLabs. View more

TwelveLabs Marengo 3.0

The most powerful embedding model for video understanding
Marengo 3.0 is TwelveLabs' most significant model to date, delivering human-like video understanding at scale. A multimodal embedding model, Marengo fuses video, audio, and text for holistic video understanding to power precise video search and retrieval.
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
TwelveLabs Marengo 3.0 gallery image
Free Options
Launch Team / Built With
AppSignal
AppSignal
Get the APM insights you need without enterprise price tags.
Promoted

What do you think? …

Emily Kurze

Hi, Emily here from @TwelveLabs!

Why we built Marengo 3.0: Modern multimodal models break down on the things that actually matter in production: long videos, fast-moving sports, mixed-modality queries, noisy real-world audio, and multilingual content. We built Marengo 3.0 to solve those exact pain points. Instead of optimizing for short clips or English-only benchmarks, we focused on understanding the world as it really is—messy, long-form, multilingual, and multimodal.

What’s new and unique: Marengo 3.0 introduces a more efficient unified embedding space that works across video, audio, text, images, and even composed queries (e.g., image + text together). That unlocks new capabilities like action-level sports retrieval, long descriptive queries, accurate speech and non-speech audio retrieval, and native multilingual search across 36 languages. And it does this while being 3–6× more storage-efficient than alternative models.

What we’re most proud of: The biggest milestone: there’s no longer a trade-off between multimodality and performance. Marengo 3.0 hits state-of-the-art results across composed retrieval, sports, OCR, long-form understanding, audio, and multilingual tasks—while staying lightweight and production-friendly. Instead of chasing synthetic benchmarks, we designed a model that excels in real-world use.

Curious to hear what the Product Hunt community thinks! What would you build with access to multimodal video understanding that actually works at production scale?

Masum Parvej

@emilykurze The storage efficiency detail caught my eye

Nika

Only people with a movie historical background will understand the logo :) Love the idea behind it :)

fmerian
Hunter

do you refer to the TriStar Pictures logo? good ol' memories indeed!

Emily Kurze

@fmerian Nope! The first movie was of a horse running to see if all four hooves left the ground at the same time.

fmerian
Hunter
Nika

@fmerian I mean this logo:

As @emilykurze said, the video as we know it today comes from capturing the motion of a horse (at a certain point, a horse, when running, has all 4 legs above the ground) – that's how we identified movement on the camera/photos. https://en.wikipedia.org/wiki/The_Horse_in_Motion

Emily Kurze

@busmark_w_nika I love that you got the reference!

Milo McCloud

Congratulations guys!! Could you use TwelveLabs to review a final cut of a video edit before publishing in the context of content creation and YouTube, it could be really interesting for final cut reviews and missed keyframes?

Emily Kurze

@milo_mccloud That's a great use case. I believe we've have customers do this, but let me check with the team and get back to you with more details.

Milo McCloud

@emilykurze Brilliant stuff, it's definitely some of the biggest time investment for creators atm

fmerian
Hunter

I had a blast collaborating with @emilykurze and the @TwelveLabs team on this launch.

Read the behind-the-scenes here in the Product Forum /p/twelvelabs and go to playground.twelvelabs.io to start playing around with the product. Enjoy!

Amelia Brooks

I like the direction you’re taking with this. What kind of feedback from early users influenced this version?

Emily Kurze

@amelia_brooks3 We work really closely with those building with our models so I'd say most features are influenced by feedback from our users.

Shubham

it's insanely fast! do you think i can use this to detect is some animation is broken?

(it's hard to define what broken even means)

check out unfold to see what i mean (we just launched yesterday on PH), and all the best guys - you have #1 vibes!

Emily Kurze

@unlikefraction I always recommend testing it (you get 10 free hours in our API playground) to see how it performs for your use case. I'm not even sure I know what broken animation looks like!

Justin Jincaid

It looks amazing! Does it handle fast-moving sports like action retrieval? I am starting a new product and kinda need something like that.

Emily Kurze

@justin2025 Yes, Marengo 3.0 has enhanced sports understanding for American football, ice hockey, baseball, basketball and soccer/football. Sports lingo can be so tricky so this version of the model we really focused on strengthening the model for sports action recognition.

123
Next
Last