Launched this week

Clipto
Fully local, natural language search over terabytes of media
799 followers
Fully local, natural language search over terabytes of media
799 followers
Like Google Photos, but fully local. Turn the terabytes of video, audio, meetings, and files you work with into searchable memories, without uploading anything to the cloud. Clipto automatically tags people, dialogue, and scenes, so you can instantly find any moment buried in your media just by describing what you're looking for. It's fast too: on a MacBook Pro M5, Clipto indexed 2TB of videos in just 24 hours.











Ada.im
For long-form team collaboration, is there a way to share the index file with another editor, or does each person need to re-index the same footage locally?
Clipto
@s_cen Great question!
Today, each Clipto library and index lives locally on the user’s machine. If multiple editors are working independently, each machine maintains its own local index.
That said, collaborative workflows are something we’re actively thinking about. As media libraries grow and teams become more distributed, sharing knowledge about a media collection becomes just as important as sharing the files themselves.
Team collaboration, shared knowledge, and more flexible ways to work across multiple users and machines are all areas we’re exploring for the roadmap.
Out of curiosity, what’s your setup today? A shared NAS, cloud storage, or a traditional post-production workflow with multiple editors?
How does the search handle lighting conditions? If I search for 'forest at night' vs 'forest during the day,' is the vision model sensitive enough to distinguish the cinematic mood?
Clipto
@carooolxxyy Great question.
Yes. Our visual models don’t just recognize objects and scenes, they also capture contextual signals such as lighting conditions, time of day, atmosphere, and other visual characteristics.
So in practice, searches like:
• “forest at night”
• “forest during the day”
• “sunset over the ocean”
• “dark and moody street scene”
can produce very different results, even when the underlying scene category is similar.
Of course, cinematic mood is inherently subjective, so there are limits to what any model can perfectly understand. But distinguishing things like day vs. night, bright vs. dark environments, or dramatically different visual atmospheres is something the system is designed to handle.
We’d actually love to hear the kinds of searches you’d want to run. “Cinematic mood” is an area where we’re continuing to push the models forward.
I am not a creator , but I do have lots of personal photos stored in different locations on my device , will clipto be able to organise those for me ? And can it build a memory chart out if it. For me rather then searching I like what google shows to me on time to time , like memories.
But sometimes searching is also required.
Clipto
@bravo_5951 That’s a great question, and honestly it gets close to where we think this category is heading.
Today, Clipto can absolutely index and organize personal photos alongside videos and audio. It can identify people, scenes, objects, and other visual concepts, and you can even assign custom names to detected faces, making it much easier to search for family memories later.
Everything stays local and private on your own machine.
Right now, our primary focus is helping users find what they’re looking for instantly. But we also think there’s a future beyond search, where your media library becomes a personal memory system rather than just a collection of files.
Features like surfacing meaningful moments, relationships between people, places, and events, and helping users rediscover forgotten memories are all directions we’re actively thinking about.
I’m curious: what would make you use Clipto instead of Google Photos? Is it privacy, local ownership, better search, or something else entirely?
@henry_kang Privacy is the first thing and then organisation of Photos.
Clipto
@bravo_5951 That makes a lot of sense.
Privacy was one of the core reasons we built Clipto as a local-first product. And we completely agree that organization matters just as much as search when your media library starts to grow.
Thanks for sharing!
Lessie AI
Since it’s 100% local, does the indexing process completely lock up the Mac, or can I still smoothly edit 4K video in the foreground while it indexes in the background?
Clipto
@alexia_li Great question.
One of the biggest challenges with local AI is making sure the indexing work doesn’t get in the way of the work you’re actually trying to do.
We’ve spent a lot of time building orchestration, scheduling, and resource management systems so indexing can run efficiently in the background while minimizing its impact on foreground tasks.
So yes, the goal is that you can continue editing, reviewing footage, or working normally while Clipto processes your library in the background.
Of course, we’re still actively optimizing performance. Large media libraries can be demanding, but a huge part of our engineering effort has gone into balancing indexing speed with a smooth user experience.
Lessie AI
Can I drag and drop clips directly from the Clipto search window straight into my Premiere Pro or DaVinci Resolve timeline, or do I need to reveal in Finder first?
Clipto
@libin_yao Yes. In fact, we’ve already built a Premiere Pro plugin specifically for this workflow.
You can search your media directly inside Premiere using Clipto, find the exact moment you’re looking for, and add the selected clip to your timeline without jumping back and forth between Finder and your editor.
For many editors, the goal isn’t just finding the clip, it’s finding it without breaking creative flow. That’s one of the main reasons we built the integration in the first place.
If you have to choose, which one you use more heavily? Premiere or Davinci?
"Automatically tags people" — is that face recognition, voice matching, or something else? And when it misidentifies someone, is there a way to correct the label without re-indexing the entire library?
Clipto
@sounak_bhattacharya Yes! We actually use both visual face recognition and voice identification to build a more complete understanding of who appears across your media.
And yes, corrections are fully supported. If Clipto misidentifies someone, you can simply relabel that person (or merge/split identities), and the change is reflected throughout your library. There’s no need to re-process or re-index everything from scratch.
In fact, user corrections become part of the local memory layer, which helps make future search and retrieval much more accurate for your own media collection.
ComputerX
So I can just dump all my messy folders into the app and let the AI do the heavy lifting? No manual tagging or renaming required at all?
Clipto
@bruceyongli Yes — just drag your local media files into Clipto.
From there, Clipto will watch and listen to the content for you. It can recognize basic media information like format, resolution, frame rate, and aspect ratio — for example MP4 or MOV files, 4K or 1080p footage, 24fps or 30fps, widescreen or vertical clips.
It can also understand what’s inside the content, including people, actions, dialogue, scenes, objects, and more. During indexing, Clipto automatically adds multi-dimensional tags to your media files, so in most cases, you don’t need to manually rename, tag, or organize everything first.