Hey guys, can you please upload the video for this launch again? (ATM, it doesn't show the thumbnail)
After republishing, the bug should be removed.
P.S.: This is very interesting. Something similar to the understanding of videos I saw 2 days ago, hunted by @zaczuo โ Twelvelabs + some kind of "video reading" have seen in Notebooks.app by @dev_singh
@saaswarriorย I can see the final result, good job! :)
Report
Hunter
๐ Hey Hunters!
Introducing Meta Perception Encoder โ Meta FAIR's powerful new family of vision-language models!
From zero-shot classification to multimodal reasoning, PE pushes the boundaries of what's possible in computer vision. With variants like PE-Core, PE-Lang, and PE-Spatial, itโs designed to tackle everything from image understanding to dense spatial tasks โ all using a single contrastive objective.
Whatโs exciting?
โ Intermediate embeddings for richer representations
โ Advanced alignment techniques
โ Strong zero-shot and retrieval performance
โ Open-source and research-friendly!
Built for researchers, developers, and AI enthusiasts alike โ letโs reimagine visual understanding together.
Would love your feedback! ๐ฌ๐
Report
@saaswarriorย Super impressive launch! Love the focus on visual understanding. How beginner-friendly is it for someone just getting into AI?
Report
Impressive benchmarks on zero-shot tasks! The vision encoder's performance suggests Meta has made significant architectural innovations in cross-modal representation learning. Particularly curious about the training methodology - is this leveraging a new paradigm beyond contrastive learning?
Replies
minimalist phone: creating folders
Hey guys, can you please upload the video for this launch again? (ATM, it doesn't show the thumbnail)
After republishing, the bug should be removed.
P.S.: This is very interesting. Something similar to the understanding of videos I saw 2 days ago, hunted by @zaczuo โ Twelvelabs + some kind of "video reading" have seen in Notebooks.app by @dev_singh
@busmark_w_nikaย I have just edited the video.
minimalist phone: creating folders
@saaswarriorย I can see the final result, good job! :)
๐ Hey Hunters!
Introducing Meta Perception Encoder โ Meta FAIR's powerful new family of vision-language models!
From zero-shot classification to multimodal reasoning, PE pushes the boundaries of what's possible in computer vision. With variants like PE-Core, PE-Lang, and PE-Spatial, itโs designed to tackle everything from image understanding to dense spatial tasks โ all using a single contrastive objective.
Whatโs exciting?
โ Intermediate embeddings for richer representations
โ Advanced alignment techniques
โ Strong zero-shot and retrieval performance
โ Open-source and research-friendly!
Built for researchers, developers, and AI enthusiasts alike โ letโs reimagine visual understanding together.
Would love your feedback! ๐ฌ๐
@saaswarriorย Super impressive launch! Love the focus on visual understanding. How beginner-friendly is it for someone just getting into AI?
Telebugs
Congrats on the launch! Curious to see what models it surpasses