All activity
Fuyu-8B is a multimodal model capable of...
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!

Fuyu-8BA multimodal architecture for AI agents

