
CogView4
Open-Source, 2K Resolution Text-to-Image.
10 followers
Open-Source, 2K Resolution Text-to-Image.
10 followers
CogView4, from ChatGLM team, is the open-source (Apache 2.0) text-to-image. Native 2048x2048 resolution, unlimited-length prompts (Chinese/English), and in-image text generation!






Flowtica Scribe
Hi everyone!
Sharing CogView4, a new open-source text-to-image model from the ChatGLM team, and it's got some seriously impressive capabilities!
What stands out:
πΌοΈ Native 2K Resolution: Generates images at 2048x2048 natively β no need for upscaling.
π (Almost) Unlimited-Length Prompts: Supports verrrrrrrrry long and detailed prompts, in both Chinese and English.
π In-Image Text Generation: Can generate images with text in them β both English and Chinese characters! This is a big deal.
βοΈ Bilingual: Excels at understanding and following Chinese instructions.
π Apache 2.0 License: Fully open-source and commercially usable.
βοΈ 6B DiT + 9B Text Encoder: Uses a Diffusion Transformer architecture.
They're also planning to release a fine-tuning framework, ControlNet support, and ComfyUI integration soon. This is a major contribution to the open-source image generation community.
You can directly try it out on their HF Spaces.