OmniGen2 Released: Unified Image Understanding & Generation Model with Natural Language Instructions
comfyui-wiki.comVectorSpaceLab team has officially released OmniGen2, a powerful multimodal image generation model. Unlike its predecessor OmniGen v1, OmniGen2 features a dual-pathway decoding design for text and image modalities, utilizing independent parameters and a decoupled image tokenizer, achieving significant performance improvements in image editing.
OmniGen2 possesses four core capabilities, with particular excellence in image editing:
- Natural Language Instruction-Guided Image Editing
- Text-to-Image Generation
- In-Context Generation
- In-Context Generation
Related links
- Project Homepage: https://vectorspacelab.github.io/OmniGen2
- GitHub Repository: https://github.com/VectorSpaceLab/OmniGen2
- Model Download: https://huggingface.co/OmniGen2/OmniGen2
- Online Demo: https://huggingface.co/spaces/OmniGen2/OmniGen2
- Technical Paper: https://arxiv.org/abs/2506.18871