Qwen-Image: A Foundation Model That Writes Pictures With Words Qwen-Image from the Qwen team at Alibaba Cloud is a 20B-parameter MMDiT (Multi-Modal Diffusion Transformer) image foundation model focused on two hard things at once: rendering complex, accurate text ... AI Arena Apache-2.0 ComfyUI Diffusers Image Editing MMDiT ModelScope Qwen Qwen-Image Text-to-Image