Alibaba Group Holding has unveiled a new multimodal artificial intelligence (AI) model that can process text, images, audio and video on smartphones and laptops.
The company introduced Qwen2.5-Omni-7B on Thursday as the newest member of its Qwen model family as the tech giant aims to strengthen its position in the generative AI field, News.Az reports, citing foreign media.
The multimodal Qwen2.5-Omni-7B model brings advanced AI capabilities closer to everyday users.
According to a statement from Alibaba, the model can process various types of inputs and generate real-time responses in text or audio. Additionally, the company has made the model open-source.
The company emphasized potential use cases like offering real-time audio descriptions for visually impaired users and providing step-by-step cooking instructions by analyzing ingredients.
The model’s versatility highlights the increasing demand for AI systems that extend beyond just text generation.
Alibaba’s foundational Qwen models have become popular choices for AI developers to build on, positioning them as one of the few major alternatives to DeepSeek’s V3 and R1 models in China.