The large language models developed by Alibaba Cloud, Qwen-7B and Qwen-7B-Chat, are two small-size versions of Tongyi Qianwen AI model launched in April 2023.
Heaptalk, Jakarta — A China-based cloud computing and artificial intelligence (AI) company, Alibaba Cloud, unleashed its large language models Qwen-7B and Qwen-7B-Chat to the open-source community (08/03).
The models are two small-size versions of Tongyi Qianwen launched on April 2023. Similar to ChatGPT, this AI model is capable to generate human-like content in Chinese and English.
Qwen-7B has been trained with 7 billion parameters, while Qwen-7B-Chat is Qwen-7B’s conversationally fine-tuned version. The two AI models can be accessed publicly through the company’s AI model community ModelScope and the collaborative AI platform Hugging Face, as reported by Technode Global.
The digital technology and intelligence backbone of Alibaba Group decided to take this measure as an effort to democratize AI technologies. All the models’ code, model weights, and documentation will be freely accessible to academics, researchers, and commercial institutions around the world.
Meanwhile, for commercial uses, these models can be used freely by companies with monthly active users of less than 100 million. Companies with more users can request a license from Alibaba Cloud.
Constructing generative AI effectively and cost-efficiently
According to the Chief Technology Officer of Alibaba Cloud Intelligence Jingren Zhou, the company aims to promote inclusive technologies and enable more developers and SMEs to reap the benefits of generative AI by open-sourcing its proprietary large language models. “As a determined long-term champion of open-source initiatives, we hope that this open approach can also bring collective wisdom to further help open-source communities thrive,” said Zhou as quoted by Technode Global.
The Qwen-7B model has been developed through alignment with human instructions. Both Qwen-7B and Qwen-7B-Chat can be implemented on cloud and on-premises infrastructures. This flexibility allows users to improve and construct their own generative AI models effectively and cost-efficiently.
Further, Qwen-7B has been pre-trained with more than 2 trillion tokens, including Chinese, English, and other multilingual materials as well as code and mathematics in general and professional fields. The length of the context can be up to 8,000.