Alibaba research lab Damo Academy releases AI large language model trained in southeast Asian languages as China AI industry heats up
Alibaba research unit Damo Academy has launched an artificial intelligence (AI) large language model (LLM) tailored for southeast Asian languages, as Chinese tech giants continue to push forward with the technology amidst US sanctions.
Damo said on Monday the LLM was trained on Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Tagalog, and Burmese data sets and has outperformed competitors in linguistic and safety tasks.
Alibaba’s Southeast Asia e-commerce platform Lazada is targeting a turnover of around $100 billion (£79bn) with 300 million consumers in the region by 2030.
The LLM includes a chat assistant called SeaLLM Chat that is designed to help businesses using the model to engage with Southeast Asia markets.
Damo said the SeaLLM performs better than other tools such as ChatGPT in non-Latin language tasks and has the ability to interpret and process text up to 9 times longer than competitors for non-Latin languages.
It said SeaLLM also delivered better results in translating between English and languages such as Lao and Khmer, which have limited data for training AIs.
Bing Lidong, director of Damo’s language technology lab, said the LLM could “embrace the richness of Southeast Asia” and that innovation would empower communities historically under-represented in the digital world.
The move comes amidst an explosion of AI research in China, with 130 LLMs released by Chinese companies and research institutes as of July this year, following a boom in the sector sparked by OpenAI’s ChatGPT.
Chinese chip designers including Huawei and Tencent are also redoubling their efforts to develop AI chips and market them domestically as alternatives to those from Nvidia, many of which have been banned under increasingly strict US export controls.