File size: 3,630 Bytes
79c723a
 
4b18b86
79c723a
 
 
 
 
 
4aac999
 
206d8c5
1bccc7f
4b18b86
 
206d8c5
f353458
4b18b86
 
7852e18
 
4aac999
 
 
f353458
b374294
59a32cd
309deb8
 
206d8c5
4b18b86
f353458
 
206d8c5
 
4b18b86
 
206d8c5
192f07b
4b18b86
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: README
emoji: 🌍
colorFrom: red
colorTo: gray
sdk: static
pinned: false
---

# SeaLLMs - Large Language Models for Southeast Asia

Welcome to the SeaLLMs project - a family of large language models tailored for Southeast Asian languages including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese. 

Unlike models primarily designed for high-resource languages like English, our mission is to democratize access to advanced language technologies for regional and potentially under-represented languages,
while prioritizing safety and trustworthiness within the regional context.

## ☄️ What's New (in 2025)?
After the release of SeaLLMs-v3, we've focused on extending along two directions: language coverage and multimodal support. We are happy to share:

- 🌏 [Babel](https://babel-llm.github.io/babel-llm/): a multilingual LLM that covers the top 25 languages by number of speakers, supports over 90% of the global population
- 🎧 [SeaLLMs-Audio](https://damo-nlp-sg.github.io/SeaLLMs-Audio/): the multimodal (audio) extension of SeaLLMs and the first large audio-language model designed to support multiple Southeast Asian languages

## SeaLLMs Models

- [SeaLLMs-v3](https://damo-nlp-sg.github.io/DAMO-SeaLLMs/): The latest version of the SeaLLMs family, achieving SOTA performance of diverse tasks while specifically enhanced to be more trustworthy, available in multiple variants: [7B-Chat](https://huggingface.co/SeaLLMs/SeaLLM3-7B-Chat), [1.5B-Chat](https://huggingface.co/SeaLLMs/SeaLLMs-v3-1.5B-Chat), [1.5B-base](https://huggingface.co/SeaLLMs/SeaLLMs-v3-1.5B) and [7B-base](https://huggingface.co/SeaLLMs/SeaLLMs-v3-7B).
- [SeaLLMs/SeaLLM-7B-v2.5](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5): New SeaLLM-7B model with 7B-SOTA on many world knowledge and reasoning tasks in SEA languages.
- [SeaLLMs/SeaLLM-7B-v2](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2): The most significant upgrade since SeaLLM-13B with half the size, outperforming performance across diverse multilingual tasks, from world knowledge, math reasoning, instruction following, etc.
- [SeaLLMs/SeaLLM-13B-Chat](https://huggingface.co/SeaLLMs/SeaLLM-13B-Chat): A chatbot optimized for Vietnamese 🇻🇳, Indonesian 🇮🇩, Thai 🇹🇭, Malay 🇲🇾, Khmer🇰🇭, Lao🇱🇦, Tagalog🇵🇭 and Burmese🇲🇲.

## Multilingual Evaluations for SEA
- [LLM Leaderboard for Southeast Asian Languages](https://huggingface.co/spaces/SeaLLMs/LLM_Leaderboard_for_SEA): evaluates LLMs on Southeast Asian languages through two comprehensive benchmarks - SeaExam and SeaBench
- SeaExam assesses world knowledge and reasoning capabilities through exam-style questions (for both base and chat version models) [[data (public)](https://huggingface.co/datasets/SeaLLMs/SeaExam), [eval code](https://github.com/DAMO-NLP-SG/SeaExam)]
- SeaBench evaluates instruction-following abilities and multi-turn conversational skills (thus only for chat version models). [[data (public)](https://huggingface.co/datasets/SeaLLMs/SeaBench), [eval code](https://github.com/DAMO-NLP-SG/SeaBench)]

## Quick Links
- [Project Page](https://damo-nlp-sg.github.io/DAMO-SeaLLMs/): project page that contains link to everything you need
- [SeaLLM-Chatbot](https://huggingface.co/spaces/SeaLLMs/SeaLLM-Chat): online demo for the latest chatbot version of SeaLLMs (currently SeaLLMs-v3-7B-chat)
- [SeaLLMs Github Repo](https://github.com/DAMO-NLP-SG/SeaLLMs)
- [SeaLLMs Paper](https://arxiv.org/abs/2312.00738) (ACL 2024 Demo)
- [SeaLLMs 3 Paper](https://arxiv.org/abs/2407.19672) (NAACL 2025 Demo)