title: AI Language Monitor | |
emoji: ๐ | |
colorFrom: purple | |
colorTo: pink | |
sdk: docker | |
app_port: 8000 | |
license: cc-by-sa-4.0 | |
short_description: Evaluating LLM performance across all human languages. | |
datasets: | |
- openlanguagedata/flores_plus | |
- google/fleurs | |
- mozilla-foundation/common_voice_1_0 | |
- CohereForAI/Global-MMLU | |
models: | |
- meta-llama/Llama-3.3-70B-Instruct | |
- mistralai/Mistral-Small-24B-Instruct-2501 | |
- deepseek-ai/DeepSeek-V3 | |
- microsoft/phi-4 | |
- openai/whisper-large-v3 | |
- google/gemma-3-27b-it | |
tags: | |
- leaderboard | |
- submission:manual | |
- test:public | |
- judge:auto | |
- modality:text | |
- modality:artefacts | |
- eval:generation | |
- language:English | |
- language:German | |
<!-- | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
For tag meaning, see https://huggingface.co/spaces/leaderboards/LeaderboardsExplorer | |
--> | |
[](https://huggingface.co/spaces/datenlabor-bmz/ai-language-monitor) | |
# AI Language Monitor ๐ | |
_Tracking language proficiency of AI models for every language_ | |
```bash | |
uv run evals/main.py | |
``` | |