File size: 938 Bytes
05d3571
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# LASER: sentence encoders for WMT '22 shared task - data track

More information on the shared task can be found here: 
https://statmt.org/wmt22/large-scale-multilingual-translation-task.html

## Downloading encoders

To download encoders for all 24 supported languages, 
please run the `download_models.sh` script within this directory
```
bash ./download_models.sh
```
This will place all supported models within the directory: `$LASER/models/wmt22`

**Note**: encoders for each focus language are in the format: `laser3-xxx`, except for
Afrikaans (afr), English (eng), and French (fra) which are all supported by the laser2 model.

Available languages are: amh, fuv, hau, ibo, kam, kin, lin, lug, luo, nso, nya, orm, sna, som, ssw, swh, tsn, tso, umb, wol, xho, yor and zul

## Embedding texts

Once all encoders are downloaded, you can then begin embedding texts by following the
instructions under: `LASER/tasks/embed/README.md`