Update README.md
Browse files
README.md
CHANGED
@@ -4,10 +4,12 @@ license: apache-2.0
|
|
4 |
|
5 |
This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462
|
6 |
|
|
|
|
|
7 |
even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.
|
8 |
|
9 |
trained from https://github.com/bary12/latent-diffusion-for-language using the following command
|
10 |
|
11 |
```
|
12 |
-
python train_latent_model.py --dataset_name
|
13 |
```
|
|
|
4 |
|
5 |
This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462
|
6 |
|
7 |
+
trained on sentences from the c4 dataset
|
8 |
+
|
9 |
even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.
|
10 |
|
11 |
trained from https://github.com/bary12/latent-diffusion-for-language using the following command
|
12 |
|
13 |
```
|
14 |
+
python train_latent_model.py --dataset_name c4_sentences --enc_dec_model facebook/bart-base --learning_rate 1e-4 --lr_warmup_steps 1000 --train_batch_size 64 --num_encoder_latents 32 --dim_ae 64 --num_decoder_latents 32 --eval_every 10000 --num_layers 3 --wandb_name bart-roc-l2norm-test-32-64 --l2_normalize_latent
|
15 |
```
|