Bary
/

bart-autoencoder-c4-sentences

Model card Files Files and versions Community

Bary commited on Sep 22, 2024

Commit

215e9ee

·

verified ·

1 Parent(s): 323b63b

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -4,10 +4,12 @@ license: apache-2.0
 This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462
 even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.
 trained from https://github.com/bary12/latent-diffusion-for-language using the following command
 ```
-python train_latent_model.py --dataset_name c4 --enc_dec_model facebook/bart-base --learning_rate 1e-4 --lr_warmup_steps 1000 --train_batch_size 64 --num_encoder_latents 32 --dim_ae 64 --num_decoder_latents 32  --eval_every 10000 --num_layers 3 --wandb_name bart-roc-l2norm-test-32-64 --l2_normalize_latent
 ```

 This is a checkpoint of a fine tune of BART to act as an autoencoder with fixed-size 32x64 latent space, to be used for training diffusion models. See https://arxiv.org/abs/2212.09462
+trained on sentences from the c4 dataset
 even though this was trained for less than in the paper and on a more diverse dataset, it's pretty good with a validation loss of 0.14, and the reconstruction is correct >90% of the time.
 trained from https://github.com/bary12/latent-diffusion-for-language using the following command
 ```
+python train_latent_model.py --dataset_name c4_sentences --enc_dec_model facebook/bart-base --learning_rate 1e-4 --lr_warmup_steps 1000 --train_batch_size 64 --num_encoder_latents 32 --dim_ae 64 --num_decoder_latents 32  --eval_every 10000 --num_layers 3 --wandb_name bart-roc-l2norm-test-32-64 --l2_normalize_latent
 ```