# Installing NeMo from source


You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.

Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run the cell below to set up dependencies.


In [None]:
import os 
BRANCH = 'r1.17.0'
!apt-get update && apt-get install -y libsndfile1 ffmpeg
!git clone https://github.com/NVIDIA/NeMo --branch $BRANCH
os.chdir('NeMo')
!./reinstall.sh
os.chdir('..')


# Overview

There are three tasks as part of this tutorial

1. Intent and Slot Classification using Assistant Dataset and a BERT model
2. Intent Classification using Schema Guided Dialogue Dataset and a GPT2 model
3. Answer Extender using MS Marco NLGen Dataset and a BART model

Feel free to skip to the task that interests you most after installing NeMo from source.

# 1. Intent and Slot Classification using Assistant Dataset

## 1.1 Task Description

**Joint Intent and Slot classification** - is a task of classifying an Intent and detecting all relevant Slots (Entities)
for this Intent in a query.
For example, in the query: `What is the weather in Santa Clara tomorrow morning?`, we would like to classify the query
as a `weather` Intent, and detect `Santa Clara` as a `location` slot and `tomorrow morning` as a `date_time` slot.
Intents and Slots names are usually task specific and defined as labels in the training data.
This is a fundamental step that is executed in any task-driven Conversational Assistant.

Our model enables to train and then detect both of these tasks together.

Note: There is a similar model available at [Joint Intent Slot Classification Colab](https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb). However, this model only support BERT style models while the model in this tutorial supports other types of models such as GPT2. 


## 1.2 Download Assistant dataset and convert to NeMo format

This is a virtual assistant interaction data set that can be downloaded from here: https://github.com/xliuhw/NLU-Evaluation-Data.
There are about 10K training and 1K testing queries which cover 64 various Intents and 55 Slots. 

An example is:

* utterance: what alarms have i set for tomorrow 
* intent: alarm_query
* slots: date(tomorrow)


Note: While only the assistant dataset is used here, import_dataset.py is also compatible with ATIS and SNIPS

In [None]:
# download and unzip the example dataset from github
!wget https://github.com/xliuhw/NLU-Evaluation-Data/archive/master.zip
!unzip master.zip
# convert the dataset to the NeMo format
!python NeMo/scripts/dataset_processing/nlp/intent_and_slot/import_datasets.py --dataset_name=assistant --source_data_dir=./NLU-Evaluation-Data-master --target_data_dir=./assistant

## 1.3 Training and/or Testing the model




In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
!(python NeMo/examples/nlp/dialogue/dialogue.py \
 do_training=True \
 model.dataset.data_dir='./assistant' \
 model.dataset.dialogues_example_dir='./assistant_bert_examples' \
 model.dataset.task='assistant' \
 model.language_model.pretrained_model_name='bert-base-uncased' \
 exp_manager.create_wandb_logger=False)


**Results after 3 epochs**

Intent report: 
```
 label precision recall f1 support 
 alarm_query (label_id: 0) 100.00 94.44 97.14 18
 alarm_remove (label_id: 1) 100.00 90.91 95.24 11
 alarm_set (label_id: 2) 94.12 94.12 94.12 17
 audio_volume_down (label_id: 3) 75.00 42.86 54.55 7
 audio_volume_mute (label_id: 4) 100.00 92.86 96.30 14
 audio_volume_up (label_id: 5) 72.22 100.00 83.87 13
 calendar_query (label_id: 6) 87.50 77.78 82.35 18
 calendar_remove (label_id: 7) 94.44 100.00 97.14 17
 calendar_set (label_id: 8) 94.44 94.44 94.44 18
 cooking_recipe (label_id: 9) 85.71 70.59 77.42 17
 datetime_convert (label_id: 10) 88.89 100.00 94.12 8
 datetime_query (label_id: 11) 89.47 100.00 94.44 17
 email_addcontact (label_id: 12) 80.00 100.00 88.89 8
 email_query (label_id: 13) 100.00 83.33 90.91 18
 email_querycontact (label_id: 14) 78.95 88.24 83.33 17
 email_sendemail (label_id: 15) 94.44 94.44 94.44 18
 general_affirm (label_id: 16) 100.00 100.00 100.00 17
 general_commandstop (label_id: 17) 100.00 100.00 100.00 18
 general_confirm (label_id: 18) 100.00 100.00 100.00 17
 general_dontcare (label_id: 19) 100.00 100.00 100.00 18
 general_explain (label_id: 20) 100.00 100.00 100.00 17
 general_joke (label_id: 21) 91.67 100.00 95.65 11
 general_negate (label_id: 22) 100.00 100.00 100.00 18
 general_praise (label_id: 23) 100.00 100.00 100.00 17
 general_quirky (label_id: 24) 60.00 50.00 54.55 18
 general_repeat (label_id: 25) 100.00 100.00 100.00 17
 iot_cleaning (label_id: 26) 100.00 100.00 100.00 15
 iot_coffee (label_id: 27) 85.71 100.00 92.31 18
 iot_hue_lightchange (label_id: 28) 100.00 94.12 96.97 17
 iot_hue_lightdim (label_id: 29) 100.00 100.00 100.00 12
 iot_hue_lightoff (label_id: 30) 100.00 100.00 100.00 17
 iot_hue_lighton (label_id: 31) 100.00 50.00 66.67 4
 iot_hue_lightup (label_id: 32) 84.62 91.67 88.00 12
 iot_wemo_off (label_id: 33) 100.00 100.00 100.00 9
 iot_wemo_on (label_id: 34) 100.00 85.71 92.31 7
 lists_createoradd (label_id: 35) 90.00 100.00 94.74 18
 lists_query (label_id: 36) 100.00 94.12 96.97 17
 lists_remove (label_id: 37) 88.89 88.89 88.89 18
 music_likeness (label_id: 38) 100.00 93.75 96.77 16
 music_query (label_id: 39) 100.00 100.00 100.00 17
 music_settings (label_id: 40) 77.78 100.00 87.50 7
 news_query (label_id: 41) 72.73 88.89 80.00 18
 play_audiobook (label_id: 42) 100.00 100.00 100.00 17
 play_game (label_id: 43) 93.75 83.33 88.24 18
 play_music (label_id: 44) 85.00 100.00 91.89 17
 play_podcasts (label_id: 45) 100.00 88.89 94.12 18
 play_radio (label_id: 46) 84.21 94.12 88.89 17
 qa_currency (label_id: 47) 85.00 94.44 89.47 18
 qa_definition (label_id: 48) 89.47 100.00 94.44 17
 qa_factoid (label_id: 49) 64.00 88.89 74.42 18
 qa_maths (label_id: 50) 84.62 84.62 84.62 13
 qa_stock (label_id: 51) 87.50 77.78 82.35 18
 recommendation_events (label_id: 52) 87.50 82.35 84.85 17
 recommendation_locations (label_id: 53) 83.33 83.33 83.33 18
 recommendation_movies (label_id: 54) 100.00 60.00 75.00 10
 social_post (label_id: 55) 100.00 94.12 96.97 17
 social_query (label_id: 56) 100.00 82.35 90.32 17
 takeaway_order (label_id: 57) 92.31 70.59 80.00 17
 takeaway_query (label_id: 58) 93.75 83.33 88.24 18
 transport_query (label_id: 59) 81.25 76.47 78.79 17
 transport_taxi (label_id: 60) 100.00 100.00 100.00 16
 transport_ticket (label_id: 61) 85.00 94.44 89.47 18
 transport_traffic (label_id: 62) 93.75 88.24 90.91 17
 weather_query (label_id: 63) 89.47 100.00 94.44 17
 -------------------
 micro avg 91.16 91.16 91.16 996
 macro avg 91.66 90.44 90.48 996
 weighted avg 91.72 91.16 91.04 996
```
Slot report: 
```
 label precision recall f1 support 
 alarm_type (label_id: 0) 0.00 0.00 0.00 2
 app_name (label_id: 1) 0.00 0.00 0.00 1
 artist_name (label_id: 2) 17.39 80.00 28.57 5
 audiobook_author (label_id: 3) 0.00 0.00 0.00 0
 audiobook_name (label_id: 4) 64.52 74.07 68.97 27
 business_name (label_id: 5) 81.48 84.62 83.02 52
 business_type (label_id: 6) 80.00 80.00 80.00 20
 change_amount (label_id: 7) 57.14 66.67 61.54 6
 coffee_type (label_id: 8) 100.00 33.33 50.00 3
 color_type (label_id: 9) 75.00 92.31 82.76 13
 cooking_type (label_id: 10) 0.00 0.00 0.00 1
 currency_name (label_id: 11) 100.00 96.43 98.18 28
 date (label_id: 12) 87.88 87.22 87.55 133
 definition_word (label_id: 13) 85.00 85.00 85.00 20
 device_type (label_id: 14) 84.75 76.92 80.65 65
 drink_type (label_id: 15) 0.00 0.00 0.00 0
 email_address (label_id: 16) 64.29 100.00 78.26 9
 email_folder (label_id: 17) 100.00 50.00 66.67 2
 event_name (label_id: 18) 80.00 75.00 77.42 64
 food_type (label_id: 19) 84.38 77.14 80.60 35
 game_name (label_id: 20) 93.55 78.38 85.29 37
 game_type (label_id: 21) 0.00 0.00 0.00 0
 general_frequency (label_id: 22) 0.00 0.00 0.00 9
 house_place (label_id: 23) 80.95 91.89 86.08 37
 ingredient (label_id: 24) 0.00 0.00 0.00 1
 joke_type (label_id: 25) 100.00 100.00 100.00 5
 list_name (label_id: 26) 89.29 69.44 78.12 36
 meal_type (label_id: 27) 0.00 0.00 0.00 3
 media_type (label_id: 28) 78.95 83.33 81.08 36
 movie_name (label_id: 29) 0.00 0.00 0.00 1
 movie_type (label_id: 30) 0.00 0.00 0.00 0
 music_album (label_id: 31) 0.00 0.00 0.00 0
 music_descriptor (label_id: 32) 0.00 0.00 0.00 2
 music_genre (label_id: 33) 81.82 90.00 85.71 10
 news_topic (label_id: 34) 80.00 30.77 44.44 13
 order_type (label_id: 35) 100.00 42.11 59.26 19
 person (label_id: 36) 70.79 100.00 82.89 63
 personal_info (label_id: 37) 76.19 94.12 84.21 17
 place_name (label_id: 38) 82.86 84.47 83.65 103
 player_setting (label_id: 39) 75.00 42.86 54.55 7
 playlist_name (label_id: 40) 0.00 0.00 0.00 3
 podcast_descriptor (label_id: 41) 92.31 54.55 68.57 22
 podcast_name (label_id: 42) 66.67 16.67 26.67 12
 radio_name (label_id: 43) 94.87 94.87 94.87 39
 relation (label_id: 44) 90.91 90.91 90.91 11
 song_name (label_id: 45) 100.00 6.67 12.50 15
 time (label_id: 46) 77.57 84.69 80.98 98
 time_zone (label_id: 47) 44.44 100.00 61.54 4
 timeofday (label_id: 48) 86.96 80.00 83.33 25
 transport_agency (label_id: 49) 80.00 57.14 66.67 7
 transport_descriptor (label_id: 50) 0.00 0.00 0.00 5
 transport_name (label_id: 51) 0.00 0.00 0.00 0
 transport_type (label_id: 52) 88.89 100.00 94.12 40
 weather_descriptor (label_id: 53) 87.50 87.50 87.50 8
 O (label_id: 54) 97.07 97.52 97.30 5408
 -------------------
 micro avg 94.24 94.24 94.24 6582
 macro avg 64.87 59.93 59.17 6582
 weighted avg 94.23 94.24 93.95 6582
```

## 1.4 (Optional) To train/ test a GPT2 model on the assistant dataset, run the cell below 

In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}": gpt2 doesn't specify a pad token, therefore using its EOS token as the pad token
# model.dataset.target_template=with_slots: this perform slot filling with intent classification
!(python NeMo/examples/nlp/dialogue/dialogue.py \
 do_training=True \
 model.dataset.data_dir='./assistant' \
 model.dataset.dialogues_example_dir='./assistant_gpt2_examples' \
 model.dataset.task='assistant' \
 model.language_model.pretrained_model_name='gpt2' \
 trainer.max_epochs=1 \
 model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}" \
 model.dataset.target_template=with_slots \
 model.dataset.eval_mode=generation \
 exp_manager.create_wandb_logger=False)

**After 1 epoch:**

More epochs would be helpful

Intent report:

 ```
 label precision recall f1 support 
 transport query (label_id: 0) 72.73 84.21 78.05 19
 weather query (label_id: 1) 94.74 94.74 94.74 19
 play game (label_id: 2) 92.86 68.42 78.79 19
 qa currency (label_id: 3) 100.00 100.00 100.00 19
 qa maths (label_id: 4) 100.00 100.00 100.00 14
 iot wemo off (label_id: 5) 75.00 100.00 85.71 9
 datetime convert (label_id: 6) 46.67 87.50 60.87 8
 email addcontact (label_id: 7) 70.00 87.50 77.78 8
 music likeness (label_id: 8) 57.89 61.11 59.46 18
 music query (label_id: 9) 78.57 57.89 66.67 19
 general negate (label_id: 10) 95.00 100.00 97.44 19
 email sendemail (label_id: 11) 92.86 68.42 78.79 19
 general affirm (label_id: 12) 95.00 100.00 97.44 19
 play audiobook (label_id: 13) 57.69 78.95 66.67 19
 general praise (label_id: 14) 100.00 94.74 97.30 19
 alarm set (label_id: 15) 85.71 94.74 90.00 19
 general explain (label_id: 16) 100.00 89.47 94.44 19
 iot wemo on (label_id: 17) 83.33 71.43 76.92 7
 cooking recipe (label_id: 18) 90.00 94.74 92.31 19
 music settings (label_id: 19) 60.00 42.86 50.00 7
 social post (label_id: 20) 84.21 84.21 84.21 19
 recommendation events (label_id: 21) 72.73 84.21 78.05 19
 audio volume up (label_id: 22) 76.47 100.00 86.67 13
 lists remove (label_id: 23) 73.08 100.00 84.44 19
 transport ticket (label_id: 24) 94.74 94.74 94.74 19
 general joke (label_id: 25) 100.00 100.00 100.00 12
 play podcasts (label_id: 26) 94.12 84.21 88.89 19
 iot hue lightchange (label_id: 27) 85.71 63.16 72.73 19
 audio volume mute (label_id: 28) 84.62 73.33 78.57 15
 general dontcare (label_id: 29) 95.00 100.00 97.44 19
 qa definition (label_id: 30) 77.27 89.47 82.93 19
 email querycontact (label_id: 31) 58.33 73.68 65.12 19
 general commandstop (label_id: 32) 100.00 100.00 100.00 19
 calendar remove (label_id: 33) 94.44 89.47 91.89 19
 news query (label_id: 34) 100.00 57.89 73.33 19
 calendar query (label_id: 35) 63.16 63.16 63.16 19
 social query (label_id: 36) 88.24 83.33 85.71 18
 transport traffic (label_id: 37) 90.48 100.00 95.00 19
 transport taxi (label_id: 38) 100.00 94.44 97.14 18
 alarm query (label_id: 39) 100.00 94.74 97.30 19
 iot hue lightoff (label_id: 40) 88.89 84.21 86.49 19
 takeaway order (label_id: 41) 81.25 68.42 74.29 19
 iot coffee (label_id: 42) 100.00 94.74 97.30 19
 recommendation movies (label_id: 43) 75.00 90.00 81.82 10
 iot hue lightup (label_id: 44) 78.57 78.57 78.57 14
 email query (label_id: 45) 85.71 94.74 90.00 19
 lists createoradd (label_id: 46) 82.35 73.68 77.78 19
 play radio (label_id: 47) 84.21 84.21 84.21 19
 audio volume down (label_id: 48) 100.00 87.50 93.33 8
 general quirky (label_id: 49) 30.00 15.79 20.69 19
 play music (label_id: 50) 71.43 52.63 60.61 19
 qa stock (label_id: 51) 90.48 100.00 95.00 19
 iot cleaning (label_id: 52) 93.33 87.50 90.32 16
 iot hue lightdim (label_id: 53) 100.00 100.00 100.00 12
 recommendation locations (label_id: 54) 100.00 89.47 94.44 19
 general repeat (label_id: 55) 100.00 100.00 100.00 19
 takeaway query (label_id: 56) 77.27 89.47 82.93 19
 alarm remove (label_id: 57) 100.00 100.00 100.00 11
 datetime query (label_id: 58) 75.00 63.16 68.57 19
 iot hue lighton (label_id: 59) 60.00 100.00 75.00 3
 qa factoid (label_id: 60) 50.00 57.89 53.66 19
 calendar set (label_id: 61) 75.00 78.95 76.92 19
 general confirm (label_id: 62) 100.00 100.00 100.00 19
 lists query (label_id: 63) 66.67 73.68 70.00 19
 label_id: 64 0.00 0.00 0.00 0
 -------------------
 micro avg 83.55 83.55 83.55 1076
 macro avg 83.53 83.93 83.01 1076
 weighted avg 84.26 83.55 83.30 1076
 
```

```
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Test metric DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 intent_f1 83.55018615722656
 intent_precision 83.55018615722656
 intent_recall 83.55018615722656
 slot_f1 73.99985919756773
slot_joint_goal_accuracy 65.89219330855019
 slot_precision 73.85223048327137
 slot_recall 74.14807930607186
 test_intent_accuracy 83.55018587360595
 test_loss_epoch 0.019178826361894608
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
```

# 2. Schema Guided Dialogue (SGD)

## 2.1 Task Description
---

SGD is a multi-domain intent classification dataset from Google with close to 100k examples.

An example is:

* utterance: I will be eating there at 11:30 am so make the reservation for then.
* intent: ReserveRestaurant
* slots: {"time": "11:30 am"}




## 2.2 Download the dataset

In [None]:
!git clone https://github.com/google-research-datasets/dstc8-schema-guided-dialogue.git

## 2.3 Training and/or Testing the model


In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}": gpt2 doesn't specify a pad token, therefore using its EOS token as the pad token

!(python NeMo/examples/nlp/dialogue/dialogue.py \
 do_training=True \
 model.dataset.data_dir='./dstc8-schema-guided-dialogue' \
 model.dataset.dialogues_example_dir='./sgd_gpt2_predictions' \
 model.dataset.task='sgd' \
 model.language_model.pretrained_model_name='gpt2' \
 trainer.max_epochs=1 \
 model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}" \
 exp_manager.create_wandb_logger=False)


In [None]:
!ls sgd_gpt2_predictions

**After 1 epoch:**

More epochs would needed to reach convergence.


```
 label precision recall f1 support 
 check balance (label_id: 0) 0.00 0.00 0.00 0
 find trains (label_id: 1) 80.20 91.95 85.68 348
 make payment (label_id: 2) 83.12 28.07 41.97 228
 book appointment (label_id: 3) 86.93 87.15 87.04 397
 get cars available (label_id: 4) 96.88 90.51 93.58 274
 get event dates (label_id: 5) 0.00 0.00 0.00 0
 buy bus ticket (label_id: 6) 78.61 91.33 84.49 173
 add event (label_id: 7) 0.00 0.00 0.00 0
 get alarms (label_id: 8) 58.33 77.78 66.67 45
 reserve car (label_id: 9) 83.75 72.43 77.68 185
 get events (label_id: 10) 0.00 0.00 0.00 0
 reserve roundtrip flights (label_id: 11) 0.00 0.00 0.00 0
 lookup music (label_id: 12) 89.83 86.89 88.33 61
 book house (label_id: 13) 91.13 92.50 91.81 200
 search oneway flight (label_id: 14) 74.77 47.70 58.25 174
 buy event tickets (label_id: 15) 72.19 95.31 82.15 128
 find apartment (label_id: 16) 0.00 0.00 0.00 0
 schedule visit (label_id: 17) 77.27 66.06 71.23 386
 play media (label_id: 18) 92.94 86.81 89.77 91
 get ride (label_id: 19) 99.41 98.82 99.12 170
 reserve oneway flight (label_id: 20) 0.00 0.00 0.00 0
 find bus (label_id: 21) 96.64 87.53 91.86 361
 find restaurants (label_id: 22) 77.14 91.22 83.59 148
 get times for movie (label_id: 23) 0.00 0.00 0.00 0
 transfer money (label_id: 24) 0.00 0.00 0.00 0
 request payment (label_id: 25) 46.71 63.39 53.79 112
 play movie (label_id: 26) 100.00 65.11 78.87 321
 search house (label_id: 27) 97.91 91.83 94.77 306
 search roundtrip flights (label_id: 28) 67.49 82.41 74.21 199
 find provider (label_id: 29) 95.11 90.53 92.77 602
 find attractions (label_id: 30) 100.00 89.01 94.19 91
 reserve hotel (label_id: 31) 56.75 97.04 71.62 169
 lookup song (label_id: 32) 0.00 0.00 0.00 0
 add alarm (label_id: 33) 95.68 60.18 73.89 221
 find home by area (label_id: 34) 48.95 59.79 53.83 194
 get available time (label_id: 35) 0.00 0.00 0.00 0
 buy movie tickets (label_id: 36) 100.00 29.39 45.42 473
 reserve restaurant (label_id: 37) 95.71 84.80 89.92 342
 find movies (label_id: 38) 62.40 97.61 76.14 335
 get weather (label_id: 39) 100.00 87.69 93.44 195
 search hotel (label_id: 40) 99.35 52.60 68.78 289
 find events (label_id: 41) 99.57 82.56 90.27 281
 play song (label_id: 42) 0.00 0.00 0.00 0
 rent movie (label_id: 43) 0.00 0.00 0.00 0
 get train tickets (label_id: 44) 45.83 5.56 9.91 198
 none (label_id: 45) 55.77 98.90 71.32 728
 label_id: 46 0.00 0.00 0.00 0
 -------------------
 micro avg 77.23 77.23 77.23 8425
 macro avg 82.01 76.68 76.56 8425
 weighted avg 83.23 77.23 76.86 8425

```

# 3. MS Marco

## Task Description

MS Marco NLGen is a dataset from Microsoft that takes extracted answers and questions and output fluent answers.

An example is 


* question: What county is Nine Mile in?
* extracted_answer: Onondaga
* fluent_answer: Nine Mile is in Onondaga county.


## Download and unzip files

In [None]:
!mkdir ms_marco
os.chdir('ms_marco')
!wget https://msmarco.blob.core.windows.net/msmarco/train_v2.1.json.gz
!wget https://msmarco.blob.core.windows.net/msmarco/dev_v2.1.json.gz

!gunzip train_v2.1.json.gz
!gunzip dev_v2.1.json.gz

!python ../NeMo/examples/nlp/dialogue/remove_ms_marco_samples_without_wellFormedAnswers.py --filename train_v2.1.json 
!python ../NeMo/examples/nlp/dialogue/remove_ms_marco_samples_without_wellFormedAnswers.py --filename dev_v2.1.json 

os.chdir('..')

## Training and/or Testing the model


In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample

!(python NeMo/examples/nlp/dialogue/dialogue.py \
 do_training=True \
 model.dataset.dialogues_example_dir='./marco_bart_predictions' \
 model.dataset.data_dir='./ms_marco' \
 model.save_model=True \
 model.dataset.debug_mode=True \
 model.dataset.task='ms_marco' \
 model.language_model.pretrained_model_name='facebook/bart-base' \
 trainer.max_epochs=1 \
 model.dataset.debug_mode=False \
 exp_manager.create_wandb_logger=False)

**After 1 epoch:**

Train more epochs for optimal performance

```
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Test metric DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 bleu 65.46179962158203
 f1 78.24439835896995
 precision 81.92473076099847
 recall 76.72508929408436
 test_accuracy 25.563487607283225
 test_loss 0.4419259166606655
 test_loss_epoch 0.4420809745788574
 test_ppl 1.5557004846779854
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
```