# ShareGPT4Video dataset For the text-to-video task, we sample 100 video captions from the ShareGPT4Video datset to feed to the diffusion model to generate videos. ## Filtering the dataset Download the dataset with captions and video paths. ```sh wget https://huggingface.co/datasets/ShareGPT4Video/ShareGPT4Video/resolve/main/sharegpt4video_40k.jsonl ``` Sample video-caption pairs. The sampled dataset will be saved under `sharegpt4video_100.json`. ```sh python sample.py ```