knockupabyss commited on
Commit
ae96b0c
·
1 Parent(s): 256c0c2

hopefully readme works

Browse files
Files changed (1) hide show
  1. README.md +9 -185
README.md CHANGED
@@ -1,185 +1,9 @@
1
- <p align="center">
2
-
3
- <h1 align="center">VACE: All-in-One Video Creation and Editing</h1>
4
- <p align="center">
5
- <strong>Zeyinzi Jiang<sup>*</sup></strong>
6
- ·
7
- <strong>Zhen Han<sup>*</sup></strong>
8
- ·
9
- <strong>Chaojie Mao<sup>*&dagger;</sup></strong>
10
- ·
11
- <strong>Jingfeng Zhang</strong>
12
- ·
13
- <strong>Yulin Pan</strong>
14
- ·
15
- <strong>Yu Liu</strong>
16
- <br>
17
- <b>Tongyi Lab - <a href="https://github.com/Wan-Video/Wan2.1"><img src='https://ali-vilab.github.io/VACE-Page/assets/logos/wan_logo.png' alt='wan_logo' style='margin-bottom: -4px; height: 20px;'></a> </b>
18
- <br>
19
- <br>
20
- <a href="https://arxiv.org/abs/2503.07598"><img src='https://img.shields.io/badge/VACE-arXiv-red' alt='Paper PDF'></a>
21
- <a href="https://ali-vilab.github.io/VACE-Page/"><img src='https://img.shields.io/badge/VACE-Project_Page-green' alt='Project Page'></a>
22
- <a href="https://huggingface.co/collections/ali-vilab/vace-67eca186ff3e3564726aff38"><img src='https://img.shields.io/badge/VACE-HuggingFace_Model-yellow'></a>
23
- <a href="https://modelscope.cn/collections/VACE-8fa5fcfd386e43"><img src='https://img.shields.io/badge/VACE-ModelScope_Model-purple'></a>
24
- <br>
25
- </p>
26
-
27
-
28
- ## Introduction
29
- <strong>VACE</strong> is an all-in-one model designed for video creation and editing. It encompasses various tasks, including reference-to-video generation (<strong>R2V</strong>), video-to-video editing (<strong>V2V</strong>), and masked video-to-video editing (<strong>MV2V</strong>), allowing users to compose these tasks freely. This functionality enables users to explore diverse possibilities and streamlines their workflows effectively, offering a range of capabilities, such as Move-Anything, Swap-Anything, Reference-Anything, Expand-Anything, Animate-Anything, and more.
30
-
31
- <img src='./assets/materials/teaser.jpg'>
32
-
33
-
34
- ## 🎉 News
35
- - [x] Mar 31, 2025: 🔥VACE-Wan2.1-1.3B-Preview and VACE-LTX-Video-0.9 models are now available at [HuggingFace](https://huggingface.co/collections/ali-vilab/vace-67eca186ff3e3564726aff38) and [ModelScope](https://modelscope.cn/collections/VACE-8fa5fcfd386e43)!
36
- - [x] Mar 31, 2025: 🔥Release code of model inference, preprocessing, and gradio demos.
37
- - [x] Mar 11, 2025: We propose [VACE](https://ali-vilab.github.io/VACE-Page/), an all-in-one model for video creation and editing.
38
-
39
-
40
- ## 🪄 Models
41
- | Models | Download Link | Video Size | License |
42
- |--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------|-----------------------------------------------------------------------------------------------|
43
- | VACE-Wan2.1-1.3B-Preview | [Huggingface](https://huggingface.co/ali-vilab/VACE-Wan2.1-1.3B-Preview) 🤗 [ModelScope](https://modelscope.cn/models/iic/VACE-Wan2.1-1.3B-Preview) 🤖 | ~ 81 x 480 x 832 | [Apache-2.0](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B/blob/main/LICENSE.txt) |
44
- | VACE-LTX-Video-0.9 | [Huggingface](https://huggingface.co/ali-vilab/VACE-LTX-Video-0.9) 🤗 [ModelScope](https://modelscope.cn/models/iic/VACE-LTX-Video-0.9) 🤖 | ~ 97 x 512 x 768 | [RAIL-M](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.license.txt) |
45
- | Wan2.1-VACE-1.3B | [To be released](https://github.com/Wan-Video) <img src='https://ali-vilab.github.io/VACE-Page/assets/logos/wan_logo.png' alt='wan_logo' style='margin-bottom: -4px; height: 15px;'> | ~ 81 x 480 x 832 | [Apache-2.0](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B/blob/main/LICENSE.txt) |
46
- | Wan2.1-VACE-14B | [To be released](https://github.com/Wan-Video) <img src='https://ali-vilab.github.io/VACE-Page/assets/logos/wan_logo.png' alt='wan_logo' style='margin-bottom: -4px; height: 15px;'> | ~ 81 x 720 x 1080 | [Apache-2.0](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/blob/main/LICENSE.txt) |
47
-
48
- - The input supports any resolution, but to achieve optimal results, the video size should fall within a specific range.
49
- - All models inherit the license of the original model.
50
-
51
-
52
- ## ⚙️ Installation
53
- The codebase was tested with Python 3.10.13, CUDA version 12.4, and PyTorch >= 2.5.1.
54
-
55
- ### Setup for Model Inference
56
- You can setup for VACE model inference by running:
57
- ```bash
58
- git clone https://github.com/ali-vilab/VACE.git && cd VACE
59
- pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124 # If PyTorch is not installed.
60
- pip install -r requirements.txt
61
- pip install wan@git+https://github.com/Wan-Video/Wan2.1 # If you want to use Wan2.1-based VACE.
62
- pip install ltx-video@git+https://github.com/Lightricks/[email protected] sentencepiece --no-deps # If you want to use LTX-Video-0.9-based VACE. It may conflict with Wan.
63
- ```
64
- Please download your preferred base model to `<repo-root>/models/`.
65
-
66
- ### Setup for Preprocess Tools
67
- If you need preprocessing tools, please install:
68
- ```bash
69
- pip install -r requirements/annotator.txt
70
- ```
71
- Please download [VACE-Annotators](https://huggingface.co/ali-vilab/VACE-Annotators) to `<repo-root>/models/`.
72
-
73
- ### Local Directories Setup
74
- It is recommended to download [VACE-Benchmark](https://huggingface.co/datasets/ali-vilab/VACE-Benchmark) to `<repo-root>/benchmarks/` as examples in `run_vace_xxx.sh`.
75
-
76
- We recommend to organize local directories as:
77
- ```angular2html
78
- VACE
79
- ├── ...
80
- ├── benchmarks
81
- │ └── VACE-Benchmark
82
- │ └── assets
83
- │ └── examples
84
- │ ├── animate_anything
85
- │ │ └── ...
86
- │ └── ...
87
- ├── models
88
- │ ├── VACE-Annotators
89
- │ │ └── ...
90
- │ ├── VACE-LTX-Video-0.9
91
- │ │ └── ...
92
- │ └── VACE-Wan2.1-1.3B-Preview
93
- │ └── ...
94
- └── ...
95
- ```
96
-
97
- ## 🚀 Usage
98
- In VACE, users can input **text prompt** and optional **video**, **mask**, and **image** for video generation or editing.
99
- Detailed instructions for using VACE can be found in the [User Guide](./UserGuide.md).
100
-
101
- ### Inference CIL
102
- #### 1) End-to-End Running
103
- To simply run VACE without diving into any implementation details, we suggest an end-to-end pipeline. For example:
104
- ```bash
105
- # run V2V depth
106
- python vace/vace_pipeline.py --base wan --task depth --video assets/videos/test.mp4 --prompt 'xxx'
107
-
108
- # run MV2V inpainting by providing bbox
109
- python vace/vace_pipeline.py --base wan --task inpainting --mode bbox --bbox 50,50,550,700 --video assets/videos/test.mp4 --prompt 'xxx'
110
- ```
111
- This script will run video preprocessing and model inference sequentially,
112
- and you need to specify all the required args of preprocessing (`--task`, `--mode`, `--bbox`, `--video`, etc.) and inference (`--prompt`, etc.).
113
- The output video together with intermediate video, mask and images will be saved into `./results/` by default.
114
-
115
- > 💡**Note**:
116
- > Please refer to [run_vace_pipeline.sh](./run_vace_pipeline.sh) for usage examples of different task pipelines.
117
-
118
-
119
- #### 2) Preprocessing
120
- To have more flexible control over the input, before VACE model inference, user inputs need to be preprocessed into `src_video`, `src_mask`, and `src_ref_images` first.
121
- We assign each [preprocessor](./vace/configs/__init__.py) a task name, so simply call [`vace_preprocess.py`](./vace/vace_preproccess.py) and specify the task name and task params. For example:
122
- ```angular2html
123
- # process video depth
124
- python vace/vace_preproccess.py --task depth --video assets/videos/test.mp4
125
-
126
- # process video inpainting by providing bbox
127
- python vace/vace_preproccess.py --task inpainting --mode bbox --bbox 50,50,550,700 --video assets/videos/test.mp4
128
- ```
129
- The outputs will be saved to `./processed/` by default.
130
-
131
- > 💡**Note**:
132
- > Please refer to [run_vace_pipeline.sh](./run_vace_pipeline.sh) preprocessing methods for different tasks.
133
- Moreover, refer to [vace/configs/](./vace/configs/) for all the pre-defined tasks and required params.
134
- You can also customize preprocessors by implementing at [`annotators`](./vace/annotators/__init__.py) and register them at [`configs`](./vace/configs).
135
-
136
-
137
- #### 3) Model inference
138
- Using the input data obtained from **Preprocessing**, the model inference process can be performed as follows:
139
- ```bash
140
- # For Wan2.1 single GPU inference
141
- python vace/vace_wan_inference.py --ckpt_dir <path-to-model> --src_video <path-to-src-video> --src_mask <path-to-src-mask> --src_ref_images <paths-to-src-ref-images> --prompt "xxx"
142
-
143
- # For Wan2.1 Multi GPU Acceleration inference
144
- pip install "xfuser>=0.4.1"
145
- torchrun --nproc_per_node=8 vace/vace_wan_inference.py --dit_fsdp --t5_fsdp --ulysses_size 1 --ring_size 8 --ckpt_dir <path-to-model> --src_video <path-to-src-video> --src_mask <path-to-src-mask> --src_ref_images <paths-to-src-ref-images> --prompt "xxx"
146
-
147
- # For LTX inference, run
148
- python vace/vace_ltx_inference.py --ckpt_path <path-to-model> --text_encoder_path <path-to-model> --src_video <path-to-src-video> --src_mask <path-to-src-mask> --src_ref_images <paths-to-src-ref-images> --prompt "xxx"
149
- ```
150
- The output video together with intermediate video, mask and images will be saved into `./results/` by default.
151
-
152
- > 💡**Note**:
153
- > (1) Please refer to [vace/vace_wan_inference.py](./vace/vace_wan_inference.py) and [vace/vace_ltx_inference.py](./vace/vace_ltx_inference.py) for the inference args.
154
- > (2) For LTX-Video and English language Wan2.1 users, you need prompt extension to unlock the full model performance.
155
- Please follow the [instruction of Wan2.1](https://github.com/Wan-Video/Wan2.1?tab=readme-ov-file#2-using-prompt-extension) and set `--use_prompt_extend` while running inference.
156
- > (3) When performing prompt extension in editing tasks, it's important to pay attention to the results of expanding plain text. Since the visual information being input is unknown, this may lead to the extended output not matching the video being edited, which can affect the final outcome.
157
-
158
- ### Inference Gradio
159
- For preprocessors, run
160
- ```bash
161
- python vace/gradios/preprocess_demo.py
162
- ```
163
- For model inference, run
164
- ```bash
165
- # For Wan2.1 gradio inference
166
- python vace/gradios/vace_wan_demo.py
167
-
168
- # For LTX gradio inference
169
- python vace/gradios/vace_ltx_demo.py
170
- ```
171
-
172
- ## Acknowledgement
173
-
174
- We are grateful for the following awesome projects, including [Scepter](https://github.com/modelscope/scepter), [Wan](https://github.com/Wan-Video/Wan2.1), and [LTX-Video](https://github.com/Lightricks/LTX-Video).
175
-
176
-
177
- ## BibTeX
178
-
179
- ```bibtex
180
- @article{vace,
181
- title = {VACE: All-in-One Video Creation and Editing},
182
- author = {Jiang, Zeyinzi and Han, Zhen and Mao, Chaojie and Zhang, Jingfeng and Pan, Yulin and Liu, Yu},
183
- journal = {arXiv preprint arXiv:2503.07598},
184
- year = {2025}
185
- }
 
1
+ ---
2
+ title: vace-demo
3
+ emoji: 😭
4
+ colorFrom: red
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ ---
8
+
9
+ vace demo