Flat Color - Style

Prompt
flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke kaikai|hara id 21|yoneyama mai|fuzichoco], 1girl, hoshimachi suisei, virtual youtuber, blue hair, side ponytail, cowboy shot, black shirt, star print, off shoulder, outdoors, starry sky, wariza, looking up, half-closed eyes, black sky, live2d animation, upper body, high quality cinematic video of a woman sitting under the starry night sky. The Camera is steady, This is a cowboy shot. The animation is smooth and fluid.

Negative Prompt
bad quality video,色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走

Prompt
flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke kaikai|hara id 21|yoneyama mai|fuzichoco], 1girl, sakura miko, pink hair, cowboy shot, white shirt, floral print, off shoulder, outdoors, cherry blossom, tree shade, wariza, looking up, falling petals, half-closed eyes, white sky, clouds, live2d animation, upper body, high quality cinematic video of a woman sitting under a sakura tree. Dreamy and lonely, the camera close-ups on the face of the woman as she turns towards the viewer. The Camera is steady, This is a cowboy shot. The animation is smooth and fluid.

Negative Prompt
bad quality video,色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走

Model description

Flat Color - Style Trained on images without visible lineart, flat colors, and little to no indication of depth.

Reprinted from CivitAI: https://civitai.com/models/1132089?modelVersionId=1525407

Text to Video previews generated with ComfyUI_examples/wan/#text-to-video

Loading the LoRA with LoraLoaderModelOnly node and using the fp16 1.3B wan2.1_t2v_1.3B_fp16.safetensors

Trigger words

You should use flat color to trigger the image generation.

You should use no lineart to trigger the image generation.

Download model

Weights for this model are available in Safetensors format.

Download them in the Files & versions tab.

Training Config

Trained with diffusion-pipe

dataset.toml

# Resolution settings.
resolutions = [512]

# Aspect ratio bucketing settings
enable_ar_bucket = true
min_ar = 0.5
max_ar = 2.0
num_ar_buckets = 7

# Frame buckets (1 is for images)
frame_buckets = [1]

[[directory]] # IMAGES
# Path to the directory containing images and their corresponding caption files.
path = '/mnt/d/huanvideo/training_data/images'
num_repeats = 5
resolutions = [720]
frame_buckets = [1] # Use 1 frame for images.

[[directory]] # VIDEOS
# Path to the directory containing videos and their corresponding caption files.
path = '/mnt/d/huanvideo/training_data/videos'
num_repeats = 5
resolutions = [512] # Set video resolution to 256 (e.g., 244p).
frame_buckets = [6, 28, 31, 32, 36, 42, 43, 48, 50, 53]

config.toml

# Dataset config file.
output_dir = '/mnt/d/wan/training_output'
dataset = 'dataset.toml'

# Training settings
epochs = 50
micro_batch_size_per_gpu = 1
pipeline_stages = 1
gradient_accumulation_steps = 4
gradient_clipping = 1.0
warmup_steps = 100

# eval settings
eval_every_n_epochs = 5
eval_before_first_step = true
eval_micro_batch_size_per_gpu = 1
eval_gradient_accumulation_steps = 1

# misc settings
save_every_n_epochs = 5
checkpoint_every_n_minutes = 30
activation_checkpointing = true
partition_method = 'parameters'
save_dtype = 'bfloat16'
caching_batch_size = 1
steps_per_print = 1
video_clip_mode = 'single_middle'

[model]
type = 'wan'
ckpt_path = '../Wan2.1-T2V-1.3B'
dtype = 'bfloat16'
# You can use fp8 for the transformer when training LoRA.
transformer_dtype = 'float8'
timestep_sample_method = 'logit_normal'

[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'

[optimizer]
type = 'adamw_optimi'
lr = 5e-5
betas = [0.9, 0.99]
weight_decay = 0.02
eps = 1e-8

motimalu
/

wan-flat-color-1.3b-v2