3 14 1

Zedong Wang (Jacky)

ZedongWangAI

https://jacky1128.github.io

AI & ML interests

Computer Vision, Multi-task Learning, Multi-modal Learning, Optimizers in the era of (M)LLMs.

Recent Activity

upvoted a paper 12 days ago

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

upvoted a paper 12 days ago

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

upvoted a paper 12 days ago

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

View all activity

Organizations

ZedongWangAI's activity

upvoted 3 papers 12 days ago

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

Paper • 2504.17207 • Published 14 days ago • 29

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Paper • 2504.17789 • Published 14 days ago • 23

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published 14 days ago • 38

upvoted 2 papers about 1 month ago

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Paper • 2504.01014 • Published Apr 1 • 68

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Paper • 2504.02542 • Published Apr 3 • 44

commented a paper about 1 month ago

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Paper • 2504.00999 • Published Apr 1 • 89 •

authored 2 papers about 1 month ago

Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup

Paper • 2111.15454 • Published Nov 30, 2021

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Paper • 2504.00999 • Published Apr 1 • 89

commented 3 papers about 1 month ago

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Paper • 2504.00999 • Published Apr 1 • 89 •

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Paper • 2504.00999 • Published Apr 1 • 89 •

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Paper • 2504.00999 • Published Apr 1 • 89 •

updated a collection about 1 month ago

Representation Learning & Generation

Collection

8 items • Updated Apr 3 • 1

upvoted a paper about 1 month ago

Scaling Language-Free Visual Representation Learning

Paper • 2504.01017 • Published Apr 1 • 29

updated a collection about 1 month ago

Representation Learning & Generation

Collection

8 items • Updated Apr 3 • 1

upvoted a paper about 1 month ago

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Paper • 2504.00999 • Published Apr 1 • 89