SheldonXu's picture

180 1

SheldonXu

Sheldoooon

·

https://sheldontsui.github.io/

SheldonTsui

AI & ML interests

3D-aware image synthesis

Recent Activity

upvoted a paper 11 days ago

DiMeR: Disentangled Mesh Reconstruction Model

upvoted a paper 13 days ago

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

upvoted a paper 20 days ago

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

View all activity

Organizations

None yet

Sheldoooon's activity

upvoted a paper 11 days ago

DiMeR: Disentangled Mesh Reconstruction Model

Paper • 2504.17670 • Published 14 days ago • 23

upvoted a paper 13 days ago

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

Paper • 2504.16074 • Published 16 days ago • 35

upvoted 5 papers 20 days ago

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Paper • 2504.11468 • Published 28 days ago • 28

BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting

Paper • 2504.09048 • Published 26 days ago • 8

SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL

Paper • 2504.11455 • Published 23 days ago • 13

NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

Paper • 2504.11427 • Published 23 days ago • 17

Seedream 3.0 Technical Report

Paper • 2504.11346 • Published 23 days ago • 54

upvoted 2 papers 23 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 24 days ago • 255

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Paper • 2504.08727 • Published 27 days ago • 11

upvoted 3 papers 24 days ago

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published 27 days ago • 123

HoloPart: Generative 3D Part Amodal Segmentation

Paper • 2504.07943 • Published 28 days ago • 29

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published 30 days ago • 159

upvoted 8 papers about 1 month ago

EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling

Paper • 2504.02402 • Published Apr 3 • 6

Articulated Kinematics Distillation from Video Diffusion Models

Paper • 2504.01204 • Published Apr 1 • 24

SketchVideo: Sketch-based Video Generation and Editing

Paper • 2503.23284 • Published Mar 30 • 23

TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization

Paper • 2503.19901 • Published Mar 25 • 41

MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs

Paper • 2503.23022 • Published Mar 29 • 7

SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling

Paper • 2503.21732 • Published Mar 27 • 9

Segment Any Motion in Videos

Paper • 2503.22268 • Published Mar 28 • 17

Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

Paper • 2503.22236 • Published Mar 28 • 11