2 7 5

Rex Zeng

stdKonjac

https://stdkonjac.icu/

AI & ML interests

Computer Vision

Recent Activity

new activity 14 days ago

stdKonjac/LiveSports-3K:Add task category, paper link, and link to github.

upvoted a paper 15 days ago

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

liked a dataset 15 days ago

chenjoya/Live-WhisperX-526K

View all activity

Organizations

None yet

stdKonjac's activity

New activity in stdKonjac/LiveSports-3K 14 days ago

Add task category, paper link, and link to github.

#3 opened 14 days ago by

nielsr

upvoted a paper 15 days ago

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published 15 days ago • 34

liked 2 datasets 15 days ago

chenjoya/Live-WhisperX-526K

Preview • Updated 5 days ago • 5.23k • 3

chenjoya/Live-CC-5M

Preview • Updated 5 days ago • 147 • 4

liked 2 models 15 days ago

chenjoya/LiveCC-7B-Base

Updated 12 days ago • 789 • 5

chenjoya/LiveCC-7B-Instruct

Updated 12 days ago • 6.44k • 33

upvoted a collection 15 days ago

LiveCC

Collection

Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025) • 8 items • Updated 15 days ago • 4

New activity in stdKonjac/LiveSports-3K 15 days ago

Update README.md

#2 opened 15 days ago by

chenjoya

liked a dataset 17 days ago

stdKonjac/LiveSports-3K

Viewer • Updated 14 days ago • 2.88k • 457 • 4

updated a dataset 17 days ago

stdKonjac/LiveSports-3K

Viewer • Updated 14 days ago • 2.88k • 457 • 4

published a dataset 19 days ago

stdKonjac/LiveSports-3K

Viewer • Updated 14 days ago • 2.88k • 457 • 4

upvoted a paper about 2 months ago

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published Mar 10 • 45

upvoted 2 papers 3 months ago

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 45

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Paper • 2502.08047 • Published Feb 12 • 27

upvoted a paper 5 months ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 87

upvoted a paper 6 months ago

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 72