RegionGPT: Towards Region Understanding Vision Language Model Paper • 2403.02330 • Published Mar 4, 2024 • 2
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation Paper • 2411.08380 • Published Nov 13, 2024 • 27
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Paper • 2410.05265 • Published Oct 7, 2024 • 31
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Paper • 2405.07990 • Published May 13, 2024 • 21
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Paper • 2405.07990 • Published May 13, 2024 • 21
EGC: Image Generation and Classification via a Diffusion Energy-Based Model Paper • 2304.02012 • Published Apr 4, 2023 • 1
RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs Paper • 2308.07228 • Published Aug 14, 2023 • 10
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths Paper • 2305.18295 • Published May 29, 2023 • 7