Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images using a View-based Representation Paper • 2003.14166 • Published Mar 23, 2020
Knowledge Hypergraph Embedding Meets Relational Algebra Paper • 2102.09557 • Published Feb 18, 2021
StarVector: Generating Scalable Vector Graphics Code from Images Paper • 2312.11556 • Published Dec 17, 2023 • 35
Capture the Flag: Uncovering Data Insights with Large Language Models Paper • 2312.13876 • Published Dec 21, 2023 • 1
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? Paper • 2403.07718 • Published Mar 12, 2024 • 2
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content Paper • 2406.11811 • Published Jun 17, 2024 • 16
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation Paper • 2407.06423 • Published Jul 8, 2024
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction Paper • 2503.15661 • Published Mar 19 • 1
StarFlow: Generating Structured Workflow Outputs From Sketch Images Paper • 2503.21889 • Published Mar 27 • 1
Distilling semantically aware orders for autoregressive image generation Paper • 2504.17069 • Published 15 days ago • 5
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20 • 93
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39