PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines Paper • 2504.14738 • Published 17 days ago • 5
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning Paper • 2504.13941 • Published 22 days ago • 7
TAPIP3D: Tracking Any Point in Persistent 3D Geometry Paper • 2504.14717 • Published 17 days ago • 8
RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search Paper • 2504.15047 • Published 16 days ago • 6
DRAGON: Distributional Rewards Optimize Diffusion Generative Models Paper • 2504.15217 • Published 16 days ago • 11
LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark Paper • 2504.13805 • Published 19 days ago • 12
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners Paper • 2504.14239 • Published 19 days ago • 13
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Paper • 2504.14899 • Published 17 days ago • 20
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs Paper • 2504.14655 • Published 17 days ago • 19
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models Paper • 2504.15133 • Published 16 days ago • 21
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs Paper • 2504.15280 • Published 16 days ago • 23
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models Paper • 2504.13367 • Published 20 days ago • 24
StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians Paper • 2504.15281 • Published 16 days ago • 23
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published 22 days ago • 31