PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models Paper • 2504.16074 • Published 16 days ago • 35