-
LucasThil/randomized_clean_miniwob_episodes__image0_5000_v2
Viewer • Updated • 2.5k • 17 -
LucasThil/miniwob_plusplus_hierarchical_training_actions_drain
Viewer • Updated • 40.2k • 57 • 1 -
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Paper • 2503.22677 • Published • 6 -
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
Paper • 2503.23022 • Published • 7
Collections
Discover the best community collections!
Collections including paper arxiv:2309.10150
-
Trusted Source Alignment in Large Language Models
Paper • 2311.06697 • Published • 12 -
Diffusion Model Alignment Using Direct Preference Optimization
Paper • 2311.12908 • Published • 50 -
SuperHF: Supervised Iterative Learning from Human Feedback
Paper • 2310.16763 • Published • 1 -
Enhancing Diffusion Models with Text-Encoder Reinforcement Learning
Paper • 2311.15657 • Published • 2
-
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
Paper • 2309.10150 • Published • 25 -
Code as Policies: Language Model Programs for Embodied Control
Paper • 2209.07753 • Published • 1 -
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
Paper • 2402.10211 • Published • 14
-
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 -
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 3 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 25 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 49
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 31 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10