Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.10150

LucasThil/randomized_clean_miniwob_episodes__image0_5000_v2

Viewer • Updated May 16, 2023 • 2.5k • 17
LucasThil/miniwob_plusplus_hierarchical_training_actions_drain

Viewer • Updated Jun 21, 2023 • 40.2k • 57 • 1
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

Paper • 2503.22677 • Published Mar 28 • 6
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs

Paper • 2503.23022 • Published Mar 29 • 7

Super Alignment

Trusted Source Alignment in Large Language Models

Paper • 2311.06697 • Published Nov 12, 2023 • 12
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 50
SuperHF: Supervised Iterative Learning from Human Feedback

Paper • 2310.16763 • Published Oct 25, 2023 • 1
Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

Paper • 2311.15657 • Published Nov 27, 2023 • 2

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25
Code as Policies: Language Model Programs for Embodied Control

Paper • 2209.07753 • Published Sep 16, 2022 • 1
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling

Paper • 2402.10211 • Published Feb 15, 2024 • 14

Moral Foundations of Large Language Models

Paper • 2310.15337 • Published Oct 23, 2023 • 1
Specific versus General Principles for Constitutional AI

Paper • 2310.13798 • Published Oct 20, 2023 • 3
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 25
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 49

Advanced and Recent Papers

Advanced and recent papers about deep learning. Please send your recommend paper to email: [email protected]

AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models

Paper • 2309.16414 • Published Sep 28, 2023 • 19
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Paper • 2309.13018 • Published Sep 22, 2023 • 9
Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 31
Language models in molecular discovery

Paper • 2309.16235 • Published Sep 28, 2023 • 10

generalist-decision-maker

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs