Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.09353

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 30
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Paper • 2504.16078 • Published 16 days ago • 20
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published 17 days ago • 19
OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published 18 days ago • 33

PEFT (Parameter Efficient Fine-Tuning)

PEFT (Parameter-Efficient Fine-Tuning): PEFT is a technique that focuses on updating only a small subset of the model’s parameters during fine-tuning.

thsluck/llm-course-hw3-lora

Text Generation • Updated 27 days ago • 5
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 39
thsluck/llm-course-hw3-dora

Text Generation • Updated 27 days ago • 6
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 27

Parameter Efficient Fine-Tuning

This collection features fine-tuned models for tweet sentiment classification trained as part of LLM course.

neuralsrg/llm-course-hw3-dora

Text Generation • Updated 27 days ago • 6
neuralsrg/llm-course-hw3-lora

Text Generation • Updated 28 days ago • 9
neuralsrg/llm-course-hw3-tinyllama-qlora

Updated 27 days ago
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 39

DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 27

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 19
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20, 2024 • 27
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 27

Applied Machine Learning Papers

Reading List (Mainly Focused of VLM's and Diffusion Models)

Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 18
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

Paper • 2311.15127 • Published Nov 25, 2023 • 13
Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 11
U-Net: Convolutional Networks for Biomedical Image Segmentation

Paper • 1505.04597 • Published May 18, 2015 • 9

Must Reads On Transformers and Diffusers

Explore the cutting-edge of AI with our curated list of must reads on Transformers & Diffusers, driving innovation in generative-AI and beyond.

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Paper • 1701.06538 • Published Jan 23, 2017 • 6
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 61
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Paper • 2005.11401 • Published May 22, 2020 • 11
Language Model Evaluation Beyond Perplexity

Paper • 2106.00085 • Published May 31, 2021

Efficient Training

DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 27
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 102

NLP Parameter Efficient Finetuning

Explore research optimizing NLP models for maximum performance with minimal computational resources.

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 3
Parameter-Efficient Transfer Learning for NLP

Paper • 1902.00751 • Published Feb 2, 2019 • 1
GPT Understands, Too

Paper • 2103.10385 • Published Mar 18, 2021 • 9
The Power of Scale for Parameter-Efficient Prompt Tuning

Paper • 2104.08691 • Published Apr 18, 2021 • 10

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29, 2024 • 57
ResLoRA: Identity Residual Mapping in Low-Rank Adaption

Paper • 2402.18039 • Published Feb 28, 2024 • 11

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs