Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17764

🔥BitNet family of large language models (1-bit LLMs).

microsoft/bitnet-b1.58-2B-4T

Text Generation • Updated 8 days ago • 84.8k • 948
microsoft/bitnet-b1.58-2B-4T-bf16

Text Generation • Updated 8 days ago • 3.64k • 25
microsoft/bitnet-b1.58-2B-4T-gguf

Text Generation • Updated 8 days ago • 35.3k • 159
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 22 days ago • 70

about 15 hours ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 148
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 13
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 59
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 49

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 78
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

a collection of articles summarized and synthesized

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 367
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615
meta-llama/Llama-4-Scout-17B-16E-Instruct

Image-Text-to-Text • Updated 30 days ago • 829k • • 878
keras-io/GauGAN-Image-generation

Updated Jul 5, 2024 • 34 • 4

HiDream-ai/HiDream-I1-Full

Text-to-Image • Updated 10 days ago • 40k • • 826
nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated about 6 hours ago • 3.91M • 11.3k • 470
Running

6.17k

6.17k

DeepSite

🐳

Generate any application with DeepSeek
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615

meta-llama/Llama-4-Scout-17B-16E-Instruct

Image-Text-to-Text • Updated 30 days ago • 829k • • 878
nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated about 6 hours ago • 3.91M • 11.3k • 470
Running

6.17k

6.17k

DeepSite

🐳

Generate any application with DeepSeek
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615

ESP32_TF_MobileDetSSDLite_SETUP_AND_TRAINING

Following collection gives an overview to create an environment, To setup TF, convert to TF_LITE and to ".cc" for running a model in ESP32-S3 Device

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615

WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

Paper • 2402.05930 • Published Feb 8, 2024 • 40
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615

open-r1/codeforces-cots

Viewer • Updated Mar 28 • 254k • 7.16k • 159
inclusionAI/Ling-Coder-SyntheticQA

Viewer • Updated Mar 27 • 21.8M • 1.16k • 11
deepmind/code_contests

Viewer • Updated Jun 11, 2023 • 4.04k • 11.6k • 166
xingyaoww/code-act

Viewer • Updated Feb 5, 2024 • 78.4k • 315 • 66

Previous
1
2
3
...
22
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs