Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
BAAI 's Collections
CCI4.0
MegaPairs
OpenSeek
BGE
Emu3
Tele-FLM (FLM-2)
CCI
Aquila
Infinity MM
Infinity Instruct
EVE
NOVA
Alt-CLIP/Diffusion
Industry Instruction
IndustryCorpus2
IndustryCorpus

CCI

updated 9 days ago

Chinese Corpora Internet(中文互联网语料)

Upvote
3

  • BAAI/CCI3-HQ

    Viewer • Updated Nov 11, 2024 • 54.8M • 3.27k • 41

  • BAAI/CCI3-HQ-Classifier

    Updated Oct 28, 2024 • 4 • 8

  • BAAI/CCI3-HQ-Annotation-Benchmark

    Viewer • Updated Oct 28, 2024 • 14.1k • 93 • 4

  • BAAI/CCI3-Data

    Updated Nov 11, 2024 • 3.76k • 29

  • CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models

    Paper • 2410.18505 • Published Oct 24, 2024 • 11

  • BAAI/CCI3-HQ-Intermediate-Checkpoints

    Updated Oct 28, 2024 • 2

  • BAAI/CCI-Data

    Updated Dec 17, 2024 • 22 • 67

  • BAAI/CCI2-Data

    Viewer • Updated Dec 17, 2024 • 179M • 2.58k • 49

  • ldwang/OpenHermes-2.5-zh

    Preview • Updated Sep 2, 2024 • 79 • 1

  • ldwang/lighteval-ceval-exam

    Updated Nov 14, 2024 • 8

  • ldwang/lighteval-cmmlu

    Updated Aug 13, 2024 • 8
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs