Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Cuiunbo 's Collections
VLM dataset
MiniCPM-V
VLM For OCR
Dataset For OCR
audio

VLM For OCR

updated Jun 29, 2024
Upvote
3

  • Qwen/Qwen-VL

    Text Generation • Updated Jan 25, 2024 • 20.3k • 243

  • google/pix2struct-large

    Image-to-Text • Updated Sep 6, 2023 • 26.3k • 34

  • THUDM/cogagent-chat-hf

    Text Generation • Updated Dec 24, 2024 • 406 • 69

  • openbmb/MiniCPM-Llama3-V-2_5

    Image-Text-to-Text • Updated Jan 15 • 32.9k • 1.4k

  • google/paligemma-3b-pt-896

    Image-Text-to-Text • Updated Jul 19, 2024 • 1.37k • 117

  • UCSC-VLAA/Recap-DataComp-1B

    Viewer • Updated Jan 9 • 1.88B • 16.9k • 169

  • WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

    Paper • 2406.11069 • Published Jun 16, 2024 • 14

  • pbevan11/synthetic-ocr-correction-gpt4o

    Viewer • Updated Jul 25, 2024 • 10k • 19 • 5

  • yifeihu/ACL-23-Paper-OCR-Markdown

    Viewer • Updated Jun 8, 2024 • 2.15k • 104 • 17

  • LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

    Paper • 2406.15319 • Published Jun 21, 2024 • 65
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs