Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Julius-L 's Collections
multimodal dataset
Generation
Long Context
Finetuning
Memory Efficient Training
Pretraining
Model Architecture
Model Merging
Sparsification
Quantization
LLM Technical Reports
Unseen Papers

multimodal dataset

updated Jan 20
Upvote
-

  • BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

    Paper • 2412.04626 • Published Dec 5, 2024 • 14

  • GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI

    Paper • 2411.14522 • Published Nov 21, 2024 • 39

  • Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

    Paper • 2411.03823 • Published Nov 6, 2024 • 49

  • Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

    Paper • 2410.18558 • Published Oct 24, 2024 • 20

  • Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

    Paper • 2501.05767 • Published Jan 10 • 30

  • Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

    Paper • 2412.05271 • Published Dec 6, 2024 • 157
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs