Rapidata

Enterprise

company

AI & ML interests

RLHF, Model Evaluation, Benchmarks, Data Labeling, Human Feedback, Computer Vision, Image Generation, Video Generation, LLMs, Translations

Recent Activity

LinoGiger updated a dataset 10 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

jasoncorkill updated a dataset 11 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

Sneccello updated a dataset 14 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

View all activity

Rapidata's activity

LinoGiger

updated a dataset 10 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

Viewer • Updated 10 days ago • 31.9k • 1.18k • 21

jasoncorkill

posted an update 11 days ago

Post

5489

🚀 Building Better Evaluations: 32K Image Annotations Now Available

Today, we're releasing an expanded version: 32K images annotated with 3.7M responses from over 300K individuals which was completed in under two weeks using the Rapidata Python API.

Rapidata/text-2-image-Rich-Human-Feedback-32k

A few months ago, we published one of our most liked dataset with 13K images based on the @data-is-better-together 's dataset, following Google's research on "Rich Human Feedback for Text-to-Image Generation" (https://arxiv.org/abs/2312.10240). It collected over 1.5M responses from 150K+ participants.

Rapidata/text-2-image-Rich-Human-Feedback

In the examples below, users highlighted words from prompts that were not correctly depicted in the generated images. Higher word scores indicate more frequent issues. If an image captured the prompt accurately, users could select [No_mistakes].

We're continuing to work on large-scale human feedback and model evaluation. If you're working on related research and need large, high-quality annotations, feel free to get in touch: [email protected].

jasoncorkill

updated a dataset 11 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

Viewer • Updated 10 days ago • 31.9k • 1.18k • 21

Sneccello

updated a dataset 14 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

Viewer • Updated 10 days ago • 31.9k • 1.18k • 21

Kchanger

updated a dataset 14 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

Viewer • Updated 10 days ago • 31.9k • 1.18k • 21

Kchanger

published a dataset 14 days ago

Rapidata/text-2-image-Rich-Human-Feedback-32k

Viewer • Updated 10 days ago • 31.9k • 1.18k • 21

jasoncorkill

posted an update 28 days ago

Post

3269

🚀 We tried something new!

We just published a dataset using a new (for us) preference modality: direct ranking based on aesthetic preference. We ranked a couple of thousand images from most to least preferred, all sampled from the Open Image Preferences v1 dataset by the amazing @data-is-better-together team.

📊 Check it out here:
Rapidata/2k-ranked-images-open-image-preferences-v1

We're really curious to hear your thoughts!
Is this kind of ranking interesting or useful to you? Let us know! 💬

If it is, please consider leaving a ❤️ and if we hit 30 ❤️s, we’ll go ahead and rank the full 17k image dataset!

6 replies

jasoncorkill

updated a dataset 29 days ago

Rapidata/2k-ranked-images-open-image-preferences-v1

Viewer • Updated 29 days ago • 2k • 196 • 18

maalber

updated a dataset 29 days ago

Rapidata/2k-ranked-images-open-image-preferences-v1

Viewer • Updated 29 days ago • 2k • 196 • 18

maalber

published a dataset 30 days ago

Rapidata/2k-ranked-images-open-image-preferences-v1

Viewer • Updated 29 days ago • 2k • 196 • 18

jasoncorkill

posted an update 30 days ago

Post

3050

🔥 Yesterday was a fire day!
We dropped two brand-new datasets capturing Human Preferences for text-to-video and text-to-image generations powered by our own crowdsourcing tool!

Whether you're working on model evaluation, alignment, or fine-tuning, this is for you.

1. Text-to-Video Dataset (Pika 2.2 model):
Rapidata/text-2-video-human-preferences-pika2.2

2. Text-to-Image Dataset (Reve-AI Halfmoon):
Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Let’s train AI on AI-generated content with humans in the loop.
Let’s make generative models that actually get us.

Kchanger

published 2 datasets about 1 month ago

Rapidata/text-2-video-human-preferences-pika2.2

Viewer • Updated about 1 month ago • 1.68k • 302 • 8

Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Viewer • Updated about 1 month ago • 13k • 219 • 7

Kchanger

updated a dataset about 1 month ago

Rapidata/text-2-video-human-preferences-pika2.2

Viewer • Updated about 1 month ago • 1.68k • 302 • 8

LinoGiger

updated a dataset about 1 month ago

Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Viewer • Updated about 1 month ago • 13k • 219 • 7

Kchanger

updated a dataset about 1 month ago

Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Viewer • Updated about 1 month ago • 13k • 219 • 7

jasoncorkill

posted an update about 1 month ago

Post

2378

🚀 Rapidata: Setting the Standard for Model Evaluation

Rapidata is proud to announce our first independent appearance in academic research, featured in the Lumina-Image 2.0 paper. This marks the beginning of our journey to become the standard for testing text-to-image and generative models. Our expertise in large-scale human annotations allows researchers to refine their models with accurate, real-world feedback.

As we continue to establish ourselves as a key player in model evaluation, we’re here to support researchers with high-quality annotations at scale. Reach out to [email protected] to see how we can help.

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework (2503.21758)

jasoncorkill

posted an update about 1 month ago

Post

2257

🔥 It's out! We published the dataset for our evaluation of @OpenAI 's new 4o image generation model.

Rapidata/OpenAI-4o_t2i_human_preference

Yesterday we published the first large evaluation of the new model, showing that it absolutely leaves the competition in the dust. We have now made the results and data available here! Please check it out and ❤️ !

jasoncorkill

posted an update about 1 month ago

Post

2058

🚀 First Benchmark of @OpenAI 's 4o Image Generation Model!

We've just completed the first-ever (to our knowledge) benchmarking of the new OpenAI 4o image generation model, and the results are impressive!

In our tests, OpenAI 4o image generation absolutely crushed leading competitors, including @black-forest-labs , @google , @xai-org , Ideogram, Recraft, and @deepseek-ai , in prompt alignment and coherence! They hold a gap of more than 20% to the nearest competitor in terms of Bradley-Terry score, the biggest we have seen since the beginning of the benchmark!

The benchmarks are based on 200k human responses collected through our API. However, the most challenging part wasn't the benchmarking itself, but generating and downloading the images:

- 5 hours to generate 1000 images (no API available yet)
- Just 10 minutes to set up and launch the benchmark
- Over 200,000 responses rapidly collected

While generating the images, we faced some hurdles that meant that we had to leave out certain parts of our prompt set. Particularly we observed that the OpenAI 4o model proactively refused to generate certain images:

🚫 Styles of living artists: completely blocked
🚫 Copyrighted characters (e.g., Darth Vader, Pokémon): initially generated but subsequently blocked

Overall, OpenAI 4o stands out significantly in alignment and coherence, especially excelling in certain unusual prompts that have historically caused issues such as: 'A chair on a cat.' See the images for more examples!

1 reply

jasoncorkill

posted an update about 2 months ago

Post

3813

At Rapidata, we compared DeepL with LLMs like DeepSeek-R1, Llama, and Mixtral for translation quality using feedback from over 51,000 native speakers. Despite the costs, the performance makes it a valuable investment, especially in critical applications where translation quality is paramount. Now we can say that Europe is more than imposing regulations.

Our dataset, based on these comparisons, is now available on Hugging Face. This might be useful for anyone working on AI translation or language model evaluation.

Rapidata/Translation-deepseek-llama-mixtral-v-deepl

1 reply

AI & ML interests

Recent Activity

Team members 7

Rapidata's activity