AdnanElAssadi's picture
Update README.md
84410cb verified
|
raw
history blame
901 Bytes
metadata
title: MTEB Human Evaluation Demo
emoji: πŸ“Š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.23.3
app_file: app.py
pinned: false

MTEB Human Evaluation Demo

This is a demo of the human evaluation interface for the MTEB (Massive Text Embedding Benchmark) project. It allows annotators to evaluate the relevance of documents for reranking tasks.

How to use

  1. Navigate to the "Demo" tab to try the interface with an example dataset (AskUbuntuDupQuestions)
  2. Read the query at the top
  3. For each document, assign a rank using the dropdown (1 = most relevant)
  4. Submit your rankings
  5. Navigate between samples using the Previous/Next buttons
  6. Your annotations are saved automatically

About MTEB Human Evaluation

This project aims to establish human performance benchmarks for MTEB tasks, helping to understand the realistic "ceiling" for embedding model performance.