Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.29.0
metadata
title: menu text detection
emoji: π¦
colorFrom: indigo
colorTo: pink
sdk: gradio
python_version: 3.11
short_description: Extract structured menu information from images into JSON...
tags:
- donut
- fine-tuning
- image-to-text
- transformer
Menu Text Detection System
Extract structured menu information from images into JSON using a fine-tuned Donut E2E model.
Based on Donut by Clova AI (ECCV β22)
π Features
Overview
Currently supports the following information from menu images:
- Restaurant Name
- Business Hours
- Address
- Phone Number
- Dish Information
- Name
- Price
For the JSON schema, see tools directory.
Supported Methods to Extract Menu Information
- Fine-tuned Donut model
- OpenAI GPT API
- Google Gemini API
π» Training / Fine-Tuning
Setup
Use uv to set up the development environment:
uv sync
Training Script (Datasets collecting, Fine-Tuning)
Please refer train.ipynb
. Use Jupyter Notebook for training:
uv run jupyter-notebook
For VSCode users, please install Jupyter extension, then select
.venv/bin/python
as your kernel.
Run Demo Locally
uv run python app.py