Spaces:
Sleeping
Sleeping
metadata
title: PDF to Markdown Converter
emoji: π
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
PDF to Markdown Converter API
A FastAPI-based service that converts PDF documents to Markdown format using the marker library.
Features
- Convert PDF files to Markdown format
- GPU-accelerated processing with CUDA support
- Simple RESTful API
- Docker containerization
Setup and Installation
Prerequisites
- Docker
- Docker Compose
- NVIDIA Container Toolkit (for GPU support)
Building and Running the Container
- Clone this repository:
git clone <repository-url>
cd docker_mineru
- Build and start the container:
docker-compose up -d
- The API will be available at:
http://localhost:7860
API Usage
Health Check
GET /health
Returns the current status of the service and whether CUDA is available.
Convert PDF to Markdown
POST /convert
Upload a PDF file to convert it to Markdown.
Example cURL request:
curl -X POST "http://localhost:7860/convert" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@your_file.pdf"
Response:
{
"filename": "your_file.pdf",
"status": "success",
"markdown_content": "# Your PDF content in Markdown...",
"output_file": "/output/your_file.md"
}
Accessing the API Documentation
Once the API is running, you can access the following:
- Swagger UI:
http://localhost:7860/docs
- ReDoc:
http://localhost:7860/redoc
Hugging Face Spaces Deployment
This application is also deployed on Hugging Face Spaces. You can access it at: https://huggingface.co/spaces/marcosremar2/docker_mineru