docker_mineru / README.md
marcosremar2's picture
Update PDF to Markdown converter API
3d9ca9a
---
title: PDF to Markdown Converter
emoji: πŸ“„
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
---
# PDF to Markdown Converter API
A FastAPI-based service that converts PDF documents to Markdown format using the [marker](https://github.com/VikParuchuri/marker) library.
## Features
- Convert PDF files to Markdown format
- GPU-accelerated processing with CUDA support
- Simple RESTful API
- Docker containerization
## Setup and Installation
### Prerequisites
- Docker
- Docker Compose
- NVIDIA Container Toolkit (for GPU support)
### Building and Running the Container
1. Clone this repository:
```bash
git clone <repository-url>
cd docker_mineru
```
2. Build and start the container:
```bash
docker-compose up -d
```
3. The API will be available at: `http://localhost:7860`
## API Usage
### Health Check
```
GET /health
```
Returns the current status of the service and whether CUDA is available.
### Convert PDF to Markdown
```
POST /convert
```
Upload a PDF file to convert it to Markdown.
#### Example cURL request:
```bash
curl -X POST "http://localhost:7860/convert" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@your_file.pdf"
```
#### Response:
```json
{
"filename": "your_file.pdf",
"status": "success",
"markdown_content": "# Your PDF content in Markdown...",
"output_file": "/output/your_file.md"
}
```
## Accessing the API Documentation
Once the API is running, you can access the following:
- Swagger UI: `http://localhost:7860/docs`
- ReDoc: `http://localhost:7860/redoc`
## Hugging Face Spaces Deployment
This application is also deployed on Hugging Face Spaces. You can access it at:
[https://huggingface.co/spaces/marcosremar2/docker_mineru](https://huggingface.co/spaces/marcosremar2/docker_mineru)