File size: 1,826 Bytes
9d76a63
3d9ca9a
9d76a63
 
 
 
 
ab599b4
9d76a63
 
3d9ca9a
b8fca79
3d9ca9a
ab599b4
 
 
3d9ca9a
 
 
 
44df236
3d9ca9a
f30c298
3d9ca9a
f30c298
3d9ca9a
 
 
f30c298
3d9ca9a
f30c298
3d9ca9a
f30c298
3d9ca9a
 
 
f30c298
 
3d9ca9a
f30c298
3d9ca9a
 
 
f30c298
3d9ca9a
f30c298
3d9ca9a
f30c298
3d9ca9a
ab599b4
3d9ca9a
 
 
f30c298
3d9ca9a
f30c298
3d9ca9a
f30c298
3d9ca9a
 
 
f30c298
3d9ca9a
f30c298
3d9ca9a
ab599b4
 
3d9ca9a
 
 
 
f30c298
 
3d9ca9a
ab599b4
3d9ca9a
 
 
 
 
 
 
 
ab599b4
3d9ca9a
f30c298
3d9ca9a
f30c298
3d9ca9a
 
f30c298
3d9ca9a
f30c298
3d9ca9a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
title: PDF to Markdown Converter
emoji: πŸ“„
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
---

# PDF to Markdown Converter API

A FastAPI-based service that converts PDF documents to Markdown format using the [marker](https://github.com/VikParuchuri/marker) library.

## Features

- Convert PDF files to Markdown format
- GPU-accelerated processing with CUDA support
- Simple RESTful API
- Docker containerization

## Setup and Installation

### Prerequisites

- Docker
- Docker Compose
- NVIDIA Container Toolkit (for GPU support)

### Building and Running the Container

1. Clone this repository:

```bash
git clone <repository-url>
cd docker_mineru
```

2. Build and start the container:

```bash
docker-compose up -d
```

3. The API will be available at: `http://localhost:7860`

## API Usage

### Health Check

```
GET /health
```

Returns the current status of the service and whether CUDA is available.

### Convert PDF to Markdown

```
POST /convert
```

Upload a PDF file to convert it to Markdown.

#### Example cURL request:

```bash
curl -X POST "http://localhost:7860/convert" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@your_file.pdf"
```

#### Response:

```json
{
  "filename": "your_file.pdf",
  "status": "success",
  "markdown_content": "# Your PDF content in Markdown...",
  "output_file": "/output/your_file.md"
}
```

## Accessing the API Documentation

Once the API is running, you can access the following:

- Swagger UI: `http://localhost:7860/docs`
- ReDoc: `http://localhost:7860/redoc`

## Hugging Face Spaces Deployment

This application is also deployed on Hugging Face Spaces. You can access it at:
[https://huggingface.co/spaces/marcosremar2/docker_mineru](https://huggingface.co/spaces/marcosremar2/docker_mineru)