File size: 1,749 Bytes
856d979
 
 
 
 
 
 
 
 
 
 
0f462f7
 
 
 
 
 
 
 
195dd9b
 
0f462f7
 
 
 
195dd9b
 
 
 
 
 
 
 
 
 
 
 
0f462f7
 
 
 
 
 
195dd9b
 
 
0f462f7
 
 
195dd9b
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
title: PDF to Markdown Converter
emoji: πŸ“„
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: "1.29.0"
app_file: app.py
pinned: false
---

# PDF to Markdown Converter

This application converts PDF documents to Markdown format. It uses the `docling` library for document conversion and provides a simple Streamlit interface.

## Features

- Upload PDF files directly
- Convert PDFs from URLs
- Batch process multiple images using vLLM
- Download the resulting Markdown files
- Clean, user-friendly interface

## How to Use

### PDF to Markdown
1. Select the "PDF to Markdown" tab
2. Upload a PDF file using the file uploader or enter a URL to a PDF document
3. Click the "Convert to Markdown" button
4. Once conversion is complete, download the Markdown file

### Batch Image Processing
1. Select the "Batch Image Processing" tab
2. Upload multiple image files (PNG, JPG, JPEG)
3. Optionally customize the model path and prompt text
4. Click the "Process Images" button
5. Once processing is complete, download the ZIP file containing all results

## Technical Details

Built with:
- Streamlit 1.29.0
- Docling 2.7.0
- docling_core
- vLLM (for batch processing)
- Python 3.12

## Deployment

This application is deployed on Hugging Face Spaces.

To deploy this application:
1. Create a new Space on Hugging Face (https://huggingface.co/spaces)
2. Choose "Streamlit" as the SDK
3. Upload all these files to the Space repository:
   - app.py
   - requirements.txt
   - README.md
   - runtime.txt

The application will automatically create any necessary directories when it starts.

Note: The vLLM functionality requires significant computational resources, so you may need to select a more powerful hardware configuration for your Space.