File size: 1,749 Bytes
856d979 0f462f7 195dd9b 0f462f7 195dd9b 0f462f7 195dd9b 0f462f7 195dd9b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
title: PDF to Markdown Converter
emoji: π
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: "1.29.0"
app_file: app.py
pinned: false
---
# PDF to Markdown Converter
This application converts PDF documents to Markdown format. It uses the `docling` library for document conversion and provides a simple Streamlit interface.
## Features
- Upload PDF files directly
- Convert PDFs from URLs
- Batch process multiple images using vLLM
- Download the resulting Markdown files
- Clean, user-friendly interface
## How to Use
### PDF to Markdown
1. Select the "PDF to Markdown" tab
2. Upload a PDF file using the file uploader or enter a URL to a PDF document
3. Click the "Convert to Markdown" button
4. Once conversion is complete, download the Markdown file
### Batch Image Processing
1. Select the "Batch Image Processing" tab
2. Upload multiple image files (PNG, JPG, JPEG)
3. Optionally customize the model path and prompt text
4. Click the "Process Images" button
5. Once processing is complete, download the ZIP file containing all results
## Technical Details
Built with:
- Streamlit 1.29.0
- Docling 2.7.0
- docling_core
- vLLM (for batch processing)
- Python 3.12
## Deployment
This application is deployed on Hugging Face Spaces.
To deploy this application:
1. Create a new Space on Hugging Face (https://huggingface.co/spaces)
2. Choose "Streamlit" as the SDK
3. Upload all these files to the Space repository:
- app.py
- requirements.txt
- README.md
- runtime.txt
The application will automatically create any necessary directories when it starts.
Note: The vLLM functionality requires significant computational resources, so you may need to select a more powerful hardware configuration for your Space. |