Spaces:

feras-vbrl
/

pdf-to-markdown-converter

Running

App Files Files Community

pdf-to-markdown-converter / README.md

feras-vbrl

Upload 4 files

195dd9b verified about 2 months ago

preview code

raw

history blame contribute delete

1.75 kB

	---
	title: PDF to Markdown Converter
	emoji: 📄
	colorFrom: blue
	colorTo: green
	sdk: streamlit
	sdk_version: "1.29.0"
	app_file: app.py
	pinned: false
	---

	# PDF to Markdown Converter

	This application converts PDF documents to Markdown format. It uses the `docling` library for document conversion and provides a simple Streamlit interface.

	## Features

	- Upload PDF files directly
	- Convert PDFs from URLs
	- Batch process multiple images using vLLM
	- Download the resulting Markdown files
	- Clean, user-friendly interface

	## How to Use

	### PDF to Markdown
	1. Select the "PDF to Markdown" tab
	2. Upload a PDF file using the file uploader or enter a URL to a PDF document
	3. Click the "Convert to Markdown" button
	4. Once conversion is complete, download the Markdown file

	### Batch Image Processing
	1. Select the "Batch Image Processing" tab
	2. Upload multiple image files (PNG, JPG, JPEG)
	3. Optionally customize the model path and prompt text
	4. Click the "Process Images" button
	5. Once processing is complete, download the ZIP file containing all results

	## Technical Details

	Built with:
	- Streamlit 1.29.0
	- Docling 2.7.0
	- docling_core
	- vLLM (for batch processing)
	- Python 3.12

	## Deployment

	This application is deployed on Hugging Face Spaces.

	To deploy this application:
	1. Create a new Space on Hugging Face (https://huggingface.co/spaces)
	2. Choose "Streamlit" as the SDK
	3. Upload all these files to the Space repository:
	- app.py
	- requirements.txt
	- README.md
	- runtime.txt

	The application will automatically create any necessary directories when it starts.

	Note: The vLLM functionality requires significant computational resources, so you may need to select a more powerful hardware configuration for your Space.