Spaces:
Running
Running
File size: 1,376 Bytes
0df5e58 35d97d7 0df5e58 35d97d7 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 0df5e58 15fdcff 97c779b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
---
title: Smart Document Parser
emoji: π»
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.13.0
app_file: app.py
pinned: false
---
# π Smart Document Parser
A powerful document parsing application that automatically extracts structured information from various document formats.
## π Features
- **Multiple Format Support**: PDF, DOCX, TXT, HTML, and Markdown
- **Rich Information Extraction**:
- Document content with preserved formatting
- Comprehensive metadata
- Section breakdown
- Named entity recognition
- **Smart Processing**:
- Automatic format detection
- Confidence scoring
- Error handling
## π― How to Use
1. **Upload Document**: Click the upload button or drag & drop your document
2. **Process**: Click "Process Document"
3. **View Results**: Explore the extracted information in different tabs:
- π Content: Main document text
- π Metadata: Document properties
- π Sections: Document structure
- π·οΈ Entities: Named entities
## π Supported Formats
- PDF Documents (*.pdf)
- Word Documents (*.docx)
- Text Files (*.txt)
- HTML Files (*.html)
- Markdown Files (*.md)
## π οΈ Technical Details
Built with:
- Docling: Advanced document processing
- Gradio: Interactive web interface
- Pydantic: Type-safe data handling
- Hugging Face Spaces: Cloud deployment
## π License
MIT License |