|
--- |
|
title: Resume Screener and Skill Extractor |
|
emoji: π |
|
colorFrom: blue |
|
colorTo: green |
|
sdk: streamlit |
|
sdk_version: 1.31.0 |
|
app_file: app.py |
|
pinned: false |
|
license: mit |
|
--- |
|
|
|
# Resume Screener and Skill Extractor |
|
|
|
A Hugging Face Space application for efficiently screening resumes against job descriptions using a hybrid ranking approach that combines semantic similarity with keyword-based scoring. |
|
|
|
## Features |
|
|
|
- **Hybrid Resume Ranking**: Combines semantic similarity (via NV-Embed-v2) with keyword-based BM25 scoring |
|
- **Skill Extraction**: Automatically identifies relevant skills from resumes based on job requirements |
|
- **Fast Search**: Uses FAISS for efficient similarity search with large resume collections |
|
- **Multi-format Support**: Processes PDFs, DOCX, TXT, and CSV files |
|
- **Explanation Generation**: Provides explanations for why each resume was ranked highly |
|
- **Visualization**: Displays comparative scores and key matches for easy analysis |
|
- **Batch Processing**: Supports uploading multiple resumes simultaneously |
|
|
|
## How It Works |
|
|
|
1. **Input**: Provide a job description and upload resumes (PDF, DOCX, TXT, or CSV format) |
|
2. **Processing**: The system creates embeddings for both the job description and resumes using the NV-Embed-v2 model |
|
3. **Ranking**: Calculates a hybrid score based on: |
|
- Semantic similarity (cosine similarity between embeddings) |
|
- Keyword relevance (BM25 scoring) |
|
4. **Results**: Returns the top 10 most suitable resumes with: |
|
- Overall score and individual component scores |
|
- Matched skills and key phrases |
|
- Explanations for why each resume was ranked highly |
|
|
|
## Technical Details |
|
|
|
### Models Used |
|
- **NV-Embed-v2**: State-of-the-art embedding model for semantic similarity |
|
- **QwQ-32B**: Used for generating explanations (simulated in the current version) |
|
|
|
### Libraries |
|
- **FAISS**: Facebook AI Similarity Search for fast vector similarity search |
|
- **rank_bm25**: Implementation of the BM25 algorithm for keyword-based scoring |
|
- **Streamlit**: For the user interface |
|
- **Hugging Face Transformers**: For accessing and using the models |
|
|
|
## Configuration Options |
|
|
|
The sidebar provides several configuration options: |
|
- **Model Selection**: Choose which embedding model to use |
|
- **Ranking Weights**: Adjust the balance between semantic similarity and keyword matching |
|
- **Results Count**: Set how many top results to display |
|
- **FAISS Usage**: Toggle the use of FAISS for faster searching with large resume collections |
|
|
|
## Getting Started |
|
|
|
### Online Usage |
|
1. Visit the Hugging Face Space at [URL] |
|
2. Enter a job description |
|
3. Upload resumes (PDF, DOCX, TXT, or CSV) |
|
4. Click "Find Top Candidates" |
|
5. Review the results |
|
|
|
### Local Installation |
|
|
|
```bash |
|
git clone https://huggingface.co/spaces/[username]/Resume_Screener_and_Skill_Extractor |
|
cd Resume_Screener_and_Skill_Extractor |
|
pip install -r requirements.txt |
|
streamlit run app.py |
|
``` |
|
|
|
## Future Enhancements |
|
|
|
- Integration with Hugging Face datasets for loading resumes directly |
|
- Enhanced skill extraction using more sophisticated NLP techniques |
|
- Real-time explanation generation using QwQ-32B |
|
- Support for additional file formats and languages |
|
- Customizable scoring algorithms and weights |
|
|
|
## License |
|
|
|
MIT License |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|