File size: 4,496 Bytes
482226d cfc3e8f 482226d cfc3e8f 482226d cfc3e8f 4fe8c1f cfc3e8f 4fe8c1f cfc3e8f 4fe8c1f cfc3e8f 4fe8c1f cfc3e8f 4fe8c1f cfc3e8f 4fe8c1f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
tags:
- spacy
- token-classification
- ner
language:
- en
license: mit
model-index:
- name: en_core_web_sm_job
results:
- task:
name: NER
type: token-classification
metrics:
- name: NER Precision
type: precision
value: 0.7516398746
- name: NER Recall
type: recall
value: 0.6069711538
- name: NER F Score
type: f_score
value: 0.6742971968
- task:
name: TAG
type: token-classification
metrics:
- name: TAG (XPOS) Accuracy
type: accuracy
value: 0.7334810915
library_name: spacy
pipeline_tag: text-classification
---
# Custom spaCy NER Model for "Profession," "Facility," and "Experience" Entities
### Overview
This spaCy-based Named Entity Recognition (NER) model has been custom-trained to recognize and classify entities related to "profession," "facility," and "experience." It is designed to enhance your text analysis capabilities by identifying these specific entity types in unstructured text data.
### Key Features
Custom-trained for high accuracy in recognizing "profession," "facility," and "experience" entities.
Suitable for various professional info streams tasks, such as information extraction, content categorization, and more.
Currently Focus on the job seekers fields, can be easily integrated into your existing spaCy-based NLP pipelines.
### Usage
#### Installation
##### You can install the custom spaCy NER model using pip:
```bash
git lfs install
git clone https://huggingface.co/LPDoctor/en_core_web_sm_job_related
```
#### Example Usage
Here's how you can use the model for entity recognition in Python:
```python
import spacy
# Load the custom spaCy NER model
nlp = spacy.load("en_core_web_sm_job")
# Process your text
text = "HR Specialist needed at Google, Dallas, TX, with expertise in employee relations and a minimum of 4 years of HR experience."
doc = nlp(text)
# Extract named entities
for ent in doc.ents:
print(f"Entity: {ent.text}, Type: {ent.label_}")
```
#### Entity Types
The model recognizes the following entity types:
- PROFESSION: Represents professions or job titles.
- FACILITY: Denotes facilities, buildings, or locations.
- EXPERIENCE: Identifies mentions of work experience, durations, or qualifications.
| Feature | Description |
| --- | --- |
| **Name** | `en_core_web_sm_job` |
| **Version** | `3.7.0` |
| **spaCy** | `>=3.7.0,<3.8.0` |
| **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
| **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | [OntoNotes 5](https://catalog.ldc.upenn.edu/LDC2013T19) (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br />[ClearNLP Constituent-to-Dependency Conversion](https://github.com/clir/clearnlp-guidelines/blob/master/md/components/dependency_conversion.md) (Emory University)<br />[WordNet 3.0](https://wordnet.princeton.edu/) (Princeton University) |
| **License** | `MIT` |
### Label Scheme
<details>
<summary>View label scheme (116 labels for 3 components)</summary>
| Component | Labels |
| --- | --- |
| **`tagger`** | `$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, `_SP`, ```` |
| **`parser`** | `ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp` |
| **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `EXPERIENCE`, `FAC`, `FACILITY`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `PROFESSION`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
</details>
### Accuracy
| Type | Score |
| --- | --- |
| `TOKEN_P` | 78.59 |
| `TOKEN_R` | 63.58 |
| `TOKEN_F` | 70.57 |
| `CUSTOM_TAG_ACC` | 71.98 |
|