ashen007 commited on
Commit
5d1fba3
·
verified ·
1 Parent(s): 3195dd3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -1
README.md CHANGED
@@ -8,4 +8,62 @@ metrics:
8
  base_model:
9
  - Ultralytics/YOLOv8
10
  pipeline_tag: object-detection
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  base_model:
9
  - Ultralytics/YOLOv8
10
  pipeline_tag: object-detection
11
+ ---
12
+
13
+ # YOLO Document Layout Model
14
+
15
+ This model is a fine-tuned YOLO detector for document layout analysis, capable of identifying various document elements such as text columns, figures, tables, and other typographical features.
16
+
17
+ ## Model Description
18
+
19
+ The model is trained to detect and classify 20 different document components, including text structures (TextColumn, List), semantic elements (Title, Header), typographical features (Bold, Italic), and visual components (Figure, Table).
20
+
21
+ ## Model Detections
22
+
23
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/661a149cd7c07238c2b3ddc2/RUOv2iWaY1sJCQcvQl2Ik.png)
24
+
25
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/661a149cd7c07238c2b3ddc2/07I3lmV59UljZfo7ItlWW.png)
26
+
27
+ ### Training
28
+
29
+ The model was fine-tuned using a proprietary dataset of document images.
30
+
31
+ ## Evaluation Results
32
+
33
+ The model's performance was evaluated on a test set with the following metrics:
34
+
35
+ | Class | Images | Instances | Precision | Recall | mAP50 | mAP50-95 |
36
+ |-------|--------|-----------|-----------|--------|-------|----------|
37
+ | **all** | **150** | **1255** | **0.701** | **0.723** | **0.735** | **0.509** |
38
+ | Author | 7 | 65 | 0.693 | 0.174 | 0.307 | 0.134 |
39
+ | Bigletter | 11 | 11 | 1.000 | 0.900 | 0.976 | 0.563 |
40
+ | Bleeding | 9 | 10 | 0.618 | 0.700 | 0.667 | 0.547 |
41
+ | Bold | 23 | 77 | 0.679 | 0.753 | 0.798 | 0.395 |
42
+ | Caption | 50 | 71 | 0.892 | 0.816 | 0.881 | 0.642 |
43
+ | Date | 17 | 57 | 0.927 | 0.666 | 0.728 | 0.386 |
44
+ | Figure | 90 | 149 | 0.772 | 0.725 | 0.823 | 0.677 |
45
+ | Footnote | 14 | 15 | 0.500 | 0.667 | 0.612 | 0.478 |
46
+ | Header | 16 | 16 | 0.560 | 0.717 | 0.664 | 0.476 |
47
+ | Italic | 17 | 86 | 0.448 | 0.791 | 0.557 | 0.327 |
48
+ | List | 34 | 55 | 0.615 | 0.709 | 0.742 | 0.591 |
49
+ | Map | 4 | 4 | 0.606 | 0.750 | 0.656 | 0.599 |
50
+ | SubSubTitle | 37 | 97 | 0.627 | 0.520 | 0.599 | 0.300 |
51
+ | SubTitle | 54 | 96 | 0.605 | 0.562 | 0.605 | 0.327 |
52
+ | Table | 30 | 43 | 0.865 | 0.953 | 0.966 | 0.855 |
53
+ | TextColumn | 115 | 323 | 0.831 | 0.913 | 0.933 | 0.811 |
54
+ | Title | 47 | 66 | 0.712 | 0.711 | 0.649 | 0.441 |
55
+ | Underline | 2 | 4 | 0.681 | 1.000 | 0.995 | 0.665 |
56
+ | equations | 4 | 10 | 0.688 | 0.700 | 0.809 | 0.450 |
57
+
58
+ ### Key Performance Highlights:
59
+
60
+ - **Best performing classes**: Table (mAP50: 0.966), TextColumn (mAP50: 0.933), and Caption (mAP50: 0.881)
61
+ - **High precision classes**: Bigletter (1.000), Date (0.927), and Caption (0.892)
62
+ - **High recall classes**: Underline (1.000), Table (0.953), and TextColumn (0.913)
63
+ - **Overall performance**: mAP50 of 0.735 and mAP50-95 of 0.509 across all classes
64
+
65
+ ## Limitations
66
+
67
+ - Lower performance on Author detection (mAP50: 0.307)
68
+ - Moderate performance on typographical features like Italic (mAP50: 0.557)
69
+ - Limited sample size for some classes (Map, Underline, equations)