ignaciaginting's picture
Upload 396 files
230c9a6 verified
.. _algorithm_ocr:
==========================
OCR (Optical Character Recognition) Algorithm
==========================
Introduction
====================
OCR(Optical Character Recognition) involves identifying the positions ajnd contents of all text blocks in pictures.
Model Usage
====================
With the environment properly set up, simply run the ocr algorithm script by executing ``scripts/ocr.py`` .
.. code:: shell
$ python scripts/ocr.py --config configs/ocr.yaml
Model Configuration
--------------------
.. code:: yaml
inputs: assets/demo/ocr
outputs: outputs/ocr
visualize: True
tasks:
ocr:
model: ocr_ppocr
model_config:
lang: ch
show_log: True
det_model_dir: models/OCR/PaddleOCR/det/ch_PP-OCRv4_det
rec_model_dir: models/OCR/PaddleOCR/rec/ch_PP-OCRv4_rec
det_db_box_thresh: 0.3
- inputs/outputs: Define the input path and the output path, respectively.
- visualize: Whether to visualize the model results. Visualized results will be saved in the outputs directory.
- tasks: Define the task type, currently only a OCR task is included.
- model: Define the specific model type, currently, only the PaddleOCR model is available.
- model_config: Define the model configuration.
- lang: Define the language, default language ch supports both english and chinese.
- show_log: Whether to print running logs.
- det_model_dir: Define the path of PaddleOCR' detection model, If the specified path does not exist, the model weight will be automatically downloaded to the path.
- rec_model_dir: Define the path of PaddleOCR' recognize model, If the specified path does not exist, the model weight will be automatically downloaded to the path.
- det_db_box_thresh: Confidence filter threshold, bounding boxes whose confidence is lower than the threshold are discarded.
Diverse Input Support
--------------------
The OCR script in PDF-Extract-Kit supports various input formats such as ``a single image/PDF``, ``a directory of image/PDF files``.
Viewing Visualization Results
--------------------
When the ``visualize`` option in the config file is set to ``True``, visualization results will be saved in the ``outputs`` directory.
.. note::
Visualization facilitates the analysis of model results. However, for large-scale tasks, it is recommended to disable visualization (set ``visualize`` to ``False`` ) to reduce memory and disk usage.