.. _algorithm_ocr: ========================== OCR (Optical Character Recognition) Algorithm ========================== Introduction ==================== OCR(Optical Character Recognition) involves identifying the positions ajnd contents of all text blocks in pictures. Model Usage ==================== With the environment properly set up, simply run the ocr algorithm script by executing ``scripts/ocr.py`` . .. code:: shell $ python scripts/ocr.py --config configs/ocr.yaml Model Configuration -------------------- .. code:: yaml inputs: assets/demo/ocr outputs: outputs/ocr visualize: True tasks: ocr: model: ocr_ppocr model_config: lang: ch show_log: True det_model_dir: models/OCR/PaddleOCR/det/ch_PP-OCRv4_det rec_model_dir: models/OCR/PaddleOCR/rec/ch_PP-OCRv4_rec det_db_box_thresh: 0.3 - inputs/outputs: Define the input path and the output path, respectively. - visualize: Whether to visualize the model results. Visualized results will be saved in the outputs directory. - tasks: Define the task type, currently only a OCR task is included. - model: Define the specific model type, currently, only the PaddleOCR model is available. - model_config: Define the model configuration. - lang: Define the language, default language ch supports both english and chinese. - show_log: Whether to print running logs. - det_model_dir: Define the path of PaddleOCR' detection model, If the specified path does not exist, the model weight will be automatically downloaded to the path. - rec_model_dir: Define the path of PaddleOCR' recognize model, If the specified path does not exist, the model weight will be automatically downloaded to the path. - det_db_box_thresh: Confidence filter threshold, bounding boxes whose confidence is lower than the threshold are discarded. Diverse Input Support -------------------- The OCR script in PDF-Extract-Kit supports various input formats such as ``a single image/PDF``, ``a directory of image/PDF files``. Viewing Visualization Results -------------------- When the ``visualize`` option in the config file is set to ``True``, visualization results will be saved in the ``outputs`` directory. .. note:: Visualization facilitates the analysis of model results. However, for large-scale tasks, it is recommended to disable visualization (set ``visualize`` to ``False`` ) to reduce memory and disk usage.