Spaces:
Build error
Build error
.. _algorithm_table_recognition: | |
======================== | |
Table Recognition Algorithm | |
======================== | |
Introduction | |
================= | |
Table recognition refers to the process of inputting a table image, identifying the table structure and content, and converting it into formats such as ``LaTeX`` or ``HTML``. | |
Model Usage | |
================= | |
With the environment properly configured, you can run the table recognition algorithm script by directly executing ``scripts/table_parsing.py``. | |
.. code:: shell | |
$ python scripts/table_parsing.py --config configs/table_parsing.yaml | |
Model Configuration | |
----------------- | |
.. code:: yaml | |
inputs: assets/demo/table_parsing | |
outputs: outputs/table_parsing | |
tasks: | |
table_parsing: | |
model: table_parsing_struct_eqtable | |
model_config: | |
model_path: models/TabRec/StructEqTable | |
max_new_tokens: 1024 | |
max_time: 30 | |
output_format: latex | |
lmdeploy: False | |
flash_attn: True | |
- inputs/outputs: Define the input file path and table recognition result directory respectively | |
- tasks: Define the task type, currently only including one table recognition task | |
- model: Define the specific model type: currently using the `StructEqTable <https://github.com/UniModal4Reasoning/StructEqTable-Deploy>`_ table recognition model | |
- model_config: Define the model configuration | |
- model_path: Path to the model weights | |
- max_new_tokens: Maximum number of tokens to generate, default is 1024, maximum supported is 4096 | |
- max_time: Maximum runtime for the model (in seconds) | |
- output_format: Output format, default is set to ``latex``, options include ``html`` and ``markdown`` | |
- lmdeploy: Whether to use LMDeploy for deployment, currently set to False | |
- flash_attn: Whether to use flash attention, only available for Ampere GPUs | |
Diverse Input Support | |
----------------- | |
The table recognition script in PDF-Extract-Kit supports ``single table images`` and ``multiple table images`` as input. | |
.. note:: | |
The StructEqTable model only supports running on GPU devices | |
.. note:: | |
Adjust ``max_new_tokens`` and ``max_time`` according to the table content, defaults are 1024 and 30 respectively. | |
.. note:: | |
lmdeploy is an option for accelerated inference. If set to True, it will use LMDeploy for accelerated inference deployment. | |
To use LMDeploy deployment, you need to install LMDeploy. For installation methods, refer to `LMDeploy <https://github.com/InternLM/lmdeploy>`_. |