Spaces:
Running
on
Zero
Running
on
Zero
A newer version of the Gradio SDK is available:
5.30.0
metadata
title: RTMO Checkpoint Tester
emoji: π
colorFrom: pink
colorTo: green
sdk: gradio
sdk_version: 5.27.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: RTMO PyTorch Checkpoint Tester
RTMO PyTorch Checkpoint Tester
This HuggingFace Space provides a real-time 2D multi-person pose estimation demo using the RTMO model from OpenMMLab, accelerated with ZeroGPU. It supports both image and video inputs.
Features
- Remote Checkpoint Selection: Choose from multiple pre-trained variants (COCO, BODY7, CrowdPose, retrainable RTMO-s) via a dropdown.
- Custom Checkpoint Upload: Upload your own
.pth
file; the application auto-detects RTMO-t/s/m/l variants. - Image Input: Upload images for single-frame pose estimation.
- Video Input: Upload video files (e.g.,
.mp4
,.mov
,.avi
,.mkv
,.webm
) to perform pose estimation on video sequences and view annotated outputs. - Threshold Adjustment: Fine-tune Bounding Box Threshold and NMS Threshold sliders to refine detections.
- Example Images: Three license-free images with people are included for quick testing via the Examples panel.
- ZeroGPU Acceleration: Utilizes the
@spaces.GPU()
decorator for GPU inference on HuggingFace Spaces.
Usage
- Upload Image: Drag-and-drop or select an image in the Upload Image component (or choose from Examples).
- Upload Video: Drag-and-drop or select a video file in the Upload Video component.
- Select Remote Checkpoint: Pick a preloaded variant from the dropdown menu.
- (Optional) Upload Your Own Checkpoint: Provide a
.pth
file to override the remote selection; the model variant is detected automatically. - Adjust Thresholds: Set Bounding Box Threshold (
bbox_thr
) and NMS Threshold (nms_thr
) to control confidence and suppression behavior. - Run Inference: Click Run Inference.
- View Results:
- For images, the annotated image will appear in the Annotated Image panel.
- For videos, the annotated video will appear in the Annotated Video panel. The active checkpoint name will appear below.
Remote Checkpoints
The following variants are available out of the box:
rtmo-s_8xb32-600e_coco
rtmo-m_16xb16-600e_coco
rtmo-l_16xb16-600e_coco
rtmo-t_8xb32-600e_body7
rtmo-s_8xb32-600e_body7
rtmo-m_16xb16-600e_body7
rtmo-l_16xb16-600e_body7
rtmo-s_8xb32-700e_crowdpose
rtmo-m_16xb16-700e_crowdpose
rtmo-l_16xb16-700e_crowdpose
rtmo-s_coco_retrainable
(from Hugging Face)
Implementation Details
- GPU Decorator:
@spaces.GPU()
marks thepredict
function for GPU execution under ZeroGPU. - Inference API: Leverages
MMPoseInferencer
from MMPose withpose2d
,pose2d_weights
, and category[0]
for person detection. - Monkey-Patch: Applies a regex patch to bypass
mmdet
βs MMCV version assertion for compatibility. - Variant Detection: Inspects
backbone.stem.conv.conv.weight
channels in the checkpoint to select the correct RTMO variant. - Checkpoint Management: Remote files are downloaded to
/tmp/{key}.pth
on demand; uploads use the provided local path. - Image & Video Support: The
predict
function automatically handles both image and video inputs, saving annotated frames or video to/tmp/vis
and displaying them in the UI. - Output: Saves visualization images or videos to
/tmp/vis
and displays them in the UI panels.
Files
- app.py: Main Gradio application script.
- requirements.txt: Python dependencies, including MMCV and MMPose.
- README.md: This documentation file.