A newer version of the Gradio SDK is available:
5.29.0
Video Analysis Project: Fight and Object Detection
1. Overview
This project analyzes video files to perform two main tasks:
- Fight Detection: Classifies video segments as containing a "FIGHT" or being "NORMAL" using a custom-trained 3D Convolutional Neural Network (CNN).
- Object Detection: Identifies and tracks specific objects within the video using a pre-trained YOLOv8 model.
The system processes an input video and outputs the fight classification result along with an annotated version of the video highlighting detected objects.
2. Features
- Dual analysis: Combines action recognition (fight detection) and object detection.
- Custom-trained model for fight detection tailored to specific data.
- Utilizes state-of-the-art YOLOv8 for object detection.
- Generates an annotated output video showing detected objects and their tracks.
- Provides confidence scores for fight detection results.
- Includes scripts for both inference (
full_project.py
) and training (trainig.py
) the fight detection model.
3. Project Structure
ComV/
βββ [Project Directory]/ # e.g., AI_made
β βββ full_project.py # Main script for running inference
β βββ Fight_detec_func.py # Fight detection logic and model loading
β βββ objec_detect_yolo.py # Object detection logic using YOLOv8
β βββ frame_slicer.py # Utility for extracting frames for fight detection
β βββ trainig.py # Script for training the fight detection model
β βββ README.md # This documentation file
β βββ trainnig_output/ # Directory for training artifacts
β βββ final_model_2.h5 # Trained fight detection model (relative path)
β βββ checkpoint/ # Checkpoints saved during training (relative path)
β βββ training_log.csv # Log file for training history (relative path)
β βββ yolo/ # (Assumed location)
β βββ yolo/
β βββ best.pt # Pre-trained YOLOv8 model weights (relative path)
βββ train/
β βββ Fighting/ # Directory containing fight video examples (relative path)
β βββ Normal/ # Directory containing normal video examples (relative path)
βββ try/
βββ result/ # Directory where output videos are saved (relative path)
βββ ... (Input video files) # Location for input videos (example)
(Note: Model paths and data directories might be hardcoded in the scripts. Consider making these configurable or using relative paths.)
4. Setup and Installation
Python Version:
- This project was developed and tested using Python 3.10.
Dependencies:
Based on the code imports and pip freeze
output, the following libraries and versions were used:
opencv-python==4.11.0.86
(cv2)numpy==1.26.4
tensorflow==2.19.0
(tf)ultralytics==8.3.108
(for YOLOv8)matplotlib==3.10.1
(for debug visualizations)scikit-learn==1.6.1
(sklearn - used intrainig.py
)
(Note: Other versions might also work, but these are the ones confirmed in the development environment.)
Installation (using pip):
pip install opencv-python numpy tensorflow ultralytics matplotlib scikit-learn
Models:
- Fight Detection Model: Ensure the trained model (
final_model_2.h5
) is present in thetrainnig_output
subdirectory relative to the script location. - YOLOv8 Model: Ensure the YOLO model (
best.pt
) is present in theyolo/yolo
subdirectory relative to the script location.
(Note: Absolute paths might be hardcoded in the scripts and may need adjustment depending on the deployment environment.)
5. Usage
To run the analysis on a video file:
- Navigate to the
d:/K_REPO/ComV/AI_made/
directory in your terminal (or ensure Python's working directory isd:/K_REPO
). - Run the main script:
python full_project.py
- The script will prompt you to enter the path to the video file:
(Ensure you provide the full path, potentially removing extra quotes if copying from Windows Explorer.)Enter the local path : <your_video_path.mp4>
Output:
- The console will print the fight detection result (e.g., "FIGHT (85.3% confidence)") and information about the object detection process.
- An annotated video file will be saved in the
D:\K_REPO\ComV\try\result
directory. The filename will include the original video name and the unique detected object labels (e.g.,input_video_label1_label2_output.mp4
). - If debug mode is enabled in
Fight_detec_func.py
, additional debug images might be saved in the result directory.
6. Module Descriptions
full_project.py
: Orchestrates the process by taking user input and calling the fight detection and object detection functions.Fight_detec_func.py
:- Contains the
fight_detec
function andFightDetector
class. - Loads the Keras model (
final_model_2.h5
). - Uses
frame_slicer
to prepare input for the model. - Performs prediction and calculates confidence.
- Handles debug visualizations.
- Contains the
objec_detect_yolo.py
:- Contains the
detection
function. - Loads the YOLOv8 model (
best.pt
). - Iterates through video frames, performs object detection and tracking.
- Generates and saves the annotated output video.
- Returns detected object labels.
- Contains the
frame_slicer.py
:- Contains the
extract_video_frames
utility function. - Extracts a fixed number of frames, resizes, normalizes, and handles potential errors during extraction.
- Contains the
trainig.py
:- Script for training the fight detection model.
- Includes
VideoDataGenerator
for loading/processing video data. - Defines the 3D CNN model architecture.
- Handles data loading, splitting, training loops, checkpointing, and saving the final model.
7. Training Data
Dataset Composition
Category | Count | Percentage | Formats | Avg Duration |
---|---|---|---|---|
Fight Videos | 2,340 | 61.9% | .mp4, .mpeg | 5.2 sec |
Normal Videos | 1,441 | 38.1% | .mp4, .mpeg | 6.1 sec |
Total | 3,781 | 100% |
Technical Specifications
- Resolution: 64Γ64 pixels
- Color Space: RGB
- Frame Rate: 30 FPS (average)
- Frame Sampling: 50 frames per video
- Input Shape: (30, 96, 96, 3) - Model resizes input
Data Sources
- Fighting videos: Collected from public surveillance datasets
- Normal videos: Sampled from public CCTV footage
- Manually verified and labeled by domain experts
Preprocessing
- Frame extraction at 50 frames/video
- Resizing to 96Γ96 pixels
- Normalization (pixel values [0,1])
- Temporal sampling to 30 frames for model input
8. Models Used
- Fight Detection: A custom 3D CNN trained using
trainig.py
. Located atD:\K_REPO\ComV\AI_made\trainnig_output\final_model_2.h5
. Input shape expects(30, 96, 96, 3)
frames. - Object Detection: YOLOv8 model. Weights located at
D:\K_REPO\ComV\yolo\yolo\best.pt
. This model is trained to detect the following classes:['Fire', 'Gun', 'License_Plate', 'Smoke', 'knife']
.
7a. Fight Detection Model Performance
The following metrics represent the performance achieved during the training of the final_model_2.h5
:
- Best Training Accuracy: 0.8583 (Epoch 7)
- Best Validation Accuracy: 0.9167 (Epoch 10)
- Lowest Training Loss: 0.3636 (Epoch 7)
- Lowest Validation Loss: 0.2805 (Epoch 8)
(Note: These metrics are based on the training run that produced the saved model. Performance may vary slightly on different datasets or during retraining.)
8. Configuration
Key parameters and paths are mostly hardcoded within the scripts:
Fight_detec_func.py
:MODEL_PATH
,N_FRAMES
,IMG_SIZE
,RESULT_PATH
.objec_detect_yolo.py
: YOLO model path, output directory path (output_dir
), confidence threshold (conf=0.7
).trainig.py
:DATA_DIR
,N_FRAMES
,IMG_SIZE
,EPOCHS
,BATCH_SIZE
,CHECKPOINT_DIR
,OUTPUT_PATH
.
Recommendation: Refactor these hardcoded values into a separate configuration file (e.g., YAML or JSON) or use command-line arguments for better flexibility.
9. Training the Fight Detection Model
To retrain or train the fight detection model:
- Prepare Data: Place training videos into
D:\K_REPO\ComV\train\Fighting
andD:\K_REPO\ComV\train\Normal
subdirectories. - Run Training Script: Execute
trainig.py
:python trainig.py
- The script will load data, build the model (or resume from a checkpoint if
RESUME_TRAINING=1
and a checkpoint exists), train it, and save the final model toD:\K_REPO\ComV\AI_made\trainnig_output\final_model_2.h5
. Checkpoints and logs are saved in thetrainnig_output
directory.