inference-api-g1 / api /management.py

Commit History

Improve model unloading with explicit GPU memory cleanup and CUDA cache clearing
8a8fe7c

alexfremont commited on

Remove system status endpoint and monitoring functionality
e241d94

alexfremont commited on

Add system monitoring features and memory usage tracking for loaded models
6ba6dc7

alexfremont commited on

Remove DELETE endpoint for model unload, keep POST alternative only
28b854e

alexfremont commited on

Add POST endpoint alternative for unloading models from memory
ca20804

alexfremont commited on

Add API endpoint to unload models from memory without database deletion
db789ea

alexfremont commited on

Refactor model loading to store metadata alongside pipelines in model_pipelines dict
dfb1c84

alexfremont commited on

Update model logging to use hf_filename instead of model_name
419e526

alexfremont commited on

Clean up imports and remove unused code across API modules
3635acb

alexfremont commited on

Refactor API auth and add management endpoints for model loading/updating
bccef3b

alexfremont commited on