Spaces:

TeamGenKI
/

LLMServer

Paused

App Files Files Community

LLMServer / main /logs /llm_api.log

AurelioAguirre

Dockerfile set to Python3.12-mini

15c704c 5 months ago

raw

history blame

9.7 kB

	2025-01-09 15:54:08,215 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
	2025-01-09 15:54:08,215 - hf_validation - ERROR - No HF_TOKEN found in environment variables
	2025-01-09 15:54:08,215 - main - INFO - Starting LLM API server
	2025-01-09 15:54:08,216 - llm_api - INFO - Initializing LLM API
	2025-01-09 15:54:08,216 - llm_api - INFO - LLM API initialized successfully
	2025-01-09 15:54:08,216 - api_routes - INFO - Router initialized with LLM API instance
	2025-01-09 15:54:08,218 - main - INFO - FastAPI application created successfully
	2025-01-09 16:46:10,118 - api_routes - INFO - Received request to download model: microsoft/phi-4
	2025-01-09 16:46:10,118 - llm_api - INFO - Starting download of model: microsoft/phi-4
	2025-01-09 16:46:10,118 - llm_api - INFO - Enabling stdout logging for download
	2025-01-09 17:00:32,400 - llm_api - INFO - Disabling stdout logging
	2025-01-09 17:00:32,400 - llm_api - INFO - Saving model to main/models/phi-4
	2025-01-09 17:02:39,928 - llm_api - INFO - Successfully downloaded model: microsoft/phi-4
	2025-01-09 17:02:41,075 - api_routes - INFO - Successfully downloaded model: microsoft/phi-4
	2025-01-09 17:02:41,080 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:02:41,080 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:02:41,081 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:02:41,377 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install 'accelerate>=0.26.0'`
	2025-01-09 17:02:41,377 - api_routes - ERROR - Error initializing model: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install 'accelerate>=0.26.0'`
	2025-01-09 17:11:25,843 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
	2025-01-09 17:11:25,843 - hf_validation - ERROR - No HF_TOKEN found in environment variables
	2025-01-09 17:11:25,843 - main - INFO - Starting LLM API server
	2025-01-09 17:11:25,843 - llm_api - INFO - Initializing LLM API
	2025-01-09 17:11:25,844 - llm_api - INFO - LLM API initialized successfully
	2025-01-09 17:11:25,844 - api_routes - INFO - Router initialized with LLM API instance
	2025-01-09 17:11:25,846 - main - INFO - FastAPI application created successfully
	2025-01-09 17:11:38,299 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:11:38,299 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:11:38,299 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:11:38,487 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: Using `bitsandbytes` 8-bit quantization requires the latest version of bitsandbytes: `pip install -U bitsandbytes`
	2025-01-09 17:11:38,487 - api_routes - ERROR - Error initializing model: Using `bitsandbytes` 8-bit quantization requires the latest version of bitsandbytes: `pip install -U bitsandbytes`
	2025-01-09 17:12:48,606 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
	2025-01-09 17:12:48,606 - hf_validation - ERROR - No HF_TOKEN found in environment variables
	2025-01-09 17:12:48,606 - main - INFO - Starting LLM API server
	2025-01-09 17:12:48,606 - llm_api - INFO - Initializing LLM API
	2025-01-09 17:12:48,606 - llm_api - INFO - LLM API initialized successfully
	2025-01-09 17:12:48,606 - api_routes - INFO - Router initialized with LLM API instance
	2025-01-09 17:12:48,608 - main - INFO - FastAPI application created successfully
	2025-01-09 17:12:59,453 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:12:59,453 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:12:59,453 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:12:59,628 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
	2025-01-09 17:12:59,628 - api_routes - ERROR - Error initializing model: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
	2025-01-09 17:14:44,390 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:14:44,390 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:14:44,390 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
	2025-01-09 17:14:53,032 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
	2025-01-09 17:14:53,032 - api_routes - ERROR - Error initializing model: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
	2025-01-09 17:15:14,956 - api_routes - INFO - Received request to initialize model: microsoft/phi-4
	2025-01-09 17:15:14,956 - llm_api - INFO - Initializing generation model: microsoft/phi-4
	2025-01-09 17:15:14,956 - llm_api - INFO - Loading model from local path: main/models/phi-4
	2025-01-09 17:15:14,965 - llm_api - ERROR - Failed to initialize generation model microsoft/phi-4: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
	2025-01-09 17:15:14,965 - api_routes - ERROR - Error initializing model: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
	2025-01-13 16:04:32,247 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
	2025-01-13 16:04:32,247 - hf_validation - ERROR - No HF_TOKEN found in environment variables
	2025-01-13 16:04:32,247 - main - INFO - Starting LLM API server
	2025-01-13 16:04:32,248 - llm_api - INFO - Initializing LLM API
	2025-01-13 16:04:32,248 - llm_api - INFO - LLM API initialized successfully
	2025-01-13 16:04:32,248 - api_routes - INFO - Router initialized with LLM API instance
	2025-01-13 16:04:32,252 - main - INFO - FastAPI application created successfully
	2025-01-13 16:05:27,996 - api_routes - INFO - Received request to download model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:05:27,996 - llm_api - INFO - Starting download of model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:05:27,996 - llm_api - INFO - Enabling stdout logging for download
	2025-01-13 16:08:46,773 - llm_api - INFO - Disabling stdout logging
	2025-01-13 16:08:46,773 - llm_api - INFO - Saving model to main/models/Phi-3.5-mini-instruct
	2025-01-13 16:10:23,543 - llm_api - INFO - Successfully downloaded model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:10:24,432 - api_routes - INFO - Successfully downloaded model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:18:45,409 - api_routes - INFO - Received request to initialize model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:18:45,409 - llm_api - INFO - Initializing generation model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:18:45,412 - llm_api - INFO - Loading model from local path: main/models/Phi-3.5-mini-instruct
	2025-01-13 16:18:45,982 - llm_api - ERROR - Failed to initialize generation model microsoft/Phi-3.5-mini-instruct: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
	Dynamo is not supported on Python 3.13+
	2025-01-13 16:18:45,982 - api_routes - ERROR - Error initializing model: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
	Dynamo is not supported on Python 3.13+