DeepSeek-Light-V1: Optimized Version of DeepSeek-Coder-6.7B
Based in the Basque Country 🇪🇸
DeepSeek-Light-V1 is a highly optimized version of DeepSeek-Coder-6.7B, designed to reduce GPU memory consumption and improve deployment feasibility. This optimization combines 4-bit quantization and pruning, significantly lowering the number of parameters while maintaining functional capabilities.
Key Optimizations 🚀
- 4-bit Quantization (BFloat16): Reduces VRAM usage with minimal precision loss.
- Pruning: Removes redundant parameters to enhance efficiency.
- Optimized for lightweight deployment: Works on lower-end hardware.
Model Comparison 📊
Version | Model Size | GPU VRAM Usage | Parameters | Relative Performance |
---|---|---|---|---|
Original (DeepSeek-Coder-6.7B) | 3.51GB | 7.85GB | 6.7B | 100% |
Optimized (DeepSeek-Light-V1) | 3.51GB | 3.93GB (50% reduction!) | 3.5B | ~50% performance |
Why Use This Model? 💡
✅ Runs on more affordable hardware – No need for high-end GPUs.
✅ Reduces operational costs – More efficient deployment.
✅ Enhances security – Enables local execution before moving to production.
How to Use 🛠️
You can load the model using transformers
with quantization:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
# Load model and tokenizer
model_name = "sanchezalonsodavid17/DeepSeek_Light_V1"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
quantization_config=quantization_config
)
# Generate text
def generate_text(prompt, max_length=100):
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
output = model.generate(**inputs, max_length=max_length)
return tokenizer.decode(output[0], skip_special_tokens=True)
# Example usage
prompt = "Explain how deep learning works in neural networks."
response = generate_text(prompt)
print(response)
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for sanchezalonsodavid17/DeepSeek_Light_V1
Base model
deepseek-ai/deepseek-coder-6.7b-instruct