DeepSeek-Light-V1: Optimized Version of DeepSeek-Coder-6.7B

Based in the Basque Country 🇪🇸

DeepSeek-Light-V1 is a highly optimized version of DeepSeek-Coder-6.7B, designed to reduce GPU memory consumption and improve deployment feasibility. This optimization combines 4-bit quantization and pruning, significantly lowering the number of parameters while maintaining functional capabilities.

Key Optimizations 🚀

4-bit Quantization (BFloat16): Reduces VRAM usage with minimal precision loss.
Pruning: Removes redundant parameters to enhance efficiency.
Optimized for lightweight deployment: Works on lower-end hardware.

Model Comparison 📊

Version	Model Size	GPU VRAM Usage	Parameters	Relative Performance
Original (DeepSeek-Coder-6.7B)	3.51GB	7.85GB	6.7B	100%
Optimized (DeepSeek-Light-V1)	3.51GB	3.93GB (50% reduction!)	3.5B	~50% performance

Why Use This Model? 💡

✅ Runs on more affordable hardware – No need for high-end GPUs.
✅ Reduces operational costs – More efficient deployment.
✅ Enhances security – Enables local execution before moving to production.

How to Use 🛠️

You can load the model using transformers with quantization:

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

# Load model and tokenizer
model_name = "sanchezalonsodavid17/DeepSeek_Light_V1"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    quantization_config=quantization_config
)

# Generate text
def generate_text(prompt, max_length=100):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    with torch.no_grad():
        output = model.generate(**inputs, max_length=max_length)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example usage
prompt = "Explain how deep learning works in neural networks."
response = generate_text(prompt)
print(response)

sanchezalonsodavid17
/

DeepSeek_Light_V1

DeepSeek-Light-V1: Optimized Version of DeepSeek-Coder-6.7B

Key Optimizations 🚀

Model Comparison 📊

Why Use This Model? 💡

How to Use 🛠️

Model tree for sanchezalonsodavid17/DeepSeek_Light_V1