DeepSeek Distilled Qwen Model

This repository contains the distilled version of the Qwen model using DeepSeek for improved inference performance. The model has been fine-tuned to optimize both accuracy and speed, making it suitable for a range of real-time applications.

Model Overview

The Qwen model, originally a large-scale transformer-based model, has been distilled using DeepSeek techniques to significantly reduce its size while retaining most of its performance. This makes it more efficient for deployment in resource-constrained environments.

Features

  • Distilled Model: A compressed version of the original Qwen model that maintains high accuracy with lower computational costs.
  • Optimized Inference: Faster response times for real-time applications.
  • Cross-Domain Application: Suitable for a variety of use cases in natural language processing (NLP) and AI-driven applications.

Installation

You can install this model via Hugging Face's transformers library.

pip install transformers

Then, you can load the model using the following code:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "madaibaba/deepseek-distill-qwen"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Usage

inputs = tokenizer("Your text here", return_tensors="pt")
outputs = model.generate(inputs['input_ids'], max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation

We have evaluated the distilled model on several NLP tasks, including text generation, summarization, and question-answering. Results show that the model performs with high accuracy and faster inference compared to the original Qwen model.

Metric Model 1 (Original) Model 2 (Ollama Distilled) Model 3 (Self Distilled)
Inference Time (s) 52.223 0.27 0.809
CPU Memory Usage (MB) 0.01 0 0
GPU Memory Usage (MB) 0.02 0 0
Perplexity 5.72 40.76 11.45
BLEU Score 0.69 45.63 19.38
ROUGE-1 Score 0.02 0.67 0.4
ROUGE-2 Score 0.01 0.64 0.37
ROUGE-L Score 0.02 0.67 0.4
Model Size (M Parameters) 1543.71 1777.09 1543.3
Throughput (samples/sec) 0.02 3.99 1.22

Fine-Tuning (Optional)

If you want to fine-tune the model further on your specific dataset, you can follow the fine-tuning procedure in this repository.

  1. Prepare your custom dataset in the appropriate format.
  2. Run the training script as shown in the training section.

License

This model is licensed under the MIT License.

Acknowledgements

  • DeepSeek for the distillation framework.
  • Qwen for the original model.
Downloads last month
4
Safetensors
Model size
1.54B params
Tensor type
FP16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support