NotShrirang commited on
Commit
e6f0f9c
·
verified ·
1 Parent(s): c7713bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -12,7 +12,17 @@ base_model:
12
  ---
13
 
14
  # DeepSeek R1 Distill Qwen 1.5B finetuned for SQL query generation
15
- This model is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, specifically trained for Text-to-SQL query generation. It has been fine-tuned on the GretelAI Synthetic Text-to-SQL dataset, enabling it to convert natural language questions into SQL queries accurately.
 
 
 
 
 
 
 
 
 
 
16
 
17
  ## Use Cases
18
  1. Assisting developers and analysts in writing SQL queries.
 
12
  ---
13
 
14
  # DeepSeek R1 Distill Qwen 1.5B finetuned for SQL query generation
15
+ This model is a fine-tuned version of DeepSeek R1 Distill Qwen 1.5B, specifically optimized for SQL query generation. It has been trained on the GretelAI Synthetic Text-to-SQL dataset to enhance its ability to convert natural language prompts into accurate SQL queries.
16
+
17
+ Due to its lightweight architecture, this model can be deployed efficiently on local machines without requiring a GPU, making it ideal for on-premises inference in resource-constrained environments. It offers a balance between performance and efficiency, making it suitable for businesses and developers looking for a cost-effective SQL generation solution.
18
+
19
+ ## Training Methodology
20
+ 1. Fine-tuning approach: LoRA (Low-Rank Adaptation) for efficient parameter tuning.
21
+ 2. Precision: bfloat16 (bf16) to reduce memory consumption while maintaining numerical stability.
22
+ 3. Gradient Accumulation: Used to handle larger batch sizes within GPU memory limits.
23
+ 4. Optimizer: AdamW with learning rate scheduling.
24
+ 5. Cosine Scheduler: Used cosine learning rate scheduler for training stability. (500 warm-up steps, 2000 steps for the cosine schedule.)
25
+ 6. Hardware: Trained on 8xA100 GPUs with mixed precision training.
26
 
27
  ## Use Cases
28
  1. Assisting developers and analysts in writing SQL queries.