XGenerationLab commited on
Commit
497391b
·
verified ·
1 Parent(s): 0450e29

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -31
README.md CHANGED
@@ -1,31 +1,32 @@
1
  ---
2
- license: apache-2.0
 
 
 
3
  base_model:
4
- - XGenerationLab/XiYanSQL-QwenCoder-7B-2502
5
- pipeline_tag: text-generation
6
  language:
7
  - en
8
  - zh
 
9
  ---
10
 
11
-
12
- We are excited to update our new XiYanSQL-QwenCoder series models. This version includes significant optimizations over the previous one, achieving new SOTA performance for single models.
13
-
14
  ### Important Links
15
  📖[Github](https://github.com/XGenerationLab/XiYanSQL-QwenCoder) |
16
  🤖[ModelScope](https://modelscope.cn/collections/XiYanSQL-Models-4483337b614241) |
17
  🌐[XiYan-SQL](https://github.com/XGenerationLab/XiYan-SQL) |
18
  🌕[析言GBI](https://bailian.console.aliyun.com/xiyan) |
19
- 🤖[Modelscope Space](https://www.modelscope.cn/studios/XGenerationLab/XiYanSQL-QwenCoder-32B)
20
 
21
 
22
  ## Introduction
23
- We are excited to release the XiYanSQL-QwenCoder-2504 version, our latest SQL generation model. This version continues to optimize upon the previous version, delivering enhanced performance.
24
- - Our model incorporates important explorations combining fine-tuning and GRPO training, leveraging the post-training strategies of GRPO without a thinking process, achieving both efficiency and accuracy in SQL generation.
25
- - It demonstrates impressive performance and supports multiple dialects, ready to use out of the box.
26
- - Improved generalization capabilities, excelling on different dialects and out-of-domain datasets.
27
 
28
- In this evaluation, we have also added a real-world SQL benchmark (the DW test set), which serves as an important internal evaluation baseline. This test set includes thousands of complex queries from real scenarios in both PostgreSQL and MySQL dialects, effectively reflecting the model's performance across multiple dialects and out-of-domain data.
29
 
30
  ## Model Downloads
31
 
@@ -42,23 +43,24 @@ In this evaluation, we have also added a real-world SQL benchmark (the DW test s
42
  ## Performance
43
  The XiYanSQL-QwenCoder models, as multi-dialect SQL base models, demonstrating robust SQL generation capabilities. The following presents the evaluation results at the time of release. We conducted a comprehensive evaluation of the model's performance under two schema formats, M-Schema, and original DDL, using the BIRD and Spider as SQLite benchmarks in the Text-to-SQL domain, as well as DW benchmarks for PostgreSQL and MySQL dialects.
44
 
45
- | Model name | Size | BIRD Dev@M-Schema | BIRD Dev@DDL | Spider Test@M-Schema | Spider Test@DDL | DW PostgreSQL@M-Schema | DW MySQL@M-Schema |
46
- |------------------------------|:----:|:-----------------:|:------------:|:--------------------:|:---------------:|:----------------------:|:----------------:|
47
- | GPT-4o-0806 | UNK | 58.47% | 54.82% | 82.89% | 78.45% | 46.79% | 57.77% |
48
- | GPT-4.1-0414 | UNK | 59.39% | 54.11% | 84.45% | 79.86% | 54.29% | 63.18% |
49
- | Claude3.5-sonnet-1022 | UNK | 53.32% | 50.46% | 76.27% | 73.04% | 55.22% | 52.84% |
50
- | Claude3.7-sonnet | UNK | 54.82% | 49.22% | 78.04% | 74.66% | 53.23% | 54.61% |
51
- | Gemini-1.5-Pro | UNK | 61.34% | 57.89% | 85.11% | 84.00% | 52.78% | 62.78% |
52
- | DeepSeek-V2.5-1210 | 236B | 55.74% | 55.61% | 82.08% | 80.57% | 45.74% | 52.18% |
53
- | DeepSeek-V3 | 685B | 59.58% | 56.71% | 81.52% | 79.91% | 52.56% | 55.95% |
54
- | DeepSeek-R1 | 685B | 58.15% | 55.61% | 80.72% | 78.85% | 60.56% | 62.00% |
55
- | DeepSeek-R1-Distill-Qwen-32B | 32B | 50.65% | 48.31% | 78.65% | 77.33% | 37.22% | 44.72% |
56
- | Deepseek-Coder-33B-Instruct | 33B | 47.52% | 44.72% | 72.39% | 62.0% | 31.48% | 36.17% |
57
- | OmniSQL-32B | 32B | 60.37% | 55.87% | 85.16% | 83.19% | 38.19% | 42.34% |
58
- | XiYanSQL-QwenCoder-32B-2412 | 32B | 67.07% | 63.04% | 88.39% | 85.46% | 45.07% | 52.84% |
59
- | XiYanSQL-QwenCoder-32B-2504 | 32B | 67.14% | 62.26% | 89.20% | 86.17% | 53.52% | 57.74% |
60
- | XiYanSQL-QwenCoder-7B-2502 | 7B | 59.65% | 56.32% | 84.15% | 80.01% | 39.38% | 42.10% |
61
- | XiYanSQL-QwenCoder-7B-2504 | 7B | 62.13% | 57.43% | 85.97% | 82.48% | 42.08% | 44.67% |
 
62
 
63
 
64
 
@@ -71,6 +73,7 @@ Currently, we mainly support mainstream dialects like SQLite, PostgreSQL, and My
71
  - transformers >= 4.37.0
72
  - vllm >= 0.7.2
73
 
 
74
  ### Prompt Template
75
  ```python
76
  nl2sqlite_template_cn = """你是一名{dialect}专家,现在需要阅读并理解下面的【数据库schema】描述,以及可能用到的【参考信息】,并运用{dialect}知识生成sql语句回答【用户问题】。
@@ -95,7 +98,7 @@ nl2sqlite_template_cn = """你是一名{dialect}专家,现在需要阅读并
95
  import torch
96
  from transformers import AutoModelForCausalLM, AutoTokenizer
97
 
98
- model_name = "XGenerationLab/XiYanSQL-QwenCoder-32B-2502"
99
  model = AutoModelForCausalLM.from_pretrained(
100
  model_name,
101
  torch_dtype=torch.bfloat16,
@@ -134,7 +137,7 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
134
  ```python
135
  from vllm import LLM, SamplingParams
136
  from transformers import AutoTokenizer
137
- model_path = "XGenerationLab/XiYanSQL-QwenCoder-32B-2502"
138
  llm = LLM(model=model_path, tensor_parallel_size=8)
139
  tokenizer = AutoTokenizer.from_pretrained(model_path)
140
  sampling_params = SamplingParams(
 
1
  ---
2
+ frameworks:
3
+ - Pytorch
4
+ tasks:
5
+ - text-generation
6
  base_model:
7
+ - XGenerationLab/XiYanSQL-QwenCoder-32B-2412
8
+ base_model_relation: adapter
9
  language:
10
  - en
11
  - zh
12
+ license: apache-2.0
13
  ---
14
 
 
 
 
15
  ### Important Links
16
  📖[Github](https://github.com/XGenerationLab/XiYanSQL-QwenCoder) |
17
  🤖[ModelScope](https://modelscope.cn/collections/XiYanSQL-Models-4483337b614241) |
18
  🌐[XiYan-SQL](https://github.com/XGenerationLab/XiYan-SQL) |
19
  🌕[析言GBI](https://bailian.console.aliyun.com/xiyan) |
20
+ 🤖[ModelScope Space](https://www.modelscope.cn/studios/XGenerationLab/XiYanSQL-QwenCoder-32B)
21
 
22
 
23
  ## Introduction
24
+ We are excited to release the **XiYanSQL-QwenCoder-2504** version, our latest SQL generation model. This version continues to optimize upon the previous version, delivering enhanced performance.
25
+ - Our model incorporates important explorations combining **fine-tuning and GRPO training**, leveraging the post-training strategies of GRPO without a thinking process, achieving both efficiency and accuracy in SQL generation.
26
+ - It demonstrates **impressive performance** and supports **multiple dialects**, ready to use out of the box.
27
+ - Improved generalization capabilities, excelling on different dialects and **out-of-domain datasets**.
28
 
29
+ In this evaluation, we have also added **a real-world SQL benchmark (the DW test set)**, which serves as an important internal evaluation baseline. This test set includes thousands of complex queries from real scenarios in both PostgreSQL and MySQL dialects, effectively reflecting the model's performance across multiple dialects and out-of-domain data.
30
 
31
  ## Model Downloads
32
 
 
43
  ## Performance
44
  The XiYanSQL-QwenCoder models, as multi-dialect SQL base models, demonstrating robust SQL generation capabilities. The following presents the evaluation results at the time of release. We conducted a comprehensive evaluation of the model's performance under two schema formats, M-Schema, and original DDL, using the BIRD and Spider as SQLite benchmarks in the Text-to-SQL domain, as well as DW benchmarks for PostgreSQL and MySQL dialects.
45
 
46
+ | Model name | Size | BIRD Dev@M-Schema | BIRD Dev@DDL | Spider Test@M-Schema | Spider Test@DDL | DW PostgreSQL@M-Schema | DW MySQL@M-Schema |
47
+ |------------------------------|:------:|:-----------------:|:------------:|:--------------------:|:---------------:|:----------------------:|:-----------------:|
48
+ | GPT-4o-0806 | UNK | 58.47% | 54.82% | 82.89% | 78.45% | 46.79% | 57.77% |
49
+ | GPT-4.1-0414 | UNK | 59.39% | 54.11% | 84.45% | 79.86% | 54.29% | 63.18% |
50
+ | Claude3.5-sonnet-1022 | UNK | 53.32% | 50.46% | 76.27% | 73.04% | 55.22% | 52.84% |
51
+ | Claude3.7-sonnet | UNK | 54.82% | 49.22% | 78.04% | 74.66% | 53.23% | 54.61% |
52
+ | Gemini-1.5-Pro | UNK | 61.34% | 57.89% | 85.11% | 84.00% | 52.78% | 62.78% |
53
+ | DeepSeek-V2.5-1210 | 236B | 55.74% | 55.61% | 82.08% | 80.57% | 45.74% | 52.18% |
54
+ | DeepSeek-V3 | 685B | 59.58% | 56.71% | 81.52% | 79.91% | 52.56% | 55.95% |
55
+ | DeepSeek-R1 | 685B | 58.15% | 55.61% | 80.72% | 78.85% | 60.56% | 62.00% |
56
+ | DeepSeek-R1-Distill-Qwen-32B | 32B | 50.65% | 48.31% | 78.65% | 77.33% | 37.22% | 44.72% |
57
+ | Deepseek-Coder-33B-Instruct | 33B | 47.52% | 44.72% | 72.39% | 62.0% | 31.48% | 36.17% |
58
+ | OmniSQL-32B | 32B | 60.37% | 55.87% | 85.16% | 83.19% | 38.19% | 42.34% |
59
+ | XiYanSQL-QwenCoder-7B-2502 | 7B | 59.65% | 56.32% | 84.15% | 80.01% | 39.38% | 42.10% |
60
+ | XiYanSQL-QwenCoder-7B-2504 | 7B | 62.13% | 57.43% | 85.97% | 82.48% | 42.08% | 44.67% |
61
+ | XiYanSQL-QwenCoder-32B-2412 | 32B | 67.07% | 63.04% | 88.39% | 85.46% | 45.07% | 52.84% |
62
+ | XiYanSQL-QwenCoder-32B-2504 | 32B | 67.14% | 62.26% | 89.20% | 86.17% | 53.52% | 57.74% |
63
+
64
 
65
 
66
 
 
73
  - transformers >= 4.37.0
74
  - vllm >= 0.7.2
75
 
76
+
77
  ### Prompt Template
78
  ```python
79
  nl2sqlite_template_cn = """你是一名{dialect}专家,现在需要阅读并理解下面的【数据库schema】描述,以及可能用到的【参考信息】,并运用{dialect}知识生成sql语句回答【用户问题】。
 
98
  import torch
99
  from transformers import AutoModelForCausalLM, AutoTokenizer
100
 
101
+ model_name = "XGenerationLab/XiYanSQL-QwenCoder-32B-2504"
102
  model = AutoModelForCausalLM.from_pretrained(
103
  model_name,
104
  torch_dtype=torch.bfloat16,
 
137
  ```python
138
  from vllm import LLM, SamplingParams
139
  from transformers import AutoTokenizer
140
+ model_path = "XGenerationLab/XiYanSQL-QwenCoder-32B-2504"
141
  llm = LLM(model=model_path, tensor_parallel_size=8)
142
  tokenizer = AutoTokenizer.from_pretrained(model_path)
143
  sampling_params = SamplingParams(