Update README.md

5b6431c verified 2 months ago

2.27 kB

	This is my reproduction of the Microsoft team's work, WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models. It is fully based on open-source models to construct training data and adopt supervised fine-tuning (SFT) to train the model. Also, I reproduced the experimental results in the paper. These results are excellent, confirming that the idea of 'learning from expert battles' proposed in the paper has great potential. I have also published the training data constructed during my reproduction of the paper in another repository, and everyone is welcome to use it. Original paper link: https://arxiv.org/pdf/2412.17395 I have also published the training data constructed during my reproduction of the paper in another repository: https://huggingface.co/datasets/HuggingMicah/warrior_reproduce .
	\| Models \| Matplotlib (155) \| NumPy (220) \| Pandas (291) \| PyTorch (68) \| SciPy (106) \| Sklearn (115) \| TensorFlow (45) \| Overall (1000) \|
	\| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
	\| INCODER (6.7B) \| 28.3 \| 4.4 \| 3.1 \| 4.4 \| 2.8 \| 2.8 \| 3.8 \| 7.4 \|
	\| CodeGen-Mono (16B) \| 31.7 \| 10.9 \| 3.4 \| 7.0 \| 9.0 \| 10.8 \| 15.2 \| 11.7 \|
	\| Code-Cushman-001 \| 40.7 \| 21.8 \| 7.9 \| 12.4 \| 11.3 \| 18.0 \| 12.2 \| 18.1 \|
	\| StarCoder (15B) \| 51.7 \| 29.7 \| 11.4 \| 21.4 \| 20.2 \| 29.5 \| 24.5 \| 26.0 \|
	\| WizardCoder-SC (15B) \| 55.2 \| 33.6 \| 16.7 \| 26.2 \| 24.2 \| 24.9 \| 26.7 \| 29.2 \|
	\| CodeLlama-Python (6.7B) \| 55.3 \| 34.5 \| 16.4 \| 19.9 \| 22.3 \| 17.6 \| 28.5 \| 28.0 \|
	\| WizardCoder-CL (6.7B) \| 53.5 \| 34.4 \| 15.2 \| 25.7 \| 21.0 \| 24.5 \| 28.9 \| 28.4 \|
	\| Magicoder-CL (6.7B) \| 54.6 \| 34.8 \| 19.0 \| 24.7 \| 25.0 \| 22.6 \| 28.9 \| 29.9 \|
	\| MagicoderS-CL (6.7B) \| 55.9 \| 40.6 \| 28.4 \| 40.4 \| 28.8 \| 35.8 \| 37.6 \| 37.5 \|
	\| WarriorCoder_published_in_paper (6.7B) \| 55.5 \| 41.8 \| 26.1 \| 41.2 \| 33.0 \| 39.1 \| 42.2 \| 38.1 \|
	\| WarriorCoder_my_reproduce (6.7B) \| 56.1 \| 45.0 \| 32.0 \| 38.2 \| 36.8 \| 44.3 \| 48.9 \| 41.7 \|

	\| Models \| HumanEval \| HumanEval+ \| MBPP \| MBPP+ \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| WizardCoder-CL (6.7B) \| 48.7 \| 40.5 \| 56.4 \| 47.0 \|
	\| WizardCoder-SC (15B) \| 51.4 \| 45.3 \| 61.6 \| 50.7 \|
	\| Magicoder-CL (6.7B) \| 60.4 \| 55.7 \| 64.2 \| 52.5 \|
	\| MagicoderS-CL (6.7B) \| 70.7 \| 66.4 \| 68.3 \| 56.4 \|
	\| WarriorCoder (6.7B) \| 79.9 \| 75.4 \| 75.8 \| 64.5 \|

	This is my reproduction of the Microsoft team's work, WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models. It is fully based on open-source models to construct training data and adopt supervised fine-tuning (SFT) to train the model. Also, I reproduced the experimental results in the paper. These results are excellent, confirming that the idea of 'learning from expert battles' proposed in the paper has great potential. I have also published the training data constructed during my reproduction of the paper in another repository, and everyone is welcome to use it. Original paper link: https://arxiv.org/pdf/2412.17395 I have also published the training data constructed during my reproduction of the paper in another repository: https://huggingface.co/datasets/HuggingMicah/warrior_reproduce .
	\| Models \| Matplotlib (155) \| NumPy (220) \| Pandas (291) \| PyTorch (68) \| SciPy (106) \| Sklearn (115) \| TensorFlow (45) \| Overall (1000) \|
	\| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
	\| INCODER (6.7B) \| 28.3 \| 4.4 \| 3.1 \| 4.4 \| 2.8 \| 2.8 \| 3.8 \| 7.4 \|
	\| CodeGen-Mono (16B) \| 31.7 \| 10.9 \| 3.4 \| 7.0 \| 9.0 \| 10.8 \| 15.2 \| 11.7 \|
	\| Code-Cushman-001 \| 40.7 \| 21.8 \| 7.9 \| 12.4 \| 11.3 \| 18.0 \| 12.2 \| 18.1 \|
	\| StarCoder (15B) \| 51.7 \| 29.7 \| 11.4 \| 21.4 \| 20.2 \| 29.5 \| 24.5 \| 26.0 \|
	\| WizardCoder-SC (15B) \| 55.2 \| 33.6 \| 16.7 \| 26.2 \| 24.2 \| 24.9 \| 26.7 \| 29.2 \|
	\| CodeLlama-Python (6.7B) \| 55.3 \| 34.5 \| 16.4 \| 19.9 \| 22.3 \| 17.6 \| 28.5 \| 28.0 \|
	\| WizardCoder-CL (6.7B) \| 53.5 \| 34.4 \| 15.2 \| 25.7 \| 21.0 \| 24.5 \| 28.9 \| 28.4 \|
	\| Magicoder-CL (6.7B) \| 54.6 \| 34.8 \| 19.0 \| 24.7 \| 25.0 \| 22.6 \| 28.9 \| 29.9 \|
	\| MagicoderS-CL (6.7B) \| 55.9 \| 40.6 \| 28.4 \| 40.4 \| 28.8 \| 35.8 \| 37.6 \| 37.5 \|
	\| WarriorCoder_published_in_paper (6.7B) \| 55.5 \| 41.8 \| 26.1 \| 41.2 \| 33.0 \| 39.1 \| 42.2 \| 38.1 \|
	\| WarriorCoder_my_reproduce (6.7B) \| 56.1 \| 45.0 \| 32.0 \| 38.2 \| 36.8 \| 44.3 \| 48.9 \| 41.7 \|

	\| Models \| HumanEval \| HumanEval+ \| MBPP \| MBPP+ \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| WizardCoder-CL (6.7B) \| 48.7 \| 40.5 \| 56.4 \| 47.0 \|
	\| WizardCoder-SC (15B) \| 51.4 \| 45.3 \| 61.6 \| 50.7 \|
	\| Magicoder-CL (6.7B) \| 60.4 \| 55.7 \| 64.2 \| 52.5 \|
	\| MagicoderS-CL (6.7B) \| 70.7 \| 66.4 \| 68.3 \| 56.4 \|
	\| WarriorCoder (6.7B) \| 79.9 \| 75.4 \| 75.8 \| 64.5 \|