matrix-multiply commited on
Commit
c81d735
·
verified ·
1 Parent(s): c2413d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -6
README.md CHANGED
@@ -45,22 +45,26 @@ The DocuMint model can be used directly to generate high-quality docstrings for
45
 
46
  The training data consists of 100,000 Python functions and their docstrings extracted from popular open-source repositories in the FLOSS ecosystem. Repositories were filtered based on metrics such as number of contributors (> 50), commits (> 5k), stars (> 35k), and forks (> 10k) to focus on well-established and actively maintained projects.
47
 
48
- An abstract syntax tree (AST) based parser was used to extract functions and docstrings. Challenges in the data sampling process included syntactic errors, multi-language repositories, computational expense, repository size discrepancies, and ensuring diversity while avoiding repetition.
49
-
50
  #### Training Hyperparameters
51
 
52
- The model was fine-tuned using Low-Rank Adaptation (LoRA) for 4 epochs with a batch size of 8 and gradient accumulation steps of 16. The initial learning rate was 2e-4. In total, there were 78,446,592 LoRA parameters and 185,040,896 training tokens. The full hyperparameter configuration is provided in Table 2 of the paper.
 
 
 
 
 
 
 
 
53
 
54
- Fine-tuning was performed using an Intel 12900K CPU, an Nvidia RTX-3090 GPU, and 64 GB RAM. Total fine-tuning time was 48 GPU hours.
55
 
 
56
 
57
 
58
  ## Evaluation
59
 
60
  <!-- This section describes the evaluation protocols and provides the results. -->
61
 
62
- ### Testing Data, Factors & Metrics
63
-
64
  #### Metrics
65
 
66
  - **Accuracy:** Measures the coverage of the generated docstring on code elements like input/output variables. Calculated using cosine similarity between the generated and expert docstring embeddings.
 
45
 
46
  The training data consists of 100,000 Python functions and their docstrings extracted from popular open-source repositories in the FLOSS ecosystem. Repositories were filtered based on metrics such as number of contributors (> 50), commits (> 5k), stars (> 35k), and forks (> 10k) to focus on well-established and actively maintained projects.
47
 
 
 
48
  #### Training Hyperparameters
49
 
50
+ | Hyperparameter | Value |
51
+ |-------------------------------|---------------|
52
+ | Fine-tuning Method | LoRA |
53
+ | Epochs | 4 |
54
+ | Batch Size | 8 |
55
+ | Gradient Accumulation Steps | 16 |
56
+ | Initial Learning Rate | 2e-4 |
57
+ | LoRA Parameters | 78,446,592 |
58
+ | Training Tokens | 185,040,896 |
59
 
 
60
 
61
+ Fine-tuning was performed using an Intel 12900K CPU, an Nvidia RTX-3090 GPU, and 64 GB RAM. Total fine-tuning time was 48 GPU hours.
62
 
63
 
64
  ## Evaluation
65
 
66
  <!-- This section describes the evaluation protocols and provides the results. -->
67
 
 
 
68
  #### Metrics
69
 
70
  - **Accuracy:** Measures the coverage of the generated docstring on code elements like input/output variables. Calculated using cosine similarity between the generated and expert docstring embeddings.