ykarout commited on
Commit
8b8d4e1
·
verified ·
1 Parent(s): 87e8565

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +117 -13
README.md CHANGED
@@ -2,21 +2,132 @@
2
  library_name: transformers
3
  tags:
4
  - unsloth
 
 
 
 
 
 
 
5
  ---
6
 
7
  # Model Card for Model ID
8
 
9
- <!-- Provide a quick summary of what the model is/does. -->
10
-
11
 
12
 
13
  ## Model Details
14
 
15
  ### Model Description
16
 
17
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
20
 
21
  - **Developed by:** [More Information Needed]
22
  - **Funded by [optional]:** [More Information Needed]
@@ -24,19 +135,12 @@ This is the model card of a 🤗 transformers model that has been pushed on the
24
  - **Model type:** [More Information Needed]
25
  - **Language(s) (NLP):** [More Information Needed]
26
  - **License:** [More Information Needed]
27
- - **Finetuned from model [optional]:** [More Information Needed]
28
-
29
- ### Model Sources [optional]
30
-
31
- <!-- Provide the basic links for the model. -->
32
 
33
- - **Repository:** [More Information Needed]
34
- - **Paper [optional]:** [More Information Needed]
35
- - **Demo [optional]:** [More Information Needed]
36
 
37
  ## Uses
38
 
39
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
40
 
41
  ### Direct Use
42
 
 
2
  library_name: transformers
3
  tags:
4
  - unsloth
5
+ license: apache-2.0
6
+ datasets:
7
+ - nvidia/Llama-Nemotron-Post-Training-Dataset
8
+ base_model:
9
+ - unsloth/phi-4
10
+ - deepseek-ai/DeepSeek-R1
11
+ pipeline_tag: text-generation
12
  ---
13
 
14
  # Model Card for Model ID
15
 
16
+ Phi-4 trained on reasoning outputs on complex logic, math and coding challenges derived from nvidia/Llama-Nemotron-Post-Training-Dataset filtered to include high length reasoning answers generated by DeepSeek R1.
 
17
 
18
 
19
  ## Model Details
20
 
21
  ### Model Description
22
 
23
+ Phi-4 trained on reasoning outputs on complex logic, math and coding challenges derived from nvidia/Llama-Nemotron-Post-Training-Dataset filtered to include high length reasoning answers generated by DeepSeek R1.
24
+ The training was on 10,000 samples done on an RTX 5090 (yes managed to make unsloth work on a 5090) with context length of 16384 and took around 10 hours using unsloth 4-bit quants and transfomers SFT Trainer.
25
+ You do not need to add a system prompt but it can help in some use cases. The model will automatically go into thinking mode when presented with complex tasks.
26
+
27
+
28
+ Recommended Settings of temperature = 1.5 (you can test with 1 to 1.5) , min_p = 0.1, repeat penalty 1.2 or 1.3 to mitigate extremely long reasoning around the same concept.
29
+
30
+
31
+ Try the following prompt or similar structured prompts containing complex connections and the model will automatically go into thinking mode and generate long reasoning chains akin to DeepSeek.
32
+
33
+ #### Prompt:
34
+
35
+ This prompt was generated using Claude 3.7 Sonnet and not included in the train or test dataset, use similarly structred prompts and see the magic!
36
+
37
+ 1. Network Packet Routing Optimization Challenge
38
+
39
+ You're designing a system to optimize packet routing in a network with multiple possible paths. The network consists of nodes connected by bidirectional links, each with different bandwidth capacities and latency values.
40
+
41
+ Your task is to find the most efficient routing path between a given source and destination node that satisfies specific constraints on bandwidth, latency, and hop count.
42
+
43
+ Input Specification
44
+
45
+ The first line contains four space-separated integers: `n`, `m`, `b_min`, and `l_max` (2 ≤ n ≤ 100, 1 ≤ m ≤ 5000, 1 ≤ b_min ≤ 1000, 1 ≤ l_max ≤ 10000)
46
+ - `n`: number of nodes in the network (numbered from 1 to n)
47
+ - `m`: number of links between nodes
48
+ - `b_min`: minimum required bandwidth for the path
49
+ - `l_max`: maximum allowed total latency for the path
50
+
51
+ The next `m` lines each contain four integers `u`, `v`, `b`, `l` (1 ≤ u, v ≤ n, u ≠ v, 1 ≤ b ≤ 1000, 1 ≤ l ≤ 1000):
52
+ - `u`, `v`: nodes connected by this link
53
+ - `b`: bandwidth capacity of the link
54
+ - `l`: latency of the link
55
+
56
+ The last line contains two integers `s` and `t` (1 ≤ s, t ≤ n, s ≠ t) - the source and destination nodes.
57
+
58
+ Constraints and Notes
59
+
60
+ 1. The bandwidth of a path is the minimum bandwidth among all links in the path
61
+ 2. The latency of a path is the sum of latencies of all links in the path
62
+ 3. A valid path must have bandwidth ≥ `b_min` and latency ≤ `l_max`
63
+ 4. Among all valid paths, you must choose the one with the highest bandwidth
64
+ 5. If there are multiple paths with the same highest bandwidth, choose the one with the lowest latency
65
+ 6. If there are still multiple paths, choose the one with the fewest hops (links)
66
+ 7. If no valid path exists, output "NO PATH"
67
+
68
+ Output
69
+
70
+ If a valid path exists, the first line should contain three space-separated integers: the bandwidth of the chosen path, the total latency of the chosen path, and the number of hops.
71
+
72
+ The second line should contain the sequence of nodes in the path, starting with `s` and ending with `t`.
73
+
74
+ If no valid path exists, output "NO PATH" (without quotes).
75
+
76
+ Examples
77
+
78
+ Example 1:
79
+ ```
80
+ 5 6 50 100
81
+ 1 2 100 20
82
+ 2 3 80 30
83
+ 3 5 70 10
84
+ 1 4 60 10
85
+ 4 5 90 30
86
+ 1 3 50 5
87
+ 1 5
88
+ ```
89
+
90
+ Output:
91
+ ```
92
+ 70 60 3
93
+ 1 2 3 5
94
+ ```
95
+
96
+ Example 2:
97
+ ```
98
+ 4 5 80 50
99
+ 1 2 80 20
100
+ 2 3 120 15
101
+ 3 4 90 10
102
+ 1 3 100 30
103
+ 2 4 70 25
104
+ 1 4
105
+ ```
106
+
107
+ Output:
108
+ ```
109
+ 90 40 2
110
+ 1 3 4
111
+ ```
112
+
113
+ Example 3:
114
+ ```
115
+ 3 3 100 100
116
+ 1 2 150 40
117
+ 2 3 180 70
118
+ 1 3 120 30
119
+ 1 3
120
+ ```
121
+
122
+ Output:
123
+ ```
124
+ 120 30 1
125
+ 1 3
126
+ ```
127
+
128
+ Your solution should efficiently find the optimal path that satisfies all constraints, handling potentially complex network topologies with multiple possible routes between source and destination.
129
+
130
 
 
131
 
132
  - **Developed by:** [More Information Needed]
133
  - **Funded by [optional]:** [More Information Needed]
 
135
  - **Model type:** [More Information Needed]
136
  - **Language(s) (NLP):** [More Information Needed]
137
  - **License:** [More Information Needed]
138
+ - **Finetuned from model [optional]:**unsloth/phi-4
 
 
 
 
139
 
 
 
 
140
 
141
  ## Uses
142
 
143
+ Complex reasoning requiring challenging thinking and coding (mostly python).
144
 
145
  ### Direct Use
146