maxhirez commited on
Commit
409f846
·
verified ·
1 Parent(s): 6592f8a

Update README.md

Browse files

Added explanation and code sample.

Files changed (1) hide show
  1. README.md +106 -1
README.md CHANGED
@@ -9,4 +9,109 @@ tags:
9
  - cryptography
10
  - math
11
  - quantum_computing
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - cryptography
10
  - math
11
  - quantum_computing
12
+ ---
13
+ # ShorNet
14
+ ShorNet is a a multi-layer perceptron for predicting two prime factors from large 1024-bit hex numbers. It was conceived as a function approximation of the Shor Algorithm for prime factorization intended for theoretical future quantum computers.
15
+ ## Background
16
+ ShorNet was conceived as a project for learning about neural networks and cryptography and for testing the abilities of the Macintosh Studio M3 Ultra. It was inspired by the success of Google Deepmind's AlphaFold, which exhibits high effectiveness at predicting protein folding, which was similarly thought only to be solvable by mega- to giga-qubit quantum computers.
17
+ ## Sample usage
18
+ ```
19
+ import torch
20
+ import torch.nn as nn
21
+
22
+ # Input your 1024-bit hex number to factorize
23
+ number_to_factorize = "0x7703af0000000000fa6ead00000000008d2a480000000000772ba480000000007c0a7100000000006b72bb00000000001b842200000000000f9c57100000000015e642c00000000050bf8f00000000003b7c390000000000127718200000000052345d8000000000e7d9db00000000004058fd00000000005eb1d50000000000"
24
+
25
+ # Model architecture
26
+ class ResidualBlock(nn.Module):
27
+ def __init__(self, dim):
28
+ super().__init__()
29
+ self.linear1, self.linear2 = nn.Linear(dim, dim), nn.Linear(dim, dim)
30
+ self.norm1, self.norm2 = nn.LayerNorm(dim), nn.LayerNorm(dim)
31
+ self.activation = nn.ReLU()
32
+
33
+ def forward(self, x):
34
+ identity = x
35
+ x = self.activation(self.linear1(self.norm1(x)))
36
+ return self.linear2(self.norm2(x)) + identity
37
+
38
+ class PrimeFactorMLP(nn.Module):
39
+ def __init__(self, input_dim, hidden_dim, output_dim, num_layers=6):
40
+ super().__init__()
41
+ self.input_proj = nn.Linear(input_dim, hidden_dim)
42
+ self.residual_blocks = nn.ModuleList([ResidualBlock(hidden_dim) for _ in range(num_layers)])
43
+ self.norm, self.activation = nn.LayerNorm(hidden_dim), nn.ReLU()
44
+ self.p_head, self.q_head = nn.Linear(hidden_dim, output_dim), nn.Linear(hidden_dim, output_dim)
45
+
46
+ def forward(self, x):
47
+ x = self.input_proj(x)
48
+ for block in self.residual_blocks:
49
+ x = block(x)
50
+ x = self.activation(self.norm(x))
51
+ return self.p_head(x), self.q_head(x)
52
+
53
+ # Helper functions
54
+ def int_to_tensor(num, bits=1024, chunk_size=64):
55
+ chunks = bits // chunk_size
56
+ tensor = torch.zeros(chunks)
57
+ for i in range(chunks):
58
+ mask = (1 << chunk_size) - 1
59
+ chunk_val = (num >> (i * chunk_size)) & mask
60
+ tensor[chunks - i - 1] = chunk_val / ((1 << chunk_size) - 1)
61
+ return tensor
62
+
63
+ def tensor_to_int(tensor, chunk_size=64):
64
+ num = 0
65
+ for i, chunk_val in enumerate(tensor):
66
+ val = int(round(chunk_val.item() * ((1 << chunk_size) - 1)))
67
+ num |= val << ((len(tensor) - i - 1) * chunk_size)
68
+ return num
69
+
70
+ # Convert input to integer
71
+ n = int(number_to_factorize, 16)
72
+
73
+ # Load model (download from HuggingFace or local path)
74
+ model = PrimeFactorMLP(input_dim=16, hidden_dim=2048, output_dim=8)
75
+ model.load_state_dict(torch.load('final_model.pth', map_location='cpu')['model_state_dict'])
76
+ model.eval()
77
+
78
+ # Prepare input
79
+ n_tensor = int_to_tensor(n).unsqueeze(0)
80
+
81
+ # Predict factors
82
+ with torch.no_grad():
83
+ p_pred, q_pred = model(n_tensor)
84
+ p = tensor_to_int(p_pred[0])
85
+ q = tensor_to_int(q_pred[0])
86
+
87
+ # Print results
88
+ print(f"Input number: {number_to_factorize[:34]}...")
89
+ print(f"Predicted P: 0x{p:0128x}")
90
+ print(f"Predicted Q: 0x{q:0128x}")
91
+ print(f"Product: 0x{p*q:0256x}")
92
+ print(f"Bit match: {((p*q) == n)}")
93
+ ```
94
+ ## Results
95
+ It will come as no surprise that the model does not effectively predict the prime factors of a 1024 bit product, but it does have some interesting properties with their own implications.
96
+
97
+ ### Patterns in the Model's Predictions
98
+
99
+ - Consistent Bit Accuracy: The model is achieving remarkably consistent bit matching across samples - averaging around 80-82% of bits correct for both prime factors. This uniformity suggests the model is capturing fundamental structural patterns rather than just memorizing training examples.
100
+ - High-Order Bit Preservation: Looking at the hex representations, the model often gets the leftmost (most significant) digits fairly accurate. This indicates it's prioritizing the high-order bits, which have the largest impact on the product.
101
+ - Q Factor Pattern Convergence: The predicted Q values show a noticeable pattern - many predictions have similar digit sequences in the middle sections. For example, many Q predictions contain segments like 80...7f...80... in similar positions, suggesting the model is applying a learned template or pattern.
102
+ - Relative Error Consistency: The average relative errors for P and Q are almost identical (0.159 vs 0.160), indicating the model doesn't favor one factor over the other.
103
+ - Prediction Regularization: The predicted factors often appear "smoother" and less random than the true factors. This suggests the network is applying some form of regularization or pattern-based prediction rather than capturing the true randomness of prime numbers.
104
+ - Value Range Compression: There appears to be compression in the predicted values compared to the true values, particularly for the Q factor - the model's predictions don't span as wide a numerical range as the actual values.
105
+
106
+ ### Implications
107
+ The model seems to have learned to:
108
+
109
+ 1. Approximate the general magnitude of each prime factor (getting high-order bits mostly right)
110
+ 2. Apply certain templates or patterns it found useful during training
111
+ 3. Balance errors between the two factors rather than optimizing for one
112
+
113
+ ### Final analysis
114
+ While there was a significant Search Space Reduction in the model, the 80% bit accuracy means roughly 20% of bits (about 100 bits for each 512-bit prime) are incorrect. This reduces the search space from 2^512 to approximately 2^100, which is a dramatic improvement but still exponentially large.
115
+ This means that even using a prediction as a starting point for a classical iterative brute-force approach the worst case calculation time is still exponential.
116
+ While methods such as probabalistic search algorithms or lattice reductions might allow further computational advantages using the predictions as starting points, such investigations were beyond the scope of this machine learning project.
117
+ - _Claude Sonnet 3.7 was used for large portions of the coding and analysis in this project_