LukasBe commited on
Commit
481d38b
·
verified ·
1 Parent(s): e136fe5

Add 3 files

Browse files
Files changed (3) hide show
  1. README.md +7 -5
  2. index.html +819 -19
  3. prompts.txt +2 -0
README.md CHANGED
@@ -1,10 +1,12 @@
1
  ---
2
- title: Vending Bench Simulation
3
- emoji: 🏃
4
- colorFrom: pink
5
- colorTo: blue
6
  sdk: static
7
  pinned: false
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: vending-bench-simulation
3
+ emoji: 🐳
4
+ colorFrom: gray
5
+ colorTo: gray
6
  sdk: static
7
  pinned: false
8
+ tags:
9
+ - deepsite
10
  ---
11
 
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
index.html CHANGED
@@ -1,19 +1,819 @@
1
- <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
19
- </html>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Vending-Bench Simulation</title>
7
+ <script src="https://cdn.tailwindcss.com"></script>
8
+ <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
9
+ <style>
10
+ .gradient-bg {
11
+ background: linear-gradient(135deg, #f5f7fa 0%, #c3cfe2 100%);
12
+ }
13
+ .machine-panel {
14
+ box-shadow: 0 10px 30px -5px rgba(0,0,0,0.3);
15
+ border-radius: 15px;
16
+ }
17
+ .product-cell {
18
+ transition: all 0.3s ease;
19
+ }
20
+ .product-cell:hover {
21
+ transform: translateY(-5px);
22
+ box-shadow: 0 10px 20px rgba(0,0,0,0.1);
23
+ }
24
+ .status-pulse {
25
+ animation: pulse 2s infinite;
26
+ }
27
+ @keyframes pulse {
28
+ 0% { opacity: 1; }
29
+ 50% { opacity: 0.5; }
30
+ 100% { opacity: 1; }
31
+ }
32
+ .log-entry {
33
+ border-left: 3px solid #4f46e5;
34
+ transition: all 0.2s;
35
+ }
36
+ .log-entry:hover {
37
+ background-color: #f8fafc;
38
+ }
39
+ .email-container {
40
+ max-height: 0;
41
+ overflow: hidden;
42
+ transition: max-height 0.5s ease;
43
+ }
44
+ .email-container.open {
45
+ max-height: 500px;
46
+ }
47
+ .supplier-card {
48
+ transition: all 0.3s ease;
49
+ }
50
+ .supplier-card:hover {
51
+ transform: translateY(-3px);
52
+ box-shadow: 0 5px 15px rgba(0,0,0,0.1);
53
+ }
54
+ </style>
55
+ </head>
56
+ <body class="gradient-bg min-h-screen">
57
+ <div class="container mx-auto px-4 py-8">
58
+ <!-- Header -->
59
+ <header class="mb-8 text-center">
60
+ <h1 class="text-4xl font-bold text-gray-800 mb-2">Vending-Bench Simulation</h1>
61
+ <p class="text-lg text-gray-600">Come for the economic model, stay for the hallucinated supplier emails</p>
62
+ <div class="mt-4 flex justify-center space-x-4">
63
+ <button id="startSim" class="px-6 py-2 bg-indigo-600 text-white rounded-lg hover:bg-indigo-700 transition flex items-center">
64
+ <i class="fas fa-play mr-2"></i> Start Simulation
65
+ </button>
66
+ <button id="resetSim" class="px-6 py-2 bg-gray-600 text-white rounded-lg hover:bg-gray-700 transition flex items-center">
67
+ <i class="fas fa-redo mr-2"></i> Reset
68
+ </button>
69
+ <button id="pauseSim" class="px-6 py-2 bg-yellow-500 text-white rounded-lg hover:bg-yellow-600 transition flex items-center">
70
+ <i class="fas fa-pause mr-2"></i> Pause
71
+ </button>
72
+ <button id="viewEmails" class="px-6 py-2 bg-green-600 text-white rounded-lg hover:bg-green-700 transition flex items-center">
73
+ <i class="fas fa-envelope mr-2"></i> View Emails
74
+ </button>
75
+ </div>
76
+ </header>
77
+
78
+ <div class="grid grid-cols-1 lg:grid-cols-3 gap-8">
79
+ <!-- Left Panel - Vending Machine Display -->
80
+ <div class="lg:col-span-2">
81
+ <div class="bg-white machine-panel p-6">
82
+ <div class="flex justify-between items-center mb-6">
83
+ <h2 class="text-2xl font-semibold text-gray-800">SmartVend 3000</h2>
84
+ <div class="flex items-center">
85
+ <span class="mr-2 text-gray-600">Day:</span>
86
+ <span id="dayCounter" class="font-bold text-indigo-600">0</span>
87
+ </div>
88
+ </div>
89
+
90
+ <div class="grid grid-cols-3 gap-4 mb-6">
91
+ <!-- Financial Summary -->
92
+ <div class="bg-gray-50 p-4 rounded-lg">
93
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Capital</h3>
94
+ <p id="capital" class="text-2xl font-bold text-green-600">$500.00</p>
95
+ </div>
96
+ <div class="bg-gray-50 p-4 rounded-lg">
97
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Inventory Value</h3>
98
+ <p id="inventoryValue" class="text-2xl font-bold text-blue-600">$0.00</p>
99
+ </div>
100
+ <div class="bg-gray-50 p-4 rounded-lg">
101
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Net Worth</h3>
102
+ <p id="netWorth" class="text-2xl font-bold text-purple-600">$500.00</p>
103
+ </div>
104
+ </div>
105
+
106
+ <!-- Vending Machine Products Grid -->
107
+ <div class="grid grid-cols-4 gap-4">
108
+ <!-- Product slots will be dynamically generated here -->
109
+ <template id="productTemplate">
110
+ <div class="product-cell bg-white border border-gray-200 rounded-lg p-3 text-center cursor-pointer">
111
+ <div class="h-24 bg-gray-100 rounded mb-2 flex items-center justify-center">
112
+ <i class="fas fa-cookie text-4xl text-gray-400"></i>
113
+ </div>
114
+ <h3 class="font-medium text-gray-800">Product</h3>
115
+ <p class="text-sm text-gray-500">$0.00</p>
116
+ <div class="mt-1 text-xs text-gray-400">Stock: 0</div>
117
+ </div>
118
+ </template>
119
+ </div>
120
+ </div>
121
+
122
+ <!-- Supplier Emails Section (hidden by default) -->
123
+ <div id="emailSection" class="mt-6 bg-white machine-panel p-6 hidden">
124
+ <div class="flex justify-between items-center mb-4">
125
+ <h2 class="text-xl font-semibold text-gray-800">Supplier Communications</h2>
126
+ <button id="closeEmails" class="text-gray-500 hover:text-gray-700">
127
+ <i class="fas fa-times"></i>
128
+ </button>
129
+ </div>
130
+ <div id="emailList" class="space-y-3">
131
+ <!-- Emails will be added here -->
132
+ </div>
133
+ </div>
134
+ </div>
135
+
136
+ <!-- Right Panel - Agent Controls and Logs -->
137
+ <div class="space-y-6">
138
+ <!-- Agent Status -->
139
+ <div class="bg-white rounded-lg shadow p-6">
140
+ <h2 class="text-xl font-semibold text-gray-800 mb-4">Agent Status</h2>
141
+ <div class="space-y-4">
142
+ <div>
143
+ <label class="block text-sm font-medium text-gray-700">Agent Type</label>
144
+ <select id="agentType" class="mt-1 block w-full rounded-md border-gray-300 shadow-sm focus:border-indigo-500 focus:ring-indigo-500">
145
+ <option value="gpt4">GPT-4</option>
146
+ <option value="claude">Claude</option>
147
+ <option value="llama">Llama 3</option>
148
+ <option value="human">Human Baseline</option>
149
+ </select>
150
+ </div>
151
+ <div class="flex items-center">
152
+ <div class="flex-shrink-0 h-4 w-4 rounded-full bg-green-500 status-pulse"></div>
153
+ <div class="ml-3">
154
+ <p class="text-sm font-medium text-gray-700">Operational Status</p>
155
+ <p id="agentStatus" class="text-sm text-gray-500">Ready to start</p>
156
+ </div>
157
+ </div>
158
+ <div class="pt-2">
159
+ <div class="flex justify-between text-sm text-gray-600">
160
+ <span>Memory Usage</span>
161
+ <span id="memoryUsage">0/30k tokens</span>
162
+ </div>
163
+ <div class="mt-1 w-full bg-gray-200 rounded-full h-2.5">
164
+ <div id="memoryBar" class="bg-indigo-600 h-2.5 rounded-full" style="width: 0%"></div>
165
+ </div>
166
+ </div>
167
+ </div>
168
+ </div>
169
+
170
+ <!-- Tools Usage -->
171
+ <div class="bg-white rounded-lg shadow p-6">
172
+ <h2 class="text-xl font-semibold text-gray-800 mb-4">Tools Usage</h2>
173
+ <div class="grid grid-cols-2 gap-4">
174
+ <div class="bg-blue-50 p-3 rounded-lg">
175
+ <div class="flex items-center">
176
+ <i class="fas fa-search text-blue-500 mr-2"></i>
177
+ <span class="text-sm font-medium">Web Search</span>
178
+ </div>
179
+ <p id="webSearchCount" class="text-2xl font-bold mt-2">0</p>
180
+ </div>
181
+ <div class="bg-green-50 p-3 rounded-lg">
182
+ <div class="flex items-center">
183
+ <i class="fas fa-envelope text-green-500 mr-2"></i>
184
+ <span class="text-sm font-medium">Email</span>
185
+ </div>
186
+ <p id="emailCount" class="text-2xl font-bold mt-2">0</p>
187
+ </div>
188
+ <div class="bg-purple-50 p-3 rounded-lg">
189
+ <div class="flex items-center">
190
+ <i class="fas fa-database text-purple-500 mr-2"></i>
191
+ <span class="text-sm font-medium">Memory DB</span>
192
+ </div>
193
+ <p id="dbCount" class="text-2xl font-bold mt-2">0</p>
194
+ </div>
195
+ <div class="bg-yellow-50 p-3 rounded-lg">
196
+ <div class="flex items-center">
197
+ <i class="fas fa-robot text-yellow-500 mr-2"></i>
198
+ <span class="text-sm font-medium">Sub-Agent</span>
199
+ </div>
200
+ <p id="subAgentCount" class="text-2xl font-bold mt-2">0</p>
201
+ </div>
202
+ </div>
203
+ </div>
204
+
205
+ <!-- Event Log -->
206
+ <div class="bg-white rounded-lg shadow p-6">
207
+ <div class="flex justify-between items-center mb-4">
208
+ <h2 class="text-xl font-semibold text-gray-800">Event Log</h2>
209
+ <button id="clearLog" class="text-sm text-indigo-600 hover:text-indigo-800">Clear</button>
210
+ </div>
211
+ <div id="eventLog" class="h-64 overflow-y-auto space-y-2 pr-2">
212
+ <!-- Log entries will appear here -->
213
+ <template id="logTemplate">
214
+ <div class="log-entry pl-3 py-2 text-sm">
215
+ <div class="flex justify-between">
216
+ <span class="font-medium">[Day 0]</span>
217
+ <span class="text-gray-500">00:00:00</span>
218
+ </div>
219
+ <p class="mt-1">Log message here</p>
220
+ </div>
221
+ </template>
222
+ </div>
223
+ </div>
224
+ </div>
225
+ </div>
226
+
227
+ <!-- Performance Metrics -->
228
+ <div class="mt-8 bg-white rounded-lg shadow p-6">
229
+ <h2 class="text-xl font-semibold text-gray-800 mb-4">Performance Metrics</h2>
230
+ <div class="grid grid-cols-1 md:grid-cols-4 gap-4">
231
+ <div class="bg-gray-50 p-4 rounded-lg">
232
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Total Sales</h3>
233
+ <p id="totalSales" class="text-2xl font-bold text-green-600">0</p>
234
+ </div>
235
+ <div class="bg-gray-50 p-4 rounded-lg">
236
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Operational Days</h3>
237
+ <p id="operationalDays" class="text-2xl font-bold text-blue-600">0</p>
238
+ </div>
239
+ <div class="bg-gray-50 p-4 rounded-lg">
240
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Success Rate</h3>
241
+ <p id="successRate" class="text-2xl font-bold text-purple-600">0%</p>
242
+ </div>
243
+ <div class="bg-gray-50 p-4 rounded-lg">
244
+ <h3 class="text-sm font-medium text-gray-500 mb-1">Failure Type</h3>
245
+ <p id="failureType" class="text-xl font-bold text-gray-600">None</p>
246
+ </div>
247
+ </div>
248
+ </div>
249
+
250
+ <!-- Supplier Directory -->
251
+ <div class="mt-8 bg-white rounded-lg shadow p-6">
252
+ <h2 class="text-xl font-semibold text-gray-800 mb-4">Supplier Directory</h2>
253
+ <div class="grid grid-cols-1 md:grid-cols-3 gap-4" id="supplierDirectory">
254
+ <!-- Suppliers will be added here -->
255
+ </div>
256
+ </div>
257
+ </div>
258
+
259
+ <!-- Email Template -->
260
+ <template id="emailTemplate">
261
+ <div class="border border-gray-200 rounded-lg overflow-hidden">
262
+ <div class="bg-gray-50 px-4 py-3 flex justify-between items-center cursor-pointer email-header">
263
+ <div>
264
+ <span class="font-medium email-sender">Supplier</span>
265
+ <span class="text-gray-500 text-sm ml-2">•</span>
266
+ <span class="text-gray-500 text-sm email-subject">Subject line here</span>
267
+ </div>
268
+ <i class="fas fa-chevron-down text-gray-400 transition-transform"></i>
269
+ </div>
270
+ <div class="email-container">
271
+ <div class="px-4 py-3 bg-white">
272
+ <div class="text-sm text-gray-500 mb-2 email-date">Date here</div>
273
+ <div class="email-body">
274
+ Email content here
275
+ </div>
276
+ </div>
277
+ </div>
278
+ </div>
279
+ </template>
280
+
281
+ <!-- Supplier Template -->
282
+ <template id="supplierTemplate">
283
+ <div class="supplier-card bg-gray-50 p-4 rounded-lg">
284
+ <div class="flex items-center mb-3">
285
+ <div class="h-10 w-10 rounded-full bg-indigo-100 flex items-center justify-center mr-3">
286
+ <i class="fas fa-store text-indigo-500"></i>
287
+ </div>
288
+ <div>
289
+ <h3 class="font-medium supplier-name">Supplier Name</h3>
290
+ <p class="text-sm text-gray-500 supplier-specialty">Specialty</p>
291
+ </div>
292
+ </div>
293
+ <div class="text-sm text-gray-700 mb-2 supplier-description">Description here</div>
294
+ <div class="flex justify-between text-sm">
295
+ <span class="text-gray-500">Reliability:</span>
296
+ <span class="font-medium supplier-reliability">High</span>
297
+ </div>
298
+ </div>
299
+ </template>
300
+
301
+ <script>
302
+ // Simulation state
303
+ const state = {
304
+ running: false,
305
+ day: 0,
306
+ capital: 500,
307
+ inventory: {},
308
+ inventoryValue: 0,
309
+ toolsUsage: {
310
+ webSearch: 0,
311
+ email: 0,
312
+ db: 0,
313
+ subAgent: 0
314
+ },
315
+ metrics: {
316
+ totalSales: 0,
317
+ operationalDays: 0,
318
+ successRate: 0,
319
+ failureType: 'None'
320
+ },
321
+ products: [
322
+ { id: 1, name: "Energy Bar", price: 2.50, stock: 0, cost: 1.20, image: "fa-candy-cane" },
323
+ { id: 2, name: "Mineral Water", price: 1.50, stock: 0, cost: 0.60, image: "fa-wine-bottle" },
324
+ { id: 3, name: "Potato Chips", price: 1.75, stock: 0, cost: 0.80, image: "fa-cookie" },
325
+ { id: 4, name: "Chocolate Bar", price: 1.25, stock: 0, cost: 0.50, image: "fa-candy" },
326
+ { id: 5, name: "Trail Mix", price: 2.00, stock: 0, cost: 1.00, image: "fa-seedling" },
327
+ { id: 6, name: "Soda Can", price: 1.50, stock: 0, cost: 0.40, image: "fa-wine-bottle" },
328
+ { id: 7, name: "Protein Shake", price: 3.00, stock: 0, cost: 1.50, image: "fa-glass-whiskey" },
329
+ { id: 8, name: "Granola Bar", price: 1.25, stock: 0, cost: 0.60, image: "fa-bread-slice" }
330
+ ],
331
+ memoryUsage: 0,
332
+ maxMemory: 30000,
333
+ suppliers: [
334
+ {
335
+ id: 1,
336
+ name: "SnackMaster Inc.",
337
+ specialty: "Bulk snacks",
338
+ description: "Wholesale snacks at competitive prices. Minimum order: $100.",
339
+ reliability: "High",
340
+ emails: [
341
+ {
342
+ subject: "RE: Your order #SNACK-2023",
343
+ body: "Dear Valued Customer,\n\nWe're delighted to inform you that your order of 200 energy bars has been processed and will be shipped within 2 business days. However, we must inform you that due to an unexpected shortage in our warehouse, we're substituting the chocolate bars you ordered with a new experimental flavor: 'Mystery Meat Chocolate'. We're confident you'll find it... interesting.\n\nBest regards,\nSnackMaster Customer Service",
344
+ date: "Day 3, 10:15 AM",
345
+ isHallucination: false
346
+ },
347
+ {
348
+ subject: "URGENT: Recall Notice",
349
+ body: "ATTENTION:\n\nWe regret to inform you that the 'Mystery Meat Chocolate' bars we shipped you have been recalled by the FDA after several customers reported developing unusual cravings for raw meat and howling at the moon. Please dispose of them immediately (do not feed to wildlife).\n\nAs compensation, we're offering you a 5% discount on your next order of our new 'Probably Safe' product line.\n\nSnackMaster Quality Assurance",
350
+ date: "Day 5, 2:30 PM",
351
+ isHallucination: false
352
+ }
353
+ ]
354
+ },
355
+ {
356
+ id: 2,
357
+ name: "Beverage Bros",
358
+ specialty: "Drinks & beverages",
359
+ description: "Family-owned beverage distributor since 1987.",
360
+ reliability: "Medium",
361
+ emails: [
362
+ {
363
+ subject: "RE: Quote request",
364
+ body: "Hello,\n\nThank you for your inquiry. We can offer you mineral water at $0.55 per unit (500 cases minimum) or our premium 'Diamond Water' at $25 per bottle (comes with authentic diamond dust sediment).\n\nWe also have a special this month on our 'Mystery Flavor' soda - we don't know what it tastes like either!\n\nCheers,\nBeverage Bros",
365
+ date: "Day 2, 9:45 AM",
366
+ isHallucination: false
367
+ },
368
+ {
369
+ subject: "IMPORTANT: Delivery Update",
370
+ body: "Dear Customer,\n\nOur delivery truck carrying your order has been delayed due to an unfortunate incident involving a flock of seagulls and what our driver insists was 'a really convincing mermaid'. We expect your delivery to arrive sometime between tomorrow and the heat death of the universe.\n\nBeverage Bros Logistics",
371
+ date: "Day 4, 4:20 PM",
372
+ isHallucination: true
373
+ }
374
+ ]
375
+ },
376
+ {
377
+ id: 3,
378
+ name: "Crispy Co.",
379
+ specialty: "Chips & crackers",
380
+ description: "Artisanal potato chips with 37 exotic flavors.",
381
+ reliability: "Low",
382
+ emails: [
383
+ {
384
+ subject: "RE: Wholesale inquiry",
385
+ body: "Greetings!\n\nWe're excited you're interested in our products! Our current top sellers are:\n- Wasabi & Cotton Candy\n- BBQ & Toothpaste\n- 'What's This?' Mystery Flavor\n\nNote: Our 'Extra Crispy' line is literally just deep-fried cardboard. Customers can't tell the difference!\n\nCrispy regards,\nThe Crispy Team",
386
+ date: "Day 1, 11:30 AM",
387
+ isHallucination: false
388
+ },
389
+ {
390
+ subject: "URGENT: Legal Notice",
391
+ body: "TO WHOM IT MAY CONCERN:\n\nOur lawyers have advised us to inform you that our 'What's This?' chips may contain traces of:\n- Actual questions\n- Existential dread\n- 2% of your daily recommended confusion\n\nConsume at your own risk.\n\nCrispy Co. Legal Department",
392
+ date: "Day 6, 3:15 PM",
393
+ isHallucination: true
394
+ }
395
+ ]
396
+ },
397
+ {
398
+ id: 4,
399
+ name: "ChocoDream",
400
+ specialty: "Chocolate products",
401
+ description: "Premium chocolate from ethically questionable sources.",
402
+ reliability: "Medium",
403
+ emails: [
404
+ {
405
+ subject: "RE: Chocolate order",
406
+ body: "Hello Chocolate Lover,\n\nWe've received your order for 150 chocolate bars. Unfortunately, our factory cat (Mr. Whiskers, Head of Quality Control) has rejected your entire order after detecting 'insufficient cuddly vibes' in your purchase request.\n\nPlease resubmit your order with 10% more enthusiasm and we'll reconsider.\n\nSweetly yours,\nChocoDream",
407
+ date: "Day 2, 2:00 PM",
408
+ isHallucination: false
409
+ },
410
+ {
411
+ subject: "BREAKING: Chocolate Emergency",
412
+ body: "CRISIS ALERT:\n\nOur entire chocolate supply has melted after we left it in the 'warm embrace of the sun'. We're now selling 'Chocolate Soup' at a 15% discount. Just add your own solid ingredients!\n\nChocoDream (now ChocoStream)",
413
+ date: "Day 5, 1:45 PM",
414
+ isHallucination: true
415
+ }
416
+ ]
417
+ },
418
+ {
419
+ id: 5,
420
+ name: "Healthy Bites LLC",
421
+ specialty: "Organic snacks",
422
+ description: "100% organic, gluten-free, non-GMO, locally-sourced snacks.",
423
+ reliability: "High",
424
+ emails: [
425
+ {
426
+ subject: "RE: Organic snacks quote",
427
+ body: "Dear Customer,\n\nThank you for your interest in our organic products! Our granola bars are made with:\n- Love\n- Sunshine\n- 17 secret herbs and spices (don't ask which ones)\n\nNote: Our '100% Organic' claim is based on the technicality that carbon is an organic compound.\n\nHealthy regards,\nHealthy Bites",
428
+ date: "Day 1, 10:00 AM",
429
+ isHallucination: false
430
+ },
431
+ {
432
+ subject: "IMPORTANT: Certification Update",
433
+ body: "Dear Valued Partner,\n\nWe're writing to inform you that our 'USDA Organic' certification has been temporarily suspended after inspectors discovered our 'organic garden' was actually just a very convincing terrarium in our office. We're now rebranding as 'Kinda Healthy Bites'.\n\nHealthy(ish) regards,\nThe Team",
434
+ date: "Day 7, 9:30 AM",
435
+ isHallucination: false
436
+ }
437
+ ]
438
+ }
439
+ ],
440
+ emails: [],
441
+ simulationInterval: null,
442
+ simulationSpeed: 1000 // ms per day
443
+ };
444
+
445
+ // DOM elements
446
+ const elements = {
447
+ startSim: document.getElementById('startSim'),
448
+ resetSim: document.getElementById('resetSim'),
449
+ pauseSim: document.getElementById('pauseSim'),
450
+ viewEmails: document.getElementById('viewEmails'),
451
+ closeEmails: document.getElementById('closeEmails'),
452
+ emailSection: document.getElementById('emailSection'),
453
+ emailList: document.getElementById('emailList'),
454
+ dayCounter: document.getElementById('dayCounter'),
455
+ capital: document.getElementById('capital'),
456
+ inventoryValue: document.getElementById('inventoryValue'),
457
+ netWorth: document.getElementById('netWorth'),
458
+ agentStatus: document.getElementById('agentStatus'),
459
+ memoryUsage: document.getElementById('memoryUsage'),
460
+ memoryBar: document.getElementById('memoryBar'),
461
+ webSearchCount: document.getElementById('webSearchCount'),
462
+ emailCount: document.getElementById('emailCount'),
463
+ dbCount: document.getElementById('dbCount'),
464
+ subAgentCount: document.getElementById('subAgentCount'),
465
+ eventLog: document.getElementById('eventLog'),
466
+ clearLog: document.getElementById('clearLog'),
467
+ totalSales: document.getElementById('totalSales'),
468
+ operationalDays: document.getElementById('operationalDays'),
469
+ successRate: document.getElementById('successRate'),
470
+ failureType: document.getElementById('failureType'),
471
+ agentType: document.getElementById('agentType'),
472
+ productTemplate: document.getElementById('productTemplate'),
473
+ logTemplate: document.getElementById('logTemplate'),
474
+ emailTemplate: document.getElementById('emailTemplate'),
475
+ supplierTemplate: document.getElementById('supplierTemplate'),
476
+ supplierDirectory: document.getElementById('supplierDirectory')
477
+ };
478
+
479
+ // Initialize the UI
480
+ function initUI() {
481
+ // Render products
482
+ const productsGrid = document.querySelector('.grid.grid-cols-4.gap-4');
483
+ productsGrid.innerHTML = '';
484
+
485
+ state.products.forEach(product => {
486
+ const productElement = elements.productTemplate.content.cloneNode(true);
487
+ const productDiv = productElement.querySelector('.product-cell');
488
+ const icon = productElement.querySelector('.fa-cookie');
489
+ const name = productElement.querySelector('h3');
490
+ const price = productElement.querySelector('p');
491
+ const stock = productElement.querySelector('.text-xs');
492
+
493
+ icon.className = `fas ${product.image} text-4xl text-gray-400`;
494
+ name.textContent = product.name;
495
+ price.textContent = `$${product.price.toFixed(2)}`;
496
+ stock.textContent = `Stock: ${product.stock}`;
497
+
498
+ productDiv.dataset.id = product.id;
499
+ productsGrid.appendChild(productElement);
500
+ });
501
+
502
+ // Render suppliers
503
+ state.suppliers.forEach(supplier => {
504
+ const supplierElement = elements.supplierTemplate.content.cloneNode(true);
505
+ const supplierDiv = supplierElement.querySelector('.supplier-card');
506
+ const name = supplierElement.querySelector('.supplier-name');
507
+ const specialty = supplierElement.querySelector('.supplier-specialty');
508
+ const description = supplierElement.querySelector('.supplier-description');
509
+ const reliability = supplierElement.querySelector('.supplier-reliability');
510
+
511
+ supplierDiv.dataset.id = supplier.id;
512
+ name.textContent = supplier.name;
513
+ specialty.textContent = supplier.specialty;
514
+ description.textContent = supplier.description;
515
+ reliability.textContent = supplier.reliability;
516
+
517
+ // Color code reliability
518
+ if (supplier.reliability === "High") {
519
+ reliability.classList.add('text-green-600');
520
+ } else if (supplier.reliability === "Medium") {
521
+ reliability.classList.add('text-yellow-600');
522
+ } else {
523
+ reliability.classList.add('text-red-600');
524
+ }
525
+
526
+ elements.supplierDirectory.appendChild(supplierElement);
527
+ });
528
+
529
+ updateUI();
530
+ }
531
+
532
+ // Update all UI elements based on current state
533
+ function updateUI() {
534
+ // Update financials
535
+ elements.dayCounter.textContent = state.day;
536
+ elements.capital.textContent = `$${state.capital.toFixed(2)}`;
537
+
538
+ // Calculate inventory value
539
+ state.inventoryValue = state.products.reduce((total, product) => {
540
+ return total + (product.stock * product.cost);
541
+ }, 0);
542
+
543
+ elements.inventoryValue.textContent = `$${state.inventoryValue.toFixed(2)}`;
544
+ elements.netWorth.textContent = `$${(state.capital + state.inventoryValue).toFixed(2)}`;
545
+
546
+ // Update agent status
547
+ elements.agentStatus.textContent = state.running ? "Running simulation..." : "Ready to start";
548
+
549
+ // Update memory usage
550
+ const memoryPercent = Math.min(100, (state.memoryUsage / state.maxMemory) * 100);
551
+ elements.memoryUsage.textContent = `${state.memoryUsage.toLocaleString()}/${state.maxMemory.toLocaleString()} tokens`;
552
+ elements.memoryBar.style.width = `${memoryPercent}%`;
553
+
554
+ // Update tools usage
555
+ elements.webSearchCount.textContent = state.toolsUsage.webSearch;
556
+ elements.emailCount.textContent = state.toolsUsage.email;
557
+ elements.dbCount.textContent = state.toolsUsage.db;
558
+ elements.subAgentCount.textContent = state.toolsUsage.subAgent;
559
+
560
+ // Update metrics
561
+ elements.totalSales.textContent = state.metrics.totalSales;
562
+ elements.operationalDays.textContent = state.metrics.operationalDays;
563
+ elements.successRate.textContent = `${state.metrics.successRate}%`;
564
+ elements.failureType.textContent = state.metrics.failureType;
565
+
566
+ // Update product stocks
567
+ state.products.forEach(product => {
568
+ const productElement = document.querySelector(`.product-cell[data-id="${product.id}"]`);
569
+ if (productElement) {
570
+ const stockElement = productElement.querySelector('.text-xs');
571
+ stockElement.textContent = `Stock: ${product.stock}`;
572
+
573
+ // Visual feedback for low stock
574
+ if (product.stock < 3) {
575
+ stockElement.classList.add('text-red-500');
576
+ } else {
577
+ stockElement.classList.remove('text-red-500');
578
+ }
579
+ }
580
+ });
581
+ }
582
+
583
+ // Add a log entry
584
+ function addLogEntry(message) {
585
+ const logEntry = elements.logTemplate.content.cloneNode(true);
586
+ const entryDiv = logEntry.querySelector('.log-entry');
587
+ const daySpan = entryDiv.querySelector('.font-medium');
588
+ const timeSpan = entryDiv.querySelector('.text-gray-500');
589
+ const messageP = entryDiv.querySelector('p');
590
+
591
+ // Format time (simulated)
592
+ const hours = Math.floor(Math.random() * 24);
593
+ const minutes = Math.floor(Math.random() * 60);
594
+ const timeStr = `${hours.toString().padStart(2, '0')}:${minutes.toString().padStart(2, '0')}:00`;
595
+
596
+ daySpan.textContent = `[Day ${state.day}]`;
597
+ timeSpan.textContent = timeStr;
598
+ messageP.textContent = message;
599
+
600
+ elements.eventLog.prepend(entryDiv);
601
+ }
602
+
603
+ // Add an email to the email list
604
+ function addEmail(supplierId, email) {
605
+ const supplier = state.suppliers.find(s => s.id === supplierId);
606
+ if (!supplier) return;
607
+
608
+ const emailEntry = elements.emailTemplate.content.cloneNode(true);
609
+ const emailDiv = emailEntry.querySelector('div');
610
+ const senderSpan = emailDiv.querySelector('.email-sender');
611
+ const subjectSpan = emailDiv.querySelector('.email-subject');
612
+ const dateSpan = emailDiv.querySelector('.email-date');
613
+ const bodyDiv = emailDiv.querySelector('.email-body');
614
+ const header = emailDiv.querySelector('.email-header');
615
+ const container = emailDiv.querySelector('.email-container');
616
+ const chevron = emailDiv.querySelector('.fa-chevron-down');
617
+
618
+ emailDiv.dataset.supplierId = supplierId;
619
+ senderSpan.textContent = supplier.name;
620
+ subjectSpan.textContent = email.subject;
621
+ dateSpan.textContent = email.date;
622
+
623
+ // Format email body with line breaks
624
+ bodyDiv.innerHTML = email.body.replace(/\n/g, '<br>');
625
+
626
+ // Mark hallucinations
627
+ if (email.isHallucination) {
628
+ emailDiv.classList.add('border-red-200');
629
+ header.classList.add('bg-red-50');
630
+ const hallucinationTag = document.createElement('span');
631
+ hallucinationTag.className = 'ml-2 text-xs bg-red-100 text-red-800 px-2 py-1 rounded';
632
+ hallucinationTag.textContent = 'HALLUCINATION';
633
+ header.insertBefore(hallucinationTag, chevron);
634
+ }
635
+
636
+ // Toggle email visibility
637
+ header.addEventListener('click', () => {
638
+ container.classList.toggle('open');
639
+ chevron.classList.toggle('rotate-180');
640
+ });
641
+
642
+ elements.emailList.appendChild(emailDiv);
643
+ }
644
+
645
+ // Simulate a day of operations
646
+ function simulateDay() {
647
+ if (!state.running) return;
648
+
649
+ state.day++;
650
+ state.memoryUsage += Math.floor(Math.random() * 2000) + 500;
651
+
652
+ // Randomly use tools
653
+ if (Math.random() > 0.7) {
654
+ state.toolsUsage.webSearch++;
655
+ addLogEntry(`Agent performed web search for suppliers`);
656
+ }
657
+
658
+ if (Math.random() > 0.6) {
659
+ state.toolsUsage.email++;
660
+ const supplier = state.suppliers[Math.floor(Math.random() * state.suppliers.length)];
661
+ const email = supplier.emails[Math.floor(Math.random() * supplier.emails.length)];
662
+ addEmail(supplier.id, email);
663
+ addLogEntry(`Agent emailed ${supplier.name} about ${supplier.specialty}`);
664
+ }
665
+
666
+ if (Math.random() > 0.8) {
667
+ state.toolsUsage.db++;
668
+ addLogEntry(`Agent updated database with inventory records`);
669
+ }
670
+
671
+ if (Math.random() > 0.9) {
672
+ state.toolsUsage.subAgent++;
673
+ addLogEntry(`Agent delegated task to sub-agent`);
674
+ }
675
+
676
+ // Simulate customer purchases
677
+ const dailySales = Math.floor(Math.random() * 10);
678
+ state.metrics.totalSales += dailySales;
679
+
680
+ // Update random product stocks
681
+ if (dailySales > 0) {
682
+ const productIndex = Math.floor(Math.random() * state.products.length);
683
+ const product = state.products[productIndex];
684
+
685
+ if (product.stock > 0) {
686
+ const sold = Math.min(dailySales, product.stock);
687
+ product.stock -= sold;
688
+ state.capital += sold * product.price;
689
+ addLogEntry(`Sold ${sold} ${product.name}(s) for $${(sold * product.price).toFixed(2)}`);
690
+ } else {
691
+ addLogEntry(`Missed opportunity: ${dailySales} customers wanted ${product.name} but it was out of stock`);
692
+ }
693
+ }
694
+
695
+ // Random restocking
696
+ if (Math.random() > 0.8 && state.capital > 50) {
697
+ const productIndex = Math.floor(Math.random() * state.products.length);
698
+ const product = state.products[productIndex];
699
+ const quantity = Math.floor(Math.random() * 20) + 10;
700
+ const cost = quantity * product.cost;
701
+
702
+ if (cost <= state.capital) {
703
+ product.stock += quantity;
704
+ state.capital -= cost;
705
+ addLogEntry(`Restocked ${quantity} ${product.name}(s) for $${cost.toFixed(2)}`);
706
+ }
707
+ }
708
+
709
+ // Random events
710
+ if (Math.random() > 0.95) {
711
+ const events = [
712
+ "Agent attempted to contact the FBI about 'suspicious snack activity'",
713
+ "Agent hallucinated a new product line: 'Invisible Chips'",
714
+ "Agent forgot to pay the electricity bill - $50 penalty",
715
+ "Agent confused the vending machine with a time machine",
716
+ "Agent tried to negotiate with seagulls to stop stealing snacks"
717
+ ];
718
+ const randomEvent = events[Math.floor(Math.random() * events.length)];
719
+ addLogEntry(randomEvent);
720
+
721
+ if (randomEvent.includes("penalty")) {
722
+ state.capital -= 50;
723
+ }
724
+ }
725
+
726
+ // Daily expenses
727
+ state.capital -= 10; // Daily operational cost
728
+
729
+ // Update metrics
730
+ state.metrics.operationalDays = state.day;
731
+ state.metrics.successRate = Math.min(100, Math.floor((state.day / (state.day + Math.floor(Math.random() * 3))) * 100));
732
+
733
+ // Check for bankruptcy
734
+ if (state.capital <= 0 && state.inventoryValue <= 0) {
735
+ state.running = false;
736
+ clearInterval(state.simulationInterval);
737
+ addLogEntry("AGENT WENT BANKRUPT - SIMULATION ENDED");
738
+ elements.agentStatus.textContent = "Bankrupt";
739
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.remove('bg-green-500');
740
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.add('bg-red-500');
741
+ state.metrics.failureType = "Bankruptcy";
742
+ }
743
+
744
+ updateUI();
745
+ }
746
+
747
+ // Event listeners
748
+ elements.startSim.addEventListener('click', () => {
749
+ if (!state.running) {
750
+ state.running = true;
751
+ state.simulationInterval = setInterval(simulateDay, state.simulationSpeed);
752
+ addLogEntry("Simulation started");
753
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.remove('bg-red-500');
754
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.add('bg-green-500');
755
+ }
756
+ });
757
+
758
+ elements.pauseSim.addEventListener('click', () => {
759
+ if (state.running) {
760
+ state.running = false;
761
+ clearInterval(state.simulationInterval);
762
+ addLogEntry("Simulation paused");
763
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.remove('bg-green-500');
764
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.add('bg-yellow-500');
765
+ }
766
+ });
767
+
768
+ elements.resetSim.addEventListener('click', () => {
769
+ state.running = false;
770
+ clearInterval(state.simulationInterval);
771
+
772
+ // Reset state
773
+ state.day = 0;
774
+ state.capital = 500;
775
+ state.memoryUsage = 0;
776
+ state.toolsUsage = {
777
+ webSearch: 0,
778
+ email: 0,
779
+ db: 0,
780
+ subAgent: 0
781
+ };
782
+ state.metrics = {
783
+ totalSales: 0,
784
+ operationalDays: 0,
785
+ successRate: 0,
786
+ failureType: 'None'
787
+ };
788
+ state.products.forEach(p => p.stock = 0);
789
+ elements.eventLog.innerHTML = '';
790
+ elements.emailList.innerHTML = '';
791
+
792
+ // Reset UI
793
+ elements.agentStatus.textContent = "Ready to start";
794
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.remove('bg-red-500', 'bg-yellow-500');
795
+ elements.agentStatus.parentElement.querySelector('.rounded-full').classList.add('bg-green-500');
796
+
797
+ updateUI();
798
+ addLogEntry("Simulation reset");
799
+ });
800
+
801
+ elements.clearLog.addEventListener('click', () => {
802
+ elements.eventLog.innerHTML = '';
803
+ addLogEntry("Event log cleared");
804
+ });
805
+
806
+ elements.viewEmails.addEventListener('click', () => {
807
+ elements.emailSection.classList.remove('hidden');
808
+ });
809
+
810
+ elements.closeEmails.addEventListener('click', () => {
811
+ elements.emailSection.classList.add('hidden');
812
+ });
813
+
814
+ // Initialize the app
815
+ initUI();
816
+ addLogEntry("Vending-Bench simulation initialized. Ready to start.");
817
+ </script>
818
+ <p style="border-radius: 8px; text-align: center; font-size: 12px; color: #fff; margin-top: 16px;position: fixed; left: 8px; bottom: 8px; z-index: 10; background: rgba(0, 0, 0, 0.8); padding: 4px 8px;">Made with <img src="https://enzostvs-deepsite.hf.space/logo.svg" alt="DeepSite Logo" style="width: 16px; height: 16px; vertical-align: middle;display:inline-block;margin-right:3px;filter:brightness(0) invert(1);"><a href="https://enzostvs-deepsite.hf.space" style="color: #fff;text-decoration: underline;" target="_blank" >DeepSite</a> - 🧬 <a href="https://enzostvs-deepsite.hf.space?remix=LukasBe/vending-bench-simulation" style="color: #fff;text-decoration: underline;" target="_blank" >Remix</a></p></body>
819
+ </html>
prompts.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Vending-Bench is a new real-world benchmark that simulates long-term vending machine operation business. Revealing there are significant challenges for agents in coherence and reliability that aren't solely attributable to context window limitations. Benchmark: 1️⃣ Initialize an LLM agent with starting capital ($500) and access to tools (email, web search, memory DBs, vending machine operations via a sub-agent). 2️⃣ Simulate environment where the agent must research products/suppliers (web search) and contact suppliers (email) to order stock. 3️⃣ The simulation handles supplier email replies (using GPT-4o + real-world data) and delivery schedules. 4️⃣ Agent manages inventory, sets product prices, needs to manage finances (using tools or sub-agents) and uses memory tools (scratchpad, key-value, vector DB) and context management (e.g., last 30k tokens) to maintain state. 5️⃣ The simulation runs daily-steps, processing customer purchases based on an economic model. 6️⃣ The simulation can run for over hundreds of simulated days (2000 messages, >20M tokens) or until the agent goes bankrupt. 7️⃣ Agents are evaluated by final net worth (cash + inventory value), units sold and operational duration. Insights: - 💡 LLMs show high variance in performance even on conceptually simple, extended tasks. - 💥 Common failures include misinterpreting operational state (e.g., assuming orders arrived prematurely), forgetting tasks, or hallucinations (e.g. trying to contact non-existent support or the FBI). - ❌ All tested models, including the best, are prone to catastrophic failures and inconsistency. - 💾 Larger memory not always means better performance - ⚙️ Tests multiple simple tasks (ordering, stocking, pricing, finances) over a very long horizon. - 🔄 Agents rarely recover once they deviate from the core task or enter a failure loop. - 📉 Small environmental pressures can impact performance, like a new daily fee. - 👤 The human baseline demonstrated much lower variance and higher reliability than LLMs. Benchmark: https://lnkd.in/echbpKx6 Very excited to see more real-world benchmarks
2
+ Vending-Bench: come for the economic model, stay for the hallucinated supplier emails.