Spaces:
Running
Running
<html> | |
<head> | |
<title>Watermark Detector</title> | |
<meta name="viewport" content="width=device-width, initial-scale=1"> | |
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet"> | |
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/font/bootstrap-icons.css"> | |
<link rel="stylesheet" href="{{ url_for('static', filename='styles.css') }}"> | |
</head> | |
<body> | |
<div class="container"> | |
<div class="d-flex justify-content-between align-items-center"> | |
<h1>Interactive watermark detector</h1> | |
<div class="d-flex gap-2"> | |
<button class="btn btn-outline-secondary" data-bs-toggle="modal" data-bs-target="#helpModal"> | |
<i class="bi bi-question-circle"></i> | |
</button> | |
<button class="btn btn-outline-secondary" data-bs-toggle="modal" data-bs-target="#paramsModal"> | |
<i class="bi bi-gear"></i> | |
</button> | |
</div> | |
</div> | |
<!-- Help Modal --> | |
<div class="modal fade" id="helpModal" tabindex="-1"> | |
<div class="modal-dialog modal-lg"> | |
<div class="modal-content"> | |
<div class="modal-header"> | |
<h5 class="modal-title">Watermark Detection Help</h5> | |
<button type="button" class="btn-close" data-bs-dismiss="modal"></button> | |
</div> | |
<div class="modal-body help-modal-body"> | |
<h4>What is LLM watermarking?</h4> | |
<p>LLM watermarking is a technique that slightly modifies how language models generate text, making it possible to detect if text was generated by a specific AI model without visibly changing the text quality.</p> | |
<h4>How to use this demo</h4> | |
<ol> | |
<li>Enter a prompt in the top text area to generate watermarked text</li> | |
<li>The generated text will appear in the second text box</li> | |
<li>The text will be automatically analyzed to show which tokens (parts of text) were influenced by the watermark</li> | |
<li>The statistics at the bottom show the detection results</li> | |
<li>You can also paste any text in the second box to test if it contains a watermark</li> | |
</ol> | |
<h4>Detection Methods</h4> | |
<p><strong>Maryland</strong>: A token-level detection algorithm that analyzes how unexpected each token is, based on the paper "<a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a>" by Kirchenbauer et al.</p> | |
<p><strong>OpenAI</strong>: A similar watermarking method inspired by initial reports from OpenAI.</p> | |
<p><strong>Maryland Z-score</strong>: A worse variant of the Maryland detector that uses z-scores for statistical interpretation.</p> | |
<p><strong>OpenAI Z-score</strong>: A worse variant of the OpenAI detector that uses z-scores for statistical interpretation.</p> | |
<h4>Parameters Explained</h4> | |
<dl class="help-description-list"> | |
<dt>Detector Type</dt> | |
<dd>The algorithm used to detect watermarks in the text. Different detectors perform better in different scenarios.</dd> | |
<dt>Seed</dt> | |
<dd>The random seed used for watermarking. The detector must use the same seed that was used when generating the text. In a real-world scenario, this would be kept private by the model provider.</dd> | |
<dt>N-gram Size</dt> | |
<dd>The number of previous tokens considered when choosing "greenlist" tokens. Larger values make the watermark less robust against edits but may improve text quality.</dd> | |
<dt>Delta</dt> | |
<dd>The bias added to "greenlist" tokens during generation. Higher values make the watermark stronger but might affect text quality. Typical values range from 1.0 to 5.0.</dd> | |
<dt>Temperature</dt> | |
<dd>Controls randomness in text generation. Higher values (e.g., 1.0) produce more diverse outputs; lower values (e.g., 0.2) make outputs more focused and deterministic.</dd> | |
</dl> | |
<h4>Understanding Results</h4> | |
<dl class="help-description-list"> | |
<dt>Tokens</dt> | |
<dd>The total number of tokens in the analyzed text. Tokens are units of text that may represent words, parts of words, or punctuation.</dd> | |
<dt>Scored Tokens</dt> | |
<dd>The number of tokens that were actually evaluated by the detector (excludes first few tokens that don't have enough context).</dd> | |
<dt>Final Score</dt> | |
<dd>A measure of how likely the text contains a watermark. Higher scores indicate stronger evidence of watermarking.</dd> | |
<dt>P-value</dt> | |
<dd>The statistical significance of the detection. Lower values (especially p < 1e-6) indicate strong evidence that the text was watermarked. Values close to 0.5 suggest no watermark is present.</dd> | |
</dl> | |
<h4>Related Papers</h4> | |
<ul class="paper-references"> | |
<li> | |
<a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a> | |
<span class="paper-authors">Kirchenbauer, et al. (2023)</span> | |
</li> | |
<li> | |
<a href="https://arxiv.org/abs/2308.00113" target="_blank">Three Bricks to Consolidate Watermarks for Large Language Models</a> | |
<span class="paper-authors">Fernandez, et al. (2023)</span> | |
</li> | |
</ul> | |
</div> | |
<div class="modal-footer"> | |
<button type="button" class="btn btn-primary" data-bs-dismiss="modal">Close</button> | |
</div> | |
</div> | |
</div> | |
</div> | |
<!-- Advanced Parameters Modal --> | |
<div class="modal fade" id="paramsModal" tabindex="-1"> | |
<div class="modal-dialog"> | |
<div class="modal-content"> | |
<div class="modal-header"> | |
<h5 class="modal-title">Advanced Parameters</h5> | |
<button type="button" class="btn-close" data-bs-dismiss="modal"></button> | |
</div> | |
<div class="modal-body"> | |
<div class="mb-3"> | |
<div class="d-flex align-items-center"> | |
<i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The algorithm used to detect watermarks in the text. Different algorithms have different strengths and weaknesses."></i> | |
<label for="detectorType" class="form-label mb-0">Detector Type</label> | |
</div> | |
<select class="form-select" id="detectorType"> | |
<option value="maryland">Maryland</option> | |
<option value="marylandz">Maryland Z-score</option> | |
<option value="openai">OpenAI</option> | |
<option value="openaiz">OpenAI Z-score</option> | |
</select> | |
<div class="form-text">Type of watermark detection algorithm</div> | |
</div> | |
<div class="mb-3"> | |
<div class="d-flex align-items-center"> | |
<i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The random seed used for watermarking. For detection to work, this must match the seed used during generation."></i> | |
<label for="seed" class="form-label mb-0">Seed</label> | |
</div> | |
<input type="number" class="form-control" id="seed" value="0"> | |
<div class="form-text">Random seed for the watermark detector</div> | |
</div> | |
<div class="mb-3"> | |
<div class="d-flex align-items-center"> | |
<i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The number of previous tokens considered when determining the hashing. Higher values are more secure but may affect robustness. Lower values may introduce a bias in the text distribution."></i> | |
<label for="ngram" class="form-label mb-0">N-gram Size</label> | |
</div> | |
<input type="number" class="form-control" id="ngram" value="1"> | |
<div class="form-text">Size of the n-gram window used for detection</div> | |
</div> | |
<div class="mb-3"> | |
<div class="d-flex align-items-center"> | |
<i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The bias added to greenlist tokens during generation. Higher values (2.0-5.0) make watermarks easier to detect but might affect text quality."></i> | |
<label for="delta" class="form-label mb-0">Delta</label> | |
</div> | |
<input type="number" step="0.1" class="form-control" id="delta" value="2.0"> | |
<div class="form-text">Bias added to greenlist tokens (for Maryland method)</div> | |
</div> | |
<div class="mb-3"> | |
<div class="d-flex align-items-center"> | |
<i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="Controls randomness in generation. Higher values (closer to 2.0) create more diverse outputs and make watermarks easier to detect."></i> | |
<label for="temperature" class="form-label mb-0">Temperature</label> | |
</div> | |
<input type="number" step="0.1" class="form-control" id="temperature" value="0.8"> | |
<div class="form-text">Temperature for sampling (higher = more random)</div> | |
</div> | |
</div> | |
<div class="modal-footer"> | |
<button type="button" class="btn btn-secondary" data-bs-dismiss="modal">Close</button> | |
<button type="button" class="btn btn-primary" id="applyParams">Apply</button> | |
</div> | |
</div> | |
</div> | |
</div> | |
<!-- Input Form --> | |
<div class="input-section"> | |
<div class="prompt-container"> | |
<textarea id="prompt_text" | |
placeholder="Enter your prompt here to generate text with the model..."></textarea> | |
<button class="floating-btn" id="generateBtn"> | |
<i class="bi bi-send-fill send-icon"></i> | |
<i class="bi bi-stop-fill stop-icon"></i> | |
</button> | |
</div> | |
<textarea id="user_text" | |
placeholder="Generated text will appear here. Replace or edit this text to see how watermark detection works."></textarea> | |
</div> | |
<!-- Token Display --> | |
<div class="token-display" id="tokenDisplay"></div> | |
<!-- Statistics --> | |
<div class="stats-container"> | |
<div> | |
<div class="stat-value" id="tokenCount">0</div> | |
<div class="stat-label"> | |
Tokens | |
<i class="bi bi-question-circle help-icon"></i> | |
<span class="help-tooltip">Total number of tokens in the text</span> | |
</div> | |
</div> | |
<div> | |
<div class="stat-value" id="scoredTokens">0</div> | |
<div class="stat-label"> | |
Scored Tokens | |
<i class="bi bi-question-circle help-icon"></i> | |
<span class="help-tooltip">Number of tokens that were actually scored by the detector (excludes first n-gram tokens and duplicates)</span> | |
</div> | |
</div> | |
<div> | |
<div class="stat-value" id="finalScore">0.00</div> | |
<div class="stat-label"> | |
Final Score | |
<i class="bi bi-question-circle help-icon"></i> | |
<span class="help-tooltip">Cumulative score from all scored tokens. Higher values indicate more likely watermarked text</span> | |
</div> | |
</div> | |
<div> | |
<div class="stat-value" id="pValue">0.500</div> | |
<div class="stat-label"> | |
P-value | |
<i class="bi bi-question-circle help-icon"></i> | |
<span class="help-tooltip">Statistical significance of the score. Lower values indicate stronger evidence of watermarking (p < 0.05 is typically considered significant)</span> | |
</div> | |
</div> | |
</div> | |
</div> | |
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js"></script> | |
<script> | |
let debounceTimeout = null; | |
let abortController = null; | |
const textarea = document.getElementById('user_text'); | |
const promptArea = document.getElementById('prompt_text'); | |
const generateBtn = document.getElementById('generateBtn'); | |
const tokenDisplay = document.getElementById('tokenDisplay'); | |
const tokenCount = document.getElementById('tokenCount'); | |
const scoredTokens = document.getElementById('scoredTokens'); | |
const finalScore = document.getElementById('finalScore'); | |
const pValue = document.getElementById('pValue'); | |
const applyParamsBtn = document.getElementById('applyParams'); | |
const seedInput = document.getElementById('seed'); | |
const ngramInput = document.getElementById('ngram'); | |
const detectorTypeSelect = document.getElementById('detectorType'); | |
const deltaInput = document.getElementById('delta'); | |
const temperatureInput = document.getElementById('temperature'); | |
// Initialize tooltips | |
document.addEventListener('DOMContentLoaded', function() { | |
// Initialize Bootstrap tooltips | |
const tooltipTriggerList = [].slice.call(document.querySelectorAll('[data-bs-toggle="tooltip"]')); | |
const tooltipList = tooltipTriggerList.map(function (tooltipTriggerEl) { | |
return new bootstrap.Tooltip(tooltipTriggerEl); | |
}); | |
// Initialize our tokenization | |
updateTokenization().catch(error => { | |
console.error('Error during initial tokenization:', error); | |
}); | |
}); | |
function startGeneration() { | |
const prompt = promptArea.value.trim(); | |
if (!prompt) { | |
alert('Please enter a prompt first.'); | |
return; | |
} | |
generateBtn.classList.add('generating'); | |
textarea.value = ''; | |
// Create new AbortController for this request | |
abortController = new AbortController(); | |
// Get current parameters | |
const params = { | |
detector_type: detectorTypeSelect.value, | |
seed: parseInt(seedInput.value) || 0, | |
ngram: parseInt(ngramInput.value) || 1, | |
delta: parseFloat(deltaInput.value) || 2.0, | |
temperature: parseFloat(temperatureInput.value) || 0.8 | |
}; | |
// Create headers for SSE | |
const headers = new Headers({ | |
'Content-Type': 'application/json', | |
'Accept': 'text/event-stream', | |
}); | |
// Start fetch request with signal | |
fetch('/generate', { | |
method: 'POST', | |
headers: headers, | |
body: JSON.stringify({ | |
prompt: prompt, | |
params: params | |
}), | |
signal: abortController.signal // Add the abort signal | |
}).then(response => { | |
const reader = response.body.getReader(); | |
const decoder = new TextDecoder(); | |
let buffer = ''; | |
function processText(text) { | |
const lines = text.split('\n'); | |
for (const line of lines) { | |
if (line.startsWith('data: ')) { | |
try { | |
const data = JSON.parse(line.slice(6)); | |
if (data.error) { | |
alert('Error: ' + data.error); | |
stopGeneration(); | |
return; | |
} | |
if (data.token) { | |
// Append new token to existing text | |
textarea.value += data.token; | |
updateTokenization(); | |
} | |
if (data.text) { | |
// Final text (only used if something went wrong with streaming) | |
textarea.value = data.text; | |
updateTokenization(); | |
} | |
if (data.done) { | |
stopGeneration(); | |
} | |
} catch (e) { | |
console.error('Error parsing SSE data:', e); | |
} | |
} | |
} | |
} | |
function pump() { | |
return reader.read().then(({value, done}) => { | |
if (done) { | |
if (buffer.length > 0) { | |
processText(buffer); | |
} | |
return; | |
} | |
buffer += decoder.decode(value, {stream: true}); | |
const lines = buffer.split('\n\n'); | |
buffer = lines.pop(); | |
for (const line of lines) { | |
processText(line); | |
} | |
return pump(); | |
}); | |
} | |
return pump(); | |
}) | |
.catch(error => { | |
if (error.name === 'AbortError') { | |
console.log('Generation stopped by user'); | |
} else { | |
console.error('Error:', error); | |
alert('Error: Failed to generate text'); | |
} | |
}) | |
.finally(() => { | |
generateBtn.classList.remove('generating'); | |
abortController = null; | |
}); | |
} | |
function stopGeneration() { | |
if (abortController) { | |
abortController.abort(); | |
abortController = null; | |
} | |
generateBtn.classList.remove('generating'); | |
} | |
// Remove BOTH old event listeners and add just one new one | |
generateBtn.addEventListener('click', function(e) { | |
e.preventDefault(); // Prevent any double triggers | |
if (generateBtn.classList.contains('generating')) { | |
stopGeneration(); | |
} else { | |
startGeneration(); | |
} | |
}); | |
async function updateTokenization() { | |
const text = textarea.value; | |
try { | |
// Validate parameters before sending | |
const seed = parseInt(seedInput.value); | |
const ngram = parseInt(ngramInput.value); | |
const delta = parseFloat(deltaInput.value); | |
const temperature = parseFloat(temperatureInput.value); | |
const response = await fetch('/tokenize', { | |
method: 'POST', | |
headers: { | |
'Content-Type': 'application/json', | |
}, | |
body: JSON.stringify({ | |
text: text, | |
params: { | |
detector_type: detectorTypeSelect.value, | |
seed: isNaN(seed) ? 0 : seed, | |
ngram: isNaN(ngram) ? 1 : ngram, | |
delta: isNaN(delta) ? 2.0 : delta, | |
temperature: isNaN(temperature) ? 0.8 : temperature | |
} | |
}) | |
}); | |
if (!response.ok) { | |
const errorData = await response.json(); | |
throw new Error(errorData.error || `HTTP error! status: ${response.status}`); | |
} | |
const data = await response.json(); | |
if (data.error) { | |
throw new Error(data.error); | |
} | |
// Update token display | |
tokenDisplay.innerHTML = data.tokens.map((token, i) => { | |
const score = data.scores[i]; | |
const pvalue = data.pvalues[i]; | |
const scoreDisplay = (score !== null && !isNaN(score)) ? score.toFixed(3) : 'N/A'; | |
const pvalueDisplay = (pvalue !== null && !isNaN(pvalue)) ? formatPValue(pvalue) : 'N/A'; | |
return `<span class="token" style="background-color: ${data.colors[i]}"> | |
${token} | |
<div class="token-tooltip"> | |
Score: ${scoreDisplay}<br> | |
P-value: ${pvalueDisplay} | |
</div> | |
</span>`; | |
}).join(''); | |
// Update counts and stats - safely handle null values | |
tokenCount.textContent = data.token_count || 0; | |
scoredTokens.textContent = data.ntoks_scored || 0; | |
finalScore.textContent = (data.final_score !== null && !isNaN(data.final_score)) ? | |
data.final_score.toFixed(2) : '0.00'; | |
pValue.textContent = (data.final_pvalue !== null && !isNaN(data.final_pvalue)) ? | |
formatPValue(data.final_pvalue) : '0.500'; | |
// Clear any previous error | |
const existingError = tokenDisplay.querySelector('.alert-danger'); | |
if (existingError) { | |
existingError.remove(); | |
} | |
} catch (error) { | |
console.error('Error updating tokenization:', error); | |
// Show detailed error to user | |
tokenDisplay.innerHTML = `<div class="alert alert-danger"> | |
<strong>Error:</strong> ${error.message || 'Error updating results. Please try again.'} | |
</div>`; | |
// Reset stats on error | |
tokenCount.textContent = '0'; | |
scoredTokens.textContent = '0'; | |
finalScore.textContent = '0.00'; | |
pValue.textContent = '0.500'; | |
} | |
} | |
// Increase debounce timeout and ensure it's properly cleared | |
textarea.addEventListener('input', function() { | |
if (debounceTimeout) { | |
clearTimeout(debounceTimeout); | |
} | |
debounceTimeout = setTimeout(updateTokenization, 500); // Increased to 500ms | |
}); | |
// Add input event listeners for parameter fields to trigger updates | |
seedInput.addEventListener('input', function() { | |
const value = this.value === '' ? '' : parseInt(this.value); | |
if (isNaN(value) && this.value !== '') { | |
this.value = "0"; | |
} | |
if (debounceTimeout) { | |
clearTimeout(debounceTimeout); | |
} | |
debounceTimeout = setTimeout(updateTokenization, 500); | |
}); | |
ngramInput.addEventListener('input', function() { | |
const value = this.value === '' ? '' : parseInt(this.value); | |
if (isNaN(value) && this.value !== '') { | |
this.value = "1"; | |
} | |
if (debounceTimeout) { | |
clearTimeout(debounceTimeout); | |
} | |
debounceTimeout = setTimeout(updateTokenization, 500); | |
}); | |
deltaInput.addEventListener('input', function() { | |
const value = this.value === '' ? '' : parseFloat(this.value); | |
if (isNaN(value) && this.value !== '') { | |
this.value = "2.0"; | |
} | |
if (debounceTimeout) { | |
clearTimeout(debounceTimeout); | |
} | |
debounceTimeout = setTimeout(updateTokenization, 500); | |
}); | |
temperatureInput.addEventListener('input', function() { | |
const value = this.value === '' ? '' : parseFloat(this.value); | |
if (isNaN(value) && this.value !== '') { | |
this.value = "0.8"; | |
} | |
if (debounceTimeout) { | |
clearTimeout(debounceTimeout); | |
} | |
debounceTimeout = setTimeout(updateTokenization, 500); | |
}); | |
// Add keyboard shortcut for applying changes | |
document.addEventListener('keydown', function(e) { | |
if ((e.metaKey || e.ctrlKey) && e.key === 'Enter') { | |
e.preventDefault(); | |
if (document.activeElement === promptArea) { | |
if (generateBtn.classList.contains('generating')) { | |
stopGeneration(); | |
} else { | |
startGeneration(); | |
} | |
} else { | |
applyParamsBtn.click(); | |
} | |
} | |
}); | |
detectorTypeSelect.addEventListener('change', function() { | |
if (debounceTimeout) { | |
clearTimeout(debounceTimeout); | |
} | |
debounceTimeout = setTimeout(updateTokenization, 500); | |
}); | |
// Ensure the modal apply button properly triggers an update | |
applyParamsBtn.addEventListener('click', function() { | |
updateTokenization().then(() => { | |
const modal = bootstrap.Modal.getInstance(document.getElementById('paramsModal')); | |
if (modal) { | |
modal.hide(); | |
} | |
}).catch(error => { | |
console.error('Error applying parameters:', error); | |
}); | |
}); | |
// Add this helper function for formatting p-values | |
function formatPValue(value) { | |
if (value >= 0.001) { | |
return value.toFixed(3); | |
} else { | |
return value.toExponential(2); | |
} | |
} | |
</script> | |
</body> | |
</html> | |