Spaces:

pierrefdz
/

interactive-llm-wm

Running

App Files Files Community

pierrefdz commited on Apr 4

Commit

1cc9a2b

1 Parent(s): e4409fa

added helpers

Browse files

Files changed (1) hide show

wm_interactive/templates/index.html +8 -12

wm_interactive/templates/index.html CHANGED Viewed

@@ -44,9 +44,9 @@
                         <h4>Detection Methods</h4>
                         <p><strong>Maryland</strong>: A token-level detection algorithm that analyzes how unexpected each token is, based on the paper "<a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a>" by Kirchenbauer et al.</p>
-                        <p><strong>Maryland Z-score</strong>: A variant of the Maryland detector that uses z-scores for better statistical interpretation.</p>
-                        <p><strong>OpenAI</strong>: A watermarking method similar to what was described in "<a href="https://arxiv.org/abs/2306.04634" target="_blank">A Watermark for Large Language Models</a>" by Kirchenbauer et al., inspired by initial reports from OpenAI.</p>
-                        <p><strong>OpenAI Z-score</strong>: A variant of the OpenAI detector that uses z-scores for better statistical interpretation.</p>
                         <h4>Parameters Explained</h4>
                         <dl class="help-description-list">
@@ -57,7 +57,7 @@
                             <dd>The random seed used for watermarking. The detector must use the same seed that was used when generating the text. In a real-world scenario, this would be kept private by the model provider.</dd>
                             <dt>N-gram Size</dt>
-                            <dd>The number of previous tokens considered when choosing "greenlist" tokens. Larger values make the watermark more robust against edits but may reduce text quality.</dd>
                             <dt>Delta</dt>
                             <dd>The bias added to "greenlist" tokens during generation. Higher values make the watermark stronger but might affect text quality. Typical values range from 1.0 to 5.0.</dd>
@@ -78,22 +78,18 @@
                             <dd>A measure of how likely the text contains a watermark. Higher scores indicate stronger evidence of watermarking.</dd>
                             <dt>P-value</dt>
-                            <dd>The statistical significance of the detection. Lower values (especially p &lt; 0.05) indicate strong evidence that the text was watermarked. Values close to 0.5 suggest no watermark is present.</dd>
                         </dl>
                         <h4>Related Papers</h4>
                         <ul class="paper-references">
                             <li>
                                 <a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a>
-                                <span class="paper-authors">Kirchenbauer, Geiping, et al. (2023)</span>
                             </li>
                             <li>
-                                <a href="https://arxiv.org/abs/2306.04634" target="_blank">Robust Distortion-free Watermarks for Language Models</a>
-                                <span class="paper-authors">Kuditipudi, Thickstun, et al. (2023)</span>
-                            </li>
-                            <li>
-                                <a href="https://arxiv.org/abs/2305.08883" target="_blank">Provable Robust Watermarking for AI-Generated Text</a>
-                                <span class="paper-authors">Christ, Mireshghallah, et al. (2023)</span>
                             </li>
                         </ul>
                     </div>

                         <h4>Detection Methods</h4>
                         <p><strong>Maryland</strong>: A token-level detection algorithm that analyzes how unexpected each token is, based on the paper "<a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a>" by Kirchenbauer et al.</p>
+                        <p><strong>OpenAI</strong>: A similar watermarking method inspired by initial reports from OpenAI.</p>
+                        <p><strong>Maryland Z-score</strong>: A worse variant of the Maryland detector that uses z-scores for statistical interpretation.</p>
+                        <p><strong>OpenAI Z-score</strong>: A worse variant of the OpenAI detector that uses z-scores for statistical interpretation.</p>
                         <h4>Parameters Explained</h4>
                         <dl class="help-description-list">
                             <dd>The random seed used for watermarking. The detector must use the same seed that was used when generating the text. In a real-world scenario, this would be kept private by the model provider.</dd>
                             <dt>N-gram Size</dt>
+                            <dd>The number of previous tokens considered when choosing "greenlist" tokens. Larger values make the watermark less robust against edits but may improve text quality.</dd>
                             <dt>Delta</dt>
                             <dd>The bias added to "greenlist" tokens during generation. Higher values make the watermark stronger but might affect text quality. Typical values range from 1.0 to 5.0.</dd>
                             <dd>A measure of how likely the text contains a watermark. Higher scores indicate stronger evidence of watermarking.</dd>
                             <dt>P-value</dt>
+                            <dd>The statistical significance of the detection. Lower values (especially p &lt; 1e-6) indicate strong evidence that the text was watermarked. Values close to 0.5 suggest no watermark is present.</dd>
                         </dl>
                         <h4>Related Papers</h4>
                         <ul class="paper-references">
                             <li>
                                 <a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a>
+                                <span class="paper-authors">Kirchenbauer, et al. (2023)</span>
                             </li>
                             <li>
+                                <a href="https://arxiv.org/abs/2308.00113" target="_blank">Three Bricks to Consolidate Watermarks for Large Language Models</a>
+                                <span class="paper-authors">Fernandez, et al. (2023)</span>
                             </li>
                         </ul>
                     </div>