Spaces:

PrunaAI
/

InferBench

Running

App Files Files Community

davidberenstein1957 commited on 13 days ago

Commit

1c9c07a

1 Parent(s): 7b73541

feat: refactor app.py to load data from a JSONL file, enhance the leaderboard display, and improve the "About" section layout with new branding and community links. Remove outdated leaderboard_data.json file.

Browse files

Files changed (5) hide show

README.md +1 -1
app.py +65 -49
assets.py +23 -0
leaderboard_data.json +0 -0
results.jsonl +112 -0

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ sdk: gradio
 app_file: app.py
 pinned: true
 license: apache-2.0
-short_description: A Leaderboard for Inference Providers (cost/quality/speed)!
 sdk_version: 5.19.0
 ---

 app_file: app.py
 pinned: true
 license: apache-2.0
+short_description: A cost/quality/speed Leaderboard for Inference Providers!
 sdk_version: 5.19.0
 ---

app.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import math
 from pathlib import Path
@@ -5,30 +6,42 @@ import gradio as gr
 import pandas as pd
 from gradio_leaderboard import ColumnFilter, Leaderboard
 abs_path = Path(__file__).parent
-# Any pandas-compatible data
-df = pd.read_csv(str(abs_path / "data.csv"))
 df["Model"] = df.apply(
-    lambda row: f'<a target="_blank" href="{row["URL"]}" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">{row["Provider"]}</a>',
     axis=1,
 )
-df = df[["Model"] + [col for col in df.columns.tolist() if col not in ["URL", "Provider", "Model"]]]
-with gr.Blocks("ParityError/Interstellar") as demo:
-    gr.Markdown(
         """
-        <h1 style="margin: 0;">InferBench - A Leaderboard for Inference Providers</h1>
-        <br>
-        <div style="margin-bottom: 20px;">
-            <p>Welcome to InferBench, the ultimate leaderboard for evaluating inference providers. Our platform focuses on key metrics such as cost, quality, and compression to help you make informed decisions. Whether you're a developer, researcher, or business looking to optimize your inference processes, InferBench provides the insights you need to choose the best provider for your needs.</p>
-        </div>
-    """
     )
     with gr.Tabs():
         with gr.TabItem("FLUX.1 [dev] Leaderboard"):
-            median_inference_time_min = math.floor(float(df["Median Inference Time (in s)"].min()))
-            median_inference_time_max = math.ceil(float(df["Median Inference Time (in s)"].max()))
             price_per_image_min = math.floor(float(df["Price per Image"].min()))
             price_per_image_max = math.ceil(float(df["Price per Image"].max()))
             Leaderboard(
@@ -36,7 +49,7 @@ with gr.Blocks("ParityError/Interstellar") as demo:
                 search_columns=["Model"],
                 filter_columns=[
                     ColumnFilter(
-                        column="Median Inference Time (in s)",
                         type="slider",
                         default=[median_inference_time_min, median_inference_time_max],
                         min=median_inference_time_min,
@@ -54,50 +67,53 @@ with gr.Blocks("ParityError/Interstellar") as demo:
                 datatype="markdown",
             )
         with gr.TabItem("About"):
-            gr.Markdown(
-                """
-                # 💜 About Pruna AI
-                We are Pruna AI, an open source AI optimisation engine and we simply make your models cheaper, faster, smaller, greener!
-                # 📊 About InferBench
-                InferBench is a leaderboard for inference providers, focusing on cost, quality, and compression.
-                Over the past few years, we’ve observed outstanding progress in image generation models fueled by ever-larger architectures.
-                Due to their size, state-of-the-art models such as FLUX take more than 6 seconds to generate a single image on a high-end H100 GPU.
-                While compression techniques can reduce inference time, their impact on quality often remains unclear.
-                <img src="https://huggingface.co/datasets/PrunaAI/documentation-images/resolve/main/inferbench/spider_comparison.png" alt="Spider Comparison">
-                To bring more transparency around the quality of compressed models:
-                - We release “juiced” endpoints for popular image generation models on Replicate, making it easy to play around with our compressed models.
-                - We assess the quality of compressed FLUX-APIs from Replicate, fal, Fireworks AI and Together AI according to different benchmarks.
-                <img src="https://huggingface.co/datasets/PrunaAI/documentation-images/resolve/main/inferbench/speed_comparison.png" alt="Speed Comparison">
-                FLUX-juiced was obtained using a combination of compilation and caching algorithms and we are proud to say that it consistently outperforms alternatives, while delivering performance on par with the original model.
-                This combination is available in our Pruna Pro package and can be applied to almost every image generation model.
-                # 🌍 Join the Pruna AI community!
-                <p><a rel="nofollow" href="https://twitter.com/PrunaAI"><img alt="Twitter" src="https://img.shields.io/twitter/follow/PrunaAI?style=social"></a>
-                <a rel="nofollow" href="https://github.com/PrunaAI/pruna"><img alt="GitHub" src="https://img.shields.io/github/stars/prunaai/pruna"></a>
-                <a rel="nofollow" href="https://www.linkedin.com/company/93832878/admin/feed/posts/?feedType=following"><img alt="LinkedIn" src="https://img.shields.io/badge/LinkedIn-Connect-blue"></a>
-                <a rel="nofollow" href="https://discord.com/invite/rskEr4BZJx"><img alt="Discord" src="https://img.shields.io/badge/Discord-Join%20Us-blue?style=social&amp;logo=discord"></a>
-                <a rel="nofollow" href="https://www.reddit.com/r/PrunaAI/"><img alt="Reddit" src="https://img.shields.io/reddit/subreddit-subscribers/PrunaAI?style=social"></a></p>
                 """
             )
         with gr.Accordion("Citation", open=True):
             gr.Markdown(
                 """
-                    ```bibtex
-                    @article{InferBench,
-                        title={InferBench: A Leaderboard for Inference Providers},
-                        author={PrunaAI},
-                        year={2025},
-                        howpublished={\\url{https://huggingface.co/spaces/PrunaAI/InferBench}}
-                    }
-                    ```
-                    """
             )
 if __name__ == "__main__":
     demo.launch()

+import json
 import math
 from pathlib import Path
 import pandas as pd
 from gradio_leaderboard import ColumnFilter, Leaderboard
+from assets import custom_css
 abs_path = Path(__file__).parent
+# Load the JSONL file into a pandas DataFrame using the json library
+with open(abs_path / "results.jsonl", "r") as file:
+    json_data = file.read()
+    fixed_json_data = f"[{json_data.replace('}\n{', '},\n{')}]"
+    json_data = json.loads(fixed_json_data)
+df = pd.DataFrame(json_data)
 df["Model"] = df.apply(
+    lambda row: f'<a target="_blank" href="{row["URL"]}" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">{row["Model"]}</a>',
     axis=1,
 )
+df = df[
+    ["Model", "Median Inference Time", "Price per Image"]
+    + [col for col in df.columns.tolist() if col not in ["URL", "Model", "Median Inference Time", "Price per Image"]]
+]
+df = df.sort_values(by="GenEval", ascending=False)
+with gr.Blocks("ParityError/Interstellar", css=custom_css) as demo:
+    gr.HTML(
         """
+            <div style="text-align: center;">
+                <img src="https://huggingface.co/datasets/PrunaAI/documentation-images/resolve/main/inferbench/logo1.png" style="width: 200px; height: auto; max-width: 100%; margin: 0 auto;">
+                <h1>🏋️ InferBench 🏋️</h1>
+                <h2>A cost/quality/speed Leaderboard for Inference Providers!</h2>
+            </div>
+            """
     )
     with gr.Tabs():
         with gr.TabItem("FLUX.1 [dev] Leaderboard"):
+            median_inference_time_min = math.floor(float(df["Median Inference Time"].min()))
+            median_inference_time_max = math.ceil(float(df["Median Inference Time"].max()))
             price_per_image_min = math.floor(float(df["Price per Image"].min()))
             price_per_image_max = math.ceil(float(df["Price per Image"].max()))
             Leaderboard(
                 search_columns=["Model"],
                 filter_columns=[
                     ColumnFilter(
+                        column="Median Inference Time",
                         type="slider",
                         default=[median_inference_time_min, median_inference_time_max],
                         min=median_inference_time_min,
                 datatype="markdown",
             )
         with gr.TabItem("About"):
+            with gr.Row():
+                with gr.Column(scale=1):
+                    gr.Markdown(
+                        """
+                        # 💜 About Pruna AI
+                        We are Pruna AI, an open source AI optimisation engine and we simply make your models cheaper, faster, smaller, greener!
+                        # 📊 About InferBench
+                        InferBench is a leaderboard for inference providers, focusing on cost, quality, and compression.
+                        Over the past few years, we’ve observed outstanding progress in image generation models fueled by ever-larger architectures.
+                        Due to their size, state-of-the-art models such as FLUX take more than 6 seconds to generate a single image on a high-end H100 GPU.
+                        While compression techniques can reduce inference time, their impact on quality often remains unclear.
+                        To bring more transparency around the quality of compressed models:
+                        - We release “juiced” endpoints for popular image generation models on Replicate, making it easy to play around with our compressed models.
+                        - We assess the quality of compressed FLUX-APIs from Replicate, fal, Fireworks AI and Together AI according to different benchmarks.
+                        FLUX-juiced was obtained using a combination of compilation and caching algorithms and we are proud to say that it consistently outperforms alternatives, while delivering performance on par with the original model.
+                        This combination is available in our Pruna Pro package and can be applied to almost every image generation model.
+                        """
+                    )
+                with gr.Column(scale=1):
+                    gr.Markdown("")
+        with gr.Accordion("🌍 Join the Pruna AI community!", open=False):
+            gr.HTML(
                 """
+                    <a rel="nofollow" href="https://twitter.com/PrunaAI"><img alt="Twitter" src="https://img.shields.io/twitter/follow/PrunaAI?style=social"></a>
+                    <a rel="nofollow" href="https://github.com/PrunaAI/pruna"><img alt="GitHub" src="https://img.shields.io/github/stars/prunaai/pruna"></a>
+                    <a rel="nofollow" href="https://www.linkedin.com/company/93832878/admin/feed/posts/?feedType=following"><img alt="LinkedIn" src="https://img.shields.io/badge/LinkedIn-Connect-blue"></a>
+                    <a rel="nofollow" href="https://discord.com/invite/rskEr4BZJx"><img alt="Discord" src="https://img.shields.io/badge/Discord-Join%20Us-blue?style=social&amp;logo=discord"></a>
+                    <a rel="nofollow" href="https://www.reddit.com/r/PrunaAI/"><img alt="Reddit" src="https://img.shields.io/reddit/subreddit-subscribers/PrunaAI?style=social"></a>
+    """
             )
         with gr.Accordion("Citation", open=True):
             gr.Markdown(
                 """
+                ```bibtex
+                @article{InferBench,
+                    title={InferBench: A Leaderboard for Inference Providers},
+                    author={PrunaAI},
+                    year={2025},
+                    howpublished={\\url{https://huggingface.co/spaces/PrunaAI/InferBench}}
+                }
+                ```
+                """
             )
 if __name__ == "__main__":
     demo.launch()

assets.py ADDED Viewed

	@@ -0,0 +1,23 @@

+custom_css = """
+.logo {
+    width: 300px;
+    height: auto;
+    max-width: 100%;
+    margin: 0 auto;
+    object-fit: contain;
+    padding-bottom: 0;
+}
+.text {
+    font-size: 16px !important;
+}
+.tabs button {
+    font-size: 20px;
+}
+.subtabs button {
+    font-size: 20px;
+}
+h1, h2 {
+    margin: 0;
+    padding-top: 0;
+}
+"""

leaderboard_data.json DELETED Viewed

The diff for this file is too large to render. See raw diff

results.jsonl ADDED Viewed

	@@ -0,0 +1,112 @@

+{
+    "Model": "Baseline [Nvidia H100]",
+    "URL": "https://huggingface.co/black-forest-labs/FLUX.1-dev?library=diffusers",
+    "GenEval": 67.98,
+    "HPS (v2.1)": 30.36,
+    "GenAI-Bench (VQA)": 0.74,
+    "DrawBench (Image Reward)": 1.0072,
+    "PartiPromts (ARNIQA)": 0.6758,
+    "PartiPromts (ClipIQA)": 0.8968,
+    "PartiPromts (ClipScore)": 27.4,
+    "PartiPromts (Sharpness - Laplacian Variance)": 6833,
+    "Median Inference Time": 6.88,
+    "Price per Image": 0.025
+}
+{
+    "Model": "fal",
+    "URL": "https://fal.ai/models/fal-ai/flux/dev",
+    "GenEval": 68.72,
+    "HPS (v2.1)": 29.97,
+    "GenAI-Bench (VQA)": 0.7441,
+    "DrawBench (Image Reward)": 1.0084,
+    "PartiPromts (ARNIQA)": 0.6702,
+    "PartiPromts (ClipIQA)": 0.8967,
+    "PartiPromts (ClipScore)": 27.61,
+    "PartiPromts (Sharpness - Laplacian Variance)": 7295,
+    "Median Inference Time": 4.06,
+    "Price per Image": 0.025
+}
+{
+    "Model": "fireworks [fp8]",
+    "URL": "https://fireworks.ai/models/fireworks/flux-1-dev-fp8",
+    "GenEval": 65.55,
+    "HPS (v2.1)": 30.26,
+    "GenAI-Bench (VQA)": 0.7455,
+    "DrawBench (Image Reward)": 0.9467,
+    "PartiPromts (ARNIQA)": 0.6639,
+    "PartiPromts (ClipIQA)": 0.8478,
+    "PartiPromts (ClipScore)": 27.24,
+    "PartiPromts (Sharpness - Laplacian Variance)": 5625,
+    "Median Inference Time": 4.66,
+    "Price per Image": 0.014
+}
+{
+    "Model": "Pruna [extra juiced]",
+    "URL": "https://replicate.com/prunaai/flux.1-juiced",
+    "GenEval": 69.9,
+    "HPS (v2.1)": 29.86,
+    "GenAI-Bench (VQA)": 0.7466,
+    "DrawBench (Image Reward)": 0.9458,
+    "PartiPromts (ARNIQA)": 0.6591,
+    "PartiPromts (ClipIQA)": 0.8887,
+    "PartiPromts (ClipScore)": 27.6,
+    "PartiPromts (Sharpness - Laplacian Variance)": 7997,
+    "Median Inference Time": 2.6,
+    "Price per Image": 0.004
+}
+{
+    "Model": "Pruna [juiced]",
+    "URL": "https://replicate.com/prunaai/flux.1-juiced",
+    "GenEval": 68.64,
+    "HPS (v2.1)": 30.38,
+    "GenAI-Bench (VQA)": 0.7408,
+    "DrawBench (Image Reward)": 0.9657,
+    "PartiPromts (ARNIQA)": 0.6762,
+    "PartiPromts (ClipIQA)": 0.9014,
+    "PartiPromts (ClipScore)": 27.55,
+    "PartiPromts (Sharpness - Laplacian Variance)": 7627,
+    "Median Inference Time": 3.14,
+    "Price per Image": 0.0048
+}
+{
+    "Model": "Pruna [lightly juiced]",
+    "URL": "https://replicate.com/prunaai/flux.1-lightly-juiced",
+    "GenEval": 69.12,
+    "HPS (v2.1)": 30.36,
+    "GenAI-Bench (VQA)": 0.7405,
+    "DrawBench (Image Reward)": 0.9972,
+    "PartiPromts (ARNIQA)": 0.6789,
+    "PartiPromts (ClipIQA)": 0.9031,
+    "PartiPromts (ClipScore)": 27.56,
+    "PartiPromts (Sharpness - Laplacian Variance)": 7849,
+    "Median Inference Time": 3.57,
+    "Price per Image": 0.0054
+}
+{
+    "Model": "Replicate [go_fast]",
+    "URL": "https://replicate.com/black-forest-labs/flux-dev",
+    "GenEval": 67.41,
+    "HPS (v2.1)": 29.25,
+    "GenAI-Bench (VQA)": 0.7547,
+    "DrawBench (Image Reward)": 0.9282,
+    "PartiPromts (ARNIQA)": 0.6356,
+    "PartiPromts (ClipIQA)": 0.8609,
+    "PartiPromts (ClipScore)": 27.56,
+    "PartiPromts (Sharpness - Laplacian Variance)": 4872,
+    "Median Inference Time": 3.38,
+    "Price per Image": 0.025
+}
+{
+    "Model": "Together AI",
+    "URL": "https://www.together.ai/models/flux-1-dev",
+    "GenEval": 64.61,
+    "HPS (v2.1)": 30.22,
+    "GenAI-Bench (VQA)": 0.7339,
+    "DrawBench (Image Reward)": 0.9463,
+    "PartiPromts (ARNIQA)": 0.5752,
+    "PartiPromts (ClipIQA)": 0.8709,
+    "PartiPromts (ClipScore)": 27.31,
+    "PartiPromts (Sharpness - Laplacian Variance)": 4501,
+    "Median Inference Time": 3.38,
+    "Price per Image": 0.025
+}