Spaces:

PrunaAI
/

InferBench

Running

App Files Files Community

davidberenstein1957 commited on 6 days ago

Commit

f60ead4

1 Parent(s): 0ffd2f5

feat: add detailed "About" tab in app.py to showcase InferBench and FLUX-juiced, highlighting performance comparisons and optimization techniques for enhanced user engagement

Browse files

Files changed (1) hide show

app.py +61 -0

app.py CHANGED Viewed

@@ -47,6 +47,67 @@ with gr.Blocks("ParityError/Interstellar", fill_width=True, css=custom_css) as d
                 select_columns=df.columns.tolist(),
                 datatype=["markdown"] + ["number"] * (len(df.columns.tolist()) - 1),
             )
         with gr.Accordion("🌍 Join the Pruna AI community!", open=False):
             gr.HTML(

                 select_columns=df.columns.tolist(),
                 datatype=["markdown"] + ["number"] * (len(df.columns.tolist()) - 1),
             )
+        with gr.TabItem("About"):
+            with gr.Row():
+                with gr.Column():
+                    gr.Markdown(
+                        """
+                        # 📊 InferBench
+                        We ran a comprehensive benchmark comparing FLUX-juiced with the “FLUX.1 [dev]” endpoints offered by:
+                        - Replicate: https://replicate.com/black-forest-labs/flux-dev
+                        - Fal: https://fal.ai/models/fal-ai/flux/dev
+                        - Fireworks AI: https://fireworks.ai/models/fireworks/flux-1-dev-fp8
+                        - Together AI: https://www.together.ai/models/flux-1-dev
+                        All of these inference providers offer FLUX.1 [dev] implementations but they don’t always communicate about the optimisation methods used in the background, and most endpoint have different response times and performance measure.
+                        For comparison purposes we used the same generation configuration and hardware among the different providers.
+                        - 28 inference steps
+                        - 1024×1024 resolution
+                        - Guidance scale of 3.5
+                        - H100 GPU (80GB)—only reported by Replicate
+                        Although we did test with this configuration and hardware, the applied compression methods work with different config and hardware too!
+                        > We published a full blog post on the [InferBench and FLUX-juiced](https://www.pruna.ai/blog/flux-juiced-the-fastest-image-generation-endpoint).
+                        """
+                    )
+                with gr.Column():
+                    gr.Markdown(
+                        """
+                        # 🧃 FLUX-juiced
+                        FLUX-juiced is our optimized version of FLUX.1, delivering up to **2.6x faster inference** than the official Replicate API, **without sacrificing image quality**.
+                        Under the hood, it uses a custom combination of:
+                        - **Graph compilation** for optimized execution paths
+                        - **Inference-time caching** for repeated operations
+                        We won’t go deep into the internals here, but here’s the gist:
+                        > We combine compiler-level execution graph optimization with selective caching of heavy operations (like attention layers), allowing inference to skip redundant computations without any loss in fidelity.
+                        These techniques are generalized and plug-and-play via the **Pruna Pro** pipeline, and can be applied to nearly any diffusion-based image model—not just FLUX. For a free but still very juicy model you can use our open source solution.
+                        > 🧪 Try FLUX-juiced now → [replicate.com/prunaai/flux.1-juiced](https://replicate.com/prunaai/flux.1-juiced)
+                        ## Sample Images
+                        The prompts were randomly sampled from the [parti-prompts dataset](https://github.com/google-research/parti). The reported times represent the full duration of each API call.
+                        ![image 1](https://pruna.notion.site/image/attachment%3A6e1e1dda-6532-4798-b277-24b8ee0a81f2%3A0001_comparison.png?table=block&id=1d270a03-9e5f-8010-a591-d513a737a868&spaceId=70bf3e2a-9a83-466b-888c-50861ef86cc9&width=2000&userId=&cache=v2)
+                        ![image 2](https://pruna.notion.site/image/attachment%3Ad4504e3c-c924-4d51-82c3-fbb8d9c1c8ec%3A0002_comparison.png?table=block&id=1d270a03-9e5f-80cb-acf0-e2583ad26750&spaceId=70bf3e2a-9a83-466b-888c-50861ef86cc9&width=2000&userId=&cache=v2)
+                        ![image 3](https://pruna.notion.site/image/attachment%3A674d1e46-0c29-425d-a50d-e13a98c3e0e2%3A0003_comparison.png?table=block&id=1d270a03-9e5f-80e7-8581-c30c0eb1afde&spaceId=70bf3e2a-9a83-466b-888c-50861ef86cc9&width=2000&userId=&cache=v2)
+                        ![image 4](https://pruna.notion.site/image/attachment%3A59625e2e-36c6-41ea-a052-bc20c48323d3%3A0004_comparison.png?table=block&id=1d270a03-9e5f-802c-8e85-d8fc9484156b&spaceId=70bf3e2a-9a83-466b-888c-50861ef86cc9&width=2000&userId=&cache=v2)
+                        ![image 5](https://pruna.notion.site/image/attachment%3A993215aa-44fc-4b35-9bc5-9df5d7c9ea16%3A0005_comparison.png?table=block&id=1d270a03-9e5f-8069-b658-e5c2361bf3af&spaceId=70bf3e2a-9a83-466b-888c-50861ef86cc9&width=2000&userId=&cache=v2)
+                        > **For more samples, check out the [Pruna Notion page](https://pruna.notion.site/FLUX-1-dev-vs-Pruna-s-FLUX-juiced-1d270a039e5f80c6a2a3c00fc0d75ef0)**
+                        """
+                    )
         with gr.Accordion("🌍 Join the Pruna AI community!", open=False):
             gr.HTML(