Spaces:
Running
Running
Commit
·
f60ead4
1
Parent(s):
0ffd2f5
feat: add detailed "About" tab in app.py to showcase InferBench and FLUX-juiced, highlighting performance comparisons and optimization techniques for enhanced user engagement
Browse files
app.py
CHANGED
@@ -47,6 +47,67 @@ with gr.Blocks("ParityError/Interstellar", fill_width=True, css=custom_css) as d
|
|
47 |
select_columns=df.columns.tolist(),
|
48 |
datatype=["markdown"] + ["number"] * (len(df.columns.tolist()) - 1),
|
49 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
with gr.Accordion("🌍 Join the Pruna AI community!", open=False):
|
52 |
gr.HTML(
|
|
|
47 |
select_columns=df.columns.tolist(),
|
48 |
datatype=["markdown"] + ["number"] * (len(df.columns.tolist()) - 1),
|
49 |
)
|
50 |
+
with gr.TabItem("About"):
|
51 |
+
with gr.Row():
|
52 |
+
with gr.Column():
|
53 |
+
gr.Markdown(
|
54 |
+
"""
|
55 |
+
# 📊 InferBench
|
56 |
+
|
57 |
+
We ran a comprehensive benchmark comparing FLUX-juiced with the “FLUX.1 [dev]” endpoints offered by:
|
58 |
+
|
59 |
+
- Replicate: https://replicate.com/black-forest-labs/flux-dev
|
60 |
+
- Fal: https://fal.ai/models/fal-ai/flux/dev
|
61 |
+
- Fireworks AI: https://fireworks.ai/models/fireworks/flux-1-dev-fp8
|
62 |
+
- Together AI: https://www.together.ai/models/flux-1-dev
|
63 |
+
|
64 |
+
All of these inference providers offer FLUX.1 [dev] implementations but they don’t always communicate about the optimisation methods used in the background, and most endpoint have different response times and performance measure.
|
65 |
+
|
66 |
+
For comparison purposes we used the same generation configuration and hardware among the different providers.
|
67 |
+
|
68 |
+
- 28 inference steps
|
69 |
+
- 1024×1024 resolution
|
70 |
+
- Guidance scale of 3.5
|
71 |
+
- H100 GPU (80GB)—only reported by Replicate
|
72 |
+
|
73 |
+
Although we did test with this configuration and hardware, the applied compression methods work with different config and hardware too!
|
74 |
+
|
75 |
+
> We published a full blog post on the [InferBench and FLUX-juiced](https://www.pruna.ai/blog/flux-juiced-the-fastest-image-generation-endpoint).
|
76 |
+
"""
|
77 |
+
)
|
78 |
+
with gr.Column():
|
79 |
+
gr.Markdown(
|
80 |
+
"""
|
81 |
+
# 🧃 FLUX-juiced
|
82 |
+
|
83 |
+
FLUX-juiced is our optimized version of FLUX.1, delivering up to **2.6x faster inference** than the official Replicate API, **without sacrificing image quality**.
|
84 |
+
|
85 |
+
Under the hood, it uses a custom combination of:
|
86 |
+
|
87 |
+
- **Graph compilation** for optimized execution paths
|
88 |
+
- **Inference-time caching** for repeated operations
|
89 |
+
|
90 |
+
We won’t go deep into the internals here, but here’s the gist:
|
91 |
+
|
92 |
+
> We combine compiler-level execution graph optimization with selective caching of heavy operations (like attention layers), allowing inference to skip redundant computations without any loss in fidelity.
|
93 |
+
|
94 |
+
These techniques are generalized and plug-and-play via the **Pruna Pro** pipeline, and can be applied to nearly any diffusion-based image model—not just FLUX. For a free but still very juicy model you can use our open source solution.
|
95 |
+
|
96 |
+
> 🧪 Try FLUX-juiced now → [replicate.com/prunaai/flux.1-juiced](https://replicate.com/prunaai/flux.1-juiced)
|
97 |
+
|
98 |
+
## Sample Images
|
99 |
+
|
100 |
+
The prompts were randomly sampled from the [parti-prompts dataset](https://github.com/google-research/parti). The reported times represent the full duration of each API call.
|
101 |
+
|
102 |
+

|
103 |
+

|
104 |
+

|
105 |
+

|
106 |
+

|
107 |
+
|
108 |
+
> **For more samples, check out the [Pruna Notion page](https://pruna.notion.site/FLUX-1-dev-vs-Pruna-s-FLUX-juiced-1d270a039e5f80c6a2a3c00fc0d75ef0)**
|
109 |
+
"""
|
110 |
+
)
|
111 |
|
112 |
with gr.Accordion("🌍 Join the Pruna AI community!", open=False):
|
113 |
gr.HTML(
|