yale-CPSC-577 (Yale CPSC 577)

abidlabs

posted an update 7 days ago

Post

3584

HOW TO ADD MCP SUPPORT TO ANY 🤗 SPACE

Gradio now supports MCP! If you want to convert an existing Space, like this one hexgrad/Kokoro-TTS, so that you can use it with Claude Desktop / Cursor / Cline / TinyAgents / or any LLM that supports MCP, here's all you need to do:

1. Duplicate the Space (in the Settings Tab)
2. Upgrade the Gradio sdk_version to 5.28 (in the README.md)
3. Set mcp_server=True in launch()
4. (Optionally) add docstrings to the function so that the LLM knows how to use it, like this:

def generate(text, speed=1):
    """
    Convert text to speech audio.

    Parameters:
        text (str): The input text to be converted to speech.
        speed (float, optional): Playback speed of the generated speech.

That's it! Now your LLM will be able to talk to you 🤯

abidlabs

posted an update 8 days ago

Post

2441

Hi folks! Excited to share a new feature from the Gradio team along with a tutorial.

If you don't already know, Gradio is an open-source Python library used to build interfaces for machine learning models. Beyond just creating UIs, Gradio also exposes API capabilities and now, Gradio apps can be launched Model Context Protocol (MCP) servers for LLMs.

If you already know how to use Gradio, there are only two additional things you need to do:
* Add standard docstrings to your function (these will be used to generate the descriptions for your tools for the LLM)
* Set mcp_server=True in launch()

Here's a complete example (make sure you already have the latest version of Gradio installed):

import gradio as gr

def letter_counter(word, letter):
    """Count the occurrences of a specific letter in a word.
    
    Args:
        word: The word or phrase to analyze
        letter: The letter to count occurrences of
        
    Returns:
        The number of times the letter appears in the word
    """
    return word.lower().count(letter.lower())

demo = gr.Interface(
    fn=letter_counter,
    inputs=["text", "text"],
    outputs="number",
    title="Letter Counter",
    description="Count how many times a letter appears in a word"
)

demo.launch(mcp_server=True)

This is a very simple example, but you can add the ability to generate Ghibli images or speak emotions to any LLM that supports MCP. Once you have an MCP running locally, you can copy-paste the same app to host it on [Hugging Face Spaces](https://huggingface.co/spaces/) as well.

All free and open-source of course! Full tutorial: https://www.gradio.app/guides/building-mcp-server-with-gradio

2 replies

·

abidlabs

posted an update about 1 month ago

Post

3781

JOURNEY TO 1 MILLION DEVELOPERS

5 years ago, we launched Gradio as a simple Python library to let researchers at Stanford easily demo computer vision models with a web interface.

Today, Gradio is used by >1 million developers each month to build and share AI web apps. This includes some of the most popular open-source projects of all time, like Automatic1111, Fooocus, Oobabooga’s Text WebUI, Dall-E Mini, and LLaMA-Factory.

How did we get here? How did Gradio keep growing in the very crowded field of open-source Python libraries? I get this question a lot from folks who are building their own open-source libraries. This post distills some of the lessons that I have learned over the past few years:

1. Invest in good primitives, not high-level abstractions
2. Embed virality directly into your library
3. Focus on a (growing) niche
4. Your only roadmap should be rapid iteration
5. Maximize ways users can consume your library's outputs

1. Invest in good primitives, not high-level abstractions

When we first launched Gradio, we offered only one high-level class (gr.Interface), which created a complete web app from a single Python function. We quickly realized that developers wanted to create other kinds of apps (e.g. multi-step workflows, chatbots, streaming applications), but as we started listing out the apps users wanted to build, we realized what we needed to do:

Read the rest here: https://x.com/abidlabs/status/1907886

abidlabs

posted an update 7 months ago

Post

6264

👋 Hi Gradio community,

I'm excited to share that Gradio 5 will launch in October with improvements across security, performance, SEO, design (see the screenshot for Gradio 4 vs. Gradio 5), and user experience, making Gradio a mature framework for web-based ML applications.

Gradio 5 is currently in beta, so if you'd like to try it out early, please refer to the instructions below:

---------- Installation -------------

Gradio 5 depends on Python 3.10 or higher, so if you are running Gradio locally, please ensure that you have Python 3.10 or higher, or download it here: https://www.python.org/downloads/

* Locally: If you are running gradio locally, simply install the release candidate with pip install gradio --pre
* Spaces: If you would like to update an existing gradio Space to use Gradio 5, you can simply update the sdk_version to be 5.0.0b3 in the README.md file on Spaces.

In most cases, that’s all you have to do to run Gradio 5.0. If you start your Gradio application, you should see your Gradio app running, with a fresh new UI.

-----------------------------

Fore more information, please see: https://github.com/gradio-app/gradio/issues/9463

2 replies

·

abidlabs

posted an update 11 months ago

Post

4688

𝗣𝗿𝗼𝘁𝗼𝘁𝘆𝗽𝗶𝗻𝗴 holds an important place in machine learning. But it has traditionally been quite difficult to go from prototype code to production-ready APIs

We're working on making that a lot easier with 𝗚𝗿𝗮𝗱𝗶𝗼 and will unveil something new on June 6th: https://www.youtube.com/watch?v=44vi31hehw4&ab_channel=HuggingFace

2 replies

·

nateraw

posted an update about 1 year ago

Post

4716

I just shared a blogpost on https://nateraw.com explaining the motivation + process of training nateraw/musicgen-songstarter-v0.2 - including training details, WandB logs, hparams, and notes on previous experiments.

Check it out here ⤵️
https://nateraw.com/posts/training_musicgen_songstarter.html

:) still kinda a WIP so if there's anything else you want to see, let me know.

3 replies

·

abidlabs

posted an update about 1 year ago

Post

3655

Open Models vs. Closed APIs for Software Engineers
-----------------------------------------------------------------------

If you're an ML researcher / scientist, you probably don't need much convincing to use open models instead of closed APIs -- open models give you reproducibility and let you deeply investigate the model's behavior.

But what if you are a software engineer building products on top of LLMs? I'd argue that open models are a much better option even if you are using them as APIs. For at least 3 reasons:

1) The most obvious reason is reliability of your product. Relying on a closed API means that your product has a single point-of-failure. On the other hand, there are at least 7 different API providers that offer Llama3 70B already. As well as libraries that abstract on top of these API providers so that you can make a single request that goes to different API providers depending on availability / latency.

2) Another benefit is eventual consistency going local. If your product takes off, it will be more economical and lower latency to have a dedicated inference endpoint running on your VPC than to call external APIs. If you've started with an open-source model, you can always deploy the same model locally. You don't need to modify prompts or change any surrounding logic to get consistent behavior. Minimize your technical debt from the beginning.

3) Finally, open models give you much more flexibility. Even if you keep using APIs, you might want to tradeoff latency vs. cost, or use APIs that support batches of inputs, etc. Because different API providers have different infrastructure, you can use the API provider that makes the most sense for your product -- or you can even use multiple API providers for different users (free vs. paid) or different parts of your product (priority features vs. nice-to-haves)

nateraw

posted an update about 1 year ago

Post

4514

Turns out if you do a cute little hack, you can make nateraw/musicgen-songstarter-v0.2 work on vocal inputs. 👀

Now, you can hum an idea for a song and get a music sample generated with AI 🔥🔥

Give it a try: ➡️ nateraw/singing-songstarter ⬅️

It'll take your voice and try to autotune it (because let's be real, you're no michael jackson), then pass it along to the model to condition on the melody. It works surprisingly well!

abidlabs

posted an update about 1 year ago

Post

3373

Introducing the Gradio API Recorder 🪄

Every Gradio app now includes an API recorder that lets you reconstruct your interaction in a Gradio app as code using the Python or JS clients! Our goal is to make Gradio the easiest way to build ML APIs, not just UIs 🔥

5 replies

·

abidlabs

posted an update over 1 year ago

Post

Necessity is the mother of invention, and of Gradio components.

Sometimes we realize that we need a Gradio component to build a cool application and demo, so we just build it. For example, we just added a new gr.ParamViewer component because we needed it to display information about Python & JavaScript functions in our documentation.

Of course, our users should be able able to do the same thing for their machine learning applications, so that's why Gradio lets you build custom components, and publish them to the world 🔥

abidlabs

posted an update over 1 year ago

Post

Lots of cool Gradio custom components, but is the most generally useful one I've seen so far: insert a Modal into any Gradio app by using the modal component!

from gradio_modal import Modal

with gr.Blocks() as demo:
    gr.Markdown("### Main Page")
    gr.Textbox("lorem ipsum " * 1000, lines=10)

    with Modal(visible=True) as modal:
        gr.Markdown("# License Agreement")

abidlabs

posted an update over 1 year ago

Post

Just out: new custom Gradio component specifically designed for code completion models 🔥

1 reply

·

abidlabs

posted an update over 1 year ago

Post

The next version of Gradio will be significantly more efficient (as well as a bit faster) for anyone who uses Gradio's streaming features. Looking at you chatbot developers @oobabooga @pseudotensor :)

The major change that we're making is that when you stream data, Gradio used to send the entire payload at each token. This is generally the most robust way to ensure all the data is correctly transmitted. We've now switched to sending "diffs" --> so at each time step, we automatically compute the diff between the most recent updates and then only send the latest token (or whatever the diff may be). Coupled with the fact that we are now using SSE, which is a more robust communication protocol than WS (SSE will resend packets if there's any drops), we should have the best of both worlds: efficient *and* robust streaming.

Very cool stuff @aliabid94 ! PR: https://github.com/gradio-app/gradio/pull/7102

abidlabs

posted an update over 1 year ago

Post

The most interesting LLM benchmark I've seen so far... reminder that there's lots of characterization of LLMs still yet to do.

EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models (2312.06281)

abidlabs

posted an update over 1 year ago

Post

Gradio 4.16 introduces a new flow: you can hide/show Tabs or make them interactive/non-interactive.

Really nice for multi-step machine learning ademos ⚡️

6 replies

·

abidlabs

posted an update over 1 year ago

Post

✨ Excited to release gradio 4.16. New features include:

🐻‍❄️ Native support for Polars Dataframe
🖼️ Gallery component can be used as an input
⚡ Much faster streaming for low-latency chatbots
📄 Auto generated docs for custom components

... and much more! This is HUGE release, so check out everything else in our changelog: https://github.com/gradio-app/gradio/blob/main/CHANGELOG.md

3 replies

·

abidlabs

posted an update over 1 year ago

Post

𝗛𝗼𝘄 𝘄𝗲 𝗺𝗮𝗱𝗲 𝗚𝗿𝗮𝗱𝗶𝗼 𝗳𝗮𝘀𝘁𝗲𝗿 𝗯𝘆... 𝘀𝗹𝗼𝘄𝗶𝗻𝗴 𝗶𝘁 𝗱𝗼𝘄𝗻!

About a month ago, @oobabooga (who built the popular text generation webui) reported an interesting issue to the Gradio team. After upgrading to Gradio 4, @oobabooga noticed that chatbots that streamed very quickly had a lag before their text would show up in the Gradio app.

After some investigation, we determined that the Gradio frontend would receive the updates from the backend immediately, but the browser would lag before rendering the changes on the screen. The main difference between Gradio 3 and Gradio 4 was that we migrated the communication protocol between the backend and frontend from Websockets (WS) to Server-Side Events (SSE), but we couldn't figure out why this would affect the browser's ability to render the streaming updates it was receiving.

After diving deep into browsers events, @aliabid94 and @pngwn made a realization: most browsers treat WS events (specifically the WebSocket.onmessage function) with a lower priority than SSE events (EventSource.onmessage function), which allowed the browser to repaint the window between WS messages. With SSE, the streaming updates would stack up in the browser's event stack and be prioritized over any browser repaint. The browser would eventually clear the stack but it would take some time to go through each update, which produced a lag.

We debated different options, but the solution that we implemented was to introduce throttling: we slowed down how frequently we would push updates to the browser event stack to a maximum rate of 20/sec. Although this seemingly “slowed down” Gradio streaming, it actually would allow browsers to process updates in real-time and provide a much better experience to end users of Gradio apps.

See the PR here: https://github.com/gradio-app/gradio/pull/7084

Kudos to @aliabid94 and @pngwn for the fix, and to @oobabooga and @pseudotensor for helping us test it out!

3 replies

·

abidlabs

posted an update over 1 year ago

Post

There's a lot of interest in machine learning models that generate 3D objects, so Gradio now supports previewing STL files natively in the Model3D component. Huge thanks to Monius for the contribution 🔥🔥

2 replies

·

abidlabs

posted an update over 1 year ago

Post

𝐄𝐦𝐛𝐫𝐚𝐜𝐞𝐝 𝐛𝐲 𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞: 𝐭𝐡𝐞 𝐈𝐧𝐬𝐢𝐝𝐞 𝐒𝐭𝐨𝐫𝐲 𝐨𝐟 𝐎𝐮𝐫 𝐒𝐭𝐚𝐫𝐭𝐮𝐩’𝐬 𝐀𝐜𝐪𝐮𝐢𝐬𝐢𝐭𝐢𝐨𝐧

In late 2021, our team of five engineers, scattered around the globe, signed the papers to shut down our startup, Gradio. For many founders, this would have been a moment of sadness or even bitter reflection.

But we were celebrating. We were getting acquired by Hugging Face!

We had been working very hard towards this acquisition, but for weeks, the acquisition had been blocked by a single investor. The more we pressed him, the more he buckled down, refusing to sign off on the acquisition. Until, unexpectedly, the investor conceded, allowing us to join Hugging Face.

For the first time since our acquisition, I’m writing down the story in detail, hoping that it may shed some light into the obscure world of startup acquisitions and what decisions founders can make to improve their odds for a successful acquisition.

To understand how we got acquired by Hugging Face, you need to know why we started Gradio.

𝐀𝐧 𝐈𝐝𝐞𝐚 𝐟𝐫𝐨𝐦 𝐭𝐡𝐞 𝐇𝐞𝐚𝐫𝐭

Two years before the acquisition, in early 2019, I was working on a research project at Stanford. It was the third year of my PhD, and my labmates and I had trained a machine learning model that could predict patient biomarkers (such as whether patients had certain diseases or an implanted pacemaker) from an ultrasound image of their heart — as well as a cardiologist.

Naturally, cardiologists were skeptical... read the rest of the story here: https://twitter.com/abidlabs/status/1745533306492588303

1 reply

·

nateraw

authored a paper about 2 years ago

Generative Disco: Text-to-Video Generation for Music Visualization

Paper • 2304.08551 • Published Apr 17, 2023 • 7

Yale CPSC 577

AI & ML interests

yale-CPSC-577's activity

Generative Disco: Text-to-Video Generation for Music Visualization

AI & ML interests

Team members 23

yale-CPSC-577's activity