ML6-Gemini-Demo / README.md
GLorr's picture
Upload folder using huggingface_hub
6c09f76 verified

A newer version of the Gradio SDK is available: 5.29.0

Upgrade
metadata
title: ML6-Gemini-Demo
app_file: src/app.py
sdk: gradio
sdk_version: 5.23.0

Gemini Voice Agent Demo

This repo contains a demo using the Gemini MultiModal API to create a voice-based agent that can conduct professional technical screening interviews.

Technical Overview

The system is based on FastRTC and Gradio to provide a real-time voice UI.

About the modality

You can configure the output modality:

  • If set to AUDIO
    • The agent will respond with an audio response.
    • There is no text output so no transcription if set to TEXT
    • The agent will respond with a text response.
    • The text output will be transcribed to audio using the TTS API.
    • Transcriptions are available.

Function Calling

There are 2 functions that can be called:

  • Answer validation
    • will check the answer type vs the expected type
    • will store the answer
  • Log Input
    • will log the user input
    • this is a form of transcribing the incoming audio

Getting Started

To run the application, follow these steps:

  1. Install uv (if not already installed): curl -LsSf https://astral.sh/uv/install.sh | sh

  2. Install dependencies: uv sync

  3. Setup the environment variables for either GenAI or VertexAI (see below)

  4. Run the application: python src/app.py

  5. Visit http://127.0.0.1:7860 in your browser to interact with the voice agent.

GenAI vs VertexAI

"gemini-2.0-flash-exp" can be used in both GenAI and VertexAI. more info

  • GenAI requires just a GEMINI_API_KEY environment variable link
  • VertexAI requires a GCP project and the following environment variables:
export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
export GOOGLE_CLOUD_LOCATION=europe-west4
export GOOGLE_GENAI_USE_VERTEXAI=True

Depending GOOGLE_GENAI_USE_VERTEXAI flag this demo will use either GenAI or VertexAI.

Note

The gradio-webrtc install fails unless you have ffmpeg@6, on mac:

brew uninstall ffmpeg
brew install ffmpeg@6
brew link ffmpeg@6