Spaces:

Maximofn
/

IriusRiskTestChallenge

Sleeping

File size: 8,462 Bytes

---
title: IriusRiskTestChallenge
emoji: 🚀
colorFrom: green
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0
short_description: LLM backend for IriusRisk Tech challenge
---

# IriusRisk test challenge

This project implements a FastAPI API that uses LangChain and LangGraph to generate text with the `SmolLM2-1.7B-Instruct` model from HuggingFace. I have chosen that model so that I could deploy it on a free GPU-only backend from Hugging Face for this test. The API includes security features such as API Key authentication and rate limiting to protect against abuse.

## API URLs

- **Production**: `https://maximofn-iriusrisktestchallenge.hf.space`
- **Local Development**: `http://localhost:7860`

## Main Features

- 🤖 Text generation using SmolLM2-1.7B-Instruct
- 📝 Text summarization capabilities
- 🔑 API Key authentication
- ⚡ Rate limiting for abuse protection
- 🔄 Conversation thread support
- 📚 Interactive documentation with Swagger and ReDoc

## Configuration

### Environment Variables

For local deployment, create a `.env` file in the project root with the following variables:

```env
API_KEY="your_secret_api_key"
```

## Deployment

### In HuggingFace Spaces

This project is designed to run in HuggingFace Spaces. To configure it:

1. Create a new Space in HuggingFace with blank Docker SDK
2. Add all the files to the Space
3. Configure the API_KEY in the Space's environment secrets

### Local Docker Deployment

For local deployment:

1. Clone this repository
2. Create the `.env` file with your API_KEY
3. Install the dependencies:
   ```bash
   pip install -r requirements.txt
   ```

### Local Docker Deployment

For local Docker deployment:

1. Clone the repository
2. Create the `.env` file with your API_KEY
3. Build the Docker image:
   ```bash
   docker build -t iriusrisk-test-challenge .
   ```
4. Run the Docker container:
   ```bash
   docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge
   ```

## Local Execution

```bash
uvicorn app:app --reload
```

The API will be available at `http://localhost:8000`.

## Local Docker Execution

```bash
docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge
```

The API will be available at `http://localhost:8000`.


## Endpoints

### GET `/`

Welcome endpoint that returns a greeting message.
- Rate limit: 10 requests per minute

### POST `/generate`

Endpoint to generate text using the language model.
- Rate limit: 5 requests per minute
- Requires API Key authentication

**Request parameters:**
```json
{
  "query": "Your question here",
  "thread_id": "optional_thread_identifier",
  "system_prompt": "optional_system_prompt"
}
```

### POST `/summarize`

Endpoint to summarize text using the language model.
- Rate limit: 5 requests per minute
- Requires API Key authentication

**Request parameters:**
```json
{
  "text": "Text to summarize",
  "thread_id": "optional_thread_identifier",
  "max_length": 200
}
```

## Authentication

The API uses API Key authentication. You must include your API Key in the `X-API-Key` header for all protected endpoint requests.

Example:
```bash
# Production
curl -X POST "https://maximofn-iriusrisktestchallenge.hf.space/generate" \
     -H "X-API-Key: your_api_key" \
     -H "Content-Type: application/json" \
     -d '{"query": "What is FastAPI?"}'

# Local development
curl -X POST "http://localhost:7860/generate" \
     -H "X-API-Key: your_api_key" \
     -H "Content-Type: application/json" \
     -d '{"query": "What is FastAPI?"}'
```

## Rate Limiting

To protect the API against abuse, the following limits have been implemented:

- Endpoint `/`: 10 requests per minute
- Endpoint `/generate`: 5 requests per minute
- Endpoint `/summarize`: 5 requests per minute

When these limits are exceeded, the API will return a 429 (Too Many Requests) error.

## API Documentation

The interactive API documentation is available at:
- Swagger UI: 
  - Production: `https://maximofn-iriusrisktestchallenge.hf.space/docs`
  - Local: `http://localhost:7860/docs`
- ReDoc: 
  - Production: `https://maximofn-iriusrisktestchallenge.hf.space/redoc`
  - Local: `http://localhost:7860/redoc`

## Error Handling

The API includes error handling for the following situations:
- Error 401: API Key not provided
- Error 403: Invalid API Key
- Error 429: Rate limit exceeded
- Error 500: Internal server error

## Code Examples

### Python

Here are some examples of how to use the API with Python:

#### Text Generation

```python
import requests

# API configuration
API_URL = "https://maximofn-iriusrisktestchallenge.hf.space"  # Production URL
# API_URL = "http://localhost:7860"  # Local development URL
API_KEY = "your_api_key"  # Replace with your API key

# Headers for authentication
headers = {
    "X-API-Key": API_KEY,
    "Content-Type": "application/json"
}

# Generate text
def generate_text(query, thread_id="default", system_prompt=None):
    url = f"{API_URL}/generate"
    
    data = {
        "query": query,
        "thread_id": thread_id
    }
    
    # Add system prompt if provided
    if system_prompt:
        data["system_prompt"] = system_prompt
    
    try:
        response = requests.post(url, json=data, headers=headers)
        if response.status_code == 200:
            result = response.json()
            return result["generated_text"]
        else:
            print(f"Error: {response.status_code}")
            print(f"Details: {response.text}")
            return None
    except Exception as e:
        print(f"Request failed: {str(e)}")
        return None

# Example usage
query = "What are the main features of Python?"
result = generate_text(query)
if result:
    print("Response:", result)

# Example with custom thread and system prompt
result = generate_text(
    query="Explain object-oriented programming",
    thread_id="programming_tutorial",
    system_prompt="You are a programming teacher. Explain concepts in simple terms."
)
```

#### Text Summarization

```python
import requests

# Summarize text
def summarize_text(text, max_length=200, thread_id="default"):
    url = f"{API_URL}/summarize"
    
    data = {
        "text": text,
        "max_length": max_length,
        "thread_id": thread_id
    }
    
    try:
        response = requests.post(url, json=data, headers=headers)
        if response.status_code == 200:
            result = response.json()
            return result["summary"]
        else:
            print(f"Error: {response.status_code}")
            print(f"Details: {response.text}")
            return None
    except Exception as e:
        print(f"Request failed: {str(e)}")
        return None

# Example usage
text_to_summarize = """
Python is a high-level, interpreted programming language created by Guido van Rossum 
and released in 1991. Python's design philosophy emphasizes code readability with 
the use of significant whitespace. Its language constructs and object-oriented 
approach aim to help programmers write clear, logical code for small and large-scale projects.
"""

summary = summarize_text(text_to_summarize, max_length=50)
if summary:
    print("Summary:", summary)
```

#### Error Handling Example

```python
def make_api_request(endpoint, data):
    url = f"{API_URL}/{endpoint}"
    
    try:
        response = requests.post(url, json=data, headers=headers)
        
        if response.status_code == 200:
            return response.json()
        elif response.status_code == 429:
            print("Rate limit exceeded. Please wait before making more requests.")
        elif response.status_code in (401, 403):
            print("Authentication error. Please check your API key.")
        else:
            print(f"Error {response.status_code}: {response.text}")
        
        return None
    except requests.exceptions.ConnectionError:
        print("Connection error. Please check if the API server is running.")
    except Exception as e:
        print(f"Unexpected error: {str(e)}")
        return None
```

These examples show how to:
- Make requests to different endpoints
- Handle authentication with API keys
- Process successful responses
- Handle various types of errors
- Use optional parameters like thread_id and system_prompt

Remember to:
- Replace `API_URL` with your actual API endpoint
- Set your API key in the headers
- Handle rate limiting by implementing appropriate delays between requests
- Implement proper error handling for your use case