File size: 5,219 Bytes
b88a126
 
 
 
 
9e4fa41
b88a126
005574a
b88a126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3360a5d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b88a126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
df0f84c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b88a126
 
 
 
 
 
df0f84c
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# OpenAI

| Feature                     | Available |
| --------------------------- | --------- |
| [Tools](../tools)           | No        |
| [Multimodal](../multimodal) | Yes       |

Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).

The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `/chat/completions`, change to `endpoint.completion` to `completions` to use the `/completions` endpoint.

```ini
MODELS=`[
  {
    "name": "text-generation-webui",
    "id": "text-generation-webui",
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024,
      "stop": []
    },
    "endpoints": [{
      "type" : "openai",
      "baseURL": "http://localhost:8000/v1"
    }]
  }
]`

```

The `openai` type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:

```ini
OPENAI_API_KEY=#your openai api key here
MODELS=`[{
  "name": "gpt-4",
  "displayName": "GPT 4",
  "endpoints" : [{
    "type": "openai",
    "apiKey": "or your openai api key here"
  }]
},{
  "name": "gpt-3.5-turbo",
  "displayName": "GPT 3.5 Turbo",
  "endpoints" : [{
    "type": "openai",
    "apiKey": "or your openai api key here"
  }]
}]`
```

We also support models in the `o1` family. You need to add a few more options ot the config: Here is an example for `o1-mini`:

```ini
MODELS=`[
  {
      "name": "o1-mini",
      "description": "ChatGPT o1-mini",
      "systemRoleSupported": false,
      "parameters": {
        "max_new_tokens": 2048,
      },
      "endpoints" : [{
        "type": "openai",
        "useCompletionTokens": true,
      }]
  }
]
```

You may also consume any model provider that provides compatible OpenAI API endpoint. For example, you may self-host [Portkey](https://github.com/Portkey-AI/gateway) gateway and experiment with Claude or GPTs offered by Azure OpenAI. Example for Claude from Anthropic:

```ini
MODELS=`[{
  "name": "claude-2.1",
  "displayName": "Claude 2.1",
  "description": "Anthropic has been founded by former OpenAI researchers...",
  "parameters": {
    "temperature": 0.5,
    "max_new_tokens": 4096,
  },
  "endpoints": [
    {
      "type": "openai",
      "baseURL": "https://gateway.example.com/v1",
      "defaultHeaders": {
        "x-portkey-config": '{"provider":"anthropic","api_key":"sk-ant-abc...xyz"}'
      }
    }
  ]
}]`
```

Example for GPT 4 deployed on Azure OpenAI:

```ini
MODELS=`[{
  "id": "gpt-4-1106-preview",
  "name": "gpt-4-1106-preview",
  "displayName": "gpt-4-1106-preview",
  "parameters": {
    "temperature": 0.5,
    "max_new_tokens": 4096,
  },
  "endpoints": [
    {
      "type": "openai",
      "baseURL": "https://{resource-name}.openai.azure.com/openai/deployments/{deployment-id}",
      "defaultHeaders": {
        "api-key": "{api-key}"
      },
      "defaultQuery": {
        "api-version": "2023-05-15"
      }
    }
  ]
}]`
```

## DeepInfra

Or try Mistral from [Deepinfra](https://deepinfra.com/mistralai/Mistral-7B-Instruct-v0.1/api?example=openai-http):

> Note, apiKey can either be set custom per endpoint, or globally using `OPENAI_API_KEY` variable.

```ini
MODELS=`[{
  "name": "mistral-7b",
  "displayName": "Mistral 7B",
  "description": "A 7B dense Transformer, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context window.",
  "parameters": {
    "temperature": 0.5,
    "max_new_tokens": 4096,
  },
  "endpoints": [
    {
      "type": "openai",
      "baseURL": "https://api.deepinfra.com/v1/openai",
      "apiKey": "abc...xyz"
    }
  ]
}]`
```

_Non-streaming endpoints_

For endpoints that don´t support streaming like o1 on Azure, you can pass `streamingSupported: false` in your endpoint config:

```
MODELS=`[{
  "id": "o1-preview",
  "name": "o1-preview",
  "displayName": "o1-preview",
  "systemRoleSupported": false,
  "endpoints": [
    {
      "type": "openai",
      "baseURL": "https://my-deployment.openai.azure.com/openai/deployments/o1-preview",
      "defaultHeaders": {
        "api-key": "$SECRET"
      },
      "streamingSupported": false,
    }
  ]
}]`
```

## Other

Some other providers and their `baseURL` for reference.

[Groq](https://groq.com/): https://api.groq.com/openai/v1
[Fireworks](https://fireworks.ai/): https://api.fireworks.ai/inference/v1

```

```