English | 中文
There are an increasing number of free large-scale models available on the market, and one-api can be somewhat cumbersome for personal use. What's desired is an adaptation program that does not require accounting, traffic, billing, etc.
Another point is that even though some manufacturers claim compatibility with the openai interface, there are still some differences in reality!!!
simple-one-api mainly addresses the above two points, aiming to be compatible with various large model interfaces and uniformly providing the OpenAI interface. Through this project, users can easily integrate and call various large models, simplifying the complexity brought by different platform interface differences.
| Large Model | Free Version | Free Limitations | Console (api_key etc.) | Documentation URL |
|---|---|---|---|---|
| Cloudflare Workers AI | All Models |
Free to use 10,000 times per day, 300,000 times per month; unlimited in test version | Access Link | Documentation View |
| ByteDance Coze.com | Various Models including Function call, General question-asking models and more | Current Coze API free for developers, with API request limit per space: QPS (requests per second): 2 |
QPM (requests per minute): 60
QPD (requests per day): 3000 | Access Link | Documentation View | | Llama Family | Various Models including Chat models with different capabilities | 1. 8 AM to 10 PM: API rate limit of 20 requests per minute
Support for multiple large models: - [x] OpenAI ChatGPT series models - [x] OpenAI - [x] Cloudflare Workers AI - [x] Azure OpenAI - [x] Groq
If compatible with the OpenAI interface, it can be used directly. See the document [docs/Compatibility with OpenAI Model Protocol
Integration Guide.md](docs/Compatibility with OpenAI Model Protocol Integration Guide.md)
api_key for a model, and can balance load randomlyapi_keyrandom model, automatically finds a configured available modelView CHANGELOG.md for detailed update history of this project.
git clone https://github.com/fruitbars/simple-one-api.git
First, ensure you have installed Go, version should be 1.18 or above, refer to the official tutorial for installation: https://go.dev/doc/install
You can check the Go version with go version.
linux/macOS
chmod +x quick_build.sh
./quick_build.sh
This will generate simple-one-api in the current directory.
Windows
Double-click quick_build.bat to execute.
quick_build.bat
This will generate simple-one-api.exe in the current directory.
Cross-compile for different platforms
Sometimes you need to compile versions for different platforms, such as windows, linux, macOS; after installing Go, execute build.sh
shell
chmod +x build.sh
./build.sh
This will automatically compile executable files for the above three platforms in different architectures, generated in the build directory.
Next, configure your model services and credentials:
Add your model service and credential information in the config.json file, refer to the configuration file description below.
Default to read and start the config.json in the same directory as simple-one-api
bash
./simple-one-api
If you want to specify the path of config.json, you can start like this
bash
./simple-one-api /path/to/config.json
Here are the steps to deploy simple-one-api using Docker:
Running
Run the Docker container using the following command while mounting your configuration file config.json:
docker run -d --name simple-one-api -p 9090:9090 -v /path/to/config.json:/app/config.json fruitbars/simple-one-api
Note: Make sure to replace /path/to/config.json with the absolute path of the config.json file on your host.
View Container Logs You can view the log output of the container with the following command:
docker logs -f simple-one-api
or
docker logs -f <container_id>
Where is the container ID, which can be viewed using the docker ps command.
Configuration File: In docker-compose.yml, first make sure you have replaced the path of your config.json file with the correct absolute path.
Start Container:
Using Docker Compose to start the service, you can run the following command in the directory containing docker-compose.yml:
sh
docker-compose up -d
This command will start the simple-one-api service in the background.
Other command references can be found in the docker-compose documentation.
Other start methods: - nohup Start - systemd Start
Now, you can call your configured large model services through the OpenAI compatible interface. Service address: http://host:port/v1, api-key can be set arbitrarily
Supported model names set to random, the backend will automatically find a model marked "enabled": true to use.
{
"server_port": ":9099",
"load_balancing": "random",
"services": {
"openai": [
{
"models": [
"@cf/meta/llama-
2-7b-chat-int8"
],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url": "https://api.cloudflare.com/client/v4/accounts/0b4a4013591101f6f5657fcb68f32043/ai/v1/chat/completions"
}
]
}
}
Other model's configuration file examples can be found at
Refer to the document: Detailed config.json Explanation
Detailed configuration descriptions for each vendor: https://github.com/fruitbars/simple-one-api/tree/main/docs
Detailed example configs for each vendor: https://github.com/fruitbars/simple-one-api/tree/main/samples
Here is a complete configuration example, covering multiple large model platforms and different models:
{
"server_port":":9090",
"load_balancing": "random",
"services": {
"openai": [
{
"models": [
"@cf/meta/llama-2-7b-chat-int8"
],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url": "https://api.cloudflare.com/client/v4/accounts/0b4a4013591101f6f5657fcb68f32043/ai/v1/chat/completions"
},
{
"models": ["llama3-70b-8192","llama3-8b-8192","gemma-7b-it","mixtral-8x7b-32768"],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url":"https://api.groq.com/openai/v1"
}
],
"cozecom": [
{
"models": ["xxx"],
"enabled": true,
"credentials": {
"token": "xxx"
},
"server_url": "https://api.coze.com/open_api/v2/chat"
}
],
"azure": [
{
"models": ["gpt-4o"],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url":"https://xxx.openai.azure.com/openai/deployments/xxx/completions?api-version=2024-05-13"
}
],
"ollama": [
{
"models": ["llama2"],
"enabled": true,
"server_url":"http://127.0.0.1:11434/api/chat"
}
]
}
}
Refer to docs/How to Use simple-one-api in Immersive Translation
Yes, it is supported. Refer to the following configuration, the free Coze.com model has a 2qps limit, so it can be set like this
{
"server_port": ":9090",
"debug": false,
"load_balancing": "random",
"services": {
"cozecom": [
{
"models": ["xxx"],
"enabled": true,
"credentials": {
"token": "xxx"
},
"limit": {
"qps":2,
"timeout": 10
},
"server_url": "https://api.coze.com/open_api/v2/chat"
}
]
}
}
It can be set through the api_key field
{
"qpi_key": "123456",
"server_port": ":9099",
"load_balancing": "random",
"services": {
"openai": [
{
"models": [
"@cf/meta/llama-2-7b-chat-int8"
],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url": "https://api.cloudflare.com/client/v4/accounts/0b4a4013591101f6f5657fcb68f32043/ai/v1/chat/completions"
}
]
}
}
For client selection of spark-lite, you can configure it as follows, randomly choosing credentials
{
"server_port": ":9099",
"load_balancing": "random",
"services": {
"openai": [
{
"models": [
"@cf/meta/llama-2-7b-chat-int8"
],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url": "https://api.cloudflare.com/client/v4/accounts/0b4a4013591101f6f5657fcb68f32043/ai/v1/chat/completions"
},
{
"models": [
"@cf/meta/llama-2-7b-chat-int8"
],
"enabled": true,
"credentials": {
"api_key": "xxx"
},
"server_url": "https://api.cloudflare.com/client/v4/accounts/0b4a4013591101f6f5657fcb68f32043/ai/v1/chat/completions"
}
]
}
}
load_balancing is configured to automatically select a model, supporting random, automatically choosing a model with enabled set to true
```json { "server_port": ":9099", "load_balancing": "random", "services": { "openai": [
$ claude mcp add simple-one-api \
-- python -m otcore.mcp_server <graph>