MCPcopy Index your code
hub / github.com/SakuraLLM/SakuraLLM

github.com/SakuraLLM/SakuraLLM @v1.1.0 sqlite

repository ↗ · DeepWiki ↗ · release v1.1.0 ↗
138 symbols 510 edges 25 files 9 documented · 7%
README

Prerequisite

  1. A stable internet connection to huggingface and dockerhub. the scripts need to download the latest tokenization_baichuan.py and other things. So please ensure that you can connect to these sites.

  2. Install nvidia-container-runtime according to Nvidia Website

    1.1 Remember sudo systemctl restart docker after nvidia-container-runtime installed.

    1.2 You can check whether gpu is supported or not by the following command

    shell docker run --gpus all nvidia/cuda:12.1.1-base-ubuntu20.04 nvidia-smi

  3. Download model from sakuraumi/Sakura-13B-Galgame and put it into models folder

Hardware and Environments

As for now, 轻小说机翻机器人 is using Sakura-13B-LNovel-v0.8-4bit. We strongly recommend using 4bit version for the balance between GPU memory usage and translation speed.

4bit model on RTX 3090 is around 46 tokens/s, and consumes about 14GiB GPU memory when idle, and up to 18G when working.

wsl2 with wslg should work, but since I only have a 4060 laptop, there is no further test for now.

Usage

Server

Copy compose.yaml.example to compose.yaml and tweak the following settings in compose.yaml to ensure the safty. - USERNAME - PASSWORD

To start the service, simply use the command:

docker compose run server
# or docker compose up server

You can use the python script in test/single.py to test the connection and performance of your gpu.

remember to change the username and password

python3 tests/single.py --auth sakura:itsmygo http://127.0.0.1:5000

It seems docker version is a little slower than run scripts in host machine. If that applies, you can follow the setup instruction in Dockerfile to prepare your own environment.

Translation

TODO(kuriko)

put file into models directory

docker compose run translate-epub {--data_path <EPUB> | --data_folder <EPUB folder>}  --output_folder <output>
docker compose run translate-novel --data_path <TXT> --output_path <TXT OUT> [--compare_text true|false]

# docker compose run translate-novel --data_path /models/a.txt --output_path /models/b.txt

Core symbols most depended-on inside this repo

generate
called by 15
infers/ollama.py
completion
called by 5
utils/model.py
make_prompt_stable
called by 4
utils/model.py
get_model_response
called by 3
translate_epub.py
check_messages
called by 3
utils/model.py
stream_generate
called by 3
infers/ollama.py
generate
called by 2
translate_novel.py
generate
called by 2
translate_epub.py

Shape

Method 58
Function 45
Class 30
Route 5

Languages

Python100%

Modules by API surface

utils/model.py23 symbols
sampler_hijack.py20 symbols
api/legacy/type.py17 symbols
infers/ollama.py10 symbols
translate_novel.py8 symbols
translate_epub.py7 symbols
infers/vllm.py7 symbols
api/openai/v1/chat.py7 symbols
infers/transformer.py6 symbols
infers/llama.py5 symbols
utils/state.py4 symbols
utils/__init__.py4 symbols

Dependencies from manifests, versioned

torch2.1.0 · 1×
transformers4.33.2 · 1×

For agents

$ claude mcp add SakuraLLM \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact