hub / github.com/ShishirPatil/gorilla

github.com/ShishirPatil/gorilla @v1.3 sqlite

repository ↗ · DeepWiki ↗ · release v1.3 ↗

1,946 symbols 5,221 edges 229 files 492 documented · 25%

README

Gorilla: Large Language Model Connected with Massive APIs

Latest Updates

📢 Check out our detailed Berkeley Function Calling Leaderboard changelog (Last updated: ) for the latest dataset / model updates to the Berkeley Function Calling Leaderboard!

🎯 [10/04/2024] Introducing the Agent Arena by Gorilla X LMSYS Chatbot Arena! Compare different agents in tasks like search, finance, RAG, and beyond. Explore which models and tools work best for specific tasks through our novel ranking system and community-driven prompt hub. [Blog] [Arena] [Leaderboard] [Dataset] [Tweet]
📣 [09/21/2024] Announcing BFCL V3 - Evaluating multi-turn and multi-step function calling capabilities! New state-based evaluation system tests models on handling complex workflows, sequential functions, and service states. [Blog] [Leaderboard] [Code] [Tweet]
🚀 [08/20/2024] Released BFCL V2 • Live! The Berkeley Function-Calling Leaderboard now features enterprise-contributed data and real-world scenarios. [Blog] [Live Leaderboard] [V2 Categories Leaderboard] [Tweet]
⚡️ [04/12/2024] Excited to release GoEx - a runtime for LLM-generated actions like code, API calls, and more. Featuring "post-facto validation" for assessing LLM actions after execution, "undo" and "damage confinement" abstractions to manage unintended actions & risks. This paves the way for fully autonomous LLM agents, enhancing interaction between apps & services with human-out-of-loop. [Blog] [Code] [Paper] [Tweet]
⏰ [04/01/2024] Introducing cost and latency metrics into Berkeley function calling leaderboard!
:rocket: [03/15/2024] RAFT: Adapting Language Model to Domain Specific RAG is live! [MSFT-Meta blog] [Berkeley Blog]
:trophy: [02/26/2024] Berkeley Function Calling Leaderboard is live!
:dart: [02/25/2024] OpenFunctions v2 sets new SoTA for open-source LLMs!
:fire: [11/16/2023] Excited to release Gorilla OpenFunctions
💻 [06/29/2023] Released gorilla-cli, LLMs for your CLI!
🟢 [06/06/2023] Released Commercially usable, Apache 2.0 licensed Gorilla models
:rocket: [05/30/2023] Provided the CLI interface to chat with Gorilla!
:rocket: [05/28/2023] Released Torch Hub and TensorFlow Hub Models!
:rocket: [05/27/2023] Released the first Gorilla model! or :hugs:!
:fire: [05/27/2023] We released the APIZoo contribution guide for community API contributions!
:fire: [05/25/2023] We release the APIBench dataset and the evaluation code of Gorilla!

About

Gorilla enables LLMs to use tools by invoking APIs. Given a natural language query, Gorilla comes up with the semantically- and syntactically- correct API to invoke.

With Gorilla, we are the first to demonstrate how to use LLMs to invoke 1,600+ (and growing) API calls accurately while reducing hallucination. This repository contains inference code for running Gorilla finetuned models, evaluation code for reproducing results from our paper, and APIBench - the largest collection of APIs, curated and easy to be trained on!

Since our initial release, we've served ~500k requests and witnessed incredible adoption by developers worldwide. The project has expanded to include tools, evaluations, leaderboard, end-to-end finetuning recipes, infrastructure components, and the Gorilla API Store:

Project	Type	Description (click to expand)
Gorilla Paper	🤖 Model

📝 Fine-tuning

📚 Dataset

📊 Evaluation

🔧 Infra |

Large Language Model Connected with Massive APIs

• Novel finetuning approach for API invocation

• Evaluation on 1,600+ APIs (APIBench)

• Retrieval-augmented training for test-time adaptation | | Gorilla OpenFunctions-V2 | 🤖 Model |

Drop-in alternative for function calling, supporting multiple complex data types and parallel execution

• Multiple & parallel function execution with OpenAI-compatible endpoints

• Native support for Python, Java, JavaScript, and REST APIs with expanded data types

• Function relevance detection to reduce hallucinations

• Enhanced RESTful API formatting capabilities

• State-of-the-art performance among open-source models

| | Berkeley Function Calling Leaderboard (BFCL) | 📊 Evaluation

🏆 Leaderboard

🔧 Function Calling Infra

📚 Dataset |

Comprehensive evaluation of function-calling capabilities

• V1: Expert-curated dataset for evaluating single-turn function calling

• V2: Enterprise-contributed data for real-world scenarios

• V3: Multi-turn & multi-step function calling evaluation

• Cost and latency metrics for all models

• Interactive API explorer for testing

• Community-driven benchmarking platform

| | Agent Arena | 📊 Evaluation

🏆 Leaderboard |

Compare LLM agents across models, tools, and frameworks

• Head-to-head agent comparisons with ELO rating system

• Framework compatibility testing (LangChain, AutoGPT)

• Community-driven evaluation platform

• Real-world task performance metrics

| | Gorilla Execution Engine (GoEx) | 🔧 Infra |

Runtime for executing LLM-generated actions with safety guarantees

• Post-facto validation for verifying LLM actions after execution

• Undo capabilities and damage confinement for risk mitigation

• OAuth2 and API key authentication for multiple services

• Support for RESTful APIs, databases, and filesystem operations

• Docker-based sandboxed execution environment

| | Retrieval-Augmented Fine-tuning (RAFT) | 📝 Fine-tuning

🤖 Model |

Fine-tuning LLMs for robust domain-specific retrieval

• Novel fine-tuning recipe for domain-specific RAG

• Chain-of-thought answers with direct document quotes

• Training with oracle and distractor documents

• Improved performance on PubMed, HotpotQA, and Gorilla benchmarks

• Efficient adaptation of smaller models for domain QA

| | Gorilla CLI | 🤖 Model

🔧 Local CLI Infra |

LLMs for your command-line interface

• User-friendly CLI tool supporting ~1500 APIs (Kubernetes, AWS, GCP, etc.)

• Natural language command generation with multi-LLM fusion

• Privacy-focused with explicit execution approval

• Command history and interactive selection interface

| | Gorilla API Zoo | 📚 Dataset |

A community-maintained repository of up-to-date API documentation

• Centralized, searchable index of APIs across domains

• Structured documentation format with arguments, versioning, and examples

• Community-driven updates to keep pace with API changes

• Rich data source for model training and fine-tuning

• Enables retrieval-augmented training and inference

• Reduces hallucination through up-to-date documentation

Getting Started

Quick Start

Try Gorilla in your browser: - 🚀 Gorilla Colab Demo: Try the base Gorilla model - 🌐 Gorilla Gradio Demo: Interactive web interface - 🔥 OpenFunctions Colab Demo: Try the latest OpenFunctions model - 🎯 OpenFunctions Website Demo: Experiment with function calling - 📊 Berkeley Function Calling Leaderboard: Compare function calling capabilities

Installation Options

Gorilla CLI - Fastest way to get started

pip install gorilla-cli
gorilla generate 100 random characters into a file called test.txt

Learn more about Gorilla CLI →

Run Gorilla Locally

git clone https://github.com/ShishirPatil/gorilla.git
cd gorilla/inference

Detailed local setup instructions →

Use OpenFunctions

import openai

openai.api_key = "EMPTY"
openai.api_base = "http://luigi.millennium.berkeley.edu:8000/v1"

# Define your functions
functions = [{
    "name": "get_current_weather",
    "description": "Get weather in a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string"},
            "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
    }
}]

# Make API call
completion = openai.ChatCompletion.create(
    model="gorilla-openfunctions-v2",
    messages=[{"role": "user", "content": "What's the weather in San Francisco?"}],
    functions=functions
)

OpenFunctions documentation →

🔧 Other Quick Starts

📊 Evaluation & Benchmarking
Berkeley Function Calling Leaderboard: Compare function calling capabilities
Agent Arena: Evaluate agent workflows
Gorilla Paper Evaluation Scripts: Run your own evaluations
🛠️ Development Tools
GoEx: Safe execution of LLM-generated actions
RAFT: Fine-tune models for domain-specific tasks
API Store: Contribute and use APIs

Frequently Asked Questions

I would like to use Gorilla commercially. Is there going to be an Apache 2.0 licensed version?

Yes! We now have models that you can use commercially without any obligations.

Can we use Gorilla with other tools like Langchain etc?

Absolutely! You've highlighted a great aspect of our tools. Gorilla is an end-to-end model, specifically tailored to serve correct API calls (tools) without requiring any additional coding. It's designed to work as part of a wider ecosystem and can be flexibly integrated within agentic frameworks and other tools.

Langchain, is a versatile developer tool. Its "agents" can efficiently swap in any LLM, Gorilla included, making it a highly adaptable solution for various needs.

The beauty of these tools truly shines when they collaborate, complementing each other's strengths and capabilities to create an even more powerful and comprehensive solution. This is where your contribution can make a difference. We enthusiastically welcome any inputs to further refine and enhance these tools.

Check out our blog on How to Use Gorilla: A Step-by-Step Walkthrough to see all the different ways you can integrate Gorilla in your projects.

Project Roadmap

In the immediate future, we plan to release the following:

[ ] Multimodal function-calling leaderboard
[ ] Agentic function-calling leaderboard
[ ] New batch of user contributed live function calling evals.
[ ] BFCL metrics to evaluate contamination
[ ] Openfunctions-v3 model to support more languages and multi-turn capability
[x] Agent Arena to compare LLM agents across models, t

Core symbols most depended-on inside this repo

berkeley-function-call-leaderboard/bfcl_eval/eval_checker/ast_eval/type_convertor/java_type_converter.py

error

called by 55

goex/cli.py

tree_to_variable_index

called by 50

gorilla/eval/eval-scripts/codebleu/parser/utils.py

js_type_converter

called by 45

berkeley-function-call-leaderboard/bfcl_eval/eval_checker/ast_eval/type_convertor/js_type_converter.py

load

called by 42

raft/checkpointing.py

func_doc_language_specific_pre_processing

called by 39

berkeley-function-call-leaderboard/bfcl_eval/model_handler/utils.py

copy

called by 34

gorilla/inference/serve/conv_template.py

Shape

Method 1,234

Function 514

Class 195

Route 3

Languages

Python96%

TypeScript4%

Modules by API surface

gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python3.8_grammar.py155 symbols

gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python2-grammar.py101 symbols

gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python2-grammar-crlf.py101 symbols

gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python3-grammar.py98 symbols

gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python3-grammar-crlf.py98 symbols

gorilla/eval/retrievers/schema.py51 symbols

berkeley-function-call-leaderboard/bfcl_eval/eval_checker/multi_turn_eval/func_source_code/gorilla_file_system.py44 symbols

goex/exec_engine/db_manager.py28 symbols

berkeley-function-call-leaderboard/bfcl_eval/model_handler/utils.py27 symbols

raft/format.py26 symbols

berkeley-function-call-leaderboard/bfcl_eval/eval_checker/multi_turn_eval/func_source_code/vehicle_control.py26 symbols

berkeley-function-call-leaderboard/bfcl_eval/eval_checker/multi_turn_eval/func_source_code/trading_bot.py26 symbols

Dependencies from manifests, versioned

@codemirror/lang-javascript6.2.2 · 1×

@testing-library/jest-dom5.17.0 · 1×

@testing-library/react13.4.0 · 1×

@testing-library/user-event13.5.0 · 1×

@uiw/codemirror-theme-material4.23.2 · 1×

@uiw/react-codemirror4.23.2 · 1×

@vercel/analytics1.3.1 · 1×

ace-builds1.35.2 · 1×

ansi_up6.0.2 · 1×

axios1.7.2 · 1×

bootstrap5.3.3 · 1×

bootstrap-dark1.0.3 · 1×

For agents

$ claude mcp add gorilla \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact