
DeepSearcher combines cutting-edge LLMs (OpenAI o3, Qwen3, DeepSeek, Grok 3, Claude 3.7 Sonnet, Llama 4, QwQ, etc.) and Vector Databases (Milvus, Zilliz Cloud etc.) to perform search, evaluation, and reasoning based on private data, providing highly accurate answer and comprehensive report. This project is suitable for enterprise knowledge management, intelligent Q&A systems, and information retrieval scenarios.


Install DeepSearcher using one of the following methods:
Create and activate a virtual environment(Python 3.10 version is recommended).
python -m venv .venv
source .venv/bin/activate
Install DeepSearcher
pip install deepsearcher
For optional dependencies, e.g., ollama:
pip install "deepsearcher[ollama]"
We recommend using uv for faster and more reliable installation. Follow the offical installation instructions to install it.
Clone the repository and navigate to the project directory:
git clone https://github.com/zilliztech/deep-searcher.git && cd deep-searcher
Synchronize and install dependencies:
uv sync
source .venv/bin/activate
For more detailed development setup and optional dependency installation options, see CONTRIBUTING.md.
To run this quick start demo, please prepare your OPENAI_API_KEY in your environment variables. If you change the LLM in the configuration, make sure to prepare the corresponding API key.
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
# Customize your config here,
# more configuration see the Configuration Details section below.
config.set_provider_config("llm", "OpenAI", {"model": "o1-mini"})
config.set_provider_config("embedding", "OpenAIEmbedding", {"model": "text-embedding-ada-002"})
init_config(config = config)
# Load your local data
from deepsearcher.offline_loading import load_from_local_files
load_from_local_files(paths_or_directory=your_local_path)
# (Optional) Load from web crawling (`FIRECRAWL_API_KEY` env variable required)
from deepsearcher.offline_loading import load_from_website
load_from_website(urls=website_url)
# Query
result = query("Write a report about xxx.") # Your question here
config.set_provider_config("llm", "(LLMName)", "(Arguments dict)")
The "LLMName" can be one of the following: ["DeepSeek", "OpenAI", "XAI", "SiliconFlow", "Aliyun", "PPIO", "TogetherAI", "Gemini", "Ollama", "Novita"]
The "Arguments dict" is a dictionary that contains the necessary arguments for the LLM class.
Example (OpenAI)
Make sure you have prepared your OPENAI API KEY as an env variable OPENAI_API_KEY.
<pre><code>config.set_provider_config("llm", "OpenAI", {"model": "o1-mini"})</code></pre>
More details about OpenAI models: https://platform.openai.com/docs/models
Example (Qwen3 from Aliyun Bailian)
Make sure you have prepared your Bailian API KEY as an env variable DASHSCOPE_API_KEY.
<pre><code>config.set_provider_config("llm", "Aliyun", {"model": "qwen-plus-latest"})</code></pre>
More details about Aliyun Bailian models: https://bailian.console.aliyun.com
Example (Qwen3 from OpenRouter)
<pre><code>config.set_provider_config("llm", "OpenAI", {"model": "qwen/qwen3-235b-a22b:free", "base_url": "https://openrouter.ai/api/v1", "api_key": "OPENROUTER_API_KEY"})</code></pre>
More details about OpenRouter models: https://openrouter.ai/qwen/qwen3-235b-a22b:free
Example (DeepSeek from official)
Make sure you have prepared your DEEPSEEK API KEY as an env variable DEEPSEEK_API_KEY.
<pre><code>config.set_provider_config("llm", "DeepSeek", {"model": "deepseek-reasoner"})</code></pre>
More details about DeepSeek: https://api-docs.deepseek.com/
Example (DeepSeek from SiliconFlow)
Make sure you have prepared your SILICONFLOW API KEY as an env variable SILICONFLOW_API_KEY.
<pre><code>config.set_provider_config("llm", "SiliconFlow", {"model": "deepseek-ai/DeepSeek-R1"})</code></pre>
More details about SiliconFlow: https://docs.siliconflow.cn/quickstart
Example (DeepSeek from TogetherAI)
Make sure you have prepared your TOGETHER API KEY as an env variable TOGETHER_API_KEY.
For deepseek R1:
<pre><code>config.set_provider_config("llm", "TogetherAI", {"model": "deepseek-ai/DeepSeek-R1"})</code></pre>
For Llama 4:
<pre><code>config.set_provider_config("llm", "TogetherAI", {"model": "meta-llama/Llama-4-Scout-17B-16E-Instruct"})</code></pre>
You need to install together before running, execute: pip install together. More details about TogetherAI: https://www.together.ai/
Example (XAI Grok)
Make sure you have prepared your XAI API KEY as an env variable XAI_API_KEY.
<pre><code>config.set_provider_config("llm", "XAI", {"model": "grok-2-latest"})</code></pre>
More details about XAI Grok: https://docs.x.ai/docs/overview#featured-models
Example (Claude)
Make sure you have prepared your ANTHROPIC API KEY as an env variable ANTHROPIC_API_KEY.
<pre><code>config.set_provider_config("llm", "Anthropic", {"model": "claude-3-7-sonnet-latest"})</code></pre>
More details about Anthropic Claude: https://docs.anthropic.com/en/home
Example (Google Gemini)
Make sure you have prepared your GEMINI API KEY as an env variable GEMINI_API_KEY.
<pre><code>config.set_provider_config('llm', 'Gemini', { 'model': 'gemini-2.0-flash' })</code></pre>
You need to install gemini before running, execute: pip install google-genai. More details about Gemini: https://ai.google.dev/gemini-api/docs
Example (DeepSeek from PPIO)
Make sure you have prepared your PPIO API KEY as an env variable PPIO_API_KEY. You can create an API Key here.
<pre><code>config.set_provider_config("llm", "PPIO", {"model": "deepseek/deepseek-r1-turbo"})</code></pre>
More details about PPIO: https://ppinfra.com/docs/get-started/quickstart.html?utm_source=github_deep-searcher
Example (Ollama)
Follow these instructions to set up and run a local Ollama instance:
Download and install Ollama onto the available supported platforms (including Windows Subsystem for Linux).
View a list of available models via the model library.
Fetch available LLM models via ollama pull <name-of-model>
Example: ollama pull qwen3
To chat directly with a model from the command line, use ollama run <name-of-model>.
By default, Ollama has a REST API for running and managing models on http://localhost:11434.
config.set_provider_config("llm", "Ollama", {"model": "qwen3"})
Example (Volcengine)
Make sure you have prepared your Volcengine API KEY as an env variable VOLCENGINE_API_KEY. You can create an API Key here.
<pre><code>config.set_provider_config("llm", "Volcengine", {"model": "deepseek-r1-250120"})</code></pre>
More details about Volcengine: https://www.volcengine.com/docs/82379/1099455?utm_source=github_deep-searcher
Example (GLM)
Make sure you have prepared your GLM API KEY as an env variable GLM_API_KEY.
<pre><code>config.set_provider_config("llm", "GLM", {"model": "glm-4-plus"})</code></pre>
You need to install zhipuai before running, execute: pip install zhipuai. More details about GLM: https://bigmodel.cn/dev/welcome
Example (Amazon Bedrock)
Make sure you have prepared your Amazon Bedrock API KEY as an env variable AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.
<pre><code>config.set_provider_config("llm", "Bedrock", {"model": "us.deepseek.r1-v1:0"})</code></pre>
You need to install boto3 before running, execute: pip install boto3. More details about Amazon Bedrock: https://docs.aws.amazon.com/bedrock/
config.set_provider_config("embedding", "(EmbeddingModelName)", "(Arguments dict)")
The "EmbeddingModelName" can be one of the following: ["MilvusEmbedding", "OpenAIEmbedding", "VoyageEmbedding", "SiliconflowEmbedding", "PPIOEmbedding", "NovitaEmbedding"]
The "Arguments dict" is a dictionary that contains the necessary arguments for the embedding model class.
Example (OpenAI embedding)
Make sure you have prepared your OpenAI API KEY as an env variable OPENAI_API_KEY.
<pre><code>config.set_provider_config("embedding", "OpenAIEmbedding", {"model": "text-embedding-3-small"})</code></pre>
More details about OpenAI models: https://platform.openai.com/docs/guides/embeddings/use-cases
Example (OpenAI embedding Azure)
Make sure you have prepared your OpenAI API KEY as an env variable OPENAI_API_KEY.
<pre><code>config.set_provider_config("embedding", "OpenAIEmbedding", {
"model": "text-embedding-ada-002",
"azure_endpoint": "https://<youraifoundry>.openai.azure.com/",
"api_version": "2023-05-15"
})
Example (Pymilvus built-in embedding model)
Use the built-in embedding model in Pymilvus, you can set the model name as "default", "BAAI/bge-base-en-v1.5", "BAAI/bge-large-en-v1.5", "jina-embeddings-v3", etc.
See [milvus_embedding.py](deepsearcher/embedding/milvus_embedding.py) for more details.
<pre><code>config.set_provider_config("embedding", "MilvusEmbedding", {"model": "BAAI/bge-base-en-v1.5"})</code></pre>
<pre><code>config.set_provider_config("embedding", "MilvusEmbedding", {"model": "jina-embeddings-v3"})</code></pre>
For Jina's embedding model, you needJINAAI_API_KEY.
You need to install pymilvus model before running, execute: pip install pymilvus.model. More details about Pymilvus: https://milvus.io/docs/embeddings.md
Example (VoyageAI embedding)
Make sure you have prepared your VOYAGE API KEY as an env variable VOYAGE_API_KEY.
<pre><code>config.set_provider_config("embedding", "VoyageEmbedding", {"model": "voyage-3"})</code></pre>
You need to install voyageai before running, execute: pip install voyageai. More details about VoyageAI: https://docs.voyageai.com/embeddings/
Example (Amazon Bedrock embedding)
config.set_provider_config("embedding", "BedrockEmbedding", {"model": "amazon.titan-embed-text-v2:0"})
You need to install boto3 before running, execute: pip install boto3. More details about Amazon Bedrock: https://docs.aws.amazon.com/bedrock/
Example (Novita AI embedding)
Make sure you have prepared your Novita AI API KEY as an env variable NOVITA_API_KEY.
<pre><code>config.se
$ claude mcp add deep-searcher \
-- python -m otcore.mcp_server <graph>