MCPcopy Index your code
hub / github.com/morphik-org/morphik-core

github.com/morphik-org/morphik-core @main

repository ↗ · DeepWiki ↗ · + Follow
2,388 symbols 9,332 edges 263 files 1,247 documented · 52% updated 1d ago★ 3,62414 open issues
README

Morphik Logo

PRs Welcome GitHub commit activity GitHub closed issues PyPI - Downloads Discord

Docs - Community - Why Morphik? - Bug reports

Migration Required for Existing Installations: If you installed Morphik before June 22nd, 2025, we've optimized our authentication system for 70-80% faster query performance. Please run the migration script before launching Morphik: bash python scripts/migrate_auth_columns_complete.py --postgres-uri "postgresql+asyncpg://user:pass@host:port/db"

Morphik is a AI-native toolset for visually rich documents and multimodal data

We are building the best way for developers to integrate context (however complex and nuanced) into their AI applications. We offer a treasure chest of tools to store, represent, and search (shallow, and deep) unstructured data. End-to-End.

Why?

Building AI applications that interact with data shouldn't require duct-taping together a dozen different tools just to get relevant results to your LLM.

Traditional RAG approaches that work in proof-of-concepts often fail spectacularly in production. Cobbling together separate systems for text extraction, OCR, embeddings, vector databases, and retrieval creates fragile pipelines that break under real-world load. Each component brings its own APIs, configurations, and failure modes - what starts as a simple demo becomes an unmaintainable mess at scale.

Even worse, these pipelines fundamentally fail at understanding visually rich documents. Charts become meaningless text fragments. Critical diagrams lose their spatial relationships. Tables get mangled into unreadable strings. Technical specifications with mixed text and visuals? Forget about accuracy.

The result is AI applications that confidently return wrong answers because they never truly understood the documents. They miss crucial information embedded in images, misinterpret technical diagrams, and treat visual data as an afterthought. And performance? Watch your infrastructure costs explode as your LLM re-processes the same 500-page manual for every single query.

What?

Morphik provides developers the tools to ingest, search (deep and shallow), transform, and manage unstructured and multimodal documents. Some of our features include:

  • Multimodal Search: We employ techniques such as ColPali to build search that actually understands the visual content of documents you provide. Search over images, PDFs, videos, and more with a single endpoint.
  • Fast and Scalable Metadata Extraction: Extract metadata from documents - including bounding boxes, labeling, classification, and more.
  • Integrations: Integrate with existing tools and workflows. Including (but not limited to) Google Suite, Slack, and Confluence.

The best part? Morphik has a free tier! Get started by signing up at Morphik.

Table of Contents

Getting Started with Morphik (Recommended)

The fastest and easiest way to get started with Morphik is by signing up for free at Morphik. We have a generous free tier and transparent, compute-usage based pricing if you're looking to ingest a lot of data.

Self-hosting Morphik

If you'd like to self-host Morphik, you can find the dedicated instruction here. We offer options for direct installation and installation via docker.

Important: Due to limited resources, we cannot provide full support for self-hosted deployments. We have an installation guide, and a Discord community to help, but we can't guarantee full support.

Using Morphik

Once you've signed up for Morphik, you can get started with ingesting and searching your data right away.

Code (Example: Python SDK)

For programmers, we offer a Python SDK and a REST API. Ingesting a file is as simple as:

from morphik import Morphik

morphik = Morphik("<your-morphik-uri>")
morphik.ingest_file("path/to/your/super/complex/file.pdf")

Similarly, searching and querying your data is easy too:

morphik.query("What's the height of screw 14-A in the chair assembly instructions?")

Morphik Console

You can also interact with Morphik via the Morphik Console. This is a web-based interface that allows you to ingest, search, and query your data. You can upload files, connect to different data sources, and chat with your data all within the same place.

Model Context Protocol

Finally, you can also access Morphik via MCP. Instructions are available here.

Contributing

You're welcome to contribute to the project! We love: - Bug reports via GitHub issues - Feature requests via GitHub issues - Pull requests

Currently, we're focused on improving speed, integrating with more tools, and finding the research papers that provide the most value to our users. If you have thoughts, let us know in the discord or in GitHub!

License

Morphik Core is source-available under the Business Source License 1.1.

  • Personal / Indie use: free.
  • Commercial production use: free if your Morphik deployment generates < $2 000/month in gross revenue. Otherwise purchase a commercial key at https://morphik.ai/pricing.
  • Future open source: each code version automatically re-licenses to Apache 2.0 exactly four years after its first release.

See the full licence text for details.

Contributors

Visit our special thanks page dedicated to our contributors.

Extension points exported contracts — how you extend this code

DynamicSiteHeaderProps (Interface)
(no doc)
ee/ui-component/components/dynamic-site-header.tsx
BaseSidebarProps (Interface)
(no doc)
ee/ui-component/components/sidebar-base.tsx
NavUserProps (Interface)
(no doc)
ee/ui-component/components/nav-user.tsx
MorphikSidebarStatefulProps (Interface)
(no doc)
ee/ui-component/components/sidebar-stateful.tsx
MorphikSidebarProps (Interface)
(no doc)
ee/ui-component/components/sidebar.tsx

Core symbols most depended-on inside this repo

error
called by 478
sdks/python/morphik/models.py
get
called by 167
ee/ui-component/lib/api-client.ts
cn
called by 116
ee/ui-component/lib/utils.ts
showAlert
called by 57
ee/ui-component/components/ui/alert-system.tsx
get_settings
called by 55
core/config.py
_request
called by 46
sdks/python/morphik/async_.py
_request
called by 46
sdks/python/morphik/sync.py
build
called by 42
core/database/metadata_filters.py

Shape

Method 1,110
Function 819
Class 270
Interface 113
Route 76

Languages

Python82%
TypeScript18%

Modules by API surface

sdks/python/morphik/async_.py113 symbols
sdks/python/morphik/sync.py112 symbols
sdks/python/morphik/tests/test_scoped_ops_unit.py68 symbols
core/services/ingestion_service.py56 symbols
core/tests/unit/test_typed_metadata.py52 symbols
core/database/postgres_database.py52 symbols
core/database/metadata_filters.py51 symbols
sdks/python/morphik/models.py50 symbols
core/api.py48 symbols
core/vector_store/fast_multivector_store.py47 symbols
core/tests/unit/test_metadata_filters.py46 symbols
core/services/document_service.py38 symbols

Dependencies from manifests, versioned

@radix-ui/react-accordion1.2.3 · 1×
@radix-ui/react-avatar1.1.10 · 1×
@radix-ui/react-checkbox1.1.5 · 1×
@radix-ui/react-dialog1.1.7 · 1×
@radix-ui/react-dropdown-menu2.1.7 · 1×
@radix-ui/react-label2.1.2 · 1×
@radix-ui/react-progress1.1.3 · 1×
@radix-ui/react-radio-group1.2.4 · 1×
@radix-ui/react-scroll-area1.2.2 · 1×
@radix-ui/react-slider1.2.3 · 1×

Datastores touched

test_dbDatabase · 1 repos

For agents

$ claude mcp add morphik-core \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact