hub / github.com/seaweedfs/seaweedfs

github.com/seaweedfs/seaweedfs @4.37 sqlite

repository ↗ · DeepWiki ↗ · release 4.37 ↗

31,857 symbols 137,349 edges 2,653 files 11,347 documented · 36%

README

SeaweedFS

SeaweedFS Logo

Sponsor SeaweedFS via Patreon

SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible entirely thanks to the support of these awesome backers. If you'd like to grow SeaweedFS even stronger, please consider joining our sponsors on Patreon.

Your support will be really appreciated by me and other supporters!

Gold Sponsors

Quick Start
- Quick Start with weed mini
- Quick Start for S3 API on Docker
Introduction
Features
- Additional Features
- Filer Features
Example: Using Seaweed Object Store
Architecture
Compared to Other File Systems
Dev Plan
Installation Guide
Disk Related Topics
Benchmark
Enterprise
License

Quick Start

Quick Start with weed mini

Download the latest binary from https://github.com/seaweedfs/seaweedfs/releases and unzip the single weed (or weed.exe) file, or run go install github.com/seaweedfs/seaweedfs/weed@latest. Then start a ready-to-use S3 object store with credentials and a pre-created bucket in one command:

AWS_ACCESS_KEY_ID=admin \
AWS_SECRET_ACCESS_KEY=secret \
S3_BUCKET=my-bucket \
./weed mini -dir=/data

That's it — the S3 endpoint is at http://localhost:8333, my-bucket already exists, and admin/secret are valid credentials. S3_BUCKET accepts a comma-separated list (e.g. raw,processed); use S3_TABLE_BUCKET for S3 Tables (Iceberg) buckets. Drop any of the env vars to skip that piece (no AWS keys → S3 runs in unauthenticated "Allow All" mode for development).

The same command starts everything else too: - S3 Endpoint: http://localhost:8333 - Master UI: http://localhost:9333 - Volume Server: http://localhost:9340 - Filer UI: http://localhost:8888 - WebDAV: http://localhost:7333 - Admin UI: http://localhost:23646

macOS: if the binary is quarantined, run xattr -d com.apple.quarantine ./weed first.

Perfect for development, testing, learning SeaweedFS, and single-node deployments. To scale out, add more volume servers by running weed volume -dir="/some/data/dir2" -master="<master_host>:9333" -port=8081 locally, on another machine, or on thousands of machines.

Quick Start for S3 API on Docker

docker run -p 8333:8333 \
  -e AWS_ACCESS_KEY_ID=admin \
  -e AWS_SECRET_ACCESS_KEY=secret \
  -e S3_BUCKET=my-bucket \
  chrislusf/seaweedfs

Same behavior as the weed mini command above — the S3 endpoint is at http://localhost:8333 with my-bucket pre-created. Drop the env vars to run anonymously for development.

Introduction

SeaweedFS is a simple and highly scalable distributed file system. There are two objectives:

to store billions of files!
to serve the files fast!

SeaweedFS started as a blob store to handle small files efficiently. Instead of managing all file metadata in a central master, the central master only manages volumes on volume servers, and these volume servers manage files and their metadata. This relieves concurrency pressure from the central master and spreads file metadata into volume servers, allowing faster file access (O(1), usually just one disk read operation).

There is only 40 bytes of disk storage overhead for each file's metadata. It is so simple with O(1) disk reads that you are welcome to challenge the performance with your actual use cases.

SeaweedFS started by implementing Facebook's Haystack design paper. Also, SeaweedFS implements erasure coding with ideas from f4: Facebook’s Warm BLOB Storage System, and has a lot of similarities with Facebook’s Tectonic Filesystem and Google's Colossus File System

On top of the blob store, optional Filer can support directories and POSIX attributes. Filer is a separate linearly-scalable stateless server with customizable metadata stores, e.g., MySql, Postgres, Redis, Cassandra, HBase, Mongodb, Elastic Search, LevelDB, RocksDB, Sqlite, MemSql, TiDB, Etcd, CockroachDB, YDB, etc.

SeaweedFS can transparently integrate with the cloud. With hot data on local cluster, and warm data on the cloud with O(1) access time, SeaweedFS can achieve both fast local access time and elastic cloud storage capacity. What's more, the cloud storage access API cost is minimized. Faster and cheaper than direct cloud storage!

SeaweedFS also ships a built-in Iceberg REST Catalog, turning the same cluster into a self-contained lakehouse. Spark, Trino, Dremio, DuckDB, and RisingWave can query Iceberg tables directly — no Hive Metastore, Glue, or external catalog service required. Storage and table metadata live in one system, simplifying on-prem and small-team analytics stacks.

Back to TOC

Features

Additional Blob Store Features

Support different replication levels, with rack and data center aware.
Automatic master servers failover - no single point of failure (SPOF).
Automatic compression depending on file MIME type.
Automatic compaction to reclaim disk space after deletion or update.
Automatic entry TTL expiration.
Flexible Capacity Expansion: Any server with some disk space can add to the total storage space.
Adding/Removing servers does not cause any data re-balancing unless triggered by admin commands.
Optional picture resizing.
Support ETag, Accept-Range, Last-Modified, etc.
Support in-memory/leveldb/readonly mode tuning for memory/performance balance.
Support rebalancing the writable and readonly volumes.
Customizable Multiple Storage Tiers: Customizable storage disk types to balance performance and cost.
Transparent cloud integration: unlimited capacity via tiered cloud storage for warm data.
Erasure Coding for warm storage Rack-Aware 10.4 erasure coding reduces storage cost and increases availability. Enterprise version can customize EC ratio.

Back to TOC

Filer Features

Filer server provides "normal" directories and files via HTTP.
File TTL automatically expires file metadata and actual file data.
Mount filer reads and writes files directly as a local directory via FUSE.
Filer Store Replication enables HA for filer meta data stores.
Active-Active Replication enables asynchronous one-way or two-way cross cluster continuous replication.
Amazon S3 compatible API accesses files with S3 tooling.
Hadoop Compatible File System accesses files from Hadoop/Spark/Flink/etc or even runs HBase.
Async Replication To Cloud has extremely fast local access and backups to Amazon S3, Google Cloud Storage, Azure, BackBlaze.
WebDAV accesses as a mapped drive on Mac and Windows, or from mobile devices.
AES256-GCM Encrypted Storage safely stores the encrypted data.
Super Large Files stores large or super large files in tens of TB.
Cloud Drive mounts cloud storage to local cluster, cached for fast read and write with asynchronous write back.
Gateway to Remote Object Store mirrors bucket operations to remote object storage, in addition to Cloud Drive

Data Lakehouse Features

[S3 Table Buckets][S3TableBucket] expose a dedicated namespace for Iceberg tables with strict layout validation.
Built-in [Iceberg REST Catalog][IcebergCatalog] runs alongside the S3 endpoint — no external metastore needed.
Native integrations with [Apache Spark][SparkIceberg], [Trino][TrinoIceberg], [Dremio][DremioIceberg], [DuckDB][DuckDBIceberg], and [RisingWave][RisingWaveIceberg].
[Automated table maintenance][IcebergMaintenance]: compaction, snapshot expiration, orphan removal, manifest rewriting.
Granular IAM at the bucket, namespace, and table level via standard S3 bucket policies.

Kubernetes

Kubernetes CSI Driver A Container Storage Interface (CSI) Driver.
SeaweedFS Operator

Extension points exported contracts — how you extend this code

FlushableEncoder (Interface)

Represents an Encoder that buffers encoded data prior to writing to the backing stream. [10 implementers]

test/random_access/src/main/java/seaweedfs/client/btree/serialize/FlushableEncoder.java

DetectionSender (Interface)

DetectionSender sends detection responses for one request. [10 implementers]

weed/plugin/worker/worker.go

KMSProvider (Interface)

KMSProvider defines the interface for Key Management Service implementations [6 implementers]

weed/kms/kms.go

HTTPClient (Interface)

HTTPClient interface for testing [132 implementers]

weed/operation/upload_content.go

PolicyManager (Interface)

PolicyManager interface for managing IAM policies [8 implementers]

weed/credential/credential_store.go

UserStore (Interface)

UserStore defines the interface for retrieving IAM user policy attachments. [15 implementers]

weed/iam/integration/iam_manager.go

Value (Interface)

Value is the interface to the dynamic value stored in a flag. (The default value is represented as a string.) If a Valu [7 …

weed/util/fla9/fla9.go

FilerClient (Interface)

FilerClient provides access to the filer for storage operations. [25 implementers]

weed/s3api/iceberg/server.go

Core symbols most depended-on inside this repo

String

called by 4129

weed/topology/node.go

weed/query/engine/datetime_functions.go

Error

called by 1260

weed/worker/types/task.go

Unlock

called by 1255

weed/filer/posixlock/manager.go

Contains

called by 1152

weed/filer/empty_folder_cleanup/cleanup_queue.go

Lock

called by 1047

weed/cluster/lock_manager/lock_manager.go

Shape

Method 16,832

Function 11,124

Struct 3,248

Interface 222

TypeAlias 214

Class 174

FuncType 43

Languages

Go94%

Java3%

TypeScript2%

Python1%

Modules by API surface

weed/pb/volume_server_pb/volume_server.pb.go1,061 symbols

weed/pb/filer_pb/filer.pb.go990 symbols

weed/pb/mq_pb/mq_broker.pb.go662 symbols

weed/pb/master_pb/master.pb.go648 symbols

weed/pb/worker_pb/worker.pb.go580 symbols

weed/pb/plugin_pb/plugin.pb.go576 symbols

weed/pb/iam_pb/iam.pb.go472 symbols

weed/admin/static/js/bootstrap.bundle.min.js392 symbols

weed/pb/s3_lifecycle_pb/s3_lifecycle.pb.go259 symbols

weed/pb/volume_server_pb/volume_server_grpc.pb.go251 symbols

weed/query/engine/engine.go202 symbols

weed/pb/schema_pb/mq_schema.pb.go189 symbols

Dependencies from manifests, versioned

atomicgo.dev/cursorv0.2.0 · 1×

atomicgo.dev/keyboardv0.2.9 · 1×

atomicgo.dev/schedulev0.1.0 · 1×

cel.dev/exprv0.25.1 · 1×

cloud.google.com/gov0.123.0 · 1×

cloud.google.com/go/authv0.20.0 · 1×

cloud.google.com/go/auth/oauth2adaptv0.2.8 · 1×

cloud.google.com/go/compute/metadatav0.9.0 · 1×

cloud.google.com/go/iamv1.7.0 · 1×

cloud.google.com/go/kmsv1.31.0 · 1×

cloud.google.com/go/longrunningv0.9.0 · 1×

cloud.google.com/go/monitoringv1.24.3 · 1×

Datastores touched

defaultDatabase · 1 repos

(mongodb)Database · 1 repos

(mysql)Database · 1 repos

For agents

$ claude mcp add seaweedfs \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact