MCPcopy
hub / github.com/cloudflare/ebpf_exporter

github.com/cloudflare/ebpf_exporter @v2.5.1 sqlite

repository ↗ · DeepWiki ↗ · release v2.5.1 ↗
266 symbols 737 edges 85 files 106 documented · 40%
README

ebpf_exporter

Prometheus exporter for custom eBPF metrics and OpenTelemetry traces.

  • Metrics:

metrics

tracing

Motivation of this exporter is to allow you to write eBPF code and export metrics that are not otherwise accessible from the Linux kernel.

ebpf.io describes eBPF:

eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.

An easy way of thinking about this exporter is bcc tools as prometheus metrics:

  • https://iovisor.github.io/bcc

We use libbpf rather than legacy bcc driven code, so it's more like libbpf-tools:

  • https://github.com/iovisor/bcc/tree/master/libbpf-tools

Producing OpenTelemetry compatible traces is also supported, see Tracing docs for more information on that.

Reading material

  • https://www.brendangregg.com/ebpf.html
  • https://nakryiko.com/posts/bpf-core-reference-guide/
  • https://nakryiko.com/posts/bpf-portability-and-co-re/
  • https://nakryiko.com/posts/bcc-to-libbpf-howto-guide/
  • https://libbpf.readthedocs.io/en/latest/program_types.html

Building and running

Actual building

To build a binary, clone the repo and run:

make build

The default build target makes a static binary, but you could also use the build-dynamic target if you'd like a dynamically linked binary. In either case libbpf is built from source, but you could override this behavior with BUILD_LIBBPF=0, if you want to use your system libbpf.

If you're having trouble building on the host, you can try building in Docker:

docker build --tag ebpf_exporter --target ebpf_exporter .
docker cp $(docker create ebpf_exporter):/ebpf_exporter ./

To build examples (see building examples section):

make -C examples clean build

To run with biolatency config:

sudo ./ebpf_exporter --config.dir=examples --config.names=biolatency

If you pass --debug, you can see raw maps at /maps endpoint and see debug output from libbpf itself.

Docker image

A docker image can be built from this repo. A prebuilt image with examples included is also available for download from GitHub Container Registry:

  • https://github.com/cloudflare/ebpf_exporter/pkgs/container/ebpf_exporter

To build the image with just the exporter binary, run the following:

docker build --tag ebpf_exporter --target ebpf_exporter .

To run it with the examples, you need to build them first (see above). Then you can run by running a privileged container and bind-mounting:

  • $(pwd)/examples:/examples:ro to allow access to examples on the host
  • /sys/fs/cgroup:/sys/fs/cgroup:ro to allow resolving cgroups

You might have to bind-mount additional directories depending on your needs. You might also not need to bind-mount anything for simple kprobe examples.

The actual command to run the docker container (from the repo directory):

docker run --rm -it --privileged -p 9435:9435 \
  -v $(pwd)/examples:/examples \
  -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
  ebpf_exporter --config.dir=examples --config.names=timers

For production use you would either bind-mount your own config and compiled bpf programs corresponding to it, or build your own image based on ours with your own config baked in.

For development use when you don't want or have any dev tools on the host, you can build the docker image with examples bundled:

docker build --tag ebpf_exporter --target ebpf_exporter_with_examples .

Some examples then can run without any bind mounts:

docker run --rm -it --privileged -p 9435:9435 \
  ebpf_exporter --config.dir=examples --config.names=timers

Or with the publicly available prebuilt image:

docker run --rm -it --privileged -p 9435:9435 \
  ghcr.io/cloudflare/ebpf_exporter --config.dir=examples --config.names=timers

Kubernetes Helm chart

A third party helm chart is available here:

  • https://github.com/kubeservice-stack/kubservice-charts/tree/master/charts/kubeservice-ebpf-exporter

Please note that the helm chart is not provided or supported by Cloudflare, so do your own due diligence and use it at your own risk.

Benchmarking overhead

See benchmark directory to get an idea of how low ebpf overhead is.

Required capabilities

While you can run ebpf_exporter as root, it is not strictly necessary. Only the following two capabilities are necessary for normal operation:

  • CAP_BPF: required for privileged bpf operations and for reading memory
  • CAP_PERFMON: required to attach bpf programs to kprobes and tracepoints

If you are using systemd, you can use the following configuration to run as on otherwise unprivileged dynamic user with the needed capabilities:

DynamicUser=true
AmbientCapabilities=CAP_BPF CAP_PERFMON
CapabilityBoundingSet=CAP_BPF CAP_PERFMON

Prior to Linux v5.8 there was no dedicated CAP_BPF and CAP_PERFMON, but you can use CAP_SYS_ADMIN instead of your kernel is older.

If you pass --capabilities.keep=none flag to ebpf_expoter, then it drops all capabilities after attaching the probes, leaving it fully unprivileged.

The following additional capabilities might be needed:

  • CAP_SYSLOG: if you use ksym decoder to have access to /proc/kallsyms. Note that you must keep this capability: --capabilities.keep=cap_syslog. See: https://elixir.bootlin.com/linux/v6.4/source/kernel/kallsyms.c#L982
  • CAP_IPC_LOCK: if you use perf_event_array for reading from the kernel. Note that you must keep it: --capabilities.keep=cap_perfmon,cap_ipc_lock.
  • CAP_SYS_ADMIN: if you want BTF information from modules. See: https://github.com/libbpf/libbpf/blob/v1.2.0/src/libbpf.c#L8654-L8666 and https://elixir.bootlin.com/linux/v6.5-rc1/source/kernel/bpf/syscall.c#L3789
  • CAP_NET_ADMIN: if you use net admin related programs like xdp. See: https://elixir.bootlin.com/linux/v6.4/source/kernel/bpf/syscall.c#L3787
  • CAP_SYS_RESOURCE: if you run an older kernel without memcg accounting for bpf memory. Upstream Linux kernel added support for this in v5.11. See: https://github.com/libbpf/libbpf/blob/v1.2.0/src/bpf.c#L98-L106
  • CAP_DAC_READ_SEARCH: if you want to use fanotify to monitor cgroup changes, which is the preferred way, but only available since Linux v6.6. See: https://github.com/torvalds/linux/commit/0ce7c12e88cf

External BTF Support

Execution of eBPF programs requires kernel data types normally available in /sys/kernel/btf/vmlinux, which is created during kernel build process. However, on some older kernel configurations, this file might not be available. If that's the case, an external BTF file can be supplied with --btf.path. An archive of BTFs for all some older distros and kernel versions can be found here.

Supported scenarios

Currently the only supported way of getting data out of the kernel is via maps.

See examples section for real world examples.

If you have examples you want to share, please feel free to open a PR.

Configuration

Skip to format to see the full specification.

Examples

You can find additional examples in examples directory.

Unless otherwise specified, all examples are expected to work on Linux 5.15, which is the latest LTS release at the time of writing. Thanks to CO-RE, examples are also supposed to work on any modern kernel with BTF enabled.

You can find the list of supported distros in libbpf README:

  • https://github.com/libbpf/libbpf#bpf-co-re-compile-once--run-everywhere

Building examples

To build examples, run:

make -C examples clean build

This will use clang to build examples with vmlinux.h we provide in this repo (see include for more on vmlinux.h).

Examples need to be compiled before they can be used.

Note that compiled examples can be used as is on any BTF enabled kernel with no runtime dependencies. Most modern Linux distributions have it enabled.

Timers via tracepoints (counters)

This config attaches to kernel tracepoints for timers subsystem and counts timers that fire with breakdown by timer name.

Resulting metrics:

# HELP ebpf_exporter_timer_starts_total Timers fired in the kernel
# TYPE ebpf_exporter_timer_starts_total counter
ebpf_exporter_timer_starts_total{function="blk_stat_timer_fn"} 10
ebpf_exporter_timer_starts_total{function="commit_timeout   [jbd2]"} 1
ebpf_exporter_timer_starts_total{function="delayed_work_timer_fn"} 25
ebpf_exporter_timer_starts_total{function="dev_watchdog"} 1
ebpf_exporter_timer_starts_total{function="mix_interrupt_randomness"} 3
ebpf_exporter_timer_starts_total{function="neigh_timer_handler"} 1
ebpf_exporter_timer_starts_total{function="process_timeout"} 49
ebpf_exporter_timer_starts_total{function="reqsk_timer_handler"} 2
ebpf_exporter_timer_starts_total{function="tcp_delack_timer"} 5
ebpf_exporter_timer_starts_total{function="tcp_keepalive_timer"} 6
ebpf_exporter_timer_starts_total{function="tcp_orphan_update"} 16
ebpf_exporter_timer_starts_total{function="tcp_write_timer"} 12
ebpf_exporter_timer_starts_total{function="tw_timer_handler"} 1
ebpf_exporter_timer_starts_total{function="writeout_period"} 5

There's config file for it:

metrics:
  counters:
    - name: timer_starts_total
      help: Timers fired in the kernel
      labels:
        - name: function
          size: 8
          decoders:
            - name: ksym

And corresponding C code that compiles into an ELF file with eBPF bytecode:

#include <vmlinux.h>
#include <bpf/bpf_tracing.h>
#include "maps.bpf.h"

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1024);
    __type(key, u64);
    __type(value, u64);
} timer_starts_total SEC(".maps");

SEC("tp_btf/timer_start")
int BPF_PROG(timer_start, struct timer_list *timer)
{
    u64 function = (u64) timer->function;
    increment_map(&timer_starts_total, &function, 1);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

Block IO histograms (histograms)

This config attaches to block io subsystem and reports disk latency as a prometheus histogram, allowing you to compute percentiles.

The following tools are working with similar concepts:

  • https://github.com/iovisor/bcc/blob/master/tools/biosnoop_example.txt
  • https://github.com/iovisor/bcc/blob/master/tools/biolatency_example.txt
  • https://github.com/iovisor/bcc/blob/master/tools/bitesize_example.txt

This program was the initial reason for the exporter and was heavily influenced by the experimental exporter from Daniel Swarbrick:

  • https://github.com/dswarbrick/ebpf_exporter

Resulting metrics:

```

HELP ebpf_exporter_bio_latency_seconds Block IO latency histogram

TYPE ebpf_exporter_bio_latency_seconds histogram

ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="1e-06"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="2e-06"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="4e-06"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="8e-06"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="1.6e-05"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="3.2e-05"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="6.4e-05"} 0 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.000128"} 22 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.000256"} 36 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.000512"} 40 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.001024"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.002048"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.004096"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.008192"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.016384"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.032768"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.065536"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.131072"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.262144"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="0.524288"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="1.048576"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="2.097152"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="4.194304"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="8.388608"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="16.777216"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="33.554432"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="67.108864"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="134.217728"} 48 ebpf_exporter_bio_latency_seconds_bucket{device="nvme0n1",operation="write",le="+Inf"} 48 ebpf_exporter_bio_latency_seconds_sum{device="nvme0n1",operation="write"} 0.021772 ebpf_export

Extension points exported contracts — how you extend this code

Decoder (Interface)
Decoder transforms byte field value into a byte value representing string to either use as an input for another Decoder [18 …
decoder/decoder.go
Provider (Interface)
Provider creates tracers for requested service names [1 implementers]
tracing/provider.go

Core symbols most depended-on inside this repo

GetHostByteOrder
called by 11
util/byte_order.go
Tracer
called by 9
tracing/provider.go
DecodeLabelsForMetrics
called by 8
decoder/decoder.go
Resolve
called by 8
cgroup/monitor.go
NewSet
called by 7
decoder/decoder.go
inode
called by 6
cgroup/inode.go
NewDecoder
called by 5
kallsyms/decoder.go
add
called by 5
cgroup/observer.go

Shape

Function 137
Method 71
Struct 51
Interface 3
TypeAlias 3
FuncType 1

Languages

Go100%

Modules by API surface

exporter/exporter.go24 symbols
benchmark/getpid_test.go18 symbols
config/config.go13 symbols
cgroup/fanotify.go10 symbols
kallsyms/decoder.go9 symbols
exporter/perf_event_array.go8 symbols
decoder/decoder.go8 symbols
cgroup/observer.go8 symbols
cgroup/monitor.go8 symbols
exporter/cgroup_id_map.go7 symbols
exporter/histogram.go6 symbols
decoder/decoder_test.go6 symbols

Dependencies from manifests, versioned

github.com/alecthomas/unitsv0.0.0-2023120207171 · 1×
github.com/aquasecurity/libbpfgov0.9.1-libbpf-1.5.1 · 1×
github.com/beorn7/perksv1.0.1 · 1×
github.com/cespare/xxhash/v2v2.3.0 · 1×
github.com/coreos/go-systemdv0.0.0-2019110409311 · 1×
github.com/elastic/go-perfv0.0.0-2019121214071 · 1×
github.com/go-logr/logrv1.4.3 · 1×
github.com/go-logr/stdrv1.2.2 · 1×
github.com/grafana/regexpv0.0.0-2024051813331 · 1×

For agents

$ claude mcp add ebpf_exporter \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact