From Tracepoints to Metrics: A journey from kernel to user-space

Written by
Balint Molnar
Published on
May 26, 2025

From Hooks to Userspace

In our last post, we explored how to define custom tracepoints inside the kernel to expose meaningful events. In this follow-up, we shift our focus to the second half of the journey: how to move that data from the kernel to user-space efficiently.

The question we set out to answer was: what’s the best mechanism for streaming kernel events to user-space at scale? Our journey led us through sockets, character devices, and virtual filesystems—before ultimately embracing eBPF.

In this post, we’ll walk through how we built our tracing pipeline, the trade-offs we considered, and why eBPF turned out to be the right tool for the job.

Streaming Events from the Kernel to User Space

Riptides is designed to operate silently and unobtrusively, which is one of the reasons we chose to use tracepoints for generating events inside the kernel. However, this design choice is only effective if the event data can be streamed efficiently to user space for further processing. There are several methods to stream such events, each with trade-offs in performance, complexity, and suitability for different use cases. Below are the options we considered:

Netlink Sockets

Netlink is a special IPC mechanism between the Linux kernel and user space, built on top of sockets. It supports multicast, unicast, and asynchronous messaging, enabling structured communication with one or more user space processes. In essence, the kernel creates a special socket, and user space applications interact with it using traditional recvmsg/sendmsg calls. While Netlink supports structured messages, it is not ideal for high-rate, stream-based messaging. Each message involves context switches and kernel locking, and there is no support for zero-copy transmission. Netlink is best suited for status updates, control messages, and occasional events rather than continuous data streams.

eBPF (Extended Berkeley Packet Filter)

eBPF is the successor to the original BPF filtering mechanism in Linux. It allows programs to run within a privileged context, such as the Linux kernel, and is primarily used to extend kernel capabilities in a safe and efficient manner. While classic BPF was limited to packet filtering, eBPF enables a much broader range of functionality. Traditionally, extending kernel behavior required writing a kernel module is often overkill for many use cases. eBPF changes this by allowing developers to load small programs at runtime without modifying the kernel or rebooting. Safety and performance are enforced through a verification engine and a JIT compiler. Since eBPF programs can attach to kernel hooks like kprobes and tracepoints, it is an excellent choice for streaming events from kernel to user space. With features like ring buffer support and zero-copy data transfer, eBPF provides high performance with minimal overhead. Compared to character devices, it is also considered safer due to its built-in verification system. Note that user-space loaders are required to manage the lifecycle of eBPF programs and their associated resources (e.g., maps and ring buffers).

Procfs/ Sysfs

Procfs (/proc) and Sysfs (/sys) are virtual filesystems used to expose kernel data to user space. procfs is primarily used for exporting system and process information, while sysfs represents kernel object attributes. These interfaces are intentionally simple, using standard file read/write operations. However, this simplicity comes at a cost: there is no built-in push mechanism. User space must poll these files to detect updates, which can lead to increased CPU usage. Additionally, there is no buffering support, only the current snapshot of the data is available, making it unsuitable for scenarios where events occur frequently or in rapid succession. Overall, Procfs and Sysfs are best suited for configuration and introspection, not for real-time telemetry or high-frequency event streaming.

Character Device

A character device is a custom kernel device exposed under /dev/mydevice, allowing user space processes to interact with it using standard file operations like open, read, and write. To use a character device, a kernel module must register it and implement the necessary callbacks. Because the module owns the device, it is responsible for managing its entire lifecycle, including creation, deletion, data handling, and synchronization. This adds significant complexity. Moreover, there is no memory protection between your code and the kernel, so bugs in the module can lead to serious issues like kernel panics. Despite these risks, character devices offer maximum flexibilityand very fast I/O performance for data transfer between kernel and user space.

Comparing the Four

Feature Netlink eBPF proc/sysfs Character Device
Performance Medium High Low High
Complexity Medium Medium-High Low Medium-High
Stream Suitable? Limited ✅ Yes (ring buffer) ❌ No ✅ Yes
Push-based? Partial Yes No (polling only) Yes (poll/push)
Best For Events, config Tracing, telemetry Status/config Custom data flows

At Riptides, we already use a character device to transfer control messages between our kernel module and user-space agent. We're fully aware of the complexity involved in managing a character device, so at first glance, it seemed like the perfect mechanism for streaming events as well. However, after careful consideration and internal discussions, we made the decision to keep character devices solely for control messages and adopt eBPF for all tracing and telemetry needs. Here’s why:

  • Simplified Kernel Module
    We aim to keep our kernel module as minimal and focused as possible. Offloading all streaming logic to eBPF makes the codebase cleaner and easier to maintain.
  • Unidirectional Data Flow
    For tracing and telemetry, we mostly need one-way data flow from kernel to user space. eBPF is perfectly suited for this use case with minimal overhead.
  • Built-in Synchronization
    eBPF provides standard mechanisms (e.g., ring buffers) with built-in synchronization and locking, eliminating the need to implement and maintain our own.
  • High Performance with Zero-Copy
    eBPF offers first-class support for ring buffers and zero-copy data transfer, making it extremely efficient for high-frequency event streaming.
  • Easy Extensibility
    With native support for hooking into tracepoints, kprobes, and other kernel hooks, eBPF allows us to expand our tracing capabilities quickly and safely.

In summary, eBPF enables us to decouple tracing logic from our core kernel module, while delivering high performance, easier maintenance, and future flexibility.

Deep Dive into Our Trace Highway

In our tracing and telemetry architecture, we use tracepoints to statically define events within our kernel module. These tracepoints serve as well-defined hooks for significant system or application events. To transport these events to user space, we rely on eBPF, which provides an efficient and safe mechanism for capturing, buffering, and forwarding event data. This setup allows us to build telemetry and metrics with minimal overhead and high reliability.

Let’s explore how this is achieved, step by step:

Requirements to Use an eBPF Program

If you’ve read our previous blog about tracepoints, you’ll recall that each tracepoint defines a hookable location in the kernel. These hooks can be used to attach eBPF programs, allowing us to run custom logic when the tracepoint is triggered.

Here’s an example of one of our eBPF handlers:

SEC("tracepoint/riptides/riptides_accept_start")
int riptides_accept_start(struct driver_socket_start_ctx *ctx)
{
   return handle_socket_start_event(ctx, EVENT_DRIVER_SOCKET_ACCEPT_START);
}

SEC() Macro

The SEC() macro is a section annotation used in eBPF programs to specify the type of program and where it should be attached. In our example:

SEC("tracepoint/riptides/riptides_accept_start")

means this eBPF program attaches to the tracepoint riptides_accept_start inside the riptides subsystem.

How to Find What to Write Inside SEC()

You might wonder: How do I know the exact tracepoint name to use here?

Linux provides a helpful virtual file called /proc/kallsyms, which lists all kernel symbols currently known to the system. Think of it as the symbol table for the running kernel.

Since we want to find tracepoints related to our kernel module called riptides, we can filter symbols like this:

cat /proc/kallsyms | grep 'riptides'

This produces many symbols because it lists all kernel symbols, including functions and data. To narrow it down to tracepoints, we add another filter:

cat /proc/kallsyms | grep 'riptides' | grep 'tracepoints'
ffff80007c2e0010 d __tracepoint_riptides_accept_start [riptides]
...

  • The first column is the symbol’s memory address.
  • The d indicates it’s a local data symbol.
  • The symbol name tells us the tracepoint: __tracepoint_riptides_accept_start.
  • The [riptides] at the end shows it belongs to the riptides kernel module.

A Simpler Way: Available Tracepoints List

For convenience, you can also list all available tracepoints directly via:

cat /sys/kernel/debug/tracing/available_events | grep riptides
riptides:riptides_accept_start
...

Although the methods described can list tracepoints, they do not show kprobes or other hook types. That’s an important caveat to keep in mind when exploring available kernel hooks.

Once the eBPF program is attached to a tracepoint, the kernel automatically invokes this function whenever the tracepoint fires. In our example, that’s the function:

int riptides_accept_start(struct driver_socket_start_ctx *ctx)

You may notice this handler has a single parameter, ctx. This parameter is a context structure, and its layout is strictly defined by the tracepoint itself.

If the program’s context structure does not match the tracepoint’s expected layout, one of two things will happen:

  • You may get garbage or invalid data inside the handler.
  • Or, more commonly, the eBPF verifier will detect the mismatch and reject the program during loading.

Therefore, it’s crucial to ensure your context struct aligns exactly with the tracepoint’s data format.

In our case, the trace event was declared using the following macro in the kernel:

TP_STRUCT__entry(
                       __array(u8, uuid, UUID_SIZE)
                       __field(s64, trace_timestamp)),

To verify the actual layout used by the kernel, you can inspect the generated tracepoint format with:

sudo cat /sys/kernel/debug/tracing/events/riptides/riptides_accept_start/format
name: riptides_accept_start
ID: 1704
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;

field:u8 uuid[16]; offset:8; size:16; signed:0;
field:s64 trace_timestamp; offset:24; size:8; signed:1;

You’ll notice the first four fields (common_type, common_flags, etc.) are automatically inserted by the tracepoint system. These are standard metadata fields shared across all tracepoints and are used internally for scheduling, filtering, or debugging.

Despite not defining them ourselves, you must account for them in your context structure, as they are part of the layout passed to your eBPF program.

Here’s the corresponding context structure in user code:

struct driver_socket_start_ctx
{
   struct header h; // Matches the common_* fields
   __u8 uuid[16];   // Custom field
   __s64 timestamp; // Custom field
};

The struct header here represents the first four "common" fields and should be defined to match them precisely.

Ring Buffer

Before diving into what happens inside the handle_socket_start_event function, let’s take a moment to understand a key component of our setup: the ring buffer, which serves as the bridge between kernel space and user space.

Here's how we declare it in eBPF:

struct
{
   __uint(type, BPF_MAP_TYPE_RINGBUF);
   __uint(max_entries, 65536);
} driver_socket_buf SEC(".maps");

The ring buffer is a lockless, high-performance FIFO data structure provided by the eBPF subsystem. It is ideal for sending a large number of events efficiently from the kernel to user space, maintaining their order.

Key points:

  • max_entries: This is the only required field. It defines the total buffer size in bytes, and must be a power of two (e.g., 65536).
  • No struct is predefined: Unlike hash or array maps, a ring buffer does not enforce a data schema. It is the responsibility of the producer and consumer to agree on the data format and parse it correctly.
  • Memory-mapped in user space: To consume data from the ring buffer, user-space code memory maps the buffer using mmap(), enabling zero-copy access.
  • Epoll-friendly: Ring buffers support epoll, allowing the user-space application to efficiently wait for new events without constant polling.

This combination of performance, simplicity, and ordering makes ring buffers an excellent choice for high-throughput, one-way data transfer from the kernel.

Propagate Data

Inside the handle_socket_start_event function, we populate the ring buffer with telemetry data. This process relies on three key eBPF helper functions provided by the kernel:

  • bpf_ringbuf_reserve(void *ringbuf, __u64 size, __u64 flags)
    This function reserves a region of memory inside the ring buffer and returns a pointer to it. You can write data directly into this space from your eBPF program.
    • If the buffer is full, it returns NULL.
    • There is no need for additional memory copies—data is written directly into the buffer (zero-copy).
    • It’s especially useful for writing large samples since the data does not live on the stack.
    • The eBPF verifier ensures you don’t write beyond the reserved memory.
  • bpf_ringbuf_submit(void *data, __u64 flags)
    Once you’ve finished writing data into the reserved space, this call marks it as ready to be consumed by user space.
  • bpf_ringbuf_discard(void *data, __u64 flags)
    If you decide not to submit the data (e.g., based on a runtime condition), use this function to discard the reserved space.

These three functions provide a reliable and efficient way to transfer structured data from kernel to user space with minimal overhead.

Earlier in this post, we mentioned that eBPF code and associated maps (like the ring buffer) must be set up from a user-space application. So what does Riptides use to handle this initialization and data consumption? And how is the buffer read in user space?

That’s what we’ll cover next.

User-Space Helpers

There are several libraries available to load, verify, and attach eBPF programs to various kernel hooks. The most mature and feature-rich option is libbpf, written in C. However, since our agent component is implemented in Go, we opted to use Cilium’s eBPF library, which is a pure Go alternative.

This library comes with a tool called bpf2go, which not only compiles your eBPF programs but also generates Go scaffolding code. This simplifies integration by eliminating the need to manually interface with the underlying C code.

The Cilium eBPF library is optimized for performance and supports the latest eBPF features. It also abstracts away the complexity of managing the lifecycle of eBPF programs and related resources like ring buffers. With this setup, raw events from the kernel are efficiently streamed into user space, ready for processing by our telemetry and metrics pipeline.

Conclusion

With our eBPF-based tracing architecture in place, we now have a high-performance, low-overhead pipeline for streaming raw events from kernel space to user space using ring buffers. By leveraging tracepoints and the Cilium eBPF library in Go, we've built a clean and efficient mechanism for capturing real-time telemetry data without overcomplicating our kernel module.

In the next post, we'll dive deeper into how these raw events are processed inside our user-space agent to generate meaningful metrics and telemetry data.

Share this post
#kernel
ebpf
metrics

Ready to replace secrets
with trusted identities?

Build with trust at the core.