Linux Documentation
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/14] kcov: add per-task dataflow tracking for function arguments/return values
@ 2026-06-11 16:21 Yunseong Kim
  2026-06-11 16:21 ` [RFC PATCH v2 01/14] " Yunseong Kim
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
	Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
	Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
	Shuah Khan, Yunseong Kim
  Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
	linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun,
	sashiko-bot

Introduce kcov_dataflow, a per-task dataflow tracking mechanism for function
arguments/return values at instrumented function boundaries.

Motivation
==========

First, Coverage-guided kernel fuzzers use KCOV edge coverage as their
sole feedback signal. This cannot distinguish two executions of the same
function with different argument values. Fuzzers plateau on stateful
subsystems where security-critical behavior depends on runtime values
rather than control-flow topology.

Second, Existing tracing tools address parts of this challenge:

 1. Per-Task Wide-Scale Tracing Contexts (ftrace / kprobes / eBPF)

 Break point instruction and redirection: Hooks physically patch global kernel
 text. The kernel cannot selectively hook functions per task; every CPU core
 triggers the hook, deferring PID filtering to post-trigger logic.

 2. Rust for Linux Tracing Status

 rustc correctly emits -mfentry code stubs via its LLVM backend, enabling
 native integration with ftrace, function_graph, and eBPF trampolines
 (fentry/fexit). Metadata & Signature Analysis: funcgraph-args parses Rust
 via pahole BTF generation. However, idiomatic types like generics or slices
 are difficult to represent cleanly compared to standard repr(C) structs.

 3. Inline Function Tracing Limitations

 Tracing Visibility: Inlined code cannot be targeted via tracefs. Its runtime
 footprint is absorbed by the caller. Debugging requires explicit noinline (C)
 or #[inline(never)] (Rust) markers.

Approach
========

An LLVM SanitizerCoverage [1] pass inserts callbacks at function entry/exit
that record argument values into a per-task mmap'd ring buffer. Kernel
backend reads struct fields via copy_from_kernel_nofault(). When not enabled
for a task, the cost is a single boolean check.

The system captures:
- Function argument values at entry (with automatic struct field expansion)
- Return values at exit
- Per-task isolation (no interference between processes)
- Both C and Rust kernel modules
- Instument even inline(default n)

For C based kernel module example, eight_args_c:

  vfs_write(0x0)
  0x0 = full_proxy_write()
  full_proxy_write(0x0, 0x1, 0x0)
  0x8200080 = __debugfs_file_get()
  __debugfs_file_get(0x0)
  0x0 = __debugfs_file_get()
  0x0 = trigger_write [eight_args_c]()
  trigger_write [eight_args_c](0x0, 0x1, 0x0)
    df_func2 [eight_args_c](0x11, 0x22)
    0x33 = df_func2 [eight_args_c]()
    df_func3 [eight_args_c](0x11, 0x22, 0x33)
    0x66 = df_func3 [eight_args_c]()
    df_func4 [eight_args_c](0x11, 0x22, 0x33, 0x44)
    0xaa = df_func4 [eight_args_c]()
    df_func5 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55)
    0xff = df_func5 [eight_args_c]()
    df_func6 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55, 0x66)
    0x165 = df_func6 [eight_args_c]()
    df_func7 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77)
    0x1dc = df_func7 [eight_args_c]()
    df_func8 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88)
    0x264 = df_func8 [eight_args_c]()
    df_func_struct [eight_args_c](0xaaaa)
    0x16665 = df_func_struct [eight_args_c]()
  0x1 = trigger_write [eight_args_c]()
  0x1 = full_proxy_write()
  0x1 = vfs_write()
  0x1 = ksys_write()
  0x1 = __x64_sys_write()
  0x0 = fpregs_assert_state_consistent()
  0xba5748 = __x64_sys_close()
  file_close_fd(0x4)
  0x0 = file_close_fd()

For corresponding rust kernel example, eight_args_rust:

  ksys_write(0x0, 0x1)
    fdget_pos(0x4)
    0xffff891481d2bc00 = fdget_pos()
  0x0 = vfs_write()
  vfs_write(0x0, 0x1, 0x0)
  0x0 = _RNvCs3p16QzTwthP_15eight_args_rust13write_handler [eight_args_rust]()
  _RNvCs3p16QzTwthP_15eight_args_rust13write_handler [eight_args_rust](0x0, 0x1, 0x0)
    rdf_func2 [eight_args_rust](0x11, 0x22)
    0x33 = rdf_func2 [eight_args_rust]()
    rdf_func3 [eight_args_rust](0x11, 0x22, 0x33)
    0x66 = rdf_func3 [eight_args_rust]()
    rdf_func4 [eight_args_rust](0x11, 0x22, 0x33, 0x44)
    0xaa = rdf_func4 [eight_args_rust]()
    rdf_func5 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55)
    0xff = rdf_func5 [eight_args_rust]()
    rdf_func6 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55, 0x66)
    0x165 = rdf_func6 [eight_args_rust]()
    rdf_func7 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77)
    0x1dc = rdf_func7 [eight_args_rust]()
    rdf_func8 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88)
    0x264 = rdf_func8 [eight_args_rust]()
    rdf_func_struct [eight_args_rust](0xaaaa)
    0x16665 = rdf_func_struct [eight_args_rust]()
  0x1 = _RNvCs3p16QzTwthP_15eight_args_rust13write_handler [eight_args_rust]()
  0x1 = vfs_write()
  0x1 = ksys_write()
  0x1 = __x64_sys_write()
  0x0 = fpregs_assert_state_consistent()
  0xba5748 = __x64_sys_close()
  file_close_fd(0x4)
  0x0 = file_close_fd()
  0x0 = filp_flush()

Design
======

- Independent from existing /sys/kernel/debug/kcov
- Separate device: /sys/kernel/debug/kcov_dataflow
- Separate ioctl namespace ('d'), separate per-task buffer
- Lock-free write path: READ_ONCE/WRITE_ONCE (Tested on x86_64/arm64)
- Safe pointer reads: copy_from_kernel_nofault()
- in_task() guard rejects interrupt/NMI context
- Per-module opt-in: KCOV_DATAFLOW_file.o := y
- Optional global: CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL
- Compiler flags: -fsanitize-coverage=trace-args,trace-ret
  (Kconfig uses cc-option to verify compiler support)

CI results:

  https://github.com/yskzalloc/kcov-dataflow/actions

Performance
===========

Per-module instrumentation (recording active):
  +8.3% on instrumented paths, ~27ns per callback

Global instrumentation (INSTRUMENT_ALL, recording disabled):
  .text: +9.5%, .data: +44%, boot: +71%, syscall latency: +133%

Prerequisites
=============

Requires custom LLVM/Clang with trace-args/trace-ret passes:

  git clone --recursive --depth 1 --shallow-submodules \
    --jobs $(nproc) https://github.com/yskzalloc/kcov-dataflow.git
  cd kcov-dataflow

  cd llvm-project
  cmake -S llvm -B build -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_C_COMPILER=clang \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DLLVM_ENABLE_LLD=ON \
    -DLLVM_ENABLE_PROJECTS="clang;lld" \
    -DLLVM_TARGETS_TO_BUILD="X86;AArch64"
  ninja -C build
  cd ..

Build and boot kernel (using virtme-ng):

  export PATH=$PWD/llvm-project/build/bin:$PATH
  export RUSTC=$PWD/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc
  export RUST_LIB_SRC=$PWD/rust/library
  cd linux
  vng --build \
    --configitem CONFIG_KCOV=y \
    --configitem CONFIG_KCOV_DATAFLOW_ARGS=y \
    --configitem CONFIG_KCOV_DATAFLOW_RET=y \
    --configitem CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL=y \
    --configitem CONFIG_DEBUG_INFO=y \
    --configitem CONFIG_RUST=y \ # For rust kernel tracking
    LLVM=1 CC=clang RUSTC=$RUSTC RUST_LIB_SRC=$RUST_LIB_SRC

Or without virtme-ng:

  cd linux
  make LLVM=1 CC=clang defconfig
  scripts/config --enable KCOV \
                 --enable KCOV_DATAFLOW_ARGS \
                 --enable KCOV_DATAFLOW_RET \
                 --enable KCOV_DATAFLOW_INSTRUMENT_ALL \
                 --enable DEBUG_INFO
  make LLVM=1 CC=clang olddefconfig
  make LLVM=1 CC=clang -j$(nproc)

For Rust module support, build rustc against the custom LLVM:

  https://github.com/yskzalloc/rust

Testing
=======

Tested on linux-next 7.1.0-rc6 (next-20260608) with custom clang/LLVM 23
and rustc 1.98-nightly. Verified on both x86_64 and arm64:

- user_ioctl: 9/9 tests pass (ioctl interface correctness: init, mmap,
  enable/disable, double-enable rejection, buffer capture verification)
- eight_args_c: nested call tree with df_func2..8 + struct (65 context records)
- eight_args_rust: nested call tree with rdf_func2..8 + struct (65 context records)
- rust_ffi_contract: detects FFI contract violation where callee returns
  success (0) but leaves buffer=NULL - captured without crash or KASAN
- binderfs: exercises binder driver via binderfs ioctls (BINDER_VERSION,
  BINDER_SET_MAX_THREADS) with kcov_dataflow recording active, verifies
  argument records captured at binder ioctl boundaries

Links
=====

[1] LLVM RFC: https://discourse.llvm.org/t/rfc-sanitizercoverage-add-fsanitize-coverage-trace-args-trace-ret/91026
[2] LLVM PR: https://github.com/llvm/llvm-project/pull/201410
[3] Repository: https://github.com/yskzalloc/kcov-dataflow
[4] Paper: https://arxiv.org/pdf/2606.00455

---
Change log:

Changes since v1 (https://lore.kernel.org/all/20260603-kcov-dataflow-next-20260603-v2-0-fee0939de2c4@est.tech/):
- Separate from /sys/kernel/debug/kcov (own device, own ioctl namespace)
- Rename internal symbols to avoid collision with existing kcov
- Add CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL for whole-kernel capture
- Fix INIT_TRACK race, fork cleanup, task exit cleanup
- Add recursion guard barriers
- Reject concurrent enable on multiple fds
- Move from tools to kselftest adding:
  user_ioctl, eight_args_c, eight_args_rust, rust_ffi_contract, binderfs_test
- Separate patch regarding kcov-dataflow Documentation

To: Ingo Molnar <mingo@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
To: Juri Lelli <juri.lelli@redhat.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Steven Rostedt <rostedt@goodmis.org>
To: Ben Segall <bsegall@google.com>
To: Mel Gorman <mgorman@suse.de>
To: Valentin Schneider <vschneid@redhat.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
To: Andrey Konovalov <andreyknvl@gmail.com>
To: Alexander Potapenko <glider@google.com>
To: Dmitry Vyukov <dvyukov@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
To: Miguel Ojeda <ojeda@kernel.org>
To: Boqun Feng <boqun@kernel.org>
To: Gary Guo <gary@garyguo.net>
To: Björn Roy Baron <bjorn3_gh@protonmail.com>
To: Benno Lossin <lossin@kernel.org>
To: Andreas Hindborg <a.hindborg@kernel.org>
To: Alice Ryhl <aliceryhl@google.com>
To: Trevor Gross <tmgross@umich.edu>
To: Danilo Krummrich <dakr@kernel.org>
To: Nathan Chancellor <nathan@kernel.org>
To: Nicolas Schier <nsc@kernel.org>
To: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
To: Bill Wendling <morbo@google.com>
To: Justin Stitt <justinstitt@google.com>
To: Kees Cook <kees@kernel.org>
To: David Hildenbrand <david@kernel.org>
To: Lorenzo Stoakes <ljs@kernel.org>
To: "Liam R. Howlett" <liam@infradead.org>
To: Vlastimil Babka <vbabka@kernel.org>
To: Mike Rapoport <rppt@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
To: Michal Hocko <mhocko@suse.com>
To: Shuah Khan <shuah@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>
To: Shuah Khan <skhan@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: kasan-dev@googlegroups.com
Cc: rust-for-linux@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Cc: llvm@lists.linux.dev
Cc: linux-mm@kvack.org
Cc: linux-kselftest@vger.kernel.org
Cc: workflows@vger.kernel.org
Cc: linux-doc@vger.kernel.org

---
Yunseong Kim (14):
      kcov: add per-task dataflow tracking for function arguments/return values
      kcov: fix INIT_TRACK race in kcov_dataflow
      kcov: add barriers to recursion guard in kcov_df_write
      kcov: reject enable on multiple dataflow fds simultaneously
      kcov: clear dataflow fields on fork
      kcov: clean up dataflow state on task exit
      kcov: exclude kcov_dataflow.o from sanitizer instrumentation
      selftests/kcov_dataflow: add trigger-view.py
      selftests/kcov_dataflow: add ioctl interface selftest
      selftests/kcov_dataflow: add eight_args_c test module
      selftests/kcov_dataflow: add eight_args_rust test module
      selftests/kcov_dataflow: add rust_ffi_contract test module
      selftests/kcov_dataflow: add binderfs ioctl capture test
      Documentation: add kcov-dataflow.rst

 Documentation/dev-tools/index.rst                  |   1 +
 Documentation/dev-tools/kcov-dataflow.rst          | 321 ++++++++++++++++++
 include/linux/kcov.h                               |   8 +
 include/linux/sched.h                              |  10 +
 kernel/Makefile                                    |   9 +
 kernel/exit.c                                      |   1 +
 kernel/fork.c                                      |   1 +
 kernel/kcov.c                                      |   2 +
 kernel/kcov_dataflow.c                             | 356 +++++++++++++++++++
 lib/Kconfig.debug                                  |  43 +++
 rust/kernel/str.rs                                 |   2 +-
 scripts/Makefile.kcov                              |  12 +
 scripts/Makefile.lib                               |   9 +
 tools/testing/selftests/kcov_dataflow/.gitignore   |   9 +
 tools/testing/selftests/kcov_dataflow/Makefile     |   4 +
 tools/testing/selftests/kcov_dataflow/README.rst   |  58 ++++
 .../selftests/kcov_dataflow/binderfs/Makefile      |   4 +
 .../kcov_dataflow/binderfs/binderfs_test.c         | 177 ++++++++++
 .../selftests/kcov_dataflow/eight_args_c/Makefile  |   3 +
 .../kcov_dataflow/eight_args_c/eight_args_c.c      |  95 ++++++
 .../kcov_dataflow/eight_args_rust/Makefile         |   3 +
 .../eight_args_rust/eight_args_rust.rs             | 143 ++++++++
 .../selftests/kcov_dataflow/run_binderfs.sh        |  13 +
 .../selftests/kcov_dataflow/run_eight_args_c.sh    |  35 ++
 .../selftests/kcov_dataflow/run_eight_args_rust.sh |  35 ++
 .../kcov_dataflow/run_rust_ffi_contract.sh         |  35 ++
 .../kcov_dataflow/rust_ffi_contract/Makefile       |   3 +
 .../rust_ffi_contract/rust_ffi_contract.c          | 111 ++++++
 .../selftests/kcov_dataflow/trigger-view.py        | 377 +++++++++++++++++++++
 .../kcov_dataflow/user_ioctl/user_ioctl.c          | 156 +++++++++
 30 files changed, 2035 insertions(+), 1 deletion(-)
---
base-commit: a87737435cfa134f9cdcc696ba3080759d04cf72
change-id: 20260611-b4-kcov-dataflow-v2-3ccff828eb31

Best regards,
--  
Yunseong Kim <yunseong.kim@est.tech>


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2026-06-12 13:11 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 16:21 [RFC PATCH v2 00/14] kcov: add per-task dataflow tracking for function arguments/return values Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 01/14] " Yunseong Kim
2026-06-12  7:34   ` Alexander Potapenko
2026-06-12 12:51     ` Yunseong Kim
2026-06-12 11:37   ` Julian Braha
2026-06-12 12:48     ` Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 02/14] kcov: fix INIT_TRACK race in kcov_dataflow Yunseong Kim
2026-06-12  6:55   ` Alexander Potapenko
2026-06-12  7:25     ` Yunseong Kim
2026-06-12  8:00       ` Alexander Potapenko
2026-06-12 13:11         ` Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 03/14] kcov: add barriers to recursion guard in kcov_df_write Yunseong Kim
2026-06-12  7:30   ` Alexander Potapenko
2026-06-12 12:55     ` Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 04/14] kcov: reject enable on multiple dataflow fds simultaneously Yunseong Kim
2026-06-12  7:32   ` Alexander Potapenko
2026-06-11 16:21 ` [RFC PATCH v2 05/14] kcov: clear dataflow fields on fork Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 06/14] kcov: clean up dataflow state on task exit Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 07/14] kcov: exclude kcov_dataflow.o from sanitizer instrumentation Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 08/14] selftests/kcov_dataflow: add trigger-view.py Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 09/14] selftests/kcov_dataflow: add ioctl interface selftest Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 10/14] selftests/kcov_dataflow: add eight_args_c test module Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 11/14] selftests/kcov_dataflow: add eight_args_rust " Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 12/14] selftests/kcov_dataflow: add rust_ffi_contract " Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 13/14] selftests/kcov_dataflow: add binderfs ioctl capture test Yunseong Kim
2026-06-11 16:21 ` [RFC PATCH v2 14/14] Documentation: add kcov-dataflow.rst Yunseong Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox