From: Nicolas Schier <nsc@kernel.org>
To: Yunseong Kim <yunseong.kim@est.tech>
Cc: "Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Juri Lelli" <juri.lelli@redhat.com>,
"Vincent Guittot" <vincent.guittot@linaro.org>,
"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ben Segall" <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
"Valentin Schneider" <vschneid@redhat.com>,
"K Prateek Nayak" <kprateek.nayak@amd.com>,
"Dmitry Vyukov" <dvyukov@google.com>,
"Andrey Konovalov" <andreyknvl@gmail.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Nathan Chancellor" <nathan@kernel.org>,
"Nick Desaulniers" <nick.desaulniers+lkml@gmail.com>,
"Bill Wendling" <morbo@google.com>,
"Justin Stitt" <justinstitt@google.com>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Boqun Feng" <boqun@kernel.org>, "Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Benno Lossin" <lossin@kernel.org>,
"Andreas Hindborg" <a.hindborg@kernel.org>,
"Alice Ryhl" <aliceryhl@google.com>,
"Trevor Gross" <tmgross@umich.edu>,
"Danilo Krummrich" <dakr@kernel.org>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <skhan@linuxfoundation.org>,
linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
llvm@lists.linux.dev, linux-kbuild@vger.kernel.org,
rust-for-linux@vger.kernel.org, workflows@vger.kernel.org,
linux-doc@vger.kernel.org, "Yunseong Kim" <ysk@kzalloc.com>
Subject: Re: [RFC PATCH v2 1/6] kcov: add per-task dataflow tracking for function arguments/return values
Date: Wed, 3 Jun 2026 21:25:19 +0200 [thread overview]
Message-ID: <aiB_nycHL-MLN-3g@levanger> (raw)
In-Reply-To: <20260603-kcov-dataflow-next-20260603-v2-1-fee0939de2c4@est.tech>
On Wed, Jun 03, 2026 at 07:43:28PM +0200, Yunseong Kim wrote:
> Add a new KCOV subsystem that captures function arguments at entry and
> return values at exit, with automatic struct field expansion using
> compiler-generated DebugInfo metadata.
>
> Key components:
> - CONFIG_KCOV_DATAFLOW_ARGS: enables argument capture
> - CONFIG_KCOV_DATAFLOW_RET: enables return value capture
> - /sys/kernel/debug/kcov_dataflow: separate device from legacy kcov
> - Ioctl namespace 'd' (KCOV_DF_INIT_TRACE, KCOV_DF_ENABLE, KCOV_DF_DISABLE)
> - Per-task buffer: task->kcov_df_area with atomic xadd reservation
> - Fault-tolerant: all reads via copy_from_kernel_nofault()
> - Recursion-safe: notrace __no_sanitize_coverage noinline
> - ERR_PTR aware: skips struct expansion for error pointers
>
> The callbacks (__sanitizer_cov_trace_args/ret) are inserted by the
> compiler when -fsanitize-coverage=dataflow-args,dataflow-ret is used.
> The Kconfig options depend on cc-option to verify compiler support.
>
> Buffer format (TLV records, all u64):
> area[0]: atomic word count
> [pos+0]: type_and_seq (0xE=entry, 0xF=return in upper 4 bits)
> [pos+1]: PC
> [pos+2]: meta (arg_idx | arg_size | ptr)
> [pos+3..N]: field values read via copy_from_kernel_nofault()
>
> This is completely independent from legacy /sys/kernel/debug/kcov.
> Existing users (syzkaller, oss-fuzz) are unaffected.
>
> Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
> ---
> include/linux/sched.h | 8 ++
> kernel/kcov.c | 291 ++++++++++++++++++++++++++++++++++++++++++++++++++
> lib/Kconfig.debug | 22 ++++
> 3 files changed, 321 insertions(+)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index c4433c185ad8..03be4b495f70 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1533,6 +1533,14 @@ struct task_struct {
> /* KCOV sequence number: */
> int kcov_sequence;
>
> + /* KCOV dataflow per-task sequence counter for TLV records: */
> + u32 kcov_dataflow_seq;
> +
> + /* KCOV dataflow: separate buffer for trace-args/trace-ret */
> + unsigned int kcov_df_size;
> + void *kcov_df_area;
> + bool kcov_df_enabled;
> +
> /* Collect coverage from softirq context: */
> unsigned int kcov_softirq;
> #endif
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index 1df373fb562b..d3c9c0efe961 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -353,6 +353,288 @@ void notrace __sanitizer_cov_trace_switch(kcov_u64 val, void *arg)
> EXPORT_SYMBOL(__sanitizer_cov_trace_switch);
> #endif /* ifdef CONFIG_KCOV_ENABLE_COMPARISONS */
>
> +#if defined(CONFIG_KCOV_DATAFLOW_ARGS) || defined(CONFIG_KCOV_DATAFLOW_RET)
> +/*
> + * KCOV Dataflow: /sys/kernel/debug/kcov_dataflow
> + *
> + * Completely separate from legacy /sys/kernel/debug/kcov.
> + * Own buffer, own ioctl, own mmap. No printk — buffer only.
> + *
> + * TLV record layout (all u64):
> + * area[0]: total u64 words written (atomic counter)
> + * [pos+0]: type_and_seq (0xE=entry|0xF=return in upper 4 bits, seq in lower 24)
> + * [pos+1]: PC
> + * [pos+2]: raw pointer | (arg_idx << 56) | (arg_size << 48) for entry
> + * [pos+3..N]: field values (or scalar value if num_fields=0)
> + */
> +#define KCOV_DF_TYPE_ENTRY 0xE0000000ULL
> +#define KCOV_DF_TYPE_RET 0xF0000000ULL
> +#define KCOV_DF_MAGIC_BAD 0xBADADD85ULL
> +#define KCOV_DF_IS_ERR(p) ((unsigned long)(p) >= (unsigned long)-4095UL)
> +
> +/* Ioctl commands for /sys/kernel/debug/kcov_dataflow */
> +#define KCOV_DF_INIT_TRACE _IOR('d', 1, unsigned long)
> +#define KCOV_DF_ENABLE _IO('d', 100)
> +#define KCOV_DF_DISABLE _IO('d', 101)
> +
> +struct kcov_dataflow {
> + refcount_t refcount;
> + spinlock_t lock;
> + unsigned int size; /* in u64 words */
> + void *area;
> + struct task_struct *t;
> +};
> +
> +static void kcov_df_put(struct kcov_dataflow *df)
> +{
> + if (refcount_dec_and_test(&df->refcount)) {
> + vfree(df->area);
> + kfree(df);
> + }
> +}
> +
> +/*
> + * Core write function — no printk, no locks, just atomic buffer write.
> + * Called from __sanitizer_cov_trace_args/ret in instrumented code.
> + */
> +static noinline notrace __no_sanitize_coverage void
> +kcov_df_write(u64 type_marker, u64 pc, u64 meta, void *ptr,
> + u64 *offsets, u32 num_fields)
> +{
> + struct task_struct *t = current;
> + u64 *area;
> + unsigned long pos, max_pos;
> + u32 record_len, seq, i;
> +
> + if (!t->kcov_df_enabled)
> + return;
> +
> + area = (u64 *)t->kcov_df_area;
> + if (!area)
> + return;
> +
> + max_pos = t->kcov_df_size;
> +
> + /* Record: header(1) + pc(1) + meta(1) + fields or scalar(max 1) */
> + record_len = 3 + (num_fields > 0 ? num_fields : 1);
> +
> + /* Atomic reservation */
> + pos = 1 + xadd((unsigned long *)&area[0], record_len);
> + if (unlikely(pos + record_len > max_pos)) {
> + xadd((unsigned long *)&area[0], -(long)record_len);
> + return;
> + }
> +
> + seq = ++t->kcov_dataflow_seq;
> + area[pos] = type_marker | (seq & 0x00FFFFFFULL);
> + area[pos + 1] = pc;
> + area[pos + 2] = meta;
> +
> + if (num_fields == 0) {
> + /* Scalar: read value from ptr using size from meta */
> + u64 val = 0;
> + u32 sz = (meta >> 48) & 0xFF;
> +
> + if (sz > sizeof(val))
> + sz = sizeof(val);
> + if (ptr && !KCOV_DF_IS_ERR(ptr))
> + copy_from_kernel_nofault(&val, ptr, sz);
> + area[pos + 3] = val;
> + } else {
> + /* Struct fields */
> + if (KCOV_DF_IS_ERR(ptr)) {
> + for (i = 0; i < num_fields; i++)
> + area[pos + 3 + i] = KCOV_DF_MAGIC_BAD;
> + return;
> + }
> + for (i = 0; i < num_fields; i++) {
> + u64 off, sz, val = KCOV_DF_MAGIC_BAD;
> + void *fa;
> +
> + if (copy_from_kernel_nofault(&off, &offsets[i * 2], sizeof(off)) ||
> + copy_from_kernel_nofault(&sz, &offsets[i * 2 + 1], sizeof(sz))) {
> + area[pos + 3 + i] = KCOV_DF_MAGIC_BAD;
> + continue;
> + }
> + fa = (void *)((unsigned long)ptr + off);
> + val = 0;
> + if (sz <= sizeof(val))
> + copy_from_kernel_nofault(&val, fa, sz);
> + else
> + copy_from_kernel_nofault(&val, fa, sizeof(val));
> + area[pos + 3 + i] = val;
> + }
> + }
> +}
> +
> +#ifdef CONFIG_KCOV_DATAFLOW_ARGS
> +noinline void notrace __no_sanitize_coverage
> +__sanitizer_cov_trace_args(u64 pc, u32 arg_idx, u32 arg_size, void *arg_ptr,
> + u64 *offsets, u32 num_fields);
> +
> +noinline void notrace __no_sanitize_coverage
> +__sanitizer_cov_trace_args(u64 pc, u32 arg_idx, u32 arg_size, void *arg_ptr,
> + u64 *offsets, u32 num_fields)
> +{
> + /* meta: [arg_idx(8) | arg_size(8) | ptr(48)] */
> + u64 meta = ((u64)arg_idx << 56) | ((u64)arg_size << 48) |
> + ((u64)(unsigned long)arg_ptr & 0xFFFFFFFFFFFFULL);
> + kcov_df_write(KCOV_DF_TYPE_ENTRY, pc, meta, arg_ptr,
> + offsets, num_fields);
> +}
> +EXPORT_SYMBOL(__sanitizer_cov_trace_args);
> +#endif
> +
> +#ifdef CONFIG_KCOV_DATAFLOW_RET
> +noinline void notrace __no_sanitize_coverage
> +__sanitizer_cov_trace_ret(u64 pc, u32 ret_size, void *ret_val,
> + u64 *offsets, u32 num_fields);
> +
> +noinline void notrace __no_sanitize_coverage
> +__sanitizer_cov_trace_ret(u64 pc, u32 ret_size, void *ret_val,
> + u64 *offsets, u32 num_fields)
> +{
> + u64 meta = ((u64)ret_size << 48) |
> + ((u64)(unsigned long)ret_val & 0xFFFFFFFFFFFFULL);
> + kcov_df_write(KCOV_DF_TYPE_RET, pc, meta, ret_val,
> + offsets, num_fields);
> +}
> +EXPORT_SYMBOL(__sanitizer_cov_trace_ret);
> +#endif
> +
> +/* --- /sys/kernel/debug/kcov_dataflow file operations --- */
> +
> +static int kcov_df_open(struct inode *inode, struct file *filep)
> +{
> + struct kcov_dataflow *df;
> +
> + df = kzalloc(sizeof(*df), GFP_KERNEL);
> + if (!df)
> + return -ENOMEM;
> + spin_lock_init(&df->lock);
> + refcount_set(&df->refcount, 1);
> + filep->private_data = df;
> + return nonseekable_open(inode, filep);
> +}
> +
> +static int kcov_df_close(struct inode *inode, struct file *filep)
> +{
> + struct kcov_dataflow *df = filep->private_data;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&df->lock, flags);
> + if (df->t == current) {
> + current->kcov_df_enabled = false;
> + current->kcov_df_area = NULL;
> + current->kcov_df_size = 0;
> + df->t = NULL;
> + }
> + spin_unlock_irqrestore(&df->lock, flags);
> + kcov_df_put(df);
> + return 0;
> +}
> +
> +static int kcov_df_mmap(struct file *filep, struct vm_area_struct *vma)
> +{
> + struct kcov_dataflow *df = filep->private_data;
> + unsigned long size, off;
> + struct page *page;
> + unsigned long flags;
> + void *area;
> + int res = 0;
> +
> + spin_lock_irqsave(&df->lock, flags);
> + size = df->size * sizeof(u64);
> + if (!df->area || vma->vm_pgoff != 0 ||
> + vma->vm_end - vma->vm_start != size) {
> + res = -EINVAL;
> + goto out;
> + }
> + area = df->area;
> + spin_unlock_irqrestore(&df->lock, flags);
> +
> + vm_flags_set(vma, VM_DONTEXPAND);
> + for (off = 0; off < size; off += PAGE_SIZE) {
> + page = vmalloc_to_page(area + off);
> + res = vm_insert_page(vma, vma->vm_start + off, page);
> + if (res)
> + return res;
> + }
> + return 0;
> +out:
> + spin_unlock_irqrestore(&df->lock, flags);
> + return res;
> +}
> +
> +static long kcov_df_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> +{
> + struct kcov_dataflow *df = filep->private_data;
> + unsigned long flags;
> + unsigned long size;
> + int res = 0;
> +
> + spin_lock_irqsave(&df->lock, flags);
> + switch (cmd) {
> + case KCOV_DF_INIT_TRACE:
> + if (df->area) {
> + res = -EBUSY;
> + break;
> + }
> + size = arg;
> + if (size < 2 || size > (128 << 20) / sizeof(u64)) {
> + res = -EINVAL;
> + break;
> + }
> + spin_unlock_irqrestore(&df->lock, flags);
> + df->area = vmalloc_user(size * sizeof(u64));
> + if (!df->area)
> + return -ENOMEM;
> + spin_lock_irqsave(&df->lock, flags);
> + df->size = size;
> + break;
> +
> + case KCOV_DF_ENABLE:
> + if (!df->area || df->t) {
> + res = -EINVAL;
> + break;
> + }
> + df->t = current;
> + current->kcov_df_area = df->area;
> + current->kcov_df_size = df->size;
> + current->kcov_dataflow_seq = 0;
> + /* Barrier before enabling */
> + barrier();
> + current->kcov_df_enabled = true;
> + break;
> +
> + case KCOV_DF_DISABLE:
> + if (df->t != current) {
> + res = -EINVAL;
> + break;
> + }
> + current->kcov_df_enabled = false;
> + barrier();
> + current->kcov_df_area = NULL;
> + current->kcov_df_size = 0;
> + df->t = NULL;
> + break;
> +
> + default:
> + res = -ENOTTY;
> + }
> + spin_unlock_irqrestore(&df->lock, flags);
> + return res;
> +}
> +
> +static const struct file_operations kcov_df_fops = {
> + .open = kcov_df_open,
> + .unlocked_ioctl = kcov_df_ioctl,
> + .compat_ioctl = kcov_df_ioctl,
> + .mmap = kcov_df_mmap,
> + .release = kcov_df_close,
> +};
> +#endif /* CONFIG_KCOV_DATAFLOW_ARGS || CONFIG_KCOV_DATAFLOW_RET */
> +
> static void kcov_start(struct task_struct *t, struct kcov *kcov,
> unsigned int size, void *area, enum kcov_mode mode,
> int sequence)
> @@ -1146,6 +1428,15 @@ static int __init kcov_init(void)
> */
> debugfs_create_file_unsafe("kcov", 0600, NULL, NULL, &kcov_fops);
>
> +#if defined(CONFIG_KCOV_DATAFLOW_ARGS) || defined(CONFIG_KCOV_DATAFLOW_RET)
> + /*
> + * Toggle verbose printk: echo 1 > /sys/kernel/debug/kcov_dataflow_verbose
> + * Default off — zero overhead when not debugging.
> + */
> + debugfs_create_file_unsafe("kcov_dataflow", 0600, NULL, NULL,
> + &kcov_df_fops);
> +#endif
> +
> #ifdef CONFIG_KCOV_SELFTEST
> selftest();
> #endif
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index e2f976c3301b..abd1a94589aa 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2261,6 +2261,28 @@ config KCOV_SELFTEST
> On test failure, causes the kernel to panic. Recommended to be
> enabled, ensuring critical functionality works as intended.
>
> +
> +config KCOV_DATAFLOW_ARGS
> + bool "Enable KCOV dataflow: function argument capture"
> + depends on KCOV
> + depends on $(cc-option,-fsanitize-coverage=dataflow-args)
> + help
> + Captures function arguments at entry via /sys/kernel/debug/kcov_dataflow.
> + Struct pointer arguments are auto-expanded using compiler DebugInfo
> + metadata, recording individual field values at runtime.
> + Enable per-module with: KCOV_DATAFLOW_file.o := y in the Makefile.
> + Requires clang with -fsanitize-coverage=dataflow-args support.
> +
> +config KCOV_DATAFLOW_RET
> + bool "Enable KCOV dataflow: return value capture"
> + depends on KCOV
> + depends on $(cc-option,-fsanitize-coverage=dataflow-ret)
> + help
> + Captures function return values via /sys/kernel/debug/kcov_dataflow.
> + Struct pointer returns are auto-expanded using compiler DebugInfo
> + metadata, recording individual field values at runtime.
> + Enable per-module with: KCOV_DATAFLOW_file.o := y in the Makefile.
> + Requires clang with -fsanitize-coverage=dataflow-ret support.
You might want to add an empty line here.
--
Nicolas
next prev parent reply other threads:[~2026-06-05 17:23 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-03 17:43 [RFC PATCH v2 0/6] kcov: per-task dataflow extraction at kernel function boundaries Yunseong Kim
2026-06-03 17:43 ` [RFC PATCH v2 1/6] kcov: add per-task dataflow tracking for function arguments/return values Yunseong Kim
2026-06-03 19:25 ` Nicolas Schier [this message]
2026-06-04 8:41 ` Peter Zijlstra
2026-06-05 16:05 ` Alexander Potapenko
2026-06-03 17:43 ` [RFC PATCH v2 2/6] kcov: add build system support for dataflow instrumentation Yunseong Kim
2026-06-04 8:45 ` Peter Zijlstra
2026-06-04 21:48 ` Nathan Chancellor
2026-06-05 15:29 ` Alexander Potapenko
2026-06-03 17:43 ` [RFC PATCH v2 3/6] kcov: add CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL and NO_INLINE Yunseong Kim
2026-06-04 8:46 ` Peter Zijlstra
2026-06-03 17:43 ` [RFC PATCH v2 4/6] tools/kcov-dataflow: add userspace consumer and test modules Yunseong Kim
2026-06-05 15:19 ` Alexander Potapenko
2026-06-03 17:43 ` [RFC PATCH v2 5/6] kcov: add interrupt context guard to kcov_df_write() Yunseong Kim
2026-06-04 8:48 ` Peter Zijlstra
2026-06-03 17:43 ` [RFC PATCH v2 6/6] kcov: add recursion guard and documentation for kcov-dataflow Yunseong Kim
2026-06-04 8:52 ` Peter Zijlstra
2026-06-04 8:40 ` [RFC PATCH v2 0/6] kcov: per-task dataflow extraction at kernel function boundaries Peter Zijlstra
2026-06-04 9:29 ` Yunseong Kim
2026-06-05 16:20 ` Alexander Potapenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiB_nycHL-MLN-3g@levanger \
--to=nsc@kernel.org \
--cc=a.hindborg@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=aliceryhl@google.com \
--cc=andreyknvl@gmail.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun@kernel.org \
--cc=bsegall@google.com \
--cc=corbet@lwn.net \
--cc=dakr@kernel.org \
--cc=dietmar.eggemann@arm.com \
--cc=dvyukov@google.com \
--cc=gary@garyguo.net \
--cc=juri.lelli@redhat.com \
--cc=justinstitt@google.com \
--cc=kasan-dev@googlegroups.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kbuild@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=lossin@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=morbo@google.com \
--cc=nathan@kernel.org \
--cc=nick.desaulniers+lkml@gmail.com \
--cc=ojeda@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=tmgross@umich.edu \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=workflows@vger.kernel.org \
--cc=ysk@kzalloc.com \
--cc=yunseong.kim@est.tech \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.