* [RFC PATCH v2 07/14] kcov: exclude kcov_dataflow.o from sanitizer instrumentation
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Exclude kcov_dataflow.o from KCOV, KASAN, KCSAN, UBSAN, and KMSAN
instrumentation, matching the exclusions already applied to kcov.o.
Without this, sanitizers instrumenting the dataflow callbacks would
cause infinite recursion.
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
kernel/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/Makefile b/kernel/Makefile
index b70e524c4074..307b7fd1e1f9 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -44,6 +44,12 @@ KCSAN_SANITIZE_kcov.o := n
UBSAN_SANITIZE_kcov.o := n
KMSAN_SANITIZE_kcov.o := n
+KCOV_INSTRUMENT_kcov_dataflow.o := n
+KASAN_SANITIZE_kcov_dataflow.o := n
+KCSAN_SANITIZE_kcov_dataflow.o := n
+UBSAN_SANITIZE_kcov_dataflow.o := n
+KMSAN_SANITIZE_kcov_dataflow.o := n
+
CONTEXT_ANALYSIS_kcov.o := y
CFLAGS_kcov.o := $(call cc-option, -fno-conserve-stack) -fno-stack-protector
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 08/14] selftests/kcov_dataflow: add trigger-view.py
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Add a Python script that loads a test module, triggers its debugfs
entry with kcov_dataflow recording active, then pretty-prints captured
records as a nested call tree with kallsyms symbol resolution.
Features:
- 8MB ring buffer (1M u64 words) for INSTRUMENT_ALL kernels
- Enable recording after module load, before trigger (avoids VFS noise)
- Variable-length record parsing using header-encoded field count
- Module-only filtering via kallsyms symbol lookup
- --context/-C N: show N records before/after each module function call
- --raw: print raw records instead of call tree
- Architecture-aware syscall numbers (x86_64 and arm64)
Usage:
python3 trigger-view.py eight_args_c \
--ko eight_args_c/eight_args_c.ko
python3 trigger-view.py eight_args_rust \
--ko eight_args_rust/eight_args_rust.ko
python3 trigger-view.py rust_ffi_contract \
--ko rust_ffi_contract/rust_ffi_contract.ko
Cc: Alexander Potapenko <glider@google.com>
Assisted-by: Claude:claude-opus-4-6 [kiro-chat]
Link: https://github.com/yskzalloc/kcov-dataflow/actions
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
.../selftests/kcov_dataflow/trigger-view.py | 377 +++++++++++++++++++++
1 file changed, 377 insertions(+)
diff --git a/tools/testing/selftests/kcov_dataflow/trigger-view.py b/tools/testing/selftests/kcov_dataflow/trigger-view.py
new file mode 100755
index 000000000000..a3274e472dc1
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/trigger-view.py
@@ -0,0 +1,377 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+"""
+trigger-view.py - Load a module with kcov_dataflow
+recording active, then pretty-print captured records.
+
+Usage:
+ python3 trigger-view.py eight_args_c
+ python3 trigger-view.py rust_ffi_contract
+ python3 trigger-view.py eight_args_c --raw
+
+The script:
+ 1. Opens /sys/kernel/debug/kcov_dataflow
+ 2. Inits and mmaps the buffer
+ 3. Enables recording for this process
+ 4. Loads the module via finit_module() -- init runs in our context
+ 5. Disables recording
+ 6. Unloads the module
+ 7. Parses and prints captured records with kallsyms resolution
+"""
+import os
+import sys
+import struct
+import ctypes
+import ctypes.util
+import argparse
+import fcntl
+
+# Constants
+DF_TYPE_ENTRY = 0xE
+DF_TYPE_RET = 0xF
+MAGIC_BAD = 0xBADADD85
+BUF_SIZE = 1048576 # 1M words = 8MB
+
+# Ioctl numbers
+def _IOR(t, nr, size):
+ return (2 << 30) | (ord(t) << 8) | nr | (size << 16)
+
+def _IO(t, nr):
+ return (ord(t) << 8) | nr
+
+KCOV_DF_INIT_TRACK = _IOR('d', 1, 8)
+KCOV_DF_ENABLE = _IO('d', 100)
+KCOV_DF_DISABLE = _IO('d', 101)
+
+# syscall numbers (x86_64)
+import platform
+_machine = platform.machine()
+if _machine == "aarch64":
+ SYS_FINIT_MODULE = 273
+ SYS_DELETE_MODULE = 106
+else: # x86_64
+ SYS_FINIT_MODULE = 313
+ SYS_DELETE_MODULE = 176
+
+SELFTEST_DIR = os.path.dirname(os.path.abspath(__file__))
+
+
+def load_kallsyms():
+ """Load kernel symbols for PC resolution."""
+ syms = []
+ try:
+ with open("/proc/kallsyms") as f:
+ for line in f:
+ parts = line.split()
+ if len(parts) >= 3:
+ addr = int(parts[0], 16)
+ name = parts[2]
+ mod = parts[3].strip("[]") if len(parts) > 3 else ""
+ syms.append((addr, name, mod))
+ except (PermissionError, FileNotFoundError):
+ pass
+ syms.sort()
+ return syms
+
+
+def symbolize(pc, syms):
+ """Find nearest symbol <= pc."""
+ if not syms:
+ return f"0x{pc:x}"
+ lo, hi = 0, len(syms) - 1
+ while lo < hi:
+ mid = (lo + hi + 1) // 2
+ if syms[mid][0] <= pc:
+ lo = mid
+ else:
+ hi = mid - 1
+ addr, name, mod = syms[lo]
+ if addr > pc:
+ return f"0x{pc:x}"
+ offset = pc - addr
+ if mod:
+ return f"{name}+0x{offset:x} [{mod}]" if offset else f"{name} [{mod}]"
+ return f"{name}+0x{offset:x}" if offset else name
+
+
+def format_val(v):
+ """Format a captured value."""
+ if v == MAGIC_BAD:
+ return "FAULT"
+ if v == 0:
+ return "0x0"
+ return f"0x{v:x}"
+
+
+def find_module(name):
+ """Find the .ko file for the given test name."""
+ ko_path = os.path.join(SELFTEST_DIR, name, f"{name}_mod.ko")
+ if os.path.exists(ko_path):
+ return ko_path
+ # Try without _mod suffix
+ ko_path = os.path.join(SELFTEST_DIR, name, f"{name}.ko")
+ if os.path.exists(ko_path):
+ return ko_path
+ # Search for any .ko in the directory
+ mod_dir = os.path.join(SELFTEST_DIR, name)
+ if os.path.isdir(mod_dir):
+ for f in os.listdir(mod_dir):
+ if f.endswith(".ko"):
+ return os.path.join(mod_dir, f)
+ return None
+
+
+def finit_module(ko_path):
+ """Load a kernel module via finit_module syscall."""
+ libc = ctypes.CDLL(ctypes.util.find_library("c"), use_errno=True)
+ fd = os.open(ko_path, os.O_RDONLY)
+ ret = libc.syscall(SYS_FINIT_MODULE, fd, b"", 0)
+ os.close(fd)
+ if ret != 0:
+ errno = ctypes.get_errno()
+ raise OSError(errno, f"finit_module({ko_path}): {os.strerror(errno)}")
+
+
+def delete_module(name):
+ """Unload a kernel module."""
+ libc = ctypes.CDLL(ctypes.util.find_library("c"), use_errno=True)
+ ret = libc.syscall(SYS_DELETE_MODULE, name.encode(), 0)
+ if ret != 0:
+ errno = ctypes.get_errno()
+ raise OSError(errno, f"delete_module({name}): {os.strerror(errno)}")
+
+
+def parse_records(buf, total_words):
+ """Parse the ring buffer into a list of records."""
+ records = []
+ pos = 1
+ while pos + 3 <= total_words and pos < BUF_SIZE:
+ hdr = buf[pos]
+
+ # Valid headers fit in 32 bits (upper 32 must be zero)
+ if hdr >> 32:
+ pos += 1
+ continue
+
+ rtype = (hdr >> 28) & 0xF
+
+ if rtype not in (DF_TYPE_ENTRY, DF_TYPE_RET):
+ pos += 1
+ continue
+
+ pc = buf[pos + 1]
+ meta = buf[pos + 2]
+ seq = hdr & 0x00FFFFFF
+ num_vals = (hdr >> 24) & 0xF
+ if num_vals == 0:
+ num_vals = 1
+
+ # Valid records always have a non-zero PC (kernel text address)
+ if pc == 0:
+ pos += 1
+ continue
+
+ val = buf[pos + 3] if pos + 3 < BUF_SIZE else 0
+ records.append({
+ "type": rtype,
+ "seq": seq,
+ "pc": pc,
+ "meta": meta,
+ "val": val,
+ })
+ pos += 3 + num_vals
+ return records
+
+
+def print_raw(records, syms):
+ """Print records in raw format."""
+ for r in records:
+ sym = symbolize(r["pc"], syms)
+ t = "ENTRY" if r["type"] == DF_TYPE_ENTRY else "RET "
+ arg_idx = (r["meta"] >> 56) & 0xFF
+ size = (r["meta"] >> 48) & 0xFF
+ print(f"[{t}] seq={r['seq']:3d} {sym} "
+ f"arg[{arg_idx}]({size}) = {format_val(r['val'])}")
+
+
+def print_tree(records, syms):
+ """Print records as indented call tree matching converted.txt format."""
+ depth = 0
+ # Group consecutive ENTRY records by PC to collect all args
+ i = 0
+ while i < len(records):
+ r = records[i]
+ sym = symbolize(r["pc"], syms)
+
+ if r["type"] == DF_TYPE_ENTRY:
+ # Collect all args for this call (same PC, consecutive entries)
+ args = []
+ pc = r["pc"]
+ while i < len(records) and records[i]["type"] == DF_TYPE_ENTRY \
+ and records[i]["pc"] == pc:
+ args.append(format_val(records[i]["val"]))
+ i += 1
+ indent = " " * depth
+ print(f"{indent}{sym}({', '.join(args)})")
+ depth += 1
+ else:
+ depth = max(0, depth - 1)
+ indent = " " * depth
+ print(f"{indent}{format_val(r['val'])} = {sym}()")
+ i += 1
+
+
+def main():
+ parser = argparse.ArgumentParser(
+ description="Load a test module with kcov_dataflow and view records")
+ parser.add_argument("module", help="Test module name (e.g. eight_args_c)")
+ parser.add_argument("--raw", action="store_true",
+ help="Print raw records instead of tree")
+ parser.add_argument("--ko", help="Explicit path to .ko file")
+ parser.add_argument("--context", "-C", type=int, default=0,
+ help="Show N lines before/after each module record")
+ args = parser.parse_args()
+
+ # Find module
+ if args.ko:
+ ko_path = args.ko
+ else:
+ ko_path = find_module(args.module)
+ if not ko_path or not os.path.exists(ko_path):
+ print(f"Cannot find module for '{args.module}'", file=sys.stderr)
+ print(f"Build it first: make LLVM=1 CC=clang "
+ f"M=tools/testing/selftests/kcov_dataflow/{args.module} modules",
+ file=sys.stderr)
+ sys.exit(1)
+
+ # Open kcov_dataflow
+ # Ensure kallsyms shows real addresses
+ try:
+ with open("/proc/sys/kernel/kptr_restrict", "w") as f:
+ f.write("0")
+ except (PermissionError, FileNotFoundError):
+ pass
+
+ try:
+ df_fd = os.open("/sys/kernel/debug/kcov_dataflow", os.O_RDWR)
+ except OSError as e:
+ print(f"Cannot open kcov_dataflow: {e}", file=sys.stderr)
+ sys.exit(1)
+
+ # Init + mmap
+ fcntl.ioctl(df_fd, KCOV_DF_INIT_TRACK, BUF_SIZE)
+ libc = ctypes.CDLL(ctypes.util.find_library("c"), use_errno=True)
+ libc.mmap.restype = ctypes.c_void_p
+ libc.mmap.argtypes = [
+ ctypes.c_void_p, ctypes.c_size_t, ctypes.c_int,
+ ctypes.c_int, ctypes.c_int, ctypes.c_long
+ ]
+ buf_ptr = libc.mmap(None, BUF_SIZE * 8, 0x3, 0x01, df_fd, 0)
+ if buf_ptr == ctypes.c_void_p(-1).value:
+ print("mmap failed", file=sys.stderr)
+ sys.exit(1)
+ buf = (ctypes.c_uint64 * BUF_SIZE).from_address(buf_ptr)
+
+ # Load module first (generates noise with INSTRUMENT_ALL)
+ mod_name = os.path.basename(ko_path).replace(".ko", "")
+ try:
+ finit_module(ko_path)
+ print(f"# Loaded {mod_name}")
+ except OSError as e:
+ print(f"Failed to load module: {e}", file=sys.stderr)
+ sys.exit(1)
+
+ # Get module .text address for PC filtering
+ mod_text_start = 0
+ try:
+ with open(f"/sys/module/{mod_name}/sections/.text") as f:
+ mod_text_start = int(f.read().strip(), 16)
+ except (FileNotFoundError, ValueError, PermissionError):
+ pass
+
+ # Enable recording AFTER load, BEFORE trigger (avoids VFS/loader noise)
+ fcntl.ioctl(df_fd, KCOV_DF_ENABLE, 0)
+ buf[0] = 0
+
+ # Trigger the module's debugfs file to invoke test functions
+ trigger_paths = [
+ f"/sys/kernel/debug/kcov_dataflow_test/trigger",
+ f"/sys/kernel/debug/kcov_dataflow_test/rust_ffi_trigger",
+ f"/sys/kernel/debug/trigger_rust",
+ f"/sys/kernel/debug/{mod_name}/trigger",
+ ]
+ for tp in trigger_paths:
+ try:
+ with open(tp, "w") as f:
+ f.write("1")
+ break
+ except (FileNotFoundError, PermissionError):
+ continue
+
+ fcntl.ioctl(df_fd, KCOV_DF_DISABLE, 0)
+
+ # Read kallsyms while module is still loaded (symbols available)
+ syms = load_kallsyms()
+
+ # Unload
+ try:
+ delete_module(mod_name)
+ except OSError:
+ pass
+
+ # Parse and display
+ total = int(buf[0])
+ print(f"# Captured {total} words")
+ records = parse_records(buf, total)
+ print(f"# {len(records)} records")
+
+ # Filter to module records using kallsyms
+ # Build set of module symbol addresses for fast lookup
+ mod_syms = set()
+ for addr, name, mod in syms:
+ if mod == mod_name and addr != 0:
+ mod_syms.add(addr)
+
+ def is_module_pc(pc):
+ """Check if PC belongs to mod_name via kallsyms."""
+ if mod_syms:
+ # Binary search: find nearest symbol <= pc, check module
+ lo, hi = 0, len(syms) - 1
+ while lo < hi:
+ mid = (lo + hi + 1) // 2
+ if syms[mid][0] <= pc:
+ lo = mid
+ else:
+ hi = mid - 1
+ return syms[lo][2] == mod_name
+ # Fallback: if no module symbols (kptr_restrict), use .text start
+ return mod_text_start and pc >= mod_text_start
+
+ if syms or mod_text_start:
+ if args.context > 0:
+ module_indices = set()
+ for i, r in enumerate(records):
+ if is_module_pc(r["pc"]):
+ for j in range(max(0, i - args.context),
+ min(len(records), i + args.context + 1)):
+ module_indices.add(j)
+ records = [records[i] for i in sorted(module_indices)]
+ print(f"# showing {len(records)} records with context={args.context} "
+ f"around {mod_name}\n")
+ else:
+ module_records = [r for r in records if is_module_pc(r["pc"])]
+ print(f"# {len(module_records)} from {mod_name}\n")
+ records = module_records
+ else:
+ print("")
+
+ if args.raw:
+ print_raw(records, syms)
+ else:
+ print_tree(records, syms)
+
+ os.close(df_fd)
+
+
+if __name__ == "__main__":
+ main()
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 09/14] selftests/kcov_dataflow: add ioctl interface selftest
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Add kselftest_harness-based test in user_ioctl/ covering the
kcov_dataflow ioctl interface (9 TAP cases): init, mmap, enable,
disable, error paths, double-enable rejection, and record capture.
Test:
make -C tools/testing/selftests/kcov_dataflow
./user_ioctl/user_ioctl
Result:
TAP version 13
1..9
# Starting 9 tests from 1 test cases.
# RUN kcov_dataflow.init_track ...
# OK kcov_dataflow.init_track
ok 1 kcov_dataflow.init_track
# RUN kcov_dataflow.init_track_too_small ...
# OK kcov_dataflow.init_track_too_small
ok 2 kcov_dataflow.init_track_too_small
# RUN kcov_dataflow.init_track_double ...
# OK kcov_dataflow.init_track_double
ok 3 kcov_dataflow.init_track_double
# RUN kcov_dataflow.mmap_before_init ...
# OK kcov_dataflow.mmap_before_init
ok 4 kcov_dataflow.mmap_before_init
# RUN kcov_dataflow.enable_disable ...
# OK kcov_dataflow.enable_disable
ok 5 kcov_dataflow.enable_disable
# RUN kcov_dataflow.enable_without_mmap ...
# OK kcov_dataflow.enable_without_mmap
ok 6 kcov_dataflow.enable_without_mmap
# RUN kcov_dataflow.disable_without_enable ...
# OK kcov_dataflow.disable_without_enable
ok 7 kcov_dataflow.disable_without_enable
# RUN kcov_dataflow.double_enable ...
# OK kcov_dataflow.double_enable
ok 8 kcov_dataflow.double_enable
# RUN kcov_dataflow.records_captured ...
# OK kcov_dataflow.records_captured
Cc: Alexander Potapenko <glider@google.com>
Assisted-by: Claude:claude-opus-4-6 [kiro-chat]
Link: https://github.com/yskzalloc/kcov-dataflow/actions
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
tools/testing/selftests/kcov_dataflow/.gitignore | 8 ++
tools/testing/selftests/kcov_dataflow/Makefile | 3 +
tools/testing/selftests/kcov_dataflow/README.rst | 37 +++++
.../kcov_dataflow/user_ioctl/user_ioctl.c | 156 +++++++++++++++++++++
4 files changed, 204 insertions(+)
diff --git a/tools/testing/selftests/kcov_dataflow/.gitignore b/tools/testing/selftests/kcov_dataflow/.gitignore
new file mode 100644
index 000000000000..f71fc89580f8
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/.gitignore
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0
+user_ioctl/user_ioctl
+*.o
+*.ko
+*.mod
+*.mod.c
+Module.symvers
+modules.order
diff --git a/tools/testing/selftests/kcov_dataflow/Makefile b/tools/testing/selftests/kcov_dataflow/Makefile
new file mode 100644
index 000000000000..b9fc1c5f0104
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+TEST_GEN_PROGS := user_ioctl/user_ioctl
+include ../lib.mk
diff --git a/tools/testing/selftests/kcov_dataflow/README.rst b/tools/testing/selftests/kcov_dataflow/README.rst
new file mode 100644
index 000000000000..8b650a62acb1
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/README.rst
@@ -0,0 +1,37 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+KCOV-Dataflow Selftests
+========================
+
+This directory contains selftests for the KCOV-Dataflow subsystem
+(``/sys/kernel/debug/kcov_dataflow``).
+
+Prerequisites
+-------------
+
+Build the kernel with::
+
+ CONFIG_KCOV=y
+ CONFIG_KCOV_DATAFLOW_ARGS=y
+ CONFIG_KCOV_DATAFLOW_RET=y
+ CONFIG_DEBUG_INFO=y
+
+For full capture, also enable::
+
+ CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL=y
+
+Tests
+-----
+
+user_ioctl/user_ioctl.c
+ Automated ioctl interface test (9 TAP cases)::
+
+ make -C tools/testing/selftests/kcov_dataflow
+ ./user_ioctl/user_ioctl
+
+trigger-view.py
+ Loads a test module via finit_module() with recording active,
+ prints captured records with symbol resolution::
+
+ python3 trigger-view.py <module_name>
+ python3 trigger-view.py <module_name> --raw
diff --git a/tools/testing/selftests/kcov_dataflow/user_ioctl/user_ioctl.c b/tools/testing/selftests/kcov_dataflow/user_ioctl/user_ioctl.c
new file mode 100644
index 000000000000..48448bc02d2f
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/user_ioctl/user_ioctl.c
@@ -0,0 +1,156 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kcov_dataflow_test.c - Selftest for /sys/kernel/debug/kcov_dataflow
+ *
+ * Verifies the ioctl interface: open, INIT_TRACK, mmap, ENABLE, DISABLE.
+ * With INSTRUMENT_ALL, also verifies that records are produced for
+ * syscalls executed while recording is active.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <string.h>
+#include <errno.h>
+
+#include "../../kselftest_harness.h"
+
+#define KCOV_DF_INIT_TRACK _IOR('d', 1, unsigned long)
+#define KCOV_DF_ENABLE _IO('d', 100)
+#define KCOV_DF_DISABLE _IO('d', 101)
+
+#define BUF_SIZE 65536
+
+#define DF_TYPE_ENTRY 0xE
+#define DF_TYPE_RET 0xF
+
+FIXTURE(kcov_dataflow) {
+ int fd;
+ uint64_t *buf;
+};
+
+FIXTURE_SETUP(kcov_dataflow)
+{
+ self->fd = open("/sys/kernel/debug/kcov_dataflow", O_RDWR);
+ if (self->fd < 0)
+ SKIP(return, "kcov_dataflow not available (need CONFIG_KCOV_DATAFLOW_ARGS)");
+ self->buf = MAP_FAILED;
+}
+
+FIXTURE_TEARDOWN(kcov_dataflow)
+{
+ if (self->buf != MAP_FAILED)
+ munmap(self->buf, BUF_SIZE * sizeof(uint64_t));
+ if (self->fd >= 0)
+ close(self->fd);
+}
+
+TEST_F(kcov_dataflow, init_track)
+{
+ int ret = ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE);
+
+ ASSERT_EQ(0, ret);
+}
+
+TEST_F(kcov_dataflow, init_track_too_small)
+{
+ int ret = ioctl(self->fd, KCOV_DF_INIT_TRACK, 1UL);
+
+ ASSERT_EQ(-1, ret);
+ ASSERT_EQ(EINVAL, errno);
+}
+
+TEST_F(kcov_dataflow, init_track_double)
+{
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ ASSERT_EQ(-1, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ ASSERT_EQ(EBUSY, errno);
+}
+
+TEST_F(kcov_dataflow, mmap_before_init)
+{
+ self->buf = mmap(NULL, BUF_SIZE * sizeof(uint64_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, self->fd, 0);
+ ASSERT_EQ(MAP_FAILED, self->buf);
+}
+
+TEST_F(kcov_dataflow, enable_disable)
+{
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ self->buf = mmap(NULL, BUF_SIZE * sizeof(uint64_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, self->fd, 0);
+ ASSERT_NE(MAP_FAILED, self->buf);
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_ENABLE, 0));
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_DISABLE, 0));
+}
+
+TEST_F(kcov_dataflow, enable_without_mmap)
+{
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ /* enable works even without mmap (mmap is optional for setup) */
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_ENABLE, 0));
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_DISABLE, 0));
+}
+
+TEST_F(kcov_dataflow, disable_without_enable)
+{
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ ASSERT_EQ(-1, ioctl(self->fd, KCOV_DF_DISABLE, 0));
+ ASSERT_EQ(EINVAL, errno);
+}
+
+TEST_F(kcov_dataflow, double_enable)
+{
+ int fd2;
+
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ self->buf = mmap(NULL, BUF_SIZE * sizeof(uint64_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, self->fd, 0);
+ ASSERT_NE(MAP_FAILED, self->buf);
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_ENABLE, 0));
+
+ /* Second fd should fail to enable (task already active) */
+ fd2 = open("/sys/kernel/debug/kcov_dataflow", O_RDWR);
+ ASSERT_GE(fd2, 0);
+ ASSERT_EQ(0, ioctl(fd2, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ ASSERT_EQ(-1, ioctl(fd2, KCOV_DF_ENABLE, 0));
+ ASSERT_EQ(EBUSY, errno);
+ close(fd2);
+
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_DISABLE, 0));
+}
+
+TEST_F(kcov_dataflow, records_captured)
+{
+ uint64_t count;
+
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_INIT_TRACK, (unsigned long)BUF_SIZE));
+ self->buf = mmap(NULL, BUF_SIZE * sizeof(uint64_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, self->fd, 0);
+ ASSERT_NE(MAP_FAILED, self->buf);
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_ENABLE, 0));
+
+ /* Trigger some kernel code in this task */
+ getpid();
+
+ ASSERT_EQ(0, ioctl(self->fd, KCOV_DF_DISABLE, 0));
+
+ count = self->buf[0];
+ /*
+ * With INSTRUMENT_ALL, getpid() produces records.
+ * Without it, count may be 0 (no instrumented code).
+ * Either way, the interface works correctly.
+ */
+ if (count > 0) {
+ uint64_t hdr = self->buf[1];
+ unsigned int type = (hdr >> 28) & 0xF;
+
+ /* First record should be ENTRY or RET */
+ ASSERT_TRUE(type == DF_TYPE_ENTRY || type == DF_TYPE_RET);
+ }
+}
+
+TEST_HARNESS_MAIN
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 10/14] selftests/kcov_dataflow: add eight_args_c test module
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
C module exercising 1-8 argument functions plus struct pointer.
Verifies register-passed (1-6) and stack-passed (7-8) arguments.
Test:
make LLVM=1 CC=clang \
M=tools/testing/selftests/kcov_dataflow/eight_args_c modules
vng --user root --exec \
"python3 tools/testing/selftests/kcov_dataflow/trigger-view.py \
eight_args_c -C 8 --ko \
tools/testing/selftests/kcov_dataflow/eight_args_c/eight_args_c.ko"
Result:
# Loaded eight_args_c
# Captured 6195 words
# 578 records
# showing 65 records with context=8 around eight_args_c
vfs_write(0x0)
0x0 = full_proxy_write()
full_proxy_write(0x0, 0x1, 0x0)
0x8200080 = __debugfs_file_get()
__debugfs_file_get(0x0)
0x0 = __debugfs_file_get()
0x0 = trigger_write [eight_args_c]()
trigger_write [eight_args_c](0x0, 0x1, 0x0)
df_func2 [eight_args_c](0x11, 0x22)
0x33 = df_func2 [eight_args_c]()
df_func3 [eight_args_c](0x11, 0x22, 0x33)
0x66 = df_func3 [eight_args_c]()
df_func4 [eight_args_c](0x11, 0x22, 0x33, 0x44)
0xaa = df_func4 [eight_args_c]()
df_func5 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55)
0xff = df_func5 [eight_args_c]()
df_func6 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55, 0x66)
0x165 = df_func6 [eight_args_c]()
df_func7 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77)
0x1dc = df_func7 [eight_args_c]()
df_func8 [eight_args_c](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88)
0x264 = df_func8 [eight_args_c]()
df_func_struct [eight_args_c](0xaaaa)
0x16665 = df_func_struct [eight_args_c]()
0x1 = trigger_write [eight_args_c]()
0x1 = full_proxy_write()
0x1 = vfs_write()
0x1 = ksys_write()
0x1 = __x64_sys_write()
0x0 = fpregs_assert_state_consistent()
0xba5748 = __x64_sys_close()
file_close_fd(0x4)
0x0 = file_close_fd()
Cc: Alexander Potapenko <glider@google.com>
Assisted-by: Claude:claude-opus-4-6 [kiro-chat]
Link: https://github.com/yskzalloc/kcov-dataflow/actions
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
tools/testing/selftests/kcov_dataflow/Makefile | 1 +
tools/testing/selftests/kcov_dataflow/README.rst | 6 ++
.../selftests/kcov_dataflow/eight_args_c/Makefile | 3 +
.../kcov_dataflow/eight_args_c/eight_args_c.c | 95 ++++++++++++++++++++++
.../selftests/kcov_dataflow/run_eight_args_c.sh | 35 ++++++++
5 files changed, 140 insertions(+)
diff --git a/tools/testing/selftests/kcov_dataflow/Makefile b/tools/testing/selftests/kcov_dataflow/Makefile
index b9fc1c5f0104..3a42c54e954d 100644
--- a/tools/testing/selftests/kcov_dataflow/Makefile
+++ b/tools/testing/selftests/kcov_dataflow/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
TEST_GEN_PROGS := user_ioctl/user_ioctl
+TEST_PROGS := run_eight_args_c.sh
include ../lib.mk
diff --git a/tools/testing/selftests/kcov_dataflow/README.rst b/tools/testing/selftests/kcov_dataflow/README.rst
index 8b650a62acb1..e93b4e573504 100644
--- a/tools/testing/selftests/kcov_dataflow/README.rst
+++ b/tools/testing/selftests/kcov_dataflow/README.rst
@@ -35,3 +35,9 @@ trigger-view.py
python3 trigger-view.py <module_name>
python3 trigger-view.py <module_name> --raw
+
+eight_args_c/
+ C module with 1-8 argument functions + struct pointer::
+
+ make LLVM=1 CC=clang M=tools/testing/selftests/kcov_dataflow/eight_args_c modules
+ python3 trigger-view.py eight_args_c
diff --git a/tools/testing/selftests/kcov_dataflow/eight_args_c/Makefile b/tools/testing/selftests/kcov_dataflow/eight_args_c/Makefile
new file mode 100644
index 000000000000..aad45c7e3863
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/eight_args_c/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-m := eight_args_c.o
+KCOV_DATAFLOW_eight_args_c.o := y
diff --git a/tools/testing/selftests/kcov_dataflow/eight_args_c/eight_args_c.c b/tools/testing/selftests/kcov_dataflow/eight_args_c/eight_args_c.c
new file mode 100644
index 000000000000..09fbbbf8d14b
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/eight_args_c/eight_args_c.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * eight_args_c.c - Verify kcov_dataflow captures 1-8 argument functions.
+ *
+ * Write to /sys/kernel/debug/kcov_dataflow_test/trigger to invoke all
+ * eight functions and a struct-pointer function. Use with the
+ * kcov_dataflow selftest to verify correct capture of register-passed
+ * (1-6) and stack-passed (7-8) arguments on x86_64.
+ */
+#include <linux/module.h>
+#include <linux/debugfs.h>
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("KCOV dataflow 8-argument stress test");
+
+struct pair {
+ u32 x;
+ u32 y;
+};
+
+/* Prototypes */
+u64 df_func1(u64 a1);
+u64 df_func2(u64 a1, u64 a2);
+u64 df_func3(u64 a1, u64 a2, u64 a3);
+u64 df_func4(u64 a1, u64 a2, u64 a3, u64 a4);
+u64 df_func5(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5);
+u64 df_func6(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6);
+u64 df_func7(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6, u64 a7);
+u64 df_func8(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6, u64 a7,
+ u64 a8);
+u64 df_func_struct(struct pair *p);
+
+/* Implementations - noinline ensures trace callbacks are emitted */
+#define DEF_FUNC(name, ret_expr, ...) \
+noinline u64 name(__VA_ARGS__) { return (ret_expr); } \
+EXPORT_SYMBOL(name)
+
+DEF_FUNC(df_func1, a1, u64 a1);
+DEF_FUNC(df_func2, a1 + a2, u64 a1, u64 a2);
+DEF_FUNC(df_func3, a1 + a2 + a3, u64 a1, u64 a2, u64 a3);
+DEF_FUNC(df_func4, a1 + a2 + a3 + a4, u64 a1, u64 a2, u64 a3, u64 a4);
+DEF_FUNC(df_func5, a1 + a2 + a3 + a4 + a5,
+ u64 a1, u64 a2, u64 a3, u64 a4, u64 a5);
+DEF_FUNC(df_func6, a1 + a2 + a3 + a4 + a5 + a6,
+ u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6);
+DEF_FUNC(df_func7, a1 + a2 + a3 + a4 + a5 + a6 + a7,
+ u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6, u64 a7);
+DEF_FUNC(df_func8, a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8,
+ u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6, u64 a7, u64 a8);
+
+noinline u64 df_func_struct(struct pair *p)
+{
+ return (u64)p->x + (u64)p->y;
+}
+EXPORT_SYMBOL(df_func_struct);
+
+static struct dentry *test_dir;
+
+static ssize_t trigger_write(struct file *f, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct pair p = { .x = 0xAAAA, .y = 0xBBBB };
+ volatile u64 sum = 0;
+
+ sum += df_func1(0x11);
+ sum += df_func2(0x11, 0x22);
+ sum += df_func3(0x11, 0x22, 0x33);
+ sum += df_func4(0x11, 0x22, 0x33, 0x44);
+ sum += df_func5(0x11, 0x22, 0x33, 0x44, 0x55);
+ sum += df_func6(0x11, 0x22, 0x33, 0x44, 0x55, 0x66);
+ sum += df_func7(0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77);
+ sum += df_func8(0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88);
+ sum += df_func_struct(&p);
+
+ return count;
+}
+
+static const struct file_operations trigger_fops = {
+ .write = trigger_write,
+};
+
+static int __init eight_args_init(void)
+{
+ test_dir = debugfs_create_dir("kcov_dataflow_test", NULL);
+ debugfs_create_file("trigger", 0200, test_dir, NULL, &trigger_fops);
+ return 0;
+}
+
+static void __exit eight_args_exit(void)
+{
+ debugfs_remove_recursive(test_dir);
+}
+
+module_init(eight_args_init);
+module_exit(eight_args_exit);
diff --git a/tools/testing/selftests/kcov_dataflow/run_eight_args_c.sh b/tools/testing/selftests/kcov_dataflow/run_eight_args_c.sh
new file mode 100755
index 000000000000..d24092e920ff
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/run_eight_args_c.sh
@@ -0,0 +1,35 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Test eight_args_c module capture via kcov_dataflow
+DIR="$(dirname "$0")"
+KO="$DIR/eight_args_c/eight_args_c.ko"
+
+if [ ! -f "$KO" ]; then
+ echo "SKIP: $KO not found"
+ echo "Build: make LLVM=1 CC=clang M=...eight_args_c modules"
+ exit 4 # kselftest SKIP
+fi
+
+if [ ! -e /sys/kernel/debug/kcov_dataflow ]; then
+ echo "SKIP: kcov_dataflow not available"
+ exit 4
+fi
+
+OUTPUT=$(python3 "$DIR/trigger-view.py" eight_args_c --ko "$KO" --raw 2>&1)
+RC=$?
+
+if [ $RC -ne 0 ]; then
+ echo "FAIL: trigger-and-view exited with $RC"
+ echo "$OUTPUT"
+ exit 1
+fi
+
+RECORDS=$(echo "$OUTPUT" | grep -c "^\[ENTRY\]\|^\[RET")
+if [ "$RECORDS" -gt 0 ]; then
+ echo "PASS: captured $RECORDS records from eight_args_c"
+ exit 0
+else
+ echo "FAIL: no records captured"
+ echo "$OUTPUT"
+ exit 1
+fi
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 11/14] selftests/kcov_dataflow: add eight_args_rust test module
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Rust module exercising 1-8 argument functions plus struct pointer.
Verifies register-passed (1-6) and stack-passed (7-8) arguments.
Test:
make LLVM=1 CC=clang RUSTC=$RUSTC RUST_LIB_SRC=$RUST_LIB_SRC \
M=tools/testing/selftests/kcov_dataflow/eight_args_rust modules
vng --user root --exec \
"python3 tools/testing/selftests/kcov_dataflow/trigger-view.py \
eight_args_rust -C 8 --ko \
tools/testing/selftests/kcov_dataflow/eight_args_rust/eight_args_rust.ko"
Result:
ksys_write(0x0, 0x1)
fdget_pos(0x4)
0xffff891481d2bc00 = fdget_pos()
0x0 = vfs_write()
vfs_write(0x0, 0x1, 0x0)
0x0 = _RNvCs3p16QzTwthP_15eight_args_rust13write_handler [eight_args_rust]()
_RNvCs3p16QzTwthP_15eight_args_rust13write_handler [eight_args_rust](0x0, 0x1, 0x0)
rdf_func2 [eight_args_rust](0x11, 0x22)
0x33 = rdf_func2 [eight_args_rust]()
rdf_func3 [eight_args_rust](0x11, 0x22, 0x33)
0x66 = rdf_func3 [eight_args_rust]()
rdf_func4 [eight_args_rust](0x11, 0x22, 0x33, 0x44)
0xaa = rdf_func4 [eight_args_rust]()
rdf_func5 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55)
0xff = rdf_func5 [eight_args_rust]()
rdf_func6 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55, 0x66)
0x165 = rdf_func6 [eight_args_rust]()
rdf_func7 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77)
0x1dc = rdf_func7 [eight_args_rust]()
rdf_func8 [eight_args_rust](0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88)
0x264 = rdf_func8 [eight_args_rust]()
rdf_func_struct [eight_args_rust](0xaaaa)
0x16665 = rdf_func_struct [eight_args_rust]()
0x1 = _RNvCs3p16QzTwthP_15eight_args_rust13write_handler [eight_args_rust]()
0x1 = vfs_write()
0x1 = ksys_write()
0x1 = __x64_sys_write()
0x0 = fpregs_assert_state_consistent()
0xba5748 = __x64_sys_close()
file_close_fd(0x4)
0x0 = file_close_fd()
0x0 = filp_flush()
Cc: Alexander Potapenko <glider@google.com>
Assisted-by: Claude:claude-opus-4-6 [kiro-chat]
Link: https://github.com/yskzalloc/kcov-dataflow/actions
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
tools/testing/selftests/kcov_dataflow/README.rst | 7 +
.../kcov_dataflow/eight_args_rust/Makefile | 3 +
.../eight_args_rust/eight_args_rust.rs | 143 +++++++++++++++++++++
.../selftests/kcov_dataflow/run_eight_args_rust.sh | 35 +++++
4 files changed, 188 insertions(+)
diff --git a/tools/testing/selftests/kcov_dataflow/README.rst b/tools/testing/selftests/kcov_dataflow/README.rst
index e93b4e573504..61a41f3bd596 100644
--- a/tools/testing/selftests/kcov_dataflow/README.rst
+++ b/tools/testing/selftests/kcov_dataflow/README.rst
@@ -41,3 +41,10 @@ eight_args_c/
make LLVM=1 CC=clang M=tools/testing/selftests/kcov_dataflow/eight_args_c modules
python3 trigger-view.py eight_args_c
+
+eight_args_rust/
+ Rust equivalent of eight_args_c. Captures arguments at -O2 where
+ drgn/vmcore cannot. Requires CONFIG_RUST::
+
+ make LLVM=1 CC=clang M=tools/testing/selftests/kcov_dataflow/eight_args_rust modules
+ python3 trigger-view.py eight_args_rust
diff --git a/tools/testing/selftests/kcov_dataflow/eight_args_rust/Makefile b/tools/testing/selftests/kcov_dataflow/eight_args_rust/Makefile
new file mode 100644
index 000000000000..c1e9ea2c5622
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/eight_args_rust/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-m := eight_args_rust.o
+KCOV_DATAFLOW_eight_args_rust.o := y
diff --git a/tools/testing/selftests/kcov_dataflow/eight_args_rust/eight_args_rust.rs b/tools/testing/selftests/kcov_dataflow/eight_args_rust/eight_args_rust.rs
new file mode 100644
index 000000000000..3026265cda97
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/eight_args_rust/eight_args_rust.rs
@@ -0,0 +1,143 @@
+// SPDX-License-Identifier: GPL-2.0
+//! Verify kcov_dataflow captures 1-8 argument Rust functions at -O2.
+//!
+//! This is the Rust equivalent of eight_args_c. Since rustc elides DWARF
+//! variable locations at -O2, drgn/vmcore cannot observe these arguments.
+//! kcov_dataflow captures them via the post-compilation pipeline.
+//!
+//! Write to /sys/kernel/debug/kcov_dataflow_test/trigger_rust to invoke.
+
+#![allow(missing_docs)]
+
+use kernel::prelude::*;
+use kernel::c_str;
+
+module! {
+ type: EightArgsRust,
+ name: "eight_args_rust",
+ authors: ["kcov-dataflow"],
+ description: "1-8 arg Rust verification for kcov_dataflow",
+ license: "GPL",
+}
+
+#[repr(C)]
+pub struct Pair {
+ pub x: u32,
+ pub y: u32,
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func1(a1: u64) -> u64 { a1 }
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func2(a1: u64, a2: u64) -> u64 { a1 + a2 }
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func3(a1: u64, a2: u64, a3: u64) -> u64 {
+ a1 + a2 + a3
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func4(a1: u64, a2: u64, a3: u64, a4: u64) -> u64 {
+ a1 + a2 + a3 + a4
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func5(a1: u64, a2: u64, a3: u64, a4: u64, a5: u64) -> u64 {
+ a1 + a2 + a3 + a4 + a5
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func6(
+ a1: u64, a2: u64, a3: u64, a4: u64, a5: u64, a6: u64,
+) -> u64 {
+ a1 + a2 + a3 + a4 + a5 + a6
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func7(
+ a1: u64, a2: u64, a3: u64, a4: u64, a5: u64, a6: u64, a7: u64,
+) -> u64 {
+ a1 + a2 + a3 + a4 + a5 + a6 + a7
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func8(
+ a1: u64, a2: u64, a3: u64, a4: u64, a5: u64, a6: u64, a7: u64, a8: u64,
+) -> u64 {
+ a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8
+}
+
+#[no_mangle]
+#[inline(never)]
+pub extern "C" fn rdf_func_struct(p: *const Pair) -> u64 {
+ unsafe { (*p).x as u64 + (*p).y as u64 }
+}
+
+unsafe extern "C" fn write_handler(
+ _file: *mut kernel::bindings::file,
+ _buf: *const core::ffi::c_char,
+ count: usize,
+ _ppos: *mut kernel::bindings::loff_t,
+) -> kernel::ffi::c_long {
+ let p = Pair { x: 0xAAAA, y: 0xBBBB };
+
+ let mut sum: u64 = 0;
+ sum = sum.wrapping_add(rdf_func1(0x11));
+ sum = sum.wrapping_add(rdf_func2(0x11, 0x22));
+ sum = sum.wrapping_add(rdf_func3(0x11, 0x22, 0x33));
+ sum = sum.wrapping_add(rdf_func4(0x11, 0x22, 0x33, 0x44));
+ sum = sum.wrapping_add(rdf_func5(0x11, 0x22, 0x33, 0x44, 0x55));
+ sum = sum.wrapping_add(rdf_func6(0x11, 0x22, 0x33, 0x44, 0x55, 0x66));
+ sum = sum.wrapping_add(rdf_func7(0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77));
+ sum = sum.wrapping_add(rdf_func8(0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88));
+ sum = sum.wrapping_add(rdf_func_struct(&p as *const Pair));
+ core::hint::black_box(sum);
+
+ count as kernel::ffi::c_long
+}
+
+#[repr(transparent)]
+struct SyncFops(kernel::bindings::file_operations);
+unsafe impl Sync for SyncFops {}
+
+static FOPS: SyncFops = SyncFops(kernel::bindings::file_operations {
+ write: Some(unsafe { core::mem::transmute(write_handler as *const ()) }),
+ ..unsafe { core::mem::zeroed() }
+});
+
+struct EightArgsRust {
+ d: *mut kernel::bindings::dentry,
+}
+
+impl kernel::Module for EightArgsRust {
+ fn init(_module: &'static ThisModule) -> Result<Self> {
+ let d = unsafe {
+ kernel::bindings::debugfs_create_file_unsafe(
+ c_str!("trigger_rust").as_char_ptr(),
+ 0o222,
+ core::ptr::null_mut(),
+ core::ptr::null_mut(),
+ &FOPS.0,
+ )
+ };
+ Ok(Self { d })
+ }
+}
+
+impl Drop for EightArgsRust {
+ fn drop(&mut self) {
+ unsafe { kernel::bindings::debugfs_remove(self.d) };
+ }
+}
+
+unsafe impl Send for EightArgsRust {}
+unsafe impl Sync for EightArgsRust {}
diff --git a/tools/testing/selftests/kcov_dataflow/run_eight_args_rust.sh b/tools/testing/selftests/kcov_dataflow/run_eight_args_rust.sh
new file mode 100755
index 000000000000..c5f11866e19d
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/run_eight_args_rust.sh
@@ -0,0 +1,35 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Test eight_args_rust module capture via kcov_dataflow
+DIR="$(dirname "$0")"
+KO="$DIR/eight_args_rust/eight_args_rust.ko"
+
+if [ ! -f "$KO" ]; then
+ echo "SKIP: $KO not found"
+ echo "Build: make LLVM=1 CC=clang RUSTC=\$RUSTC M=...eight_args_rust modules""
+ exit 4 # kselftest SKIP
+fi
+
+if [ ! -e /sys/kernel/debug/kcov_dataflow ]; then
+ echo "SKIP: kcov_dataflow not available"
+ exit 4
+fi
+
+OUTPUT=$(python3 "$DIR/trigger-view.py" eight_args_rust --ko "$KO" --raw 2>&1)
+RC=$?
+
+if [ $RC -ne 0 ]; then
+ echo "FAIL: trigger-and-view exited with $RC"
+ echo "$OUTPUT"
+ exit 1
+fi
+
+RECORDS=$(echo "$OUTPUT" | grep -c "^\[ENTRY\]\|^\[RET")
+if [ "$RECORDS" -gt 0 ]; then
+ echo "PASS: captured $RECORDS records from eight_args_rust"
+ exit 0
+else
+ echo "FAIL: no records captured"
+ echo "$OUTPUT"
+ exit 1
+fi
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 12/14] selftests/kcov_dataflow: add rust_ffi_contract test module
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Demonstrates FFI contract violation detection. A C callee returns
success (0) but leaves buffer=NULL, violating the postcondition
"ret==0 implies buffer!=NULL". kcov_dataflow captures struct fields
at the boundary proving the violation without a crash or KASAN report.
Test:
make LLVM=1 CC=clang \
M=tools/testing/selftests/kcov_dataflow/rust_ffi_contract modules
vng --user root --exec \
"python3 tools/testing/selftests/kcov_dataflow/trigger-view.py \
rust_ffi_contract -C 8 --ko \
tools/testing/selftests/kcov_dataflow/rust_ffi_contract/rust_ffi_contract.ko"
Result:
vfs_write(0x0)
0x0 = full_proxy_write()
full_proxy_write(0x0, 0x1, 0x0)
0x8200080 = __debugfs_file_get()
__debugfs_file_get(0x0)
0x0 = __debugfs_file_get()
0x0 = rust_ffi_trigger_write [rust_ffi_contract]()
rust_ffi_trigger_write [rust_ffi_contract](0x0, 0x1, 0x0)
ffi_alloc_buf [rust_ffi_contract](0xffffffff912288ad, 0x100, 0x0, 0x1)
0x0 = ffi_alloc_buf [rust_ffi_contract]()
_printk(0x6f635f6966663601)
vprintk(0x6f635f6966663601, 0x8)
vprintk_default(0x6f635f6966663601, 0x8)
vprintk_emit(0x0, 0xffffffff, 0x0)
0x0 = panic_on_this_cpu()
0x0 = _prb_read_valid()
0x0 = prb_read_valid()
0x0 = console_unlock()
0x3f = vprintk_emit()
0x3f = vprintk_default()
0x3f = vprintk()
0x3f = _printk()
ffi_check_result [rust_ffi_contract](0x0)
_printk(0x6f635f6966663301)
vprintk(0x6f635f6966663301, 0x8)
vprintk_default(0x6f635f6966663301, 0x8)
vprintk_emit(0x0, 0xffffffff, 0x0)
0x0 = panic_on_this_cpu()
0x0 = _prb_read_valid()
0x0 = prb_read_valid()
0x0 = console_unlock()
0x3f = vprintk_emit()
0x3f = vprintk_default()
0x3f = vprintk()
0x3f = _printk()
0xfffffff2 = ffi_check_result [rust_ffi_contract]()
0x1 = rust_ffi_trigger_write [rust_ffi_contract]()
0x1 = full_proxy_write()
0x1 = vfs_write()
0x1 = ksys_write()
0x1 = __x64_sys_write()
0x0 = fpregs_assert_state_consistent()
0xba5748 = __x64_sys_close()
file_close_fd(0x4)
0x0 = file_close_fd()
Cc: Alexander Potapenko <glider@google.com>
Assisted-by: Claude:claude-opus-4-6 [kiro-chat]
Link: https://github.com/yskzalloc/kcov-dataflow/actions
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
tools/testing/selftests/kcov_dataflow/Makefile | 2 +-
tools/testing/selftests/kcov_dataflow/README.rst | 8 ++
.../kcov_dataflow/run_rust_ffi_contract.sh | 35 +++++++
.../kcov_dataflow/rust_ffi_contract/Makefile | 3 +
.../rust_ffi_contract/rust_ffi_contract.c | 111 +++++++++++++++++++++
5 files changed, 158 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kcov_dataflow/Makefile b/tools/testing/selftests/kcov_dataflow/Makefile
index 3a42c54e954d..6412c90edfa1 100644
--- a/tools/testing/selftests/kcov_dataflow/Makefile
+++ b/tools/testing/selftests/kcov_dataflow/Makefile
@@ -1,4 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
TEST_GEN_PROGS := user_ioctl/user_ioctl
-TEST_PROGS := run_eight_args_c.sh
+TEST_PROGS := run_eight_args_c.sh run_rust_ffi_contract.sh
include ../lib.mk
diff --git a/tools/testing/selftests/kcov_dataflow/README.rst b/tools/testing/selftests/kcov_dataflow/README.rst
index 61a41f3bd596..06a0c805cc74 100644
--- a/tools/testing/selftests/kcov_dataflow/README.rst
+++ b/tools/testing/selftests/kcov_dataflow/README.rst
@@ -48,3 +48,11 @@ eight_args_rust/
make LLVM=1 CC=clang M=tools/testing/selftests/kcov_dataflow/eight_args_rust modules
python3 trigger-view.py eight_args_rust
+
+rust_ffi_contract/
+ Demonstrates FFI contract violation detection. A callee returns
+ success but leaves buffer=NULL. kcov_dataflow captures struct
+ fields proving the violation::
+
+ make LLVM=1 CC=clang M=tools/testing/selftests/kcov_dataflow/rust_ffi_contract modules
+ python3 trigger-view.py rust_ffi_contract
diff --git a/tools/testing/selftests/kcov_dataflow/run_rust_ffi_contract.sh b/tools/testing/selftests/kcov_dataflow/run_rust_ffi_contract.sh
new file mode 100755
index 000000000000..8662e532296b
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/run_rust_ffi_contract.sh
@@ -0,0 +1,35 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Test rust_ffi_contract module capture via kcov_dataflow
+DIR="$(dirname "$0")"
+KO="$DIR/rust_ffi_contract/rust_ffi_contract.ko"
+
+if [ ! -f "$KO" ]; then
+ echo "SKIP: $KO not found"
+ echo "Build: make LLVM=1 CC=clang M=...rust_ffi_contract modules""
+ exit 4 # kselftest SKIP
+fi
+
+if [ ! -e /sys/kernel/debug/kcov_dataflow ]; then
+ echo "SKIP: kcov_dataflow not available"
+ exit 4
+fi
+
+OUTPUT=$(python3 "$DIR/trigger-view.py" rust_ffi_contract --ko "$KO" --raw 2>&1)
+RC=$?
+
+if [ $RC -ne 0 ]; then
+ echo "FAIL: trigger-and-view exited with $RC"
+ echo "$OUTPUT"
+ exit 1
+fi
+
+RECORDS=$(echo "$OUTPUT" | grep -c "^\[ENTRY\]\|^\[RET")
+if [ "$RECORDS" -gt 0 ]; then
+ echo "PASS: captured $RECORDS records from rust_ffi_contract"
+ exit 0
+else
+ echo "FAIL: no records captured"
+ echo "$OUTPUT"
+ exit 1
+fi
diff --git a/tools/testing/selftests/kcov_dataflow/rust_ffi_contract/Makefile b/tools/testing/selftests/kcov_dataflow/rust_ffi_contract/Makefile
new file mode 100644
index 000000000000..d2a0261070b1
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/rust_ffi_contract/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-m := rust_ffi_contract.o
+KCOV_DATAFLOW_rust_ffi_contract.o := y
diff --git a/tools/testing/selftests/kcov_dataflow/rust_ffi_contract/rust_ffi_contract.c b/tools/testing/selftests/kcov_dataflow/rust_ffi_contract/rust_ffi_contract.c
new file mode 100644
index 000000000000..9cbb17c42195
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/rust_ffi_contract/rust_ffi_contract.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * rust_ffi_contract.c - Demonstrates kcov_dataflow detecting an FFI
+ * contract violation at a function boundary.
+ *
+ * The pattern: caller passes a struct pointer to callee. Callee's
+ * contract says "returns 0 implies out->buffer is valid". A bug in
+ * the async path returns 0 but leaves buffer=NULL.
+ *
+ * kcov_dataflow captures:
+ * [ENTRY] ffi_alloc_buf(alloc={.buffer=NULL, .data_size=0}, 256, 16, 1)
+ * [RET] ffi_alloc_buf() = 0
+ * [ENTRY] ffi_check_result(alloc={.buffer=NULL, ...})
+ * ^ proves contract violated
+ *
+ * Write to /sys/kernel/debug/kcov_dataflow_test/rust_ffi_trigger to run.
+ */
+#include <linux/module.h>
+#include <linux/debugfs.h>
+#include <linux/slab.h>
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("FFI contract violation detection via kcov_dataflow");
+
+struct ffi_alloc {
+ void *buffer;
+ u64 data_size;
+ u32 free_async;
+ u32 flags;
+};
+
+/* Prototypes */
+int ffi_alloc_buf(struct ffi_alloc *alloc, u64 data_size,
+ u64 offsets_size, int is_async);
+int ffi_check_result(struct ffi_alloc *alloc);
+
+/*
+ * Callee with contract: returns 0 implies alloc->buffer is valid.
+ * BUG: async path with free_async==0 returns 0 but buffer stays NULL.
+ */
+noinline int ffi_alloc_buf(struct ffi_alloc *alloc, u64 data_size,
+ u64 offsets_size, int is_async)
+{
+ if (!is_async) {
+ alloc->buffer = kmalloc(data_size, GFP_KERNEL);
+ if (!alloc->buffer)
+ return -ENOMEM;
+ return 0;
+ }
+ /* BUG: returns success but buffer is NULL when pool empty */
+ if (alloc->free_async == 0) {
+ alloc->buffer = NULL;
+ return 0; /* contract violation */
+ }
+ alloc->buffer = kmalloc(data_size, GFP_KERNEL);
+ alloc->free_async--;
+ return 0;
+}
+EXPORT_SYMBOL(ffi_alloc_buf);
+
+/* Caller that trusts the contract */
+noinline int ffi_check_result(struct ffi_alloc *alloc)
+{
+ if (!alloc->buffer) {
+ pr_err("ffi_contract: VIOLATION detected - buffer is NULL after success\n");
+ return -EFAULT;
+ }
+ kfree(alloc->buffer);
+ return 0;
+}
+EXPORT_SYMBOL(ffi_check_result);
+
+static struct dentry *test_dir;
+
+static ssize_t rust_ffi_trigger_write(struct file *f, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct ffi_alloc alloc = { .buffer = NULL, .data_size = 0,
+ .free_async = 0, .flags = 0 };
+ int ret;
+
+ /* Trigger the bug: is_async=1, free_async=0 */
+ ret = ffi_alloc_buf(&alloc, 256, 16, 1);
+ pr_info("ffi_contract: ffi_alloc_buf returned %d, buffer=%p\n",
+ ret, alloc.buffer);
+
+ if (ret == 0)
+ ffi_check_result(&alloc);
+
+ return count;
+}
+
+static const struct file_operations rust_ffi_trigger_fops = {
+ .write = rust_ffi_trigger_write,
+};
+
+static int __init ffi_contract_init(void)
+{
+ test_dir = debugfs_create_dir("kcov_dataflow_test", NULL);
+ debugfs_create_file("rust_ffi_trigger", 0200, test_dir, NULL,
+ &rust_ffi_trigger_fops);
+ return 0;
+}
+
+static void __exit ffi_contract_exit(void)
+{
+ debugfs_remove_recursive(test_dir);
+}
+
+module_init(ffi_contract_init);
+module_exit(ffi_contract_exit);
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 13/14] selftests/kcov_dataflow: add binderfs ioctl capture test
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Exercise the binder driver via binderfs with kcov_dataflow recording
active. Verifies that function argument records are captured at binder
ioctl boundaries (BINDER_VERSION, BINDER_SET_MAX_THREADS).
Requires CONFIG_ANDROID_BINDER_IPC=y and CONFIG_ANDROID_BINDERFS=y.
Gracefully skips if binderfs is not available.
Build and test:
export PATH=$PWD/../llvm-project/build/bin:$PATH
vng --build \
--configitem CONFIG_KCOV=y \
--configitem CONFIG_KCOV_DATAFLOW_ARGS=y \
--configitem CONFIG_KCOV_DATAFLOW_RET=y \
--configitem CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL=y \
--configitem CONFIG_DEBUG_INFO=y \
--configitem CONFIG_ANDROID_BINDER_IPC=y \
--configitem CONFIG_ANDROID_BINDERFS=y \
LLVM=1 CC=clang
make -C tools/testing/selftests/kcov_dataflow/binderfs
vng --user root --exec \
tools/testing/selftests/kcov_dataflow/binderfs/binderfs_test
Result:
TAP version 13
1..3
ok 1 kcov_dataflow.binderfs_setup
ok 2 kcov_dataflow.binderfs_captured # 636 words
ok 3 kcov_dataflow.binderfs_valid_records
# Totals: pass:3 fail:0 skip:0
#
# Captured call records:
# ENTRY pc=0xffffffff... arg=0x4 (fd)
# ENTRY pc=0xffffffff... arg=0xc0046209 (BINDER_VERSION)
# ENTRY pc=0xffffffff... arg=0x0 (binder_get_thread)
# RET pc=0xffffffff... ret=0x0 (success)
# ENTRY pc=0xffffffff... arg=0x40046205 (SET_MAX_THREADS)
# ENTRY pc=0xffffffff... arg=0x4 (_copy_from_user size)
Cc: Alexander Potapenko <glider@google.com>
Assisted-by: Claude:claude-opus-4-6 [kiro-chat]
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
tools/testing/selftests/kcov_dataflow/.gitignore | 1 +
.../selftests/kcov_dataflow/binderfs/Makefile | 4 +
.../kcov_dataflow/binderfs/binderfs_test.c | 177 +++++++++++++++++++++
.../selftests/kcov_dataflow/run_binderfs.sh | 13 ++
4 files changed, 195 insertions(+)
diff --git a/tools/testing/selftests/kcov_dataflow/.gitignore b/tools/testing/selftests/kcov_dataflow/.gitignore
index f71fc89580f8..da4c189ad3be 100644
--- a/tools/testing/selftests/kcov_dataflow/.gitignore
+++ b/tools/testing/selftests/kcov_dataflow/.gitignore
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
user_ioctl/user_ioctl
+binderfs/binderfs_test
*.o
*.ko
*.mod
diff --git a/tools/testing/selftests/kcov_dataflow/binderfs/Makefile b/tools/testing/selftests/kcov_dataflow/binderfs/Makefile
new file mode 100644
index 000000000000..9f1588512dba
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/binderfs/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+TEST_GEN_PROGS := binderfs_test
+CFLAGS += -Wall -O2
+include ../../lib.mk
diff --git a/tools/testing/selftests/kcov_dataflow/binderfs/binderfs_test.c b/tools/testing/selftests/kcov_dataflow/binderfs/binderfs_test.c
new file mode 100644
index 000000000000..ce9b49aa0b9f
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/binderfs/binderfs_test.c
@@ -0,0 +1,177 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * binderfs selftest for kcov_dataflow
+ *
+ * Exercises the binder driver via binderfs with kcov_dataflow recording
+ * active, then verifies that function argument records were captured at
+ * binder ioctl boundaries.
+ *
+ * Requires: CONFIG_ANDROID_BINDER_IPC=y (or _RUST), CONFIG_ANDROID_BINDERFS=y
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/mount.h>
+#include <sys/stat.h>
+#include <linux/android/binder.h>
+#include <linux/android/binderfs.h>
+
+#define KCOV_DF_INIT_TRACK _IOR('d', 1, unsigned long)
+#define KCOV_DF_ENABLE _IO('d', 100)
+#define KCOV_DF_DISABLE _IO('d', 101)
+
+#define BUF_SIZE (1 << 20)
+#define BINDERFS_PATH "/tmp/binderfs_test"
+#define BINDER_DEV BINDERFS_PATH "/my_binder"
+
+static int setup_binderfs(void)
+{
+ struct binderfs_device dev = {};
+
+ mkdir(BINDERFS_PATH, 0755);
+
+ if (mount("binder", BINDERFS_PATH, "binder", 0, NULL)) {
+ if (errno == ENODEV || errno == ENOENT) {
+ printf("SKIP: binderfs not available\n");
+ return -1;
+ }
+ perror("mount binderfs");
+ return -1;
+ }
+
+ /* Create a binder device via BINDER_CTL_ADD ioctl */
+ int ctl_fd;
+
+ ctl_fd = open(BINDERFS_PATH "/binder-control", O_RDONLY);
+ if (ctl_fd < 0) {
+ perror("open binder-control");
+ umount(BINDERFS_PATH);
+ return -1;
+ }
+
+ strcpy(dev.name, "my_binder");
+ if (ioctl(ctl_fd, BINDER_CTL_ADD, &dev) && errno != EEXIST) {
+ perror("BINDER_CTL_ADD");
+ close(ctl_fd);
+ umount(BINDERFS_PATH);
+ return -1;
+ }
+ close(ctl_fd);
+ return 0;
+}
+
+static void cleanup_binderfs(void)
+{
+ umount(BINDERFS_PATH);
+ rmdir(BINDERFS_PATH);
+}
+
+int main(void)
+{
+ uint64_t *buf;
+ int df_fd, binder_fd;
+ uint64_t total;
+ int valid = 0;
+
+ printf("TAP version 13\n");
+ printf("1..3\n");
+
+ /* Setup binderfs */
+ if (setup_binderfs()) {
+ printf("ok 1 # SKIP binderfs not available\n");
+ printf("ok 2 # SKIP\n");
+ printf("ok 3 # SKIP\n");
+ return 0;
+ }
+
+ /* Open kcov_dataflow */
+ df_fd = open("/sys/kernel/debug/kcov_dataflow", O_RDWR);
+ if (df_fd < 0) {
+ printf("not ok 1 cannot open kcov_dataflow\n");
+ cleanup_binderfs();
+ return 1;
+ }
+
+ if (ioctl(df_fd, KCOV_DF_INIT_TRACK, BUF_SIZE)) {
+ printf("not ok 1 INIT_TRACK failed\n");
+ close(df_fd);
+ cleanup_binderfs();
+ return 1;
+ }
+
+ buf = mmap(NULL, BUF_SIZE * sizeof(uint64_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, df_fd, 0);
+ if (buf == MAP_FAILED) {
+ printf("not ok 1 mmap failed\n");
+ close(df_fd);
+ cleanup_binderfs();
+ return 1;
+ }
+
+ printf("ok 1 kcov_dataflow.binderfs_setup\n");
+
+ /* Open binder device */
+ binder_fd = open(BINDER_DEV, O_RDWR | O_CLOEXEC);
+ if (binder_fd < 0) {
+ printf("not ok 2 cannot open %s: %s\n", BINDER_DEV,
+ strerror(errno));
+ munmap(buf, BUF_SIZE * sizeof(uint64_t));
+ close(df_fd);
+ cleanup_binderfs();
+ return 1;
+ }
+
+ /* Enable recording and exercise binder ioctls */
+ ioctl(df_fd, KCOV_DF_ENABLE, 0);
+ __atomic_store_n(&buf[0], 0, __ATOMIC_RELAXED);
+
+ /* BINDER_VERSION - simple ioctl that exercises the binder path */
+ struct binder_version ver = {};
+
+ ioctl(binder_fd, BINDER_VERSION, &ver);
+
+ /* BINDER_SET_MAX_THREADS */
+ uint32_t max_threads = 4;
+
+ ioctl(binder_fd, BINDER_SET_MAX_THREADS, &max_threads);
+
+ ioctl(df_fd, KCOV_DF_DISABLE, 0);
+
+ total = __atomic_load_n(&buf[0], __ATOMIC_RELAXED);
+ close(binder_fd);
+
+ if (total > 0)
+ printf("ok 2 kcov_dataflow.binderfs_captured # %lu words\n",
+ (unsigned long)total);
+ else
+ printf("not ok 2 kcov_dataflow.binderfs_captured # 0 words\n");
+
+ /* Verify at least one record has valid header (type 0xE or 0xF) */
+
+ if (total > 3) {
+ uint64_t hdr = buf[1];
+ uint32_t type = (hdr >> 28) & 0xF;
+
+ if (type == 0xE || type == 0xF)
+ valid = 1;
+ }
+
+ if (valid)
+ printf("ok 3 kcov_dataflow.binderfs_valid_records\n");
+ else
+ printf("not ok 3 kcov_dataflow.binderfs_valid_records\n");
+
+ printf("# Totals: pass:%d fail:%d skip:0\n",
+ valid ? 3 : 2, valid ? 0 : 1);
+
+ munmap(buf, BUF_SIZE * sizeof(uint64_t));
+ close(df_fd);
+ cleanup_binderfs();
+ return valid ? 0 : 1;
+}
diff --git a/tools/testing/selftests/kcov_dataflow/run_binderfs.sh b/tools/testing/selftests/kcov_dataflow/run_binderfs.sh
new file mode 100755
index 000000000000..5376e5350061
--- /dev/null
+++ b/tools/testing/selftests/kcov_dataflow/run_binderfs.sh
@@ -0,0 +1,13 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Test binderfs ioctl capture via kcov_dataflow
+DIR="$(dirname "$0")"
+BIN="$DIR/binderfs/binderfs_test"
+
+if [ ! -f "$BIN" ]; then
+ echo "SKIP: $BIN not found"
+ echo "Build: make -C tools/testing/selftests/kcov_dataflow/binderfs"
+ exit 4
+fi
+
+exec "$BIN"
--
2.43.0
^ permalink raw reply related
* [RFC PATCH v2 14/14] Documentation: add kcov-dataflow.rst
From: Yunseong Kim @ 2026-06-11 16:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, K Prateek Nayak, Andrey Konovalov,
Alexander Potapenko, Dmitry Vyukov, Andrew Morton, Miguel Ojeda,
Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Nathan Chancellor, Nicolas Schier, Nick Desaulniers,
Bill Wendling, Justin Stitt, Kees Cook, David Hildenbrand,
Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Jonathan Corbet,
Shuah Khan, Yunseong Kim
Cc: linux-kernel, kasan-dev, rust-for-linux, linux-kbuild, llvm,
linux-mm, linux-kselftest, workflows, linux-doc, Yeoreum Yun
In-Reply-To: <20260611-b4-kcov-dataflow-v2-v2-0-0a261da3987c@est.tech>
Add documentation for the KCOV-Dataflow subsystem covering:
- Prerequisites and Kconfig options
- Per-module and per-directory instrumentation
- Data collection example with buffer parsing
- Ring buffer TLV record format
- Safety properties
- Ioctl interface reference
- Compatibility with legacy KCOV
- Rust module support via post-compilation pipeline
- Fork/child process tracing pattern
Signed-off-by: Yunseong Kim <yunseong.kim@est.tech>
---
Documentation/dev-tools/index.rst | 1 +
Documentation/dev-tools/kcov-dataflow.rst | 321 +++++++++++++++++++++++++
tools/testing/selftests/kcov_dataflow/Makefile | 2 +-
3 files changed, 323 insertions(+), 1 deletion(-)
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 59cbb77b33ff..541c58cc65ea 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -24,6 +24,7 @@ Documentation/process/debugging/index.rst
context-analysis
sparse
kcov
+ kcov-dataflow
gcov
kasan
kmsan
diff --git a/Documentation/dev-tools/kcov-dataflow.rst b/Documentation/dev-tools/kcov-dataflow.rst
new file mode 100644
index 000000000000..603c83946d12
--- /dev/null
+++ b/Documentation/dev-tools/kcov-dataflow.rst
@@ -0,0 +1,321 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+KCOV-Dataflow: function argument and return value extraction
+=============================================================
+
+KCOV-Dataflow captures function arguments and return values, including
+automatic struct field decomposition, at instrumented kernel function
+boundaries. It provides per-task, lock-free ring buffers accessible via
+``mmap()``, enabling data-flow-aware fuzzing and post-mortem contract
+verification.
+
+Unlike KCOV's ``trace-pc`` which reports *which* code executed,
+KCOV-Dataflow reports *what values* were passed and returned. This is
+a completely separate device from ``/sys/kernel/debug/kcov``.
+
+Prerequisites
+-------------
+
+KCOV-Dataflow requires Clang/LLVM with the ``trace-args`` and
+``trace-ret`` SanitizerCoverage extensions. Standard (unpatched)
+compilers will not expose these Kconfig options.
+
+To enable KCOV-Dataflow, configure the kernel with::
+
+ CONFIG_KCOV=y
+ CONFIG_KCOV_DATAFLOW_ARGS=y
+ CONFIG_KCOV_DATAFLOW_RET=y
+
+Optional: instrument the entire kernel (significant overhead)::
+
+ CONFIG_KCOV_DATAFLOW_INSTRUMENT_ALL=y
+
+Coverage data becomes accessible once debugfs is mounted::
+
+ mount -t debugfs none /sys/kernel/debug
+
+Per-module instrumentation
+--------------------------
+
+To instrument a specific module, add to its Makefile::
+
+ KCOV_DATAFLOW_my_module.o := y
+
+For example, to instrument the Android binder driver::
+
+ # drivers/android/Makefile
+ KCOV_DATAFLOW_binder.o := y
+ KCOV_DATAFLOW_binder_alloc.o := y
+
+To instrument an entire directory, set the variable without a filename::
+
+ # fs/Makefile
+ KCOV_DATAFLOW := y
+
+The build system automatically adds the required compiler flags
+(``-fsanitize-coverage=trace-args,trace-ret``). Debug info is provided
+by ``CONFIG_DEBUG_INFO`` which is a Kconfig dependency.
+
+Data collection
+---------------
+
+The following program demonstrates how to collect function argument and
+return value data for a single syscall:
+
+.. code-block:: c
+
+ #include <stdio.h>
+ #include <stdint.h>
+ #include <stdlib.h>
+ #include <sys/types.h>
+ #include <sys/ioctl.h>
+ #include <sys/mman.h>
+ #include <unistd.h>
+ #include <fcntl.h>
+
+ #define KCOV_DF_INIT_TRACE _IOR('d', 1, unsigned long)
+ #define KCOV_DF_ENABLE _IO('d', 100)
+ #define KCOV_DF_DISABLE _IO('d', 101)
+ #define BUF_SIZE (1 << 20) /* 1M words = 8MB */
+
+ int main(void)
+ {
+ int fd;
+ uint64_t *buf, n, i;
+
+ fd = open("/sys/kernel/debug/kcov_dataflow", O_RDWR);
+ if (fd == -1)
+ perror("open"), exit(1);
+
+ /* Allocate buffer (size in u64 words). */
+ if (ioctl(fd, KCOV_DF_INIT_TRACE, BUF_SIZE))
+ perror("ioctl(INIT)"), exit(1);
+
+ /* Map the buffer into user space. */
+ buf = (uint64_t *)mmap(NULL, BUF_SIZE * sizeof(uint64_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if (buf == MAP_FAILED)
+ perror("mmap"), exit(1);
+
+ /* Enable data-flow collection for this task. */
+ if (ioctl(fd, KCOV_DF_ENABLE, 0))
+ perror("ioctl(ENABLE)"), exit(1);
+
+ /* Reset counter. */
+ __atomic_store_n(&buf[0], 0, __ATOMIC_RELAXED);
+
+ /* === Trigger syscall(s) here === */
+ read(-1, NULL, 0);
+
+ /* Read how many words were written. */
+ n = __atomic_load_n(&buf[0], __ATOMIC_RELAXED);
+
+ /* Parse TLV records. */
+ i = 1;
+ while (i + 3 < n) {
+ uint64_t type_seq = buf[i];
+ uint64_t pc = buf[i + 1];
+ uint64_t meta = buf[i + 2];
+ uint32_t type = (type_seq >> 28) & 0xF;
+ uint32_t num_vals = (type_seq >> 24) & 0xF;
+ uint32_t seq = type_seq & 0x00FFFFFF;
+ uint32_t arg_idx = (meta >> 56) & 0xFF;
+ uint32_t size = (meta >> 48) & 0xFF;
+
+ if (type_seq >> 32 || (type != 0xE && type != 0xF)) {
+ i++;
+ continue;
+ }
+ if (!num_vals)
+ num_vals = 1;
+
+ printf("[%s] seq=%u pc=0x%lx arg_idx=%u size=%u val=0x%lx\n",
+ type == 0xE ? "ENTRY" : "RET",
+ seq, pc, arg_idx, size, buf[i + 3]);
+ i += 3 + num_vals;
+ }
+
+ if (ioctl(fd, KCOV_DF_DISABLE, 0))
+ perror("ioctl(DISABLE)"), exit(1);
+
+ munmap(buf, BUF_SIZE * sizeof(uint64_t));
+ close(fd);
+ return 0;
+ }
+
+Ring buffer format
+------------------
+
+The buffer is an array of ``u64`` words::
+
+ buf[0]: atomic counter -- total words written
+
+Each record occupies 3 + N words:
+
+.. list-table::
+ :header-rows: 1
+
+ * - Offset
+ - Field
+ - Description
+ * - 0
+ - type_and_seq
+ - bits[31:28] = 0xE (entry) or 0xF (return), bits[27:24] = num_vals,
+ bits[23:0] = sequence number
+ * - 1
+ - pc
+ - Instrumented function address
+ * - 2
+ - meta
+ - bits[63:56] = arg_idx (0 for return), bits[55:48] = size in bytes,
+ bits[47:0] = raw pointer value
+ * - 3..N
+ - field_val[0..N]
+ - Struct field values or single scalar
+
+Magic values:
+
+- ``0xBADADD85``: field read failed (pointer was invalid/freed/poisoned)
+
+Safety
+------
+
+- Callbacks are ``notrace``, ``__no_sanitize_coverage``, ``noinline``
+ to prevent recursion.
+- All pointer reads use ``copy_from_kernel_nofault()`` -- survives
+ freed, poisoned, or unmapped memory.
+- An ``in_task()`` guard rejects calls from hardirq/softirq/NMI context,
+ preventing reentrant buffer corruption.
+- No ``printk`` or allocation in the data path.
+- When not enabled for a task, overhead is a single boolean check.
+
+Ioctl interface
+---------------
+
+.. list-table::
+ :header-rows: 1
+
+ * - Command
+ - Value
+ - Description
+ * - KCOV_DF_INIT_TRACK
+ - ``_IOR('d', 1, unsigned long)``
+ - Allocate buffer (size in u64 words)
+ * - KCOV_DF_ENABLE
+ - ``_IO('d', 100)``
+ - Start collection for current task
+ * - KCOV_DF_DISABLE
+ - ``_IO('d', 101)``
+ - Stop collection
+
+Compatibility
+-------------
+
+KCOV-Dataflow is completely independent from legacy KCOV:
+
+- Separate device: ``/sys/kernel/debug/kcov_dataflow``
+- Separate ioctl namespace (``'d'`` vs ``'c'``)
+- Separate per-task buffer
+- Both can be used simultaneously without interference
+- syzkaller and other KCOV users are unaffected
+
+Rust module support
+-------------------
+
+Rust kernel modules are supported via a post-compilation pipeline::
+
+ rustc --emit=llvm-ir -g module.rs
+ opt -passes=sancov-module \
+ -sanitizer-coverage-trace-args \
+ -sanitizer-coverage-trace-ret module.ll -S -o module_inst.ll
+ llc -filetype=obj module_inst.ll -o module.o
+
+Selftests
+---------
+
+Automated tests and visualization tools are in
+``tools/testing/selftests/kcov_dataflow/``::
+
+ # Automated ioctl interface test (TAP output):
+ make -C tools/testing/selftests/kcov_dataflow
+ vng --user root --exec \
+ tools/testing/selftests/kcov_dataflow/user_ioctl/user_ioctl
+
+ # Load a test module and view captured records:
+ make LLVM=1 CC=clang M=tools/testing/selftests/kcov_dataflow/eight_args_c modules
+ vng --user root --exec \
+ "python3 tools/testing/selftests/kcov_dataflow/trigger-view.py \
+ eight_args_c -C 8 --ko \
+ tools/testing/selftests/kcov_dataflow/eight_args_c/eight_args_c.ko"
+
+ # Binderfs ioctl capture test (requires CONFIG_ANDROID_BINDER_IPC):
+ make -C tools/testing/selftests/kcov_dataflow/binderfs
+ vng --user root --exec \
+ tools/testing/selftests/kcov_dataflow/binderfs/binderfs_test
+
+See ``tools/testing/selftests/kcov_dataflow/README.rst`` for details.
+
+Tracing child processes
+-----------------------
+
+KCOV-Dataflow is per-task: after ``fork()``, the child does not inherit
+the enabled state. To trace child processes, re-enable on the inherited
+file descriptor in the child before ``exec()``. The ``mmap``'d buffer is
+shared (``MAP_SHARED``), so both parent and child write to the same ring
+buffer atomically.
+
+.. code-block:: c
+
+ #include <stdio.h>
+ #include <stdint.h>
+ #include <stdlib.h>
+ #include <sys/ioctl.h>
+ #include <sys/mman.h>
+ #include <sys/wait.h>
+ #include <unistd.h>
+ #include <fcntl.h>
+
+ #define KCOV_DF_INIT_TRACE _IOR('d', 1, unsigned long)
+ #define KCOV_DF_ENABLE _IO('d', 100)
+ #define KCOV_DF_DISABLE _IO('d', 101)
+ #define BUF_SIZE (1 << 20)
+
+ int main(int argc, char **argv)
+ {
+ int fd = open("/sys/kernel/debug/kcov_dataflow", O_RDWR);
+ ioctl(fd, KCOV_DF_INIT_TRACE, BUF_SIZE);
+ uint64_t *buf = mmap(NULL, BUF_SIZE * 8,
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+
+ /* Enable for parent task. */
+ ioctl(fd, KCOV_DF_ENABLE, 0);
+ __atomic_store_n(&buf[0], 0, __ATOMIC_RELAXED);
+
+ pid_t pid = fork();
+ if (pid == 0) {
+ /*
+ * Child: re-enable on inherited fd.
+ * The shared mmap buffer receives records from both tasks.
+ */
+ ioctl(fd, KCOV_DF_ENABLE, 0);
+ execvp(argv[1], &argv[1]);
+ _exit(1);
+ }
+
+ waitpid(pid, NULL, 0);
+ ioctl(fd, KCOV_DF_DISABLE, 0);
+
+ uint64_t n = __atomic_load_n(&buf[0], __ATOMIC_RELAXED);
+ printf("Captured %lu words from parent + child\n", n);
+
+ munmap(buf, BUF_SIZE * 8);
+ close(fd);
+ return 0;
+ }
+
+Note: the child's ``ioctl(fd, KCOV_DF_ENABLE)`` will fail if the parent
+has not yet called ``KCOV_DF_DISABLE``, because only one task can be
+associated with a descriptor at a time. For true multi-process tracing,
+open a separate ``kcov_dataflow`` fd per child, or disable in the parent
+before the child enables (as shown above -- the parent is blocked in
+``waitpid`` so it generates no records during that time anyway).
diff --git a/tools/testing/selftests/kcov_dataflow/Makefile b/tools/testing/selftests/kcov_dataflow/Makefile
index 6412c90edfa1..9691b41ffd3e 100644
--- a/tools/testing/selftests/kcov_dataflow/Makefile
+++ b/tools/testing/selftests/kcov_dataflow/Makefile
@@ -1,4 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
TEST_GEN_PROGS := user_ioctl/user_ioctl
-TEST_PROGS := run_eight_args_c.sh run_rust_ffi_contract.sh
+TEST_PROGS := run_eight_args_c.sh run_eight_args_rust.sh run_rust_ffi_contract.sh
include ../lib.mk
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v3 1/3] dt-bindings: hwmon: pmbus: Add bindings for Silergy SQ24860
From: Rob Herring @ 2026-06-11 16:38 UTC (permalink / raw)
To: Ziming Zhu
Cc: Guenter Roeck, Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet,
Shuah Khan, linux-hwmon, devicetree, linux-kernel, linux-doc,
Ziming Zhu
In-Reply-To: <20260611074335.4415-2-zmzhu0630@163.com>
On Thu, Jun 11, 2026 at 03:43:33PM +0800, Ziming Zhu wrote:
> From: Ziming Zhu <ziming.zhu@silergycorp.com>
>
> Add devicetree binding documentation for the Silergy SQ24860 eFuse.
>
> The device is a PMBus hardware monitoring device which reports voltage,
> current, power, and temperature telemetry. The board-specific IMON
> resistor value is described with silergy,rimon-micro-ohms.
>
> Signed-off-by: Ziming Zhu <ziming.zhu@silergycorp.com>
Missing Conor's reviewed-by.
> ---
> .../bindings/hwmon/pmbus/silergy,sq24860.yaml | 74 +++++++++++++++++++
> 1 file changed, 74 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/hwmon/pmbus/silergy,sq24860.yaml
^ permalink raw reply
* Re: [RFC PATCH 0/5] mm/slub: preserve previous object lifetime
From: Vlastimil Babka (SUSE) @ 2026-06-11 17:13 UTC (permalink / raw)
To: Harry Yoo, Pengpeng Hou, Andrew Morton, linux-mm
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
David Hildenbrand, Lorenzo Stoakes, liam, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Jonathan Corbet, Shuah Khan,
linux-doc, linux-kernel
In-Reply-To: <2b5577b0-d81a-4dee-b4e2-acadcf7f7db2@kernel.org>
On 6/11/26 09:19, Harry Yoo wrote:
> Hi Pengpeng,
>
> On 6/11/26 3:39 PM, Pengpeng Hou wrote:
>> SLAB_STORE_USER currently stores one allocation track and one free track
>> for an object. This is useful, but it loses part of the previous lifetime
>> when the object is reused: the new allocation overwrites the allocation
>> track, and a later stale free can overwrite the free track.
>
> I'm not sure what you meant by "stale free", UAF is accessing object
> that are freed. What makes the free "stale"?
I'm guessing it means 'second/duplicated free" of the previous owner.
Accesses (UAF) perhaps may not happen by that owner, or if they happen after
he object is reallocated, they are not recognized as such.
> In general, I don't think slab_debug=UP is the right tool to debug
> use-after-frees, because slab will never know _when_ the object was
> overwritten. It can only tell that somebody has overwritten freed
> objects by checking if the object content is POISON_FREE or POISON_END.
It could give more information about double frees like this, however.
> KASAN is a better tool to debug use-after-frees, because it can
> tell you which kernel code is accessing memory it shouldn't. (It also
> quarantines slab objects to avoid immediately reusing the object for
> better coverage).
>
> So I have to ask, "Why not use KASAN instead?" before enhancing
> slab_debug (neither is intended for production anyway).
From my distro experience, it's very useful to tell a user to just enable
slub_debug for a specific cache with the existing kernel, with some but not
prohibitive overhead. And with some luck it gives you enough information to
find the root cause too. So in that sense it can be used in production.
KASAN is indeed superior wrt catching issues, but almost never applicable in
such environment. It would need a rebuilt kernel and the overhead is much
higher. So it's a tradeoff.
>> For free-after-reuse bugs, the report can therefore contain the victim
>> allocation and the stale free, while the earlier alloc/free pair that
>> explains where the stale pointer came from is no longer available.
>
> Again, I'm confused. I have no idea what "free-after-reuse" means.
> Objects cannot be reused until they are not freed, no?
^ permalink raw reply
* Re: [PATCH net-next 2/3] docs: net: tls-offload: document tls_dev_del, tls_dev_resync, and rekey
From: Jakub Kicinski @ 2026-06-11 17:18 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
linux-doc, bpf, john.fastabend, skhan
In-Reply-To: <ainR5GAK8LaHJYMP@krikkit>
On Wed, 10 Jun 2026 23:06:44 +0200 Sabrina Dubroca wrote:
> > +The third TLS device callback is :c:member:`tls_dev_resync`, called by the core
> > +to synchronize the TCP stream with the record boundaries:
> > +
> > +.. code-block:: c
> > +
> > + int (*tls_dev_resync)(struct net_device *netdev,
> > + struct sock *sk, u32 seq, u8 *rcd_sn,
> > + enum tls_offload_ctx_dir direction);
> > +
> > +See the `Resync handling`_ section for details.
>
> Hmm, this callback is not mentioned at all in the "Resync handling"
> section. I think it'd be good to add at least a quick note there about
> how/when it's invoked, and what the arguments mean (at least the two
> types of sequence numbers, since the rest is identical to the other
> driver CBs).
Something like this, you mean?
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -278,9 +278,9 @@ sequence number (as it will be updated from a different context).
bool tls_offload_tx_resync_pending(struct sock *sk)
Next time ``ktls`` pushes a record it will first send its TCP sequence number
-and TLS record number to the driver. Stack will also make sure that
-the new record will start on a segment boundary (like it does when
-the connection is initially added).
+and TLS record number to the driver via the ``tls_dev_resync`` callback.
+Stack will also make sure that the new record will start on a segment boundary
+(like it does when the connection is initially added).
RX
--
@@ -372,9 +372,10 @@ all TLS record headers that have been logged since the resync request
started.
The kernel confirms the guessed location was correct and tells the device
-the record sequence number. Meanwhile, the device had been parsing
-and counting all records since the just-confirmed one, it adds the number
-of records it had seen to the record number provided by the kernel.
+the record sequence number via the ``tls_dev_resync`` callback. Meanwhile,
+the device had been parsing and counting all records since the just-confirmed
+one, it adds the number of records it had seen to the record number provided
+by the kernel.
At this point the device is in sync and can resume decryption at next
segment boundary.
@@ -398,7 +399,8 @@ schedules resynchronization after it has received two completely encrypted
records.
The stack waits for the socket to drain and informs the device about
-the next expected record number and its TCP sequence number. If the
+the next expected record number and its TCP sequence number via the
+``tls_dev_resync`` callback. If the
records continue to be received fully encrypted stack retries the
synchronization with an exponential back off (first after 2 encrypted
records, then after 4 records, after 8, after 16... up until every
^ permalink raw reply
* [PATCH net-next v2] docs: networking: add guidance on what to push via extack
From: Jakub Kicinski @ 2026-06-11 17:21 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
Joe Damato, corbet, skhan, linux-doc
Every now and then someone tries to duplicated extack
messages to dmesg. Document our guidance against this.
Also indicate that system level faults should continue
to go to system logs. The high level thinking is to try
to distinguish between what's important to the user vs
system admin.
Reviewed-by: Joe Damato <joe@dama.to>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2: some nit picks
CC: corbet@lwn.net
CC: skhan@linuxfoundation.org
CC: linux-doc@vger.kernel.org
---
Documentation/networking/driver.rst | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/Documentation/networking/driver.rst b/Documentation/networking/driver.rst
index 195a916dc0de..920c0ec98759 100644
--- a/Documentation/networking/driver.rst
+++ b/Documentation/networking/driver.rst
@@ -128,3 +128,16 @@ to be freed up.
If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you
must not keep any reference to that SKB and you must not attempt
to free it up.
+
+Error message reporting
+=======================
+
+A number of driver configuration interfaces pass a Netlink extended ACK
+(``extack``) object to the driver (either directly as an argument or
+as a member of a parameter struct). The drivers should try to report
+most errors via the ``extack`` object. System level exceptions,
+indicating that system or device is misbehaving or is in bad state,
+should continue to be reported to system logs.
+
+Messages should be passed **either** via ``extack`` **or** to system logs.
+Drivers should not try to report the same information to both.
--
2.54.0
^ permalink raw reply related
* Re: [RFC PATCH 0/5] mm/slub: preserve previous object lifetime
From: Vlastimil Babka (SUSE) @ 2026-06-11 17:28 UTC (permalink / raw)
To: Pengpeng Hou, Harry Yoo, Andrew Morton, linux-mm
Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
David Hildenbrand, Lorenzo Stoakes, liam, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Jonathan Corbet, Shuah Khan,
linux-doc, linux-kernel
In-Reply-To: <20260611063926.38111-1-pengpeng@iscas.ac.cn>
On 6/11/26 08:39, Pengpeng Hou wrote:
> SLAB_STORE_USER currently stores one allocation track and one free track
> for an object. This is useful, but it loses part of the previous lifetime
> when the object is reused: the new allocation overwrites the allocation
> track, and a later stale free can overwrite the free track.
>
> For free-after-reuse bugs, the report can therefore contain the victim
> allocation and the stale free, while the earlier alloc/free pair that
> explains where the stale pointer came from is no longer available.
>
> This RFC adds an opt-in SLUB debug option to keep one previous completed
> object lifetime. The option is disabled by default, is not part of the
> default debug flags, and only takes effect when user tracking is already
> enabled:
Sounds useful!
> slab_debug=UH,kmalloc-128
>
> The series intentionally does not attempt to infer semantic ownership or
> identify the root cause of a use-after-free. It only preserves and prints
> additional track records that SLUB already knows how to collect.
Indeed, it would be generally impossible to infer I think.
> This is sent as RFC because the user-visible interface and the cost/benefit
> tradeoff should be agreed on before this becomes a normal patch series.
> In particular, feedback would be useful on:
>
> - whether a separate H option is preferable to extending U directly
I think before we converted U to stackdepot, the memory overhead of the
stacks was higher than U+H with stackdepot. So I think it would be
acceptable to extend directly. If a user is willing to pay the current U
overhead to debug something in production, the addition of U+H shouldn't
make it suddenly unacceptable.
> - whether H should require U, as implemented here, or imply U
> - whether the extra per-object metadata is useful enough for this debug path
One could think of scenarios where even longer object history would be
needed to find the culprit. But adding one extra lifetime probably has the
biggest impact.
> Not included yet:
>
> - KUnit coverage or a standalone reproducer
That would be nice.
> - object-size/order comparison data for representative caches
Could be useful too.
> - runtime benchmark data for slab_debug=U vs slab_debug=UH
This is probably not necessary as we do expect debugging has a performance
cost, and this is not doing anything extra slow, only copying the previous
tracking info? In practice the overhead depends on the workload anyway.
Thanks!
> Those should be added before a non-RFC submission if the direction looks
> acceptable.
>
> Pengpeng Hou (5):
> mm/slub: factor user tracking metadata size calculation
> mm/slub: add optional previous lifetime user tracking
> mm/slub: print previous object lifetime in debug reports
> Documentation/mm: document SLUB previous lifetime tracking
> mm/slub: sanitize previous lifetime tracking flags
>
> Documentation/admin-guide/mm/slab.rst | 22 ++++-
> include/linux/slab.h | 3 +
> mm/slab.h | 3 +-
> mm/slub.c | 118 ++++++++++++++++++++++----
> 4 files changed, 128 insertions(+), 18 deletions(-)
>
^ permalink raw reply
* Re: [PATCH v3 1/4] mm/zswap: Make shrink_worker writeback cursor per-memcg
From: Yosry Ahmed @ 2026-06-11 17:39 UTC (permalink / raw)
To: Hao Jia
Cc: Nhat Pham, shakeel.butt, akpm, tj, hannes, mhocko, mkoutny,
chengming.zhou, muchun.song, roman.gushchin, cgroups, linux-mm,
linux-kernel, linux-doc, Hao Jia
In-Reply-To: <1c25650e-bf98-2863-d505-9b94c385668b@gmail.com>
On Tue, Jun 09, 2026 at 11:18:26AM +0800, Hao Jia wrote:
>
>
> On 2026/6/9 02:01, Nhat Pham wrote:
> > On Mon, Jun 8, 2026 at 9:48 AM Yosry Ahmed <yosry@kernel.org> wrote:
> > >
> > > > But OTOH, this does seem like a recipe for inefficient reclaim. We
> > > > might exhaust hotter memory of a cgroup while sparing colder memory of
> > > > another cgroup... But maybe if they're all cold anyway, then who
> > > > cares, and eventually you'll get to the cold stuff of other child?
> > >
> > > Forgot to respond to this part, the unfairness is limited to the batch
> > > size per-invocation, so it should be fine as long as you don't divide
> > > the amount over 100 iterations for some reason. Also yes, all memory
> > > in zswap is cold, the relative coldness is not that important (e.g.
> > > compared to relative coldness during reclaim).
> >
> > Ok then yeah, I think we should shelve per-memcg cursor for the next
> > version. Down the line, if we have more data that unfairness is an
> > issue, we can always fix it. One step at a time :)
>
> Thanks a lot to Yosry, Nhat, and Shakeel for the great suggestions!
>
> Let me summarize what I plan to do in the next version to make sure we are
> on the same page:
>
> - Drop the per-memcg cursor and keep the root cgroup cursor
> (zswap_next_shrink) logic intact.
> - Stick to using the zswap_writeback_only key, and change the proactive
> writeback size to use the compressed size.
> - Consolidate and reuse the logic between shrink_worker() and
> shrink_memcg(). Enable batch writeback in the shrink_worker() path, while
> keeping the writeback behavior in the zswap_store() path unchanged.
>
> Please let me know if I missed or misunderstood anything. Thanks again for
> clearing things up!
Sorry for the late response, yes I think this makes sense. However, I
have some comment about how this interacts with swap tiering, let me
reply to the other thread.
>
> Thanks,
> Hao
^ permalink raw reply
* Re: [PATCH net-next V3 2/7] netdevsim: Register devlink after device init
From: Mark Bloch @ 2026-06-11 17:43 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Eric Dumazet, Paolo Abeni, Andrew Lunn, David S. Miller,
Jonathan Corbet, Shuah Khan, Jiri Pirko, Simon Horman,
Sunil Goutham, Linu Cherian, Geetha sowjanya, hariprasad,
Subbaraya Sundeep, Bharat Bhushan, Saeed Mahameed,
Leon Romanovsky, Tariq Toukan, Ethan Nelson-Moore, linux-doc,
netdev, linux-rdma
In-Reply-To: <20260611085440.4fe36bf2@kernel.org>
On 11/06/2026 18:54, Jakub Kicinski wrote:
> On Thu, 11 Jun 2026 09:02:03 +0300 Mark Bloch wrote:
>> On 11/06/2026 2:50, Jakub Kicinski wrote:
>>> On Fri, 5 Jun 2026 21:10:25 +0300 Mark Bloch wrote:
>>>> devl_register() makes the devlink instance visible to userspace. A later
>>>> patch also makes registration the point where devlink core may call
>>>> eswitch_mode_set() to apply a boot-time default eswitch mode.
>>>>
>>>> Move netdevsim registration after all objects (resources, params, regions,
>>>> traps, debugfs etc) are initialized, and after the initial eswitch mode is
>>>> set to legacy.
>>>>
>>>> Move devl_unregister() to the beginning of nsim_drv_remove(), before those
>>>> devlink objects are torn down. This keeps devlink register/unregister as
>>>> the notification barrier and makes the later object teardown paths run
>>>> after devlink is no longer registered, so they do not emit their own
>>>> netlink DEL notifications.
>>>
>>> This is going backwards. At some point someone from nVidia thought that
>>> we can order our way out of locking, so mlx5 is likely ordered this way,
>>> but this must not be required, or in any way normalized.
>>> We (syzbot) quickly discovered that it doesn't cover all corner cases.
>>> devl_lock() is exposed specifically to allow the driver to finish
>>> whatever init it needs without letting user space invoke callbacks, yet.
>>> Almost (?) all driver callbacks hold devl_lock(), so maybe the devlink
>>> instance is "visible" to user space but that should not matter.
>>
>> Let me clarify.
>>
>> No locking is changed here, and I don't want to make register/unregister
>> ordering a substitute for devl_lock().
>>
>> The only requirement I have for this series is that devl_register() is called
>> only once the driver is ready for devlink core to call eswitch_mode_set().
>> That follows from the earlier direction to have the core apply the default
>> mode from devl_register() instead of adding an explicit driver call.
>
> This is exactly what I'm objecting to. AFAIU we are trading off
> explicit call to get the default value for an implicit behavior
> depending on order of calls. We want to optimize for how easy it
> is to get the API wrong, not for LoC.
Right, the reason I moved in this direction is that in v1 I had
the explicit driver call, and Jiri asked to make this transparent
from devlink core instead.
>
> If we don't have a clean way to implement this without driver
> changes let's add the explicit API to get the default value.
> If driver doesn't call it schedule a work to go via the callback
> once devl_lock() is dropped. That way drivers which care can optimize
> themselves by reading the default value upfront. Drivers which don't
> care will work correctly, and there's no API call order trap.
The workqueue fallback is possible, but I think it makes the semantics
more complicated.
We would need to track devlink instances which still need the default
applied, and the worker would have to skip/remove them once handled.
More importantly, the worker can race with userspace setting the
eswitch mode, so we would also need some state to tell whether the user
already changed the mode. That feels more fragile than an explicit
driver call.
>
> Not ideal, but isn't that best we can do here?
> I still have flashbacks of the fallout from the call ordering games,
> we have too many drivers to keep this straight...
That's why I started with the explicit call in the first place.
I can switch back to this model: drivers which support boot time eswitch
defaults will opt in and call the helper once they are ready. This keeps
the support explicit per driver and avoids making it depend on where
devl_register() happens in the init path.
With that, devlink can tell at register time whether the instance supports
boot time eswitch defaults. If the user configured a default for an instance
whose driver did not opt in, devlink can write to dmesg from
devl_register().
Not perfect, but at least the user gets a visible failure instead of the
config being silently ignored.
Mark
>
>> So if the objection is to the commit message wording, I can fix that and drop
>> the "notification barrier" language.
>>
>> For unregister, I can probably leave the old ordering as-is. I moved it only
>> to mirror the register path, which felt cleaner, but it is not required for
>> the default-mode change and as the lock is held I see no issue with doing
>> that.
^ permalink raw reply
* Re: [PATCH v3 2/4] mm/zswap: Implement proactive writeback
From: Yosry Ahmed @ 2026-06-11 17:45 UTC (permalink / raw)
To: YoungJun Park
Cc: Shakeel Butt, Hao Jia, Johannes Weiner, mhocko, tj, mkoutny,
roman.gushchin, Nhat Pham, akpm, chengming.zhou, muchun.song,
cgroups, linux-mm, linux-kernel, linux-doc, Hao Jia, chrisl,
kasong, baoquan.he
In-Reply-To: <aieUQUBHI+E3uNPW@yjaykim-PowerEdge-T330>
On Tue, Jun 09, 2026 at 01:19:13PM +0900, YoungJun Park wrote:
> On Mon, Jun 08, 2026 at 03:27:07PM -0700, Yosry Ahmed wrote:
>
> +Chris +Kairui +Baoquan
>
> Hello
>
> Thanks for inviting me to the discussion, Shakeel.
>
> > > > > Youngjun is working on swap tiers. At the moment he is more interested in
> > > > > allowing a specific swap device to a memcg or not. I can imagine in future there
> > > > > will be use-cases where there will be a need to demote data on higher tier swap
> > > > > to lower tier swap. What would be the appropriate interface?
>
> Speaking of my work on swap tiers, I recently submitted a patch and am
> currently considering memcg integration:
> https://lore.kernel.org/linux-mm/20260527062247.3440692-1-youngjun.park@lge.com/
>
> The future use-cases imagined above seem to align with this
> direction. (BTW, I am currently waiting for reviews/feedback from the memcg
> folks on this patch. Any reviews would be highly appreciated!)
>
> We could potentially assign a target tier
> for writeback within the existing memory.zswap.writeback interface.
>
> For instance, '0' could mean disabled, while non-zero values could represent
> specific tiers, which would maintain backward compatibility with the current
> version. Alternatively, if zswap is treated as the default top tier,
> the `memory.swap.tiers` interface could potentially replace `memory.zswap.writeback`.
>
> Furthermore, this could be expanded so that each swap tier can demote data
> user-triggered demotion between swap tiers.
>
> Based on the current patch's ideas combined with my swap tiers concept:
>
> Assuming a hierarchy like:
> zswap -> tier1 (SSD swap) -> tier2 (HDD swap) -> tier3 (Network swap)
>
> We could configure the active tiers via a setting like `memory.swap.tiers`
> (tier2 enabled, tier3 enabled).
>
> For example, the concept of `echo "100M zswap_writeback_only > memory.reclaim"`
> could be extended. A user could run `echo "100M tier2 > memory.reclaim"`
> to explicitly trigger demotion from tier2 to tier3.
> (BTW, if we combine these features, my personal preference for the keyword
> format would be `<size> <demote_prefix><tier_name>`. I think it would be
> better to explicitly indicate that it is a swap demotion by using a specific
> prefix followed by the tier name.
> Or make demote prefix another key is also possible)
I am not sure if proactive demotion between swap tiers would be driven
by memory.reclaim, I am guessing a new interface might be more suitable.
But yes, you are right that it's very possible that
'zswap_writeback_only' with memory.reclaim will become obsolete once
swap tiering matures and starts supporting things like proactive
demotion.
Part of me wants to wait until the swap tiering interfaces are figured
out so that we don't end up with redundant interfaces, but I also don't
want to hold Hao's work since it doesn't directly depend on swap
tiering.
Shakeel, how do you want to handle this? I think there's a few options:
1. Add zswap_writeback_only now, and when we have swap tiering demotion
it becomes a redundant interface, like memory.zswap.writeback -- or
maybe we try to deprecate both of them at that point. It's difficult to
remove interfaces tho, but maybe easier to stop supporting
zswap_writeback_only.
2. Add zswap_writeback_only behind an experimental config option, to
unblock development but have a line of sight to dropping support once we
have a swap tiering interface.
3. Wait until we figure out the swap tiering interfaces and then add
the proactive zswap writeback as part of it.
WDYT?
^ permalink raw reply
* Re: [PATCH v3] arm64: errata: Workaround NVIDIA Olympus device store/load ordering erratum
From: Jason Gunthorpe @ 2026-06-11 17:49 UTC (permalink / raw)
To: Will Deacon
Cc: Shanker Donthineni, Catalin Marinas, Vladimir Murzin,
linux-arm-kernel, Mark Rutland, linux-kernel, linux-doc,
Vikram Sethi, Jason Sequeira
In-Reply-To: <aiq5VigmtZq9GlAm@willie-the-truck>
On Thu, Jun 11, 2026 at 02:34:14PM +0100, Will Deacon wrote:
> I still reckon you should do something with the memcpy-to-io routines.
> A simple option could be to make dgh() a dmb on parts with the erratum?
> That at least moves the barrier out of the loop.
AFAIK only callers that know they are using WC memory should be
calling dgh() and in that case we know it is NORMAL-NC and we don't
need a different barrier
Other random users calling memcpy_to_io functions on real IO don't
have to do dgh(), and AFAIK it doesn't do anything on the Device
memory types?
Jason
^ permalink raw reply
* [PATCH v6 00/20] nfsd: add support for CB_NOTIFY callbacks in directory delegations
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
This version of the patchset fixes up yet more problems that Sashiko
and Chuck flagged during review. Progress!
Please consider for v7.3. Original cover letter follows:
---------------------------------8<------------------------------------
This patchset builds on the directory delegation work we did a few
months ago to add support for CB_NOTIFY callbacks for some events. In
particular, creates, unlinks and renames. The server also sends updated
directory attributes in the notifications. With this support, the client
can register interest in a directory and get notifications about changes
within it without losing its lease.
The series starts with patches to allow the vfs to ignore certain types
of events on directories. nfsd can then request these sorts of
delegations on directories, and then set up inotify watches on the
directory to trigger sending CB_NOTIFY events.
This has mainly been tested with pynfs, with some new testcases that
I'll be posting soon. They seem to work fine with those tests, but I
don't think we'll want to merge these until we have a complete
client-side implementation to test against.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Changes in v6:
- fold earlier fix series into their respective patches
- tighten up RCU handling on fi_deleg_file
- move nfsd_fsnotify_recalc_mask() to filecache.c
- encoding failure now triggers deleg recall
- take snapshot of new dentry name when creating event
- Link to v5: https://lore.kernel.org/r/20260522-dir-deleg-v5-0-542cddfad576@kernel.org
Changes in v5:
- properly free dir delegation when alloc_pages_bulk() fails
- handle nfsd_file with no mark in nfsd_fsnotify_recalc_mask()
- nfsd_get_dir_deleg() should use stable nf pointer instead of
depending on fi_deleg_file
- use GFP_NOFS in alloc_nfsd_notify_event() since it's called with locks
held
- nfsd_handle_dir_event() tracepoint now handles NULL pointers safely
- Link to v4: https://lore.kernel.org/r/20260522-dir-deleg-v4-0-2acb883ac6bc@kernel.org
Changes in v4:
- Rebase onto Chuck's nfsd-testing branch. Minor contextual fixups.
- Link to v3: https://lore.kernel.org/r/20260428-dir-deleg-v3-0-5a0780ba9def@kernel.org
Changes in v3:
- Fix error handling in alloc_init_dir_deleg()
- Link to v2: https://lore.kernel.org/r/20260416-dir-deleg-v2-0-851426a550f6@kernel.org
Changes in v2:
- Fix __break_lease handling with different lease types on flc_lease list
- Add FSNOTIFY_EVENT_RENAME data type to properly handle cross-directory rename events
- Display fsnotify mask symbolically in tracepoints
- New tracepoint in fsnotify()
- Recalc fsnotify mask after unlocking lease instead of before
- Don't notify client that is making the changes
- After sending CB_NOTIFY, requeue if new events came in while running
- Document removal of NFS4_VERIFIER_SIZE/NFS4_FHSIZE from UAPI headers
- Properly release nfsd_dir_fsnotify_group on server shutdown
- Link to v1: https://lore.kernel.org/r/20260407-dir-deleg-v1-0-aaf68c478abd@kernel.org
---
Jeff Layton (20):
nfsd: check fl_lmops in nfsd_breaker_owns_lease()
nfsd: add protocol support for CB_NOTIFY
nfs_common: add new NOTIFY4_* flags proposed in RFC8881bis
nfsd: allow nfsd to get a dir lease with an ignore mask
nfsd: update the fsnotify mark when setting or removing a dir delegation
nfsd: make nfsd4_callback_ops->prepare operation bool return
nfsd: add callback encoding and decoding linkages for CB_NOTIFY
nfsd: use RCU to protect fi_deleg_file
nfsd: add data structures for handling CB_NOTIFY
nfsd: add notification handlers for dir events
nfsd: apply the notify mask to the delegation when requested
nfsd: add helper to marshal a fattr4 from completed args
nfsd: allow nfsd4_encode_fattr4_change() to work with no export
nfsd: send basic file attributes in CB_NOTIFY
nfsd: allow encoding a filehandle into fattr4 without a svc_fh
nfsd: add a fi_connectable flag to struct nfs4_file
nfsd: add the filehandle to returned attributes in CB_NOTIFY
nfsd: properly track requested child attributes
nfsd: track requested dir attributes
nfsd: add support to CB_NOTIFY for dir attribute changes
Documentation/sunrpc/xdr/nfs4_1.x | 262 ++++++++++++++-
fs/nfsd/filecache.c | 122 ++++++-
fs/nfsd/filecache.h | 3 +
fs/nfsd/nfs4callback.c | 97 +++++-
fs/nfsd/nfs4layouts.c | 10 +-
fs/nfsd/nfs4proc.c | 17 +
fs/nfsd/nfs4state.c | 590 ++++++++++++++++++++++++++++++----
fs/nfsd/nfs4xdr.c | 330 +++++++++++++++++--
fs/nfsd/nfs4xdr_gen.c | 601 ++++++++++++++++++++++++++++++++++-
fs/nfsd/nfs4xdr_gen.h | 20 +-
fs/nfsd/nfsfh.c | 10 +-
fs/nfsd/nfsfh.h | 1 +
fs/nfsd/state.h | 85 ++++-
fs/nfsd/trace.h | 24 ++
fs/nfsd/xdr4.h | 5 +
fs/nfsd/xdr4cb.h | 12 +
include/linux/nfs4.h | 127 --------
include/linux/sunrpc/xdrgen/nfs4_1.h | 291 ++++++++++++++++-
include/uapi/linux/nfs4.h | 2 -
19 files changed, 2339 insertions(+), 270 deletions(-)
---
base-commit: 8defc3ed26a2b4c8677ce2106c2c92cd26ef1316
change-id: 20260325-dir-deleg-339066dd1017
Best regards,
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply
* [PATCH v6 01/20] nfsd: check fl_lmops in nfsd_breaker_owns_lease()
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
Any lease created by nfsd will have its fl_lmops set to
nfsd_lease_mng_ops. Do a quick check for that first when testing whether
the lease breaker owns the lease.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4state.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e59aec57e9e8..489558bf124c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -91,6 +91,8 @@ static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_stat
static void nfsd4_file_hash_remove(struct nfs4_file *fi);
static void deleg_reaper(struct nfsd_net *nn);
+static const struct lease_manager_operations nfsd_lease_mng_ops;
+
/* Locking: */
enum nfsd4_st_mutex_lock_subclass {
@@ -5734,6 +5736,10 @@ static bool nfsd_breaker_owns_lease(struct file_lease *fl)
struct svc_rqst *rqst;
struct nfs4_client *clp;
+ /* Only nfsd leases */
+ if (fl->fl_lmops != &nfsd_lease_mng_ops)
+ return false;
+
rqst = nfsd_current_rqst();
if (!nfsd_v4client(rqst))
return false;
--
2.54.0
^ permalink raw reply related
* [PATCH v6 02/20] nfsd: add protocol support for CB_NOTIFY
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
Add the necessary bits to nfs4_1.x and remove the duplicate definitions
from nfs4.h and the uapi nfs4 header. Regenerate the xdr files.
Note that regenerating these files caused conflicts with the definitions
of NFS4_VERIFIER_SIZE and NFS4_FHSIZE in include/uapi/linux/nfs4.h.
These constants are defined by the RFC, and are not part of the kernel
API. They have been removed. Userspace consumers who require those
constants should plan to get them from more authoritative sources.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Documentation/sunrpc/xdr/nfs4_1.x | 250 ++++++++++++++-
fs/nfsd/nfs4xdr_gen.c | 590 ++++++++++++++++++++++++++++++++++-
fs/nfsd/nfs4xdr_gen.h | 20 +-
fs/nfsd/trace.h | 1 +
include/linux/nfs4.h | 127 --------
include/linux/sunrpc/xdrgen/nfs4_1.h | 280 ++++++++++++++++-
include/uapi/linux/nfs4.h | 2 -
7 files changed, 1129 insertions(+), 141 deletions(-)
diff --git a/Documentation/sunrpc/xdr/nfs4_1.x b/Documentation/sunrpc/xdr/nfs4_1.x
index 5b45547b2ebc..632f5b579c39 100644
--- a/Documentation/sunrpc/xdr/nfs4_1.x
+++ b/Documentation/sunrpc/xdr/nfs4_1.x
@@ -45,19 +45,165 @@ pragma header nfs4;
/*
* Basic typedefs for RFC 1832 data type definitions
*/
-typedef hyper int64_t;
-typedef unsigned int uint32_t;
+typedef int int32_t;
+typedef unsigned int uint32_t;
+typedef hyper int64_t;
+typedef unsigned hyper uint64_t;
+
+const NFS4_VERIFIER_SIZE = 8;
+const NFS4_FHSIZE = 128;
+
+enum nfsstat4 {
+ NFS4_OK = 0, /* everything is okay */
+ NFS4ERR_PERM = 1, /* caller not privileged */
+ NFS4ERR_NOENT = 2, /* no such file/directory */
+ NFS4ERR_IO = 5, /* hard I/O error */
+ NFS4ERR_NXIO = 6, /* no such device */
+ NFS4ERR_ACCESS = 13, /* access denied */
+ NFS4ERR_EXIST = 17, /* file already exists */
+ NFS4ERR_XDEV = 18, /* different filesystems */
+
+ /*
+ * Please do not allocate value 19; it was used in NFSv3
+ * and we do not want a value in NFSv3 to have a different
+ * meaning in NFSv4.x.
+ */
+
+ NFS4ERR_NOTDIR = 20, /* should be a directory */
+ NFS4ERR_ISDIR = 21, /* should not be directory */
+ NFS4ERR_INVAL = 22, /* invalid argument */
+ NFS4ERR_FBIG = 27, /* file exceeds server max */
+ NFS4ERR_NOSPC = 28, /* no space on filesystem */
+ NFS4ERR_ROFS = 30, /* read-only filesystem */
+ NFS4ERR_MLINK = 31, /* too many hard links */
+ NFS4ERR_NAMETOOLONG = 63, /* name exceeds server max */
+ NFS4ERR_NOTEMPTY = 66, /* directory not empty */
+ NFS4ERR_DQUOT = 69, /* hard quota limit reached*/
+ NFS4ERR_STALE = 70, /* file no longer exists */
+ NFS4ERR_BADHANDLE = 10001,/* Illegal filehandle */
+ NFS4ERR_BAD_COOKIE = 10003,/* READDIR cookie is stale */
+ NFS4ERR_NOTSUPP = 10004,/* operation not supported */
+ NFS4ERR_TOOSMALL = 10005,/* response limit exceeded */
+ NFS4ERR_SERVERFAULT = 10006,/* undefined server error */
+ NFS4ERR_BADTYPE = 10007,/* type invalid for CREATE */
+ NFS4ERR_DELAY = 10008,/* file "busy" - retry */
+ NFS4ERR_SAME = 10009,/* nverify says attrs same */
+ NFS4ERR_DENIED = 10010,/* lock unavailable */
+ NFS4ERR_EXPIRED = 10011,/* lock lease expired */
+ NFS4ERR_LOCKED = 10012,/* I/O failed due to lock */
+ NFS4ERR_GRACE = 10013,/* in grace period */
+ NFS4ERR_FHEXPIRED = 10014,/* filehandle expired */
+ NFS4ERR_SHARE_DENIED = 10015,/* share reserve denied */
+ NFS4ERR_WRONGSEC = 10016,/* wrong security flavor */
+ NFS4ERR_CLID_INUSE = 10017,/* clientid in use */
+
+ /* NFS4ERR_RESOURCE is not a valid error in NFSv4.1 */
+ NFS4ERR_RESOURCE = 10018,/* resource exhaustion */
+
+ NFS4ERR_MOVED = 10019,/* filesystem relocated */
+ NFS4ERR_NOFILEHANDLE = 10020,/* current FH is not set */
+ NFS4ERR_MINOR_VERS_MISMATCH= 10021,/* minor vers not supp */
+ NFS4ERR_STALE_CLIENTID = 10022,/* server has rebooted */
+ NFS4ERR_STALE_STATEID = 10023,/* server has rebooted */
+ NFS4ERR_OLD_STATEID = 10024,/* state is out of sync */
+ NFS4ERR_BAD_STATEID = 10025,/* incorrect stateid */
+ NFS4ERR_BAD_SEQID = 10026,/* request is out of seq. */
+ NFS4ERR_NOT_SAME = 10027,/* verify - attrs not same */
+ NFS4ERR_LOCK_RANGE = 10028,/* overlapping lock range */
+ NFS4ERR_SYMLINK = 10029,/* should be file/directory*/
+ NFS4ERR_RESTOREFH = 10030,/* no saved filehandle */
+ NFS4ERR_LEASE_MOVED = 10031,/* some filesystem moved */
+ NFS4ERR_ATTRNOTSUPP = 10032,/* recommended attr not sup*/
+ NFS4ERR_NO_GRACE = 10033,/* reclaim outside of grace*/
+ NFS4ERR_RECLAIM_BAD = 10034,/* reclaim error at server */
+ NFS4ERR_RECLAIM_CONFLICT= 10035,/* conflict on reclaim */
+ NFS4ERR_BADXDR = 10036,/* XDR decode failed */
+ NFS4ERR_LOCKS_HELD = 10037,/* file locks held at CLOSE*/
+ NFS4ERR_OPENMODE = 10038,/* conflict in OPEN and I/O*/
+ NFS4ERR_BADOWNER = 10039,/* owner translation bad */
+ NFS4ERR_BADCHAR = 10040,/* utf-8 char not supported*/
+ NFS4ERR_BADNAME = 10041,/* name not supported */
+ NFS4ERR_BAD_RANGE = 10042,/* lock range not supported*/
+ NFS4ERR_LOCK_NOTSUPP = 10043,/* no atomic up/downgrade */
+ NFS4ERR_OP_ILLEGAL = 10044,/* undefined operation */
+ NFS4ERR_DEADLOCK = 10045,/* file locking deadlock */
+ NFS4ERR_FILE_OPEN = 10046,/* open file blocks op. */
+ NFS4ERR_ADMIN_REVOKED = 10047,/* lockowner state revoked */
+ NFS4ERR_CB_PATH_DOWN = 10048,/* callback path down */
+
+ /* NFSv4.1 errors start here. */
+
+ NFS4ERR_BADIOMODE = 10049,
+ NFS4ERR_BADLAYOUT = 10050,
+ NFS4ERR_BAD_SESSION_DIGEST = 10051,
+ NFS4ERR_BADSESSION = 10052,
+ NFS4ERR_BADSLOT = 10053,
+ NFS4ERR_COMPLETE_ALREADY = 10054,
+ NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
+ NFS4ERR_DELEG_ALREADY_WANTED = 10056,
+ NFS4ERR_BACK_CHAN_BUSY = 10057,/*backchan reqs outstanding*/
+ NFS4ERR_LAYOUTTRYLATER = 10058,
+ NFS4ERR_LAYOUTUNAVAILABLE = 10059,
+ NFS4ERR_NOMATCHING_LAYOUT = 10060,
+ NFS4ERR_RECALLCONFLICT = 10061,
+ NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
+ NFS4ERR_SEQ_MISORDERED = 10063,/* unexpected seq.ID in req*/
+ NFS4ERR_SEQUENCE_POS = 10064,/* [CB_]SEQ. op not 1st op */
+ NFS4ERR_REQ_TOO_BIG = 10065,/* request too big */
+ NFS4ERR_REP_TOO_BIG = 10066,/* reply too big */
+ NFS4ERR_REP_TOO_BIG_TO_CACHE =10067,/* rep. not all cached*/
+ NFS4ERR_RETRY_UNCACHED_REP =10068,/* retry & rep. uncached*/
+ NFS4ERR_UNSAFE_COMPOUND =10069,/* retry/recovery too hard */
+ NFS4ERR_TOO_MANY_OPS = 10070,/*too many ops in [CB_]COMP*/
+ NFS4ERR_OP_NOT_IN_SESSION =10071,/* op needs [CB_]SEQ. op */
+ NFS4ERR_HASH_ALG_UNSUPP = 10072, /* hash alg. not supp. */
+ /* Error 10073 is unused. */
+ NFS4ERR_CLIENTID_BUSY = 10074,/* clientid has state */
+ NFS4ERR_PNFS_IO_HOLE = 10075,/* IO to _SPARSE file hole */
+ NFS4ERR_SEQ_FALSE_RETRY= 10076,/* Retry != original req. */
+ NFS4ERR_BAD_HIGH_SLOT = 10077,/* req has bad highest_slot*/
+ NFS4ERR_DEADSESSION = 10078,/*new req sent to dead sess*/
+ NFS4ERR_ENCR_ALG_UNSUPP= 10079,/* encr alg. not supp. */
+ NFS4ERR_PNFS_NO_LAYOUT = 10080,/* I/O without a layout */
+ NFS4ERR_NOT_ONLY_OP = 10081,/* addl ops not allowed */
+ NFS4ERR_WRONG_CRED = 10082,/* op done by wrong cred */
+ NFS4ERR_WRONG_TYPE = 10083,/* op on wrong type object */
+ NFS4ERR_DIRDELEG_UNAVAIL=10084,/* delegation not avail. */
+ NFS4ERR_REJECT_DELEG = 10085,/* cb rejected delegation */
+ NFS4ERR_RETURNCONFLICT = 10086,/* layout get before return*/
+ NFS4ERR_DELEG_REVOKED = 10087, /* deleg./layout revoked */
+ NFS4ERR_PARTNER_NOTSUPP = 10088,
+ NFS4ERR_PARTNER_NO_AUTH = 10089,
+ NFS4ERR_UNION_NOTSUPP = 10090,
+ NFS4ERR_OFFLOAD_DENIED = 10091,
+ NFS4ERR_WRONG_LFS = 10092,
+ NFS4ERR_BADLABEL = 10093,
+ NFS4ERR_OFFLOAD_NO_REQS = 10094,
+ NFS4ERR_NOXATTR = 10095,
+ NFS4ERR_XATTR2BIG = 10096,
+
+ /* always set this to one more than the last one in the enum */
+ NFS4ERR_FIRST_FREE = 10097
+};
/*
* Basic data types
*/
+typedef opaque attrlist4<>;
typedef uint32_t bitmap4<>;
+typedef opaque verifier4[NFS4_VERIFIER_SIZE];
+typedef uint64_t nfs_cookie4;
+typedef opaque nfs_fh4<NFS4_FHSIZE>;
typedef opaque utf8string<>;
typedef utf8string utf8str_cis;
typedef utf8string utf8str_cs;
typedef utf8string utf8str_mixed;
+typedef utf8str_cs component4;
+typedef utf8str_cs linktext4;
+typedef component4 pathname4<>;
+
/*
* Timeval
*/
@@ -66,6 +212,21 @@ struct nfstime4 {
uint32_t nseconds;
};
+/*
+ * File attribute container
+ */
+struct fattr4 {
+ bitmap4 attrmask;
+ attrlist4 attr_vals;
+};
+
+/*
+ * Stateid
+ */
+struct stateid4 {
+ uint32_t seqid;
+ opaque other[12];
+};
/*
* The following content was extracted from draft-ietf-nfsv4-delstid
@@ -245,3 +406,88 @@ const FATTR4_ACL_TRUEFORM = 89;
const FATTR4_ACL_TRUEFORM_SCOPE = 90;
const FATTR4_POSIX_DEFAULT_ACL = 91;
const FATTR4_POSIX_ACCESS_ACL = 92;
+
+/*
+ * Directory notification types.
+ */
+enum notify_type4 {
+ NOTIFY4_CHANGE_CHILD_ATTRS = 0,
+ NOTIFY4_CHANGE_DIR_ATTRS = 1,
+ NOTIFY4_REMOVE_ENTRY = 2,
+ NOTIFY4_ADD_ENTRY = 3,
+ NOTIFY4_RENAME_ENTRY = 4,
+ NOTIFY4_CHANGE_COOKIE_VERIFIER = 5
+};
+
+/* Changed entry information. */
+struct notify_entry4 {
+ component4 ne_file;
+ fattr4 ne_attrs;
+};
+
+/* Previous entry information */
+struct prev_entry4 {
+ notify_entry4 pe_prev_entry;
+ /* what READDIR returned for this entry */
+ nfs_cookie4 pe_prev_entry_cookie;
+};
+
+struct notify_remove4 {
+ notify_entry4 nrm_old_entry;
+ nfs_cookie4 nrm_old_entry_cookie;
+};
+pragma public notify_remove4;
+
+struct notify_add4 {
+ /*
+ * Information on object
+ * possibly renamed over.
+ */
+ notify_remove4 nad_old_entry<1>;
+ notify_entry4 nad_new_entry;
+ /* what READDIR would have returned for this entry */
+ nfs_cookie4 nad_new_entry_cookie<1>;
+ prev_entry4 nad_prev_entry<1>;
+ bool nad_last_entry;
+};
+pragma public notify_add4;
+
+struct notify_attr4 {
+ notify_entry4 na_changed_entry;
+};
+pragma public notify_attr4;
+
+struct notify_rename4 {
+ notify_remove4 nrn_old_entry;
+ notify_add4 nrn_new_entry;
+};
+pragma public notify_rename4;
+
+struct notify_verifier4 {
+ verifier4 nv_old_cookieverf;
+ verifier4 nv_new_cookieverf;
+};
+
+/*
+ * Objects of type notify_<>4 and
+ * notify_device_<>4 are encoded in this.
+ */
+typedef opaque notifylist4<>;
+
+struct notify4 {
+ /* composed from notify_type4 or notify_deviceid_type4 */
+ bitmap4 notify_mask;
+ notifylist4 notify_vals;
+};
+
+struct CB_NOTIFY4args {
+ stateid4 cna_stateid;
+ nfs_fh4 cna_fh;
+ notify4 cna_changes<>;
+};
+pragma public CB_NOTIFY4args;
+
+struct CB_NOTIFY4res {
+ nfsstat4 cnr_status;
+};
+pragma public CB_NOTIFY4res;
diff --git a/fs/nfsd/nfs4xdr_gen.c b/fs/nfsd/nfs4xdr_gen.c
index 824497051b87..5e656d6bbb8e 100644
--- a/fs/nfsd/nfs4xdr_gen.c
+++ b/fs/nfsd/nfs4xdr_gen.c
@@ -1,16 +1,16 @@
// SPDX-License-Identifier: GPL-2.0
// Generated by xdrgen. Manual edits will be lost.
// XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x
-// XDR specification modification time: Thu Jan 8 23:12:07 2026
+// XDR specification modification time: Wed Mar 25 11:39:22 2026
#include <linux/sunrpc/svc.h>
#include "nfs4xdr_gen.h"
static bool __maybe_unused
-xdrgen_decode_int64_t(struct xdr_stream *xdr, int64_t *ptr)
+xdrgen_decode_int32_t(struct xdr_stream *xdr, int32_t *ptr)
{
- return xdrgen_decode_hyper(xdr, ptr);
+ return xdrgen_decode_int(xdr, ptr);
}
static bool __maybe_unused
@@ -19,6 +19,155 @@ xdrgen_decode_uint32_t(struct xdr_stream *xdr, uint32_t *ptr)
return xdrgen_decode_unsigned_int(xdr, ptr);
}
+static bool __maybe_unused
+xdrgen_decode_int64_t(struct xdr_stream *xdr, int64_t *ptr)
+{
+ return xdrgen_decode_hyper(xdr, ptr);
+}
+
+static bool __maybe_unused
+xdrgen_decode_uint64_t(struct xdr_stream *xdr, uint64_t *ptr)
+{
+ return xdrgen_decode_unsigned_hyper(xdr, ptr);
+}
+
+static bool __maybe_unused
+xdrgen_decode_nfsstat4(struct xdr_stream *xdr, nfsstat4 *ptr)
+{
+ u32 val;
+
+ if (xdr_stream_decode_u32(xdr, &val) < 0)
+ return false;
+ /* Compiler may optimize to a range check for dense enums */
+ switch (val) {
+ case NFS4_OK:
+ case NFS4ERR_PERM:
+ case NFS4ERR_NOENT:
+ case NFS4ERR_IO:
+ case NFS4ERR_NXIO:
+ case NFS4ERR_ACCESS:
+ case NFS4ERR_EXIST:
+ case NFS4ERR_XDEV:
+ case NFS4ERR_NOTDIR:
+ case NFS4ERR_ISDIR:
+ case NFS4ERR_INVAL:
+ case NFS4ERR_FBIG:
+ case NFS4ERR_NOSPC:
+ case NFS4ERR_ROFS:
+ case NFS4ERR_MLINK:
+ case NFS4ERR_NAMETOOLONG:
+ case NFS4ERR_NOTEMPTY:
+ case NFS4ERR_DQUOT:
+ case NFS4ERR_STALE:
+ case NFS4ERR_BADHANDLE:
+ case NFS4ERR_BAD_COOKIE:
+ case NFS4ERR_NOTSUPP:
+ case NFS4ERR_TOOSMALL:
+ case NFS4ERR_SERVERFAULT:
+ case NFS4ERR_BADTYPE:
+ case NFS4ERR_DELAY:
+ case NFS4ERR_SAME:
+ case NFS4ERR_DENIED:
+ case NFS4ERR_EXPIRED:
+ case NFS4ERR_LOCKED:
+ case NFS4ERR_GRACE:
+ case NFS4ERR_FHEXPIRED:
+ case NFS4ERR_SHARE_DENIED:
+ case NFS4ERR_WRONGSEC:
+ case NFS4ERR_CLID_INUSE:
+ case NFS4ERR_RESOURCE:
+ case NFS4ERR_MOVED:
+ case NFS4ERR_NOFILEHANDLE:
+ case NFS4ERR_MINOR_VERS_MISMATCH:
+ case NFS4ERR_STALE_CLIENTID:
+ case NFS4ERR_STALE_STATEID:
+ case NFS4ERR_OLD_STATEID:
+ case NFS4ERR_BAD_STATEID:
+ case NFS4ERR_BAD_SEQID:
+ case NFS4ERR_NOT_SAME:
+ case NFS4ERR_LOCK_RANGE:
+ case NFS4ERR_SYMLINK:
+ case NFS4ERR_RESTOREFH:
+ case NFS4ERR_LEASE_MOVED:
+ case NFS4ERR_ATTRNOTSUPP:
+ case NFS4ERR_NO_GRACE:
+ case NFS4ERR_RECLAIM_BAD:
+ case NFS4ERR_RECLAIM_CONFLICT:
+ case NFS4ERR_BADXDR:
+ case NFS4ERR_LOCKS_HELD:
+ case NFS4ERR_OPENMODE:
+ case NFS4ERR_BADOWNER:
+ case NFS4ERR_BADCHAR:
+ case NFS4ERR_BADNAME:
+ case NFS4ERR_BAD_RANGE:
+ case NFS4ERR_LOCK_NOTSUPP:
+ case NFS4ERR_OP_ILLEGAL:
+ case NFS4ERR_DEADLOCK:
+ case NFS4ERR_FILE_OPEN:
+ case NFS4ERR_ADMIN_REVOKED:
+ case NFS4ERR_CB_PATH_DOWN:
+ case NFS4ERR_BADIOMODE:
+ case NFS4ERR_BADLAYOUT:
+ case NFS4ERR_BAD_SESSION_DIGEST:
+ case NFS4ERR_BADSESSION:
+ case NFS4ERR_BADSLOT:
+ case NFS4ERR_COMPLETE_ALREADY:
+ case NFS4ERR_CONN_NOT_BOUND_TO_SESSION:
+ case NFS4ERR_DELEG_ALREADY_WANTED:
+ case NFS4ERR_BACK_CHAN_BUSY:
+ case NFS4ERR_LAYOUTTRYLATER:
+ case NFS4ERR_LAYOUTUNAVAILABLE:
+ case NFS4ERR_NOMATCHING_LAYOUT:
+ case NFS4ERR_RECALLCONFLICT:
+ case NFS4ERR_UNKNOWN_LAYOUTTYPE:
+ case NFS4ERR_SEQ_MISORDERED:
+ case NFS4ERR_SEQUENCE_POS:
+ case NFS4ERR_REQ_TOO_BIG:
+ case NFS4ERR_REP_TOO_BIG:
+ case NFS4ERR_REP_TOO_BIG_TO_CACHE:
+ case NFS4ERR_RETRY_UNCACHED_REP:
+ case NFS4ERR_UNSAFE_COMPOUND:
+ case NFS4ERR_TOO_MANY_OPS:
+ case NFS4ERR_OP_NOT_IN_SESSION:
+ case NFS4ERR_HASH_ALG_UNSUPP:
+ case NFS4ERR_CLIENTID_BUSY:
+ case NFS4ERR_PNFS_IO_HOLE:
+ case NFS4ERR_SEQ_FALSE_RETRY:
+ case NFS4ERR_BAD_HIGH_SLOT:
+ case NFS4ERR_DEADSESSION:
+ case NFS4ERR_ENCR_ALG_UNSUPP:
+ case NFS4ERR_PNFS_NO_LAYOUT:
+ case NFS4ERR_NOT_ONLY_OP:
+ case NFS4ERR_WRONG_CRED:
+ case NFS4ERR_WRONG_TYPE:
+ case NFS4ERR_DIRDELEG_UNAVAIL:
+ case NFS4ERR_REJECT_DELEG:
+ case NFS4ERR_RETURNCONFLICT:
+ case NFS4ERR_DELEG_REVOKED:
+ case NFS4ERR_PARTNER_NOTSUPP:
+ case NFS4ERR_PARTNER_NO_AUTH:
+ case NFS4ERR_UNION_NOTSUPP:
+ case NFS4ERR_OFFLOAD_DENIED:
+ case NFS4ERR_WRONG_LFS:
+ case NFS4ERR_BADLABEL:
+ case NFS4ERR_OFFLOAD_NO_REQS:
+ case NFS4ERR_NOXATTR:
+ case NFS4ERR_XATTR2BIG:
+ case NFS4ERR_FIRST_FREE:
+ break;
+ default:
+ return false;
+ }
+ *ptr = val;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_attrlist4(struct xdr_stream *xdr, attrlist4 *ptr)
+{
+ return xdrgen_decode_opaque(xdr, ptr, 0);
+}
+
static bool __maybe_unused
xdrgen_decode_bitmap4(struct xdr_stream *xdr, bitmap4 *ptr)
{
@@ -30,6 +179,24 @@ xdrgen_decode_bitmap4(struct xdr_stream *xdr, bitmap4 *ptr)
return true;
}
+static bool __maybe_unused
+xdrgen_decode_verifier4(struct xdr_stream *xdr, verifier4 *ptr)
+{
+ return xdr_stream_decode_opaque_fixed(xdr, ptr, NFS4_VERIFIER_SIZE) == 0;
+}
+
+static bool __maybe_unused
+xdrgen_decode_nfs_cookie4(struct xdr_stream *xdr, nfs_cookie4 *ptr)
+{
+ return xdrgen_decode_uint64_t(xdr, ptr);
+}
+
+static bool __maybe_unused
+xdrgen_decode_nfs_fh4(struct xdr_stream *xdr, nfs_fh4 *ptr)
+{
+ return xdrgen_decode_opaque(xdr, ptr, NFS4_FHSIZE);
+}
+
static bool __maybe_unused
xdrgen_decode_utf8string(struct xdr_stream *xdr, utf8string *ptr)
{
@@ -54,6 +221,29 @@ xdrgen_decode_utf8str_mixed(struct xdr_stream *xdr, utf8str_mixed *ptr)
return xdrgen_decode_utf8string(xdr, ptr);
}
+static bool __maybe_unused
+xdrgen_decode_component4(struct xdr_stream *xdr, component4 *ptr)
+{
+ return xdrgen_decode_utf8str_cs(xdr, ptr);
+}
+
+static bool __maybe_unused
+xdrgen_decode_linktext4(struct xdr_stream *xdr, linktext4 *ptr)
+{
+ return xdrgen_decode_utf8str_cs(xdr, ptr);
+}
+
+static bool __maybe_unused
+xdrgen_decode_pathname4(struct xdr_stream *xdr, pathname4 *ptr)
+{
+ if (xdr_stream_decode_u32(xdr, &ptr->count) < 0)
+ return false;
+ for (u32 i = 0; i < ptr->count; i++)
+ if (!xdrgen_decode_component4(xdr, &ptr->element[i]))
+ return false;
+ return true;
+}
+
static bool __maybe_unused
xdrgen_decode_nfstime4(struct xdr_stream *xdr, struct nfstime4 *ptr)
{
@@ -64,6 +254,26 @@ xdrgen_decode_nfstime4(struct xdr_stream *xdr, struct nfstime4 *ptr)
return true;
}
+static bool __maybe_unused
+xdrgen_decode_fattr4(struct xdr_stream *xdr, struct fattr4 *ptr)
+{
+ if (!xdrgen_decode_bitmap4(xdr, &ptr->attrmask))
+ return false;
+ if (!xdrgen_decode_attrlist4(xdr, &ptr->attr_vals))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_stateid4(struct xdr_stream *xdr, struct stateid4 *ptr)
+{
+ if (!xdrgen_decode_uint32_t(xdr, &ptr->seqid))
+ return false;
+ if (xdr_stream_decode_opaque_fixed(xdr, ptr->other, 12) < 0)
+ return false;
+ return true;
+}
+
static bool __maybe_unused
xdrgen_decode_fattr4_offline(struct xdr_stream *xdr, fattr4_offline *ptr)
{
@@ -366,9 +576,160 @@ xdrgen_decode_fattr4_posix_access_acl(struct xdr_stream *xdr, fattr4_posix_acces
*/
static bool __maybe_unused
-xdrgen_encode_int64_t(struct xdr_stream *xdr, const int64_t value)
+xdrgen_decode_notify_type4(struct xdr_stream *xdr, notify_type4 *ptr)
{
- return xdrgen_encode_hyper(xdr, value);
+ u32 val;
+
+ if (xdr_stream_decode_u32(xdr, &val) < 0)
+ return false;
+ /* Compiler may optimize to a range check for dense enums */
+ switch (val) {
+ case NOTIFY4_CHANGE_CHILD_ATTRS:
+ case NOTIFY4_CHANGE_DIR_ATTRS:
+ case NOTIFY4_REMOVE_ENTRY:
+ case NOTIFY4_ADD_ENTRY:
+ case NOTIFY4_RENAME_ENTRY:
+ case NOTIFY4_CHANGE_COOKIE_VERIFIER:
+ break;
+ default:
+ return false;
+ }
+ *ptr = val;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_notify_entry4(struct xdr_stream *xdr, struct notify_entry4 *ptr)
+{
+ if (!xdrgen_decode_component4(xdr, &ptr->ne_file))
+ return false;
+ if (!xdrgen_decode_fattr4(xdr, &ptr->ne_attrs))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_prev_entry4(struct xdr_stream *xdr, struct prev_entry4 *ptr)
+{
+ if (!xdrgen_decode_notify_entry4(xdr, &ptr->pe_prev_entry))
+ return false;
+ if (!xdrgen_decode_nfs_cookie4(xdr, &ptr->pe_prev_entry_cookie))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_decode_notify_remove4(struct xdr_stream *xdr, struct notify_remove4 *ptr)
+{
+ if (!xdrgen_decode_notify_entry4(xdr, &ptr->nrm_old_entry))
+ return false;
+ if (!xdrgen_decode_nfs_cookie4(xdr, &ptr->nrm_old_entry_cookie))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_decode_notify_add4(struct xdr_stream *xdr, struct notify_add4 *ptr)
+{
+ if (xdr_stream_decode_u32(xdr, &ptr->nad_old_entry.count) < 0)
+ return false;
+ if (ptr->nad_old_entry.count > 1)
+ return false;
+ for (u32 i = 0; i < ptr->nad_old_entry.count; i++)
+ if (!xdrgen_decode_notify_remove4(xdr, &ptr->nad_old_entry.element[i]))
+ return false;
+ if (!xdrgen_decode_notify_entry4(xdr, &ptr->nad_new_entry))
+ return false;
+ if (xdr_stream_decode_u32(xdr, &ptr->nad_new_entry_cookie.count) < 0)
+ return false;
+ if (ptr->nad_new_entry_cookie.count > 1)
+ return false;
+ for (u32 i = 0; i < ptr->nad_new_entry_cookie.count; i++)
+ if (!xdrgen_decode_nfs_cookie4(xdr, &ptr->nad_new_entry_cookie.element[i]))
+ return false;
+ if (xdr_stream_decode_u32(xdr, &ptr->nad_prev_entry.count) < 0)
+ return false;
+ if (ptr->nad_prev_entry.count > 1)
+ return false;
+ for (u32 i = 0; i < ptr->nad_prev_entry.count; i++)
+ if (!xdrgen_decode_prev_entry4(xdr, &ptr->nad_prev_entry.element[i]))
+ return false;
+ if (!xdrgen_decode_bool(xdr, &ptr->nad_last_entry))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_decode_notify_attr4(struct xdr_stream *xdr, struct notify_attr4 *ptr)
+{
+ if (!xdrgen_decode_notify_entry4(xdr, &ptr->na_changed_entry))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_decode_notify_rename4(struct xdr_stream *xdr, struct notify_rename4 *ptr)
+{
+ if (!xdrgen_decode_notify_remove4(xdr, &ptr->nrn_old_entry))
+ return false;
+ if (!xdrgen_decode_notify_add4(xdr, &ptr->nrn_new_entry))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_notify_verifier4(struct xdr_stream *xdr, struct notify_verifier4 *ptr)
+{
+ if (!xdrgen_decode_verifier4(xdr, &ptr->nv_old_cookieverf))
+ return false;
+ if (!xdrgen_decode_verifier4(xdr, &ptr->nv_new_cookieverf))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_decode_notifylist4(struct xdr_stream *xdr, notifylist4 *ptr)
+{
+ return xdrgen_decode_opaque(xdr, ptr, 0);
+}
+
+static bool __maybe_unused
+xdrgen_decode_notify4(struct xdr_stream *xdr, struct notify4 *ptr)
+{
+ if (!xdrgen_decode_bitmap4(xdr, &ptr->notify_mask))
+ return false;
+ if (!xdrgen_decode_notifylist4(xdr, &ptr->notify_vals))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_decode_CB_NOTIFY4args(struct xdr_stream *xdr, struct CB_NOTIFY4args *ptr)
+{
+ if (!xdrgen_decode_stateid4(xdr, &ptr->cna_stateid))
+ return false;
+ if (!xdrgen_decode_nfs_fh4(xdr, &ptr->cna_fh))
+ return false;
+ if (xdr_stream_decode_u32(xdr, &ptr->cna_changes.count) < 0)
+ return false;
+ for (u32 i = 0; i < ptr->cna_changes.count; i++)
+ if (!xdrgen_decode_notify4(xdr, &ptr->cna_changes.element[i]))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_decode_CB_NOTIFY4res(struct xdr_stream *xdr, struct CB_NOTIFY4res *ptr)
+{
+ if (!xdrgen_decode_nfsstat4(xdr, &ptr->cnr_status))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_encode_int32_t(struct xdr_stream *xdr, const int32_t value)
+{
+ return xdrgen_encode_int(xdr, value);
}
static bool __maybe_unused
@@ -377,6 +738,30 @@ xdrgen_encode_uint32_t(struct xdr_stream *xdr, const uint32_t value)
return xdrgen_encode_unsigned_int(xdr, value);
}
+static bool __maybe_unused
+xdrgen_encode_int64_t(struct xdr_stream *xdr, const int64_t value)
+{
+ return xdrgen_encode_hyper(xdr, value);
+}
+
+static bool __maybe_unused
+xdrgen_encode_uint64_t(struct xdr_stream *xdr, const uint64_t value)
+{
+ return xdrgen_encode_unsigned_hyper(xdr, value);
+}
+
+static bool __maybe_unused
+xdrgen_encode_nfsstat4(struct xdr_stream *xdr, nfsstat4 value)
+{
+ return xdr_stream_encode_u32(xdr, value) == XDR_UNIT;
+}
+
+static bool __maybe_unused
+xdrgen_encode_attrlist4(struct xdr_stream *xdr, const attrlist4 value)
+{
+ return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+}
+
static bool __maybe_unused
xdrgen_encode_bitmap4(struct xdr_stream *xdr, const bitmap4 value)
{
@@ -388,6 +773,24 @@ xdrgen_encode_bitmap4(struct xdr_stream *xdr, const bitmap4 value)
return true;
}
+static bool __maybe_unused
+xdrgen_encode_verifier4(struct xdr_stream *xdr, const verifier4 value)
+{
+ return xdr_stream_encode_opaque_fixed(xdr, value, NFS4_VERIFIER_SIZE) >= 0;
+}
+
+static bool __maybe_unused
+xdrgen_encode_nfs_cookie4(struct xdr_stream *xdr, const nfs_cookie4 value)
+{
+ return xdrgen_encode_uint64_t(xdr, value);
+}
+
+static bool __maybe_unused
+xdrgen_encode_nfs_fh4(struct xdr_stream *xdr, const nfs_fh4 value)
+{
+ return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+}
+
static bool __maybe_unused
xdrgen_encode_utf8string(struct xdr_stream *xdr, const utf8string value)
{
@@ -412,6 +815,29 @@ xdrgen_encode_utf8str_mixed(struct xdr_stream *xdr, const utf8str_mixed value)
return xdrgen_encode_utf8string(xdr, value);
}
+static bool __maybe_unused
+xdrgen_encode_component4(struct xdr_stream *xdr, const component4 value)
+{
+ return xdrgen_encode_utf8str_cs(xdr, value);
+}
+
+static bool __maybe_unused
+xdrgen_encode_linktext4(struct xdr_stream *xdr, const linktext4 value)
+{
+ return xdrgen_encode_utf8str_cs(xdr, value);
+}
+
+static bool __maybe_unused
+xdrgen_encode_pathname4(struct xdr_stream *xdr, const pathname4 value)
+{
+ if (xdr_stream_encode_u32(xdr, value.count) != XDR_UNIT)
+ return false;
+ for (u32 i = 0; i < value.count; i++)
+ if (!xdrgen_encode_component4(xdr, value.element[i]))
+ return false;
+ return true;
+}
+
static bool __maybe_unused
xdrgen_encode_nfstime4(struct xdr_stream *xdr, const struct nfstime4 *value)
{
@@ -422,6 +848,26 @@ xdrgen_encode_nfstime4(struct xdr_stream *xdr, const struct nfstime4 *value)
return true;
}
+static bool __maybe_unused
+xdrgen_encode_fattr4(struct xdr_stream *xdr, const struct fattr4 *value)
+{
+ if (!xdrgen_encode_bitmap4(xdr, value->attrmask))
+ return false;
+ if (!xdrgen_encode_attrlist4(xdr, value->attr_vals))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_encode_stateid4(struct xdr_stream *xdr, const struct stateid4 *value)
+{
+ if (!xdrgen_encode_uint32_t(xdr, value->seqid))
+ return false;
+ if (xdr_stream_encode_opaque_fixed(xdr, value->other, 12) < 0)
+ return false;
+ return true;
+}
+
static bool __maybe_unused
xdrgen_encode_fattr4_offline(struct xdr_stream *xdr, const fattr4_offline value)
{
@@ -567,3 +1013,137 @@ xdrgen_encode_fattr4_posix_access_acl(struct xdr_stream *xdr, const fattr4_posix
return false;
return true;
}
+
+static bool __maybe_unused
+xdrgen_encode_notify_type4(struct xdr_stream *xdr, notify_type4 value)
+{
+ return xdr_stream_encode_u32(xdr, value) == XDR_UNIT;
+}
+
+static bool __maybe_unused
+xdrgen_encode_notify_entry4(struct xdr_stream *xdr, const struct notify_entry4 *value)
+{
+ if (!xdrgen_encode_component4(xdr, value->ne_file))
+ return false;
+ if (!xdrgen_encode_fattr4(xdr, &value->ne_attrs))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_encode_prev_entry4(struct xdr_stream *xdr, const struct prev_entry4 *value)
+{
+ if (!xdrgen_encode_notify_entry4(xdr, &value->pe_prev_entry))
+ return false;
+ if (!xdrgen_encode_nfs_cookie4(xdr, value->pe_prev_entry_cookie))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_encode_notify_remove4(struct xdr_stream *xdr, const struct notify_remove4 *value)
+{
+ if (!xdrgen_encode_notify_entry4(xdr, &value->nrm_old_entry))
+ return false;
+ if (!xdrgen_encode_nfs_cookie4(xdr, value->nrm_old_entry_cookie))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_encode_notify_add4(struct xdr_stream *xdr, const struct notify_add4 *value)
+{
+ if (value->nad_old_entry.count > 1)
+ return false;
+ if (xdr_stream_encode_u32(xdr, value->nad_old_entry.count) != XDR_UNIT)
+ return false;
+ for (u32 i = 0; i < value->nad_old_entry.count; i++)
+ if (!xdrgen_encode_notify_remove4(xdr, &value->nad_old_entry.element[i]))
+ return false;
+ if (!xdrgen_encode_notify_entry4(xdr, &value->nad_new_entry))
+ return false;
+ if (value->nad_new_entry_cookie.count > 1)
+ return false;
+ if (xdr_stream_encode_u32(xdr, value->nad_new_entry_cookie.count) != XDR_UNIT)
+ return false;
+ for (u32 i = 0; i < value->nad_new_entry_cookie.count; i++)
+ if (!xdrgen_encode_nfs_cookie4(xdr, value->nad_new_entry_cookie.element[i]))
+ return false;
+ if (value->nad_prev_entry.count > 1)
+ return false;
+ if (xdr_stream_encode_u32(xdr, value->nad_prev_entry.count) != XDR_UNIT)
+ return false;
+ for (u32 i = 0; i < value->nad_prev_entry.count; i++)
+ if (!xdrgen_encode_prev_entry4(xdr, &value->nad_prev_entry.element[i]))
+ return false;
+ if (!xdrgen_encode_bool(xdr, value->nad_last_entry))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_encode_notify_attr4(struct xdr_stream *xdr, const struct notify_attr4 *value)
+{
+ if (!xdrgen_encode_notify_entry4(xdr, &value->na_changed_entry))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_encode_notify_rename4(struct xdr_stream *xdr, const struct notify_rename4 *value)
+{
+ if (!xdrgen_encode_notify_remove4(xdr, &value->nrn_old_entry))
+ return false;
+ if (!xdrgen_encode_notify_add4(xdr, &value->nrn_new_entry))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_encode_notify_verifier4(struct xdr_stream *xdr, const struct notify_verifier4 *value)
+{
+ if (!xdrgen_encode_verifier4(xdr, value->nv_old_cookieverf))
+ return false;
+ if (!xdrgen_encode_verifier4(xdr, value->nv_new_cookieverf))
+ return false;
+ return true;
+}
+
+static bool __maybe_unused
+xdrgen_encode_notifylist4(struct xdr_stream *xdr, const notifylist4 value)
+{
+ return xdr_stream_encode_opaque(xdr, value.data, value.len) >= 0;
+}
+
+static bool __maybe_unused
+xdrgen_encode_notify4(struct xdr_stream *xdr, const struct notify4 *value)
+{
+ if (!xdrgen_encode_bitmap4(xdr, value->notify_mask))
+ return false;
+ if (!xdrgen_encode_notifylist4(xdr, value->notify_vals))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_encode_CB_NOTIFY4args(struct xdr_stream *xdr, const struct CB_NOTIFY4args *value)
+{
+ if (!xdrgen_encode_stateid4(xdr, &value->cna_stateid))
+ return false;
+ if (!xdrgen_encode_nfs_fh4(xdr, value->cna_fh))
+ return false;
+ if (xdr_stream_encode_u32(xdr, value->cna_changes.count) != XDR_UNIT)
+ return false;
+ for (u32 i = 0; i < value->cna_changes.count; i++)
+ if (!xdrgen_encode_notify4(xdr, &value->cna_changes.element[i]))
+ return false;
+ return true;
+}
+
+bool
+xdrgen_encode_CB_NOTIFY4res(struct xdr_stream *xdr, const struct CB_NOTIFY4res *value)
+{
+ if (!xdrgen_encode_nfsstat4(xdr, value->cnr_status))
+ return false;
+ return true;
+}
diff --git a/fs/nfsd/nfs4xdr_gen.h b/fs/nfsd/nfs4xdr_gen.h
index 1c487f1a11ab..503fe2ccba51 100644
--- a/fs/nfsd/nfs4xdr_gen.h
+++ b/fs/nfsd/nfs4xdr_gen.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/* Generated by xdrgen. Manual edits will be lost. */
/* XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x */
-/* XDR specification modification time: Thu Jan 8 23:12:07 2026 */
+/* XDR specification modification time: Wed Mar 25 11:39:22 2026 */
#ifndef _LINUX_XDRGEN_NFS4_1_DECL_H
#define _LINUX_XDRGEN_NFS4_1_DECL_H
@@ -32,4 +32,22 @@ bool xdrgen_decode_posixaceperm4(struct xdr_stream *xdr, posixaceperm4 *ptr);
bool xdrgen_encode_posixaceperm4(struct xdr_stream *xdr, const posixaceperm4 value);
+bool xdrgen_decode_notify_remove4(struct xdr_stream *xdr, struct notify_remove4 *ptr);
+bool xdrgen_encode_notify_remove4(struct xdr_stream *xdr, const struct notify_remove4 *value);
+
+bool xdrgen_decode_notify_add4(struct xdr_stream *xdr, struct notify_add4 *ptr);
+bool xdrgen_encode_notify_add4(struct xdr_stream *xdr, const struct notify_add4 *value);
+
+bool xdrgen_decode_notify_attr4(struct xdr_stream *xdr, struct notify_attr4 *ptr);
+bool xdrgen_encode_notify_attr4(struct xdr_stream *xdr, const struct notify_attr4 *value);
+
+bool xdrgen_decode_notify_rename4(struct xdr_stream *xdr, struct notify_rename4 *ptr);
+bool xdrgen_encode_notify_rename4(struct xdr_stream *xdr, const struct notify_rename4 *value);
+
+bool xdrgen_decode_CB_NOTIFY4args(struct xdr_stream *xdr, struct CB_NOTIFY4args *ptr);
+bool xdrgen_encode_CB_NOTIFY4args(struct xdr_stream *xdr, const struct CB_NOTIFY4args *value);
+
+bool xdrgen_decode_CB_NOTIFY4res(struct xdr_stream *xdr, struct CB_NOTIFY4res *ptr);
+bool xdrgen_encode_CB_NOTIFY4res(struct xdr_stream *xdr, const struct CB_NOTIFY4res *value);
+
#endif /* _LINUX_XDRGEN_NFS4_1_DECL_H */
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 33953d38314e..171e8fdbafb6 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1677,6 +1677,7 @@ TRACE_EVENT(nfsd_cb_setup_err,
{ OP_CB_RECALL, "CB_RECALL" }, \
{ OP_CB_LAYOUTRECALL, "CB_LAYOUTRECALL" }, \
{ OP_CB_RECALL_ANY, "CB_RECALL_ANY" }, \
+ { OP_CB_NOTIFY, "CB_NOTIFY" }, \
{ OP_CB_NOTIFY_LOCK, "CB_NOTIFY_LOCK" }, \
{ OP_CB_OFFLOAD, "CB_OFFLOAD" })
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index d87be1f25273..44e5e9fa12e1 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -171,133 +171,6 @@ Needs to be updated if more operations are defined in future.*/
#define LAST_NFS42_OP OP_REMOVEXATTR
#define LAST_NFS4_OP LAST_NFS42_OP
-enum nfsstat4 {
- NFS4_OK = 0,
- NFS4ERR_PERM = 1,
- NFS4ERR_NOENT = 2,
- NFS4ERR_IO = 5,
- NFS4ERR_NXIO = 6,
- NFS4ERR_ACCESS = 13,
- NFS4ERR_EXIST = 17,
- NFS4ERR_XDEV = 18,
- /* Unused/reserved 19 */
- NFS4ERR_NOTDIR = 20,
- NFS4ERR_ISDIR = 21,
- NFS4ERR_INVAL = 22,
- NFS4ERR_FBIG = 27,
- NFS4ERR_NOSPC = 28,
- NFS4ERR_ROFS = 30,
- NFS4ERR_MLINK = 31,
- NFS4ERR_NAMETOOLONG = 63,
- NFS4ERR_NOTEMPTY = 66,
- NFS4ERR_DQUOT = 69,
- NFS4ERR_STALE = 70,
- NFS4ERR_BADHANDLE = 10001,
- NFS4ERR_BAD_COOKIE = 10003,
- NFS4ERR_NOTSUPP = 10004,
- NFS4ERR_TOOSMALL = 10005,
- NFS4ERR_SERVERFAULT = 10006,
- NFS4ERR_BADTYPE = 10007,
- NFS4ERR_DELAY = 10008,
- NFS4ERR_SAME = 10009,
- NFS4ERR_DENIED = 10010,
- NFS4ERR_EXPIRED = 10011,
- NFS4ERR_LOCKED = 10012,
- NFS4ERR_GRACE = 10013,
- NFS4ERR_FHEXPIRED = 10014,
- NFS4ERR_SHARE_DENIED = 10015,
- NFS4ERR_WRONGSEC = 10016,
- NFS4ERR_CLID_INUSE = 10017,
- NFS4ERR_RESOURCE = 10018,
- NFS4ERR_MOVED = 10019,
- NFS4ERR_NOFILEHANDLE = 10020,
- NFS4ERR_MINOR_VERS_MISMATCH = 10021,
- NFS4ERR_STALE_CLIENTID = 10022,
- NFS4ERR_STALE_STATEID = 10023,
- NFS4ERR_OLD_STATEID = 10024,
- NFS4ERR_BAD_STATEID = 10025,
- NFS4ERR_BAD_SEQID = 10026,
- NFS4ERR_NOT_SAME = 10027,
- NFS4ERR_LOCK_RANGE = 10028,
- NFS4ERR_SYMLINK = 10029,
- NFS4ERR_RESTOREFH = 10030,
- NFS4ERR_LEASE_MOVED = 10031,
- NFS4ERR_ATTRNOTSUPP = 10032,
- NFS4ERR_NO_GRACE = 10033,
- NFS4ERR_RECLAIM_BAD = 10034,
- NFS4ERR_RECLAIM_CONFLICT = 10035,
- NFS4ERR_BADXDR = 10036,
- NFS4ERR_LOCKS_HELD = 10037,
- NFS4ERR_OPENMODE = 10038,
- NFS4ERR_BADOWNER = 10039,
- NFS4ERR_BADCHAR = 10040,
- NFS4ERR_BADNAME = 10041,
- NFS4ERR_BAD_RANGE = 10042,
- NFS4ERR_LOCK_NOTSUPP = 10043,
- NFS4ERR_OP_ILLEGAL = 10044,
- NFS4ERR_DEADLOCK = 10045,
- NFS4ERR_FILE_OPEN = 10046,
- NFS4ERR_ADMIN_REVOKED = 10047,
- NFS4ERR_CB_PATH_DOWN = 10048,
-
- /* nfs41 */
- NFS4ERR_BADIOMODE = 10049,
- NFS4ERR_BADLAYOUT = 10050,
- NFS4ERR_BAD_SESSION_DIGEST = 10051,
- NFS4ERR_BADSESSION = 10052,
- NFS4ERR_BADSLOT = 10053,
- NFS4ERR_COMPLETE_ALREADY = 10054,
- NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
- NFS4ERR_DELEG_ALREADY_WANTED = 10056,
- NFS4ERR_BACK_CHAN_BUSY = 10057, /* backchan reqs outstanding */
- NFS4ERR_LAYOUTTRYLATER = 10058,
- NFS4ERR_LAYOUTUNAVAILABLE = 10059,
- NFS4ERR_NOMATCHING_LAYOUT = 10060,
- NFS4ERR_RECALLCONFLICT = 10061,
- NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
- NFS4ERR_SEQ_MISORDERED = 10063, /* unexpected seq.id in req */
- NFS4ERR_SEQUENCE_POS = 10064, /* [CB_]SEQ. op not 1st op */
- NFS4ERR_REQ_TOO_BIG = 10065, /* request too big */
- NFS4ERR_REP_TOO_BIG = 10066, /* reply too big */
- NFS4ERR_REP_TOO_BIG_TO_CACHE = 10067, /* rep. not all cached */
- NFS4ERR_RETRY_UNCACHED_REP = 10068, /* retry & rep. uncached */
- NFS4ERR_UNSAFE_COMPOUND = 10069, /* retry/recovery too hard */
- NFS4ERR_TOO_MANY_OPS = 10070, /* too many ops in [CB_]COMP */
- NFS4ERR_OP_NOT_IN_SESSION = 10071, /* op needs [CB_]SEQ. op */
- NFS4ERR_HASH_ALG_UNSUPP = 10072, /* hash alg. not supp. */
- /* Error 10073 is unused. */
- NFS4ERR_CLIENTID_BUSY = 10074, /* clientid has state */
- NFS4ERR_PNFS_IO_HOLE = 10075, /* IO to _SPARSE file hole */
- NFS4ERR_SEQ_FALSE_RETRY = 10076, /* retry not original */
- NFS4ERR_BAD_HIGH_SLOT = 10077, /* sequence arg bad */
- NFS4ERR_DEADSESSION = 10078, /* persistent session dead */
- NFS4ERR_ENCR_ALG_UNSUPP = 10079, /* SSV alg mismatch */
- NFS4ERR_PNFS_NO_LAYOUT = 10080, /* direct I/O with no layout */
- NFS4ERR_NOT_ONLY_OP = 10081, /* bad compound */
- NFS4ERR_WRONG_CRED = 10082, /* permissions:state change */
- NFS4ERR_WRONG_TYPE = 10083, /* current operation mismatch */
- NFS4ERR_DIRDELEG_UNAVAIL = 10084, /* no directory delegation */
- NFS4ERR_REJECT_DELEG = 10085, /* on callback */
- NFS4ERR_RETURNCONFLICT = 10086, /* outstanding layoutreturn */
- NFS4ERR_DELEG_REVOKED = 10087, /* deleg./layout revoked */
-
- /* nfs42 */
- NFS4ERR_PARTNER_NOTSUPP = 10088,
- NFS4ERR_PARTNER_NO_AUTH = 10089,
- NFS4ERR_UNION_NOTSUPP = 10090,
- NFS4ERR_OFFLOAD_DENIED = 10091,
- NFS4ERR_WRONG_LFS = 10092,
- NFS4ERR_BADLABEL = 10093,
- NFS4ERR_OFFLOAD_NO_REQS = 10094,
-
- /* xattr (RFC8276) */
- NFS4ERR_NOXATTR = 10095,
- NFS4ERR_XATTR2BIG = 10096,
-
- /* can be used for internal errors */
- NFS4ERR_FIRST_FREE
-};
-
/* error codes for internal client use */
#define NFS4ERR_RESET_TO_MDS 12001
#define NFS4ERR_RESET_TO_PNFS 12002
diff --git a/include/linux/sunrpc/xdrgen/nfs4_1.h b/include/linux/sunrpc/xdrgen/nfs4_1.h
index 4ac54bdbd335..f761c3ddb4c7 100644
--- a/include/linux/sunrpc/xdrgen/nfs4_1.h
+++ b/include/linux/sunrpc/xdrgen/nfs4_1.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/* Generated by xdrgen. Manual edits will be lost. */
/* XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x */
-/* XDR specification modification time: Thu Jan 8 23:12:07 2026 */
+/* XDR specification modification time: Wed Mar 25 11:39:22 2026 */
#ifndef _LINUX_XDRGEN_NFS4_1_DEF_H
#define _LINUX_XDRGEN_NFS4_1_DEF_H
@@ -9,15 +9,150 @@
#include <linux/types.h>
#include <linux/sunrpc/xdrgen/_defs.h>
-typedef s64 int64_t;
+typedef s32 int32_t;
typedef u32 uint32_t;
+typedef s64 int64_t;
+
+typedef u64 uint64_t;
+
+enum { NFS4_VERIFIER_SIZE = 8 };
+
+enum { NFS4_FHSIZE = 128 };
+
+enum nfsstat4 {
+ NFS4_OK = 0,
+ NFS4ERR_PERM = 1,
+ NFS4ERR_NOENT = 2,
+ NFS4ERR_IO = 5,
+ NFS4ERR_NXIO = 6,
+ NFS4ERR_ACCESS = 13,
+ NFS4ERR_EXIST = 17,
+ NFS4ERR_XDEV = 18,
+ NFS4ERR_NOTDIR = 20,
+ NFS4ERR_ISDIR = 21,
+ NFS4ERR_INVAL = 22,
+ NFS4ERR_FBIG = 27,
+ NFS4ERR_NOSPC = 28,
+ NFS4ERR_ROFS = 30,
+ NFS4ERR_MLINK = 31,
+ NFS4ERR_NAMETOOLONG = 63,
+ NFS4ERR_NOTEMPTY = 66,
+ NFS4ERR_DQUOT = 69,
+ NFS4ERR_STALE = 70,
+ NFS4ERR_BADHANDLE = 10001,
+ NFS4ERR_BAD_COOKIE = 10003,
+ NFS4ERR_NOTSUPP = 10004,
+ NFS4ERR_TOOSMALL = 10005,
+ NFS4ERR_SERVERFAULT = 10006,
+ NFS4ERR_BADTYPE = 10007,
+ NFS4ERR_DELAY = 10008,
+ NFS4ERR_SAME = 10009,
+ NFS4ERR_DENIED = 10010,
+ NFS4ERR_EXPIRED = 10011,
+ NFS4ERR_LOCKED = 10012,
+ NFS4ERR_GRACE = 10013,
+ NFS4ERR_FHEXPIRED = 10014,
+ NFS4ERR_SHARE_DENIED = 10015,
+ NFS4ERR_WRONGSEC = 10016,
+ NFS4ERR_CLID_INUSE = 10017,
+ NFS4ERR_RESOURCE = 10018,
+ NFS4ERR_MOVED = 10019,
+ NFS4ERR_NOFILEHANDLE = 10020,
+ NFS4ERR_MINOR_VERS_MISMATCH = 10021,
+ NFS4ERR_STALE_CLIENTID = 10022,
+ NFS4ERR_STALE_STATEID = 10023,
+ NFS4ERR_OLD_STATEID = 10024,
+ NFS4ERR_BAD_STATEID = 10025,
+ NFS4ERR_BAD_SEQID = 10026,
+ NFS4ERR_NOT_SAME = 10027,
+ NFS4ERR_LOCK_RANGE = 10028,
+ NFS4ERR_SYMLINK = 10029,
+ NFS4ERR_RESTOREFH = 10030,
+ NFS4ERR_LEASE_MOVED = 10031,
+ NFS4ERR_ATTRNOTSUPP = 10032,
+ NFS4ERR_NO_GRACE = 10033,
+ NFS4ERR_RECLAIM_BAD = 10034,
+ NFS4ERR_RECLAIM_CONFLICT = 10035,
+ NFS4ERR_BADXDR = 10036,
+ NFS4ERR_LOCKS_HELD = 10037,
+ NFS4ERR_OPENMODE = 10038,
+ NFS4ERR_BADOWNER = 10039,
+ NFS4ERR_BADCHAR = 10040,
+ NFS4ERR_BADNAME = 10041,
+ NFS4ERR_BAD_RANGE = 10042,
+ NFS4ERR_LOCK_NOTSUPP = 10043,
+ NFS4ERR_OP_ILLEGAL = 10044,
+ NFS4ERR_DEADLOCK = 10045,
+ NFS4ERR_FILE_OPEN = 10046,
+ NFS4ERR_ADMIN_REVOKED = 10047,
+ NFS4ERR_CB_PATH_DOWN = 10048,
+ NFS4ERR_BADIOMODE = 10049,
+ NFS4ERR_BADLAYOUT = 10050,
+ NFS4ERR_BAD_SESSION_DIGEST = 10051,
+ NFS4ERR_BADSESSION = 10052,
+ NFS4ERR_BADSLOT = 10053,
+ NFS4ERR_COMPLETE_ALREADY = 10054,
+ NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
+ NFS4ERR_DELEG_ALREADY_WANTED = 10056,
+ NFS4ERR_BACK_CHAN_BUSY = 10057,
+ NFS4ERR_LAYOUTTRYLATER = 10058,
+ NFS4ERR_LAYOUTUNAVAILABLE = 10059,
+ NFS4ERR_NOMATCHING_LAYOUT = 10060,
+ NFS4ERR_RECALLCONFLICT = 10061,
+ NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
+ NFS4ERR_SEQ_MISORDERED = 10063,
+ NFS4ERR_SEQUENCE_POS = 10064,
+ NFS4ERR_REQ_TOO_BIG = 10065,
+ NFS4ERR_REP_TOO_BIG = 10066,
+ NFS4ERR_REP_TOO_BIG_TO_CACHE = 10067,
+ NFS4ERR_RETRY_UNCACHED_REP = 10068,
+ NFS4ERR_UNSAFE_COMPOUND = 10069,
+ NFS4ERR_TOO_MANY_OPS = 10070,
+ NFS4ERR_OP_NOT_IN_SESSION = 10071,
+ NFS4ERR_HASH_ALG_UNSUPP = 10072,
+ NFS4ERR_CLIENTID_BUSY = 10074,
+ NFS4ERR_PNFS_IO_HOLE = 10075,
+ NFS4ERR_SEQ_FALSE_RETRY = 10076,
+ NFS4ERR_BAD_HIGH_SLOT = 10077,
+ NFS4ERR_DEADSESSION = 10078,
+ NFS4ERR_ENCR_ALG_UNSUPP = 10079,
+ NFS4ERR_PNFS_NO_LAYOUT = 10080,
+ NFS4ERR_NOT_ONLY_OP = 10081,
+ NFS4ERR_WRONG_CRED = 10082,
+ NFS4ERR_WRONG_TYPE = 10083,
+ NFS4ERR_DIRDELEG_UNAVAIL = 10084,
+ NFS4ERR_REJECT_DELEG = 10085,
+ NFS4ERR_RETURNCONFLICT = 10086,
+ NFS4ERR_DELEG_REVOKED = 10087,
+ NFS4ERR_PARTNER_NOTSUPP = 10088,
+ NFS4ERR_PARTNER_NO_AUTH = 10089,
+ NFS4ERR_UNION_NOTSUPP = 10090,
+ NFS4ERR_OFFLOAD_DENIED = 10091,
+ NFS4ERR_WRONG_LFS = 10092,
+ NFS4ERR_BADLABEL = 10093,
+ NFS4ERR_OFFLOAD_NO_REQS = 10094,
+ NFS4ERR_NOXATTR = 10095,
+ NFS4ERR_XATTR2BIG = 10096,
+ NFS4ERR_FIRST_FREE = 10097,
+};
+
+typedef enum nfsstat4 nfsstat4;
+
+typedef opaque attrlist4;
+
typedef struct {
u32 count;
uint32_t *element;
} bitmap4;
+typedef u8 verifier4[NFS4_VERIFIER_SIZE];
+
+typedef uint64_t nfs_cookie4;
+
+typedef opaque nfs_fh4;
+
typedef opaque utf8string;
typedef utf8string utf8str_cis;
@@ -26,11 +161,30 @@ typedef utf8string utf8str_cs;
typedef utf8string utf8str_mixed;
+typedef utf8str_cs component4;
+
+typedef utf8str_cs linktext4;
+
+typedef struct {
+ u32 count;
+ component4 *element;
+} pathname4;
+
struct nfstime4 {
int64_t seconds;
uint32_t nseconds;
};
+struct fattr4 {
+ bitmap4 attrmask;
+ attrlist4 attr_vals;
+};
+
+struct stateid4 {
+ uint32_t seqid;
+ u8 other[12];
+};
+
typedef bool fattr4_offline;
enum { FATTR4_OFFLINE = 83 };
@@ -216,11 +370,98 @@ enum { FATTR4_POSIX_DEFAULT_ACL = 91 };
enum { FATTR4_POSIX_ACCESS_ACL = 92 };
-#define NFS4_int64_t_sz \
- (XDR_hyper)
+enum notify_type4 {
+ NOTIFY4_CHANGE_CHILD_ATTRS = 0,
+ NOTIFY4_CHANGE_DIR_ATTRS = 1,
+ NOTIFY4_REMOVE_ENTRY = 2,
+ NOTIFY4_ADD_ENTRY = 3,
+ NOTIFY4_RENAME_ENTRY = 4,
+ NOTIFY4_CHANGE_COOKIE_VERIFIER = 5,
+};
+
+typedef enum notify_type4 notify_type4;
+
+struct notify_entry4 {
+ component4 ne_file;
+ struct fattr4 ne_attrs;
+};
+
+struct prev_entry4 {
+ struct notify_entry4 pe_prev_entry;
+ nfs_cookie4 pe_prev_entry_cookie;
+};
+
+struct notify_remove4 {
+ struct notify_entry4 nrm_old_entry;
+ nfs_cookie4 nrm_old_entry_cookie;
+};
+
+struct notify_add4 {
+ struct {
+ u32 count;
+ struct notify_remove4 *element;
+ } nad_old_entry;
+ struct notify_entry4 nad_new_entry;
+ struct {
+ u32 count;
+ nfs_cookie4 *element;
+ } nad_new_entry_cookie;
+ struct {
+ u32 count;
+ struct prev_entry4 *element;
+ } nad_prev_entry;
+ bool nad_last_entry;
+};
+
+struct notify_attr4 {
+ struct notify_entry4 na_changed_entry;
+};
+
+struct notify_rename4 {
+ struct notify_remove4 nrn_old_entry;
+ struct notify_add4 nrn_new_entry;
+};
+
+struct notify_verifier4 {
+ verifier4 nv_old_cookieverf;
+ verifier4 nv_new_cookieverf;
+};
+
+typedef opaque notifylist4;
+
+struct notify4 {
+ bitmap4 notify_mask;
+ notifylist4 notify_vals;
+};
+
+struct CB_NOTIFY4args {
+ struct stateid4 cna_stateid;
+ nfs_fh4 cna_fh;
+ struct {
+ u32 count;
+ struct notify4 *element;
+ } cna_changes;
+};
+
+struct CB_NOTIFY4res {
+ nfsstat4 cnr_status;
+};
+
+#define NFS4_int32_t_sz \
+ (XDR_int)
#define NFS4_uint32_t_sz \
(XDR_unsigned_int)
+#define NFS4_int64_t_sz \
+ (XDR_hyper)
+#define NFS4_uint64_t_sz \
+ (XDR_unsigned_hyper)
+#define NFS4_nfsstat4_sz (XDR_int)
+#define NFS4_attrlist4_sz (XDR_unsigned_int)
#define NFS4_bitmap4_sz (XDR_unsigned_int)
+#define NFS4_verifier4_sz (XDR_QUADLEN(NFS4_VERIFIER_SIZE))
+#define NFS4_nfs_cookie4_sz \
+ (NFS4_uint64_t_sz)
+#define NFS4_nfs_fh4_sz (XDR_unsigned_int + XDR_QUADLEN(NFS4_FHSIZE))
#define NFS4_utf8string_sz (XDR_unsigned_int)
#define NFS4_utf8str_cis_sz \
(NFS4_utf8string_sz)
@@ -228,8 +469,17 @@ enum { FATTR4_POSIX_ACCESS_ACL = 92 };
(NFS4_utf8string_sz)
#define NFS4_utf8str_mixed_sz \
(NFS4_utf8string_sz)
+#define NFS4_component4_sz \
+ (NFS4_utf8str_cs_sz)
+#define NFS4_linktext4_sz \
+ (NFS4_utf8str_cs_sz)
+#define NFS4_pathname4_sz (XDR_unsigned_int)
#define NFS4_nfstime4_sz \
(NFS4_int64_t_sz + NFS4_uint32_t_sz)
+#define NFS4_fattr4_sz \
+ (NFS4_bitmap4_sz + NFS4_attrlist4_sz)
+#define NFS4_stateid4_sz \
+ (NFS4_uint32_t_sz + XDR_QUADLEN(12))
#define NFS4_fattr4_offline_sz \
(XDR_bool)
#define NFS4_open_arguments4_sz \
@@ -259,5 +509,27 @@ enum { FATTR4_POSIX_ACCESS_ACL = 92 };
(NFS4_aclscope4_sz)
#define NFS4_fattr4_posix_default_acl_sz (XDR_unsigned_int)
#define NFS4_fattr4_posix_access_acl_sz (XDR_unsigned_int)
+#define NFS4_notify_type4_sz (XDR_int)
+#define NFS4_notify_entry4_sz \
+ (NFS4_component4_sz + NFS4_fattr4_sz)
+#define NFS4_prev_entry4_sz \
+ (NFS4_notify_entry4_sz + NFS4_nfs_cookie4_sz)
+#define NFS4_notify_remove4_sz \
+ (NFS4_notify_entry4_sz + NFS4_nfs_cookie4_sz)
+#define NFS4_notify_add4_sz \
+ (XDR_unsigned_int + (1 * (NFS4_notify_remove4_sz)) + NFS4_notify_entry4_sz + XDR_unsigned_int + (1 * (NFS4_nfs_cookie4_sz)) + XDR_unsigned_int + (1 * (NFS4_prev_entry4_sz)) + XDR_bool)
+#define NFS4_notify_attr4_sz \
+ (NFS4_notify_entry4_sz)
+#define NFS4_notify_rename4_sz \
+ (NFS4_notify_remove4_sz + NFS4_notify_add4_sz)
+#define NFS4_notify_verifier4_sz \
+ (NFS4_verifier4_sz + NFS4_verifier4_sz)
+#define NFS4_notifylist4_sz (XDR_unsigned_int)
+#define NFS4_notify4_sz \
+ (NFS4_bitmap4_sz + NFS4_notifylist4_sz)
+#define NFS4_CB_NOTIFY4args_sz \
+ (NFS4_stateid4_sz + NFS4_nfs_fh4_sz + XDR_unsigned_int)
+#define NFS4_CB_NOTIFY4res_sz \
+ (NFS4_nfsstat4_sz)
#endif /* _LINUX_XDRGEN_NFS4_1_DEF_H */
diff --git a/include/uapi/linux/nfs4.h b/include/uapi/linux/nfs4.h
index 4273e0249fcb..289205b53a08 100644
--- a/include/uapi/linux/nfs4.h
+++ b/include/uapi/linux/nfs4.h
@@ -17,11 +17,9 @@
#include <linux/types.h>
#define NFS4_BITMAP_SIZE 3
-#define NFS4_VERIFIER_SIZE 8
#define NFS4_STATEID_SEQID_SIZE 4
#define NFS4_STATEID_OTHER_SIZE 12
#define NFS4_STATEID_SIZE (NFS4_STATEID_SEQID_SIZE + NFS4_STATEID_OTHER_SIZE)
-#define NFS4_FHSIZE 128
#define NFS4_MAXPATHLEN PATH_MAX
#define NFS4_MAXNAMLEN NAME_MAX
#define NFS4_OPAQUE_LIMIT 1024
--
2.54.0
^ permalink raw reply related
* [PATCH v6 03/20] nfs_common: add new NOTIFY4_* flags proposed in RFC8881bis
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
RFC8881bis adds some new flags to GET_DIR_DELEGATION that later patches
will consume. In particular, Linux nfsd can't easily provide info about
directory cookies and ordering. The new flags allow it to omit that
information.
There is some risk here -- RFC8881bis is still a working group document,
and has been for years. The changes to directory delegations have been
stable for the last year or so however, so the hope is that those parts
won't change (much).
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Documentation/sunrpc/xdr/nfs4_1.x | 14 +++++++++++++-
fs/nfsd/nfs4xdr_gen.c | 13 ++++++++++++-
fs/nfsd/nfs4xdr_gen.h | 2 +-
include/linux/sunrpc/xdrgen/nfs4_1.h | 13 ++++++++++++-
4 files changed, 38 insertions(+), 4 deletions(-)
diff --git a/Documentation/sunrpc/xdr/nfs4_1.x b/Documentation/sunrpc/xdr/nfs4_1.x
index 632f5b579c39..6039eb024e0e 100644
--- a/Documentation/sunrpc/xdr/nfs4_1.x
+++ b/Documentation/sunrpc/xdr/nfs4_1.x
@@ -416,7 +416,19 @@ enum notify_type4 {
NOTIFY4_REMOVE_ENTRY = 2,
NOTIFY4_ADD_ENTRY = 3,
NOTIFY4_RENAME_ENTRY = 4,
- NOTIFY4_CHANGE_COOKIE_VERIFIER = 5
+ NOTIFY4_CHANGE_COOKIE_VERIFIER = 5,
+ /* Proposed in RFC8881bis */
+ NOTIFY4_GFLAG_EXTEND = 6,
+ NOTIFY4_AUFLAG_VALID = 7,
+ NOTIFY4_AUFLAG_USER = 8,
+ NOTIFY4_AUFLAG_GROUP = 9,
+ NOTIFY4_AUFLAG_OTHER = 10,
+ NOTIFY4_CHANGE_AUTH = 11,
+ NOTIFY4_CFLAG_ORDER = 12,
+ NOTIFY4_AUFLAG_GANOW = 13,
+ NOTIFY4_AUFLAG_GALATER = 14,
+ NOTIFY4_CHANGE_GA = 15,
+ NOTIFY4_CHANGE_AMASK = 16
};
/* Changed entry information. */
diff --git a/fs/nfsd/nfs4xdr_gen.c b/fs/nfsd/nfs4xdr_gen.c
index 5e656d6bbb8e..80369139ef7e 100644
--- a/fs/nfsd/nfs4xdr_gen.c
+++ b/fs/nfsd/nfs4xdr_gen.c
@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
// Generated by xdrgen. Manual edits will be lost.
// XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x
-// XDR specification modification time: Wed Mar 25 11:39:22 2026
+// XDR specification modification time: Wed Mar 25 11:40:02 2026
#include <linux/sunrpc/svc.h>
@@ -590,6 +590,17 @@ xdrgen_decode_notify_type4(struct xdr_stream *xdr, notify_type4 *ptr)
case NOTIFY4_ADD_ENTRY:
case NOTIFY4_RENAME_ENTRY:
case NOTIFY4_CHANGE_COOKIE_VERIFIER:
+ case NOTIFY4_GFLAG_EXTEND:
+ case NOTIFY4_AUFLAG_VALID:
+ case NOTIFY4_AUFLAG_USER:
+ case NOTIFY4_AUFLAG_GROUP:
+ case NOTIFY4_AUFLAG_OTHER:
+ case NOTIFY4_CHANGE_AUTH:
+ case NOTIFY4_CFLAG_ORDER:
+ case NOTIFY4_AUFLAG_GANOW:
+ case NOTIFY4_AUFLAG_GALATER:
+ case NOTIFY4_CHANGE_GA:
+ case NOTIFY4_CHANGE_AMASK:
break;
default:
return false;
diff --git a/fs/nfsd/nfs4xdr_gen.h b/fs/nfsd/nfs4xdr_gen.h
index 503fe2ccba51..092a1ed399c7 100644
--- a/fs/nfsd/nfs4xdr_gen.h
+++ b/fs/nfsd/nfs4xdr_gen.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/* Generated by xdrgen. Manual edits will be lost. */
/* XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x */
-/* XDR specification modification time: Wed Mar 25 11:39:22 2026 */
+/* XDR specification modification time: Wed Mar 25 11:40:02 2026 */
#ifndef _LINUX_XDRGEN_NFS4_1_DECL_H
#define _LINUX_XDRGEN_NFS4_1_DECL_H
diff --git a/include/linux/sunrpc/xdrgen/nfs4_1.h b/include/linux/sunrpc/xdrgen/nfs4_1.h
index f761c3ddb4c7..537504069f24 100644
--- a/include/linux/sunrpc/xdrgen/nfs4_1.h
+++ b/include/linux/sunrpc/xdrgen/nfs4_1.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/* Generated by xdrgen. Manual edits will be lost. */
/* XDR specification file: ../../Documentation/sunrpc/xdr/nfs4_1.x */
-/* XDR specification modification time: Wed Mar 25 11:39:22 2026 */
+/* XDR specification modification time: Wed Mar 25 11:40:02 2026 */
#ifndef _LINUX_XDRGEN_NFS4_1_DEF_H
#define _LINUX_XDRGEN_NFS4_1_DEF_H
@@ -377,6 +377,17 @@ enum notify_type4 {
NOTIFY4_ADD_ENTRY = 3,
NOTIFY4_RENAME_ENTRY = 4,
NOTIFY4_CHANGE_COOKIE_VERIFIER = 5,
+ NOTIFY4_GFLAG_EXTEND = 6,
+ NOTIFY4_AUFLAG_VALID = 7,
+ NOTIFY4_AUFLAG_USER = 8,
+ NOTIFY4_AUFLAG_GROUP = 9,
+ NOTIFY4_AUFLAG_OTHER = 10,
+ NOTIFY4_CHANGE_AUTH = 11,
+ NOTIFY4_CFLAG_ORDER = 12,
+ NOTIFY4_AUFLAG_GANOW = 13,
+ NOTIFY4_AUFLAG_GALATER = 14,
+ NOTIFY4_CHANGE_GA = 15,
+ NOTIFY4_CHANGE_AMASK = 16,
};
typedef enum notify_type4 notify_type4;
--
2.54.0
^ permalink raw reply related
* [PATCH v6 04/20] nfsd: allow nfsd to get a dir lease with an ignore mask
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
When requesting a directory lease, enable the FL_IGN_DIR_* bits that
correspond to the requested notification types.
In nfsd_get_dir_deleg(), gddr_notification[0] will ultimately represent
the notifications that will be provided to the client. For now, that
field is always set to 0. That will change once the upper layers are
ready to start ignoring certain events.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4state.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 489558bf124c..ae8505747dc2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6119,7 +6119,22 @@ static bool nfsd4_cb_channel_good(struct nfs4_client *clp)
return clp->cl_minorversion && clp->cl_cb_state == NFSD4_CB_UNKNOWN;
}
-static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp)
+static unsigned int
+nfsd_notify_to_ignore(u32 notify)
+{
+ unsigned int mask = 0;
+
+ if (notify & BIT(NOTIFY4_REMOVE_ENTRY))
+ mask |= FL_IGN_DIR_DELETE;
+ if (notify & BIT(NOTIFY4_ADD_ENTRY))
+ mask |= FL_IGN_DIR_CREATE;
+ if (notify & BIT(NOTIFY4_RENAME_ENTRY))
+ mask |= FL_IGN_DIR_RENAME;
+
+ return mask;
+}
+
+static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp, u32 notify)
{
struct file_lease *fl;
@@ -6127,7 +6142,7 @@ static struct file_lease *nfs4_alloc_init_lease(struct nfs4_delegation *dp)
if (!fl)
return NULL;
fl->fl_lmops = &nfsd_lease_mng_ops;
- fl->c.flc_flags = FL_DELEG;
+ fl->c.flc_flags = FL_DELEG | nfsd_notify_to_ignore(notify);
fl->c.flc_type = deleg_is_read(dp->dl_type) ? F_RDLCK : F_WRLCK;
fl->c.flc_owner = (fl_owner_t)dp;
fl->c.flc_pid = current->tgid;
@@ -6344,7 +6359,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
if (stp->st_stid.sc_export)
dp->dl_stid.sc_export = exp_get(stp->st_stid.sc_export);
- fl = nfs4_alloc_init_lease(dp);
+ fl = nfs4_alloc_init_lease(dp, 0);
if (!fl)
goto out_clnt_odstate;
@@ -9771,7 +9786,11 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
dp->dl_stid.sc_export =
exp_get(cstate->current_fh.fh_export);
- fl = nfs4_alloc_init_lease(dp);
+ /*
+ * NB: gddr_notification[0] represents the notifications that
+ * will be granted to the client
+ */
+ fl = nfs4_alloc_init_lease(dp, gdd->gddr_notification[0]);
if (!fl)
goto out_put_stid;
--
2.54.0
^ permalink raw reply related
* [PATCH v6 05/20] nfsd: update the fsnotify mark when setting or removing a dir delegation
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
Add a new helper function that will update the mask on the nfsd_file's
fsnotify_mark to be a union of all current directory delegations on an
inode.
Call that when directory delegations are added or removed, since that
can change what fsnotify events nfsd requires from the VFS layer.
The fsnotify_mark is shared by every nfsd_file open on the inode, so
concurrent delegation adds and removes on the same directory can run
nfsd_fsnotify_recalc_mask() in parallel. Because it reads the lease
state and updates the mark in two separate locked sections, a recalc
working from a stale snapshot of the lease list could clobber a
concurrent update and leave the mark missing required events. Add an
nfm_recalc_mutex to the nfsd_file_mark and hold it across the recalc to
serialize callers.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/filecache.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/filecache.h | 3 +++
fs/nfsd/nfs4state.c | 5 +++--
3 files changed, 58 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 1ea2bfd51825..c5f2c5768324 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -192,6 +192,7 @@ nfsd_file_mark_find_or_create(struct inode *inode)
fsnotify_init_mark(&new->nfm_mark, nfsd_file_fsnotify_group);
new->nfm_mark.mask = FS_ATTRIB|FS_DELETE_SELF;
refcount_set(&new->nfm_ref, 1);
+ mutex_init(&new->nfm_recalc_mutex);
err = fsnotify_add_inode_mark(&new->nfm_mark, inode, 0);
@@ -1473,3 +1474,54 @@ int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
seq_printf(m, "mean age (ms): -\n");
return 0;
}
+
+/**
+ * nfsd_fsnotify_recalc_mask - recalculate the fsnotify mask for a nfsd_file
+ * @nf: nfsd_file to recalculate the mask on
+ *
+ * When a directory nfsd_file has a delegation added or removed, that may
+ * change the events that nfsd requires from the VFS layer. This function
+ * recalculates the fsnotify mask based on the leases present.
+ */
+void nfsd_fsnotify_recalc_mask(struct nfsd_file *nf)
+{
+ struct inode *inode = file_inode(nf->nf_file);
+ u32 lease_mask, set = 0, clear = 0;
+ struct fsnotify_mark *mark;
+
+ /* This is only needed when adding or removing dir delegs */
+ if (!S_ISDIR(inode->i_mode) || !nf->nf_mark)
+ return;
+
+ mark = &nf->nf_mark->nfm_mark;
+
+ /*
+ * The mark is shared by every nfsd_file on this inode, so concurrent
+ * delegation add/remove on the same directory can recalc it in
+ * parallel. Serialize the read of the lease state and the update of
+ * the mark so that a recalc working from a stale snapshot of the
+ * lease list can't clobber a concurrent recalc's update.
+ */
+ mutex_lock(&nf->nf_mark->nfm_recalc_mutex);
+
+ /* Set up notifications for any ignored delegation events */
+ lease_mask = inode_lease_ignore_mask(inode);
+
+ if (lease_mask & FL_IGN_DIR_CREATE)
+ set |= FS_CREATE | FS_MOVED_TO;
+ else
+ clear |= FS_CREATE | FS_MOVED_TO;
+
+ if (lease_mask & FL_IGN_DIR_DELETE)
+ set |= FS_DELETE | FS_MOVED_FROM;
+ else
+ clear |= FS_DELETE | FS_MOVED_FROM;
+
+ if (lease_mask & FL_IGN_DIR_RENAME)
+ set |= FS_RENAME;
+ else
+ clear |= FS_RENAME;
+
+ fsnotify_modify_mark_mask(mark, set, clear);
+ mutex_unlock(&nf->nf_mark->nfm_recalc_mutex);
+}
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 683b6437cacc..b224902b438d 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -26,6 +26,8 @@
struct nfsd_file_mark {
struct fsnotify_mark nfm_mark;
refcount_t nfm_ref;
+ /* serializes nfsd_fsnotify_recalc_mask() against itself */
+ struct mutex nfm_recalc_mutex;
};
/*
@@ -86,4 +88,5 @@ __be32 nfsd_file_acquire_local(struct net *net, struct svc_cred *cred,
__be32 nfsd_file_acquire_dir(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct nfsd_file **pnf);
int nfsd_file_cache_stats_show(struct seq_file *m, void *v);
+void nfsd_fsnotify_recalc_mask(struct nfsd_file *nf);
#endif /* _FS_NFSD_FILECACHE_H */
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ae8505747dc2..0cbb37f73ee7 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1255,6 +1255,7 @@ static void nfs4_unlock_deleg_lease(struct nfs4_delegation *dp)
nfsd4_finalize_deleg_timestamps(dp, nf->nf_file);
kernel_setlease(nf->nf_file, F_UNLCK, NULL, (void **)&dp);
+ nfsd_fsnotify_recalc_mask(nf);
put_deleg_file(fp);
}
@@ -9725,8 +9726,7 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry,
* @nf: nfsd_file opened on the directory
*
* Given a GET_DIR_DELEGATION request @gdd, attempt to acquire a delegation
- * on the directory to which @nf refers. Note that this does not set up any
- * sort of async notifications for the delegation.
+ * on the directory to which @nf refers.
*/
struct nfs4_delegation *
nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
@@ -9816,6 +9816,7 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate,
if (!status) {
put_nfs4_file(fp);
+ nfsd_fsnotify_recalc_mask(nf);
return dp;
}
--
2.54.0
^ permalink raw reply related
* [PATCH v6 06/20] nfsd: make nfsd4_callback_ops->prepare operation bool return
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
For a CB_NOTIFY operation, we need to stop processing the callback
if an allocation fails. Change the ->prepare callback operation to
return true if processing should continue, and false otherwise.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4callback.c | 5 ++++-
fs/nfsd/nfs4layouts.c | 3 ++-
fs/nfsd/nfs4state.c | 6 ++++--
fs/nfsd/state.h | 6 +++---
4 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 1628bb9ef9dd..a3c46905fd47 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1786,7 +1786,10 @@ nfsd4_run_cb_work(struct work_struct *work)
if (!test_and_clear_bit(NFSD4_CALLBACK_REQUEUE, &cb->cb_flags)) {
if (cb->cb_ops && cb->cb_ops->prepare)
- cb->cb_ops->prepare(cb);
+ if (!cb->cb_ops->prepare(cb)) {
+ nfsd41_destroy_cb(cb);
+ return;
+ }
}
cb->cb_msg.rpc_cred = clp->cl_cb_cred;
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 279ff1e9dffb..4c3f253c7d07 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -659,7 +659,7 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls, struct nfsd_file *file)
}
}
-static void
+static bool
nfsd4_cb_layout_prepare(struct nfsd4_callback *cb)
{
struct nfs4_layout_stateid *ls =
@@ -668,6 +668,7 @@ nfsd4_cb_layout_prepare(struct nfsd4_callback *cb)
mutex_lock(&ls->ls_mutex);
nfs4_inc_and_copy_stateid(&ls->ls_recall_sid, &ls->ls_stid);
mutex_unlock(&ls->ls_mutex);
+ return true;
}
static int
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0cbb37f73ee7..1ff954a18f93 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -357,12 +357,13 @@ remove_blocked_locks(struct nfs4_lockowner *lo)
}
}
-static void
+static bool
nfsd4_cb_notify_lock_prepare(struct nfsd4_callback *cb)
{
struct nfsd4_blocked_lock *nbl = container_of(cb,
struct nfsd4_blocked_lock, nbl_cb);
locks_delete_block(&nbl->nbl_lock);
+ return true;
}
static int
@@ -5599,7 +5600,7 @@ bool nfsd_wait_for_delegreturn(struct svc_rqst *rqstp, struct inode *inode)
return timeo > 0;
}
-static void nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
+static bool nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
{
struct nfs4_delegation *dp = cb_to_delegation(cb);
struct nfsd_net *nn = net_generic(dp->dl_stid.sc_client->net,
@@ -5620,6 +5621,7 @@ static void nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
}
spin_unlock(&nn->deleg_lock);
+ return true;
}
static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index f44ea672670f..4c6765a4cf22 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -98,9 +98,9 @@ struct nfsd4_callback {
};
struct nfsd4_callback_ops {
- void (*prepare)(struct nfsd4_callback *);
- int (*done)(struct nfsd4_callback *, struct rpc_task *);
- void (*release)(struct nfsd4_callback *);
+ bool (*prepare)(struct nfsd4_callback *cb);
+ int (*done)(struct nfsd4_callback *cb, struct rpc_task *task);
+ void (*release)(struct nfsd4_callback *cb);
uint32_t opcode;
};
--
2.54.0
^ permalink raw reply related
* [PATCH v6 07/20] nfsd: add callback encoding and decoding linkages for CB_NOTIFY
From: Jeff Layton @ 2026-06-11 17:50 UTC (permalink / raw)
To: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Trond Myklebust, Anna Schumaker, Jonathan Corbet, Shuah Khan,
Chuck Lever
Cc: Steven Rostedt, Alexander Aring, Amir Goldstein, Jan Kara,
Alexander Viro, Christian Brauner, Calum Mackay, linux-kernel,
linux-doc, linux-nfs, Jeff Layton
In-Reply-To: <20260611-dir-deleg-v6-0-4c45080e5f3f@kernel.org>
Add routines for encoding and decoding CB_NOTIFY messages. These call
into the code generated by xdrgen to do the actual encoding and
decoding.
For now, the encoder is a stub. Later patches will flesh out the payload
encoding.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4callback.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/state.h | 8 ++++++++
fs/nfsd/xdr4cb.h | 12 ++++++++++++
3 files changed, 66 insertions(+)
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index a3c46905fd47..ca4dd2f969eb 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -887,6 +887,51 @@ static void encode_stateowner(struct xdr_stream *xdr, struct nfs4_stateowner *so
xdr_encode_opaque(p, so->so_owner.data, so->so_owner.len);
}
+static void nfs4_xdr_enc_cb_notify(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfsd4_callback *cb = data;
+ struct nfs4_cb_compound_hdr hdr = {
+ .ident = 0,
+ .minorversion = cb->cb_clp->cl_minorversion,
+ };
+ struct CB_NOTIFY4args args = { };
+
+ WARN_ON_ONCE(hdr.minorversion == 0);
+
+ encode_cb_compound4args(xdr, &hdr);
+ encode_cb_sequence4args(xdr, cb, &hdr);
+
+ /*
+ * FIXME: get stateid and fh from delegation. Inline the cna_changes
+ * buffer, and zero it.
+ */
+ xdrgen_encode_CB_NOTIFY4args(xdr, &args);
+
+ hdr.nops++;
+ encode_cb_nops(&hdr);
+}
+
+static int nfs4_xdr_dec_cb_notify(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfsd4_callback *cb = data;
+ struct nfs4_cb_compound_hdr hdr;
+ int status;
+
+ status = decode_cb_compound4res(xdr, &hdr);
+ if (unlikely(status))
+ return status;
+
+ status = decode_cb_sequence4res(xdr, cb);
+ if (unlikely(status || cb->cb_seq_status))
+ return status;
+
+ return decode_cb_op_status(xdr, OP_CB_NOTIFY, &cb->cb_status);
+}
+
static void nfs4_xdr_enc_cb_notify_lock(struct rpc_rqst *req,
struct xdr_stream *xdr,
const void *data)
@@ -1048,6 +1093,7 @@ static const struct rpc_procinfo nfs4_cb_procedures[] = {
#ifdef CONFIG_NFSD_PNFS
PROC(CB_LAYOUT, COMPOUND, cb_layout, cb_layout),
#endif
+ PROC(CB_NOTIFY, COMPOUND, cb_notify, cb_notify),
PROC(CB_NOTIFY_LOCK, COMPOUND, cb_notify_lock, cb_notify_lock),
PROC(CB_OFFLOAD, COMPOUND, cb_offload, cb_offload),
PROC(CB_RECALL_ANY, COMPOUND, cb_recall_any, cb_recall_any),
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 4c6765a4cf22..9f321e9ed76d 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -190,6 +190,13 @@ struct nfs4_cb_fattr {
u64 ncf_cur_fsize;
};
+/*
+ * FIXME: the current backchannel encoder can't handle a send buffer longer
+ * than a single page (see bc_malloc/bc_free).
+ */
+#define NOTIFY4_EVENT_QUEUE_SIZE 3
+#define NOTIFY4_PAGE_ARRAY_SIZE 1
+
/*
* Represents a delegation stateid. The nfs4_client holds references to these
* and they are put when it is being destroyed or when the delegation is
@@ -776,6 +783,7 @@ enum nfsd4_cb_op {
NFSPROC4_CLNT_CB_NOTIFY_LOCK,
NFSPROC4_CLNT_CB_RECALL_ANY,
NFSPROC4_CLNT_CB_GETATTR,
+ NFSPROC4_CLNT_CB_NOTIFY,
};
/* Returns true iff a is later than b: */
diff --git a/fs/nfsd/xdr4cb.h b/fs/nfsd/xdr4cb.h
index f4e29c0c701c..b06d0170d7c4 100644
--- a/fs/nfsd/xdr4cb.h
+++ b/fs/nfsd/xdr4cb.h
@@ -33,6 +33,18 @@
cb_sequence_dec_sz + \
op_dec_sz)
+#define NFS4_enc_cb_notify_sz (cb_compound_enc_hdr_sz + \
+ cb_sequence_enc_sz + \
+ 1 + enc_stateid_sz + \
+ enc_nfs4_fh_sz + \
+ 1 + \
+ NOTIFY4_EVENT_QUEUE_SIZE * \
+ (2 + (NFS4_OPAQUE_LIMIT >> 2)))
+
+#define NFS4_dec_cb_notify_sz (cb_compound_dec_hdr_sz + \
+ cb_sequence_dec_sz + \
+ op_dec_sz)
+
#define NFS4_enc_cb_notify_lock_sz (cb_compound_enc_hdr_sz + \
cb_sequence_enc_sz + \
2 + 1 + \
--
2.54.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox