Linux Documentation
 help / color / mirror / Atom feed
* [PATCH v13 25/36] dyndbg-API: replace DECLARE_DYNDBG_CLASSMAP
From: Jim Cromie @ 2026-04-08 20:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: gregkh, jbaron, louis.chauvet, Jim Cromie, linux-doc
In-Reply-To: <20260408200211.43821-1-jim.cromie@gmail.com>

commit aad0214f3026 ("dyndbg: add DECLARE_DYNDBG_CLASSMAP macro")

DECLARE_DYNDBG_CLASSMAP() has a design error; its usage fails a
basic K&R rule: "define once, refer many times".

When CONFIG_DRM_USE_DYNAMIC_DEBUG=y, it is used across DRM core &
drivers; each invocation allocates/inits the classmap understood by
that module.  They *all* must match for the DRM modules to respond
consistently when drm.debug categories are enabled.  This is at least
a maintenance hassle.

Worse, its the root-cause of the CONFIG_DRM_USE_DYNAMIC_DEBUG=Y
regression; its use in both core & drivers obfuscates the 2 roles,
muddling the design, yielding an incomplete initialization when
modprobing drivers:

1st drm.ko loads, and dyndbg initializes its drm.debug callsites, then
a drm-driver loads, but too late for the drm.debug enablement.

And that led to:
commit bb2ff6c27bc9 ("drm: Disable dynamic debug as broken")

So retire it, replace with 2 macros:
  DYNAMIC_DEBUG_CLASSMAP_DEFINE - invoked once from core - drm.ko
  DYNAMIC_DEBUG_CLASSMAP_USE*   - from all drm drivers and helpers.
  NB: name-space de-noise

DYNAMIC_DEBUG_CLASSMAP_DEFINE: this reworks DECLARE_DYNDBG_CLASSMAP,
basically by dropping the static qualifier on the classmap, and
exporting it instead.

DYNAMIC_DEBUG_CLASSMAP_USE: then refers to the exported var by name:
  used from drivers, helper-mods
  lets us drop the repetitive "classname" declarations
  fixes 2nd-defn problem
  creates a ddebug_class_user record in new __dyndbg_class_users section
  new section is scanned "differently"

DECLARE_DYNDBG_CLASSMAP is preserved temporarily, to decouple DRM
adaptation work and avoid compile-errs before its done.

The DEFINE,USE distinction, and the separate classmap-use record,
allows dyndbg to initialize the driver's & helper's drm.debug
callsites separately after each is modprobed.

Basically, the classmap initial scan is repeated for classmap-users.

dyndbg's existing __dyndbg_classes[] section does:

. catalogs the module's classmaps
. tells dyndbg about them, allowing >control
. DYNAMIC_DEBUG_CLASSMAP_DEFINE creates section records.
. we rename it to: __dyndbg_class_maps[]

this patch adds __dyndbg_class_users[] section:

. catalogs users of classmap definitions from elsewhere
. authorizes dyndbg to >control user's class'd prdbgs
. DYNAMIC_DEBUG_CLASSMAP_USE() creates section records.

Now ddebug_add_module(etal) can handle classmap-uses similar to (and
after) classmaps; when a dependent module is loaded, if it has
classmap-uses (to a classmap-def in another module), that module's
kernel params are scanned to find if it has a kparam that is wired to
dyndbg's param-ops, and whose classmap is the one being ref'd.

To support this, theres a few data/header changes:

new struct ddebug_class_user
  contains: user-module-name, &classmap-defn
  it records drm-driver's use of a classmap in the section, allowing lookup

struct ddebug_info gets 2 new fields for the new sections:
  class_users, num_class_users.
  set by dynamic_debug_init() for builtins.
  or by kernel/module/main:load_info() for loadable modules.

vmlinux.lds.h: Add a new BOUNDED_SECTION for __dyndbg_class_users.
this creates start,stop C symbol-names for the section.

TLDR ?

dynamic_debug.c: 2 changes from ddebug_add_module() & ddebug_change():

ddebug_add_module():

ddebug_attach_module_classes() is reworked/renamed/split into
debug_apply_class_maps(), ddebug_apply_class_users(), which both call
ddebug_apply_params().

ddebug_apply_params(new fn):

It scans module's/builtin kernel-params, calls ddebug_match_apply_kparam
for each to find any params/sysfs-nodes which may be wired to a classmap.

ddebug_match_apply_kparam(new fn):

1st, it tests the kernel-param.ops is dyndbg's; this guarantees that
the attached arg is a struct ddebug_class_param, which has a ref to
the param's state, and to the classmap defining the param's handling.

2nd, it requires that the classmap ref'd by the kparam is the one
we've been called for; modules can use many separate classmaps (as
test_dynamic_debug does).

Then apply the "parent" kparam's setting to the dependent module,
using ddebug_apply_class_bitmap().

ddebug_change(and callees) also gets adjustments:

ddebug_find_valid_class(): This does a search over the module's
classmaps, looking for the class FOO echo'd to >control.  So now it
searches over __dyndbg_class_users[] after __dyndbg_classes[].

ddebug_class_name(): return class-names for defined OR used classes.

test_dynamic_debug.c, test_dynamic_debug_submod.c:

This demonstrates the 2 types of classmaps & sysfs-params, following
the 4-part recipe:

0. define an enum for the classmap's class_ids
   drm.debug gives us DRM_UT_<*> (aka <T>)
   multiple classmaps in a module(s) must share 0-62 classid space.

1. DYNAMIC_DEBUG_CLASSMAP_DEFINE(classmap_name, .. "<T>")
   names the classes, maps them to consecutive class-ids.
   convention here is stringified ENUM_SYMBOLS
   these become API/ABI if 2 is done.

2. DYNAMIC_DEBUG_CLASSMAP_PARAM* (classmap_name)
   adds a controlling kparam to the class

3. DYNAMIC_DEBUG_CLASSMAP_USE(classmap_name)
   for subsystem/group/drivers to use extern created by 1.

Move all the enum declarations together, to better explain how they
share the 0..62 class-id space available to a module (non-overlapping
subranges).

reorg macros 2,3 by name.  This gives a tabular format, making it easy
to see the pattern of repetition, and the points of change.

And extend the test to replicate the 2-module (parent & dependent)
scenario which caused the CONFIG_DRM_USE_DYNAMIC_DEBUG=y regression
seen in drm & drivers.

The _submod.c is a 2-line file: #define _SUBMOD, #include parent.

This gives identical complements of prdbgs in parent & _submod, and
thus identical print behavior when all of: >control, >params, and
parent->_submod propagation are working correctly.

It also puts all the parent/_submod declarations together in the same
source; the new ifdef _SUBMOD block invokes DYNAMIC_DEBUG_CLASSMAP_USE
for the 2 test-interfaces.  I think this is clearer.

These 2 modules are both tristate, allowing 3 super/sub combos: Y/Y,
Y/M, M/M (not N/Y, since this is disallowed by dependence).

Y/Y, Y/M testing once exposed a missing __align(8) in the _METADATA
macro, which M/M didn't see, probably because the module-loader memory
placement constrained it from misalignment.

Fixes: aad0214f3026 ("dyndbg: add DECLARE_DYNDBG_CLASSMAP macro")
cc: linux-doc@vger.kernel.org
Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
---
-v?
replace di with &dt->info, since di becomes stale
fix dd_mark_vector_subrange macro param ordering to match kdoc
s/base/offset/ in _ddebug_class_user, to reduce later churn

-v12 - squash in _USE_ and refinements.

A: dyndbg: add DYNAMIC_DEBUG_CLASSMAP_USE_(dd_class_name, offset)

Allow a module to use 2 classmaps together that would otherwise have a
class_id range conflict.

Suppose a drm-driver does:

  DYNAMIC_DEBUG_CLASSMAP_USE(drm_debug_classes);
  DYNAMIC_DEBUG_CLASSMAP_USE(drm_accel_xfer_debug);

If (for some reason) drm-accel cannot define their constants to avoid
DRM's drm_debug_category 0..10 reservations, we would have a conflict
with reserved-ids.

In this case a driver needing to use both would _USE_ one of them with
an offset to avoid the conflict.  This will handle most forseeable
cases; perhaps a 3-X-3 of classmap-defns X classmap-users would get
too awkward and fiddly.

B: dyndbg: refine DYNAMIC_DEBUG_CLASSMAP_USE_ macro

The struct _ddebug_class_user _varname construct is needlessly
permissive; it has a static qualifier, and a unique name.  Together,
these allow a module to have 2 or more _USE(foo)s, which is contrary
to its purpose, and therefore potentially confusing.

So drop the unique name, and the static qualifier, and replace it with
an extern pre-declaration.  Construct the name by pasting together the
_var (which is the name of the exported ddebug_class_map), and
__KBUILD_MODNAME (which is the user module name).  This allows only a
single USE() reference to the exported record, which is all that is
required.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
---
 MAINTAINERS                      |   2 +-
 include/asm-generic/dyndbg.lds.h |   7 +-
 include/linux/dynamic_debug.h    | 159 +++++++++++++++++++++++---
 kernel/module/main.c             |   3 +
 lib/Kconfig.debug                |  24 +++-
 lib/Makefile                     |   5 +
 lib/dynamic_debug.c              | 185 ++++++++++++++++++++++++-------
 lib/test_dynamic_debug.c         | 132 ++++++++++++++++------
 lib/test_dynamic_debug_submod.c  |  14 +++
 9 files changed, 433 insertions(+), 98 deletions(-)
 create mode 100644 lib/test_dynamic_debug_submod.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 0f4c2f182d63..31c945228fab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9034,7 +9034,7 @@ M:	Jim Cromie <jim.cromie@gmail.com>
 S:	Maintained
 F:	include/linux/dynamic_debug.h
 F:	lib/dynamic_debug.c
-F:	lib/test_dynamic_debug.c
+F:	lib/test_dynamic_debug*.c
 F:	tools/testing/selftests/dynamic_debug/*
 
 DYNAMIC INTERRUPT MODERATION
diff --git a/include/asm-generic/dyndbg.lds.h b/include/asm-generic/dyndbg.lds.h
index 8345ac6c52b7..6e38d0f1d00b 100644
--- a/include/asm-generic/dyndbg.lds.h
+++ b/include/asm-generic/dyndbg.lds.h
@@ -6,7 +6,8 @@
 #define DYNDBG_SECTIONS()						\
 	. = ALIGN(8);							\
 	BOUNDED_SECTION_BY(__dyndbg_descriptors, ___dyndbg_descs)	\
-	BOUNDED_SECTION_BY(__dyndbg_class_maps, ___dyndbg_class_maps)
+	BOUNDED_SECTION_BY(__dyndbg_class_maps, ___dyndbg_class_maps)	\
+	BOUNDED_SECTION_BY(__dyndbg_class_users, ___dyndbg_class_users)
 
 #define MOD_DYNDBG_SECTIONS()                                           \
 	__dyndbg_descriptors : {					\
@@ -16,6 +17,10 @@
 	__dyndbg_class_maps : {						\
 		BOUNDED_SECTION_BY(__dyndbg_class_maps,			\
 				   ___dyndbg_class_maps)		\
+	}								\
+	__dyndbg_class_users : {					\
+		BOUNDED_SECTION_BY(__dyndbg_class_users,		\
+				   ___dyndbg_class_users)		\
 	}
 
 #endif /* __ASM_GENERIC_DYNDBG_LDS_H */
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 33291abd8971..71c91bc8d3a6 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -72,19 +72,30 @@ enum ddebug_class_map_type {
 	 */
 };
 
+/*
+ * map @class_names 0..N to consecutive constants starting at @base.
+ */
 struct _ddebug_class_map {
-	struct module *mod;	/* NULL for builtins */
-	const char *mod_name;	/* needed for builtins */
+	const struct module *mod;	/* NULL for builtins */
+	const char *mod_name;		/* needed for builtins */
 	const char **class_names;
 	const int length;
 	const int base;		/* index of 1st .class_id, allows split/shared space */
 	enum ddebug_class_map_type map_type;
 };
 
+struct _ddebug_class_user {
+	char *mod_name;
+	struct _ddebug_class_map *map;
+	const int offset;	/* offset from map->base */
+};
+
 /*
- * @_ddebug_info: gathers module/builtin dyndbg_* __sections together.
+ * @_ddebug_info: gathers module/builtin __dyndbg_<T> __sections
+ * together, each is a vec_<T>: a struct { struct T start[], int len }.
+ *
  * For builtins, it is used as a cursor, with the inner structs
- * marking sub-vectors of the builtin __sections in DATA.
+ * marking sub-vectors of the builtin __sections in DATA_DATA
  */
 struct _ddebug_descs {
 	struct _ddebug *start;
@@ -96,10 +107,16 @@ struct _ddebug_class_maps {
 	int len;
 };
 
+struct _ddebug_class_users {
+	struct _ddebug_class_user *start;
+	int len;
+};
+
 struct _ddebug_info {
 	const char *mod_name;
 	struct _ddebug_descs descs;
 	struct _ddebug_class_maps maps;
+	struct _ddebug_class_users users;
 };
 
 struct _ddebug_class_param {
@@ -118,12 +135,81 @@ struct _ddebug_class_param {
 #if defined(CONFIG_DYNAMIC_DEBUG) || \
 	(defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
 
+/*
+ * dyndbg classmaps is modelled closely upon drm.debug:
+ *
+ *  1. run-time control via sysfs node (api/abi)
+ *  2. each bit 0..N controls a single "category"
+ *  3. a pr_debug can have only 1 category, not several.
+ *  4. "kind" is a compile-time constant: 0..N or BIT() thereof
+ *  5. macro impls - give compile-time resolution or fail.
+ *
+ * dyndbg classmaps design axioms/constraints:
+ *
+ *  . optimizing compilers use 1-5 above, so preserve them.
+ *  . classmaps.class_id *is* the category.
+ *  . classmap definers/users are modules.
+ *  . every user wants 0..N
+ *  . 0..N exposes as ABI
+ *  . no 1 use-case wants N > 32, 16 is more usable
+ *  . N <= 64 in *all* cases
+ *  . modules/subsystems make category/classmap decisions
+ *  . ie an enum: DRM has DRM_UT_CORE..DRM_UT_DRMRES
+ *  . some categories are exposed to user: ABI
+ *  . making modules change their numbering is bogus, avoid if possible
+ *
+ * We can solve for all these at once:
+ *  A: map class-names to a .class_id range at compile-time
+ *  B: allow only "class NAME" changes to class'd callsites at run-time
+ *  C: users/modules must manage 0..62 hardcoded .class_id range limit.
+ *  D: existing pr_debugs get CLASS_DFLT=63
+ *
+ * By mapping class-names at >control to class-ids underneath, and
+ * responding only to class-names DEFINEd or USEd by the module, we
+ * can private-ize the class-id, and adjust class'd pr_debugs only by
+ * their names.
+ *
+ * This give us:
+ *  E: class_ids without classnames are unreachable
+ *  F: user modules opt-in by DEFINEing a classmap and/or USEing another
+ *
+ * Multi-classmap modules/groups are supported, if the classmaps share
+ * the class_id space [0..62] without overlap/conflict.
+ *
+ * NOTE: Due to the integer class_id, this api cannot disallow these:
+ * __pr_debug_cls(0, "fake CORE msg");  works only if a classmap maps 0.
+ * __pr_debug_cls(22, "no such class"); compiles but is not reachable
+ */
+
 /**
- * DECLARE_DYNDBG_CLASSMAP - declare classnames known by a module
- * @_var:   a struct ddebug_class_map, passed to module_param_cb
- * @_type:  enum class_map_type, chooses bits/verbose, numeric/symbolic
- * @_base:  offset of 1st class-name. splits .class_id space
- * @classes: class-names used to control class'd prdbgs
+ * DYNAMIC_DEBUG_CLASSMAP_DEFINE - define debug classes used by a module.
+ * @_var:   name of the classmap, exported for other modules coordinated use.
+ * @_mapty: enum ddebug_class_map_type: 0:DISJOINT - independent, 1:LEVEL - v2>v1
+ * @_base:  reserve N classids starting at _base, to split 0..62 classid space
+ * @classes: names of the N classes.
+ *
+ * This tells dyndbg what class_ids the module is using: _base..+N, by
+ * mapping names onto them.  This qualifies "class NAME" >controls on
+ * the defining module, ignoring unknown names.
+ */
+#define DYNAMIC_DEBUG_CLASSMAP_DEFINE(_var, _mapty, _base, ...)		\
+	static const char *_var##_classnames[] = { __VA_ARGS__ };	\
+	extern struct _ddebug_class_map _var;				\
+	struct _ddebug_class_map __aligned(8) __used			\
+		__section("__dyndbg_class_maps") _var = {		\
+		.mod = THIS_MODULE,					\
+		.mod_name = KBUILD_MODNAME,				\
+		.base = (_base),					\
+		.map_type = (_mapty),					\
+		.length = ARRAY_SIZE(_var##_classnames),		\
+		.class_names = _var##_classnames,			\
+	};								\
+	EXPORT_SYMBOL(_var)
+
+/*
+ * XXX: keep this until DRM adapts to use the DEFINE/USE api, it
+ * differs from DYNAMIC_DEBUG_CLASSMAP_DEFINE by the lack of the
+ * extern/EXPORT on the struct init, and cascading thinkos.
  */
 #define DECLARE_DYNDBG_CLASSMAP(_var, _maptype, _base, ...)		\
 	static const char *_var##_classnames[] = { __VA_ARGS__ };	\
@@ -137,6 +223,44 @@ struct _ddebug_class_param {
 		.class_names = _var##_classnames,			\
 	}
 
+/**
+ * DYNAMIC_DEBUG_CLASSMAP_USE - refer to a classmap, DEFINEd elsewhere.
+ * @_var: name of the exported classmap var
+ *
+ * This tells dyndbg that the module has prdbgs with classids defined
+ * in the named classmap.  This qualifies "class NAME" >controls on
+ * the user module, and ignores unknown names. This is a wrapper for
+ * DYNAMIC_DEBUG_CLASSMAP_USE_() with a base offset of 0.
+ */
+#define DYNAMIC_DEBUG_CLASSMAP_USE(_var) \
+	DYNAMIC_DEBUG_CLASSMAP_USE_(_var, 0)
+
+/**
+ * DYNAMIC_DEBUG_CLASSMAP_USE_ - refer to a classmap with a manual offset.
+ * @_var:   name of the exported classmap var to use.
+ * @_offset:  an integer offset to add to the class IDs of the used map.
+ *
+ * This is an extended version of DYNAMIC_DEBUG_CLASSMAP_USE(). It should
+ * only be used to resolve class ID conflicts when a module uses multiple
+ * classmaps that have overlapping ID ranges.
+ *
+ * The final class IDs for the used map will be calculated as:
+ * original_map_base + class_index + @_offset.
+ */
+#define DYNAMIC_DEBUG_CLASSMAP_USE_(_var, _offset)			\
+	extern struct _ddebug_class_map _var;				\
+	static_assert((_offset) >= 0 && (_offset) < _DPRINTK_CLASS_DFLT, \
+		      "classmap use offset must be in 0..62");          \
+	extern struct _ddebug_class_user __aligned(8)			\
+		__PASTE(_var ## _, __KBUILD_MODNAME);			\
+	struct _ddebug_class_user __aligned(8) __used			\
+		__section("__dyndbg_class_users")			\
+		__PASTE(_var ## _, __KBUILD_MODNAME) = {		\
+		.mod_name = KBUILD_MODNAME,				\
+		.map = &(_var),						\
+		.offset = _offset					\
+	}
+
 extern __printf(2, 3)
 void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...);
 
@@ -298,12 +422,18 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
 				   KERN_DEBUG, prefix_str, prefix_type,	\
 				   rowsize, groupsize, buf, len, ascii)
 
-/* for test only, generally expect drm.debug style macro wrappers */
-#define __pr_debug_cls(cls, fmt, ...) do {			\
+/*
+ * This is the "model" class variant of pr_debug.  It is not really
+ * intended for direct use; I'd encourage DRM-style drm_dbg_<T>
+ * macros for the interface, along with an enum for the <T>
+ *
+ * __printf(2, 3) would apply.
+ */
+#define __pr_debug_cls(cls, fmt, ...) ({			\
 	BUILD_BUG_ON_MSG(!__builtin_constant_p(cls),		\
 			 "expecting constant class int/enum");	\
 	dynamic_pr_debug_cls(cls, fmt, ##__VA_ARGS__);		\
-	} while (0)
+})
 
 #else /* !(CONFIG_DYNAMIC_DEBUG || (CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE)) */
 
@@ -311,6 +441,8 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
 #include <linux/errno.h>
 #include <linux/printk.h>
 
+#define DYNAMIC_DEBUG_CLASSMAP_DEFINE(_var, _mapty, _base, ...)
+#define DYNAMIC_DEBUG_CLASSMAP_USE(_var)
 #define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt)
 #define DYNAMIC_DEBUG_BRANCH(descriptor) false
 #define DECLARE_DYNDBG_CLASSMAP(...)
@@ -357,8 +489,7 @@ static inline int param_set_dyndbg_classes(const char *instr, const struct kerne
 static inline int param_get_dyndbg_classes(char *buffer, const struct kernel_param *kp)
 { return 0; }
 
-#endif
-
+#endif /* !CONFIG_DYNAMIC_DEBUG_CORE */
 
 extern const struct kernel_param_ops param_ops_dyndbg_classes;
 
diff --git a/kernel/module/main.c b/kernel/module/main.c
index a0fe6c7aab75..8a3fc98d8a4c 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2723,6 +2723,9 @@ static int find_module_sections(struct module *mod, struct load_info *info)
 	mod->dyndbg_info.maps.start = section_objs(info, "__dyndbg_class_maps",
 						   sizeof(*mod->dyndbg_info.maps.start),
 						   &mod->dyndbg_info.maps.len);
+	mod->dyndbg_info.users.start = section_objs(info, "__dyndbg_class_users",
+						   sizeof(*mod->dyndbg_info.users.start),
+						   &mod->dyndbg_info.users.len);
 #endif
 
 	return 0;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 93f356d2b3d9..302bb2656682 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -3106,12 +3106,26 @@ config TEST_STATIC_KEYS
 	  If unsure, say N.
 
 config TEST_DYNAMIC_DEBUG
-	tristate "Test DYNAMIC_DEBUG"
-	depends on DYNAMIC_DEBUG
+	tristate "Build test-dynamic-debug module"
+	depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
 	help
-	  This module registers a tracer callback to count enabled
-	  pr_debugs in a 'do_debugging' function, then alters their
-	  enablements, calls the function, and compares counts.
+	  This module exercises/demonstrates dyndbg's classmap API, by
+	  creating 2 classes: a DISJOINT classmap (supporting DRM.debug)
+	  and a LEVELS/VERBOSE classmap (like verbose2 > verbose1).
+
+	  If unsure, say N.
+
+config TEST_DYNAMIC_DEBUG_SUBMOD
+	tristate "Build test-dynamic-debug submodule"
+	default m
+	depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
+	depends on TEST_DYNAMIC_DEBUG
+	help
+	  This sub-module uses a classmap defined and exported by the
+	  parent module, recapitulating drm & driver's shared use of
+	  drm.debug to control enabled debug-categories.
+	  It is tristate, independent of parent, to allow testing all
+	  proper combinations of parent=y/m submod=y/m.
 
 	  If unsure, say N.
 
diff --git a/lib/Makefile b/lib/Makefile
index 1b9ee167517f..19ab40903436 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -83,6 +83,9 @@ obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
 obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o
+obj-$(CONFIG_TEST_DYNAMIC_DEBUG_SUBMOD) += test_dynamic_debug_submod.o
+obj-$(CONFIG_TEST_PRINTF) += test_printf.o
+obj-$(CONFIG_TEST_SCANF) += test_scanf.o
 
 obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
 ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_KASAN),yy)
@@ -206,6 +209,8 @@ obj-$(CONFIG_ARCH_NEED_CMPXCHG_1_EMU) += cmpxchg-emu.o
 obj-$(CONFIG_DYNAMIC_DEBUG_CORE) += dynamic_debug.o
 #ensure exported functions have prototypes
 CFLAGS_dynamic_debug.o := -DDYNAMIC_DEBUG_MODULE
+CFLAGS_test_dynamic_debug.o := -DDYNAMIC_DEBUG_MODULE
+CFLAGS_test_dynamic_debug_submod.o := -DDYNAMIC_DEBUG_MODULE
 
 obj-$(CONFIG_SYMBOLIC_ERRNAME) += errname.o
 
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index b8983e095e60..ce512efaeffd 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -29,6 +29,7 @@
 #include <linux/string_helpers.h>
 #include <linux/uaccess.h>
 #include <linux/dynamic_debug.h>
+
 #include <linux/debugfs.h>
 #include <linux/slab.h>
 #include <linux/jump_label.h>
@@ -43,6 +44,8 @@ extern struct _ddebug __start___dyndbg_descs[];
 extern struct _ddebug __stop___dyndbg_descs[];
 extern struct _ddebug_class_map __start___dyndbg_class_maps[];
 extern struct _ddebug_class_map __stop___dyndbg_class_maps[];
+extern struct _ddebug_class_user __start___dyndbg_class_users[];
+extern struct _ddebug_class_user __stop___dyndbg_class_users[];
 
 struct ddebug_table {
 	struct list_head link;
@@ -168,20 +171,37 @@ static void vpr_info_dq(const struct ddebug_query *query, const char *msg)
 		  query->first_lineno, query->last_lineno, query->class_string);
 }
 
-static struct _ddebug_class_map *ddebug_find_valid_class(struct ddebug_table const *dt,
-							 const char *class_string,
-							 int *class_id)
+#define vpr_di_info(di_p, msg_p, ...) ({				\
+	struct _ddebug_info const *_di = di_p;				\
+	v2pr_info(msg_p "module:%s nd:%d nc:%d nu:%d\n", ##__VA_ARGS__, \
+		  _di->mod_name, _di->descs.len, _di->maps.len,		\
+		  _di->users.len);					\
+	})
+
+static struct _ddebug_class_map *
+ddebug_find_valid_class(struct _ddebug_info const *di, const char *query_class, int *class_id)
 {
 	struct _ddebug_class_map *map;
+	struct _ddebug_class_user *cli;
 	int i, idx;
 
-	for_subvec(i, map, &dt->info, maps) {
-		idx = match_string(map->class_names, map->length, class_string);
+	for_subvec(i, map, di, maps) {
+		idx = match_string(map->class_names, map->length, query_class);
 		if (idx >= 0) {
+			vpr_di_info(di, "good-class: %s.%s ", map->mod_name, query_class);
 			*class_id = idx + map->base;
 			return map;
 		}
 	}
+	for_subvec(i, cli, di, users) {
+		idx = match_string(cli->map->class_names, cli->map->length, query_class);
+		if (idx >= 0) {
+			vpr_di_info(di, "class-ref: %s -> %s.%s ",
+				    cli->mod_name, cli->map->mod_name, query_class);
+			*class_id = idx + cli->map->base - cli->offset;
+			return cli->map;
+		}
+	}
 	*class_id = -ENOENT;
 	return NULL;
 }
@@ -238,8 +258,7 @@ static bool ddebug_match_desc(const struct ddebug_query *query,
 	return true;
 }
 
-static int ddebug_change(const struct ddebug_query *query,
-			 struct flag_settings *modifiers)
+static int ddebug_change(const struct ddebug_query *query, struct flag_settings *modifiers)
 {
 	int i;
 	struct ddebug_table *dt;
@@ -260,7 +279,8 @@ static int ddebug_change(const struct ddebug_query *query,
 			continue;
 
 		if (query->class_string) {
-			map = ddebug_find_valid_class(dt, query->class_string, &valid_class);
+			map = ddebug_find_valid_class(&dt->info, query->class_string,
+						      &valid_class);
 			if (!map)
 				continue;
 		} else {
@@ -590,7 +610,7 @@ static int ddebug_exec_query(char *query_string, const char *modname)
 
 /* handle multiple queries in query string, continue on error, return
    last error or number of matching callsites.  Module name is either
-   in param (for boot arg) or perhaps in query string.
+   in the modname arg (for boot args) or perhaps in query string.
 */
 static int ddebug_exec_queries(char *query, const char *modname)
 {
@@ -721,7 +741,7 @@ static int param_set_dyndbg_module_classes(const char *instr,
 /**
  * param_set_dyndbg_classes - classmap kparam setter
  * @instr: string echo>d to sysfs, input depends on map_type
- * @kp:    kp->arg has state: bits/lvl, map, map_type
+ * @kp:    kp->arg has state: bits/lvl, classmap, map_type
  *
  * enable/disable all class'd pr_debugs in the classmap. For LEVEL
  * map-types, enforce * relative levels by bitpos.
@@ -758,6 +778,7 @@ int param_get_dyndbg_classes(char *buffer, const struct kernel_param *kp)
 	default:
 		return -1;
 	}
+	return 0;
 }
 EXPORT_SYMBOL(param_get_dyndbg_classes);
 
@@ -1073,12 +1094,17 @@ static bool ddebug_class_in_range(const int class_id, const struct _ddebug_class
 static const char *ddebug_class_name(struct _ddebug_info *di, struct _ddebug *dp)
 {
 	struct _ddebug_class_map *map;
+	struct _ddebug_class_user *cli;
 	int i;
 
 	for_subvec(i, map, di, maps)
 		if (ddebug_class_in_range(dp->class_id, map))
 			return map->class_names[dp->class_id - map->base];
 
+	for_subvec(i, cli, di, users)
+		if (ddebug_class_in_range(dp->class_id, cli->map))
+			return cli->map->class_names[dp->class_id - cli->map->base - cli->offset];
+
 	return NULL;
 }
 
@@ -1159,32 +1185,87 @@ static const struct proc_ops proc_fops = {
 	.proc_write = ddebug_proc_write
 };
 
-static void ddebug_attach_module_classes(struct ddebug_table *dt, struct _ddebug_info *di)
+#define vpr_cm_info(cm_p, msg_fmt, ...) ({				\
+	struct _ddebug_class_map const *_cm = cm_p;			\
+	v2pr_info(msg_fmt "%s [%d..%d] %s..%s\n", ##__VA_ARGS__,	\
+		  _cm->mod_name, _cm->base, _cm->base + _cm->length,	\
+		  _cm->class_names[0], _cm->class_names[_cm->length - 1]); \
+	})
+
+static void ddebug_sync_classbits(const struct kernel_param *kp, const char *modname)
 {
-	struct _ddebug_class_map *cm;
-	int i, nc = 0;
+	const struct _ddebug_class_param *dcp = kp->arg;
 
-	/*
-	 * Find this module's classmaps in a subrange/wholerange of
-	 * the builtin/modular classmap vector/section.  Save the start
-	 * and length of the subrange at its edges.
-	 */
-	for_subvec(i, cm, di, maps) {
-		if (!strcmp(cm->mod_name, dt->info.mod_name)) {
-			if (!nc) {
-				v2pr_info("start subrange, class[%d]: module:%s base:%d len:%d ty:%d\n",
-					  i, cm->mod_name, cm->base, cm->length, cm->map_type);
-				dt->info.maps.start = cm;
-			}
-			nc++;
-		}
+	/* clamp initial bitvec, mask off hi-bits */
+	if (*dcp->bits & ~CLASSMAP_BITMASK(dcp->map->length)) {
+		*dcp->bits &= CLASSMAP_BITMASK(dcp->map->length);
+		v2pr_info("preset classbits: %lx\n", *dcp->bits);
+	}
+	/* force class'd prdbgs (in USEr module) to match (DEFINEr module) class-param */
+	ddebug_apply_class_bitmap(dcp, dcp->bits, ~0, modname);
+	ddebug_apply_class_bitmap(dcp, dcp->bits, 0, modname);
+}
+
+static void ddebug_match_apply_kparam(const struct kernel_param *kp,
+				      const struct _ddebug_class_map *map,
+				      const char *mod_name)
+{
+	struct _ddebug_class_param *dcp;
+
+	if (kp->ops != &param_ops_dyndbg_classes)
+		return;
+
+	dcp = (struct _ddebug_class_param *)kp->arg;
+
+	if (map == dcp->map) {
+		v2pr_info(" kp:%s.%s =0x%lx", mod_name, kp->name, *dcp->bits);
+		vpr_cm_info(map, " %s maps ", mod_name);
+		ddebug_sync_classbits(kp, mod_name);
+	}
+}
+
+static void ddebug_apply_params(const struct _ddebug_class_map *cm, const char *mod_name)
+{
+	const struct kernel_param *kp;
+#if IS_ENABLED(CONFIG_MODULES)
+	int i;
+
+	if (cm->mod) {
+		vpr_cm_info(cm, "loaded classmap: %s ", mod_name);
+		/* ifdef protects the cm->mod->kp deref */
+		for (i = 0, kp = cm->mod->kp; i < cm->mod->num_kp; i++, kp++)
+			ddebug_match_apply_kparam(kp, cm, mod_name);
 	}
-	if (nc) {
-		dt->info.maps.len = nc;
-		vpr_info("module:%s attached %d classes\n", dt->info.mod_name, nc);
+#endif
+	if (!cm->mod) {
+		vpr_cm_info(cm, "builtin classmap: %s ", mod_name);
+		for (kp = __start___param; kp < __stop___param; kp++)
+			ddebug_match_apply_kparam(kp, cm, mod_name);
 	}
 }
 
+static void ddebug_apply_class_maps(const struct _ddebug_info *di)
+{
+	struct _ddebug_class_map *cm;
+	int i;
+
+	for_subvec(i, cm, di, maps)
+		ddebug_apply_params(cm, cm->mod_name);
+
+	vpr_di_info(di, "attached %d class-maps to ", i);
+}
+
+static void ddebug_apply_class_users(const struct _ddebug_info *di)
+{
+	struct _ddebug_class_user *cli;
+	int i;
+
+	for_subvec(i, cli, di, users)
+		ddebug_apply_params(cli->map, cli->mod_name);
+
+	vpr_di_info(di, "attached %d class-users to ", i);
+}
+
 /*
  * Narrow a _ddebug_info's vector (@_vec) to the contiguous subrange
  * of elements where ->mod_name matches @__di->mod_name.
@@ -1214,6 +1295,22 @@ static void ddebug_attach_module_classes(struct ddebug_table *dt, struct _ddebug
 	__di->_vec.len = __nc;						\
 })
 
+static int __maybe_unused
+ddebug_class_range_overlap(struct _ddebug_class_map *cm,
+			   u64 *reserved_ids)
+{
+	u64 range = (((1ULL << cm->length) - 1) << cm->base);
+
+	if (range & *reserved_ids) {
+		pr_err("[%d..%d] on %s conflicts with %llx\n", cm->base,
+		       cm->base + cm->length - 1, cm->class_names[0],
+		       *reserved_ids);
+		return -EINVAL;
+	}
+	*reserved_ids |= range;
+	return 0;
+}
+
 /*
  * Allocate a new ddebug_table for the given module
  * and add it to the global list.
@@ -1222,6 +1319,7 @@ static int ddebug_add_module(struct _ddebug_info *di)
 {
 	struct ddebug_table *dt;
 	struct _ddebug_class_map *cm;
+	struct _ddebug_class_user *cli;
 	int i;
 
 	if (!di->descs.len)
@@ -1234,26 +1332,29 @@ static int ddebug_add_module(struct _ddebug_info *di)
 		pr_err("error adding module: %s\n", di->mod_name);
 		return -ENOMEM;
 	}
+	INIT_LIST_HEAD(&dt->link);
 	/*
-	 * For built-in modules, name (as supplied in di by its
-	 * callers) lives in .rodata and is immortal. For loaded
-	 * modules, name points at the name[] member of struct module,
-	 * which lives at least as long as this struct ddebug_table.
+	 * For built-in modules, di-> referents live in .*data and are
+	 * immortal. For loaded modules, di points at the dyndbg_info
+	 * member of its struct module, which lives at least as
+	 * long as this struct ddebug_table.
 	 */
 	dt->info = *di;
-
-	INIT_LIST_HEAD(&dt->link);
-
 	dd_set_module_subrange(i, cm, &dt->info, maps);
-
-	if (di->maps.len)
-		ddebug_attach_module_classes(dt, di);
+	dd_set_module_subrange(i, cli, &dt->info, users);
+	/* now di is stale */
 
 	mutex_lock(&ddebug_lock);
 	list_add_tail(&dt->link, &ddebug_tables);
 	mutex_unlock(&ddebug_lock);
 
-	vpr_info("%3u debug prints in module %s\n", di->descs.len, di->mod_name);
+	if (dt->info.maps.len)
+		ddebug_apply_class_maps(&dt->info);
+	if (dt->info.users.len)
+		ddebug_apply_class_users(&dt->info);
+
+	vpr_info("%3u debug prints in module %s\n",
+		 dt->info.descs.len, dt->info.mod_name);
 	return 0;
 }
 
@@ -1403,8 +1504,10 @@ static int __init dynamic_debug_init(void)
 	struct _ddebug_info di = {
 		.descs.start = __start___dyndbg_descs,
 		.maps.start  = __start___dyndbg_class_maps,
+		.users.start = __start___dyndbg_class_users,
 		.descs.len = __stop___dyndbg_descs - __start___dyndbg_descs,
 		.maps.len  = __stop___dyndbg_class_maps - __start___dyndbg_class_maps,
+		.users.len = __stop___dyndbg_class_users - __start___dyndbg_class_users,
 	};
 
 #ifdef CONFIG_MODULES
diff --git a/lib/test_dynamic_debug.c b/lib/test_dynamic_debug.c
index 9c3e53cd26bd..6c4548f63512 100644
--- a/lib/test_dynamic_debug.c
+++ b/lib/test_dynamic_debug.c
@@ -6,11 +6,30 @@
  *      Jim Cromie	<jim.cromie@gmail.com>
  */
 
-#define pr_fmt(fmt) "test_dd: " fmt
+/*
+ * This file is built 2x, also making test_dynamic_debug_submod.ko,
+ * whose 2-line src file #includes this file.  This gives us a _submod
+ * clone with identical pr_debugs, without further maintenance.
+ *
+ * If things are working properly, they should operate identically
+ * when printed or adjusted by >control.  This eases visual perusal of
+ * the logs, and simplifies testing, by easing the proper accounting
+ * of expectations.
+ *
+ * It also puts both halves of the subsystem _DEFINE & _USE use case
+ * together, and integrates the common ENUM providing both class_ids
+ * and class-names to both _DEFINErs and _USERs.  I think this makes
+ * the usage clearer.
+ */
+#if defined(TEST_DYNAMIC_DEBUG_SUBMOD)
+  #define pr_fmt(fmt) "test_dd_submod: " fmt
+#else
+  #define pr_fmt(fmt) "test_dd: " fmt
+#endif
 
 #include <linux/module.h>
 
-/* run tests by reading or writing sysfs node: do_prints */
+/* re-gen output by reading or writing sysfs node: do_prints */
 
 static void do_prints(void); /* device under test */
 static int param_set_do_prints(const char *instr, const struct kernel_param *kp)
@@ -29,24 +48,39 @@ static const struct kernel_param_ops param_ops_do_prints = {
 };
 module_param_cb(do_prints, &param_ops_do_prints, NULL, 0600);
 
-/*
- * Using the CLASSMAP api:
- * - classmaps must have corresponding enum
- * - enum symbols must match/correlate with class-name strings in the map.
- * - base must equal enum's 1st value
- * - multiple maps must set their base to share the 0-30 class_id space !!
- *   (build-bug-on tips welcome)
- * Additionally, here:
- * - tie together sysname, mapname, bitsname, flagsname
- */
-#define DD_SYS_WRAP(_model, _flags)					\
-	static unsigned long bits_##_model;				\
-	static struct _ddebug_class_param _flags##_model = {		\
+#define CLASSMAP_BITMASK(width, base) (((1UL << (width)) - 1) << (base))
+
+/* sysfs param wrapper, proto-API */
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM_(_model, _flags, _init)		\
+	static unsigned long bits_##_model = _init;			\
+	static struct _ddebug_class_param _flags##_##_model = {		\
 		.bits = &bits_##_model,					\
 		.flags = #_flags,					\
 		.map = &map_##_model,					\
 	};								\
-	module_param_cb(_flags##_##_model, &param_ops_dyndbg_classes, &_flags##_model, 0600)
+	module_param_cb(_flags##_##_model, &param_ops_dyndbg_classes,	\
+			&_flags##_##_model, 0600)
+#ifdef DEBUG
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM(_model, _flags)		\
+	DYNAMIC_DEBUG_CLASSMAP_PARAM_(_model, _flags, ~0)
+#else
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM(_model, _flags)		\
+	DYNAMIC_DEBUG_CLASSMAP_PARAM_(_model, _flags, 0)
+#endif
+
+/*
+ * Demonstrate/test DISJOINT & LEVEL typed classmaps with a sys-param.
+ *
+ * To comport with DRM debug-category (an int), classmaps map names to
+ * ids (also an int).  So a classmap starts with an enum; DRM has enum
+ * debug_category: with DRM_UT_<CORE,DRIVER,KMS,etc>.  We use the enum
+ * values as class-ids, and stringified enum-symbols as classnames.
+ *
+ * Modules with multiple CLASSMAPS must have enums with distinct
+ * value-ranges, as arranged below with explicit enum_sym = X inits.
+ * To clarify this sharing, declare the 2 enums now, for the 2
+ * different classmap types
+ */
 
 /* numeric input, independent bits */
 enum cat_disjoint_bits {
@@ -60,26 +94,51 @@ enum cat_disjoint_bits {
 	D2_LEASE,
 	D2_DP,
 	D2_DRMRES };
-DECLARE_DYNDBG_CLASSMAP(map_disjoint_bits, DD_CLASS_TYPE_DISJOINT_BITS, 0,
-			"D2_CORE",
-			"D2_DRIVER",
-			"D2_KMS",
-			"D2_PRIME",
-			"D2_ATOMIC",
-			"D2_VBL",
-			"D2_STATE",
-			"D2_LEASE",
-			"D2_DP",
-			"D2_DRMRES");
-DD_SYS_WRAP(disjoint_bits, p);
-DD_SYS_WRAP(disjoint_bits, T);
-
-/* numeric verbosity, V2 > V1 related */
-enum cat_level_num { V0 = 14, V1, V2, V3, V4, V5, V6, V7 };
-DECLARE_DYNDBG_CLASSMAP(map_level_num, DD_CLASS_TYPE_LEVEL_NUM, 14,
-		       "V0", "V1", "V2", "V3", "V4", "V5", "V6", "V7");
-DD_SYS_WRAP(level_num, p);
-DD_SYS_WRAP(level_num, T);
+
+/* numeric verbosity, V2 > V1 related.  V0 is > D2_DRMRES */
+enum cat_level_num { V0 = 16, V1, V2, V3, V4, V5, V6, V7 };
+
+/* recapitulate DRM's multi-classmap setup */
+#if !defined(TEST_DYNAMIC_DEBUG_SUBMOD)
+/*
+ * In single user, or parent / coordinator (drm.ko) modules, define
+ * classmaps on the client enums above, and then declares the PARAMS
+ * ref'g the classmaps.  Each is exported.
+ */
+DYNAMIC_DEBUG_CLASSMAP_DEFINE(map_disjoint_bits, DD_CLASS_TYPE_DISJOINT_BITS,
+			      D2_CORE,
+			      "D2_CORE",
+			      "D2_DRIVER",
+			      "D2_KMS",
+			      "D2_PRIME",
+			      "D2_ATOMIC",
+			      "D2_VBL",
+			      "D2_STATE",
+			      "D2_LEASE",
+			      "D2_DP",
+			      "D2_DRMRES");
+
+DYNAMIC_DEBUG_CLASSMAP_DEFINE(map_level_num, DD_CLASS_TYPE_LEVEL_NUM,
+			      V0, "V0", "V1", "V2", "V3", "V4", "V5", "V6", "V7");
+
+/*
+ * now add the sysfs-params
+ */
+
+DYNAMIC_DEBUG_CLASSMAP_PARAM(disjoint_bits, p);
+DYNAMIC_DEBUG_CLASSMAP_PARAM(level_num, p);
+
+#else /* TEST_DYNAMIC_DEBUG_SUBMOD */
+
+/*
+ * in submod/drm-drivers, use the classmaps defined in top/parent
+ * module above.
+ */
+
+DYNAMIC_DEBUG_CLASSMAP_USE(map_disjoint_bits);
+DYNAMIC_DEBUG_CLASSMAP_USE(map_level_num);
+
+#endif
 
 /* stand-in for all pr_debug etc */
 #define prdbg(SYM) __pr_debug_cls(SYM, #SYM " msg\n")
@@ -115,6 +174,7 @@ static void do_levels(void)
 
 static void do_prints(void)
 {
+	pr_debug("do_prints:\n");
 	do_cats();
 	do_levels();
 }
diff --git a/lib/test_dynamic_debug_submod.c b/lib/test_dynamic_debug_submod.c
new file mode 100644
index 000000000000..672aabf40160
--- /dev/null
+++ b/lib/test_dynamic_debug_submod.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Kernel module for testing dynamic_debug
+ *
+ * Authors:
+ *      Jim Cromie	<jim.cromie@gmail.com>
+ */
+
+/*
+ * clone the parent, inherit all the properties, for consistency and
+ * simpler accounting in test expectations.
+ */
+#define TEST_DYNAMIC_DEBUG_SUBMOD
+#include "test_dynamic_debug.c"
-- 
2.53.0


^ permalink raw reply related

* [PATCH v13 29/36] dyndbg-API: promote DYNAMIC_DEBUG_CLASSMAP_PARAM to API
From: Jim Cromie @ 2026-04-08 20:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: gregkh, jbaron, louis.chauvet, Jim Cromie, linux-doc
In-Reply-To: <20260408200211.43821-1-jim.cromie@gmail.com>

move the DYNAMIC_DEBUG_CLASSMAP_PARAM macro from test-dynamic-debug.c into
the header, and refine it, by distinguishing the 2 use cases:

1.DYNAMIC_DEBUG_CLASSMAP_PARAM_REF
    for DRM, to pass in extern __drm_debug by name.
    dyndbg keeps bits in it, so drm can still use it as before

2.DYNAMIC_DEBUG_CLASSMAP_PARAM
    new user (test_dynamic_debug) doesn't need to share state,
    decls a static long unsigned int to store the bitvec.

__DYNAMIC_DEBUG_CLASSMAP_PARAM
   bottom layer - allocate,init a ddebug-class-param, module-param-cb.

Modify ddebug_sync_classbits() argtype deref inside the fn, to give
access to all kp members.

Also add stub macros, clean up and improve comments in test-code, and
add MODULE_DESCRIPTIONs.

cc: linux-doc@vger.kernel.org
Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
---
 include/linux/dynamic_debug.h   | 40 ++++++++++++++++++++++
 lib/dynamic_debug.c             | 60 ++++++++++++++++++++++-----------
 lib/test_dynamic_debug.c        | 47 ++++++++++----------------
 lib/test_dynamic_debug_submod.c |  9 ++++-
 4 files changed, 106 insertions(+), 50 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index a1c75237abaa..1cae9a2f32d7 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -273,6 +273,44 @@ struct _ddebug_class_param {
 		.offset = _offset					\
 	}
 
+/**
+ * DYNAMIC_DEBUG_CLASSMAP_PARAM - control a ddebug-classmap from a sys-param
+ * @_name:  sysfs node name
+ * @_var:   name of the classmap var defining the controlled classes/bits
+ * @_flags: flags to be toggled, typically just 'p'
+ *
+ * Creates a sysfs-param to control the classes defined by the
+ * exported classmap, with bits 0..N-1 mapped to the classes named.
+ * This version keeps class-state in a private long int.
+ */
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM(_name, _var, _flags)		\
+	static unsigned long _name##_bvec;				\
+	__DYNAMIC_DEBUG_CLASSMAP_PARAM(_name, _name##_bvec, _var, _flags)
+
+/**
+ * DYNAMIC_DEBUG_CLASSMAP_PARAM_REF - wrap a classmap with a controlling sys-param
+ * @_name:  sysfs node name
+ * @_bits:  name of the module's unsigned long bit-vector, ex: __drm_debug
+ * @_var:   name of the (exported) classmap var defining the classes/bits
+ * @_flags: flags to be toggled, typically just 'p'
+ *
+ * Creates a sysfs-param to control the classes defined by the
+ * exported clasmap, with bits 0..N-1 mapped to the classes named.
+ * This version keeps class-state in user @_bits.  This lets drm check
+ * __drm_debug elsewhere too.
+ */
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM_REF(_name, _bits, _var, _flags)	\
+	__DYNAMIC_DEBUG_CLASSMAP_PARAM(_name, _bits, _var, _flags)
+
+#define __DYNAMIC_DEBUG_CLASSMAP_PARAM(_name, _bits, _var, _flags)	\
+	static struct _ddebug_class_param _name##_##_flags = {		\
+		.bits = &(_bits),					\
+		.flags = #_flags,					\
+		.map = &(_var),						\
+	};								\
+	module_param_cb(_name, &param_ops_dyndbg_classes,		\
+			&_name##_##_flags, 0600)
+
 extern __printf(2, 3)
 void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...);
 
@@ -455,6 +493,8 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
 
 #define DYNAMIC_DEBUG_CLASSMAP_DEFINE(_var, _mapty, _base, ...)
 #define DYNAMIC_DEBUG_CLASSMAP_USE(_var)
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM(_name, _var, _flags)
+#define DYNAMIC_DEBUG_CLASSMAP_PARAM_REF(_name, _var, _flags)
 #define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt)
 #define DYNAMIC_DEBUG_BRANCH(descriptor) false
 #define DECLARE_DYNDBG_CLASSMAP(...)
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index a7f67ecbc4d7..c6f3b0452dfa 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -687,6 +687,30 @@ static int ddebug_apply_class_bitmap(const struct _ddebug_class_param *dcp,
 
 #define CLASSMAP_BITMASK(width) ((1UL << (width)) - 1)
 
+static void ddebug_class_param_clamp_input(unsigned long *inrep, const struct kernel_param *kp)
+{
+	const struct _ddebug_class_param *dcp = kp->arg;
+	const struct _ddebug_class_map *map = dcp->map;
+
+	switch (map->map_type) {
+	case DD_CLASS_TYPE_DISJOINT_BITS:
+		/* expect bits. mask and warn if too many */
+		if (*inrep & ~CLASSMAP_BITMASK(map->length)) {
+			pr_warn("%s: input: 0x%lx exceeds mask: 0x%lx, masking\n",
+				KP_NAME(kp), *inrep, CLASSMAP_BITMASK(map->length));
+			*inrep &= CLASSMAP_BITMASK(map->length);
+		}
+		break;
+	case DD_CLASS_TYPE_LEVEL_NUM:
+		/* input is bitpos, of highest verbosity to be enabled */
+		if (*inrep > map->length) {
+			pr_warn("%s: level:%ld exceeds max:%d, clamping\n",
+				KP_NAME(kp), *inrep, map->length);
+			*inrep = map->length;
+		}
+		break;
+	}
+}
 static int param_set_dyndbg_module_classes(const char *instr,
 					   const struct kernel_param *kp,
 					   const char *mod_name)
@@ -705,26 +729,15 @@ static int param_set_dyndbg_module_classes(const char *instr,
 		pr_err("expecting numeric input, not: %s > %s\n", instr, KP_NAME(kp));
 		return -EINVAL;
 	}
+	ddebug_class_param_clamp_input(&inrep, kp);
 
 	switch (map->map_type) {
 	case DD_CLASS_TYPE_DISJOINT_BITS:
-		/* expect bits. mask and warn if too many */
-		if (inrep & ~CLASSMAP_BITMASK(map->length)) {
-			pr_warn("%s: input: 0x%lx exceeds mask: 0x%lx, masking\n",
-				KP_NAME(kp), inrep, CLASSMAP_BITMASK(map->length));
-			inrep &= CLASSMAP_BITMASK(map->length);
-		}
 		v2pr_info("bits:0x%lx > %s.%s\n", inrep, mod_name ?: "*", KP_NAME(kp));
 		totct += ddebug_apply_class_bitmap(dcp, &inrep, *dcp->bits, mod_name);
 		*dcp->bits = inrep;
 		break;
 	case DD_CLASS_TYPE_LEVEL_NUM:
-		/* input is bitpos, of highest verbosity to be enabled */
-		if (inrep > map->length) {
-			pr_warn("%s: level:%ld exceeds max:%d, clamping\n",
-				KP_NAME(kp), inrep, map->length);
-			inrep = map->length;
-		}
 		old_bits = CLASSMAP_BITMASK(*dcp->lvl);
 		new_bits = CLASSMAP_BITMASK(inrep);
 		v2pr_info("lvl:%ld bits:0x%lx > %s\n", inrep, new_bits, KP_NAME(kp));
@@ -1200,15 +1213,24 @@ static const struct proc_ops proc_fops = {
 static void ddebug_sync_classbits(const struct kernel_param *kp, const char *modname)
 {
 	const struct _ddebug_class_param *dcp = kp->arg;
+	unsigned long new_bits;
 
-	/* clamp initial bitvec, mask off hi-bits */
-	if (*dcp->bits & ~CLASSMAP_BITMASK(dcp->map->length)) {
-		*dcp->bits &= CLASSMAP_BITMASK(dcp->map->length);
-		v2pr_info("preset classbits: %lx\n", *dcp->bits);
+	ddebug_class_param_clamp_input(dcp->bits, kp);
+
+	switch (dcp->map->map_type) {
+	case DD_CLASS_TYPE_DISJOINT_BITS:
+		v2pr_info("  %s: classbits: 0x%lx\n", KP_NAME(kp), *dcp->bits);
+		ddebug_apply_class_bitmap(dcp, dcp->bits, 0UL, modname);
+		break;
+	case DD_CLASS_TYPE_LEVEL_NUM:
+		new_bits = CLASSMAP_BITMASK(*dcp->lvl);
+		v2pr_info("  %s: lvl:%ld bits:0x%lx\n", KP_NAME(kp), *dcp->lvl, new_bits);
+		ddebug_apply_class_bitmap(dcp, &new_bits, 0UL, modname);
+		break;
+	default:
+		pr_err("bad map type %d\n", dcp->map->map_type);
+		return;
 	}
-	/* force class'd prdbgs (in USEr module) to match (DEFINEr module) class-param */
-	ddebug_apply_class_bitmap(dcp, dcp->bits, ~0, modname);
-	ddebug_apply_class_bitmap(dcp, dcp->bits, 0, modname);
 }
 
 static void ddebug_match_apply_kparam(const struct kernel_param *kp,
diff --git a/lib/test_dynamic_debug.c b/lib/test_dynamic_debug.c
index 5de839f13c8b..d65fa3f3ef9e 100644
--- a/lib/test_dynamic_debug.c
+++ b/lib/test_dynamic_debug.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
- * Kernel module for testing dynamic_debug
+ * Kernel module to test/demonstrate dynamic_debug features,
+ * particularly classmaps and their support for subsystems like DRM.
  *
  * Authors:
  *      Jim Cromie	<jim.cromie@gmail.com>
@@ -57,24 +58,6 @@ module_param_cb(do_prints, &param_ops_do_prints, NULL, 0600);
 
 #define CLASSMAP_BITMASK(width, base) (((1UL << (width)) - 1) << (base))
 
-/* sysfs param wrapper, proto-API */
-#define DYNAMIC_DEBUG_CLASSMAP_PARAM_(_model, _flags, _init)		\
-	static unsigned long bits_##_model = _init;			\
-	static struct _ddebug_class_param _flags##_##_model = {		\
-		.bits = &bits_##_model,					\
-		.flags = #_flags,					\
-		.map = &map_##_model,					\
-	};								\
-	module_param_cb(_flags##_##_model, &param_ops_dyndbg_classes,	\
-			&_flags##_##_model, 0600)
-#ifdef DEBUG
-#define DYNAMIC_DEBUG_CLASSMAP_PARAM(_model, _flags)		\
-	DYNAMIC_DEBUG_CLASSMAP_PARAM_(_model, _flags, ~0)
-#else
-#define DYNAMIC_DEBUG_CLASSMAP_PARAM(_model, _flags)		\
-	DYNAMIC_DEBUG_CLASSMAP_PARAM_(_model, _flags, 0)
-#endif
-
 /*
  * Demonstrate/test DISJOINT & LEVEL typed classmaps with a sys-param.
  *
@@ -105,12 +88,15 @@ enum cat_disjoint_bits {
 /* numeric verbosity, V2 > V1 related.  V0 is > D2_DRMRES */
 enum cat_level_num { V0 = 16, V1, V2, V3, V4, V5, V6, V7 };
 
-/* recapitulate DRM's multi-classmap setup */
+/*
+ * use/demonstrate multi-module-group classmaps, as for DRM
+ */
 #if !defined(TEST_DYNAMIC_DEBUG_SUBMOD)
 /*
- * In single user, or parent / coordinator (drm.ko) modules, define
- * classmaps on the client enums above, and then declares the PARAMS
- * ref'g the classmaps.  Each is exported.
+ * For module-groups of 1+, define classmaps with names (stringified
+ * enum-symbols) copied from above. 1-to-1 mapping is recommended.
+ * The classmap is exported, so that other modules in the group can
+ * link to it and control their prdbgs.
  */
 DYNAMIC_DEBUG_CLASSMAP_DEFINE(map_disjoint_bits, DD_CLASS_TYPE_DISJOINT_BITS,
 			      D2_CORE,
@@ -129,11 +115,13 @@ DYNAMIC_DEBUG_CLASSMAP_DEFINE(map_level_num, DD_CLASS_TYPE_LEVEL_NUM,
 			      V0, "V0", "V1", "V2", "V3", "V4", "V5", "V6", "V7");
 
 /*
- * now add the sysfs-params
+ * for use-cases that want it, provide a sysfs-param to set the
+ * classes in the classmap.  It is at this interface where the
+ * "v3>v2" property is applied to DD_CLASS_TYPE_LEVEL_NUM inputs.
  */
 
-DYNAMIC_DEBUG_CLASSMAP_PARAM(disjoint_bits, p);
-DYNAMIC_DEBUG_CLASSMAP_PARAM(level_num, p);
+DYNAMIC_DEBUG_CLASSMAP_PARAM(p_disjoint_bits,	map_disjoint_bits, p);
+DYNAMIC_DEBUG_CLASSMAP_PARAM(p_level_num,	map_level_num, p);
 
 #ifdef FORCE_CLASSID_CONFLICT
 /*
@@ -144,12 +132,10 @@ DYNAMIC_DEBUG_CLASSMAP_DEFINE(classid_range_conflict, 0, D2_CORE + 1, "D3_CORE")
 #endif
 
 #else /* TEST_DYNAMIC_DEBUG_SUBMOD */
-
 /*
- * in submod/drm-drivers, use the classmaps defined in top/parent
- * module above.
+ * the +1 members of a multi-module group refer to the classmap
+ * DEFINEd (and exported) above.
  */
-
 DYNAMIC_DEBUG_CLASSMAP_USE(map_disjoint_bits);
 DYNAMIC_DEBUG_CLASSMAP_USE(map_level_num);
 
@@ -228,6 +214,7 @@ static void __exit test_dynamic_debug_exit(void)
 module_init(test_dynamic_debug_init);
 module_exit(test_dynamic_debug_exit);
 
+MODULE_DESCRIPTION("test/demonstrate dynamic-debug features");
 MODULE_AUTHOR("Jim Cromie <jim.cromie@gmail.com>");
 MODULE_DESCRIPTION("Kernel module for testing dynamic_debug");
 MODULE_LICENSE("GPL");
diff --git a/lib/test_dynamic_debug_submod.c b/lib/test_dynamic_debug_submod.c
index 672aabf40160..3adf3925fb86 100644
--- a/lib/test_dynamic_debug_submod.c
+++ b/lib/test_dynamic_debug_submod.c
@@ -1,6 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Kernel module for testing dynamic_debug
+ * Kernel module to test/demonstrate dynamic_debug features,
+ * particularly classmaps and their support for subsystems, like DRM,
+ * which defines its drm_debug classmap in drm module, and uses it in
+ * helpers & drivers.
  *
  * Authors:
  *      Jim Cromie	<jim.cromie@gmail.com>
@@ -12,3 +15,7 @@
  */
 #define TEST_DYNAMIC_DEBUG_SUBMOD
 #include "test_dynamic_debug.c"
+
+MODULE_DESCRIPTION("test/demonstrate dynamic-debug subsystem support");
+MODULE_AUTHOR("Jim Cromie <jim.cromie@gmail.com>");
+MODULE_LICENSE("GPL");
-- 
2.53.0


^ permalink raw reply related

* [PATCH v13 35/36] docs/dyndbg: add classmap info to howto
From: Jim Cromie @ 2026-04-08 20:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: gregkh, jbaron, louis.chauvet, Jim Cromie, linux-doc
In-Reply-To: <20260408200211.43821-1-jim.cromie@gmail.com>

Describe the 3 API macros providing dynamic_debug's classmaps

DYNAMIC_DEBUG_CLASSMAP_DEFINE - create & export a classmap
DYNAMIC_DEBUG_CLASSMAP_USE    - refer to exported map
DYNAMIC_DEBUG_CLASSMAP_PARAM  - bind control param to the classmap
DYNAMIC_DEBUG_CLASSMAP_PARAM_REF + use module's storage - __drm_debug

NB: The _DEFINE & _USE model makes the user dependent on the definer,
just like EXPORT_SYMBOL(__drm_debug) already does.

cc: linux-doc@vger.kernel.org
Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
---
 .../admin-guide/dynamic-debug-howto.rst       | 132 ++++++++++++++++--
 1 file changed, 122 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst b/Documentation/admin-guide/dynamic-debug-howto.rst
index 0a42b9de55ac..734be0b5fe9a 100644
--- a/Documentation/admin-guide/dynamic-debug-howto.rst
+++ b/Documentation/admin-guide/dynamic-debug-howto.rst
@@ -146,6 +146,9 @@ keywords are::
   "1-30" is valid range but "1 - 30" is not.
 
 
+Keywords
+--------
+
 The meanings of each keyword are:
 
 func
@@ -194,16 +197,6 @@ format
 	format "nfsd: SETATTR"  // a neater way to match a format with whitespace
 	format 'nfsd: SETATTR'  // yet another way to match a format with whitespace
 
-class
-    The given class_name is validated against each module, which may
-    have declared a list of known class_names.  If the class_name is
-    found for a module, callsite & class matching and adjustment
-    proceeds.  Examples::
-
-	class DRM_UT_KMS	# a DRM.debug category
-	class JUNK		# silent non-match
-	// class TLD_*		# NOTICE: no wildcard in class names
-
 line
     The given line number or range of line numbers is compared
     against the line number of each ``pr_debug()`` callsite.  A single
@@ -218,6 +211,25 @@ line
 	line -1605          // the 1605 lines from line 1 to line 1605
 	line 1600-          // all lines from line 1600 to the end of the file
 
+class
+
+    The given class_name is validated against each module, which may
+    have declared a list of class_names it accepts.  If the class_name
+    accepted by a module, callsite & class matching and adjustment
+    proceeds.  Examples::
+
+	class DRM_UT_KMS	# a drm.debug category
+	class JUNK		# silent non-match
+	// class TLD_*		# NOTICE: no wildcard in class names
+
+.. note::
+
+    Unlike other keywords, classes are "name-to-change", not
+    "omitting-constraint-allows-change".  See Dynamic Debug Classmaps
+
+Flags
+-----
+
 The flags specification comprises a change operation followed
 by one or more flag characters.  The change operation is one
 of the characters::
@@ -239,6 +251,11 @@ The flags are::
   l    Include line number
   d    Include call trace
 
+.. note::
+
+   * To query without changing	``+_`` or ``-_``.
+   * To clear all flags		``=_`` or ``-fslmpt``.
+
 For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only
 the ``p`` flag has meaning, other flags are ignored.
 
@@ -395,3 +412,98 @@ just a shortcut for ``print_hex_dump(KERN_DEBUG)``.
 For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
 its ``prefix_str`` argument, if it is constant string; or ``hexdump``
 in case ``prefix_str`` is built dynamically.
+
+.. _dyndbg-classmaps:
+
+Dynamic Debug Classmaps
+=======================
+
+The "class" keyword selects prdbgs based on author supplied,
+domain-oriented names.  This complements the nested-scope keywords:
+module, file, function, line.
+
+The main difference from the others: classes must be named to be
+changed.  This protects them from unintended overwrite::
+
+  # IOW this cannot undo any drm.debug settings
+  :#> ddcmd -p
+
+This protection is needed; /sys/module/drm/parameters/debug is ABI.
+drm.debug is authoritative when dyndbg is not used, dyndbg-under-DRM
+is an implementation detail, and must not behave erratically, just
+because another admin fed >control something unrelated.
+
+So each class must be enabled individually (no wildcards)::
+
+  :#> ddcmd class DRM_UT_CORE +p
+  :#> ddcmd class DRM_UT_KMS +p
+  # or more selectively
+  :#> ddcmd class DRM_UT_CORE module drm +p
+
+That makes direct >control wordy and annoying, but it is a secondary
+interface; it is not intended to replace the ABI, just slide in
+underneath and reimplement the guaranteed behavior.  So DRM would keep
+using the convenient way, and be able to trust it::
+
+  :#> echo 0x1ff > /sys/module/drm/parameters/debug
+
+That said, since the sysfs/kparam is the ABI, if the author omits the
+CLASSMAP_PARAM, theres no ABI to guard, and he probably wants a less
+pedantic >control interface.  In this case, protection is dropped.
+
+Dynamic Debug Classmap API
+==========================
+
+DYNAMIC_DEBUG_CLASSMAP_DEFINE(clname,type,_base,classnames) - this maps
+classnames (a list of strings) onto class-ids consecutively, starting
+at _base.
+
+DYNAMIC_DEBUG_CLASSMAP_USE(clname) & _USE_(clname,_base) - modules
+call this to refer to the var _DEFINEd elsewhere (and exported).
+
+DYNAMIC_DEBUG_CLASSMAP_PARAM(clname) - creates the sysfs/kparam,
+maps/exposes bits 0..N as class-names.
+
+Classmaps are opt-in: modules invoke _DEFINE or _USE to authorize
+dyndbg to update those named classes.  "class FOO" queries are
+validated against the classes defined or used by the module, this
+finds the classid to alter; classes are not directly selectable by
+their classid.
+
+Classnames are global in scope, so subsystems (module-groups) should
+prepend a subsystem name; unqualified names like "CORE" are discouraged.
+
+NB: It is an inherent API limitation (due to class_id's int type) that
+the following are possible:
+
+  // these errors should be caught in review
+  __pr_debug_cls(0, "fake DRM_UT_CORE msg");  // this works
+  __pr_debug_cls(62, "un-known classid msg"); // this compiles, does nothing
+
+There are 2 types of classmaps:
+
+* DD_CLASS_TYPE_DISJOINT_BITS: classes are independent, like drm.debug
+* DD_CLASS_TYPE_LEVEL_NUM: classes are relative, ordered (V3 > V2)
+
+DYNAMIC_DEBUG_CLASSMAP_PARAM - modelled after module_param_cb, it
+refers to a DEFINEd classmap, and associates it to the param's
+data-store.  This state is then applied to DEFINEr and USEr modules
+when they're modprobed.
+
+The PARAM interface also enforces the DD_CLASS_TYPE_LEVEL_NUM relation
+amongst the contained classnames; all classes are independent in the
+control parser itself.  There is no implied meaning in names like "V4"
+or "PL_ERROR" vs "PL_WARNING".
+
+Modules or subsystems (drm & drivers) can define multiple classmaps,
+as long as they (all the classmaps) share the limited 0..62
+per-module-group _class_id range, without overlap.
+
+If a module encounters a conflict between 2 classmaps it is _USEing or
+_DEFINEing, it can invoke the extended _USE_(name,_base) macro to
+de-conflict the respective ranges.
+
+``#define DEBUG`` will enable all pr_debugs in scope, including any
+class'd ones.  This won't be reflected in the PARAM readback value,
+but the class'd pr_debug callsites can be forced off by toggling the
+classmap-kparam all-on then all-off.
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums
From: Joel Fernandes @ 2026-04-08 20:19 UTC (permalink / raw)
  To: Alexandre Courbot, Eliot Courtney, Danilo Krummrich
  Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo, Bjorn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	Dave Airlie, Daniel Almeida, Koen Koning, dri-devel,
	rust-for-linux, Nikola Djukic, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
	Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
	Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
	Alex Gaynor, Boqun Feng, John Hubbard, Alistair Popple,
	Timur Tabi, Edwin Peer, Andrea Righi, Andy Ritger, Zhi Wang,
	Balbir Singh, Philipp Stanner, Elle Rhumsaa, alexeyi, joel,
	linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <DHNKYBM159T9.2UUQ7CU0RN0BU@nvidia.com>

Hi Alex, Eliot, Danilo,

Thanks for taking a look. Let me respond to the specific points below.

On Wed, 08 Apr 2026, Alexandre Courbot wrote:
> After a quick look I'd say that having a trait here would actually be
> *good* for correctness and maintainability.
>
> The current design implies that every operation on a page table (most
> likely using the walker) goes through a branching point. Just looking at
> `PtWalk::read_pte_at_level`, there are already at least 6
> `if version == 2 { } else { }` branches that all resolve to the same
> result. Include walking down the PDEs and you have at least a dozen of
> these just to resolve a virtual address. I know CPUs are fast, but this
> is still wasted cycles for no good reason.

I did some measurements and there is no notieceable difference in both
approaches. I ran perf and loaded nova with self-tests running. The extra
potential branching is lost in the noise. In both cases, loading nova and
running the self-tests has ~119.7M branch instructions on my Ampere. The total
instruction count is also identical (~615M).

I measured like this:
perf stat -e
branches,branch-misses,cache-references,cache-misses,instructions,cycles --
modprobe nova_core

So I think the branching argument is not a strong one. I also did more
measurements and the dominant time taken is MMIO. During the map prep and
execute, page table walks are done. A TLB flush alone costs ~1.4 microseconds.
And PRAMIN BAR0 writes to write the PTE is also about 1 microsecond. Considering
this, I don't think the extra branching argument holds (even without branch
prediction and speculation).

Also some branches cannot be eliminated even with parameterization:

    if level == self.mmu_version.dual_pde_level() {
        // 128-bit dual PDE read
    } else {
        // Regular 64-bit PDE read
    }

This isn't really a version branch -- it's a structural branch that
distinguishes between 64-bit PDE and 128-bit dual PDE entries. Any MMU
version with a dual PDE level would need this same distinction.

I also did code-generation size analysis (see diff of code used below):

Code generation analysis:

  Module .ko size:   Before: 511,792 bytes   After: 524,464 bytes  (+2.5%)
  .text section:     Before: 112,620 bytes   After: 116,628 bytes  (+4,008 bytes)

  The +4K .text growth is the monomorphization cost: every generic function
  is compiled twice (once for MmuV2, once for MmuV3).

> If you use a trait here, and make `PtWalk` generic against it, you can
> optimize this away. We had a similar situation when we introduced Turing
> support and the v2 ucode header, and tried both approaches: the
> trait-based one was slightly shorter, and arguably more readable.

Actually I was the one who suggested traits for Falcon ucode descriptor if you
see this thread [1]. So basically you and Eliot are telling me to do what I
suggested in [1]. :-) However, I disagree that it is the right choice for this code.

[1] https://lore.kernel.org/all/20251117231028.GA1095236@joelbox2/

I think the two cases are quite different in complexity:

The falcon ucode descriptor is essentially a set of flat field accessors
and a few params (imem_sec_load_params, dmem_load_params).
The trait has ~10 simple getter methods. There's no multi-level hierarchy,
no walker, and no generic propagation.

The MMU page table case is structurally different. Making PtWalk generic
over an Mmu trait would require:

  - PtWalk<M: Mmu> (the walker)
  - Plus all the associated types: M::Pte, M::Pde, M::DualPde each
    needing their own trait bounds

And we would also need:
  - Vmm<M: Mmu> (which creates PtWalk)
  - BarUser<M: Mmu> (which creates Vmm)

I am also against making Vmm an enum as Eliot suggested:
       enum Vmm {
           V2(VmmInner<MmuV2>),
           V3(VmmInner<MmuV3>),
       }

That moves the version complexity up to the reader. Code complexity IMO should
decrease as we go up abstractions, making it easier for users (Vmm/Bar).

If you look at the the changes in vmm.rs to handle version dispatch there [2]:
Added: +109
Removed: -28

[2]
https://github.com/Edgeworth/linux/commit/3627af550b61256184d589e7ec666c1108971f0e

The main benefit of my approach is version-specific dispatch complexity is
completely isolated inside MmuVersion thus making the code outside of
pagetable.rs much more readable, without having to parametrize anything, and
without code size increase. I think that is worth considering.

> But the main argument to use a trait here IMO is that it enables
> associated types and constants. That's particularly critical since some
> equivalent fields have different lengths between v2 and v3. An
> associated `Bounded` type for these would force the caller to validate
> the length of these fields before calling a non-fallible operation,
> which is exactly the level of caution that we want when dealing with
> page tables.

I think Bounded validation is orthogonal to the dispatch model.
We can add Bounded to the current design without restructuring
into traits. For example:

    // In ver2::Pte
    pub fn new_vram(pfn: Bounded<Pfn, 25>, writable: bool) -> Self { ... }

    // In ver3::Pte
    pub fn new_vram(pfn: Bounded<Pfn, 40>, writable: bool) -> Self { ... }

The unified Pte enum wrapper already dispatches to the correct
version-specific constructor, which would enforce the correct Bounded
constraint for that version.

> In order to fully benefit from it, we will need the bitfield macro from
> the `kernel` crate so the PDE/PTE fields can be `Bounded`, I will try to
> make it available quickly in a patch that you can depend on.

That would be great, and I'd be happy to integrate Bounded validation once
the macro is available. I just don't think we need to restructure the
dispatch model in order to benefit from it.

> But long story short, and although I need to dive deeper into the code,
> this looks like a good candidate for using a trait and associated types.

The walker code (walk.rs) is already version-agnostic and reads cleanly.
The version dispatch is encapsulated behind method calls, not exposed as
inline if/else blocks.

Generic propagation (or version-specific dispatch at higher levels) adds more
complexity at higher layers.

Enclosed below [3] is the diff I used for my testing with the data, I don't
really see a net readability win there (IMO, it is a net-loss in readability).

[3]
https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=trait-pt-dispatch&id=5eb0e98af11ba608ff4d0f7a06065ee863f5066a

thanks,

--
Joel Fernandes


^ permalink raw reply

* [PATCH 0/2] KVM: arm64: KVM: arm64: Add per-VM WFI/WFE exit disable capability
From: David Woodhouse @ 2026-04-08 20:23 UTC (permalink / raw)
  To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, kvm, linux-doc, linux-kernel,
	linux-arm-kernel, kvmarm, linux-kselftest, Colton Lewis,
	Jing Zhang, David Woodhouse

Add KVM_CAP_ARM_DISABLE_EXITS, modelled after the existing x86
KVM_CAP_X86_DISABLE_EXITS, to allow userspace to disable WFI and/or
WFE trapping on a per-VM basis.

KVM already has system-wide kernel command line parameters
(kvm-arm.wfi_trap_policy and kvm-arm.wfe_trap_policy, added in
0b5afe05377d) to control WFx trapping. However, these are global and
set at boot time. A per-VM capability allows the VMM to make the
decision per guest — for example, disabling WFI trapping for
latency-sensitive VMs with pinned vCPUs while keeping it enabled for
overcommitted guests on the same host.

When a flag is set via KVM_ENABLE_CAP, the corresponding trap is
unconditionally cleared, overriding the system-wide policy. When the
flag is not set, the system policy (including the default
single-task heuristic) applies as before.

As with the x86 equivalent, disabling exits is a one-way operation
per VM.

Tested on Graviton 3 (Neoverse-V1) metal.

David Woodhouse (2):
  KVM: arm64: Add KVM_CAP_ARM_DISABLE_EXITS for WFI/WFE passthrough
  KVM: arm64: selftests: Add KVM_CAP_ARM_DISABLE_EXITS UAPI test

 Documentation/virt/kvm/api.rst                    | 28 +++++++++++++
 arch/arm64/include/asm/kvm_host.h                 |  4 ++
 arch/arm64/kvm/arm.c                              | 20 ++++++++++
 include/uapi/linux/kvm.h                          |  6 +++
 tools/testing/selftests/kvm/Makefile.kvm          |  1 +
 tools/testing/selftests/kvm/arm64/disable_exits.c | 48 +++++++++++++++++++++++
 6 files changed, 107 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/arm64/disable_exits.c


^ permalink raw reply

* [PATCH 2/2] KVM: arm64: selftests: Add KVM_CAP_ARM_DISABLE_EXITS UAPI test
From: David Woodhouse @ 2026-04-08 20:23 UTC (permalink / raw)
  To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, kvm, linux-doc, linux-kernel,
	linux-arm-kernel, kvmarm, linux-kselftest, Colton Lewis,
	Jing Zhang, David Woodhouse
In-Reply-To: <20260408202557.2102476-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Test the KVM_CAP_ARM_DISABLE_EXITS capability interface:
 - KVM_CHECK_EXTENSION reports KVM_ARM_DISABLE_EXITS_WFI
 - KVM_ENABLE_CAP succeeds with valid flags (WFI, zero)
 - KVM_ENABLE_CAP fails with EINVAL for unknown flags

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 tools/testing/selftests/kvm/Makefile.kvm      |  1 +
 .../selftests/kvm/arm64/disable_exits.c       | 48 +++++++++++++++++++
 2 files changed, 49 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/arm64/disable_exits.c

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index 878d7cb92555..d8e7ff122445 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -179,6 +179,7 @@ TEST_GEN_PROGS_arm64 += arm64/vgic_irq
 TEST_GEN_PROGS_arm64 += arm64/vgic_lpi_stress
 TEST_GEN_PROGS_arm64 += arm64/vgic_group_iidr
 TEST_GEN_PROGS_arm64 += arm64/vgic_group_v2
+TEST_GEN_PROGS_arm64 += arm64/disable_exits
 TEST_GEN_PROGS_arm64 += arm64/vpmu_counter_access
 TEST_GEN_PROGS_arm64 += arm64/no-vgic-v3
 TEST_GEN_PROGS_arm64 += arm64/idreg-idst
diff --git a/tools/testing/selftests/kvm/arm64/disable_exits.c b/tools/testing/selftests/kvm/arm64/disable_exits.c
new file mode 100644
index 000000000000..27fe6c9297b2
--- /dev/null
+++ b/tools/testing/selftests/kvm/arm64/disable_exits.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * disable_exits.c - Test KVM_CAP_ARM_DISABLE_EXITS UAPI
+ *
+ * Verify that KVM_CHECK_EXTENSION reports the valid exit disable mask
+ * and that KVM_ENABLE_CAP accepts valid flags and rejects invalid ones.
+ */
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+int main(int argc, char *argv[])
+{
+	struct kvm_vm *vm;
+	int r;
+
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_DISABLE_EXITS));
+
+	r = kvm_check_cap(KVM_CAP_ARM_DISABLE_EXITS);
+	TEST_ASSERT(r & KVM_ARM_DISABLE_EXITS_WFI,
+		    "KVM_CHECK_EXTENSION should report WFI: got 0x%x", r);
+	TEST_ASSERT(r & KVM_ARM_DISABLE_EXITS_WFE,
+		    "KVM_CHECK_EXTENSION should report WFE: got 0x%x", r);
+
+	vm = vm_create(1);
+
+	/* Valid: disable WFI trapping */
+	vm_enable_cap(vm, KVM_CAP_ARM_DISABLE_EXITS, KVM_ARM_DISABLE_EXITS_WFI);
+
+	/* Valid: disable WFE trapping */
+	vm_enable_cap(vm, KVM_CAP_ARM_DISABLE_EXITS, KVM_ARM_DISABLE_EXITS_WFE);
+
+	/* Valid: disable both */
+	vm_enable_cap(vm, KVM_CAP_ARM_DISABLE_EXITS,
+		      KVM_ARM_DISABLE_EXITS_WFI | KVM_ARM_DISABLE_EXITS_WFE);
+
+	/* Valid: no exits disabled (no-op) */
+	vm_enable_cap(vm, KVM_CAP_ARM_DISABLE_EXITS, 0);
+
+	/* Invalid: unknown bit set */
+	r = __vm_enable_cap(vm, KVM_CAP_ARM_DISABLE_EXITS, 1ULL << 31);
+	TEST_ASSERT(r == -1 && errno == EINVAL,
+		    "Unknown flags should fail with EINVAL: got %d errno %d",
+		    r, errno);
+
+	kvm_vm_free(vm);
+	return 0;
+}
-- 
2.51.0


^ permalink raw reply related

* [PATCH 1/2] KVM: arm64: Add KVM_CAP_ARM_DISABLE_EXITS for WFI/WFE passthrough
From: David Woodhouse @ 2026-04-08 20:23 UTC (permalink / raw)
  To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, kvm, linux-doc, linux-kernel,
	linux-arm-kernel, kvmarm, linux-kselftest, Colton Lewis,
	Jing Zhang, David Woodhouse
In-Reply-To: <20260408202557.2102476-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Add a per-VM capability to allow userspace to disable WFI and/or WFE
trapping, modelled after x86's KVM_CAP_X86_DISABLE_EXITS. When the
corresponding flag is set, the trap is unconditionally cleared
regardless of the global kvm-arm.wf{i,e}_trap_policy setting.

The existing kernel command line parameters provide a system-wide
override, but a per-VM capability allows the VMM to make the decision
per guest.

This is useful for hypervisors running a combination of dedicated
pinned vCPUs which want to avoid the cost of trapping WFI/WFE, as
well as overcommitted floating instances where it is necessary.

As with the x86 equivalent, KVM_CHECK_EXTENSION returns the bitmask of
supported exit disables.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 Documentation/virt/kvm/api.rst    | 28 ++++++++++++++++++++++++++++
 arch/arm64/include/asm/kvm_host.h |  4 ++++
 arch/arm64/kvm/arm.c              | 20 ++++++++++++++++++++
 include/uapi/linux/kvm.h          |  6 ++++++
 4 files changed, 58 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 032516783e96..e3b3bd9edeec 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8902,6 +8902,34 @@ helpful if user space wants to emulate instructions which are not
 This capability can be enabled dynamically even if VCPUs were already
 created and are running.
 
+7.47 KVM_CAP_ARM_DISABLE_EXITS
+------------------------------
+
+:Architecture: arm64
+:Target: VM
+:Parameters: args[0] is a bitmask of exits to disable
+:Returns: 0 on success, -EINVAL if unsupported bits are set.
+
+Valid bits in args[0]:
+
+ - ``KVM_ARM_DISABLE_EXITS_WFI``: Disable trapping of WFI (Wait For
+   Interrupt) instructions. The guest WFI will execute natively instead
+   of causing a VM exit.
+
+ - ``KVM_ARM_DISABLE_EXITS_WFE``: Disable trapping of WFE (Wait For
+   Event) instructions. The guest WFE will execute natively instead of
+   causing a VM exit.
+
+When a bit is set, the corresponding trap is unconditionally cleared for
+all vCPUs in the VM, overriding the system-wide ``kvm-arm.wfi_trap_policy``
+and ``kvm-arm.wfe_trap_policy`` kernel parameters.
+
+Disabling exits is a one-way operation: once an exit type is disabled for
+a VM, it cannot be re-enabled. Calling this ioctl with args[0] = 0 is a
+no-op.
+
+``KVM_CHECK_EXTENSION`` returns the bitmask of exits that can be disabled.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 70cb9cfd760a..a1bb025c641f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -312,6 +312,10 @@ struct kvm_arch {
 	size_t nested_mmus_size;
 	int nested_mmus_next;
 
+	/* Per-VM WFI trap override; set via KVM_CAP_ARM_DISABLE_EXITS */
+	bool wfi_in_guest;
+	bool wfe_in_guest;
+
 	/* Interrupt controller */
 	struct vgic_dist	vgic;
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 410ffd41fd73..326a99fea753 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -178,6 +178,17 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		}
 		mutex_unlock(&kvm->lock);
 		break;
+	case KVM_CAP_ARM_DISABLE_EXITS:
+		if (cap->args[0] & ~KVM_ARM_DISABLE_VALID_EXITS) {
+			r = -EINVAL;
+			break;
+		}
+		if (cap->args[0] & KVM_ARM_DISABLE_EXITS_WFI)
+			kvm->arch.wfi_in_guest = true;
+		if (cap->args[0] & KVM_ARM_DISABLE_EXITS_WFE)
+			kvm->arch.wfe_in_guest = true;
+		r = 0;
+		break;
 	case KVM_CAP_ARM_SEA_TO_USER:
 		r = 0;
 		set_bit(KVM_ARCH_FLAG_EXIT_SEA, &kvm->arch.flags);
@@ -379,6 +390,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_SEA_TO_USER:
 		r = 1;
 		break;
+	case KVM_CAP_ARM_DISABLE_EXITS:
+		r = KVM_ARM_DISABLE_VALID_EXITS;
+		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
 		return KVM_GUESTDBG_VALID_MASK;
 	case KVM_CAP_ARM_SET_DEVICE_ADDR:
@@ -610,6 +624,9 @@ static void vcpu_set_pauth_traps(struct kvm_vcpu *vcpu)
 
 static bool kvm_vcpu_should_clear_twi(struct kvm_vcpu *vcpu)
 {
+	if (vcpu->kvm->arch.wfi_in_guest)
+		return true;
+
 	if (unlikely(kvm_wfi_trap_policy != KVM_WFX_NOTRAP_SINGLE_TASK))
 		return kvm_wfi_trap_policy == KVM_WFX_NOTRAP;
 
@@ -621,6 +638,9 @@ static bool kvm_vcpu_should_clear_twi(struct kvm_vcpu *vcpu)
 
 static bool kvm_vcpu_should_clear_twe(struct kvm_vcpu *vcpu)
 {
+	if (vcpu->kvm->arch.wfe_in_guest)
+		return true;
+
 	if (unlikely(kvm_wfe_trap_policy != KVM_WFX_NOTRAP_SINGLE_TASK))
 		return kvm_wfe_trap_policy == KVM_WFX_NOTRAP;
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 80364d4dbebb..694cf699ed0a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -669,6 +669,11 @@ struct kvm_ioeventfd {
 #define KVM_X86_DISABLE_EXITS_CSTATE         (1 << 3)
 #define KVM_X86_DISABLE_EXITS_APERFMPERF     (1 << 4)
 
+#define KVM_ARM_DISABLE_EXITS_WFI            (1 << 0)
+#define KVM_ARM_DISABLE_EXITS_WFE            (1 << 1)
+#define KVM_ARM_DISABLE_VALID_EXITS          (KVM_ARM_DISABLE_EXITS_WFI | \
+					      KVM_ARM_DISABLE_EXITS_WFE)
+
 /* for KVM_ENABLE_CAP */
 struct kvm_enable_cap {
 	/* in */
@@ -989,6 +994,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_ARM_SEA_TO_USER 245
 #define KVM_CAP_S390_USER_OPEREXEC 246
 #define KVM_CAP_S390_KEYOP 247
+#define KVM_CAP_ARM_DISABLE_EXITS 248
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
-- 
2.51.0


^ permalink raw reply related

* Re: [PATCH] hwmon: (asus-ec-sensors) add ROG STRIX B650E-E GAMING WIFI
From: Veronika Kossmann @ 2026-04-08 20:28 UTC (permalink / raw)
  To: Eugene Shalygin, Guenter Roeck
  Cc: Veronika Kossmann, Veronika Kossmann, Jonathan Corbet, Shuah Khan,
	linux-hwmon, linux-doc, linux-kernel
In-Reply-To: <CAB95QATxrJa0koMq=BCjnXvLHJ5boRBUA+76FwqWJhmhEi-Tqg@mail.gmail.com>

On 4/4/26 10:12, Eugene Shalygin wrote:
> On Sat, 4 Apr 2026 at 06:38, Guenter Roeck <linux@roeck-us.net> wrote:
>> Sashiko has a problem with this patch:
> I must admit now, that these _SET macros were a bad idea, it turned
> out to be too easy to misread. I'm going to remove them.
>
> Veronika, could you, please, show us the output from sensors with this
> version of the code?
>
> Cheers,
> Eugene

Of course:

$sensors asusec-isa-000a
asusec-isa-000a
Adapter: ISA adapter
CPU:          +37.0°C
Motherboard:  +38.0°C
VRM:          +51.0°C

These are relevant to actual temperatures.

Best wishes,

Veronika


^ permalink raw reply

* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Babu Moger @ 2026-04-08 20:45 UTC (permalink / raw)
  To: Reinette Chatre, corbet@lwn.net, tony.luck@intel.com,
	Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
  Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
	akpm@linux-foundation.org, pmladek@suse.com,
	rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
	kees@kernel.org, elver@google.com, paulmck@kernel.org,
	lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
	seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
	xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
	Lendacky, Thomas, elena.reshetova@intel.com,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-coco@lists.linux.dev, kvm@vger.kernel.org,
	eranian@google.com, peternewman@google.com
In-Reply-To: <efc269f8-bf98-4f12-8d76-1fee564be84c@intel.com>

Hi Reinette,

On 4/7/26 23:45, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/7/26 6:01 PM, Babu Moger wrote:
>> Hi Reinette,
>>
>> On 4/7/26 12:48, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/6/26 3:45 PM, Babu Moger wrote:
>>>> Hi Reinette,
>>>>
>>>> Sorry for the late response. I was trying to get confirmation about the use case.
>>>
>>> No problem. I appreciate that you did this so that we can make sure resctrl supports
>>> needed use cases.
>>>
>>>>
>>>> On 3/31/26 17:24, Reinette Chatre wrote:
>>>>> On 3/30/26 11:46 AM, Babu Moger wrote:
>>>>>> On 3/27/26 17:11, Reinette Chatre wrote:
>>>>>>> On 3/26/26 10:12 AM, Babu Moger wrote:
>>>>>>>> On 3/24/26 17:51, Reinette Chatre wrote:
>>>>>>>>> On 3/12/26 1:36 PM, Babu Moger wrote:
>>>
>>>>> can have domains that span different CPUs. There thus seem to be a built in assumption of what a "domain"
>>>>> means for PQR_PLZA_ASSOC so it sounds to me as though, instead of saying that "PQR_PLZA_ASSOC needs
>>>>> to be the same in QoS domain" it may be more accurate to, for example, say that "PQR_PLZA_ASSOC has L3 scope"?
>>>>
>>>> Yes.
>>>
>>> Above is about L3 scope ...
>>
>> Yes. The scope for PQR_PLZA_ASSOC is L3.
>>
>> Is that what you are asking here?
> 
> I was trying to point out that there appears to be a mismatch between the actual scope and
> the planned implementation. As highlighted below during the discussion about "global" this is
> fine with me and I just wanted to confirm that this matches your intentions.

Ack.

> 
>>
>>>   
>>>>>
>>>>> This seems to be what this implementation does since it hardcodes PQR_PLZA_ASSOC scope to the L3
>>>>> resource but that creates dependency to the L3 resource that would make PLZA unusable if, for example,
>>>>> the user boots with "rdt=!l3cat" while wanting to use PLZA to manage MBA allocations when in kernel?
>>>>
>>>> Yes. that is correct. It should not be attached to one resource. We need to change it to global scope.
>>>
>>> Can I interpret "global scope" as "all online CPUs"? Doing so will simplify
>>
>> Yes. That is correct.
>>
>>
>>> supporting this feature. It does not sound practical for a user wanting to assign
>>> different resource groups to kernel work done in different domains ... the guidance should
>>> instead be to just set the allocations of one resource group to what is needed in the different
>>> domains? There may be more flexibility when supporting per-domain RMIDs though but so far
>>> it sounds as though the focus is global. We can consider what needs to be done to support
>>> some type of "per-domain" assignment as exercise whether current interface could support it
>>> in the future.
>>
>> Yes. Makes sense.
>>
>>>
> 
> ...
> 
>>>> The PLZA MSR is updated when user changes the association to the
>>>> file. No context switch code changes are needed. This will be
>>>> dedicated group. The current resctrl group files, "cpus, cpus_list
>>>
>>> Why does this have to be a dedicated group? One of the conclusions from v1
>>> discussion was that the "PLZA group" need *not* be a dedicated group. I repeated that
>>> in my earlier response that I left quoted above. You did not respond to these
>>> conclusions and statements in this regard while you keep coming back to this
>>> needing to be a dedicated group without providing a motivation to do so.
>>> Could you please elaborate why a dedicated group is required?
>>
>> If the same group applies identical limits to both user and kernel
>> space, it essentially behaves like a current resctrl group. In that
>> sense, it’s not really a PLZA group. PLZA’s key value is the ability
>> to separate allocations between user space and kernel space. A
> 
> The plan has never been to force identical allocations for user and kernel
> space since that would go against this feature entirely. Even so, just as
> user and kernel space cannot be forced to have identical allocations they
> also cannot be forced to have different allocations. Specifically,
> a task *can* use the same CLOSID for user and kernel space work just as easily
> as it can use *different* CLOSID for user and kernel space work. There
> should not be any CLOSID reserved just for kernel work. Or am I missing something?

No. You are not missing anything.


> 
>> single CPU can belong to two groups: one group manages the user-
>> space allocation for that CPU, while another manages the kernel-mode
>> allocation.
> 
> Exactly. This is why it is important to have two files for this CPU association
> within a resource group. The cpus/cpus_list file continues to be used as today
> while the new kernel_mode_cpus/kernel_mode_cpus_list is used for kernel work.
> With this a task can be associated with any resource group for its user space
> allocations but when it runs on one of the CPUs within kernel_mode_cpus then
> its kernel work will be done with allocations of the resource group the
> kernel_mode_cpus file belongs to, which may or may not be the same
> resource group that the user space task belongs to.

Yes. Exactly.

> 
>> This approach also simplifies file handling, which is another reason
>> I prefer it.
> 
> I *think* we have different interpretations of "dedicated group":
> It sounds as though you interpret "dedicated group" as a way that enforces
> the same allocations to user space and kernel work.
> I interpret "dedicated group" essentially as a CLOSID reserved for kernel
> work. Since I do not see that resctrl should dedicate a CLOSID/resource group
> for kernel work I have been pushing against such "dedicated group".

Actually, our understanding is same. Probably, I am not explaining it 
right. Hope we get there soon.


> 
>> That said, I’m open to not having a dedicated group if we can still support all the features that PLZA provides without it.
> 
> I find that enabling user space to share CLOSID/RMID between user space
> and kernel space to indeed support what PLZA provides. I think I am missing
> something here since below proposal again attempts to isolate a resource group
> (CLOSID) for kernel work.

No. I dont want to isolate a group just for PLZA. All I am saying is, we 
should provide option to create a dedicated group if the user wants to 
do it.

> 
>>>> Add a file, "info/kmode_monitor", to describe how kmode is monitored.
>>>>
>>>> # cat info/kmode_monitor
>>>> [inherit_ctrl_and_mon] <- Kernel uses the same CLOSID/RMID as user. Default option for the "global"
>>>> assign_ctrl_inherit_mon <- One CLOSID for all kernel work; RMID inherited from user.
>>>> assign_ctrl_assign_mon <- One resource group (CLOSID+RMID) for all kernel work. Default option for "cpu" type.
>>>
>>> My first thought is that the naming is confusing. resctrl has a very strong relationship between
>>> "RMID" and "monitoring" so naming a file "monitor" that deals with allocation/ctrl/CLOSID is
>>> potentially confusion.
>>>
>>> Apart from that, while I think I understand where you are going by separating the mode into
>>> two files I am concerned about future complications needing to accommodate all different
>>> combinations of the (now) essentially two modes. My preference is thus to keep this simple by
>>> keeping the mode within one file.
>>>
>>> Even so, when stepping back, it does not really look like we need to separate the "global"
>>> and "per CPU" modes. We could just have a single "per CPU" mode and the "global" is just
>>> its default of "all CPUs", no?
>>
>> Yes. That correct.
>>
>>>
>>> Consider, for example, the implementation just consisting of:
>>>
>>>      # cat info/kernel_mode
>>>      [inherit_ctrl_and_mon]
>>>      global_assign_ctrl_inherit_mon_per_cpu
>>>      global_assign_ctrl_assign_mon_per_cpu
>>>   
>>>>
>>>> Rename “kernel_mode_assignment” to “kmode_group” to assign the specific group to kmode. This file usage is same as before.
>>>>
>>>> #cat info/kmode_groups (Renamed "kernel_mode_assignment")
>>>> //
>>>
>>> Please consider the intent of this file when thinking about names. The idea is that "info/kernel_mode"
>>> specifies the "mode" of how kernel work is handled and it determines the configuration files used in that
>>> mode as well as the syntax when interacting with those files. By renaming "kernel_mode_assignment" to
>>> "kmode_groups" it implicitly requires all future kernel mode enhancements to need some data related to "groups".
>>>
>>> In summary, I think this can be simplified by introducing just two new files in info/ that enables the
>>> user to (a) select and (b) configure the "kernel mode". To start there can be just two modes,
>>> global_assign_ctrl_inherit_mon_per_cpu and global_assign_ctrl_assign_mon_per_cpu.
>>> global_assign_ctrl_inherit_mon_per_cpu mode requires a control group in kernel_mode_assignment while
>>> global_assign_ctrl_assign_mon_per_cpu requires a control and monitoring group.
>>>
>>> The resource group in info/kernel_mode_assignment gets two additional files "kernel_mode_cpus" and
>>> "kernel_mode_cpus_list" that contains the CPUs enabled with the kernel mode configuration, by default
>>> it will be all online CPUs. The resource group can continue to be used to manage allocations of and
>>> monitor user space tasks. Specifically, the "cpus", "cpus_list", and "tasks" files remain.
>>>
>>> A user wanting just "global" settings will get just that when writing the group to
>>> info/kernel_mode_assignment. A user wanting "per CPU" settings can follow the
>>> info/kernel_mode_assignment setting with changes to that resource group's kernel_mode_cpus/kernel_mode_cpus_list
>>> files. Any task running on a CPU that is *not* in kernel_mode_cpus/kernel_mode_cpus_list can be
>>> expected to inherit both CLOSID and RMID from user space for all kernel work.
>>
>> After further consideration, I don’t think the info/kernel_mode file
>> is necessary. There’s no need to enforce a specific mode for all the
>> PLZA groups. Avoiding this constraint makes the design more
>> flexible, particularly as we move toward supporting multiple PLZA
>> groups in the future. MPAM already appears capable of handling more
>> than one group—for example, one group could use
>> inherit_ctrl_and_mon, while another could use
>> global_assign_ctrl_inherit_mon_per_cpu.
> 
> You are looking ahead at future capabilities for which we do not know all requirements
> at this time. I think it is very good to consider how things may progress and your example
> of MPAM is of course on point. I believe the current design does consider this progression.
> Please see https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@intel.com/
> (search for "per_group_assign_ctrl_assign_mon"). In that exploration per-group assignment
> is actually accomplished with global files. I thus think we should not make such a big
> architectural decision that does not benefit the immediate feature using partial information.
> As it is, a "info/kernel_mode" gives the flexibility to expand to, if needed, configuration
> files within a resource group. That is why the intention is to associate the mode within
> info/kernel_mode with the presence/absence of info/kernel_mode_assignment (search for
> "Visibility depends on active mode in info/kernel_mode" in linked email) since in the
> future resctrl may need to enable a mode that needs configuration files within each
> resource group and when enabling such mode the per-resource group files will appear
> instead of the global info/kernel_mode_assignment.
> 
>>
>> The mode can simply be determined on a per-group basis. We can introduce two new files—kernel_mode_cpus and kernel_mode_cpus_list—within each resctrl group when kmode (or PLZA) is supported.
> 
> I think having these files in every resource group is confusing since user can only interact
> with these files in one resource group for current PLZA. Why not *just* have the files in the
> resource group that matches the group in info/kernel_mode_assignment?

The default group can also serve as the PLZA group.

#cat info/kernel_mode_assignment
//

At this point, the (kmode_cpus / kmode_cpus_list) files will exist in 
the default group:

Then user changes the PLZA group to "test".

#echo "test//" > info/kernel_mode_assignment

At this point, we expect the files "(kmode_cpus/kmode_cpus_list)" to be 
visible in "test//" group.

One open question is whether we should remove the visibility of these 
files from the default group. It’s unclear if we can safely do this 
dynamically.

An alternative approach would be to always keep the files present, but 
allow access to them only for groups that are listed in 
"info/kernel_mode_assignment".


>>
>> The info/kernel_mode_assignment file would indicate which resctrl
>> group(or groups) is used for PLZA. The files—kernel_mode_cpus and
>> kernel_mode_cpus_list would indicate how the plza is applied which
>> each group.
> 
> The "how PLZA is applied" should be learned from info/kernel_mode where user
> space learns whether RMID is inherited or not. While I find kernel_mode_cpus
> and kernel_mode_cpus_list to be just for configuration and just found in the
> resource group listed in info/kernel_mode_assignment.

ok.

> 
>>
>> Files and behavior:
>> - cpus / cpus_list:
>>
>> CPUs listed here use the same allocation for both user and kernel space.
> 
> Both user and kernel space?

As it stands today, the CPU list is written to MSR_PQR_ASSOC, resulting 
in the same allocation for both user and kernel within a given CLOS.

Kernel-mode allocation changes only if specific CPUs are included in the 
kmode_cpus list.


> Monitoring would depend on info/kernel_mode_assignment ("inherit_mon")
> and kernel space allocation would depend on whether the CPU on which the task runs
> can be found in kernel_mode_cpus, no?

Yes. that is correct.

> 
> 
>> There is no change to the current semantics of these files.
>> If these files are empty, the group effectively becomes a PLZA-dedicated group.
> 
> I do not see it this way. If the cpu/cpus_list files are empty then it means that the
> tasks in the group will use their own CLOSID/RMID for user space allocation and
> monitoring. What allocations/monitoring is used by tasks when in kernel mode depends
> on whether the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
> file. If the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
> file then it will inherit whatever the PQR_PLZA setting of that CPU which is the allocation
> associated with the resource group to which that kernel_mode_cpus/kernel_mode_cpuslist belongs.
> If the CPU the task is running on cannot be found in kernel_mode_cpus/kernel_mode_cpuslist
> then its kernel work will inherit its user space allocations and monitoring.
> 

Yes. that is correct. I think our understanding is correct, but our 
implementation ideas are different it seems.

>>
>> - kernel_mode_cpus / kernel_mode_cpus_list:
>>
>> These files determine whether a separate kernel allocation is applied.
>> If empty, user and kernel share the same allocation.
>> If non-empty, the kernel uses a separate allocation.
>>
>> The group can be CTL_MON or MON group. Based on type the group the CLOSID and RMID will be used to enable PLZA. If it is MON, then rmid_en = 1 when writing PLZA MSR.
> 
> This will be difficult to get right since CTRL_MON groups also have RMID assigned.
> 
>> Here’s the proposed flow:
>>
>> # mount -t resctrl resctrl /sys/fs/resctrl/
>> # cd /sys/fs/resctrl/
>> # cat info/kernel_mode_assignment
>> //
>>
>> By default, the root (default) group is PLZA-enabled when resctrl is mounted. All CPUs use CLOSID 0 for both user and kernel-mode allocation.
>>
>> # cat cpus_list
>> 1-64
>> # cat kmode_cpus_list
>> 1-64
>>
>> Next, create a new group for PLZA:
>>
>> # mkdir plza_group
>>
>> # echo "plza_group//" > info/kernel_mode_assignment
>>
>> At this point, plza_group becomes the new PLZA-enabled group, and the PLZA-related MSRs are updated accordingly.
> 
> It really looks like you are getting back to trying to dedicate a resource group to
> kernel work and that is not something that resctrl should enforce.
> 
>>
>> # cat plza_group/cpus_list
>> <empty>
>>
>> # cat plza_group/kmode_cpus_list
>> 1-64
>>
>> The user can then update kmode_cpus_list to apply PLZA only to a specific subset of CPUs, if desired.
>>
>>
>> What do you think of this approach?
> 
> It is difficult to predict how the "next" PLZA will actually end up looking like and I find resctrl creating a complicated
> interface to support this to be risky. Instead I would prefer to focus on efficiently supporting what PLZA can do today
> and make it extensible. Apart from that I find the implicit interface, "If it is MON, then rmid_en = 1" to be too
> architecture specific for a generic interface while also not able to accurately capture user's intent (i.e. user may
> indeed, for example, want "a CTRL_MON group to have rmid_en = 1"). Finally, I am just so confused about why the implementations
> keep needing to dedicate a resource group/CLOSID to kernel work.

Let me make sure I understand what you mentioned earlier. Copied the 
text below from the thread for the context:

https://lore.kernel.org/lkml/3305c18e-9e50-4df0-b9f1-c61028628967@intel.com/
=====================================================================

Please consider the intent of this file when thinking about names. The 
idea is that "info/kernel_mode"
specifies the "mode" of how kernel work is handled and it determines the 
configuration files used in that
mode as well as the syntax when interacting with those files. By 
renaming "kernel_mode_assignment" to
"kmode_groups" it implicitly requires all future kernel mode 
enhancements to need some data related to "groups".

In summary, I think this can be simplified by introducing just two new 
files in info/ that enables the
user to (a) select and (b) configure the "kernel mode". To start there 
can be just two modes,
global_assign_ctrl_inherit_mon_per_cpu and 
global_assign_ctrl_assign_mon_per_cpu.
global_assign_ctrl_inherit_mon_per_cpu mode requires a control group in 
kernel_mode_assignment while
global_assign_ctrl_assign_mon_per_cpu requires a control and monitoring 
group.

The resource group in info/kernel_mode_assignment gets two additional 
files "kernel_mode_cpus" and
"kernel_mode_cpus_list" that contains the CPUs enabled with the kernel 
mode configuration, by default
it will be all online CPUs. The resource group can continue to be used 
to manage allocations of and
monitor user space tasks. Specifically, the "cpus", "cpus_list", and 
"tasks" files remain.

A user wanting just "global" settings will get just that when writing 
the group to
info/kernel_mode_assignment. A user wanting "per CPU" settings can 
follow the
info/kernel_mode_assignment setting with changes to that resource 
group's kernel_mode_cpus/kernel_mode_cpus_list
files. Any task running on a CPU that is *not* in 
kernel_mode_cpus/kernel_mode_cpus_list can be
expected to inherit both CLOSID and RMID from user space for all kernel 
work.

======================================================================

Let me try to get few clarification on things here.

# cat info/kernel_mode
   [inherit_ctrl_and_mon]
   global_assign_ctrl_inherit_mon_per_cpu
   global_assign_ctrl_assign_mon_per_cpu

My understanding of "inherit_ctrl_and_mon" is that the kernel inherits 
both the CLOS and the RMID from user space. Basically both user and 
kernel uses same CLOSID and RMID. This reflects the current behavior 
(without PLZA) correct? This would correspond to the default group when 
resctrl is mounted.

The modes "global_assign_ctrl_inherit_mon_per_cpu" and 
"global_assign_ctrl_assign_mon_per_cpu" represent the actual PLZA modes.

Both of these modes introduce new files kernel_mode_cpus/ and 
kernel_mode_cpus_list in the resctrl group.

When the user echoes a group name into info/kernel_mode_assignment, PLZA 
is applied globally across all CPUs. This is default behavior.

If the user wants PLZA to apply only to a specific subset of CPUs, then 
the kernel_mode_cpus or kernel_mode_cpus_list files need to be updated 
accordingly.

global_assign_ctrl_inherit_mon_per_cpu : The group needs to be CTLR_MON 
group. This mode uses rmid_en=0 when writing PLZA MSR.

global_assign_ctrl_assign_mon_per_cpu: The group needs to be 
CTLR_MON/MON group. This mode uses rmid_en=1 when writing PLZA MSR.

Did I get it right?

Thanks
Babu

^ permalink raw reply

* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-08 21:24 UTC (permalink / raw)
  To: Babu Moger, corbet@lwn.net, tony.luck@intel.com,
	Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
  Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
	akpm@linux-foundation.org, pmladek@suse.com,
	rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
	kees@kernel.org, elver@google.com, paulmck@kernel.org,
	lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
	seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
	xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
	Lendacky, Thomas, elena.reshetova@intel.com,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-coco@lists.linux.dev, kvm@vger.kernel.org,
	eranian@google.com, peternewman@google.com
In-Reply-To: <0ae2b267-4527-4251-9136-6afdc3fc97a5@amd.com>

Hi Babu,

On 4/8/26 1:45 PM, Babu Moger wrote:
> On 4/7/26 23:45, Reinette Chatre wrote:
>> On 4/7/26 6:01 PM, Babu Moger wrote:

>>> That said, I’m open to not having a dedicated group if we can still support all the features that PLZA provides without it.
>>
>> I find that enabling user space to share CLOSID/RMID between user space
>> and kernel space to indeed support what PLZA provides. I think I am missing
>> something here since below proposal again attempts to isolate a resource group
>> (CLOSID) for kernel work.
> 
> No. I dont want to isolate a group just for PLZA. All I am saying
> is, we should provide option to create a dedicated group if the user
> wants to do it.
I agree. I do not see resctrl needing to do anything to accomplish this though. If
the user wants a group dedicated to kernel mode/PLZA then all that is needed is for the
user not to assign any tasks to this group, either via changes to the group's tasks file
or via the group's cpus/cpus_list files.

>>>
>>> The mode can simply be determined on a per-group basis. We can
>>> introduce two new files—kernel_mode_cpus and
>>> kernel_mode_cpus_list—within each resctrl group when kmode (or
>>> PLZA) is supported.
>>
>> I think having these files in every resource group is confusing since user can only interact
>> with these files in one resource group for current PLZA. Why not *just* have the files in the
>> resource group that matches the group in info/kernel_mode_assignment?
> 
> The default group can also serve as the PLZA group.
> 
> #cat info/kernel_mode_assignment
> //
> 
> At this point, the (kmode_cpus / kmode_cpus_list) files will exist in the default group:
> 
> Then user changes the PLZA group to "test".
> 
> #echo "test//" > info/kernel_mode_assignment
> 
> At this point, we expect the files "(kmode_cpus/kmode_cpus_list)" to be visible in "test//" group.
> 
> One open question is whether we should remove the visibility of these files from the default group. It’s unclear if we can safely do this dynamically.
> 
> An alternative approach would be to always keep the files present, but allow access to them only for groups that are listed in "info/kernel_mode_assignment".

The files appearing/disappearing is just how the user experiences the resctrl fs interface.
Within resctrl the files could indeed always exist but resctrl can use the kernfs_show()
API to show/hide them as needed. Similar to resctrl_bmec_files_show() that you created.
Allowing/removing access becomes complicated because user space can always do a chmod
to change permissions that resctrl would need to handle.

I do not know if there are sharp corners here when thinking about strange scenarios where
user opens a file before resctrl changes visibility or permissions and then user space
interacts with the file. This may be worthwhile to test to matter which mechanism is used.

>>> Files and behavior:
>>> - cpus / cpus_list:
>>>
>>> CPUs listed here use the same allocation for both user and kernel space.
>>
>> Both user and kernel space?
> 
> As it stands today, the CPU list is written to MSR_PQR_ASSOC, resulting in the same allocation for both user and kernel within a given CLOS.
> 
> Kernel-mode allocation changes only if specific CPUs are included in the kmode_cpus list.

ack.

>>> There is no change to the current semantics of these files.
>>> If these files are empty, the group effectively becomes a PLZA-dedicated group.
>>
>> I do not see it this way. If the cpu/cpus_list files are empty then it means that the
>> tasks in the group will use their own CLOSID/RMID for user space allocation and
>> monitoring. What allocations/monitoring is used by tasks when in kernel mode depends
>> on whether the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
>> file. If the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
>> file then it will inherit whatever the PQR_PLZA setting of that CPU which is the allocation
>> associated with the resource group to which that kernel_mode_cpus/kernel_mode_cpuslist belongs.
>> If the CPU the task is running on cannot be found in kernel_mode_cpus/kernel_mode_cpuslist
>> then its kernel work will inherit its user space allocations and monitoring.
>>
> 
> Yes. that is correct. I think our understanding is correct, but our implementation ideas are different it seems.

While we have been sharing different ideas I have tried to be clear on *why* I made
certain choices and attempted to provide specific feedback to your ideas. If you find
your plan to be better then please respond to my feedback about it to help me understand
why that may be the better solution. If you find your solution is better then could you please
describe it with detail? At this time I do not have a clear understanding of what you propose.

...
> 
> Let me make sure I understand what you mentioned earlier. Copied the text below from the thread for the context:
> 
> https://lore.kernel.org/lkml/3305c18e-9e50-4df0-b9f1-c61028628967@intel.com/
> =====================================================================
> 
> Please consider the intent of this file when thinking about names. The idea is that "info/kernel_mode"
> specifies the "mode" of how kernel work is handled and it determines the configuration files used in that
> mode as well as the syntax when interacting with those files. By renaming "kernel_mode_assignment" to
> "kmode_groups" it implicitly requires all future kernel mode enhancements to need some data related to "groups".
> 
> In summary, I think this can be simplified by introducing just two new files in info/ that enables the
> user to (a) select and (b) configure the "kernel mode". To start there can be just two modes,
> global_assign_ctrl_inherit_mon_per_cpu and global_assign_ctrl_assign_mon_per_cpu.
> global_assign_ctrl_inherit_mon_per_cpu mode requires a control group in kernel_mode_assignment while
> global_assign_ctrl_assign_mon_per_cpu requires a control and monitoring group.
> 
> The resource group in info/kernel_mode_assignment gets two additional files "kernel_mode_cpus" and
> "kernel_mode_cpus_list" that contains the CPUs enabled with the kernel mode configuration, by default
> it will be all online CPUs. The resource group can continue to be used to manage allocations of and
> monitor user space tasks. Specifically, the "cpus", "cpus_list", and "tasks" files remain.
> 
> A user wanting just "global" settings will get just that when writing the group to
> info/kernel_mode_assignment. A user wanting "per CPU" settings can follow the
> info/kernel_mode_assignment setting with changes to that resource group's kernel_mode_cpus/kernel_mode_cpus_list
> files. Any task running on a CPU that is *not* in kernel_mode_cpus/kernel_mode_cpus_list can be
> expected to inherit both CLOSID and RMID from user space for all kernel work.
> 
> ======================================================================
> 
> Let me try to get few clarification on things here.
> 
> # cat info/kernel_mode
>   [inherit_ctrl_and_mon]
>   global_assign_ctrl_inherit_mon_per_cpu
>   global_assign_ctrl_assign_mon_per_cpu
> 
> My understanding of "inherit_ctrl_and_mon" is that the kernel
> inherits both the CLOS and the RMID from user space. Basically both
> user and kernel uses same CLOSID and RMID. This reflects the current
> behavior (without PLZA) correct? This would correspond to the

Correct.

> default group when resctrl is mounted.

> 
> The modes "global_assign_ctrl_inherit_mon_per_cpu" and "global_assign_ctrl_assign_mon_per_cpu" represent the actual PLZA modes.
> 
> Both of these modes introduce new files kernel_mode_cpus/ and kernel_mode_cpus_list in the resctrl group.

Right. To be specific when the user changes the mode to either "global_assign_ctrl_inherit_mon_per_cpu" or
"global_assign_ctrl_assign_mon_per_cpu" the new files will be created in the default resource group with
associated setting applied globally at that time.

> 
> When the user echoes a group name into info/kernel_mode_assignment, PLZA is applied globally across all CPUs. This is default behavior.
> 
> If the user wants PLZA to apply only to a specific subset of CPUs, then the kernel_mode_cpus or kernel_mode_cpus_list files need to be updated accordingly.
> 
> global_assign_ctrl_inherit_mon_per_cpu : The group needs to be CTLR_MON group. This mode uses rmid_en=0 when writing PLZA MSR.
> 
> global_assign_ctrl_assign_mon_per_cpu: The group needs to be CTLR_MON/MON group. This mode uses rmid_en=1 when writing PLZA MSR.
> 
> Did I get it right?

This is my understanding also, yes.

Reinette


^ permalink raw reply

* Re: [PATCH] doc: watchdog: fix typos etc.
From: Randy Dunlap @ 2026-04-08 21:28 UTC (permalink / raw)
  To: Björn Persson
  Cc: Andrew Morton, Jonathan Corbet, Shuah Khan, linux-doc,
	linux-kernel
In-Reply-To: <20260408205611.0f7e38de@tag.xn--rombobjrn-67a.se>



On 4/8/26 11:56 AM, Björn Persson wrote:
> Randy Dunlap wrote:
>> -Similarly to the softlockup case, the current stack trace is displayed
>> +Similar to the softlockup case, the current stack trace is displayed
> 
> "Similarly" modifies "is displayed", so the adverbial form is correct.
> 
>> -The core of the detectors in a hrtimer. It servers multiple purpose:
>> +The core of the detectors is an hrtimer. It servers multiple purposes:
> 
> And "servers" should be "serves".

Thank you.

Andrew, I'll send a v2 patch.

-- 
~Randy


^ permalink raw reply

* [PATCH v2] doc: watchdog: fix typos etc.
From: Randy Dunlap @ 2026-04-08 21:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: Randy Dunlap, Andrew Morton, Jonathan Corbet, Shuah Khan,
	linux-doc, Björn Persson

Correct typos in lockup-watchdogs.rst.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
---
v2: corrections from Björn (Thanks)

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: linux-doc@vger.kernel.org
Cc: Björn Persson <Bjorn@xn--rombobjrn-67a.se>

 Documentation/admin-guide/lockup-watchdogs.rst |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-next-20260406.orig/Documentation/admin-guide/lockup-watchdogs.rst
+++ linux-next-20260406/Documentation/admin-guide/lockup-watchdogs.rst
@@ -41,7 +41,7 @@ is a trade-off between fast response to
 Implementation
 ==============
 
-The soft and hard lockup detectors are built around a hrtimer.
+The soft and hard lockup detectors are built around an hrtimer.
 In addition, the softlockup detector regularly schedules a job, and
 the hard lockup detector might use Perf/NMI events on architectures
 that support it.
@@ -49,7 +49,7 @@ that support it.
 Frequency and Heartbeats
 ------------------------
 
-The core of the detectors in a hrtimer. It servers multiple purpose:
+The core of the detectors is an hrtimer. It serves multiple purposes:
 
 - schedules watchdog job for the softlockup detector
 - bumps the interrupt counter for hardlockup detectors (heartbeat)

^ permalink raw reply

* Re: allowing '-' instead of ':' in kernel-doc descriptions
From: Randy Dunlap @ 2026-04-08 22:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Jonathan Corbet, Linux Documentation
In-Reply-To: <dskdc44um6l6sw43uazfpzmsv4tkesog7sro22qkvzxyflvurt@pwhb3rs44ga7>

Hi,
[modified Subject & recipients]

On 11/13/25 2:32 AM, Mauro Carvalho Chehab wrote:
> On Thu, Nov 13, 2025 at 03:49:27AM -0500, Michael S. Tsirkin wrote:
>> On Thu, Nov 13, 2025 at 12:55:37PM +1100, Stephen Rothwell wrote:
>>> Hi all,
>>>
>>> Today's linux-next build (htmldocs) produced these warnings:
>>>
>>> WARNING: /home/sfr/kernels/next/next/include/linux/virtio_config.h:174 duplicate section name 'Return'
>>> WARNING: /home/sfr/kernels/next/next/include/linux/virtio_config.h:184 duplicate section name 'Return'
>>> WARNING: /home/sfr/kernels/next/next/include/linux/virtio_config.h:190 duplicate section name 'Return'
>>>
>>> Introduced by commit
>>>
>>>   bee8c7c24b73 ("virtio: introduce map ops in virtio core")
>>>
>>> but is probably a bug in our scripts as those lines above have "Returns:"
>>> in them, not "Return:".
>>>
>>> These have turned up now since a bug was fixed that was repressing a
>>> lot of warnings.
>>
>> Indeed. But the rest of header says Returns ... without : so I will just
>> fix this one to do the same. I also fixed other issues in the comments
>> in this header while I was at it. Will post shortly.
> 
> That's the best approach. We could instead change the new section detection
> regex to accept just one space at most:
> 
>     diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
>     index f7dbb0868367..bab0ec3abe31 100644
>     --- a/scripts/lib/kdoc/kdoc_parser.py
>     +++ b/scripts/lib/kdoc/kdoc_parser.py
>     @@ -46,7 +46,7 @@ doc_decl = doc_com + KernRe(r'(\w+)', cache=False)
>      known_section_names = 'description|context|returns?|notes?|examples?'
>      known_sections = KernRe(known_section_names, flags = re.I)
>      doc_sect = doc_com + \
>     -    KernRe(r'\s*(@[.\w]+|@\.\.\.|' + known_section_names + r')\s*:([^:].*)?$',
>     +    KernRe(r'\s?(@[.\w]+|@\.\.\.|' + known_section_names + r')\s*:([^:].*)?$',
>                 flags=re.I, cache=False)
>  
>      doc_content = doc_com_body + KernRe(r'(.*)', cache=False)
> 
> (patch not tested)
> 
> But, if we do so, someone has to check if this won't cause regressions
> elsewhere. I'm almost sure a change like that will break something...

Following up:

I've been testing this patch for about 3 months now.
The only problems that I have seen with it are these:
(in linux-next-20260408)


WARNING: ../drivers/pci/msi/api.c:102 duplicate section name 'Return'
WARNING: ../mm/damon/core.c:1472 duplicate section name 'Return'
WARNING: ../mm/damon/core.c:1472 duplicate section name 'Return'
WARNING: ../include/uapi/drm/i915_drm.h:2403 duplicate section name 'Return'
WARNING: ../include/uapi/drm/i915_drm.h:2403 duplicate section name 'Return'
WARNING: ../include/uapi/drm/i915_drm.h:2403 duplicate section name 'Return'
WARNING: ../drivers/gpu/drm/drm_atomic_helper.c:3546 duplicate section name 'Return'
WARNING: ../drivers/gpu/drm/drm_atomic_helper.c:3710 duplicate section name 'Return'
WARNING: ../drivers/gpu/drm/drm_of.c:382 duplicate section name 'Return'
WARNING: ../drivers/gpu/drm/drm_of.c:432 duplicate section name 'Return'
WARNING: ../drivers/gpu/drm/drm_gem.c:900 duplicate section name 'Return'
WARNING: ../include/linux/w1.h:115 duplicate section name 'Return'
WARNING: ../include/linux/w1.h:115 duplicate section name 'Return'


-- 
~Randy


^ permalink raw reply

* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Moger, Babu @ 2026-04-08 23:07 UTC (permalink / raw)
  To: Reinette Chatre, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
	Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
  Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
	akpm@linux-foundation.org, pmladek@suse.com,
	rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
	kees@kernel.org, elver@google.com, paulmck@kernel.org,
	lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
	seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
	xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
	Lendacky, Thomas, elena.reshetova@intel.com,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-coco@lists.linux.dev, kvm@vger.kernel.org,
	eranian@google.com, peternewman@google.com
In-Reply-To: <72297351-2954-4318-81b6-7de409e5552c@intel.com>

Hi Reinette,

On 4/8/2026 4:24 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/8/26 1:45 PM, Babu Moger wrote:
>> On 4/7/26 23:45, Reinette Chatre wrote:
>>> On 4/7/26 6:01 PM, Babu Moger wrote:
> 
>>>> That said, I’m open to not having a dedicated group if we can still support all the features that PLZA provides without it.
>>>
>>> I find that enabling user space to share CLOSID/RMID between user space
>>> and kernel space to indeed support what PLZA provides. I think I am missing
>>> something here since below proposal again attempts to isolate a resource group
>>> (CLOSID) for kernel work.
>>
>> No. I dont want to isolate a group just for PLZA. All I am saying
>> is, we should provide option to create a dedicated group if the user
>> wants to do it.
> I agree. I do not see resctrl needing to do anything to accomplish this though. If
> the user wants a group dedicated to kernel mode/PLZA then all that is needed is for the
> user not to assign any tasks to this group, either via changes to the group's tasks file
> or via the group's cpus/cpus_list files.
> 
>>>>
>>>> The mode can simply be determined on a per-group basis. We can
>>>> introduce two new files—kernel_mode_cpus and
>>>> kernel_mode_cpus_list—within each resctrl group when kmode (or
>>>> PLZA) is supported.
>>>
>>> I think having these files in every resource group is confusing since user can only interact
>>> with these files in one resource group for current PLZA. Why not *just* have the files in the
>>> resource group that matches the group in info/kernel_mode_assignment?
>>
>> The default group can also serve as the PLZA group.
>>
>> #cat info/kernel_mode_assignment
>> //
>>
>> At this point, the (kmode_cpus / kmode_cpus_list) files will exist in the default group:
>>
>> Then user changes the PLZA group to "test".
>>
>> #echo "test//" > info/kernel_mode_assignment
>>
>> At this point, we expect the files "(kmode_cpus/kmode_cpus_list)" to be visible in "test//" group.
>>
>> One open question is whether we should remove the visibility of these files from the default group. It’s unclear if we can safely do this dynamically.
>>
>> An alternative approach would be to always keep the files present, but allow access to them only for groups that are listed in "info/kernel_mode_assignment".
> 
> The files appearing/disappearing is just how the user experiences the resctrl fs interface.
> Within resctrl the files could indeed always exist but resctrl can use the kernfs_show()
> API to show/hide them as needed. Similar to resctrl_bmec_files_show() that you created.
> Allowing/removing access becomes complicated because user space can always do a chmod
> to change permissions that resctrl would need to handle.
> 
> I do not know if there are sharp corners here when thinking about strange scenarios where
> user opens a file before resctrl changes visibility or permissions and then user space
> interacts with the file. This may be worthwhile to test to matter which mechanism is used.
> 
>>>> Files and behavior:
>>>> - cpus / cpus_list:
>>>>
>>>> CPUs listed here use the same allocation for both user and kernel space.
>>>
>>> Both user and kernel space?
>>
>> As it stands today, the CPU list is written to MSR_PQR_ASSOC, resulting in the same allocation for both user and kernel within a given CLOS.
>>
>> Kernel-mode allocation changes only if specific CPUs are included in the kmode_cpus list.
> 
> ack.
> 
>>>> There is no change to the current semantics of these files.
>>>> If these files are empty, the group effectively becomes a PLZA-dedicated group.
>>>
>>> I do not see it this way. If the cpu/cpus_list files are empty then it means that the
>>> tasks in the group will use their own CLOSID/RMID for user space allocation and
>>> monitoring. What allocations/monitoring is used by tasks when in kernel mode depends
>>> on whether the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
>>> file. If the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
>>> file then it will inherit whatever the PQR_PLZA setting of that CPU which is the allocation
>>> associated with the resource group to which that kernel_mode_cpus/kernel_mode_cpuslist belongs.
>>> If the CPU the task is running on cannot be found in kernel_mode_cpus/kernel_mode_cpuslist
>>> then its kernel work will inherit its user space allocations and monitoring.
>>>
>>
>> Yes. that is correct. I think our understanding is correct, but our implementation ideas are different it seems.
> 
> While we have been sharing different ideas I have tried to be clear on *why* I made
> certain choices and attempted to provide specific feedback to your ideas. If you find
> your plan to be better then please respond to my feedback about it to help me understand
> why that may be the better solution. If you find your solution is better then could you please
> describe it with detail? At this time I do not have a clear understanding of what you propose.
> 
> ...
>>
>> Let me make sure I understand what you mentioned earlier. Copied the text below from the thread for the context:
>>
>> https://lore.kernel.org/lkml/3305c18e-9e50-4df0-b9f1-c61028628967@intel.com/
>> =====================================================================
>>
>> Please consider the intent of this file when thinking about names. The idea is that "info/kernel_mode"
>> specifies the "mode" of how kernel work is handled and it determines the configuration files used in that
>> mode as well as the syntax when interacting with those files. By renaming "kernel_mode_assignment" to
>> "kmode_groups" it implicitly requires all future kernel mode enhancements to need some data related to "groups".
>>
>> In summary, I think this can be simplified by introducing just two new files in info/ that enables the
>> user to (a) select and (b) configure the "kernel mode". To start there can be just two modes,
>> global_assign_ctrl_inherit_mon_per_cpu and global_assign_ctrl_assign_mon_per_cpu.
>> global_assign_ctrl_inherit_mon_per_cpu mode requires a control group in kernel_mode_assignment while
>> global_assign_ctrl_assign_mon_per_cpu requires a control and monitoring group.
>>
>> The resource group in info/kernel_mode_assignment gets two additional files "kernel_mode_cpus" and
>> "kernel_mode_cpus_list" that contains the CPUs enabled with the kernel mode configuration, by default
>> it will be all online CPUs. The resource group can continue to be used to manage allocations of and
>> monitor user space tasks. Specifically, the "cpus", "cpus_list", and "tasks" files remain.
>>
>> A user wanting just "global" settings will get just that when writing the group to
>> info/kernel_mode_assignment. A user wanting "per CPU" settings can follow the
>> info/kernel_mode_assignment setting with changes to that resource group's kernel_mode_cpus/kernel_mode_cpus_list
>> files. Any task running on a CPU that is *not* in kernel_mode_cpus/kernel_mode_cpus_list can be
>> expected to inherit both CLOSID and RMID from user space for all kernel work.
>>
>> ======================================================================
>>
>> Let me try to get few clarification on things here.
>>
>> # cat info/kernel_mode
>>    [inherit_ctrl_and_mon]
>>    global_assign_ctrl_inherit_mon_per_cpu
>>    global_assign_ctrl_assign_mon_per_cpu
>>
>> My understanding of "inherit_ctrl_and_mon" is that the kernel
>> inherits both the CLOS and the RMID from user space. Basically both
>> user and kernel uses same CLOSID and RMID. This reflects the current
>> behavior (without PLZA) correct? This would correspond to the
> 
> Correct.
> 
>> default group when resctrl is mounted.
> 
>>
>> The modes "global_assign_ctrl_inherit_mon_per_cpu" and "global_assign_ctrl_assign_mon_per_cpu" represent the actual PLZA modes.
>>
>> Both of these modes introduce new files kernel_mode_cpus/ and kernel_mode_cpus_list in the resctrl group.
> 
> Right. To be specific when the user changes the mode to either "global_assign_ctrl_inherit_mon_per_cpu" or
> "global_assign_ctrl_assign_mon_per_cpu" the new files will be created in the default resource group with
> associated setting applied globally at that time.

If, at that point, "info/kernel_mode_assignment" points to // (the 
default group), is that correct?

And if "info/kernel_mode_assignment" points to a different group (for 
example, test//), then the kernel_mode_cpus/ and kernel_mode_cpus_list 
files will be created only under the test// group. Is that correct?

Thanks
Babu


^ permalink raw reply

* Re: [PATCH v8 0/2] PCI: s390: Expose the UID as an arch specific PCI slot attribute
From: Vasily Gorbik @ 2026-04-08 23:12 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Bjorn Helgaas, Jonathan Corbet, Lukas Wunner, Shuah Khan,
	Farhan Ali, Alexander Gordeev, Christian Borntraeger,
	Gerald Schaefer, Gerd Bayer, Heiko Carstens, Julian Ruess,
	Matthew Rosato, Peter Oberparleiter, Ramesh Errabolu,
	Sven Schnelle, linux-doc, linux-kernel, linux-pci, linux-s390,
	Randy Dunlap
In-Reply-To: <20260407-uid_slot-v8-0-15ae4409d2ce@linux.ibm.com>

On Tue, Apr 07, 2026 at 03:24:44PM +0200, Niklas Schnelle wrote:
> Add a mechanism for architecture specific attributes on
> PCI slots in order to add the user-defined ID (UID) as an s390 specific
> PCI slot attribute. First though improve some issues with the s390 specific
> documentation of PCI sysfs attributes noticed during development. 

> Niklas Schnelle (2):
>       docs: s390/pci: Improve and update PCI documentation
>       PCI: s390: Expose the UID as an arch specific PCI slot attribute
> 
>  Documentation/arch/s390/pci.rst | 151 +++++++++++++++++++++++++++-------------
>  arch/s390/include/asm/pci.h     |   4 ++
>  arch/s390/pci/pci_sysfs.c       |  20 ++++++
>  drivers/pci/slot.c              |  13 +++-
>  4 files changed, 140 insertions(+), 48 deletions(-)

Applied to s390 tree, thank you!

^ permalink raw reply

* Re: [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums
From: John Hubbard @ 2026-04-08 23:13 UTC (permalink / raw)
  To: Joel Fernandes, Eliot Courtney, linux-kernel
  Cc: Miguel Ojeda, Boqun Feng, Gary Guo, Bjorn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Dave Airlie, Daniel Almeida, Koen Koning, dri-devel,
	rust-for-linux, Nikola Djukic, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
	Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
	Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
	Alex Gaynor, Boqun Feng, Alistair Popple, Timur Tabi, Edwin Peer,
	Alexandre Courbot, Andrea Righi, Andy Ritger, Zhi Wang,
	Balbir Singh, Philipp Stanner, Elle Rhumsaa, alexeyi, joel,
	linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <da8d03f8-0294-417b-b684-2c20d577f94a@nvidia.com>

On 4/8/26 9:58 AM, Joel Fernandes wrote:
> On 4/8/2026 9:26 AM, Eliot Courtney wrote:
>> On Tue Apr 7, 2026 at 10:59 PM JST, Joel Fernandes wrote:
>>> On 4/7/2026 9:42 AM, Eliot Courtney wrote:
>>>> On Tue Apr 7, 2026 at 6:55 AM JST, Joel Fernandes wrote:
...>> [1]: https://github.com/Edgeworth/linux/commits/review/nova-mm-v10/
> First, thanks for the effort. I looked through this, its pretty much what I
> had before when I used traits. I don't think it is better to be honest. In
> fact your version is worse, it adds many new types and things like the
> following which I did not need before.

Hi Joel and all,

I also looked through Eliot's above attempt carefully, and actually
liked it a lot (sorry! haha):

* It cleans up the code. The initial working version was readable, but
  also had lots of noise on the screen: match statements and pairs of
  v2/v3 statements.

  And interestingly, the mmu_version was, in effect, sporadically
  implementing a Trait-based approach. But because it is custom,
  readers don't benefit as much as they would with Traits, which
  tell you immediately how things are structured.

Joel, I am passionately in agreement with your principles: code must
be readable on the screen.

In this case, though, Traits make considerably more readable,
especially if one makes the very reasonable assumption that readers are
thoroughly accustomed to dealing with Rust traits.

> 
> To put it mildly, the following suggestion should not be anywhere near my code:
> 

lol I understand, believe me. But this is short and not too bad, really.

> /// Type-erased MMU-specific [`Vmm`] implementations.

Type erasure remains a semi-exotic thing, IMHO. As such, another
sentence to elaborate on this would be a nice touch.

> enum VmmInner {
>     /// `Vmm` implementation for MMU v2.
>     V2(VmmImpl<MmuV2>),
>     /// `Vmm` implementation for MMU v3.
>     V3(VmmImpl<MmuV3>),
> }
> 
> /// MMU-specific [`Vmm`] implementation.
> struct VmmImpl<M: Mmu> {
> 
> Seriously, I have to pass on this. :-)
> 
> And, you unfortunately seem to have ignored my point about requiring 4 NEW
> traits (Mmu, PteOps, PdeOps, DualPdeOps etc), which I did not need before.
> So you're making the code much much worse than before actually. We don't
> new traits and types pointlessly.

They are not pointless.

However! What I think would be nice is: do a new v11 with approximately
this approach, and then we can beat it into being as readable as 
possible.
 

thanks,
-- 
John Hubbard


^ permalink raw reply

* Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Reinette Chatre @ 2026-04-08 23:41 UTC (permalink / raw)
  To: Moger, Babu, Babu Moger, corbet@lwn.net, tony.luck@intel.com,
	Dave.Martin@arm.com, james.morse@arm.com, tglx@kernel.org,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
  Cc: skhan@linuxfoundation.org, x86@kernel.org, hpa@zytor.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vschneid@redhat.com, kas@kernel.org, rick.p.edgecombe@intel.com,
	akpm@linux-foundation.org, pmladek@suse.com,
	rdunlap@infradead.org, dapeng1.mi@linux.intel.com,
	kees@kernel.org, elver@google.com, paulmck@kernel.org,
	lirongqing@baidu.com, safinaskar@gmail.com, fvdl@google.com,
	seanjc@google.com, pawan.kumar.gupta@linux.intel.com,
	xin@zytor.com, tiala@microsoft.com, chang.seok.bae@intel.com,
	Lendacky, Thomas, elena.reshetova@intel.com,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-coco@lists.linux.dev, kvm@vger.kernel.org,
	eranian@google.com, peternewman@google.com
In-Reply-To: <20aaacfb-9601-4343-a5d5-f3df6152155b@amd.com>

Hi Babu,

On 4/8/26 4:07 PM, Moger, Babu wrote:
> On 4/8/2026 4:24 PM, Reinette Chatre wrote:
>> On 4/8/26 1:45 PM, Babu Moger wrote:
...

>>> The modes "global_assign_ctrl_inherit_mon_per_cpu" and "global_assign_ctrl_assign_mon_per_cpu" represent the actual PLZA modes.
>>>
>>> Both of these modes introduce new files kernel_mode_cpus/ and kernel_mode_cpus_list in the resctrl group.
>>
>> Right. To be specific when the user changes the mode to either "global_assign_ctrl_inherit_mon_per_cpu" or
>> "global_assign_ctrl_assign_mon_per_cpu" the new files will be created in the default resource group with
>> associated setting applied globally at that time.
> 
> If, at that point, "info/kernel_mode_assignment" points to // (the default group), is that correct?

I see "info/kernel_mode_assignment" pointing to default group as the only
option right after a mode switch away from "inherit_ctrl_and_mon".

To elaborate, the current idea is that the mode within info/kernel_mode determines
which, if any, control files are presented to user space.
Assuming that the system boots up with:
	# cat info/kernel_mode
	[inherit_ctrl_and_mon]
	global_assign_ctrl_inherit_mon_per_cpu
	global_assign_ctrl_assign_mon_per_cpu

In above scenario "info/kernel_mode_assignment" does not exist (is not visible to
user space).

When the user switches to either "global_assign_ctrl_inherit_mon_per_cpu" or
'global_assign_ctrl_assign_mon_per_cpu" then "info/kernel_mode_assignment" is created
(or made visible to user space) and is expected to point to default group.
User can change the group using "info/kernel_mode_assignment" at this point.

If the current scenario is below ...
	# cat info/kernel_mode
	[global_assign_ctrl_inherit_mon_per_cpu]
	inherit_ctrl_and_mon
	global_assign_ctrl_assign_mon_per_cpu

... then "info/kernel_mode_assignment" will exist but what it should contain if
user switches mode at this point may be up for discussion.

option 1)
When user switches mode to "global_assign_ctrl_assign_mon_per_cpu" then
the resource group in "info/kernel_mode_assignment" is reset to the
default group and all CPUs PLZA state reset to match. The kernel_mode_cpus
and kernel_mode_cpuslist files become visible in default resource group
and they contain "all online CPUs".

option 2)
When user switches mode to "global_assign_ctrl_assign_mon_per_cpu" then
the resource group in "info/kernel_mode_assignment" is kept and all
CPUs PLZA state set to match it while also keeping the current 
values of that resource group's kernel_mode_cpus and kernel_mode_cpuslist
files.

I am leaning towards "option 1" to keep it consistent with a switch from
"inherit_ctrl_and_mon" and being deterministic about how a mode is started with
a clean slate. What are your thoughts? What would be use case where a user would
want to switch between "global_assign_ctrl_inherit_mon_per_cpu" and
"global_assign_ctrl_assign_mon_per_cpu" to just switch rmid_en on and off?


> And if "info/kernel_mode_assignment" points to a different group
> (for example, test//), then the kernel_mode_cpus/ and
> kernel_mode_cpus_list files will be created only under the test//
> group. Is that correct?

I expect that if "info/kernel_mode_assignment" exists then the group
listed within contains kernel_mode_cpus and kernel_mode_cpuslist.
How the group ends up in "info/kernel_mode_assignment" could result
from mode change or from write by user space.

Reinette


^ permalink raw reply

* Re: [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio
From: SeongJae Park @ 2026-04-09  0:00 UTC (permalink / raw)
  To: Bijan Tabatabai
  Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, Brendan Higgins,
	David Gow, David Hildenbrand, Jonathan Corbet, Lorenzo Stoakes,
	Michal Hocko, Mike Rapoport, Shuah Khan, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, damon, kunit-dev, linux-doc,
	linux-kernel, linux-kselftest, linux-mm
In-Reply-To: <20260408165001.8473-1-bijan311@gmail.com>

On Wed,  8 Apr 2026 11:48:27 -0500 Bijan Tabatabai <bijan311@gmail.com> wrote:

> On Mon,  6 Apr 2026 18:05:22 -0700 SeongJae Park <sj@kernel.org> wrote:
> 
> Hi SJ,
> 
> > TL; DR: Let users set different DAMOS quota charge ratios for DAMOS
> > action failed regions, for deterministic and consistent DAMOS action
> > progress.
> > 
> > Common Reports: Unexpectedly Slow DAMOS
> > =======================================
> > 
> > One common issue report that we get from DAMON users is that DAMOS
> > action applying progress speed is sometimes much slower than expected.
> > And one common root cause is that the DAMOS quota is exceeded by the
> > action applying failed memory regions.
> > 
> > For example, a group of users tried to run DAMOS-based proactive memory
> > reclamation (DAMON_RECLAIM) with 100 MiB per second DAMOS quota.  They
> > ran it on a system having no active workload which means all memory of
> > the system is cold.  The expectation was that the system will show 100
> > MiB per second reclamation until (nearly) all memory is reclaimed. But
> > what they found is that the speed is quite inconsistent and sometimes it
> > becomes very slower than the expectation, sometimes even no reclamation
> > at all for about tens of seconds.  The upper limit of the speed (100 MiB
> > per second) was being kept as expected, though.
> > 
> > By monitoring the qt_exceeds (number of DAMOS quota exceed events) DAMOS
> > stat, we found DAMOS quota is always exceeded when the speed is slow. By
> > monitoring sz_tried and sz_applied (the total amount of DAMOS action
> > tried memory and succeeded memory) DAMOS stats together, we found the
> > reclamation attempts nearly always failed when the speed is slow.
> > 
> > DAMOS quota charges DAMOS action tried regions regardless of the
> > successfulness of the try.  Hence in the example reported case, there
> > was unreclaimable memory spread around the system memory.  Sometimes
> > nearly 100 MiB of memory that DAMOS tried to reclaim in the given quota
> > interval was reclaimable, and therefore showed nearly 100 MiB per second
> > speed.  Sometimes nearly 99 MiB of memory that DAMOS was trying to
> > reclaim in the given quota interval was unreclaimable, and therefore
> > showing only about 1 MiB per second reclaim speed.
> > 
> > We explained it is an expected behavior of the feature rather than a
> > bug, as DAMOS quota is there for only the upper-limit of the speed.  The
> > users agreed and later reported a huge win from the adoption of
> > DAMON_RECLAIM on their products.
> 
> Thanks for this series. This is a problem I have come across and am looking
> forward to seeing this land.

Thank you for acknowledging.  I'm hoping this to land on 7.2-rc1.

[...]
> > DAMOS Action Failed Region Quota Charge Ratio
> > =============================================
> > 
> > Let users set the charge ratio for the action-failed memory, for more
> > optimal and deterministic use of DAMOS.  It allows users to specify the
> > numerator and the denominator of the ratio for flexible setup.  For
> > example, let's suppose the numerator and the denominator are set to 1
> > and 4,096, respectively.  The ratio is 1 / 4,096.  A DAMOS scheme action
> > is applied to 5 GiB memory.  For 1 GiB of the memory, the action is
> > succeeded.  For the rest (4 GiB), the action is failed.  Then, only 1
> > GiB and 1 MiB quota is charged.
> > 
> > The optimal charge ratio will depend on the use case and
> > system/workload.  I'd recommend starting from setting the nominator as 1
> > and the denominator as PAGE_SIZE and tune based on the results, because
> > many DAMOS actions are applied at page level.
> 
> This makes sense, but the quota is also considered when setting the minimum
> allowable score in damos_adjust_quota(), which, to my understanding, assumes
> that all of the all of a region's data will by applied. If an action fails for
> a significant amount of the memory, a lower score than what was calculated in
> damos_adjust_quota() could be valid. If that's the case, the scheme would be
> applied to fewer regions than strictly necessary.

Good point, you are right.

> 
> As you mention above, this is not a correctness issue because the quota only
> guarantees an upper limit on the amount of data the scheme is applied to.

I agree.

> Additionally, it may very well be true that what I listed above would not be
> very noticeable in practice.

I guess it is hopefully true, for following reason.

The score for each region is calculated as a weigted sum of the access
frequency and the age of the region.  To avoid DAMOS action is repeatedly
applied to only a few regions, we reset age of regions after a DAMOS action is
applied to the region, regardless of the action failure.  So, periodically the
score of the regions having the action unapplicable region will get low, make
no big impact to the minimum score threshold calculation.

But real data could say something different.  I will be happy to be proven
wrong my real data. :)

> I just thought this was worth pointing out as
> something to think about.

Indeed.  Thank you for pointing out.  Nonetheless this is not a new issue that
introduced by this patch series.  And the impact is not clear at the moment.  I
will be happy to revisit this in parallel to this patch series.


Thanks,
SJ

[...]

^ permalink raw reply

* Re: [PATCH] crash: Support high memory reservation for range syntax
From: Youling Tang @ 2026-04-09  1:55 UTC (permalink / raw)
  To: Baoquan He, Sourabh Jain
  Cc: Andrew Morton, Jonathan Corbet, Vivek Goyal, Dave Young, kexec,
	linux-kernel, linux-doc, Youling Tang
In-Reply-To: <adZYpnwOxgvFMLaT@MiWiFi-R3L-srv>

Hi, Baoquan

On 4/8/26 21:32, Baoquan He wrote:
> On 04/08/26 at 10:01am, Sourabh Jain wrote:
>> Hello Youling,
>>
>> On 04/04/26 13:11, Youling Tang wrote:
>>> From: Youling Tang <tangyouling@kylinos.cn>
>>>
>>> The crashkernel range syntax (range1:size1[,range2:size2,...]) allows
>>> automatic size selection based on system RAM, but it always reserves
>>> from low memory. When a large crashkernel is selected, this can
>>> consume most of the low memory, causing subsequent hardware
>>> hotplug or drivers requiring low memory to fail due to allocation
>>> failures.
>>
>> Support for high crashkernel reservation has been added to
>> address the above problem.
>>
>> However, high crashkernel reservation is not supported with
>> range-based crashkernel kernel command-line arguments.
>> For example: crashkernel=0M-1G:100M,1G-4G:160M,4G-8G:192M
>>
>> Many users, including some distributions, use range-based
>> crashkernel configuration. So, adding support for high crashkernel
>> reservation with range-based configuration would be useful.
> Sorry for late response. And I have to say sorry because I have some
> negative tendency on this change.
>
> We use crashkernel=xM|G and crashkernel=range1:size1[,range2:size2,...]
> as default setting, so that people only need to set suggested amount
> of memory. While crashkernel=,high|low is for advanced user to customize
> their crashkernel value. In that case, user knows what's high memory and
> low memory, and how much is needed separately to achieve their goal, e.g
> saving low memory, taking away more high memory.
>
> To be honest, above grammers sounds simple, right? I believe both of you
> know very well how complicated the current crashkernel code is. I would
> suggest not letting them becomre more and more complicated by extending
> the grammer further and further. Unless you meet unavoidable issue with
> the existing grammer.
>
> Here comes my question, do you meet unavoidable issue with the existing
> grammer when you use crashkernel=range1:size1[,range2:size2,...] and
> think it's not satisfactory, and at the same time crashkernel=,high|low
> can't meet your demand either?

Yes, regular users generally don't know about high memory and low memory,
and probably don't know how much crashkernel memory should be reserved
either. They mostly just use the default crashkernel parameters configured
by the distribution.

For advanced users, the current grammar is sufficient, because
'crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset],>boundary'
can definitely be replaced with 'crashkernel=size,high'.

The main purpose of this patch is to provide distributions with a more
reasonable default parameter configuration (satisfying most requirements),
without having to set different distribution default parameters for 
different
scenarios (physical machines, virtual machines) and different machine 
models.

Thanks,
Youling.
>
> Thanks
> Baoquan
>

^ permalink raw reply

* Re: [PATCH net-next] docs: netdev: improve wording of reviewer guidance
From: patchwork-bot+netdevbpf @ 2026-04-09  2:10 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
	skhan, workflows, linux-doc
In-Reply-To: <20260406175334.3153451-1-kuba@kernel.org>

Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon,  6 Apr 2026 10:53:34 -0700 you wrote:
> Reword the reviewer guidance based on behavior we see on the list.
> Steer folks:
>  - towards sending tags
>  - away from process issues.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> 
> [...]

Here is the summary with links:
  - [net-next] docs: netdev: improve wording of reviewer guidance
    https://git.kernel.org/netdev/net-next/c/bd5c24e4001d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [RFC net-next 15/15] Documentation: networking: add ipxlat translator guide
From: Xavier HSINYUAN @ 2026-04-09  2:17 UTC (permalink / raw)
  To: Daniel Gröber
  Cc: ralf, antonio, corbet, davem, edumazet, horms, kuba, linux-doc,
	linux-kernel, netdev, pabeni, skhan
In-Reply-To: <fldksy7obiaonlcxrjcbnfkfmaup27t3fq3ktubd7sx35fsswx@hjmchh6sr7rw>

Hi Daniel,

> Indeed, the JSON is just wrong and --do dev-set is missing. However
> `--family ipxlat` works for me and looking at the code is basically the
> same as specifying --spec.
> 
> Could you try this:
>
>    $ JSON='{"ifindex": '"$IID"', "config": {"xlat-prefix6": { "prefix": "'$ADDR_HEX'", "prefix-len": 96}}}'
>    $ ./tools/net/ynl/pyynl/cli.py --family ipxlat --do dev-set --json "$JSON"
This looks good to me now. `--family ipxlat` is fine with me if this runs
from the source tree.

> I worry once we start with that we're really just re-stating what's already
> extensively documented in the RFCs.
> 
> How about a reference to RFC 7915 Appendix A? This has a full bidirectional
> end-to-end example of how translation operates:
> https://datatracker.ietf.org/doc/html/rfc7915#appendix-A
>
> Admittedly using a /96 prefix (which the appendix doesn't) would make it
> easier to grok whats going on. Not sure that's reason enough to get into
> more detailed examples here.

A reference to RFC 7915 Appendix A sounds good to me. Still, a short /96
mapping example would help readers quickly see how the translation works
before reading the full RFC, and would make the following NAT64 section
easier to follow as well.

Best regards,
Xavier

^ permalink raw reply

* 答复: [PATCH v2] Documentation/kernel-parameters: fix architecture alignment for pt, nopt, and nobypass
From: Li,Rongqing(ACG CCN) @ 2026-04-09  2:18 UTC (permalink / raw)
  To: Jonathan Corbet, Andrew Morton, Borislav Petkov, Randy Dunlap,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
  Cc: Shuah Khan, Peter Zijlstra, Feng Tang, Pawan Gupta, Dapeng Mi,
	Kees Cook, Marco Elver, Paul E . McKenney, Askar Safin,
	Bjorn Helgaas, Sohil Mehta
In-Reply-To: <20260330105957.2271-1-lirongqing@baidu.com>

> 主题: [PATCH v2] Documentation/kernel-parameters: fix architecture alignment
> for pt, nopt, and nobypass
> 
> From: Li RongQing <lirongqing@baidu.com>
> 
> Commit ab0e7f20768a ("Documentation: Merge x86-specific boot options doc
> into kernel-parameters.txt") introduced a formatting regression where
> architecture tags were placed on separate lines with broken indentation.
> This caused the 'nopt' [X86] parameter to appear as if it belonged to the
> [PPC/POWERNV] section.
> 
> Furthermore, since the main 'iommu=' parameter heading already specifies it is
> for [X86, EARLY], the subsequent standalone [X86] tags for 'pt', 'nopt', and the
> AMD GART options are redundant and clutter the documentation.
> 
> Clean up the formatting by removing these redundant tags and properly
> attributing the 'nobypass' option to [PPC/POWERNV].
> 


Ping

thanks

[Li,Rongqing] 



> Fixes: ab0e7f20768a ("Documentation: Merge x86-specific boot options doc
> into kernel-parameters.txt")
> Acked-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Borislav Petkov (AMD) <bp@alien8.de>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Feng Tang <feng.tang@linux.alibaba.com>
> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Marco Elver <elver@google.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Askar Safin <safinaskar@gmail.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> index 03a5506..5253c23 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2615,15 +2615,11 @@ Kernel parameters
>  			Intel machines). This can be used to prevent the usage
>  			of an available hardware IOMMU.
> 
> -			[X86]
>  		pt
> -			[X86]
>  		nopt
> -			[PPC/POWERNV]
> -		nobypass
> +		nobypass	[PPC/POWERNV]
>  			Disable IOMMU bypass, using IOMMU for PCI devices.
> 
> -		[X86]
>  		AMD Gart HW IOMMU-specific options:
> 
>  		<size>
> --
> 2.9.4


^ permalink raw reply

* Re: [PATCH net-next V5 00/12] devlink: add per-port resource support
From: patchwork-bot+netdevbpf @ 2026-04-09  3:10 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: edumazet, kuba, pabeni, andrew+netdev, davem, horms,
	donald.hunter, jiri, corbet, skhan, saeedm, leon, mbloch, shuah,
	matttbe, chuck.lever, cjubran, ohartoov, moshe, dtatulea,
	daniel.zahka, shshitrit, cratiu, jacob.e.keller, parav,
	ajayachandra, shayd, kees, danielj, netdev, linux-kernel,
	linux-doc, linux-rdma, linux-kselftest, gal
In-Reply-To: <20260407194107.148063-1-tariqt@nvidia.com>

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 7 Apr 2026 22:40:55 +0300 you wrote:
> Hi,
> 
> This series by Or adds devlink per-port resource support.
> See detailed description by Or below [1].
> 
> Regards,
> Tariq
> 
> [...]

Here is the summary with links:
  - [net-next,V5,01/12] devlink: Refactor resource functions to be generic
    https://git.kernel.org/netdev/net-next/c/7be3163c49b2
  - [net-next,V5,02/12] devlink: Add port-level resource registration infrastructure
    https://git.kernel.org/netdev/net-next/c/6f38acfed5ed
  - [net-next,V5,03/12] net/mlx5: Register SF resource on PF port representor
    https://git.kernel.org/netdev/net-next/c/4be8326d817e
  - [net-next,V5,04/12] netdevsim: Add devlink port resource registration
    https://git.kernel.org/netdev/net-next/c/085b234b28cc
  - [net-next,V5,05/12] devlink: Add dump support for device-level resources
    https://git.kernel.org/netdev/net-next/c/11636b550eea
  - [net-next,V5,06/12] devlink: Include port resources in resource dump dumpit
    https://git.kernel.org/netdev/net-next/c/810b76394d69
  - [net-next,V5,07/12] devlink: Add port-specific option to resource dump doit
    https://git.kernel.org/netdev/net-next/c/7511ff14f30d
  - [net-next,V5,08/12] selftest: netdevsim: Add devlink port resource doit test
    https://git.kernel.org/netdev/net-next/c/396135377104
  - [net-next,V5,09/12] devlink: Document port-level resources and full dump
    https://git.kernel.org/netdev/net-next/c/170e160a0e7c
  - [net-next,V5,10/12] devlink: Add resource scope filtering to resource dump
    https://git.kernel.org/netdev/net-next/c/1bc45341a6ea
  - [net-next,V5,11/12] selftest: netdevsim: Add resource dump and scope filter test
    https://git.kernel.org/netdev/net-next/c/2a8e91235254
  - [net-next,V5,12/12] devlink: Document resource scope filtering
    https://git.kernel.org/netdev/net-next/c/78c327c1728d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH 3/4] docs/zh_CN: update rust/quick-start.rst translation
From: Dongliang Mu @ 2026-04-09  5:37 UTC (permalink / raw)
  To: Gary Guo, Ben Guo, Alex Shi, Yanteng Si, Jonathan Corbet
  Cc: linux-doc, linux-kernel, rust-for-linux
In-Reply-To: <DHNYKCR34P1F.1EZ3D0A8UB8S5@garyguo.net>


On 4/9/26 1:43 AM, Gary Guo wrote:
> On Wed Apr 8, 2026 at 5:51 PM BST, Ben Guo wrote:
>> On 4/8/26 7:33 PM, Gary Guo wrote:
>>> Hi Ben,
>>>
>>> Thanks on updating the doc translation. There has been new changes to
>>> quick-start.rst on rust-next, could you update the translation to base on that
>>> please?
>>>
>>> Thanks,
>>> Gary
>> Hi Gary,
>>    
>>
>>    
>>    
>>
>> Thanks for the review. This series is based on the Chinese documentation
>> maintainer's tree (alexs/linux.git docs-next), which does not yet have
>> the latest quick-start.rst changes from the Rust-for-Linux rust-next
>> tree.
>>
>> Would it be better to wait until those changes land in our base tree
>> and then resend with the updated translation? Or would you prefer a
>> different approach?
>>
>> Thanks,
>> Ben
> I don't see the issue of sending translation of the latest quick-start.rst even
> if it's not in your base yet. By the time the changes land upstream, the
> original quick-start.rst would already be there.

Hi Gary,

Let’s wait for the rust-next changes to land upstream first, then I’ll 
ask Ben Guo to sync that commit. Otherwise, the Chinese translation 
would do not match the original English doc, which will confuse readers.

We have checktransupdate.py in place for monitoring the updates in 
English documents.

Dongliang Mu


>
> Best,
> Gary


^ permalink raw reply

* Re: [PATCH 0/1] Documentation: leds: leds-class: Document keyboard backlight LED class naming
From: Kate Hsuan @ 2026-04-09  6:43 UTC (permalink / raw)
  To: Hans de Goede, Lee Jones, Pavel Machek, Jonathan Corbet,
	Shuah Khan
  Cc: Rishit Bansal, Carlos Ferreira, Edip Hazuri, Mustafa Ekşi,
	Xavier Bestel, linux-leds, linux-doc
In-Reply-To: <20260406174638.320135-1-johannes.goede@oss.qualcomm.com>

Hi Hans,

On 4/7/26 1:46 AM, Hans de Goede wrote:
> Hi All,
>
> Over the last couple of years there have been several attempts to add
> upstream kernel support for controlling keyboard backlights consisting of
> a small number of backlight zones, think e.g. : "main", "cursor" and
> "keypad" zones.
>
> All of these attempts have gotten or are stuck on the lack of consensus on
> a userspace API (1) for controlling such zoned keyboard backlights.
>
> Previous discussion can be summarized as there being consensus that
> these backlights should be represented as (multi-color) LED class devices
> with one LED class device per zone, mirroring the existing use of
> a LED class device for controlling single zone keyboard backlights.
>
> The only thing which really still needs to be agreed upon is a naming
> scheme for the per zone LED class devices so that userspace can detect:
>
> 1. That the function of these is to control a zoned keyboard backlight.
> 2. How to group the per zone devices together for a single keyboard.
>
> The single patch in this series documents the currently undocumented naming
> scheme for single zone keyboard backlights and extends this with a naming
> scheme to use for multi-zone keyboard backlights.
>
> This is send out as a separate patch rather then as part of a series
> implementing this in the hope to get multiple drivers which are in
> the process of being upstreamed unstuck wrt the LED class naming problem.
>
> Drivers which need this are:
>
> 1. HP WMI laptop driver Omen gaming keyboards backlight control support:
> First 2023 attempt:
> https://lore.kernel.org/platform-driver-x86/20230131235027.36304-1-rishitbansal0@gmail.com/
> Later 2024 attempt which includes an earlier version of this doc patch:
> https://lore.kernel.org/platform-driver-x86/20240719100011.16656-1-carlosmiguelferreira.2003@gmail.com/
> Current ongoing 2026 attempt:
> https://lore.kernel.org/platform-driver-x86/20260304105831.119349-3-edip@medip.dev/
>
> 2. Casper Excalibur laptop driver (inc. multi-zone kbd backlight control):
> https://lore.kernel.org/platform-driver-x86/20240806205001.191551-2-mustafa.eskieksi@gmail.com/
> This one unfortunately seems to have stalled.
>
> 3. Logitech G710/G710+ gaming keyboards HID driver:
> https://lore.kernel.org/linux-input/20260402075239.3829699-1-xav@bes.tel/
> Posted a week ago, needs an agreement on the LED class dev naming scheme
> to continue.
>
> Regards,
>
> Hans
>
>
> 1) The lack of such an API may not always have been the sole reason these
> drivers have gotten stuck, but it was always a factor.
>
>
> Carlos Ferreira (1):
>    Documentation: leds: leds-class: Document keyboard backlight LED class
>      naming
>
>   Documentation/leds/leds-class.rst | 63 +++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
>
Thank you for your work.

The kbd_zoned_backlight is pretty useful for the upper-layer apps, such 
as upower.
This gives additional information about the location of the keyboard 
backlight LED and allows the upower to expose the APIs with the zone 
information to the user space. It also improves the user experience of 
the keyboard backlight control.

Acked-by: Kate Hsuan <hpa@redhat.com>


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox