public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] [PATCH v3 00/12] x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
@ 2026-04-30 23:24 Babu Moger
  2026-04-30 23:24 ` [PATCH v3 01/12] x86/resctrl: Support Privilege-Level Zero Association (PLZA) Babu Moger
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Babu Moger @ 2026-04-30 23:24 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, Dave.Martin, james.morse,
	tglx, bp, dave.hansen
  Cc: skhan, x86, babu.moger, mingo, hpa, akpm, rdunlap,
	pawan.kumar.gupta, feng.tang, dapeng1.mi, kees, elver, lirongqing,
	paulmck, bhelgaas, seanjc, alexandre.chartre, yazen.ghannam,
	peterz, chang.seok.bae, kim.phillips, xin, naveen,
	thomas.lendacky, linux-doc, linux-kernel, eranian, peternewman,
	sos-linux-ext-patches


Hi,

This series adds support for AMD's Privilege-Level Zero Association
(PLZA) so kernel work can be assigned to a resctrl group, and wires it
up through a small generic "kernel mode" (kmode) layer in fs/resctrl
so future architectures can plug in without touching core resctrl.

The features are documented in:
 
   AMD64 Zen6 Platform Quality of Service (PQOS) Extensions,
   Publication # 69193 Revision 1.00, Issue Date March 2026
 
available at https://bugzilla.kernel.org/show_bug.cgi?id=206537

The patches are based on top of commit (7.1.0-rc1)
Commit 3382329a309d Merge branch into tip/master: 'timers/clocksource'.

Background
==========

Customers have identified an issue while using the QoS resource Control
feature. If a memory bandwidth associated with a CLOSID is aggressively
throttled, and it moves into Kernel mode, the Kernel operations are also
aggressively throttled. This can stall forward progress and eventually
degrade overall system performance.

Privilege-Level Zero Association (PLZA) allows the user to specify a CLOSID
and/or RMID associated with execution in Privilege-Level Zero. When enabled
on a HW thread, when the thread enters Privilege-Level Zero, transactions
associated with that thread will be associated with the PLZA CLOSID and/or
RMID. Otherwise, the HW thread will be associated with the CLOSID and RMID
identified by PQR_ASSOC.

Design
======

A new sysfs file, info/kernel_mode, holds a single global policy that
selects what kernel work is steered and which rdtgroup it is steered
to.  Reads describe the supported modes and the currently-active
binding; writes change the policy or rebind to a different group.
Look at the thread below for design discussion.
https://lore.kernel.org/lkml/14a8ad0a-e842-4268-871a-0762f1169e03@intel.com/

Per-rdtgroup files kmode_cpus and kmode_cpus_list scope the binding
to a subset of online CPUs without unbind/rebind churn.  They are
visible only on the group that is currently the active kernel-mode
binding.

The arch hooks (resctrl_arch_get_kmode_support,
resctrl_arch_configure_kmode) keep the fs/resctrl layer arch-neutral.
Only AMD PLZA is wired up here; Intel and ARM can add their own
support later by implementing the hooks.

Layout
======

  01-02  x86: PLZA CPU feature + MSR/data-structure plumbing.
  03-05  fs/resctrl + x86: kmode data structures, arch hooks, and
         population of supported modes.
  06-08  fs/resctrl: global kmode config, info/kernel_mode read/write
         and documentation.
  09     fs/resctrl: reset the binding when the bound rdtgroup is
         removed.
  10-12  fs/resctrl: per-rdtgroup kmode_cpus[_list] - expose, gate
         visibility on the bound group, and allow incremental writes.

Examples
========

(See Documentation/filesystems/resctrl.rst, "kernel_mode" and
"kmode_cpus" sections, for the full UAPI.)

  # Mount resctrl
  # mount -t resctrl resctrl /sys/fs/resctrl
  # cd /sys/fs/resctrl

  # Read the supported modes.  The active mode is bracketed and reports
  # the bound "<ctrl>/<mon>/" group; other supported modes report
  # ":group=none" because nothing is bound to them.
  # cat info/kernel_mode
  [inherit_ctrl_and_mon:group=//]
  global_assign_ctrl_inherit_mon_per_cpu:group=none
  global_assign_ctrl_assign_mon_per_cpu:group=none

  # Create a CTRL_MON group plus a MON child and bind both the kernel
  # CLOSID and RMID to them.
  # mkdir ctrl1
  # mkdir ctrl1/mon_groups/mon1
  # echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/" \
          > info/kernel_mode
  # cat info/kernel_mode
  inherit_ctrl_and_mon:group=none
  global_assign_ctrl_inherit_mon_per_cpu:group=none
  [global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/]

  # kmode_cpus and kmode_cpus_list are visible only on the bound group.
  # ls ctrl1/kmode_cpus*
  ctrl1/kmode_cpus  ctrl1/kmode_cpus_list

  # Restrict the binding to a CPU subset; the write is incremental.
  # echo 0-3 > ctrl1/kmode_cpus_list
  # cat ctrl1/kmode_cpus
  f
  # cat ctrl1/kmode_cpus_list
  0-3

  # Empty masks are rejected; use info/kernel_mode to reset to
  # "every online CPU".
  # echo "" > ctrl1/kmode_cpus_list
  bash: echo: write error: Invalid argument
  # cat info/last_cmd_status
  Empty mask not allowed; use info/kernel_mode to unbind

  # Disable kernel-mode steering (back to inherit, default group).
  # echo "inherit_ctrl_and_mon" > info/kernel_mode

Tested on AMD with PLZA; the generic bits build clean on x86 without
PLZA support and are no-ops at runtime.

Changelog
=========

v3:
  - Generalise the layer beyond AMD: rename "PLZA mode" to "kernel
    mode" (kmode) in code, sysfs, and Documentation.  The public
    interface is now info/kernel_mode and per-group kmode_cpus[_list].
  - info/kernel_mode UAPI cleanups: ":group=none" instead of
    ":group=uninitialized"; designated initialisers + static_assert
    for the mode-name table; strim() the input; clearer error
    messages via last_cmd_status.
  - kmode_cpus / kmode_cpus_list:
      * 0010 exposes them read-only on every group.
      * 0011 toggles their visibility via kernfs_show() so they
        appear only on the rdtgroup currently bound to the active
        kernel mode.
      * 0012 (new) makes them writable: incremental
        enable/disable deltas via resctrl_arch_configure_kmode(),
        empty masks rejected with -EINVAL ("use info/kernel_mode
        to unbind"), offline CPUs rejected, defensive -EBUSY for
        stale fds opened before an info/kernel_mode rebind.
  - 0009: reset the binding when the bound rdtgroup is removed,
    instead of leaving stale state.
  - Kerneldoc/comment cleanups across the series; Documentation
    updated alongside the UAPI changes.

v2: 
     This is similar to RFC with new proposal. Names of the some interfaces
     are not final. Lets fix that later as we move forward.

     Separated the two features: Global Bandwidth Enforcement (GLBE) and
     Privilege Level Zero Association (PLZA).
 
     This series only adds support for PLZA.

     Used the name of the feature as kmode instead of PLZA. That can be changed as well.

     Tony suggested using global variables to store the kernel mode
     CLOSID and RMID. However, the kernel mode CLOSID and RMID are
     coming from rdtgroup structure with the new interface. Accessing
     them requires holding the associated lock, which would make the
     context switch path unnecessarily expensive. So, dropped the idea.
     https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
     Let me know if there are other ways to optimize this.

Patch 1: Data structures and arch hook: Add resctrl_kmode,
	resctrl_kmode_cfg, kernel-mode bits, and resctrl_arch_get_kmode_cfg()
	for generic resctrl kernel mode (e.g. PLZA).

Patch 2: Implement resctrl_arch_get_kmode_cfg() on x86, add global resctrl_kcfg
	and resctrl_kmode_init() to set default kmode.

Patch 3: Add info/kernel_mode and resctrl_kernel_mode_show() to list supported
	kernel modes and show the current one in brackets.

Patch 4: Add x86 PLZA support and boot option rdt=plza.

Patch 5: Add supported modes from CPUID.

Patch 6: Add rdt_kmode_enable_key and arch enable/disable helpers so PLZA only
	touches fast paths when enabled.

Patch 7: Add MSR_IA32_PQR_PLZA_ASSOC, bit defines, and union qos_pqr_plza_assoc
	for programming PLZA.

Patch 8: Add Per-CPU and per-task state.

Patch 9: Add resctrl_arch_configure_kmode() and resctrl_arch_set_kmode()
	to program PLZA per domain and set/clear it on a CPU.

Patch 10: In the sched-in path, program MSR_IA32_PQR_PLZA_ASSOC from task or
	per-CPU kmode; only write when kmode changes; guard with rdt_kmode_enable_key.

Patch 11: Add write handler so the current kernel mode can be set by name.

Patch 12: Add info/kernel_mode_assignment and show which rdtgroup is assigned
	for kernel mode in CTRL_MON/MON/ form.

Patch 13: Add write handler to assign/clear the group used for kernel mode;
	enforce single assignment and clear on rmdir.

Patch 14: Update per-CPU PLZA state when its cpu_mask changes (add/remove CPUs)
	via cpus_write_kmode() and helpers.

Patch 15: Refactor so task list respects t->kmode when the group has kmode (PLZA),
	so tasks are shown correctly.



v2: https://lore.kernel.org/lkml/cover.1773347820.git.babu.moger@amd.com/
v1: https://lore.kernel.org/lkml/cover.1769029977.git.babu.moger@amd.com/

Babu Moger (12):
  x86/resctrl: Support Privilege-Level Zero Association (PLZA)
  x86/resctrl: Add data structures and definitions for PLZA configuration
  fs/resctrl: Add kernel mode (kmode) data structures and arch hook
  x86,fs/resctrl: Program PLZA through kmode arch hooks
  x86/resctrl: Initialize supported kernel modes for PLZA
  fs/resctrl: Initialize the global kernel-mode policy at subsystem init
  fs/resctrl: Add info/kernel_mode for kernel-mode policy introspection
  fs/resctrl: Make info/kernel_mode writable and identify the bound group
  fs/resctrl: Reset kernel-mode binding when its rdtgroup goes away
  fs/resctrl: Expose kmode_cpus / kmode_cpus_list per rdtgroup
  resctrl: Hide kmode_cpus[_list] on groups not bound to kernel-mode
  fs/resctrl: Allow user space to write kmode_cpus / kmode_cpus_list

 Documentation/filesystems/resctrl.rst        |  ...
 arch/x86/kernel/cpu/resctrl/...              |  ...
 fs/resctrl/...                               |  ...
 include/linux/resctrl.h                      |  ...
 include/linux/resctrl_types.h                |  ...
 N files changed, X insertions(+), Y deletions(-)

-- 
2.43.0

Babu Moger (12):
  x86/resctrl: Support Privilege-Level Zero Association (PLZA)
  x86/resctrl: Add data structures and definitions for PLZA
    configuration
  fs/resctrl: Add kernel mode (kmode) data structures and arch hook
  x86,fs/resctrl: Program PLZA through kmode arch hooks
  x86/resctrl: Initialize supported kernel modes for PLZA
  fs/resctrl: Initialize the global kernel-mode policy at subsystem init
  fs/resctrl: Add info/kernel_mode for kernel-mode policy introspection
  fs/resctrl: Make info/kernel_mode writable and identify the bound
    group
  fs/resctrl: Reset kernel-mode binding when its rdtgroup goes away
  fs/resctrl: Expose kmode_cpus / kmode_cpus_list per rdtgroup
  resctrl: Hide kmode_cpus[_list] on groups not bound to kernel-mode
  fs/resctrl: Allow user space to write kmode_cpus / kmode_cpus_list

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/filesystems/resctrl.rst         |  84 ++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   7 +
 arch/x86/kernel/cpu/resctrl/core.c            |  17 +
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c     |  35 +
 arch/x86/kernel/cpu/resctrl/internal.h        |  27 +
 arch/x86/kernel/cpu/scattered.c               |   1 +
 fs/resctrl/internal.h                         |   6 +
 fs/resctrl/rdtgroup.c                         | 784 ++++++++++++++++++
 include/linux/resctrl.h                       |  23 +
 include/linux/resctrl_types.h                 |  46 +
 12 files changed, 1032 insertions(+), 1 deletion(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-04-30 23:27 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-30 23:24 [PATCH v3 00/12] [PATCH v3 00/12] x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem Babu Moger
2026-04-30 23:24 ` [PATCH v3 01/12] x86/resctrl: Support Privilege-Level Zero Association (PLZA) Babu Moger
2026-04-30 23:24 ` [PATCH v3 02/12] x86/resctrl: Add data structures and definitions for PLZA configuration Babu Moger
2026-04-30 23:24 ` [PATCH v3 03/12] fs/resctrl: Add kernel mode (kmode) data structures and arch hook Babu Moger
2026-04-30 23:24 ` [PATCH v3 04/12] x86,fs/resctrl: Program PLZA through kmode arch hooks Babu Moger
2026-04-30 23:24 ` [PATCH v3 05/12] x86/resctrl: Initialize supported kernel modes for PLZA Babu Moger
2026-04-30 23:24 ` [PATCH v3 06/12] fs/resctrl: Initialize the global kernel-mode policy at subsystem init Babu Moger
2026-04-30 23:24 ` [PATCH v3 07/12] fs/resctrl: Add info/kernel_mode for kernel-mode policy introspection Babu Moger
2026-04-30 23:24 ` [PATCH v3 08/12] fs/resctrl: Make info/kernel_mode writable and identify the bound group Babu Moger
2026-04-30 23:24 ` [PATCH v3 09/12] fs/resctrl: Reset kernel-mode binding when its rdtgroup goes away Babu Moger
2026-04-30 23:24 ` [PATCH v3 10/12] fs/resctrl: Expose kmode_cpus / kmode_cpus_list per rdtgroup Babu Moger
2026-04-30 23:24 ` [PATCH v3 11/12] resctrl: Hide kmode_cpus[_list] on groups not bound to kernel-mode Babu Moger
2026-04-30 23:24 ` [PATCH v3 12/12] fs/resctrl: Allow user space to write kmode_cpus / kmode_cpus_list Babu Moger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox