public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] x86,fs/resctrl: Pave the way for MPAM counter assignment
@ 2026-02-25 20:19 Ben Horgan
  2026-02-25 20:19 ` [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode Ben Horgan
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Ben Horgan @ 2026-02-25 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, reinette.chatre, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, ben.horgan, fenghuay,
	tan.shaopeng

A little bit of preparatory work to get ready for MPAM counter
assignment. Resctrl gained support last year for counter assignment for AMD
machines supporting ABMC. Tighten a few things up, that weren't needed for
AMD, so that the MPAM driver can emulate ABMC and hence support counter
assignment.

Based on v7.0-rc1. The last patch [1] is only there to resolve the conflict
in the case that the mpam resctrl glue series [2] is merged first.

* arm_mpam: resctrl: Use new signature for resctrl_arch_is_evt_configurable()
[2] https://lore.kernel.org/linux-arm-kernel/20260224175720.2663924-1-ben.horgan@arm.com/


Ben Horgan (4):
  x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of
    mbm_assign_mode
  fs/resctrl: Only show 'event_filter' files if events are configurable
  fs/resctrl: Disallow the software controller when mbm counters are
    assignable
  arm_mpam: resctrl: Use new signature for
    resctrl_arch_is_evt_configurable()

 arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++--
 drivers/resctrl/mpam_resctrl.c     |  2 +-
 fs/resctrl/monitor.c               |  4 ++--
 fs/resctrl/rdtgroup.c              |  8 ++++++--
 include/linux/resctrl.h            |  2 +-
 5 files changed, 20 insertions(+), 8 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-02-25 20:19 [PATCH v1 0/4] x86,fs/resctrl: Pave the way for MPAM counter assignment Ben Horgan
@ 2026-02-25 20:19 ` Ben Horgan
  2026-03-02 23:11   ` Reinette Chatre
  2026-02-25 20:19 ` [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable Ben Horgan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-02-25 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, reinette.chatre, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, ben.horgan, fenghuay,
	tan.shaopeng

The features BMEC and ABMC provide separate interfaces to configuring which
bandwidth types a counter tracks. Currently
resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
supported.

ABMC is useful even when BMEC is supported as it also provides counter
assignment which reduces the number of hardware monitors a system
requires. It is an architectural detail that ABMC provides counter
configurability without requiring the prior feature, BMEC. On MPAM systems
these two features are independent and the bandwidth types are limited to a
choice of only read or write.

In order to give resctrl the information to support these features
independently extend resctrl_arch_is_evt_configurable() to report whether
events are configurable when using the mbm_event counter assignment mode.

Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
 arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++--
 fs/resctrl/monitor.c               |  4 ++--
 include/linux/resctrl.h            |  2 +-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7667cf7c4e94..d35f4417d55e 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -877,11 +877,19 @@ bool rdt_cpu_has(int flag)
 	return ret;
 }
 
-bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt, bool assignable)
 {
-	if (!rdt_cpu_has(X86_FEATURE_BMEC))
+	if (!assignable && !rdt_cpu_has(X86_FEATURE_BMEC))
 		return false;
 
+	if (assignable && !rdt_cpu_has(X86_FEATURE_ABMC))
+		return false;
+
+	/*
+	 * When ABMC is used the mbm_local and mbm_total events are enabled
+	 * based on the equivalently named cpu features. (In order to allow
+	 * fallback to the default counter assignment mode).
+	 */
 	switch (evt) {
 	case QOS_L3_MBM_TOTAL_EVENT_ID:
 		return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 49f3f6b846b2..d25622ee22c5 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1854,12 +1854,12 @@ int resctrl_l3_mon_resource_init(void)
 	if (ret)
 		return ret;
 
-	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) {
+	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID, false)) {
 		mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].configurable = true;
 		resctrl_file_fflags_init("mbm_total_bytes_config",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 	}
-	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID)) {
+	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID, false)) {
 		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].configurable = true;
 		resctrl_file_fflags_init("mbm_local_bytes_config",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 006e57fd7ca5..9ac1fe76bff4 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -419,7 +419,7 @@ bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
 
 bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid);
 
-bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt, bool assignable);
 
 static inline bool resctrl_is_mbm_event(enum resctrl_event_id eventid)
 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-02-25 20:19 [PATCH v1 0/4] x86,fs/resctrl: Pave the way for MPAM counter assignment Ben Horgan
  2026-02-25 20:19 ` [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode Ben Horgan
@ 2026-02-25 20:19 ` Ben Horgan
  2026-03-02 23:12   ` Reinette Chatre
  2026-02-25 20:19 ` [PATCH v1 3/4] fs/resctrl: Disallow the software controller when mbm counters are assignable Ben Horgan
  2026-02-25 20:19 ` [PATCH v1 4/4] arm_mpam: resctrl: Use new signature for resctrl_arch_is_evt_configurable() Ben Horgan
  3 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-02-25 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, reinette.chatre, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, ben.horgan, fenghuay,
	tan.shaopeng

When the counter assignment mode is mbm_event resctrl assumes the mbm
events are configurable and exposes the 'event_filter' files. These files
live at info/L3_MON/event_configs/<event>/event_filter and are used to
display and set the event configuration.

ABMC always supports event configuration but for MPAM they are
independent. Decouple event configuration from counter assignment by only
showing the 'event_filter' interface when event configuration is supported
at the same time as 'mbm_event' counter assignment.

Remove only the 'event_filter' files and not the containing directories so
that their names can still be used to identify the assignable counters.

Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
 fs/resctrl/rdtgroup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 5da305bd36c9..e79929d84317 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2338,6 +2338,9 @@ static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_nod
 		if (ret)
 			goto out;
 
+		if (!resctrl_arch_is_evt_configurable(mevt->evtid, true))
+			continue;
+
 		ret = rdtgroup_add_files(kn_subdir2, RFTYPE_ASSIGN_CONFIG);
 		if (ret)
 			break;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v1 3/4] fs/resctrl: Disallow the software controller when mbm counters are assignable
  2026-02-25 20:19 [PATCH v1 0/4] x86,fs/resctrl: Pave the way for MPAM counter assignment Ben Horgan
  2026-02-25 20:19 ` [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode Ben Horgan
  2026-02-25 20:19 ` [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable Ben Horgan
@ 2026-02-25 20:19 ` Ben Horgan
  2026-02-25 20:19 ` [PATCH v1 4/4] arm_mpam: resctrl: Use new signature for resctrl_arch_is_evt_configurable() Ben Horgan
  3 siblings, 0 replies; 23+ messages in thread
From: Ben Horgan @ 2026-02-25 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, reinette.chatre, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, ben.horgan, fenghuay,
	tan.shaopeng

The software controller requires that there are free running mbm counters
for each control group in order to provide the feedback necessary to
control the memory bandwidth allocation for that control group.  Previous
to the introduction counter assignment support (ABMC) resctrl required this
in order to advertise support for mbm but now if the mbm counters are
assignable then this can't be guaranteed.

Currently, only AMD systems support counter assignment but the MBA is non
linear and so the software controller is never supported anyway. For MPAM
systems the MBA is linear and so the dependency on counters not being
assignable needs to made explicit. Hence, fail the mount if the user
requests the software controller, the mba_MBps option, and the mbm counters
are assignable.

Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
 fs/resctrl/rdtgroup.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index e79929d84317..10ff3a125751 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2523,7 +2523,8 @@ static bool supports_mba_mbps(void)
 
 	return (resctrl_is_mbm_enabled() &&
 		r->alloc_capable && is_mba_linear() &&
-		r->ctrl_scope == rmbm->mon_scope);
+		r->ctrl_scope == rmbm->mon_scope &&
+		!rmbm->mon.mbm_cntr_assignable);
 }
 
 /*
@@ -2938,7 +2939,7 @@ static int rdt_parse_param(struct fs_context *fc, struct fs_parameter *param)
 		ctx->enable_cdpl2 = true;
 		return 0;
 	case Opt_mba_mbps:
-		msg = "mba_MBps requires MBM and linear scale MBA at L3 scope";
+		msg = "mba_MBps requires dedicated MBM counters and linear scale MBA at L3 scope";
 		if (!supports_mba_mbps())
 			return invalfc(fc, msg);
 		ctx->enable_mba_mbps = true;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v1 4/4] arm_mpam: resctrl: Use new signature for resctrl_arch_is_evt_configurable()
  2026-02-25 20:19 [PATCH v1 0/4] x86,fs/resctrl: Pave the way for MPAM counter assignment Ben Horgan
                   ` (2 preceding siblings ...)
  2026-02-25 20:19 ` [PATCH v1 3/4] fs/resctrl: Disallow the software controller when mbm counters are assignable Ben Horgan
@ 2026-02-25 20:19 ` Ben Horgan
  3 siblings, 0 replies; 23+ messages in thread
From: Ben Horgan @ 2026-02-25 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tony.luck, reinette.chatre, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, ben.horgan, fenghuay,
	tan.shaopeng

Fix MPAM build after change to new version of
resctrl_arch_is_evt_configurable().  The MPAM driver doesn't support
event configuration regardless of whether default or mbm_event counter
assignment mode is used and so return false unconditionally.

Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
This patch is only here in the case it's needed to resolve a merge conflict.
---
 drivers/resctrl/mpam_resctrl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 694ea8548a05..afe43034a516 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -105,7 +105,7 @@ bool resctrl_arch_mon_capable(void)
 	return l3->mon_capable;
 }
 
-bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt, bool assignable)
 {
 	return false;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-02-25 20:19 ` [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode Ben Horgan
@ 2026-03-02 23:11   ` Reinette Chatre
  2026-03-03 12:29     ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-02 23:11 UTC (permalink / raw)
  To: Ben Horgan, linux-kernel
  Cc: tony.luck, Dave.Martin, james.morse, babu.moger, tglx, mingo, bp,
	dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 2/25/26 12:19 PM, Ben Horgan wrote:
> The features BMEC and ABMC provide separate interfaces to configuring which
> bandwidth types a counter tracks. Currently
> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
> supported.
> 
> ABMC is useful even when BMEC is supported as it also provides counter
> assignment which reduces the number of hardware monitors a system
> requires. It is an architectural detail that ABMC provides counter

Since the goal is to support MPAM I'd suggest that the first focus be on what
resctrl fs supports and exposes and how it does or does not work for MPAM.

> configurability without requiring the prior feature, BMEC. On MPAM systems
> these two features are independent and the bandwidth types are limited to a
> choice of only read or write.

Does MPAM support exactly these two features? Specifically, does MPAM support
a feature that allows user to configure events globally per domain and another
feature that allows user to configure events per PMG?

These different features are how I understand assignable counters and BMEC to
be and to support both at the same time requires a user interface that is
confusing since the user can concurrently configure events globally per-domain
and per resource group.

Could you please elaborate how event configuration work on MPAM? If find this
series quite cryptic. I think it will help if you could elaborate what MPAM
capabilities are and how you expect resctrl fs to expose these features to
an MPAM user and how said used is expected to interact with resctrl fs to use
the features.

> 
> In order to give resctrl the information to support these features
> independently extend resctrl_arch_is_evt_configurable() to report whether
> events are configurable when using the mbm_event counter assignment mode.
> 
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
>  arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++--
>  fs/resctrl/monitor.c               |  4 ++--
>  include/linux/resctrl.h            |  2 +-
>  3 files changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 7667cf7c4e94..d35f4417d55e 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -877,11 +877,19 @@ bool rdt_cpu_has(int flag)
>  	return ret;
>  }
>  
> -bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
> +bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt, bool assignable)
>  {
> -	if (!rdt_cpu_has(X86_FEATURE_BMEC))
> +	if (!assignable && !rdt_cpu_has(X86_FEATURE_BMEC))
>  		return false;
>  
> +	if (assignable && !rdt_cpu_has(X86_FEATURE_ABMC))
> +		return false;
> +

I find this very confusing. 

> +	/*
> +	 * When ABMC is used the mbm_local and mbm_total events are enabled
> +	 * based on the equivalently named cpu features. (In order to allow
> +	 * fallback to the default counter assignment mode).
> +	 */
>  	switch (evt) {
>  	case QOS_L3_MBM_TOTAL_EVENT_ID:
>  		return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);


Reinette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-02-25 20:19 ` [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable Ben Horgan
@ 2026-03-02 23:12   ` Reinette Chatre
  2026-03-03 14:00     ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-02 23:12 UTC (permalink / raw)
  To: Ben Horgan, linux-kernel
  Cc: tony.luck, Dave.Martin, james.morse, babu.moger, tglx, mingo, bp,
	dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 2/25/26 12:19 PM, Ben Horgan wrote:
> When the counter assignment mode is mbm_event resctrl assumes the mbm
> events are configurable and exposes the 'event_filter' files. These files
> live at info/L3_MON/event_configs/<event>/event_filter and are used to
> display and set the event configuration.
> 
> ABMC always supports event configuration but for MPAM they are
> independent. Decouple event configuration from counter assignment by only

Could you please elaborate what you mean with "independent" here? If event
configuration is still supported when assignable counter mode is enabled, why
can event configuration interface not just remain as-is? Could resctrl not
display the existing event configuration and if user cannot modify it return
a failure when user attempts to do so?

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-02 23:11   ` Reinette Chatre
@ 2026-03-03 12:29     ` Ben Horgan
  2026-03-03 18:09       ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-03 12:29 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
> Hi Ben,
> 
> On 2/25/26 12:19 PM, Ben Horgan wrote:
> > The features BMEC and ABMC provide separate interfaces to configuring which
> > bandwidth types a counter tracks. Currently
> > resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
> > supported.
> > 
> > ABMC is useful even when BMEC is supported as it also provides counter
> > assignment which reduces the number of hardware monitors a system
> > requires. It is an architectural detail that ABMC provides counter
> 
> Since the goal is to support MPAM I'd suggest that the first focus be on what
> resctrl fs supports and exposes and how it does or does not work for MPAM.
> 
> > configurability without requiring the prior feature, BMEC. On MPAM systems
> > these two features are independent and the bandwidth types are limited to a
> > choice of only read or write.
> 
> Does MPAM support exactly these two features? Specifically, does MPAM support
> a feature that allows user to configure events globally per domain and another
> feature that allows user to configure events per PMG?

No, the bandwidth type configuration in MPAM is per counter and so effectively
per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
or both.

> 
> These different features are how I understand assignable counters and BMEC to

We are each approaching this from a different view point. I've just been looking at
ABMC as a way of dealing with systems where there are fewer hardware counters than
(PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
as they need a fixed (PARTID, PMG) configuration to avoid missing counts.

The intent of this patch is to allow splitting these two features of ABMC,
bandwidth type configuration and hardware counter assignment in order to just
support the hardware counter assignment.

I'm still not understanding the distinction you are making though.
The files are,
With ABMC:
info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
and with BMEC they are:
info/L3_MON/mbm_[local,total]_bytes_config

In both cases they have allow configuration for two event types,
mbm_local_bytes, and mbm_total_bytes. What am I missing?

> be and to support both at the same time requires a user interface that is
> confusing since the user can concurrently configure events globally per-domain
> and per resource group.

Sure.

> 
> Could you please elaborate how event configuration work on MPAM? If find this
> series quite cryptic. I think it will help if you could elaborate what MPAM
> capabilities are and how you expect resctrl fs to expose these features to
> an MPAM user and how said used is expected to interact with resctrl fs to use
> the features.

Ok, firstly regarding hardware counter assignment, on MPAM systems with more
(PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
counting events and otherwise not.

I haven't put much thought into how we would support event configuration with
MPAM but we would want something that allows the configuration per hardware
counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
as this looks to commit us to per event type configuration. Please correct me
if this isn't the case.

> 
> > 
> > In order to give resctrl the information to support these features
> > independently extend resctrl_arch_is_evt_configurable() to report whether
> > events are configurable when using the mbm_event counter assignment mode.
> > 
> > Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> > ---
> >  arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++--
> >  fs/resctrl/monitor.c               |  4 ++--
> >  include/linux/resctrl.h            |  2 +-
> >  3 files changed, 13 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> > index 7667cf7c4e94..d35f4417d55e 100644
> > --- a/arch/x86/kernel/cpu/resctrl/core.c
> > +++ b/arch/x86/kernel/cpu/resctrl/core.c
> > @@ -877,11 +877,19 @@ bool rdt_cpu_has(int flag)
> >  	return ret;
> >  }
> >  
> > -bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
> > +bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt, bool assignable)
> >  {
> > -	if (!rdt_cpu_has(X86_FEATURE_BMEC))
> > +	if (!assignable && !rdt_cpu_has(X86_FEATURE_BMEC))
> >  		return false;
> >  
> > +	if (assignable && !rdt_cpu_has(X86_FEATURE_ABMC))
> > +		return false;
> > +
> 
> I find this very confusing. 
> 
> > +	/*
> > +	 * When ABMC is used the mbm_local and mbm_total events are enabled
> > +	 * based on the equivalently named cpu features. (In order to allow
> > +	 * fallback to the default counter assignment mode).
> > +	 */
> >  	switch (evt) {
> >  	case QOS_L3_MBM_TOTAL_EVENT_ID:
> >  		return rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL);
> 
> 
> Reinette

Thanks,

Ben

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-03-02 23:12   ` Reinette Chatre
@ 2026-03-03 14:00     ` Ben Horgan
  2026-03-03 18:14       ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-03 14:00 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On Mon, Mar 02, 2026 at 03:12:52PM -0800, Reinette Chatre wrote:
> Hi Ben,
> 
> On 2/25/26 12:19 PM, Ben Horgan wrote:
> > When the counter assignment mode is mbm_event resctrl assumes the mbm
> > events are configurable and exposes the 'event_filter' files. These files
> > live at info/L3_MON/event_configs/<event>/event_filter and are used to
> > display and set the event configuration.
> > 
> > ABMC always supports event configuration but for MPAM they are
> > independent. Decouple event configuration from counter assignment by only
> 
> Could you please elaborate what you mean with "independent" here? If event
> configuration is still supported when assignable counter mode is enabled, why
> can event configuration interface not just remain as-is? Could resctrl not

The two features of ABMC that I'm claiming are independent are: firstly,
requiring assignment of a hardware counter to to CTRL_MON/MON group in order to
allow using bandwidth monitoring when there are fewer hardware counters than
possible CTRL_MON/MON groups (num_rmid) and secondly bandwidth type
configuration for the counters.

The first is concerned with which, if any, hardware counter is used per group
and the second with what the counters are counting. To me these as appear as two
things that should be considered separatedly. Is this clearer?

I'm first trying to address the case where event configuration isn't supported
as we haven't currently got support for that in the MPAM driver and supporting
systems with fewer hardware counters than (PARTID, PMG) without unnecessary
limiting the exposed PARTID/PMG. Some MPAM hardware systems only have a single
bandwidth counter.

> display the existing event configuration and if user cannot modify it return
> a failure when user attempts to do so?

I guess it could. Currently in MPAM we just support mbm_total_bytes and so it
would always be 0x1F. Would we want some other way to indicate that it is fixed
rather than trying to change it? However, if we just remove the configuration
files then it seems natural for mbm_total_bytes to just have the same meaning as
it has when BMEC and ABMC are not enabled.

> 
> Reinette
> 

Thanks,

Ben

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-03 12:29     ` Ben Horgan
@ 2026-03-03 18:09       ` Reinette Chatre
  2026-03-04 11:07         ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-03 18:09 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/3/26 4:29 AM, Ben Horgan wrote:
> Hi Reinette,
> 
> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>> bandwidth types a counter tracks. Currently
>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>> supported.
>>>
>>> ABMC is useful even when BMEC is supported as it also provides counter
>>> assignment which reduces the number of hardware monitors a system
>>> requires. It is an architectural detail that ABMC provides counter
>>
>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>
>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>> these two features are independent and the bandwidth types are limited to a
>>> choice of only read or write.
>>
>> Does MPAM support exactly these two features? Specifically, does MPAM support
>> a feature that allows user to configure events globally per domain and another
>> feature that allows user to configure events per PMG?
> 
> No, the bandwidth type configuration in MPAM is per counter and so effectively
> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
> or both.

Thank you for confirming.

Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.


>> These different features are how I understand assignable counters and BMEC to
> 
> We are each approaching this from a different view point. I've just been looking at
> ABMC as a way of dealing with systems where there are fewer hardware counters than
> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work

No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
counter mode the counter assignment is per monitoring group AND event as a pair:
(CTRL_MON/MON group, event).

> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.

It is not clear to me how sharing counters are at play here.

> The intent of this patch is to allow splitting these two features of ABMC,
> bandwidth type configuration and hardware counter assignment in order to just

Why keep BMEC which is by its name does event configuration? And then on top
of that it is event configuration at a scope that MPAM does not support?

> support the hardware counter assignment.
> 
> I'm still not understanding the distinction you are making though.
> The files are,
> With ABMC:
> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter

This is an event configuration that is global without any assignment. This
interface communicates to user space which transactions are counted when
this particular event is assigned to a CTRL_MON/MON group. This interface
is intended to be extensible. The interface starts with the original mbm_local_bytes
and mbm_total_bytes events in order to be backward compatible. The vision is that
if the user prefers to count different transactions then they could create
a new event with the transactions needing counting. For example,

  # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
  # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter

The events are just tracked and managed in software with the above interface,
no hardware configuration is involved at this point in the above example*.

The new "just_local_slow" can can then be assigned to a monitor group via
mbm_L3_assignments that will at that time consume one hardware counter and
program it with the event (which transactions to monitor) and monitor group
details (PARTID, PMG).

This is based on original suggestion by Peter in a way that we thus expect to
work for customers. See [1].

> and with BMEC they are:
> info/L3_MON/mbm_[local,total]_bytes_config

This is essentially both an event configuration and assignment that is not
compatible with assignable counters. With this interface the user
both configures which transactions are counted by a particular event and
programs all counters in a domain (across all resource groups) to use that
particular configuration. Due to this incompatibility resctrl fs will not expose
BMEC files when assignable counters are enabled.


> In both cases they have allow configuration for two event types,
> mbm_local_bytes, and mbm_total_bytes. What am I missing?

The way I see it:
BMEC: per domain across all resource groups event configuration and assignment that
      applies to all counters - intended to support the "default" mode where there
      is no counter assignment from user space.
assignable counters: event configuration via event_filter with assignment done
                     separately using per resource group mbm_L3_assignments file

> 
>> be and to support both at the same time requires a user interface that is
>> confusing since the user can concurrently configure events globally per-domain
>> and per resource group.
> 
> Sure.
> 
>>
>> Could you please elaborate how event configuration work on MPAM? If find this
>> series quite cryptic. I think it will help if you could elaborate what MPAM
>> capabilities are and how you expect resctrl fs to expose these features to
>> an MPAM user and how said used is expected to interact with resctrl fs to use
>> the features.
> 
> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
> counting events and otherwise not.

ok. This sounds like assignable counters to me. I do not believe BMEC comes
into play.

> 
> I haven't put much thought into how we would support event configuration with
> MPAM but we would want something that allows the configuration per hardware
> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface

This is what assignable counters already does, no?

> as this looks to commit us to per event type configuration. Please correct me
> if this isn't the case.
> 

The existing interface was created with MPAM in mind so I do hope that it can
indeed accommodate MPAM.

Reinette

[1] https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/

* If the user changes which memory transactions are counted by an event *after* this
event has been assigned to resource groups then some hardware reconfiguration is indeed 
required.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-03-03 14:00     ` Ben Horgan
@ 2026-03-03 18:14       ` Reinette Chatre
  2026-03-04 11:31         ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-03 18:14 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/3/26 6:00 AM, Ben Horgan wrote:
> Hi Reinette,
> 
> On Mon, Mar 02, 2026 at 03:12:52PM -0800, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>> When the counter assignment mode is mbm_event resctrl assumes the mbm
>>> events are configurable and exposes the 'event_filter' files. These files
>>> live at info/L3_MON/event_configs/<event>/event_filter and are used to
>>> display and set the event configuration.
>>>
>>> ABMC always supports event configuration but for MPAM they are
>>> independent. Decouple event configuration from counter assignment by only
>>
>> Could you please elaborate what you mean with "independent" here? If event
>> configuration is still supported when assignable counter mode is enabled, why
>> can event configuration interface not just remain as-is? Could resctrl not
> 
> The two features of ABMC that I'm claiming are independent are: firstly,
> requiring assignment of a hardware counter to to CTRL_MON/MON group in order to
> allow using bandwidth monitoring when there are fewer hardware counters than
> possible CTRL_MON/MON groups (num_rmid) and secondly bandwidth type
> configuration for the counters.

These are implemented separately in resctrl fs.


> The first is concerned with which, if any, hardware counter is used per group
> and the second with what the counters are counting. To me these as appear as two
> things that should be considered separatedly. Is this clearer?

They can, and indeed resctrl manages these two parts of assignable counters with
two separate interfaces.

> 
> I'm first trying to address the case where event configuration isn't supported
> as we haven't currently got support for that in the MPAM driver and supporting
> systems with fewer hardware counters than (PARTID, PMG) without unnecessary
> limiting the exposed PARTID/PMG. Some MPAM hardware systems only have a single
> bandwidth counter.

ok, but if assignment is supported first then that assignment needs to have some
configuration associated with it as to which memory transactions are counted, no?

If I understand correctly MPAM would have hardcoded events (hardcoded which
transactions are counted by current default mbm_total_bytes and/or mbm_local_bytes).
The memory transactions that the event counts can be exposed in the
associated event_filter file. User space can use the per-resource group
mbm_L3_assignments file to assign the event to the resource group that will
result in counter allocated and programmed to count those transactions for
that resource group.

With only one bandwidth counter it will only be possible to assign one event to
one resource group at a time.

> 
>> display the existing event configuration and if user cannot modify it return
>> a failure when user attempts to do so?
> 
> I guess it could. Currently in MPAM we just support mbm_total_bytes and so it
> would always be 0x1F. Would we want some other way to indicate that it is fixed
> rather than trying to change it? However, if we just remove the configuration

event_filter could also be made readable only. This sounds temporary though and at
some point the MPAM driver will support event configuration. I thus do not think
resctrl fs should make big interface changes to accommodate this.

> files then it seems natural for mbm_total_bytes to just have the same meaning as
> it has when BMEC and ABMC are not enabled.

I do not see a reason to remove the configuration file. If MPAM just supports
mbm_total_bytes then user space can still look at its event_filter to see which
transactions are counted by it.  I am not sure about the 0x1F you mention (was
expecting 0x7F), but below is how it would appear:

  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
  local_reads_slow_memory

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-03 18:09       ` Reinette Chatre
@ 2026-03-04 11:07         ` Ben Horgan
  2026-03-04 17:02           ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-04 11:07 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/3/26 18:09, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/3/26 4:29 AM, Ben Horgan wrote:
>> Hi Reinette,
>>
>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>> Hi Ben,
>>>
>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>> bandwidth types a counter tracks. Currently
>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>> supported.
>>>>
>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>> assignment which reduces the number of hardware monitors a system
>>>> requires. It is an architectural detail that ABMC provides counter
>>>
>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>
>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>> these two features are independent and the bandwidth types are limited to a
>>>> choice of only read or write.
>>>
>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>> a feature that allows user to configure events globally per domain and another
>>> feature that allows user to configure events per PMG?
>>
>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>> or both.
> 
> Thank you for confirming.
> 
> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
> 
> 
>>> These different features are how I understand assignable counters and BMEC to
>>
>> We are each approaching this from a different view point. I've just been looking at
>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
> 
> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
> counter mode the counter assignment is per monitoring group AND event as a pair:
> (CTRL_MON/MON group, event).

Yes but these counters aren't necessarily fungible. For MPAM the
mbm_local_bytes and mbm_total_bytes are necessarily backed by different
hardware counters. A MPAM bandwidth counters just counts all traffic on
a link with the only configurability being for read/write. The counters
are just placed at different point in the topology to get the different
events.

> 
>> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.
> 
> It is not clear to me how sharing counters are at play here.

I was just saying it wasn't possible for bandwidth counters. For
llc_occupancy, CSU in MPAM, you can share 'counters' as they can just
recount to get the current cache occupancy.

> 
>> The intent of this patch is to allow splitting these two features of ABMC,
>> bandwidth type configuration and hardware counter assignment in order to just
> 
> Why keep BMEC which is by its name does event configuration? And then on top
> of that it is event configuration at a scope that MPAM does not support?
> 
>> support the hardware counter assignment.
>>
>> I'm still not understanding the distinction you are making though.
>> The files are,
>> With ABMC:
>> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
> 
> This is an event configuration that is global without any assignment. This
> interface communicates to user space which transactions are counted when
> this particular event is assigned to a CTRL_MON/MON group. This interface
> is intended to be extensible. The interface starts with the original mbm_local_bytes
> and mbm_total_bytes events in order to be backward compatible. The vision is that
> if the user prefers to count different transactions then they could create
> a new event with the transactions needing counting. For example,
> 
>   # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
>   # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter
> 
> The events are just tracked and managed in software with the above interface,
> no hardware configuration is involved at this point in the above example*.
> 
> The new "just_local_slow" can can then be assigned to a monitor group via
> mbm_L3_assignments that will at that time consume one hardware counter and
> program it with the event (which transactions to monitor) and monitor group
> details (PARTID, PMG).
> 
> This is based on original suggestion by Peter in a way that we thus expect to
> work for customers. See [1].
> 
>> and with BMEC they are:
>> info/L3_MON/mbm_[local,total]_bytes_config

I see this makes the intent much clearer to me. Thanks for sharing this
plan. I think the general idea is good. To me this implies that for MPAM
to support event configuration we'd want ABMC enabled at the same time.
Which indeed makes sense as then you can then count read and write
separately for a given CTRL_MON/MON group without requiring twice the
number of hardware counters.

However, I now spot an existing issue, bundling mbm_local_bytes and
mbm_total_bytes together for one pool of counters doesn't work for MPAM.
As noted above they require different sets of hardware counters. With
the current counter assignment mode interface the num_mbm_cntrs is
scoped to all mbm counters. In an MPAM system that supports both
mbm_local_bytes and mbm_total_bytes this could lead to
num_mbm_total_cntrs and a num_mbm_local_cntrs or something equivalent.

> 
> This is essentially both an event configuration and assignment that is not
> compatible with assignable counters. With this interface the user
> both configures which transactions are counted by a particular event and
> programs all counters in a domain (across all resource groups) to use that
> particular configuration. Due to this incompatibility resctrl fs will not expose
> BMEC files when assignable counters are enabled.
> 
> 
>> In both cases they have allow configuration for two event types,
>> mbm_local_bytes, and mbm_total_bytes. What am I missing?
> 
> The way I see it:
> BMEC: per domain across all resource groups event configuration and assignment that
>       applies to all counters - intended to support the "default" mode where there
>       is no counter assignment from user space.
> assignable counters: event configuration via event_filter with assignment done
>                      separately using per resource group mbm_L3_assignments file

Make sense.

> 
>>
>>> be and to support both at the same time requires a user interface that is
>>> confusing since the user can concurrently configure events globally per-domain
>>> and per resource group.
>>
>> Sure.
>>
>>>
>>> Could you please elaborate how event configuration work on MPAM? If find this
>>> series quite cryptic. I think it will help if you could elaborate what MPAM
>>> capabilities are and how you expect resctrl fs to expose these features to
>>> an MPAM user and how said used is expected to interact with resctrl fs to use
>>> the features.
>>
>> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
>> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
>> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
>> counting events and otherwise not.
> 
> ok. This sounds like assignable counters to me. I do not believe BMEC comes
> into play.
> 
>>
>> I haven't put much thought into how we would support event configuration with
>> MPAM but we would want something that allows the configuration per hardware
>> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
> 
> This is what assignable counters already does, no?

Isn't that only with the future plan you shared above?

> 
>> as this looks to commit us to per event type configuration. Please correct me
>> if this isn't the case.
>>
> 
> The existing interface was created with MPAM in mind so I do hope that it can
> indeed accommodate MPAM.

Me too!

> 
> Reinette
> 
> [1] https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/
> 
> * If the user changes which memory transactions are counted by an event *after* this
> event has been assigned to resource groups then some hardware reconfiguration is indeed 
> required.


Thanks,

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-03-03 18:14       ` Reinette Chatre
@ 2026-03-04 11:31         ` Ben Horgan
  2026-03-04 17:03           ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-04 11:31 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/3/26 18:14, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/3/26 6:00 AM, Ben Horgan wrote:
>> Hi Reinette,
>>
>> On Mon, Mar 02, 2026 at 03:12:52PM -0800, Reinette Chatre wrote:
>>> Hi Ben,
>>>
>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>> When the counter assignment mode is mbm_event resctrl assumes the mbm
>>>> events are configurable and exposes the 'event_filter' files. These files
>>>> live at info/L3_MON/event_configs/<event>/event_filter and are used to
>>>> display and set the event configuration.
>>>>
>>>> ABMC always supports event configuration but for MPAM they are
>>>> independent. Decouple event configuration from counter assignment by only
>>>
>>> Could you please elaborate what you mean with "independent" here? If event
>>> configuration is still supported when assignable counter mode is enabled, why
>>> can event configuration interface not just remain as-is? Could resctrl not
>>
>> The two features of ABMC that I'm claiming are independent are: firstly,
>> requiring assignment of a hardware counter to to CTRL_MON/MON group in order to
>> allow using bandwidth monitoring when there are fewer hardware counters than
>> possible CTRL_MON/MON groups (num_rmid) and secondly bandwidth type
>> configuration for the counters.
> 
> These are implemented separately in resctrl fs.
> 
> 
>> The first is concerned with which, if any, hardware counter is used per group
>> and the second with what the counters are counting. To me these as appear as two
>> things that should be considered separatedly. Is this clearer?
> 
> They can, and indeed resctrl manages these two parts of assignable counters with
> two separate interfaces.
> 
>>
>> I'm first trying to address the case where event configuration isn't supported
>> as we haven't currently got support for that in the MPAM driver and supporting
>> systems with fewer hardware counters than (PARTID, PMG) without unnecessary
>> limiting the exposed PARTID/PMG. Some MPAM hardware systems only have a single
>> bandwidth counter.
> 
> ok, but if assignment is supported first then that assignment needs to have some
> configuration associated with it as to which memory transactions are counted, no?
> 
> If I understand correctly MPAM would have hardcoded events (hardcoded which
> transactions are counted by current default mbm_total_bytes and/or mbm_local_bytes).
> The memory transactions that the event counts can be exposed in the
> associated event_filter file. User space can use the per-resource group
> mbm_L3_assignments file to assign the event to the resource group that will
> result in counter allocated and programmed to count those transactions for
> that resource group.

That's right.

Just wondering, why aren't there <telemetry>_L3_assignments files to
allow keeping all CTRL_MON/MON groups when there are, as pointed out in
the Documentation/filesystems/resctrl.rst, platforms with limited
telemetry counters?

> 
> With only one bandwidth counter it will only be possible to assign one event to
> one resource group at a time.

yes

> 
>>
>>> display the existing event configuration and if user cannot modify it return
>>> a failure when user attempts to do so?
>>
>> I guess it could. Currently in MPAM we just support mbm_total_bytes and so it
>> would always be 0x1F. Would we want some other way to indicate that it is fixed
>> rather than trying to change it? However, if we just remove the configuration
> 
> event_filter could also be made readable only. This sounds temporary though and at
> some point the MPAM driver will support event configuration. I thus do not think
> resctrl fs should make big interface changes to accommodate this.

For MPAM, some of the bits will always be unchangeable as they depend on
which hardware counter rather than the configuration of that hardware
counter. Also, not all MPAM hardware supports the read/write configuration.

> 
>> files then it seems natural for mbm_total_bytes to just have the same meaning as
>> it has when BMEC and ABMC are not enabled.
> 
> I do not see a reason to remove the configuration file. If MPAM just supports
> mbm_total_bytes then user space can still look at its event_filter to see which
> transactions are counted by it.  I am not sure about the 0x1F you mention (was
> expecting 0x7F), but below is how it would appear:

0x1F was a mistake, it is indeed 0x7F.

> 
>   # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>   local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
>   local_reads_slow_memory

Ok, it's not a natural description for MPAM bandwidth counters but I
think we can live with that.

> 
> Reinette
> 

Thanks,

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-04 11:07         ` Ben Horgan
@ 2026-03-04 17:02           ` Reinette Chatre
  2026-03-04 17:37             ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-04 17:02 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/4/26 3:07 AM, Ben Horgan wrote:
> Hi Reinette,
> 
> On 3/3/26 18:09, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>> Hi Reinette,
>>>
>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>> Hi Ben,
>>>>
>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>> bandwidth types a counter tracks. Currently
>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>> supported.
>>>>>
>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>> assignment which reduces the number of hardware monitors a system
>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>
>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>
>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>> these two features are independent and the bandwidth types are limited to a
>>>>> choice of only read or write.
>>>>
>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>> a feature that allows user to configure events globally per domain and another
>>>> feature that allows user to configure events per PMG?
>>>
>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>> or both.
>>
>> Thank you for confirming.
>>
>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>
>>
>>>> These different features are how I understand assignable counters and BMEC to
>>>
>>> We are each approaching this from a different view point. I've just been looking at
>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>
>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>> counter mode the counter assignment is per monitoring group AND event as a pair:
>> (CTRL_MON/MON group, event).
> 
> Yes but these counters aren't necessarily fungible. For MPAM the
> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
> hardware counters. A MPAM bandwidth counters just counts all traffic on
> a link with the only configurability being for read/write. The counters
> are just placed at different point in the topology to get the different
> events.

The distinction between "different hardware counters for mbm_local_bytes and
mbm_total_bytes" and "The counters are just placed at different point in the
topology" is not clear to me". The former implies different counters for the
two events while the latter implies the same counters are used for both events
but perhaps accumulated/displayed differently?

I re-read the thread starting with 
https://lore.kernel.org/lkml/CALPaoCh+mRLJEfhKBve3hRf+vHHoObjvWRt74OfpopgtR9g9FQ@mail.gmail.com/
and it sounded to me as though MPAM would only expose the mbm_total_bytes event.

Ignoring for a moment that counters could be configured to count different
transactions, so assuming all counters count the same transactions. Could you
please clarify how MPAM determines the counts returned by the
mbm_local_bytes and mbm_total_bytes respectively?

>>> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.
>>
>> It is not clear to me how sharing counters are at play here.
> 
> I was just saying it wasn't possible for bandwidth counters. For
> llc_occupancy, CSU in MPAM, you can share 'counters' as they can just
> recount to get the current cache occupancy.

ack.

>>> The intent of this patch is to allow splitting these two features of ABMC,
>>> bandwidth type configuration and hardware counter assignment in order to just
>>
>> Why keep BMEC which is by its name does event configuration? And then on top
>> of that it is event configuration at a scope that MPAM does not support?
>>
>>> support the hardware counter assignment.
>>>
>>> I'm still not understanding the distinction you are making though.
>>> The files are,
>>> With ABMC:
>>> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
>>
>> This is an event configuration that is global without any assignment. This
>> interface communicates to user space which transactions are counted when
>> this particular event is assigned to a CTRL_MON/MON group. This interface
>> is intended to be extensible. The interface starts with the original mbm_local_bytes
>> and mbm_total_bytes events in order to be backward compatible. The vision is that
>> if the user prefers to count different transactions then they could create
>> a new event with the transactions needing counting. For example,
>>
>>   # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
>>   # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter
>>
>> The events are just tracked and managed in software with the above interface,
>> no hardware configuration is involved at this point in the above example*.
>>
>> The new "just_local_slow" can can then be assigned to a monitor group via
>> mbm_L3_assignments that will at that time consume one hardware counter and
>> program it with the event (which transactions to monitor) and monitor group
>> details (PARTID, PMG).
>>
>> This is based on original suggestion by Peter in a way that we thus expect to
>> work for customers. See [1].
>>
>>> and with BMEC they are:
>>> info/L3_MON/mbm_[local,total]_bytes_config
> 
> I see this makes the intent much clearer to me. Thanks for sharing this
> plan. I think the general idea is good. To me this implies that for MPAM
> to support event configuration we'd want ABMC enabled at the same time.
> Which indeed makes sense as then you can then count read and write
> separately for a given CTRL_MON/MON group without requiring twice the
> number of hardware counters.
> 
> However, I now spot an existing issue, bundling mbm_local_bytes and
> mbm_total_bytes together for one pool of counters doesn't work for MPAM.
> As noted above they require different sets of hardware counters. With
> the current counter assignment mode interface the num_mbm_cntrs is
> scoped to all mbm counters. In an MPAM system that supports both
> mbm_local_bytes and mbm_total_bytes this could lead to
> num_mbm_total_cntrs and a num_mbm_local_cntrs or something equivalent.

Is this just needed because MPAM driver does not support counter configuration
yet?

>> This is essentially both an event configuration and assignment that is not
>> compatible with assignable counters. With this interface the user
>> both configures which transactions are counted by a particular event and
>> programs all counters in a domain (across all resource groups) to use that
>> particular configuration. Due to this incompatibility resctrl fs will not expose
>> BMEC files when assignable counters are enabled.
>>
>>
>>> In both cases they have allow configuration for two event types,
>>> mbm_local_bytes, and mbm_total_bytes. What am I missing?
>>
>> The way I see it:
>> BMEC: per domain across all resource groups event configuration and assignment that
>>       applies to all counters - intended to support the "default" mode where there
>>       is no counter assignment from user space.
>> assignable counters: event configuration via event_filter with assignment done
>>                      separately using per resource group mbm_L3_assignments file
> 
> Make sense.
> 
>>
>>>
>>>> be and to support both at the same time requires a user interface that is
>>>> confusing since the user can concurrently configure events globally per-domain
>>>> and per resource group.
>>>
>>> Sure.
>>>
>>>>
>>>> Could you please elaborate how event configuration work on MPAM? If find this
>>>> series quite cryptic. I think it will help if you could elaborate what MPAM
>>>> capabilities are and how you expect resctrl fs to expose these features to
>>>> an MPAM user and how said used is expected to interact with resctrl fs to use
>>>> the features.
>>>
>>> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
>>> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
>>> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
>>> counting events and otherwise not.
>>
>> ok. This sounds like assignable counters to me. I do not believe BMEC comes
>> into play.
>>
>>>
>>> I haven't put much thought into how we would support event configuration with
>>> MPAM but we would want something that allows the configuration per hardware
>>> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
>>
>> This is what assignable counters already does, no?
> 
> Isn't that only with the future plan you shared above?

Assigning a counter to a (PARTID, PMG) pair is what assignable counters does
today.

Reinette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-03-04 11:31         ` Ben Horgan
@ 2026-03-04 17:03           ` Reinette Chatre
  2026-03-04 17:53             ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-04 17:03 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/4/26 3:31 AM, Ben Horgan wrote:
> Hi Reinette,
> 
> On 3/3/26 18:14, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 3/3/26 6:00 AM, Ben Horgan wrote:
>>> Hi Reinette,
>>>
>>> On Mon, Mar 02, 2026 at 03:12:52PM -0800, Reinette Chatre wrote:
>>>> Hi Ben,
>>>>
>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>> When the counter assignment mode is mbm_event resctrl assumes the mbm
>>>>> events are configurable and exposes the 'event_filter' files. These files
>>>>> live at info/L3_MON/event_configs/<event>/event_filter and are used to
>>>>> display and set the event configuration.
>>>>>
>>>>> ABMC always supports event configuration but for MPAM they are
>>>>> independent. Decouple event configuration from counter assignment by only
>>>>
>>>> Could you please elaborate what you mean with "independent" here? If event
>>>> configuration is still supported when assignable counter mode is enabled, why
>>>> can event configuration interface not just remain as-is? Could resctrl not
>>>
>>> The two features of ABMC that I'm claiming are independent are: firstly,
>>> requiring assignment of a hardware counter to to CTRL_MON/MON group in order to
>>> allow using bandwidth monitoring when there are fewer hardware counters than
>>> possible CTRL_MON/MON groups (num_rmid) and secondly bandwidth type
>>> configuration for the counters.
>>
>> These are implemented separately in resctrl fs.
>>
>>
>>> The first is concerned with which, if any, hardware counter is used per group
>>> and the second with what the counters are counting. To me these as appear as two
>>> things that should be considered separatedly. Is this clearer?
>>
>> They can, and indeed resctrl manages these two parts of assignable counters with
>> two separate interfaces.
>>
>>>
>>> I'm first trying to address the case where event configuration isn't supported
>>> as we haven't currently got support for that in the MPAM driver and supporting
>>> systems with fewer hardware counters than (PARTID, PMG) without unnecessary
>>> limiting the exposed PARTID/PMG. Some MPAM hardware systems only have a single
>>> bandwidth counter.
>>
>> ok, but if assignment is supported first then that assignment needs to have some
>> configuration associated with it as to which memory transactions are counted, no?
>>
>> If I understand correctly MPAM would have hardcoded events (hardcoded which
>> transactions are counted by current default mbm_total_bytes and/or mbm_local_bytes).
>> The memory transactions that the event counts can be exposed in the
>> associated event_filter file. User space can use the per-resource group
>> mbm_L3_assignments file to assign the event to the resource group that will
>> result in counter allocated and programmed to count those transactions for
>> that resource group.
> 
> That's right.
> 
> Just wondering, why aren't there <telemetry>_L3_assignments files to
> allow keeping all CTRL_MON/MON groups when there are, as pointed out in
> the Documentation/filesystems/resctrl.rst, platforms with limited
> telemetry counters?

Apologies but I do not understand this question. Are you referring to the new
telemetry monitoring feature? 

>> With only one bandwidth counter it will only be possible to assign one event to
>> one resource group at a time.
> 
> yes
> 
>>
>>>
>>>> display the existing event configuration and if user cannot modify it return
>>>> a failure when user attempts to do so?
>>>
>>> I guess it could. Currently in MPAM we just support mbm_total_bytes and so it
>>> would always be 0x1F. Would we want some other way to indicate that it is fixed
>>> rather than trying to change it? However, if we just remove the configuration
>>
>> event_filter could also be made readable only. This sounds temporary though and at
>> some point the MPAM driver will support event configuration. I thus do not think
>> resctrl fs should make big interface changes to accommodate this.
> 
> For MPAM, some of the bits will always be unchangeable as they depend on
> which hardware counter rather than the configuration of that hardware
> counter. Also, not all MPAM hardware supports the read/write configuration.

This seems to tie in with discussion in other thread on how values for
mbm_total_bytes and mbm_local_bytes are computed. It almost sounds as though there
is just one bandwidth measurement returned from the various counters on the system
and that MPAM driver uses system knowledge to accumulated needed values and display
them to user space as these separate events.

>>> files then it seems natural for mbm_total_bytes to just have the same meaning as
>>> it has when BMEC and ABMC are not enabled.
>>
>> I do not see a reason to remove the configuration file. If MPAM just supports
>> mbm_total_bytes then user space can still look at its event_filter to see which
>> transactions are counted by it.  I am not sure about the 0x1F you mention (was
>> expecting 0x7F), but below is how it would appear:
> 
> 0x1F was a mistake, it is indeed 0x7F.
> 
>>
>>   # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>   local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
>>   local_reads_slow_memory
> 
> Ok, it's not a natural description for MPAM bandwidth counters but I
> think we can live with that.

What are the MPAM descriptions?

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-04 17:02           ` Reinette Chatre
@ 2026-03-04 17:37             ` Ben Horgan
  2026-03-04 19:23               ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-04 17:37 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/4/26 17:02, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/4/26 3:07 AM, Ben Horgan wrote:
>> Hi Reinette,
>>
>> On 3/3/26 18:09, Reinette Chatre wrote:
>>> Hi Ben,
>>>
>>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>>> Hi Reinette,
>>>>
>>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>>> Hi Ben,
>>>>>
>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>>> bandwidth types a counter tracks. Currently
>>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>>> supported.
>>>>>>
>>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>>> assignment which reduces the number of hardware monitors a system
>>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>>
>>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>>
>>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>>> these two features are independent and the bandwidth types are limited to a
>>>>>> choice of only read or write.
>>>>>
>>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>>> a feature that allows user to configure events globally per domain and another
>>>>> feature that allows user to configure events per PMG?
>>>>
>>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>>> or both.
>>>
>>> Thank you for confirming.
>>>
>>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>>
>>>
>>>>> These different features are how I understand assignable counters and BMEC to
>>>>
>>>> We are each approaching this from a different view point. I've just been looking at
>>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>>
>>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>>> counter mode the counter assignment is per monitoring group AND event as a pair:
>>> (CTRL_MON/MON group, event).
>>
>> Yes but these counters aren't necessarily fungible. For MPAM the
>> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
>> hardware counters. A MPAM bandwidth counters just counts all traffic on
>> a link with the only configurability being for read/write. The counters
>> are just placed at different point in the topology to get the different
>> events.
> 
> The distinction between "different hardware counters for mbm_local_bytes and
> mbm_total_bytes" and "The counters are just placed at different point in the
> topology" is not clear to me". The former implies different counters for the
> two events while the latter implies the same counters are used for both events
> but perhaps accumulated/displayed differently?

For a given RIS, mpam device hardware unit of which an MSC may consist
of 1 or more, there are MPAMF_MBWUMON_IDR.NUM_MON hardware bandwidth
counters which measure traffic passing a specific point with no
filtering for where it's going. The filtering of this counter is
set up in MSMON_CFG_MBWU_FLT which only allows pmg/partid/(read/write).
Whether or not these count traffic that leaves the local numa node or
only traffic that's internal to the numa node is a h/w design time (or
perhaps f/w) decision and so the mbm_local_bytes/mbm_total_bytes
distinction is a property of the RIS/MSC.

By different counters I'm referring to different RIS and by "different
places in the topology" I'm referring to the design decision of where
you put those RIS.

> 
> I re-read the thread starting with 
> https://lore.kernel.org/lkml/CALPaoCh+mRLJEfhKBve3hRf+vHHoObjvWRt74OfpopgtR9g9FQ@mail.gmail.com/
> and it sounded to me as though MPAM would only expose the mbm_total_bytes event.

That's the case initially but is only due to current hardware support
and what can be described by acpi at the moment.

> 
> Ignoring for a moment that counters could be configured to count different
> transactions, so assuming all counters count the same transactions. Could you
> please clarify how MPAM determines the counts returned by the
> mbm_local_bytes and mbm_total_bytes respectively?
> 
>>>> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.
>>>
>>> It is not clear to me how sharing counters are at play here.
>>
>> I was just saying it wasn't possible for bandwidth counters. For
>> llc_occupancy, CSU in MPAM, you can share 'counters' as they can just
>> recount to get the current cache occupancy.
> 
> ack.
> 
>>>> The intent of this patch is to allow splitting these two features of ABMC,
>>>> bandwidth type configuration and hardware counter assignment in order to just
>>>
>>> Why keep BMEC which is by its name does event configuration? And then on top
>>> of that it is event configuration at a scope that MPAM does not support?
>>>
>>>> support the hardware counter assignment.
>>>>
>>>> I'm still not understanding the distinction you are making though.
>>>> The files are,
>>>> With ABMC:
>>>> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
>>>
>>> This is an event configuration that is global without any assignment. This
>>> interface communicates to user space which transactions are counted when
>>> this particular event is assigned to a CTRL_MON/MON group. This interface
>>> is intended to be extensible. The interface starts with the original mbm_local_bytes
>>> and mbm_total_bytes events in order to be backward compatible. The vision is that
>>> if the user prefers to count different transactions then they could create
>>> a new event with the transactions needing counting. For example,
>>>
>>>   # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
>>>   # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter
>>>
>>> The events are just tracked and managed in software with the above interface,
>>> no hardware configuration is involved at this point in the above example*.
>>>
>>> The new "just_local_slow" can can then be assigned to a monitor group via
>>> mbm_L3_assignments that will at that time consume one hardware counter and
>>> program it with the event (which transactions to monitor) and monitor group
>>> details (PARTID, PMG).
>>>
>>> This is based on original suggestion by Peter in a way that we thus expect to
>>> work for customers. See [1].
>>>
>>>> and with BMEC they are:
>>>> info/L3_MON/mbm_[local,total]_bytes_config
>>
>> I see this makes the intent much clearer to me. Thanks for sharing this
>> plan. I think the general idea is good. To me this implies that for MPAM
>> to support event configuration we'd want ABMC enabled at the same time.
>> Which indeed makes sense as then you can then count read and write
>> separately for a given CTRL_MON/MON group without requiring twice the
>> number of hardware counters.
>>
>> However, I now spot an existing issue, bundling mbm_local_bytes and
>> mbm_total_bytes together for one pool of counters doesn't work for MPAM.
>> As noted above they require different sets of hardware counters. With
>> the current counter assignment mode interface the num_mbm_cntrs is
>> scoped to all mbm counters. In an MPAM system that supports both
>> mbm_local_bytes and mbm_total_bytes this could lead to
>> num_mbm_total_cntrs and a num_mbm_local_cntrs or something equivalent.
> 
> Is this just needed because MPAM driver does not support counter configuration
> yet?

No. As I've hopefully managed to explain a bit better above these
necessarily come from different pools of counters.

> 
>>> This is essentially both an event configuration and assignment that is not
>>> compatible with assignable counters. With this interface the user
>>> both configures which transactions are counted by a particular event and
>>> programs all counters in a domain (across all resource groups) to use that
>>> particular configuration. Due to this incompatibility resctrl fs will not expose
>>> BMEC files when assignable counters are enabled.
>>>
>>>
>>>> In both cases they have allow configuration for two event types,
>>>> mbm_local_bytes, and mbm_total_bytes. What am I missing?
>>>
>>> The way I see it:
>>> BMEC: per domain across all resource groups event configuration and assignment that
>>>       applies to all counters - intended to support the "default" mode where there
>>>       is no counter assignment from user space.
>>> assignable counters: event configuration via event_filter with assignment done
>>>                      separately using per resource group mbm_L3_assignments file
>>
>> Make sense.
>>
>>>
>>>>
>>>>> be and to support both at the same time requires a user interface that is
>>>>> confusing since the user can concurrently configure events globally per-domain
>>>>> and per resource group.
>>>>
>>>> Sure.
>>>>
>>>>>
>>>>> Could you please elaborate how event configuration work on MPAM? If find this
>>>>> series quite cryptic. I think it will help if you could elaborate what MPAM
>>>>> capabilities are and how you expect resctrl fs to expose these features to
>>>>> an MPAM user and how said used is expected to interact with resctrl fs to use
>>>>> the features.
>>>>
>>>> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
>>>> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
>>>> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
>>>> counting events and otherwise not.
>>>
>>> ok. This sounds like assignable counters to me. I do not believe BMEC comes
>>> into play.
>>>
>>>>
>>>> I haven't put much thought into how we would support event configuration with
>>>> MPAM but we would want something that allows the configuration per hardware
>>>> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
>>>
>>> This is what assignable counters already does, no?
>>
>> Isn't that only with the future plan you shared above?
> 
> Assigning a counter to a (PARTID, PMG) pair is what assignable counters does
> today.

Yes, but isn't it the case that currently, once you've chosen the
configuration for mbm_local_bytes and for mbm_total_bytes, each hardware
event is tied to one of those two configurations? The future work will
allow the user to construct custom named events to give more general
event configuration where there can be more than 2 different
configurations at once. (Where I'm using configuration to mean selecting
which of the resctrl/bmec/abmc list of bandwidth types are used.)

> 
> Reinette

Thanks,

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable
  2026-03-04 17:03           ` Reinette Chatre
@ 2026-03-04 17:53             ` Ben Horgan
  0 siblings, 0 replies; 23+ messages in thread
From: Ben Horgan @ 2026-03-04 17:53 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/4/26 17:03, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/4/26 3:31 AM, Ben Horgan wrote:
>> Hi Reinette,
>>
>> On 3/3/26 18:14, Reinette Chatre wrote:
>>> Hi Ben,
>>>
>>> On 3/3/26 6:00 AM, Ben Horgan wrote:
>>>> Hi Reinette,
>>>>
>>>> On Mon, Mar 02, 2026 at 03:12:52PM -0800, Reinette Chatre wrote:
>>>>> Hi Ben,
>>>>>
>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>> When the counter assignment mode is mbm_event resctrl assumes the mbm
>>>>>> events are configurable and exposes the 'event_filter' files. These files
>>>>>> live at info/L3_MON/event_configs/<event>/event_filter and are used to
>>>>>> display and set the event configuration.
>>>>>>
>>>>>> ABMC always supports event configuration but for MPAM they are
>>>>>> independent. Decouple event configuration from counter assignment by only
>>>>>
>>>>> Could you please elaborate what you mean with "independent" here? If event
>>>>> configuration is still supported when assignable counter mode is enabled, why
>>>>> can event configuration interface not just remain as-is? Could resctrl not
>>>>
>>>> The two features of ABMC that I'm claiming are independent are: firstly,
>>>> requiring assignment of a hardware counter to to CTRL_MON/MON group in order to
>>>> allow using bandwidth monitoring when there are fewer hardware counters than
>>>> possible CTRL_MON/MON groups (num_rmid) and secondly bandwidth type
>>>> configuration for the counters.
>>>
>>> These are implemented separately in resctrl fs.
>>>
>>>
>>>> The first is concerned with which, if any, hardware counter is used per group
>>>> and the second with what the counters are counting. To me these as appear as two
>>>> things that should be considered separatedly. Is this clearer?
>>>
>>> They can, and indeed resctrl manages these two parts of assignable counters with
>>> two separate interfaces.
>>>
>>>>
>>>> I'm first trying to address the case where event configuration isn't supported
>>>> as we haven't currently got support for that in the MPAM driver and supporting
>>>> systems with fewer hardware counters than (PARTID, PMG) without unnecessary
>>>> limiting the exposed PARTID/PMG. Some MPAM hardware systems only have a single
>>>> bandwidth counter.
>>>
>>> ok, but if assignment is supported first then that assignment needs to have some
>>> configuration associated with it as to which memory transactions are counted, no?
>>>
>>> If I understand correctly MPAM would have hardcoded events (hardcoded which
>>> transactions are counted by current default mbm_total_bytes and/or mbm_local_bytes).
>>> The memory transactions that the event counts can be exposed in the
>>> associated event_filter file. User space can use the per-resource group
>>> mbm_L3_assignments file to assign the event to the resource group that will
>>> result in counter allocated and programmed to count those transactions for
>>> that resource group.
>>
>> That's right.
>>
>> Just wondering, why aren't there <telemetry>_L3_assignments files to
>> allow keeping all CTRL_MON/MON groups when there are, as pointed out in
>> the Documentation/filesystems/resctrl.rst, platforms with limited
>> telemetry counters?
> 
> Apologies but I do not understand this question. Are you referring to the new
> telemetry monitoring feature? 

Yes, it's not important, but I was just wondering if the telemetry
monitoring feature had the same counter scarcity problem that the
counter assignment feature addresses and if it was solved in the same
way. Although, I realise it may just be to do with the number of rmid
supported for telemetry.

> 
>>> With only one bandwidth counter it will only be possible to assign one event to
>>> one resource group at a time.
>>
>> yes
>>
>>>
>>>>
>>>>> display the existing event configuration and if user cannot modify it return
>>>>> a failure when user attempts to do so?
>>>>
>>>> I guess it could. Currently in MPAM we just support mbm_total_bytes and so it
>>>> would always be 0x1F. Would we want some other way to indicate that it is fixed
>>>> rather than trying to change it? However, if we just remove the configuration
>>>
>>> event_filter could also be made readable only. This sounds temporary though and at
>>> some point the MPAM driver will support event configuration. I thus do not think
>>> resctrl fs should make big interface changes to accommodate this.
>>
>> For MPAM, some of the bits will always be unchangeable as they depend on
>> which hardware counter rather than the configuration of that hardware
>> counter. Also, not all MPAM hardware supports the read/write configuration.
> 
> This seems to tie in with discussion in other thread on how values for
> mbm_total_bytes and mbm_local_bytes are computed. It almost sounds as though there
> is just one bandwidth measurement returned from the various counters on the system
> and that MPAM driver uses system knowledge to accumulated needed values and display
> them to user space as these separate events.

Let's concentrate on the other thread, but no. The MSC.RIS count single
source and the driver only ever sums equivalent parts within a
component. (I guess the hardware counters could be implemented using
accumulation but, that's not our concern.)

> 
>>>> files then it seems natural for mbm_total_bytes to just have the same meaning as
>>>> it has when BMEC and ABMC are not enabled.
>>>
>>> I do not see a reason to remove the configuration file. If MPAM just supports
>>> mbm_total_bytes then user space can still look at its event_filter to see which
>>> transactions are counted by it.  I am not sure about the 0x1F you mention (was
>>> expecting 0x7F), but below is how it would appear:
>>
>> 0x1F was a mistake, it is indeed 0x7F.
>>
>>>
>>>   # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>>   local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
>>>   local_reads_slow_memory
>>
>> Ok, it's not a natural description for MPAM bandwidth counters but I
>> think we can live with that.
> 
> What are the MPAM descriptions?

We mainly talk about the placement of the MSC rather than bandwidth
types. For mbm_total_bytes I'm just viewing that as all reads/writes
leaving the L3. Bringing in the extra info temporal/slow_memory is just
extra. Still, should be ok.

> 
> Reinette
> 

-- 
Thanks,

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-04 17:37             ` Ben Horgan
@ 2026-03-04 19:23               ` Reinette Chatre
  2026-03-04 21:01                 ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-04 19:23 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/4/26 9:37 AM, Ben Horgan wrote:
> On 3/4/26 17:02, Reinette Chatre wrote:
>> On 3/4/26 3:07 AM, Ben Horgan wrote:
>>> On 3/3/26 18:09, Reinette Chatre wrote:
>>>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>>>> Hi Ben,
>>>>>>
>>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>>>> bandwidth types a counter tracks. Currently
>>>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>>>> supported.
>>>>>>>
>>>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>>>> assignment which reduces the number of hardware monitors a system
>>>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>>>
>>>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>>>
>>>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>>>> these two features are independent and the bandwidth types are limited to a
>>>>>>> choice of only read or write.
>>>>>>
>>>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>>>> a feature that allows user to configure events globally per domain and another
>>>>>> feature that allows user to configure events per PMG?
>>>>>
>>>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>>>> or both.
>>>>
>>>> Thank you for confirming.
>>>>
>>>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>>>
>>>>
>>>>>> These different features are how I understand assignable counters and BMEC to
>>>>>
>>>>> We are each approaching this from a different view point. I've just been looking at
>>>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>>>
>>>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>>>> counter mode the counter assignment is per monitoring group AND event as a pair:
>>>> (CTRL_MON/MON group, event).
>>>
>>> Yes but these counters aren't necessarily fungible. For MPAM the
>>> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
>>> hardware counters. A MPAM bandwidth counters just counts all traffic on
>>> a link with the only configurability being for read/write. The counters
>>> are just placed at different point in the topology to get the different
>>> events.
>>
>> The distinction between "different hardware counters for mbm_local_bytes and
>> mbm_total_bytes" and "The counters are just placed at different point in the
>> topology" is not clear to me". The former implies different counters for the
>> two events while the latter implies the same counters are used for both events
>> but perhaps accumulated/displayed differently?
> 
> For a given RIS, mpam device hardware unit of which an MSC may consist
> of 1 or more, there are MPAMF_MBWUMON_IDR.NUM_MON hardware bandwidth
> counters which measure traffic passing a specific point with no
> filtering for where it's going. The filtering of this counter is
> set up in MSMON_CFG_MBWU_FLT which only allows pmg/partid/(read/write).

Thank you for the details. Is the expectation that user should be able to
program all these counters via resctrl? If an MSC consists of multiple RIS
with different counters then things get complicated very fast. Could it be
constrained to only expose the maximum number of counters supported by
all RIS at a particular scope? This would match what the existing
num_mbm_cntrs file supports.


> Whether or not these count traffic that leaves the local numa node or
> only traffic that's internal to the numa node is a h/w design time (or
> perhaps f/w) decision and so the mbm_local_bytes/mbm_total_bytes
> distinction is a property of the RIS/MSC.

mbm_local_bytes and mbm_total_bytes are already established as counting
external bandwidth. Specifically, mbm_local_bytes counting L3 external
bandwidth satisfied by the local memory.
Do you have insight into what these systems will actually end up being
programmed with? It is difficult to reason with so many hypotheticals.
I wonder if it may not be simpler to expose a new unique event for the
internal numbers? Could initial work be constrained to fit into
existing definitions and then build from there?


> By different counters I'm referring to different RIS and by "different
> places in the topology" I'm referring to the design decision of where
> you put those RIS.

resctrl is very much focused on monitoring external memory bandwidth at L3 scope.
Monitoring memory bandwidth at different scopes still needs to be supported.
This sounds related to the work Fenghua is planning? RISC-V also seems
to have requirements around monitoring at different scope. Also, for
reference, https://lore.kernel.org/lkml/fb1e2686-237b-4536-acd6-15159abafcba@intel.com/

Could we start by seeing how MPAM parts that support monitoring of external bandwidth
at L3 scope can be supported, evaluate what is missing, and work from there? 

>> I re-read the thread starting with 
>> https://lore.kernel.org/lkml/CALPaoCh+mRLJEfhKBve3hRf+vHHoObjvWRt74OfpopgtR9g9FQ@mail.gmail.com/
>> and it sounded to me as though MPAM would only expose the mbm_total_bytes event.
> 
> That's the case initially but is only due to current hardware support
> and what can be described by acpi at the moment.

I am becoming lost here. Are we discussing adding features to resctrl to support
changes to ACPI that are currently under discussion for hardware that may or
may not be built on what those ACPI descriptions may look like? This all sounds
too hypothetical to seriously consider changes to resctrl at this time.

>> Ignoring for a moment that counters could be configured to count different
>> transactions, so assuming all counters count the same transactions. Could you
>> please clarify how MPAM determines the counts returned by the
>> mbm_local_bytes and mbm_total_bytes respectively?
>>
>>>>> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.
>>>>
>>>> It is not clear to me how sharing counters are at play here.
>>>
>>> I was just saying it wasn't possible for bandwidth counters. For
>>> llc_occupancy, CSU in MPAM, you can share 'counters' as they can just
>>> recount to get the current cache occupancy.
>>
>> ack.
>>
>>>>> The intent of this patch is to allow splitting these two features of ABMC,
>>>>> bandwidth type configuration and hardware counter assignment in order to just
>>>>
>>>> Why keep BMEC which is by its name does event configuration? And then on top
>>>> of that it is event configuration at a scope that MPAM does not support?
>>>>
>>>>> support the hardware counter assignment.
>>>>>
>>>>> I'm still not understanding the distinction you are making though.
>>>>> The files are,
>>>>> With ABMC:
>>>>> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
>>>>
>>>> This is an event configuration that is global without any assignment. This
>>>> interface communicates to user space which transactions are counted when
>>>> this particular event is assigned to a CTRL_MON/MON group. This interface
>>>> is intended to be extensible. The interface starts with the original mbm_local_bytes
>>>> and mbm_total_bytes events in order to be backward compatible. The vision is that
>>>> if the user prefers to count different transactions then they could create
>>>> a new event with the transactions needing counting. For example,
>>>>
>>>>   # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
>>>>   # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter
>>>>
>>>> The events are just tracked and managed in software with the above interface,
>>>> no hardware configuration is involved at this point in the above example*.
>>>>
>>>> The new "just_local_slow" can can then be assigned to a monitor group via
>>>> mbm_L3_assignments that will at that time consume one hardware counter and
>>>> program it with the event (which transactions to monitor) and monitor group
>>>> details (PARTID, PMG).
>>>>
>>>> This is based on original suggestion by Peter in a way that we thus expect to
>>>> work for customers. See [1].
>>>>
>>>>> and with BMEC they are:
>>>>> info/L3_MON/mbm_[local,total]_bytes_config
>>>
>>> I see this makes the intent much clearer to me. Thanks for sharing this
>>> plan. I think the general idea is good. To me this implies that for MPAM
>>> to support event configuration we'd want ABMC enabled at the same time.
>>> Which indeed makes sense as then you can then count read and write
>>> separately for a given CTRL_MON/MON group without requiring twice the
>>> number of hardware counters.
>>>
>>> However, I now spot an existing issue, bundling mbm_local_bytes and
>>> mbm_total_bytes together for one pool of counters doesn't work for MPAM.
>>> As noted above they require different sets of hardware counters. With
>>> the current counter assignment mode interface the num_mbm_cntrs is
>>> scoped to all mbm counters. In an MPAM system that supports both
>>> mbm_local_bytes and mbm_total_bytes this could lead to
>>> num_mbm_total_cntrs and a num_mbm_local_cntrs or something equivalent.
>>
>> Is this just needed because MPAM driver does not support counter configuration
>> yet?
> 
> No. As I've hopefully managed to explain a bit better above these
> necessarily come from different pools of counters.

It sounds like the "different pools" may be managed separately based on scope
and if there are different "internal" vs "external" capabilities of these counters
then indeed they need to be assigned based on the type of the event. Do you have more
details about these systems? If the "internal" vs "external" distinction is
tied to the scope then resctrl may have a clear path to support this.

>>>> This is essentially both an event configuration and assignment that is not
>>>> compatible with assignable counters. With this interface the user
>>>> both configures which transactions are counted by a particular event and
>>>> programs all counters in a domain (across all resource groups) to use that
>>>> particular configuration. Due to this incompatibility resctrl fs will not expose
>>>> BMEC files when assignable counters are enabled.
>>>>
>>>>
>>>>> In both cases they have allow configuration for two event types,
>>>>> mbm_local_bytes, and mbm_total_bytes. What am I missing?
>>>>
>>>> The way I see it:
>>>> BMEC: per domain across all resource groups event configuration and assignment that
>>>>       applies to all counters - intended to support the "default" mode where there
>>>>       is no counter assignment from user space.
>>>> assignable counters: event configuration via event_filter with assignment done
>>>>                      separately using per resource group mbm_L3_assignments file
>>>
>>> Make sense.
>>>
>>>>
>>>>>
>>>>>> be and to support both at the same time requires a user interface that is
>>>>>> confusing since the user can concurrently configure events globally per-domain
>>>>>> and per resource group.
>>>>>
>>>>> Sure.
>>>>>
>>>>>>
>>>>>> Could you please elaborate how event configuration work on MPAM? If find this
>>>>>> series quite cryptic. I think it will help if you could elaborate what MPAM
>>>>>> capabilities are and how you expect resctrl fs to expose these features to
>>>>>> an MPAM user and how said used is expected to interact with resctrl fs to use
>>>>>> the features.
>>>>>
>>>>> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
>>>>> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
>>>>> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
>>>>> counting events and otherwise not.
>>>>
>>>> ok. This sounds like assignable counters to me. I do not believe BMEC comes
>>>> into play.
>>>>
>>>>>
>>>>> I haven't put much thought into how we would support event configuration with
>>>>> MPAM but we would want something that allows the configuration per hardware
>>>>> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
>>>>
>>>> This is what assignable counters already does, no?
>>>
>>> Isn't that only with the future plan you shared above?
>>
>> Assigning a counter to a (PARTID, PMG) pair is what assignable counters does
>> today.
> 
> Yes, but isn't it the case that currently, once you've chosen the
> configuration for mbm_local_bytes and for mbm_total_bytes, each hardware
> event is tied to one of those two configurations? The future work will
> allow the user to construct custom named events to give more general
> event configuration where there can be more than 2 different
> configurations at once. (Where I'm using configuration to mean selecting
> which of the resctrl/bmec/abmc list of bandwidth types are used.)

Right.

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-04 19:23               ` Reinette Chatre
@ 2026-03-04 21:01                 ` Ben Horgan
  2026-03-04 22:50                   ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-04 21:01 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/4/26 19:23, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/4/26 9:37 AM, Ben Horgan wrote:
>> On 3/4/26 17:02, Reinette Chatre wrote:
>>> On 3/4/26 3:07 AM, Ben Horgan wrote:
>>>> On 3/3/26 18:09, Reinette Chatre wrote:
>>>>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>>>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>>>>> Hi Ben,
>>>>>>>
>>>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>>>>> bandwidth types a counter tracks. Currently
>>>>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>>>>> supported.
>>>>>>>>
>>>>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>>>>> assignment which reduces the number of hardware monitors a system
>>>>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>>>>
>>>>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>>>>
>>>>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>>>>> these two features are independent and the bandwidth types are limited to a
>>>>>>>> choice of only read or write.
>>>>>>>
>>>>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>>>>> a feature that allows user to configure events globally per domain and another
>>>>>>> feature that allows user to configure events per PMG?
>>>>>>
>>>>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>>>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>>>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>>>>> or both.
>>>>>
>>>>> Thank you for confirming.
>>>>>
>>>>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>>>>
>>>>>
>>>>>>> These different features are how I understand assignable counters and BMEC to
>>>>>>
>>>>>> We are each approaching this from a different view point. I've just been looking at
>>>>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>>>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>>>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>>>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>>>>
>>>>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>>>>> counter mode the counter assignment is per monitoring group AND event as a pair:
>>>>> (CTRL_MON/MON group, event).
>>>>
>>>> Yes but these counters aren't necessarily fungible. For MPAM the
>>>> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
>>>> hardware counters. A MPAM bandwidth counters just counts all traffic on
>>>> a link with the only configurability being for read/write. The counters
>>>> are just placed at different point in the topology to get the different
>>>> events.
>>>
>>> The distinction between "different hardware counters for mbm_local_bytes and
>>> mbm_total_bytes" and "The counters are just placed at different point in the
>>> topology" is not clear to me". The former implies different counters for the
>>> two events while the latter implies the same counters are used for both events
>>> but perhaps accumulated/displayed differently?
>>
>> For a given RIS, mpam device hardware unit of which an MSC may consist
>> of 1 or more, there are MPAMF_MBWUMON_IDR.NUM_MON hardware bandwidth
>> counters which measure traffic passing a specific point with no
>> filtering for where it's going. The filtering of this counter is
>> set up in MSMON_CFG_MBWU_FLT which only allows pmg/partid/(read/write).
> 
> Thank you for the details. Is the expectation that user should be able to
> program all these counters via resctrl? If an MSC consists of multiple RIS
> with different counters then things get complicated very fast. Could it be
> constrained to only expose the maximum number of counters supported by
> all RIS at a particular scope? This would match what the existing
> num_mbm_cntrs file supports.

Not individually, no, they will generally just be one per cache slice or
memory controller and all be programmed together as a component.

> 
> 
>> Whether or not these count traffic that leaves the local numa node or
>> only traffic that's internal to the numa node is a h/w design time (or
>> perhaps f/w) decision and so the mbm_local_bytes/mbm_total_bytes
>> distinction is a property of the RIS/MSC.
> 
> mbm_local_bytes and mbm_total_bytes are already established as counting
> external bandwidth. Specifically, mbm_local_bytes counting L3 external
> bandwidth satisfied by the local memory.
> Do you have insight into what these systems will actually end up being
> programmed with? It is difficult to reason with so many hypotheticals.
> I wonder if it may not be simpler to expose a new unique event for the
> internal numbers? Could initial work be constrained to fit into
> existing definitions and then build from there?

Yes, we can assume mpam just supports mbm_total_bytes for the moment.

> 
> 
>> By different counters I'm referring to different RIS and by "different
>> places in the topology" I'm referring to the design decision of where
>> you put those RIS.
> 
> resctrl is very much focused on monitoring external memory bandwidth at L3 scope.
> Monitoring memory bandwidth at different scopes still needs to be supported.
> This sounds related to the work Fenghua is planning? RISC-V also seems
> to have requirements around monitoring at different scope. Also, for
> reference, https://lore.kernel.org/lkml/fb1e2686-237b-4536-acd6-15159abafcba@intel.com/
> 
> Could we start by seeing how MPAM parts that support monitoring of external bandwidth
> at L3 scope can be supported, evaluate what is missing, and work from there? 

Yes.

> 
>>> I re-read the thread starting with 
>>> https://lore.kernel.org/lkml/CALPaoCh+mRLJEfhKBve3hRf+vHHoObjvWRt74OfpopgtR9g9FQ@mail.gmail.com/
>>> and it sounded to me as though MPAM would only expose the mbm_total_bytes event.
>>
>> That's the case initially but is only due to current hardware support
>> and what can be described by acpi at the moment.
> 
> I am becoming lost here. Are we discussing adding features to resctrl to support
> changes to ACPI that are currently under discussion for hardware that may or
> may not be built on what those ACPI descriptions may look like? This all sounds
> too hypothetical to seriously consider changes to resctrl at this time.

Sorry yes.. I was just thinking about not constraining what is
architecturally possible but we don't need to go there.

> 
>>> Ignoring for a moment that counters could be configured to count different
>>> transactions, so assuming all counters count the same transactions. Could you
>>> please clarify how MPAM determines the counts returned by the
>>> mbm_local_bytes and mbm_total_bytes respectively?
>>>
>>>>>> as they need a fixed (PARTID, PMG) configuration to avoid missing counts.
>>>>>
>>>>> It is not clear to me how sharing counters are at play here.
>>>>
>>>> I was just saying it wasn't possible for bandwidth counters. For
>>>> llc_occupancy, CSU in MPAM, you can share 'counters' as they can just
>>>> recount to get the current cache occupancy.
>>>
>>> ack.
>>>
>>>>>> The intent of this patch is to allow splitting these two features of ABMC,
>>>>>> bandwidth type configuration and hardware counter assignment in order to just
>>>>>
>>>>> Why keep BMEC which is by its name does event configuration? And then on top
>>>>> of that it is event configuration at a scope that MPAM does not support?
>>>>>
>>>>>> support the hardware counter assignment.
>>>>>>
>>>>>> I'm still not understanding the distinction you are making though.
>>>>>> The files are,
>>>>>> With ABMC:
>>>>>> info/L3_MON/event_configs/mbm_[local,total]_bytes/event_filter
>>>>>
>>>>> This is an event configuration that is global without any assignment. This
>>>>> interface communicates to user space which transactions are counted when
>>>>> this particular event is assigned to a CTRL_MON/MON group. This interface
>>>>> is intended to be extensible. The interface starts with the original mbm_local_bytes
>>>>> and mbm_total_bytes events in order to be backward compatible. The vision is that
>>>>> if the user prefers to count different transactions then they could create
>>>>> a new event with the transactions needing counting. For example,
>>>>>
>>>>>   # mkdir /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow
>>>>>   # echo local_reads_slow_memory > /sys/fs/resctrl/info/L3_MON/event_configs/just_local_slow/event_filter
>>>>>
>>>>> The events are just tracked and managed in software with the above interface,
>>>>> no hardware configuration is involved at this point in the above example*.
>>>>>
>>>>> The new "just_local_slow" can can then be assigned to a monitor group via
>>>>> mbm_L3_assignments that will at that time consume one hardware counter and
>>>>> program it with the event (which transactions to monitor) and monitor group
>>>>> details (PARTID, PMG).
>>>>>
>>>>> This is based on original suggestion by Peter in a way that we thus expect to
>>>>> work for customers. See [1].
>>>>>
>>>>>> and with BMEC they are:
>>>>>> info/L3_MON/mbm_[local,total]_bytes_config
>>>>
>>>> I see this makes the intent much clearer to me. Thanks for sharing this
>>>> plan. I think the general idea is good. To me this implies that for MPAM
>>>> to support event configuration we'd want ABMC enabled at the same time.
>>>> Which indeed makes sense as then you can then count read and write
>>>> separately for a given CTRL_MON/MON group without requiring twice the
>>>> number of hardware counters.
>>>>
>>>> However, I now spot an existing issue, bundling mbm_local_bytes and
>>>> mbm_total_bytes together for one pool of counters doesn't work for MPAM.
>>>> As noted above they require different sets of hardware counters. With
>>>> the current counter assignment mode interface the num_mbm_cntrs is
>>>> scoped to all mbm counters. In an MPAM system that supports both
>>>> mbm_local_bytes and mbm_total_bytes this could lead to
>>>> num_mbm_total_cntrs and a num_mbm_local_cntrs or something equivalent.
>>>
>>> Is this just needed because MPAM driver does not support counter configuration
>>> yet?
>>
>> No. As I've hopefully managed to explain a bit better above these
>> necessarily come from different pools of counters.
> 
> It sounds like the "different pools" may be managed separately based on scope
> and if there are different "internal" vs "external" capabilities of these counters
> then indeed they need to be assigned based on the type of the event. Do you have more
> details about these systems? If the "internal" vs "external" distinction is
> tied to the scope then resctrl may have a clear path to support this.

Not really, I think we are quite far away from this no.

> 
>>>>> This is essentially both an event configuration and assignment that is not
>>>>> compatible with assignable counters. With this interface the user
>>>>> both configures which transactions are counted by a particular event and
>>>>> programs all counters in a domain (across all resource groups) to use that
>>>>> particular configuration. Due to this incompatibility resctrl fs will not expose
>>>>> BMEC files when assignable counters are enabled.
>>>>>
>>>>>
>>>>>> In both cases they have allow configuration for two event types,
>>>>>> mbm_local_bytes, and mbm_total_bytes. What am I missing?
>>>>>
>>>>> The way I see it:
>>>>> BMEC: per domain across all resource groups event configuration and assignment that
>>>>>       applies to all counters - intended to support the "default" mode where there
>>>>>       is no counter assignment from user space.
>>>>> assignable counters: event configuration via event_filter with assignment done
>>>>>                      separately using per resource group mbm_L3_assignments file
>>>>
>>>> Make sense.
>>>>
>>>>>
>>>>>>
>>>>>>> be and to support both at the same time requires a user interface that is
>>>>>>> confusing since the user can concurrently configure events globally per-domain
>>>>>>> and per resource group.
>>>>>>
>>>>>> Sure.
>>>>>>
>>>>>>>
>>>>>>> Could you please elaborate how event configuration work on MPAM? If find this
>>>>>>> series quite cryptic. I think it will help if you could elaborate what MPAM
>>>>>>> capabilities are and how you expect resctrl fs to expose these features to
>>>>>>> an MPAM user and how said used is expected to interact with resctrl fs to use
>>>>>>> the features.
>>>>>>
>>>>>> Ok, firstly regarding hardware counter assignment, on MPAM systems with more
>>>>>> (PARTID, PMG) pairs than bandwidth hardware counters we'd like to expose the
>>>>>> mbm_L3_assignments for tracking which CTRL_MON/MON groups have bandwidth
>>>>>> counting events and otherwise not.
>>>>>
>>>>> ok. This sounds like assignable counters to me. I do not believe BMEC comes
>>>>> into play.
>>>>>
>>>>>>
>>>>>> I haven't put much thought into how we would support event configuration with
>>>>>> MPAM but we would want something that allows the configuration per hardware
>>>>>> counter or (PARTID, PMG) pair. I'd rather not commit to the existing interface
>>>>>
>>>>> This is what assignable counters already does, no?
>>>>
>>>> Isn't that only with the future plan you shared above?
>>>
>>> Assigning a counter to a (PARTID, PMG) pair is what assignable counters does
>>> today.
>>
>> Yes, but isn't it the case that currently, once you've chosen the
>> configuration for mbm_local_bytes and for mbm_total_bytes, each hardware
>> event is tied to one of those two configurations? The future work will
>> allow the user to construct custom named events to give more general
>> event configuration where there can be more than 2 different
>> configurations at once. (Where I'm using configuration to mean selecting
>> which of the resctrl/bmec/abmc list of bandwidth types are used.)
> 
> Right.
> 
> Reinette
>

So, to try and bring this back to what we can be done now for MPAM to
fit into the counter mode assignment interface. Just support
mbm_total_bytes and then num_mbm_cntrs is correct (nothing to do). Make
the event_filter file always display all the bandwidth types and make
that the only value that be the only value it accepts (instead of hiding
the event_filter file). If you agree I'll respin with that.

Thanks,

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-04 21:01                 ` Ben Horgan
@ 2026-03-04 22:50                   ` Reinette Chatre
  2026-03-05 10:01                     ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-04 22:50 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/4/26 1:01 PM, Ben Horgan wrote:
> Hi Reinette,
> 
> On 3/4/26 19:23, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 3/4/26 9:37 AM, Ben Horgan wrote:
>>> On 3/4/26 17:02, Reinette Chatre wrote:
>>>> On 3/4/26 3:07 AM, Ben Horgan wrote:
>>>>> On 3/3/26 18:09, Reinette Chatre wrote:
>>>>>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>>>>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>>>>>> Hi Ben,
>>>>>>>>
>>>>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>>>>>> bandwidth types a counter tracks. Currently
>>>>>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>>>>>> supported.
>>>>>>>>>
>>>>>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>>>>>> assignment which reduces the number of hardware monitors a system
>>>>>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>>>>>
>>>>>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>>>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>>>>>
>>>>>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>>>>>> these two features are independent and the bandwidth types are limited to a
>>>>>>>>> choice of only read or write.
>>>>>>>>
>>>>>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>>>>>> a feature that allows user to configure events globally per domain and another
>>>>>>>> feature that allows user to configure events per PMG?
>>>>>>>
>>>>>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>>>>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>>>>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>>>>>> or both.
>>>>>>
>>>>>> Thank you for confirming.
>>>>>>
>>>>>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>>>>>
>>>>>>
>>>>>>>> These different features are how I understand assignable counters and BMEC to
>>>>>>>
>>>>>>> We are each approaching this from a different view point. I've just been looking at
>>>>>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>>>>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>>>>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>>>>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>>>>>
>>>>>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>>>>>> counter mode the counter assignment is per monitoring group AND event as a pair:
>>>>>> (CTRL_MON/MON group, event).
>>>>>
>>>>> Yes but these counters aren't necessarily fungible. For MPAM the
>>>>> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
>>>>> hardware counters. A MPAM bandwidth counters just counts all traffic on
>>>>> a link with the only configurability being for read/write. The counters
>>>>> are just placed at different point in the topology to get the different
>>>>> events.
>>>>
>>>> The distinction between "different hardware counters for mbm_local_bytes and
>>>> mbm_total_bytes" and "The counters are just placed at different point in the
>>>> topology" is not clear to me". The former implies different counters for the
>>>> two events while the latter implies the same counters are used for both events
>>>> but perhaps accumulated/displayed differently?
>>>
>>> For a given RIS, mpam device hardware unit of which an MSC may consist
>>> of 1 or more, there are MPAMF_MBWUMON_IDR.NUM_MON hardware bandwidth
>>> counters which measure traffic passing a specific point with no
>>> filtering for where it's going. The filtering of this counter is
>>> set up in MSMON_CFG_MBWU_FLT which only allows pmg/partid/(read/write).
>>
>> Thank you for the details. Is the expectation that user should be able to
>> program all these counters via resctrl? If an MSC consists of multiple RIS
>> with different counters then things get complicated very fast. Could it be
>> constrained to only expose the maximum number of counters supported by
>> all RIS at a particular scope? This would match what the existing
>> num_mbm_cntrs file supports.
> 
> Not individually, no, they will generally just be one per cache slice or
> memory controller and all be programmed together as a component.

Is this where the risk of double counting comes in? That is, adding up the
memory bandwidth at the cache to the memory bandwidth at memory controller
for a total memory bandwidth count?

...

 
> So, to try and bring this back to what we can be done now for MPAM to
> fit into the counter mode assignment interface. Just support
> mbm_total_bytes and then num_mbm_cntrs is correct (nothing to do). Make
> the event_filter file always display all the bandwidth types and make
> that the only value that be the only value it accepts (instead of hiding
> the event_filter file). If you agree I'll respin with that.

From resctrl side this sounds fine. I don't have any insight into what, if any,
kind of gymnastics the MPAM driver needs to do to make the discovered MSCs with
their varying scope and internal vs external counts fit into this. If initial
implementation indeed forces some components into categories that are not a good
match then when resctrl later does get support for diverse components there may
be surprises to user space along the way. For example, user space may not see the
same memory bandwidth numbers reported by the same events on the same system as
the interface evolves.

"make that the only value that be the only value it accepts" - are you saying that
whatever is displayed when user views the "event_filter" file is what the
user can write to the "event_filter" file? I find this a challenging interface
for user space to use. The expectation is that the user can write any supported
memory transaction to that file and when writing fails it can only be because
of an invalid memory transaction. How can user space know that events are not
configurable at all? It sounds as though user space is expected to try configuring
the event with a memory transaction and then, presumably, check last_cmd_status?

Could this not be simplified by making the "event_filter" file read-only on
MPAM systems? 

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-04 22:50                   ` Reinette Chatre
@ 2026-03-05 10:01                     ` Ben Horgan
  2026-03-05 17:22                       ` Reinette Chatre
  0 siblings, 1 reply; 23+ messages in thread
From: Ben Horgan @ 2026-03-05 10:01 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/4/26 22:50, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/4/26 1:01 PM, Ben Horgan wrote:
>> Hi Reinette,
>>
>> On 3/4/26 19:23, Reinette Chatre wrote:
>>> Hi Ben,
>>>
>>> On 3/4/26 9:37 AM, Ben Horgan wrote:
>>>> On 3/4/26 17:02, Reinette Chatre wrote:
>>>>> On 3/4/26 3:07 AM, Ben Horgan wrote:
>>>>>> On 3/3/26 18:09, Reinette Chatre wrote:
>>>>>>> On 3/3/26 4:29 AM, Ben Horgan wrote:
>>>>>>>> On Mon, Mar 02, 2026 at 03:11:48PM -0800, Reinette Chatre wrote:
>>>>>>>>> Hi Ben,
>>>>>>>>>
>>>>>>>>> On 2/25/26 12:19 PM, Ben Horgan wrote:
>>>>>>>>>> The features BMEC and ABMC provide separate interfaces to configuring which
>>>>>>>>>> bandwidth types a counter tracks. Currently
>>>>>>>>>> resctrl_arch_is_evt_configurable() only ever returns true if BMEC is
>>>>>>>>>> supported.
>>>>>>>>>>
>>>>>>>>>> ABMC is useful even when BMEC is supported as it also provides counter
>>>>>>>>>> assignment which reduces the number of hardware monitors a system
>>>>>>>>>> requires. It is an architectural detail that ABMC provides counter
>>>>>>>>>
>>>>>>>>> Since the goal is to support MPAM I'd suggest that the first focus be on what
>>>>>>>>> resctrl fs supports and exposes and how it does or does not work for MPAM.
>>>>>>>>>
>>>>>>>>>> configurability without requiring the prior feature, BMEC. On MPAM systems
>>>>>>>>>> these two features are independent and the bandwidth types are limited to a
>>>>>>>>>> choice of only read or write.
>>>>>>>>>
>>>>>>>>> Does MPAM support exactly these two features? Specifically, does MPAM support
>>>>>>>>> a feature that allows user to configure events globally per domain and another
>>>>>>>>> feature that allows user to configure events per PMG?
>>>>>>>>
>>>>>>>> No, the bandwidth type configuration in MPAM is per counter and so effectively
>>>>>>>> per (PARTID, PMG) pair. In supporting hardware, the configuration is made in the
>>>>>>>> RWBW field of MSMON_CFG_MBWU_FLT and allows counting of just read, just write,
>>>>>>>> or both.
>>>>>>>
>>>>>>> Thank you for confirming.
>>>>>>>
>>>>>>> Since BMEC event configuration is per domain I do not believe BMEC is relevant to MPAM.
>>>>>>>
>>>>>>>
>>>>>>>>> These different features are how I understand assignable counters and BMEC to
>>>>>>>>
>>>>>>>> We are each approaching this from a different view point. I've just been looking at
>>>>>>>> ABMC as a way of dealing with systems where there are fewer hardware counters than
>>>>>>>> (PARTID, PMG) pairs (num_rmid) by requiring a counter to be assigned to a
>>>>>>>> CTRL_MON or MON group in order to be usable. resctrl otherwise expects a counter
>>>>>>>> per CTRL_MON/MON group. Sharing bandwidth counters doesn't work
>>>>>>>
>>>>>>> No, resctrl does not expect a counter per CTRL_MON/MON group - in assignable
>>>>>>> counter mode the counter assignment is per monitoring group AND event as a pair:
>>>>>>> (CTRL_MON/MON group, event).
>>>>>>
>>>>>> Yes but these counters aren't necessarily fungible. For MPAM the
>>>>>> mbm_local_bytes and mbm_total_bytes are necessarily backed by different
>>>>>> hardware counters. A MPAM bandwidth counters just counts all traffic on
>>>>>> a link with the only configurability being for read/write. The counters
>>>>>> are just placed at different point in the topology to get the different
>>>>>> events.
>>>>>
>>>>> The distinction between "different hardware counters for mbm_local_bytes and
>>>>> mbm_total_bytes" and "The counters are just placed at different point in the
>>>>> topology" is not clear to me". The former implies different counters for the
>>>>> two events while the latter implies the same counters are used for both events
>>>>> but perhaps accumulated/displayed differently?
>>>>
>>>> For a given RIS, mpam device hardware unit of which an MSC may consist
>>>> of 1 or more, there are MPAMF_MBWUMON_IDR.NUM_MON hardware bandwidth
>>>> counters which measure traffic passing a specific point with no
>>>> filtering for where it's going. The filtering of this counter is
>>>> set up in MSMON_CFG_MBWU_FLT which only allows pmg/partid/(read/write).
>>>
>>> Thank you for the details. Is the expectation that user should be able to
>>> program all these counters via resctrl? If an MSC consists of multiple RIS
>>> with different counters then things get complicated very fast. Could it be
>>> constrained to only expose the maximum number of counters supported by
>>> all RIS at a particular scope? This would match what the existing
>>> num_mbm_cntrs file supports.
>>
>> Not individually, no, they will generally just be one per cache slice or
>> memory controller and all be programmed together as a component.
> 
> Is this where the risk of double counting comes in? That is, adding up the
> memory bandwidth at the cache to the memory bandwidth at memory controller
> for a total memory bandwidth count?
> 

Not double counting, so much. The problem is more about using these at
the same time. We were initially thinking that if the memory controller
topology matched that of the l3 caches then we could have
mbm_local_bytes and mbm_total_bytes at the other but we realised we
weren't counting the right things. (Where 'topology matches' means that
there is a pairing between numa nodes and l3 cache where within each
pair they have the same affine cpus.) This would have led to having more
than one pool of hardware counter for memory bandwidth counters that
are, effectively, at the l3 cache. Going forward there are ideas about
placing the MSC in different places in the design which are logically
the l3 cache but mean that different bandwidth types could be counted
but this would need firmware description help (device tree/acpi) so very
much future.

For the moment the only abuse we do around this in the MPAM driver is
that if there is a single l3 and a single numa node then we say that an
MSC counting traffic at the entry to the memory is the same as one at
the exit from the l3 (assuming l3 is the last level cache).

> ...
> 
>  
>> So, to try and bring this back to what we can be done now for MPAM to
>> fit into the counter mode assignment interface. Just support
>> mbm_total_bytes and then num_mbm_cntrs is correct (nothing to do). Make
>> the event_filter file always display all the bandwidth types and make
>> that the only value that be the only value it accepts (instead of hiding
>> the event_filter file). If you agree I'll respin with that.
> 
> From resctrl side this sounds fine. I don't have any insight into what, if any,
> kind of gymnastics the MPAM driver needs to do to make the discovered MSCs with
> their varying scope and internal vs external counts fit into this. If initial
> implementation indeed forces some components into categories that are not a good
> match then when resctrl later does get support for diverse components there may
> be surprises to user space along the way. For example, user space may not see the
> same memory bandwidth numbers reported by the same events on the same system as
> the interface evolves.

Indeed, we have already weeded a few things out of the MPAM driver for
similar reasons. If we start with mpam only supporting a
non-configurable mbm_total_bytes with ABMC I think we're ok. I'll drop
the non-ABMC bandwidth counter support from the MPAM driver as even if
we've got enough counters, one per (CTRL_MON/MON, evt), we can use ABMC.
Also, when event configuration (read/write filtering) using user defined
(or new) events is added this will mean that enough counters becomes a
higher limit. That will mean that the software controller is not usable
but for now I think we can just fail when that mount option, mba_MBps,
is used. Later we can consider using non-ABMC bandwidth counters when
the software controller is requested.

> 
> "make that the only value that be the only value it accepts" - are you saying that
> whatever is displayed when user views the "event_filter" file is what the
> user can write to the "event_filter" file? I find this a challenging interface
> for user space to use. The expectation is that the user can write any supported
> memory transaction to that file and when writing fails it can only be because
> of an invalid memory transaction. How can user space know that events are not
> configurable at all? It sounds as though user space is expected to try configuring
> the event with a memory transaction and then, presumably, check last_cmd_status?
> 
> Could this not be simplified by making the "event_filter" file read-only on
> MPAM systems?

Yes, we'll need some finer grained control for which sets of bandwidth
types can be configured further down the line but going with read-only
for when there is only one fixed set seems good to me.

> 
> Reinette
> 

Thanks,

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-05 10:01                     ` Ben Horgan
@ 2026-03-05 17:22                       ` Reinette Chatre
  2026-03-05 17:34                         ` Ben Horgan
  0 siblings, 1 reply; 23+ messages in thread
From: Reinette Chatre @ 2026-03-05 17:22 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Ben,

On 3/5/26 2:01 AM, Ben Horgan wrote:
> On 3/4/26 22:50, Reinette Chatre wrote:
>> On 3/4/26 1:01 PM, Ben Horgan wrote:
  
>>> So, to try and bring this back to what we can be done now for MPAM to
>>> fit into the counter mode assignment interface. Just support
>>> mbm_total_bytes and then num_mbm_cntrs is correct (nothing to do). Make
>>> the event_filter file always display all the bandwidth types and make
>>> that the only value that be the only value it accepts (instead of hiding
>>> the event_filter file). If you agree I'll respin with that.
>>
>> From resctrl side this sounds fine. I don't have any insight into what, if any,
>> kind of gymnastics the MPAM driver needs to do to make the discovered MSCs with
>> their varying scope and internal vs external counts fit into this. If initial
>> implementation indeed forces some components into categories that are not a good
>> match then when resctrl later does get support for diverse components there may
>> be surprises to user space along the way. For example, user space may not see the
>> same memory bandwidth numbers reported by the same events on the same system as
>> the interface evolves.
> 
> Indeed, we have already weeded a few things out of the MPAM driver for
> similar reasons. If we start with mpam only supporting a
> non-configurable mbm_total_bytes with ABMC I think we're ok. I'll drop
> the non-ABMC bandwidth counter support from the MPAM driver as even if
> we've got enough counters, one per (CTRL_MON/MON, evt), we can use ABMC.

It sounds like the expectation is that when there are enough counters the user
will then run with "mbm_assign_on_mkdir" set so that counters are always dynamically
assigned and thus essentially be "non-ABMC bandwidth counter support"?
Since "mbm_assign_on_mkdir" is set by default then user space should get sane
behavior by default.

Alternatively, a user space would just always run with "mbm_assign_on_mkdir"
off and have full control over monitor assignment even when it is not necessary.

Sounds good to me.

> Also, when event configuration (read/write filtering) using user defined
> (or new) events is added this will mean that enough counters becomes a
> higher limit. That will mean that the software controller is not usable
> but for now I think we can just fail when that mount option, mba_MBps,
> is used. Later we can consider using non-ABMC bandwidth counters when
> the software controller is requested.

ok. Since info/last_cmd_status is not available to to user when mount fail,
please do add a message to kernel log to help diagnose resctrl mount failure.


> 
>>
>> "make that the only value that be the only value it accepts" - are you saying that
>> whatever is displayed when user views the "event_filter" file is what the
>> user can write to the "event_filter" file? I find this a challenging interface
>> for user space to use. The expectation is that the user can write any supported
>> memory transaction to that file and when writing fails it can only be because
>> of an invalid memory transaction. How can user space know that events are not
>> configurable at all? It sounds as though user space is expected to try configuring
>> the event with a memory transaction and then, presumably, check last_cmd_status?
>>
>> Could this not be simplified by making the "event_filter" file read-only on
>> MPAM systems?
> 
> Yes, we'll need some finer grained control for which sets of bandwidth
> types can be configured further down the line but going with read-only
> for when there is only one fixed set seems good to me.

Thank you very much.

Reinette


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode
  2026-03-05 17:22                       ` Reinette Chatre
@ 2026-03-05 17:34                         ` Ben Horgan
  0 siblings, 0 replies; 23+ messages in thread
From: Ben Horgan @ 2026-03-05 17:34 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: linux-kernel, tony.luck, Dave.Martin, james.morse, babu.moger,
	tglx, mingo, bp, dave.hansen, x86, hpa, fenghuay, tan.shaopeng

Hi Reinette,

On 3/5/26 17:22, Reinette Chatre wrote:
> Hi Ben,
> 
> On 3/5/26 2:01 AM, Ben Horgan wrote:
>> On 3/4/26 22:50, Reinette Chatre wrote:
>>> On 3/4/26 1:01 PM, Ben Horgan wrote:
>   
>>>> So, to try and bring this back to what we can be done now for MPAM to
>>>> fit into the counter mode assignment interface. Just support
>>>> mbm_total_bytes and then num_mbm_cntrs is correct (nothing to do). Make
>>>> the event_filter file always display all the bandwidth types and make
>>>> that the only value that be the only value it accepts (instead of hiding
>>>> the event_filter file). If you agree I'll respin with that.
>>>
>>> From resctrl side this sounds fine. I don't have any insight into what, if any,
>>> kind of gymnastics the MPAM driver needs to do to make the discovered MSCs with
>>> their varying scope and internal vs external counts fit into this. If initial
>>> implementation indeed forces some components into categories that are not a good
>>> match then when resctrl later does get support for diverse components there may
>>> be surprises to user space along the way. For example, user space may not see the
>>> same memory bandwidth numbers reported by the same events on the same system as
>>> the interface evolves.
>>
>> Indeed, we have already weeded a few things out of the MPAM driver for
>> similar reasons. If we start with mpam only supporting a
>> non-configurable mbm_total_bytes with ABMC I think we're ok. I'll drop
>> the non-ABMC bandwidth counter support from the MPAM driver as even if
>> we've got enough counters, one per (CTRL_MON/MON, evt), we can use ABMC.
> 
> It sounds like the expectation is that when there are enough counters the user
> will then run with "mbm_assign_on_mkdir" set so that counters are always dynamically
> assigned and thus essentially be "non-ABMC bandwidth counter support"?

Yes, that's the intent.

> Since "mbm_assign_on_mkdir" is set by default then user space should get sane
> behavior by default.
> 
> Alternatively, a user space would just always run with "mbm_assign_on_mkdir"
> off and have full control over monitor assignment even when it is not necessary.
> 
> Sounds good to me.
> 
>> Also, when event configuration (read/write filtering) using user defined
>> (or new) events is added this will mean that enough counters becomes a
>> higher limit. That will mean that the software controller is not usable
>> but for now I think we can just fail when that mount option, mba_MBps,
>> is used. Later we can consider using non-ABMC bandwidth counters when
>> the software controller is requested.
> 
> ok. Since info/last_cmd_status is not available to to user when mount fail,
> please do add a message to kernel log to help diagnose resctrl mount failure.
>

Yep, patch 3 does this.

> 
>>
>>>
>>> "make that the only value that be the only value it accepts" - are you saying that
>>> whatever is displayed when user views the "event_filter" file is what the
>>> user can write to the "event_filter" file? I find this a challenging interface
>>> for user space to use. The expectation is that the user can write any supported
>>> memory transaction to that file and when writing fails it can only be because
>>> of an invalid memory transaction. How can user space know that events are not
>>> configurable at all? It sounds as though user space is expected to try configuring
>>> the event with a memory transaction and then, presumably, check last_cmd_status?
>>>
>>> Could this not be simplified by making the "event_filter" file read-only on
>>> MPAM systems?
>>
>> Yes, we'll need some finer grained control for which sets of bandwidth
>> types can be configured further down the line but going with read-only
>> for when there is only one fixed set seems good to me.
> 
> Thank you very much.
> 
> Reinette
> 

Sounds like we've got a plan :) Thanks!

Ben


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-03-05 17:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 20:19 [PATCH v1 0/4] x86,fs/resctrl: Pave the way for MPAM counter assignment Ben Horgan
2026-02-25 20:19 ` [PATCH v1 1/4] x86,fs/resctrl: Make resctrl_arch_is_evt_configurable() aware of mbm_assign_mode Ben Horgan
2026-03-02 23:11   ` Reinette Chatre
2026-03-03 12:29     ` Ben Horgan
2026-03-03 18:09       ` Reinette Chatre
2026-03-04 11:07         ` Ben Horgan
2026-03-04 17:02           ` Reinette Chatre
2026-03-04 17:37             ` Ben Horgan
2026-03-04 19:23               ` Reinette Chatre
2026-03-04 21:01                 ` Ben Horgan
2026-03-04 22:50                   ` Reinette Chatre
2026-03-05 10:01                     ` Ben Horgan
2026-03-05 17:22                       ` Reinette Chatre
2026-03-05 17:34                         ` Ben Horgan
2026-02-25 20:19 ` [PATCH v1 2/4] fs/resctrl: Only show 'event_filter' files if events are configurable Ben Horgan
2026-03-02 23:12   ` Reinette Chatre
2026-03-03 14:00     ` Ben Horgan
2026-03-03 18:14       ` Reinette Chatre
2026-03-04 11:31         ` Ben Horgan
2026-03-04 17:03           ` Reinette Chatre
2026-03-04 17:53             ` Ben Horgan
2026-02-25 20:19 ` [PATCH v1 3/4] fs/resctrl: Disallow the software controller when mbm counters are assignable Ben Horgan
2026-02-25 20:19 ` [PATCH v1 4/4] arm_mpam: resctrl: Use new signature for resctrl_arch_is_evt_configurable() Ben Horgan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox