All of lore.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Ben Horgan <ben.horgan@arm.com>, <tony.luck@intel.com>,
	<james.morse@arm.com>, <Dave.Martin@arm.com>,
	<babu.moger@amd.com>, <bp@alien8.de>, <tglx@linutronix.de>,
	<dave.hansen@linux.intel.com>
Cc: <x86@kernel.org>, <hpa@zytor.com>, <fustini@kernel.org>,
	<fenghuay@nvidia.com>, <peternewman@google.com>,
	<linux-kernel@vger.kernel.org>, <patches@lists.linux.dev>
Subject: Re: [PATCH 00/11] x86,fs/resctrl: Improve resctrl quality and consistency
Date: Tue, 17 Mar 2026 11:09:45 -0700	[thread overview]
Message-ID: <f37db924-dc65-432b-ae48-ddd986b3efd9@intel.com> (raw)
In-Reply-To: <38e5c384-4d08-425e-a4fb-a63913be35ac@arm.com>

Hi Ben,

On 3/17/26 3:25 AM, Ben Horgan wrote:
> On 3/16/26 18:18, Reinette Chatre wrote:
>> On 3/16/26 10:44 AM, Ben Horgan wrote:
>>> On 3/2/26 18:46, Reinette Chatre wrote:
>> ...
>>> One related issue I've just noticed is that when ABMC and mbm_assign_on_mkdir are
>>> enabled the creation of MON/CTRL_MON directories may succeed but an error message
>>> is written to last_cmd_status. E.g.
>>>
>>> /sys/fs/resctrl# mkdir mon_groups/new5
>>> /sys/fs/resctrl# cat info/last_cmd_status
>>> Failed to allocate counter for mbm_total_bytes in domain 2
>>>
>>> The failure is ignored, as expected, in rdt_assign_cntrs() but the last_cmd_status
>>> is never cleared. I think this could be fixed by:
>>>
>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>>> index 62edb464410a..396f17ed72c6 100644
>>> --- a/fs/resctrl/monitor.c
>>> +++ b/fs/resctrl/monitor.c
>>> @@ -1260,6 +1260,8 @@ void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>>>         if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
>>>                 rdtgroup_assign_cntr_event(NULL, rdtgrp,
>>>                                            &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
>>> +
>>> +       rdt_last_cmd_clear();
>>>  }
>>>
>>> Is this right thing to do? Let me know if you want a proper patch.
>>
>> Letting group be created without any counters assigned while writing the error
>> to last_cmd_status is the intended behavior. If the last_cmd_status buffer is cleared
>> at this point then user space will never have the opportunity to see the message that
>> contains the details.
>>
>> It did not seem appropriate to let resource group creation fail when no counters
>> are available. I see that the documentation is not clear on this. What do you think
>> of an update to documentation instead? Would something like below help clarify behavior?
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index ba609f8d4de5..20dc58d281cf 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -478,6 +478,12 @@ with the following files:
>>  	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
>>  	  0
>>  
>> +	Automatic counter assignment is done with best effort. If auto assignment
>> +	is enabled but there are not enough available counters then monitor group
>> +	creation could succeed while one or more events belonging to the group may
>> +	not have a counter assigned. Consult last_cmd_status for details during
>> +	such scenario.
>> +
> 
> This does the improve the situation but the multiple domain failure behaviour depends
> on the order the domains are iterated through. This is stable as the list is sorted but
> does seem a bit complicated.
> I.e. if you have two domains, with ids 2 and 3, with no counters remaining on domain 2 but
> some on domain 3 then rdtgroup_assign_cntr_event() will fail early and the counter won't
> be assigned for domain 3 but the last_cmd_status will only be about domain 2. The user
> either needs to know a failure at one domain means all higher domains will not be
> considered for that counter or look at the new mbm_L3_assignments to understand what's happened.
> In this case we have:
> 
> /sys/fs/resctrl# cat info/L3_MON/mbm_assign_on_mkdir
> 1
> /sys/fs/resctrl# cat info/L3_MON/available_mbm_cntrs
> 2=0;3=1
> /sys/fs/resctrl# mkdir mon_groups/new
> /sys/fs/resctrl# cat info/last_cmd_status
> Failed to allocate counter for mbm_total_bytes in domain 2
> /sys/fs/resctrl# cat mon_groups/new/mbm_L3_assignments
> mbm_total_bytes:2=_;3=_

Good point.

> 
> Would it be better for each domain to be considered even if a previous failure occurred or
> is this now a fixed behaviour? For illustration:

I do not believe this needs to be fixed.

> 
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 62edb464410a..8e061bce9742 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -1248,18 +1248,25 @@ static int rdtgroup_assign_cntr_event(struct rdt_l3_mon_domain *d, struct rdtgro
>  void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>  {
>         struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> +       struct rdt_l3_mon_domain *d;
> 
>         if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r) ||
>             !r->mon.mbm_assign_on_mkdir)
>                 return;
> 
> -       if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
> -               rdtgroup_assign_cntr_event(NULL, rdtgrp,
> -                                          &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]);
> +       if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) {
> +               list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +                       rdtgroup_assign_cntr_event(d, rdtgrp,
> +                                                  &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]);
> +               }
> +       }
> 
> -       if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
> -               rdtgroup_assign_cntr_event(NULL, rdtgrp,
> -                                          &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
> +       if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) {
> +               list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +                       rdtgroup_assign_cntr_event(d, rdtgrp,
> +                                                  &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
> +               }
> +       }
>  }

With a solution like this last_cmd_status could potentially contain multiple lines, one
line per domain that failed. last_cmd_status is 512 bytes so if this is a system with 
many domains there is a risk of overflow and user space not seeing all failures.
That may be ok?

I think this can be simplified within rdt_assign_cntr_event() though. Consider:

diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 49f3f6b846b2..a6a791a15e30 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1209,12 +1209,13 @@ static int rdtgroup_alloc_assign_cntr(struct rdt_resource *r, struct rdt_l3_mon_
  * NULL; otherwise, assign the counter to the specified domain @d.
  *
  * If all counters in a domain are already in use, rdtgroup_alloc_assign_cntr()
- * will fail. The assignment process will abort at the first failure encountered
- * during domain traversal, which may result in the event being only partially
- * assigned.
+ * will fail. Ignore errors when attempting to assign a counter to all domains
+ * since only some domains may have counters available and goal is to assign
+ * counters where possible. Only caller providing @d of NULL is
+ * rdtgroup_assign_cntrs() that ignores errors.
  *
  * Return:
- * 0 on success, < 0 on failure.
+ * 0 on success when @d is specified, 0 always when @d is NULL, < 0 on failure.
  */
 static int rdtgroup_assign_cntr_event(struct rdt_l3_mon_domain *d, struct rdtgroup *rdtgrp,
 				      struct mon_evt *mevt)
@@ -1223,11 +1224,8 @@ static int rdtgroup_assign_cntr_event(struct rdt_l3_mon_domain *d, struct rdtgro
 	int ret = 0;
 
 	if (!d) {
-		list_for_each_entry(d, &r->mon_domains, hdr.list) {
-			ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
-			if (ret)
-				return ret;
-		}
+		list_for_each_entry(d, &r->mon_domains, hdr.list)
+			rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
 	} else {
 		ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
 	}


Reinette


  reply	other threads:[~2026-03-17 18:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02 18:46 [PATCH 00/11] x86,fs/resctrl: Improve resctrl quality and consistency Reinette Chatre
2026-03-02 18:46 ` [PATCH 01/11] fs/resctrl: Add missing return value descriptions Reinette Chatre
2026-03-02 18:46 ` [PATCH 02/11] fs/resctrl: Avoid "may be used uninitialized" warning Reinette Chatre
2026-03-02 18:46 ` [PATCH 03/11] fs/resctrl: Use correct format specifier for printing error pointers Reinette Chatre
2026-03-02 18:46 ` [PATCH 04/11] x86/resctrl: Protect against bad shift Reinette Chatre
2026-03-02 18:46 ` [PATCH 05/11] fs/resctrl: Use accurate type for rdt_resource::rid Reinette Chatre
2026-03-03 18:20   ` Luck, Tony
2026-03-03 19:06     ` Reinette Chatre
2026-03-03 19:54       ` Luck, Tony
2026-03-03 22:29         ` Reinette Chatre
2026-03-03 23:26           ` Luck, Tony
2026-03-17 11:23   ` Ben Horgan
2026-03-17 17:34     ` Reinette Chatre
2026-03-02 18:46 ` [PATCH 06/11] fs/resctrl: Pass error reading event through to user space Reinette Chatre
2026-03-17 17:08   ` Ben Horgan
2026-03-02 18:46 ` [PATCH 07/11] fs/resctrl: Add last_cmd_status support for writes to max_threshold_occupancy Reinette Chatre
2026-03-17 17:13   ` Ben Horgan
2026-03-02 18:46 ` [PATCH 08/11] fs/resctrl: Use accurate and symmetric exit flows Reinette Chatre
2026-03-02 18:46 ` [PATCH 09/11] fs/resctrl: Use stricter checks on input to cpus/cpus_list file Reinette Chatre
2026-03-02 18:46 ` [PATCH 10/11] fs/resctrl: Change last_cmd_status custom during input parsing Reinette Chatre
2026-03-17 17:20   ` Ben Horgan
2026-03-02 18:46 ` [PATCH 11/11] fs/resctrl: Communicate resource group deleted error via last_cmd_status Reinette Chatre
2026-03-02 23:37 ` [PATCH 00/11] x86,fs/resctrl: Improve resctrl quality and consistency Luck, Tony
2026-03-03  2:18   ` Reinette Chatre
2026-03-04 11:48   ` Ben Horgan
2026-03-16 22:28     ` Reinette Chatre
2026-03-16 17:44 ` Ben Horgan
2026-03-16 18:18   ` Reinette Chatre
2026-03-17 10:25     ` Ben Horgan
2026-03-17 18:09       ` Reinette Chatre [this message]
2026-03-18 11:59         ` Ben Horgan
2026-03-18 16:35           ` Reinette Chatre
2026-03-18 17:10             ` Ben Horgan
2026-03-18 20:12               ` Reinette Chatre
2026-03-19  9:53                 ` Ben Horgan
2026-03-19 16:18                   ` Reinette Chatre
2026-03-19 17:18                     ` Ben Horgan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f37db924-dc65-432b-ae48-ddd986b3efd9@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=ben.horgan@arm.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghuay@nvidia.com \
    --cc=fustini@kernel.org \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.