[PATCH] fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters

patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

* [PATCH] fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters
@ 2025-09-08 22:15 Reinette Chatre
  2025-09-12  9:51 ` qinyuntan
  0 siblings, 1 reply; 3+ messages in thread
From: Reinette Chatre @ 2025-09-08 22:15 UTC (permalink / raw)
  To: qinyuntan, tony.luck, james.morse, Dave.Martin, babu.moger, bp,
	tglx, dave.hansen
  Cc: x86, hpa, peternewman, linux-kernel, patches, reinette.chatre

Running resctrl_tests on an SNC-2 system with lockdep debugging enabled
triggers several warnings with following trace:
	WARNING: CPU: 0 PID: 1914 at kernel/cpu.c:528 lockdep_assert_cpus_held+0x8a/0xc0
	...
	Call Trace:
	__mon_event_count
	? __lock_acquire
	? __pfx___mon_event_count
	mon_event_count
	? __pfx_smp_mon_event_count
	smp_mon_event_count
	smp_call_on_cpu_callback
	<snip>

get_cpu_cacheinfo_level() called from __mon_event_count() requires CPU
hotplug lock to be held. The hotplug lock is indeed held during this time,
as confirmed by the lockdep_assert_cpus_held() within mon_event_read()
that calls mon_event_count() via IPI, but the lockdep tracking is not able
to follow the IPI.

Fresh CPU cache information via get_cpu_cacheinfo_level() from
__mon_event_count() was added to support the fix for the issue where
resctrl inappropriately maintained links to L3 cache information that will
be stale in the case when the associated CPU goes offline.

Keep the cacheinfo ID in struct rdt_mon_domain to ensure that
resctrl does not maintain stale cache information while CPUs can go
offline. Return to using a pointer to the L3 cache information (struct
cacheinfo) in struct rmid_read, rmid_read::ci. Initialize rmid_read::ci
before the IPI where it is used. CPU hotplug lock is held across
rmid_read::ci initialization and use to ensure that it points to accurate
cache information.

Fixes: 594902c986e2 ("x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 fs/resctrl/ctrlmondata.c | 2 +-
 fs/resctrl/internal.h    | 4 ++--
 fs/resctrl/monitor.c     | 6 ++----
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index d98e0d2de09f..3c39cfacb251 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -625,11 +625,11 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 		 */
 		list_for_each_entry(d, &r->mon_domains, hdr.list) {
 			if (d->ci_id == domid) {
-				rr.ci_id = d->ci_id;
 				cpu = cpumask_any(&d->hdr.cpu_mask);
 				ci = get_cpu_cacheinfo_level(cpu, RESCTRL_L3_CACHE);
 				if (!ci)
 					continue;
+				rr.ci = ci;
 				mon_event_read(&rr, r, NULL, rdtgrp,
 					       &ci->shared_cpu_map, evtid, false);
 				goto checkresult;
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 0a1eedba2b03..9a8cf6f11151 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -98,7 +98,7 @@ struct mon_data {
  *	   domains in @r sharing L3 @ci.id
  * @evtid: Which monitor event to read.
  * @first: Initialize MBM counter when true.
- * @ci_id: Cacheinfo id for L3. Only set when @d is NULL. Used when summing domains.
+ * @ci:    Cacheinfo for L3. Only set when @d is NULL. Used when summing domains.
  * @err:   Error encountered when reading counter.
  * @val:   Returned value of event counter. If @rgrp is a parent resource group,
  *	   @val includes the sum of event counts from its child resource groups.
@@ -112,7 +112,7 @@ struct rmid_read {
 	struct rdt_mon_domain	*d;
 	enum resctrl_event_id	evtid;
 	bool			first;
-	unsigned int		ci_id;
+	struct cacheinfo	*ci;
 	int			err;
 	u64			val;
 	void			*arch_mon_ctx;
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index f5637855c3ac..7326c28a7908 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -361,7 +361,6 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 {
 	int cpu = smp_processor_id();
 	struct rdt_mon_domain *d;
-	struct cacheinfo *ci;
 	struct mbm_state *m;
 	int err, ret;
 	u64 tval = 0;
@@ -389,8 +388,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 	}
 
 	/* Summing domains that share a cache, must be on a CPU for that cache. */
-	ci = get_cpu_cacheinfo_level(cpu, RESCTRL_L3_CACHE);
-	if (!ci || ci->id != rr->ci_id)
+	if (!cpumask_test_cpu(cpu, &rr->ci->shared_cpu_map))
 		return -EINVAL;
 
 	/*
@@ -402,7 +400,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 	 */
 	ret = -EINVAL;
 	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
-		if (d->ci_id != rr->ci_id)
+		if (d->ci_id != rr->ci->id)
 			continue;
 		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
 					     rr->evtid, &tval, rr->arch_mon_ctx);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters
  2025-09-08 22:15 [PATCH] fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters Reinette Chatre
@ 2025-09-12  9:51 ` qinyuntan
  2025-09-12 15:30   ` Reinette Chatre
  0 siblings, 1 reply; 3+ messages in thread
From: qinyuntan @ 2025-09-12  9:51 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, james.morse, Dave.Martin, babu.moger,
	bp, tglx, dave.hansen
  Cc: x86, hpa, peternewman, linux-kernel, patches

Hi Reinette,

Thank you for reporting this issue, and my apologies for the delayed 
reply :).

I was able to reproduce the same lockdep warning on my local SNC-3 
system, and the warning disappears after applying your patch.

The change looks good to me.

Reviewed-by: Qinyun Tan <qinyuntan@linux.alibaba.com>

Thanks,

On 9/9/25 6:15 AM, Reinette Chatre wrote:
> Running resctrl_tests on an SNC-2 system with lockdep debugging enabled
> triggers several warnings with following trace:
> 	WARNING: CPU: 0 PID: 1914 at kernel/cpu.c:528 lockdep_assert_cpus_held+0x8a/0xc0
> 	...
> 	Call Trace:
> 	__mon_event_count
> 	? __lock_acquire
> 	? __pfx___mon_event_count
> 	mon_event_count
> 	? __pfx_smp_mon_event_count
> 	smp_mon_event_count
> 	smp_call_on_cpu_callback
> 	<snip>
> 
> get_cpu_cacheinfo_level() called from __mon_event_count() requires CPU
> hotplug lock to be held. The hotplug lock is indeed held during this time,
> as confirmed by the lockdep_assert_cpus_held() within mon_event_read()
> that calls mon_event_count() via IPI, but the lockdep tracking is not able
> to follow the IPI.
> 
> Fresh CPU cache information via get_cpu_cacheinfo_level() from
> __mon_event_count() was added to support the fix for the issue where
> resctrl inappropriately maintained links to L3 cache information that will
> be stale in the case when the associated CPU goes offline.
> 
> Keep the cacheinfo ID in struct rdt_mon_domain to ensure that
> resctrl does not maintain stale cache information while CPUs can go
> offline. Return to using a pointer to the L3 cache information (struct
> cacheinfo) in struct rmid_read, rmid_read::ci. Initialize rmid_read::ci
> before the IPI where it is used. CPU hotplug lock is held across
> rmid_read::ci initialization and use to ensure that it points to accurate
> cache information.
> 
> Fixes: 594902c986e2 ("x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>   fs/resctrl/ctrlmondata.c | 2 +-
>   fs/resctrl/internal.h    | 4 ++--
>   fs/resctrl/monitor.c     | 6 ++----
>   3 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index d98e0d2de09f..3c39cfacb251 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -625,11 +625,11 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>   		 */
>   		list_for_each_entry(d, &r->mon_domains, hdr.list) {
>   			if (d->ci_id == domid) {
> -				rr.ci_id = d->ci_id;
>   				cpu = cpumask_any(&d->hdr.cpu_mask);
>   				ci = get_cpu_cacheinfo_level(cpu, RESCTRL_L3_CACHE);
>   				if (!ci)
>   					continue;
> +				rr.ci = ci;
>   				mon_event_read(&rr, r, NULL, rdtgrp,
>   					       &ci->shared_cpu_map, evtid, false);
>   				goto checkresult;
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 0a1eedba2b03..9a8cf6f11151 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -98,7 +98,7 @@ struct mon_data {
>    *	   domains in @r sharing L3 @ci.id
>    * @evtid: Which monitor event to read.
>    * @first: Initialize MBM counter when true.
> - * @ci_id: Cacheinfo id for L3. Only set when @d is NULL. Used when summing domains.
> + * @ci:    Cacheinfo for L3. Only set when @d is NULL. Used when summing domains.
>    * @err:   Error encountered when reading counter.
>    * @val:   Returned value of event counter. If @rgrp is a parent resource group,
>    *	   @val includes the sum of event counts from its child resource groups.
> @@ -112,7 +112,7 @@ struct rmid_read {
>   	struct rdt_mon_domain	*d;
>   	enum resctrl_event_id	evtid;
>   	bool			first;
> -	unsigned int		ci_id;
> +	struct cacheinfo	*ci;
>   	int			err;
>   	u64			val;
>   	void			*arch_mon_ctx;
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index f5637855c3ac..7326c28a7908 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -361,7 +361,6 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>   {
>   	int cpu = smp_processor_id();
>   	struct rdt_mon_domain *d;
> -	struct cacheinfo *ci;
>   	struct mbm_state *m;
>   	int err, ret;
>   	u64 tval = 0;
> @@ -389,8 +388,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>   	}
>   
>   	/* Summing domains that share a cache, must be on a CPU for that cache. */
> -	ci = get_cpu_cacheinfo_level(cpu, RESCTRL_L3_CACHE);
> -	if (!ci || ci->id != rr->ci_id)
> +	if (!cpumask_test_cpu(cpu, &rr->ci->shared_cpu_map))
>   		return -EINVAL;
>   
>   	/*
> @@ -402,7 +400,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>   	 */
>   	ret = -EINVAL;
>   	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
> -		if (d->ci_id != rr->ci_id)
> +		if (d->ci_id != rr->ci->id)
>   			continue;
>   		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
>   					     rr->evtid, &tval, rr->arch_mon_ctx);


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters
  2025-09-12  9:51 ` qinyuntan
@ 2025-09-12 15:30   ` Reinette Chatre
  0 siblings, 0 replies; 3+ messages in thread
From: Reinette Chatre @ 2025-09-12 15:30 UTC (permalink / raw)
  To: qinyuntan, tony.luck, james.morse, Dave.Martin, babu.moger, bp,
	tglx, dave.hansen
  Cc: x86, hpa, peternewman, linux-kernel, patches

Hi Qinyun,

On 9/12/25 2:51 AM, qinyuntan wrote:
> Hi Reinette,
> 
> Thank you for reporting this issue, and my apologies for the delayed reply :).
> 
> I was able to reproduce the same lockdep warning on my local SNC-3 system, and the warning disappears after applying your patch.
> 
> The change looks good to me.
> 
> Reviewed-by: Qinyun Tan <qinyuntan@linux.alibaba.com>
> 

Thank you very much for testing and reviewing!

This patch has already been queued for inclusion. Unfortunately
there may not be another opportunity for your tag to be added, but it will
forever be associated with this patch in lore.kernel.org.

Reinette



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-09-12 15:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-08 22:15 [PATCH] fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters Reinette Chatre
2025-09-12  9:51 ` qinyuntan
2025-09-12 15:30   ` Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).