All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] acpi/hmat: hmat_register_target() refactor to address lockdep warning
@ 2025-10-17 21:21 Dave Jiang
  2025-10-17 21:21 ` [PATCH v3 1/2] acpi/hmat: Return when generic target is updated Dave Jiang
  2025-10-17 21:21 ` [PATCH v3 2/2] acpi/hmat: Fix lockdep warning for hmem_register_resource() Dave Jiang
  0 siblings, 2 replies; 6+ messages in thread
From: Dave Jiang @ 2025-10-17 21:21 UTC (permalink / raw)
  To: nvdimm, linux-cxl, linux-acpi
  Cc: dan.j.williams, vishal.l.verma, ira.weiny, rafael

This series has changes that refactor the hmat_register_target() function
to clean up a lockdep warning.

Dave Jiang (2):
  acpi/hmat: Return when generic target is updated
  acpi/hmat: Fix lockdep warning for hmem_register_resource()

 drivers/acpi/numa/hmat.c | 47 ++++++++++++++++++++++------------------
 1 file changed, 26 insertions(+), 21 deletions(-)


base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
-- 
2.51.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/2] acpi/hmat: Return when generic target is updated
  2025-10-17 21:21 [PATCH v3 0/2] acpi/hmat: hmat_register_target() refactor to address lockdep warning Dave Jiang
@ 2025-10-17 21:21 ` Dave Jiang
  2025-10-28 15:08   ` Jonathan Cameron
  2025-10-17 21:21 ` [PATCH v3 2/2] acpi/hmat: Fix lockdep warning for hmem_register_resource() Dave Jiang
  1 sibling, 1 reply; 6+ messages in thread
From: Dave Jiang @ 2025-10-17 21:21 UTC (permalink / raw)
  To: nvdimm, linux-cxl, linux-acpi
  Cc: dan.j.williams, vishal.l.verma, ira.weiny, rafael

With the current code flow, once the generic target is updated
target->registered is set and the remaining code is skipped.
So return immediately instead of going through the checks and
then skip.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/acpi/numa/hmat.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 5a36d57289b4..1dc73d20d989 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -888,12 +888,13 @@ static void hmat_register_target(struct memory_target *target)
 	 * Register generic port perf numbers. The nid may not be
 	 * initialized and is still NUMA_NO_NODE.
 	 */
-	mutex_lock(&target_lock);
-	if (*(u16 *)target->gen_port_device_handle) {
-		hmat_update_generic_target(target);
-		target->registered = true;
+	scoped_guard(mutex, &target_lock) {
+		if (*(u16 *)target->gen_port_device_handle) {
+			hmat_update_generic_target(target);
+			target->registered = true;
+			return;
+		}
 	}
-	mutex_unlock(&target_lock);
 
 	/*
 	 * Skip offline nodes. This can happen when memory
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/2] acpi/hmat: Fix lockdep warning for hmem_register_resource()
  2025-10-17 21:21 [PATCH v3 0/2] acpi/hmat: hmat_register_target() refactor to address lockdep warning Dave Jiang
  2025-10-17 21:21 ` [PATCH v3 1/2] acpi/hmat: Return when generic target is updated Dave Jiang
@ 2025-10-17 21:21 ` Dave Jiang
  2025-10-28 15:15   ` Jonathan Cameron
  1 sibling, 1 reply; 6+ messages in thread
From: Dave Jiang @ 2025-10-17 21:21 UTC (permalink / raw)
  To: nvdimm, linux-cxl, linux-acpi
  Cc: dan.j.williams, vishal.l.verma, ira.weiny, rafael

The following lockdep splat was observed while kernel auto-online a CXL
memory region:

======================================================
WARNING: possible circular locking dependency detected
6.17.0djtest+ #53 Tainted: G        W
------------------------------------------------------
systemd-udevd/3334 is trying to acquire lock:
ffffffff90346188 (hmem_resource_lock){+.+.}-{4:4}, at: hmem_register_resource+0x31/0x50

but task is already holding lock:
ffffffff90338890 ((node_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x2e/0x70

which lock already depends on the new lock.
[..]
Chain exists of:
  hmem_resource_lock --> mem_hotplug_lock --> (node_chain).rwsem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  rlock((node_chain).rwsem);
                               lock(mem_hotplug_lock);
                               lock((node_chain).rwsem);
  lock(hmem_resource_lock);

The lock ordering can cause potential deadlock. There are instances
where hmem_resource_lock is taken after (node_chain).rwsem, and vice
versa.

Split out the target update section of hmat_register_target() so that
hmat_callback() only envokes that section instead of attempt to register
hmem devices that it does not need to.

Fixes: cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as a
n "hmem" device")
notmuch/
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Refactor to split out target device setup vs target update (Dan)
---
 drivers/acpi/numa/hmat.c | 48 ++++++++++++++++++++++------------------
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 1dc73d20d989..ddbdd32e79a8 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -874,28 +874,10 @@ static void hmat_register_target_devices(struct memory_target *target)
 	}
 }
 
-static void hmat_register_target(struct memory_target *target)
+static void hmat_hotplug_target(struct memory_target *target)
 {
 	int nid = pxm_to_node(target->memory_pxm);
 
-	/*
-	 * Devices may belong to either an offline or online
-	 * node, so unconditionally add them.
-	 */
-	hmat_register_target_devices(target);
-
-	/*
-	 * Register generic port perf numbers. The nid may not be
-	 * initialized and is still NUMA_NO_NODE.
-	 */
-	scoped_guard(mutex, &target_lock) {
-		if (*(u16 *)target->gen_port_device_handle) {
-			hmat_update_generic_target(target);
-			target->registered = true;
-			return;
-		}
-	}
-
 	/*
 	 * Skip offline nodes. This can happen when memory
 	 * marked EFI_MEMORY_SP, "specific purpose", is applied
@@ -906,7 +888,7 @@ static void hmat_register_target(struct memory_target *target)
 	if (nid == NUMA_NO_NODE || !node_online(nid))
 		return;
 
-	mutex_lock(&target_lock);
+	guard(mutex)(&target_lock);
 	if (!target->registered) {
 		hmat_register_target_initiators(target);
 		hmat_register_target_cache(target);
@@ -914,7 +896,29 @@ static void hmat_register_target(struct memory_target *target)
 		hmat_register_target_perf(target, ACCESS_COORDINATE_CPU);
 		target->registered = true;
 	}
-	mutex_unlock(&target_lock);
+}
+
+static void hmat_register_target(struct memory_target *target)
+{
+	/*
+	 * Devices may belong to either an offline or online
+	 * node, so unconditionally add them.
+	 */
+	hmat_register_target_devices(target);
+
+	/*
+	 * Register generic port perf numbers. The nid may not be
+	 * initialized and is still NUMA_NO_NODE.
+	 */
+	scoped_guard(mutex, &target_lock) {
+		if (*(u16 *)target->gen_port_device_handle) {
+			hmat_update_generic_target(target);
+			target->registered = true;
+			return;
+		}
+	}
+
+	hmat_hotplug_target(target);
 }
 
 static void hmat_register_targets(void)
@@ -940,7 +944,7 @@ static int hmat_callback(struct notifier_block *self,
 	if (!target)
 		return NOTIFY_OK;
 
-	hmat_register_target(target);
+	hmat_hotplug_target(target);
 	return NOTIFY_OK;
 }
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] acpi/hmat: Return when generic target is updated
  2025-10-17 21:21 ` [PATCH v3 1/2] acpi/hmat: Return when generic target is updated Dave Jiang
@ 2025-10-28 15:08   ` Jonathan Cameron
  0 siblings, 0 replies; 6+ messages in thread
From: Jonathan Cameron @ 2025-10-28 15:08 UTC (permalink / raw)
  To: Dave Jiang
  Cc: nvdimm, linux-cxl, linux-acpi, dan.j.williams, vishal.l.verma,
	ira.weiny, rafael

On Fri, 17 Oct 2025 14:21:04 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> With the current code flow, once the generic target is updated
> target->registered is set and the remaining code is skipped.
> So return immediately instead of going through the checks and
> then skip.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] acpi/hmat: Fix lockdep warning for hmem_register_resource()
  2025-10-17 21:21 ` [PATCH v3 2/2] acpi/hmat: Fix lockdep warning for hmem_register_resource() Dave Jiang
@ 2025-10-28 15:15   ` Jonathan Cameron
  0 siblings, 0 replies; 6+ messages in thread
From: Jonathan Cameron @ 2025-10-28 15:15 UTC (permalink / raw)
  To: Dave Jiang
  Cc: nvdimm, linux-cxl, linux-acpi, dan.j.williams, vishal.l.verma,
	ira.weiny, rafael

On Fri, 17 Oct 2025 14:21:05 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> The following lockdep splat was observed while kernel auto-online a CXL
> memory region:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.17.0djtest+ #53 Tainted: G        W
> ------------------------------------------------------
> systemd-udevd/3334 is trying to acquire lock:
> ffffffff90346188 (hmem_resource_lock){+.+.}-{4:4}, at: hmem_register_resource+0x31/0x50
> 
> but task is already holding lock:
> ffffffff90338890 ((node_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x2e/0x70
> 
> which lock already depends on the new lock.
> [..]
> Chain exists of:
>   hmem_resource_lock --> mem_hotplug_lock --> (node_chain).rwsem
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   rlock((node_chain).rwsem);
>                                lock(mem_hotplug_lock);
>                                lock((node_chain).rwsem);
>   lock(hmem_resource_lock);
> 
> The lock ordering can cause potential deadlock. There are instances
> where hmem_resource_lock is taken after (node_chain).rwsem, and vice
> versa.
> 
> Split out the target update section of hmat_register_target() so that
> hmat_callback() only envokes that section instead of attempt to register
> hmem devices that it does not need to.
> 
> Fixes: cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as a
> n "hmem" device")
Fix up whatever caused that line to wrap!

> notmuch/

?

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

The guard change seems to be unrelated to rest of the patch.
Probably shouldn't be here.

> 
> ---
> v3:
> - Refactor to split out target device setup vs target update (Dan)
> ---
>  drivers/acpi/numa/hmat.c | 48 ++++++++++++++++++++++------------------
>  1 file changed, 26 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
> index 1dc73d20d989..ddbdd32e79a8 100644
> --- a/drivers/acpi/numa/hmat.c
> +++ b/drivers/acpi/numa/hmat.c
> @@ -874,28 +874,10 @@ static void hmat_register_target_devices(struct memory_target *target)
>  	}
>  }
>  
> -static void hmat_register_target(struct memory_target *target)
> +static void hmat_hotplug_target(struct memory_target *target)
>  {

> @@ -906,7 +888,7 @@ static void hmat_register_target(struct memory_target *target)
>  	if (nid == NUMA_NO_NODE || !node_online(nid))
>  		return;
>  
> -	mutex_lock(&target_lock);
> +	guard(mutex)(&target_lock);
Smells unrelated...

If you did want to do this I'd also do
	if (target->registered)
		return;

	hmat_register_target_initiators();
etc.


>  	if (!target->registered) {
>  		hmat_register_target_initiators(target);
>  		hmat_register_target_cache(target);
> @@ -914,7 +896,29 @@ static void hmat_register_target(struct memory_target *target)
>  		hmat_register_target_perf(target, ACCESS_COORDINATE_CPU);
>  		target->registered = true;
>  	}
> -	mutex_unlock(&target_lock);
> +}



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/2] acpi/hmat: Return when generic target is updated
  2025-11-05 23:48 [PATCH v3 0/2] acpi/hmat: hmat_register_target() refactor to address lockdep warning Dave Jiang
@ 2025-11-05 23:48 ` Dave Jiang
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Jiang @ 2025-11-05 23:48 UTC (permalink / raw)
  To: nvdimm, linux-cxl, linux-acpi
  Cc: dan.j.williams, vishal.l.verma, ira.weiny, rafael

With the current code flow, once the generic target is updated
target->registered is set and the remaining code is skipped.
So return immediately instead of going through the checks and
then skip.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/acpi/numa/hmat.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 5a36d57289b4..1dc73d20d989 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -888,12 +888,13 @@ static void hmat_register_target(struct memory_target *target)
 	 * Register generic port perf numbers. The nid may not be
 	 * initialized and is still NUMA_NO_NODE.
 	 */
-	mutex_lock(&target_lock);
-	if (*(u16 *)target->gen_port_device_handle) {
-		hmat_update_generic_target(target);
-		target->registered = true;
+	scoped_guard(mutex, &target_lock) {
+		if (*(u16 *)target->gen_port_device_handle) {
+			hmat_update_generic_target(target);
+			target->registered = true;
+			return;
+		}
 	}
-	mutex_unlock(&target_lock);
 
 	/*
 	 * Skip offline nodes. This can happen when memory
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-05 23:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17 21:21 [PATCH v3 0/2] acpi/hmat: hmat_register_target() refactor to address lockdep warning Dave Jiang
2025-10-17 21:21 ` [PATCH v3 1/2] acpi/hmat: Return when generic target is updated Dave Jiang
2025-10-28 15:08   ` Jonathan Cameron
2025-10-17 21:21 ` [PATCH v3 2/2] acpi/hmat: Fix lockdep warning for hmem_register_resource() Dave Jiang
2025-10-28 15:15   ` Jonathan Cameron
  -- strict thread matches above, loose matches on Subject: below --
2025-11-05 23:48 [PATCH v3 0/2] acpi/hmat: hmat_register_target() refactor to address lockdep warning Dave Jiang
2025-11-05 23:48 ` [PATCH v3 1/2] acpi/hmat: Return when generic target is updated Dave Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.