linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Add managed SOFT RESERVE resource handling
@ 2025-06-03 22:19 Smita Koralahalli
  2025-06-03 22:19 ` [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region() Smita Koralahalli
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

Add the ability to manage SOFT RESERVE iomem resources prior to them being
added to the iomem resource tree. This allows drivers, such as CXL, to
remove any pieces of the SOFT RESERVE resource that intersect with created
CXL regions.

The current approach of leaving the SOFT RESERVE resources as is can cause
failures during hotplug of devices, such as CXL, because the resource is
not available for reuse after teardown of the device.

The approach is to add SOFT RESERVE resources to a separate tree during
boot. This allows any drivers to update the SOFT RESERVE resources before
they are merged into the iomem resource tree. In addition a notifier chain
is added so that drivers can be notified when these SOFT RESERVE resources
are added to the ioeme resource tree.

The CXL driver is modified to use a worker thread that waits for the CXL
PCI and CXL mem drivers to be loaded and for their probe routine to
complete. Then the driver walks through any created CXL regions to trim any
intersections with SOFT RESERVE resources in the iomem tree.

The dax driver uses the new soft reserve notifier chain so it can consume
any remaining SOFT RESERVES once they're added to the iomem tree.

The following scenarios have been tested:

Example 1: Exact alignment, soft reserved is a child of the region

|---------- "Soft Reserved" -----------|
|-------------- "Region #" ------------|

Before:
  1050000000-304fffffff : CXL Window 0
    1050000000-304fffffff : region0
      1050000000-304fffffff : Soft Reserved
        1080000000-2fffffffff : dax0.0
          1080000000-2fffffffff : System RAM (kmem)

After:
  1050000000-304fffffff : CXL Window 0
    1050000000-304fffffff : region1
      1080000000-2fffffffff : dax0.0
        1080000000-2fffffffff : System RAM (kmem)

Example 2: Start and/or end aligned and soft reserved spans multiple
regions

|----------- "Soft Reserved" -----------|
|-------- "Region #" -------|
or
|----------- "Soft Reserved" -----------|
|-------- "Region #" -------|

Before:
  850000000-684fffffff : Soft Reserved
    850000000-284fffffff : CXL Window 0
      850000000-284fffffff : region3
        850000000-284fffffff : dax0.0
          850000000-284fffffff : System RAM (kmem)
    2850000000-484fffffff : CXL Window 1
      2850000000-484fffffff : region4
        2850000000-484fffffff : dax1.0
          2850000000-484fffffff : System RAM (kmem)
    4850000000-684fffffff : CXL Window 2
      4850000000-684fffffff : region5
        4850000000-684fffffff : dax2.0
          4850000000-684fffffff : System RAM (kmem)

After:
  850000000-284fffffff : CXL Window 0
    850000000-284fffffff : region3
      850000000-284fffffff : dax0.0
        850000000-284fffffff : System RAM (kmem)
  2850000000-484fffffff : CXL Window 1
    2850000000-484fffffff : region4
      2850000000-484fffffff : dax1.0
        2850000000-484fffffff : System RAM (kmem)
  4850000000-684fffffff : CXL Window 2
    4850000000-684fffffff : region5
      4850000000-684fffffff : dax2.0
        4850000000-684fffffff : System RAM (kmem)

Example 3: No alignment
|---------- "Soft Reserved" ----------|
	|---- "Region #" ----|

Before:
  00000000-3050000ffd : Soft Reserved
    ..
    ..
    1050000000-304fffffff : CXL Window 0
      1050000000-304fffffff : region1
        1080000000-2fffffffff : dax0.0
          1080000000-2fffffffff : System RAM (kmem)

After:
  00000000-104fffffff : Soft Reserved
    ..
    ..
  1050000000-304fffffff : CXL Window 0
    1050000000-304fffffff : region1
      1080000000-2fffffffff : dax0.0
        1080000000-2fffffffff : System RAM (kmem)
  3050000000-3050000ffd : Soft Reserved

v4 updates:
 - Split first patch into 4 smaller patches.
 - Correct the logic for cxl_pci_loaded() and cxl_mem_active() to return
   false at default instead of true.
 - Cleanup cxl_wait_for_pci_mem() to remove config checks for cxl_pci
   and cxl_mem.
 - Fixed multiple bugs and build issues which includes correcting
   walk_iomem_resc_desc() and calculations of alignments.
 
v3 updates:
 - Remove srmem resource tree from kernel/resource.c, this is no longer
   needed in the current implementation. All SOFT RESERVE resources now
   put on the iomem resource tree.
 - Remove the no longer needed SOFT_RESERVED_MANAGED kernel config option.
 - Add the 'nid' parameter back to hmem_register_resource();
 - Remove the no longer used soft reserve notification chain (introduced
   in v2). The dax driver is now notified of SOFT RESERVED resources by
   the CXL driver.

v2 updates:
 - Add config option SOFT_RESERVE_MANAGED to control use of the
   separate srmem resource tree at boot.
 - Only add SOFT RESERVE resources to the soft reserve tree during
   boot, they go to the iomem resource tree after boot.
 - Remove the resource trimming code in the previous patch to re-use
   the existing code in kernel/resource.c
 - Add functionality for the cxl acpi driver to wait for the cxl PCI
   and me drivers to load.

Smita Koralahalli (7):
  cxl/region: Avoid null pointer dereference in is_cxl_region()
  cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o
  cxl/pci: Add pci_loaded tracking to mark PCI driver readiness
  cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  cxl/region: Introduce SOFT RESERVED resource removal on region
    teardown
  dax/hmem: Save the DAX HMEM platform device pointer
  cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after
    CXL region creation

 drivers/cxl/Kconfig        |   4 -
 drivers/cxl/acpi.c         |  25 ++++++
 drivers/cxl/core/Makefile  |   2 +-
 drivers/cxl/core/region.c  | 163 ++++++++++++++++++++++++++++++++++++-
 drivers/cxl/core/suspend.c |  34 +++++++-
 drivers/cxl/cxl.h          |   7 ++
 drivers/cxl/cxlmem.h       |   9 --
 drivers/cxl/cxlpci.h       |   1 +
 drivers/cxl/pci.c          |   2 +
 drivers/dax/hmem/device.c  |  47 +++++------
 drivers/dax/hmem/hmem.c    |  10 ++-
 include/linux/dax.h        |  11 ++-
 include/linux/pm.h         |   7 --
 13 files changed, 270 insertions(+), 52 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region()
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-03 23:49   ` Dave Jiang
  2025-06-03 22:19 ` [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o Smita Koralahalli
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

Add a NULL check in is_cxl_region() to prevent potential null pointer
dereference if a caller passes a NULL device. This change ensures the
function safely returns false instead of triggering undefined behavior
when dev is NULL.

Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/cxl/core/region.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c3f4dc244df7..109b8a98c4c7 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2333,7 +2333,7 @@ const struct device_type cxl_region_type = {
 
 bool is_cxl_region(struct device *dev)
 {
-	return dev->type == &cxl_region_type;
+	return dev && dev->type == &cxl_region_type;
 }
 EXPORT_SYMBOL_NS_GPL(is_cxl_region, "CXL");
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
  2025-06-03 22:19 ` [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region() Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-09 11:02   ` Jonathan Cameron
  2025-06-03 22:19 ` [PATCH v4 3/7] cxl/pci: Add pci_loaded tracking to mark PCI driver readiness Smita Koralahalli
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

In preparation for soft-reserved resource handling, make the suspend
infrastructure always available by removing the CONFIG_CXL_SUSPEND
Kconfig option.

This ensures cxl_mem_active_inc()/dec() and cxl_mem_active() are
unconditionally available, enabling coordination between cxl_pci and
cxl_mem drivers during region setup and hotplug operations.

Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/cxl/Kconfig        | 4 ----
 drivers/cxl/core/Makefile  | 2 +-
 drivers/cxl/core/suspend.c | 5 ++++-
 drivers/cxl/cxlmem.h       | 9 ---------
 include/linux/pm.h         | 7 -------
 5 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index cf1ba673b8c2..d09144c2002e 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -118,10 +118,6 @@ config CXL_PORT
 	default CXL_BUS
 	tristate
 
-config CXL_SUSPEND
-	def_bool y
-	depends on SUSPEND && CXL_MEM
-
 config CXL_REGION
 	bool "CXL: Region Support"
 	default CXL_BUS
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 086df97a0fcf..035864db8a32 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += cxl_core.o
-obj-$(CONFIG_CXL_SUSPEND) += suspend.o
+obj-y += suspend.o
 
 ccflags-y += -I$(srctree)/drivers/cxl
 CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)
diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
index 29aa5cc5e565..5ba4b4de0e33 100644
--- a/drivers/cxl/core/suspend.c
+++ b/drivers/cxl/core/suspend.c
@@ -8,7 +8,10 @@ static atomic_t mem_active;
 
 bool cxl_mem_active(void)
 {
-	return atomic_read(&mem_active) != 0;
+	if (IS_ENABLED(CONFIG_CXL_MEM))
+		return atomic_read(&mem_active) != 0;
+
+	return false;
 }
 
 void cxl_mem_active_inc(void)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ec6b906371b..1bd1e88c4cc0 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -853,17 +853,8 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
 
-#ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
-#else
-static inline void cxl_mem_active_inc(void)
-{
-}
-static inline void cxl_mem_active_dec(void)
-{
-}
-#endif
 
 int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
 
diff --git a/include/linux/pm.h b/include/linux/pm.h
index f0bd8fbae4f2..415928e0b6ca 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -35,14 +35,7 @@ static inline void pm_vt_switch_unregister(struct device *dev)
 }
 #endif /* CONFIG_VT_CONSOLE_SLEEP */
 
-#ifdef CONFIG_CXL_SUSPEND
 bool cxl_mem_active(void);
-#else
-static inline bool cxl_mem_active(void)
-{
-	return false;
-}
-#endif
 
 /*
  * Device power management
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 3/7] cxl/pci: Add pci_loaded tracking to mark PCI driver readiness
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
  2025-06-03 22:19 ` [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region() Smita Koralahalli
  2025-06-03 22:19 ` [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-04  9:29   ` Zhijian Li (Fujitsu)
  2025-06-03 22:19 ` [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe Smita Koralahalli
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

Introduce a pci_loaded flag similar to mem_active, and define
mark_cxl_pci_loaded() to indicate when the PCI driver has initialized.

This will be used by other CXL components, such as cxl_acpi and the soft
reserved resource handling logic, to coordinate initialization and ensure
that dependent operations proceed only after both cxl_pci and cxl_mem
drivers are ready.

Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/cxl/core/suspend.c | 8 ++++++++
 drivers/cxl/cxlpci.h       | 1 +
 drivers/cxl/pci.c          | 2 ++
 3 files changed, 11 insertions(+)

diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
index 5ba4b4de0e33..72818a2c8ec8 100644
--- a/drivers/cxl/core/suspend.c
+++ b/drivers/cxl/core/suspend.c
@@ -3,8 +3,10 @@
 #include <linux/atomic.h>
 #include <linux/export.h>
 #include "cxlmem.h"
+#include "cxlpci.h"
 
 static atomic_t mem_active;
+static atomic_t pci_loaded;
 
 bool cxl_mem_active(void)
 {
@@ -25,3 +27,9 @@ void cxl_mem_active_dec(void)
 	atomic_dec(&mem_active);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
+
+void mark_cxl_pci_loaded(void)
+{
+	atomic_inc(&pci_loaded);
+}
+EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 54e219b0049e..5a811ac63fcf 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -135,4 +135,5 @@ void read_cdat_data(struct cxl_port *port);
 void cxl_cor_error_detected(struct pci_dev *pdev);
 pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
 				    pci_channel_state_t state);
+void mark_cxl_pci_loaded(void);
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 785aa2af5eaa..b019bd324dba 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1189,6 +1189,8 @@ static int __init cxl_pci_driver_init(void)
 	if (rc)
 		pci_unregister_driver(&cxl_pci_driver);
 
+	mark_cxl_pci_loaded();
+
 	return rc;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
                   ` (2 preceding siblings ...)
  2025-06-03 22:19 ` [PATCH v4 3/7] cxl/pci: Add pci_loaded tracking to mark PCI driver readiness Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-03 23:45   ` Dave Jiang
  2025-06-04  9:40   ` Zhijian Li (Fujitsu)
  2025-06-03 22:19 ` [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown Smita Koralahalli
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

Introduce a waitqueue mechanism to coordinate initialization between the
cxl_pci and cxl_mem drivers.

Launch a background worker from cxl_acpi_probe() that waits for both
drivers to complete initialization before invoking wait_for_device_probe().
Without this, the probe completion wait could begin prematurely, before
the drivers are present, leading to missed updates.

Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
 drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
 drivers/cxl/cxl.h          |  2 ++
 3 files changed, 46 insertions(+)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index cb14829bb9be..978f63b32b41 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
 	return 0;
 }
 
+static void cxl_softreserv_mem_work_fn(struct work_struct *work)
+{
+	/* Wait for cxl_pci and cxl_mem drivers to load */
+	cxl_wait_for_pci_mem();
+
+	/*
+	 * Wait for the driver probe routines to complete after cxl_pci
+	 * and cxl_mem drivers are loaded.
+	 */
+	wait_for_device_probe();
+}
+static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
+
+static void cxl_softreserv_mem_update(void)
+{
+	schedule_work(&cxl_sr_work);
+}
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
@@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 
 	/* In case PCI is scanned before ACPI re-trigger memdev attach */
 	cxl_bus_rescan();
+
+	/* Update SOFT RESERVE resources that intersect with CXL regions */
+	cxl_softreserv_mem_update();
+
 	return 0;
 }
 
@@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
 
 static void __exit cxl_acpi_exit(void)
 {
+	cancel_work_sync(&cxl_sr_work);
 	platform_driver_unregister(&cxl_acpi_driver);
 	cxl_bus_drain();
 }
diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
index 72818a2c8ec8..c0d8f70aed56 100644
--- a/drivers/cxl/core/suspend.c
+++ b/drivers/cxl/core/suspend.c
@@ -2,12 +2,15 @@
 /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
 #include <linux/atomic.h>
 #include <linux/export.h>
+#include <linux/wait.h>
 #include "cxlmem.h"
 #include "cxlpci.h"
 
 static atomic_t mem_active;
 static atomic_t pci_loaded;
 
+static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
+
 bool cxl_mem_active(void)
 {
 	if (IS_ENABLED(CONFIG_CXL_MEM))
@@ -19,6 +22,7 @@ bool cxl_mem_active(void)
 void cxl_mem_active_inc(void)
 {
 	atomic_inc(&mem_active);
+	wake_up(&cxl_wait_queue);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
 
@@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
 
+static bool cxl_pci_loaded(void)
+{
+	if (IS_ENABLED(CONFIG_CXL_PCI))
+		return atomic_read(&pci_loaded) != 0;
+
+	return false;
+}
+
 void mark_cxl_pci_loaded(void)
 {
 	atomic_inc(&pci_loaded);
+	wake_up(&cxl_wait_queue);
 }
 EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
+
+void cxl_wait_for_pci_mem(void)
+{
+	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
+				cxl_mem_active(), 30 * HZ))
+		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
+}
+EXPORT_SYMBOL_NS_GPL(cxl_wait_for_pci_mem, "CXL");
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index a9ab46eb0610..1ba7d39c2991 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -902,6 +902,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 
 bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
 
+void cxl_wait_for_pci_mem(void);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
                   ` (3 preceding siblings ...)
  2025-06-03 22:19 ` [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-06  4:16   ` Zhijian Li (Fujitsu)
                     ` (2 more replies)
  2025-06-03 22:19 ` [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer Smita Koralahalli
                   ` (2 subsequent siblings)
  7 siblings, 3 replies; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

Reworked from a patch by Alison Schofield <alison.schofield@intel.com>

Previously, when CXL regions were created through autodiscovery and their
resources overlapped with SOFT RESERVED ranges, the soft reserved resource
remained in place after region teardown. This left the HPA range
unavailable for reuse even after the region was destroyed.

Enhance the logic to reliably remove SOFT RESERVED resources associated
with a region, regardless of alignment or hierarchy in the iomem tree.

Link: https://lore.kernel.org/linux-cxl/29312c0765224ae76862d59a17748c8188fb95f1.1692638817.git.alison.schofield@intel.com/
Co-developed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/cxl/acpi.c        |   2 +
 drivers/cxl/core/region.c | 151 ++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |   5 ++
 3 files changed, 158 insertions(+)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 978f63b32b41..1b1388feb36d 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -823,6 +823,8 @@ static void cxl_softreserv_mem_work_fn(struct work_struct *work)
 	 * and cxl_mem drivers are loaded.
 	 */
 	wait_for_device_probe();
+
+	cxl_region_softreserv_update();
 }
 static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
 
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 109b8a98c4c7..3a5ca44d65f3 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3443,6 +3443,157 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
 
+static int add_soft_reserved(resource_size_t start, resource_size_t len,
+			     unsigned long flags)
+{
+	struct resource *res = kmalloc(sizeof(*res), GFP_KERNEL);
+	int rc;
+
+	if (!res)
+		return -ENOMEM;
+
+	*res = DEFINE_RES_MEM_NAMED(start, len, "Soft Reserved");
+
+	res->desc = IORES_DESC_SOFT_RESERVED;
+	res->flags = flags;
+	rc = insert_resource(&iomem_resource, res);
+	if (rc) {
+		kfree(res);
+		return rc;
+	}
+
+	return 0;
+}
+
+static void remove_soft_reserved(struct cxl_region *cxlr, struct resource *soft,
+				 resource_size_t start, resource_size_t end)
+{
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	resource_size_t new_start, new_end;
+	int rc;
+
+	/* Prevent new usage while removing or adjusting the resource */
+	guard(mutex)(&cxlrd->range_lock);
+
+	/* Aligns at both resource start and end */
+	if (soft->start == start && soft->end == end)
+		goto remove;
+
+	/* Aligns at either resource start or end */
+	if (soft->start == start || soft->end == end) {
+		if (soft->start == start) {
+			new_start = end + 1;
+			new_end = soft->end;
+		} else {
+			new_start = soft->start;
+			new_end = start - 1;
+		}
+
+		rc = add_soft_reserved(new_start, new_end - new_start + 1,
+				       soft->flags);
+		if (rc)
+			dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
+				 &new_start);
+
+		/* Remove the original Soft Reserved resource */
+		goto remove;
+	}
+
+	/*
+	 * No alignment. Attempt a 3-way split that removes the part of
+	 * the resource the region occupied, and then creates new soft
+	 * reserved resources for the leading and trailing addr space.
+	 */
+	new_start = soft->start;
+	new_end = soft->end;
+
+	rc = add_soft_reserved(new_start, start - new_start, soft->flags);
+	if (rc)
+		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
+			 &new_start);
+
+	rc = add_soft_reserved(end + 1, new_end - end, soft->flags);
+	if (rc)
+		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa + 1\n",
+			 &end);
+
+remove:
+	rc = remove_resource(soft);
+	if (rc)
+		dev_warn(&cxlr->dev, "cannot remove soft reserved resource %pr\n",
+			 soft);
+}
+
+/*
+ * normalize_resource
+ *
+ * The walk_iomem_res_desc() returns a copy of a resource, not a reference
+ * to the actual resource in the iomem_resource tree. As a result,
+ * __release_resource() which relies on pointer equality will fail.
+ *
+ * This helper walks the children of the resource's parent to find and
+ * return the original resource pointer that matches the given resource's
+ * start and end addresses.
+ *
+ * Return: Pointer to the matching original resource in iomem_resource, or
+ *         NULL if not found or invalid input.
+ */
+static struct resource *normalize_resource(struct resource *res)
+{
+	if (!res || !res->parent)
+		return NULL;
+
+	for (struct resource *res_iter = res->parent->child;
+	     res_iter != NULL; res_iter = res_iter->sibling) {
+		if ((res_iter->start == res->start) &&
+		    (res_iter->end == res->end))
+			return res_iter;
+	}
+
+	return NULL;
+}
+
+static int __cxl_region_softreserv_update(struct resource *soft,
+					  void *_cxlr)
+{
+	struct cxl_region *cxlr = _cxlr;
+	struct resource *res = cxlr->params.res;
+
+	/* Skip non-intersecting soft-reserved regions */
+	if (soft->end < res->start || soft->start > res->end)
+		return 0;
+
+	soft = normalize_resource(soft);
+	if (!soft)
+		return -EINVAL;
+
+	remove_soft_reserved(cxlr, soft, res->start, res->end);
+
+	return 0;
+}
+
+int cxl_region_softreserv_update(void)
+{
+	struct device *dev = NULL;
+
+	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
+		struct device *put_dev __free(put_device) = dev;
+		struct cxl_region *cxlr;
+
+		if (!is_cxl_region(dev))
+			continue;
+
+		cxlr = to_cxl_region(dev);
+
+		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
+				    IORESOURCE_MEM, 0, -1, cxlr,
+				    __cxl_region_softreserv_update);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
+
 u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
 {
 	struct cxl_region_ref *iter;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1ba7d39c2991..fc39c4b24745 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -859,6 +859,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
 int cxl_add_to_region(struct cxl_port *root,
 		      struct cxl_endpoint_decoder *cxled);
 struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
+int cxl_region_softreserv_update(void);
 u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
 #else
 static inline bool is_cxl_pmem_region(struct device *dev)
@@ -878,6 +879,10 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
 {
 	return NULL;
 }
+static inline int cxl_region_softreserv_update(void)
+{
+	return 0;
+}
 static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
 					       u64 spa)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
                   ` (4 preceding siblings ...)
  2025-06-03 22:19 ` [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-05 16:54   ` Dave Jiang
  2025-06-03 22:19 ` [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation Smita Koralahalli
  2025-06-04  8:43 ` [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Zhijian Li (Fujitsu)
  7 siblings, 1 reply; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

From: Nathan Fontenot <nathan.fontenot@amd.com>

To enable registration of HMEM devices for SOFT RESERVED regions after
the DAX HMEM device is initialized, this patch saves a reference to the
DAX HMEM platform device.

This saved pointer will be used in a follow-up patch to allow late
registration of SOFT RESERVED memory ranges. It also enables
simplification of the walk_hmem_resources() by removing the need to
pass a struct device argument.

There are no functional changes.

Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/dax/hmem/device.c | 4 ++--
 drivers/dax/hmem/hmem.c   | 9 ++++++---
 include/linux/dax.h       | 5 ++---
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
index f9e1a76a04a9..59ad44761191 100644
--- a/drivers/dax/hmem/device.c
+++ b/drivers/dax/hmem/device.c
@@ -17,14 +17,14 @@ static struct resource hmem_active = {
 	.flags = IORESOURCE_MEM,
 };
 
-int walk_hmem_resources(struct device *host, walk_hmem_fn fn)
+int walk_hmem_resources(walk_hmem_fn fn)
 {
 	struct resource *res;
 	int rc = 0;
 
 	mutex_lock(&hmem_resource_lock);
 	for (res = hmem_active.child; res; res = res->sibling) {
-		rc = fn(host, (int) res->desc, res);
+		rc = fn((int) res->desc, res);
 		if (rc)
 			break;
 	}
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index 5e7c53f18491..3aedef5f1be1 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -9,6 +9,8 @@
 static bool region_idle;
 module_param_named(region_idle, region_idle, bool, 0644);
 
+static struct platform_device *dax_hmem_pdev;
+
 static int dax_hmem_probe(struct platform_device *pdev)
 {
 	unsigned long flags = IORESOURCE_DAX_KMEM;
@@ -59,9 +61,9 @@ static void release_hmem(void *pdev)
 	platform_device_unregister(pdev);
 }
 
-static int hmem_register_device(struct device *host, int target_nid,
-				const struct resource *res)
+static int hmem_register_device(int target_nid, const struct resource *res)
 {
+	struct device *host = &dax_hmem_pdev->dev;
 	struct platform_device *pdev;
 	struct memregion_info info;
 	long id;
@@ -125,7 +127,8 @@ static int hmem_register_device(struct device *host, int target_nid,
 
 static int dax_hmem_platform_probe(struct platform_device *pdev)
 {
-	return walk_hmem_resources(&pdev->dev, hmem_register_device);
+	dax_hmem_pdev = pdev;
+	return walk_hmem_resources(hmem_register_device);
 }
 
 static struct platform_driver dax_hmem_platform_driver = {
diff --git a/include/linux/dax.h b/include/linux/dax.h
index dcc9fcdf14e4..a4ad3708ea35 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -305,7 +305,6 @@ static inline void hmem_register_resource(int target_nid, struct resource *r)
 }
 #endif
 
-typedef int (*walk_hmem_fn)(struct device *dev, int target_nid,
-			    const struct resource *res);
-int walk_hmem_resources(struct device *dev, walk_hmem_fn fn);
+typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);
+int walk_hmem_resources(walk_hmem_fn fn);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
                   ` (5 preceding siblings ...)
  2025-06-03 22:19 ` [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer Smita Koralahalli
@ 2025-06-03 22:19 ` Smita Koralahalli
  2025-06-09 13:01   ` Jonathan Cameron
                     ` (2 more replies)
  2025-06-04  8:43 ` [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Zhijian Li (Fujitsu)
  7 siblings, 3 replies; 40+ messages in thread
From: Smita Koralahalli @ 2025-06-03 22:19 UTC (permalink / raw)
  To: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Smita Koralahalli, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati, Zhijian Li

From: Nathan Fontenot <nathan.fontenot@amd.com>

The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
during initialization. This interferes with the CXL driver’s ability to
create regions and trim overlapping SOFT RESERVED ranges before DAX uses
them.

To resolve this, defer the DAX driver's resource consumption if the
cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
iomem resource tree in this case. After CXL region creation completes,
any remaining SOFT RESERVED resources are explicitly registered with the
DAX driver by the CXL driver.

This sequencing ensures proper handling of overlaps and fixes hotplug
failures.

Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
 drivers/cxl/core/region.c | 10 +++++++++
 drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
 drivers/dax/hmem/hmem.c   |  3 ++-
 include/linux/dax.h       |  6 ++++++
 4 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3a5ca44d65f3..c6c0c7ba3b20 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -10,6 +10,7 @@
 #include <linux/sort.h>
 #include <linux/idr.h>
 #include <linux/memory-tiers.h>
+#include <linux/dax.h>
 #include <cxlmem.h>
 #include <cxl.h>
 #include "core.h"
@@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
 	return NULL;
 }
 
+static int cxl_softreserv_mem_register(struct resource *res, void *unused)
+{
+	return hmem_register_device(phys_to_target_node(res->start), res);
+}
+
 static int __cxl_region_softreserv_update(struct resource *soft,
 					  void *_cxlr)
 {
@@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
 				    __cxl_region_softreserv_update);
 	}
 
+	/* Now register any remaining SOFT RESERVES with DAX */
+	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
+			    0, -1, NULL, cxl_softreserv_mem_register);
+
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
index 59ad44761191..cc1ed7bbdb1a 100644
--- a/drivers/dax/hmem/device.c
+++ b/drivers/dax/hmem/device.c
@@ -8,7 +8,6 @@
 static bool nohmem;
 module_param_named(disable, nohmem, bool, 0444);
 
-static bool platform_initialized;
 static DEFINE_MUTEX(hmem_resource_lock);
 static struct resource hmem_active = {
 	.name = "HMEM devices",
@@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
 
 static void __hmem_register_resource(int target_nid, struct resource *res)
 {
-	struct platform_device *pdev;
 	struct resource *new;
-	int rc;
 
 	new = __request_region(&hmem_active, res->start, resource_size(res), "",
 			       0);
@@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
 	}
 
 	new->desc = target_nid;
-
-	if (platform_initialized)
-		return;
-
-	pdev = platform_device_alloc("hmem_platform", 0);
-	if (!pdev) {
-		pr_err_once("failed to register device-dax hmem_platform device\n");
-		return;
-	}
-
-	rc = platform_device_add(pdev);
-	if (rc)
-		platform_device_put(pdev);
-	else
-		platform_initialized = true;
 }
 
 void hmem_register_resource(int target_nid, struct resource *res)
@@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
 
 static __init int hmem_init(void)
 {
-	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
-			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
-	return 0;
+	struct platform_device *pdev;
+	int rc;
+
+	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
+		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
+				    IORESOURCE_MEM, 0, -1, NULL,
+				    hmem_register_one);
+	}
+
+	pdev = platform_device_alloc("hmem_platform", 0);
+	if (!pdev) {
+		pr_err("failed to register device-dax hmem_platform device\n");
+		return -1;
+	}
+
+	rc = platform_device_add(pdev);
+	if (rc) {
+		pr_err("failed to add device-dax hmem_platform device\n");
+		platform_device_put(pdev);
+	}
+
+	return rc;
 }
 
 /*
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index 3aedef5f1be1..a206b9b383e4 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
 	platform_device_unregister(pdev);
 }
 
-static int hmem_register_device(int target_nid, const struct resource *res)
+int hmem_register_device(int target_nid, const struct resource *res)
 {
 	struct device *host = &dax_hmem_pdev->dev;
 	struct platform_device *pdev;
@@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
 	platform_device_put(pdev);
 	return rc;
 }
+EXPORT_SYMBOL_GPL(hmem_register_device);
 
 static int dax_hmem_platform_probe(struct platform_device *pdev)
 {
diff --git a/include/linux/dax.h b/include/linux/dax.h
index a4ad3708ea35..5052dca8b3bc 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
 
 #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
 void hmem_register_resource(int target_nid, struct resource *r);
+int hmem_register_device(int target_nid, const struct resource *res);
 #else
 static inline void hmem_register_resource(int target_nid, struct resource *r)
 {
 }
+
+static inline int hmem_register_device(int target_nid, const struct resource *res)
+{
+	return 0;
+}
 #endif
 
 typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  2025-06-03 22:19 ` [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe Smita Koralahalli
@ 2025-06-03 23:45   ` Dave Jiang
  2025-06-04 18:31     ` Koralahalli Channabasappa, Smita
  2025-06-04  9:40   ` Zhijian Li (Fujitsu)
  1 sibling, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-06-03 23:45 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl, linux-kernel, nvdimm, linux-fsdevel,
	linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li



On 6/3/25 3:19 PM, Smita Koralahalli wrote:
> Introduce a waitqueue mechanism to coordinate initialization between the
> cxl_pci and cxl_mem drivers.
> 
> Launch a background worker from cxl_acpi_probe() that waits for both
> drivers to complete initialization before invoking wait_for_device_probe().
> Without this, the probe completion wait could begin prematurely, before
> the drivers are present, leading to missed updates.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>  drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>  drivers/cxl/cxl.h          |  2 ++
>  3 files changed, 46 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index cb14829bb9be..978f63b32b41 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>  	return 0;
>  }
>  
> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
> +{
> +	/* Wait for cxl_pci and cxl_mem drivers to load */
> +	cxl_wait_for_pci_mem();
> +
> +	/*
> +	 * Wait for the driver probe routines to complete after cxl_pci
> +	 * and cxl_mem drivers are loaded.
> +	 */
> +	wait_for_device_probe();
> +}
> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
> +
> +static void cxl_softreserv_mem_update(void)
> +{
> +	schedule_work(&cxl_sr_work);
> +}
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>  	int rc;
> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>  
>  	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>  	cxl_bus_rescan();
> +
> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
> +	cxl_softreserv_mem_update();
> +
>  	return 0;
>  }
>  
> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>  
>  static void __exit cxl_acpi_exit(void)
>  {
> +	cancel_work_sync(&cxl_sr_work);
>  	platform_driver_unregister(&cxl_acpi_driver);
>  	cxl_bus_drain();
>  }
> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> index 72818a2c8ec8..c0d8f70aed56 100644
> --- a/drivers/cxl/core/suspend.c
> +++ b/drivers/cxl/core/suspend.c
> @@ -2,12 +2,15 @@
>  /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>  #include <linux/atomic.h>
>  #include <linux/export.h>
> +#include <linux/wait.h>
>  #include "cxlmem.h"
>  #include "cxlpci.h"
>  
>  static atomic_t mem_active;
>  static atomic_t pci_loaded;
>  
> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
> +
>  bool cxl_mem_active(void)
>  {
>  	if (IS_ENABLED(CONFIG_CXL_MEM))
> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>  void cxl_mem_active_inc(void)
>  {
>  	atomic_inc(&mem_active);
> +	wake_up(&cxl_wait_queue);
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>  
> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>  
> +static bool cxl_pci_loaded(void)
> +{
> +	if (IS_ENABLED(CONFIG_CXL_PCI))
> +		return atomic_read(&pci_loaded) != 0;
> +
> +	return false;
> +}
> +
>  void mark_cxl_pci_loaded(void)
>  {
>  	atomic_inc(&pci_loaded);
> +	wake_up(&cxl_wait_queue);
>  }
>  EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
> +
> +void cxl_wait_for_pci_mem(void)
> +{
> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
> +				cxl_mem_active(), 30 * HZ))

I'm trying to understand why cxl_pci_loaded() is needed. cxl_mem_active() goes above 0 when a cxl_mem_probe() instance succeeds. cxl_mem_probe() being triggered implies that an instance of cxl_pci_probe() has been called since cxl_mem_probe() is triggered from devm_cxl_add_memdev() with memdev being added and cxl_mem driver also have been loaded. So does cxl_mem_active() not imply cxl_pci_loaded() and makes it unnecessary?

DJ


> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_wait_for_pci_mem, "CXL");
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index a9ab46eb0610..1ba7d39c2991 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -902,6 +902,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>  
>  bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>  
> +void cxl_wait_for_pci_mem(void);
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region()
  2025-06-03 22:19 ` [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region() Smita Koralahalli
@ 2025-06-03 23:49   ` Dave Jiang
  2025-06-04 19:56     ` Nathan Fontenot
  0 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-06-03 23:49 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl, linux-kernel, nvdimm, linux-fsdevel,
	linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li



On 6/3/25 3:19 PM, Smita Koralahalli wrote:
> Add a NULL check in is_cxl_region() to prevent potential null pointer
> dereference if a caller passes a NULL device. This change ensures the
> function safely returns false instead of triggering undefined behavior
> when dev is NULL.

Don't think this change is necessary. The code paths should not be hitting any NULL region devices unless it's a programming error.

> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/core/region.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c3f4dc244df7..109b8a98c4c7 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2333,7 +2333,7 @@ const struct device_type cxl_region_type = {
>  
>  bool is_cxl_region(struct device *dev)
>  {
> -	return dev->type == &cxl_region_type;
> +	return dev && dev->type == &cxl_region_type;
>  }
>  EXPORT_SYMBOL_NS_GPL(is_cxl_region, "CXL");
>  


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 0/7] Add managed SOFT RESERVE resource handling
  2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
                   ` (6 preceding siblings ...)
  2025-06-03 22:19 ` [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation Smita Koralahalli
@ 2025-06-04  8:43 ` Zhijian Li (Fujitsu)
  2025-06-04 18:59   ` Koralahalli Channabasappa, Smita
  7 siblings, 1 reply; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-04  8:43 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati

Smita,

Thanks for your awesome work. I just tested the scenarios you listed, and they work as expected. Thanks again.
(Minor comments inlined)

Tested-by: Li Zhijian <lizhijian@fujitsu.com>


To the CXL community,

The scenarios mentioned here essentially cover what a correct firmware may provide. However,
I would like to discuss one more scenario that I can simulate with a modified QEMU:
The E820 exposes a SOFT RESERVED region which is the same as a CFMW, but the HDM decoders are not committed. This means no region will be auto-created during boot.

As an example, after boot, the iomem tree is as follows:
1050000000-304fffffff : CXL Window 0
   1050000000-304fffffff : Soft Reserved
     <No region>

In this case, the SOFT RESERVED resource is not trimmed, so the end-user cannot create a new region.
My question is: Is this scenario a problem? If it is, should we fix it in this patchset or create a new patch?




On 04/06/2025 06:19, Smita Koralahalli wrote:
> Add the ability to manage SOFT RESERVE iomem resources prior to them being
> added to the iomem resource tree. This allows drivers, such as CXL, to
> remove any pieces of the SOFT RESERVE resource that intersect with created
> CXL regions.
> 
> The current approach of leaving the SOFT RESERVE resources as is can cause
> failures during hotplug of devices, such as CXL, because the resource is
> not available for reuse after teardown of the device.
> 
> The approach is to add SOFT RESERVE resources to a separate tree during
> boot.

No special tree at all since V3


> This allows any drivers to update the SOFT RESERVE resources before
> they are merged into the iomem resource tree. In addition a notifier chain
> is added so that drivers can be notified when these SOFT RESERVE resources
> are added to the ioeme resource tree.
> 
> The CXL driver is modified to use a worker thread that waits for the CXL
> PCI and CXL mem drivers to be loaded and for their probe routine to
> complete. Then the driver walks through any created CXL regions to trim any
> intersections with SOFT RESERVE resources in the iomem tree.
> 
> The dax driver uses the new soft reserve notifier chain so it can consume
> any remaining SOFT RESERVES once they're added to the iomem tree.
> 
> The following scenarios have been tested:
> 
> Example 1: Exact alignment, soft reserved is a child of the region
> 
> |---------- "Soft Reserved" -----------|
> |-------------- "Region #" ------------|
> 
> Before:
>    1050000000-304fffffff : CXL Window 0
>      1050000000-304fffffff : region0
>        1050000000-304fffffff : Soft Reserved
>          1080000000-2fffffffff : dax0.0

BTW, I'm curious how to set up a dax with an address range different from its corresponding region.


>            1080000000-2fffffffff : System RAM (kmem)
> 
> After:
>    1050000000-304fffffff : CXL Window 0
>      1050000000-304fffffff : region1
>        1080000000-2fffffffff : dax0.0
>          1080000000-2fffffffff : System RAM (kmem)
> 
> Example 2: Start and/or end aligned and soft reserved spans multiple
> regions

Tested

> 
> |----------- "Soft Reserved" -----------|
> |-------- "Region #" -------|
> or
> |----------- "Soft Reserved" -----------|
> |-------- "Region #" -------|

Typo? should be:
|----------- "Soft Reserved" -----------|
             |-------- "Region #" -------|

> 
> Example 3: No alignment
> |---------- "Soft Reserved" ----------|
> 	|---- "Region #" ----|

Tested.


Thanks
Zhijian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 3/7] cxl/pci: Add pci_loaded tracking to mark PCI driver readiness
  2025-06-03 22:19 ` [PATCH v4 3/7] cxl/pci: Add pci_loaded tracking to mark PCI driver readiness Smita Koralahalli
@ 2025-06-04  9:29   ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-04  9:29 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 04/06/2025 06:19, Smita Koralahalli wrote:
> Introduce a pci_loaded flag similar to mem_active, and define
> mark_cxl_pci_loaded() to indicate when the PCI driver has initialized.
> 
> This will be used by other CXL components, such as cxl_acpi and the soft
> reserved resource handling logic, to coordinate initialization and ensure
> that dependent operations proceed only after both cxl_pci and cxl_mem
> drivers are ready.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>   drivers/cxl/core/suspend.c | 8 ++++++++
>   drivers/cxl/cxlpci.h       | 1 +
>   drivers/cxl/pci.c          | 2 ++
>   3 files changed, 11 insertions(+)
> 
> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> index 5ba4b4de0e33..72818a2c8ec8 100644
> --- a/drivers/cxl/core/suspend.c
> +++ b/drivers/cxl/core/suspend.c
> @@ -3,8 +3,10 @@
>   #include <linux/atomic.h>
>   #include <linux/export.h>
>   #include "cxlmem.h"
> +#include "cxlpci.h"
>   
>   static atomic_t mem_active;
> +static atomic_t pci_loaded;


I find it odd to place these changes in suspend.c. Also, I noticed that in a
subsequent patch, DJ has mentioned (and I agree) that this patch is unnecessary.


Thanks
Zhijian


>   
>   bool cxl_mem_active(void)
>   {
> @@ -25,3 +27,9 @@ void cxl_mem_active_dec(void)
>   	atomic_dec(&mem_active);
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
> +
> +void mark_cxl_pci_loaded(void)
> +{
> +	atomic_inc(&pci_loaded);
> +}
> +EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 54e219b0049e..5a811ac63fcf 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -135,4 +135,5 @@ void read_cdat_data(struct cxl_port *port);
>   void cxl_cor_error_detected(struct pci_dev *pdev);
>   pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>   				    pci_channel_state_t state);
> +void mark_cxl_pci_loaded(void);
>   #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 785aa2af5eaa..b019bd324dba 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1189,6 +1189,8 @@ static int __init cxl_pci_driver_init(void)
>   	if (rc)
>   		pci_unregister_driver(&cxl_pci_driver);
>   
> +	mark_cxl_pci_loaded();
> +
>   	return rc;
>   }
>   

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  2025-06-03 22:19 ` [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe Smita Koralahalli
  2025-06-03 23:45   ` Dave Jiang
@ 2025-06-04  9:40   ` Zhijian Li (Fujitsu)
  2025-06-04 14:35     ` Dave Jiang
  1 sibling, 1 reply; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-04  9:40 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 04/06/2025 06:19, Smita Koralahalli wrote:
>   drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>   drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>   drivers/cxl/cxl.h          |  2 ++
>   3 files changed, 46 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index cb14829bb9be..978f63b32b41 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>   	return 0;
>   }
>   
> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
> +{
> +	/* Wait for cxl_pci and cxl_mem drivers to load */
> +	cxl_wait_for_pci_mem();
> +
> +	/*
> +	 * Wait for the driver probe routines to complete after cxl_pci
> +	 * and cxl_mem drivers are loaded.
> +	 */
> +	wait_for_device_probe();
> +}
> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
> +
> +static void cxl_softreserv_mem_update(void)
> +{
> +	schedule_work(&cxl_sr_work);
> +}
> +
>   static int cxl_acpi_probe(struct platform_device *pdev)
>   {
>   	int rc;
> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>   
>   	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>   	cxl_bus_rescan();
> +
> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
> +	cxl_softreserv_mem_update();
> +
>   	return 0;
>   }
>   
> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>   
>   static void __exit cxl_acpi_exit(void)
>   {
> +	cancel_work_sync(&cxl_sr_work);
>   	platform_driver_unregister(&cxl_acpi_driver);
>   	cxl_bus_drain();
>   }
> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> index 72818a2c8ec8..c0d8f70aed56 100644
> --- a/drivers/cxl/core/suspend.c
> +++ b/drivers/cxl/core/suspend.c
> @@ -2,12 +2,15 @@
>   /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>   #include <linux/atomic.h>
>   #include <linux/export.h>
> +#include <linux/wait.h>
>   #include "cxlmem.h"
>   #include "cxlpci.h"
>   
>   static atomic_t mem_active;
>   static atomic_t pci_loaded;
>   
> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);

Given that this file (suspend.c) focuses on power management functions,
it might be more appropriate to move the wait queue declaration and its
related changes to acpi.c in where the its caller is.


Thanks
Zhijian

> +
>   bool cxl_mem_active(void)
>   {
>   	if (IS_ENABLED(CONFIG_CXL_MEM))
> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>   void cxl_mem_active_inc(void)
>   {
>   	atomic_inc(&mem_active);
> +	wake_up(&cxl_wait_queue);
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>   
> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>   
> +static bool cxl_pci_loaded(void)
> +{
> +	if (IS_ENABLED(CONFIG_CXL_PCI))
> +		return atomic_read(&pci_loaded) != 0;
> +
> +	return false;
> +}
> +
>   void mark_cxl_pci_loaded(void)
>   {
>   	atomic_inc(&pci_loaded);
> +	wake_up(&cxl_wait_queue);
>   }
>   EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
> +
> +void cxl_wait_for_pci_mem(void)
> +{
> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
> +				cxl_mem_active(), 30 * HZ))
> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
> +}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  2025-06-04  9:40   ` Zhijian Li (Fujitsu)
@ 2025-06-04 14:35     ` Dave Jiang
  2025-06-05  2:18       ` Zhijian Li (Fujitsu)
  0 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-06-04 14:35 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), Smita Koralahalli,
	linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
	nvdimm@lists.linux.dev, linux-fsdevel@vger.kernel.org,
	linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 6/4/25 2:40 AM, Zhijian Li (Fujitsu) wrote:
> 
> 
> On 04/06/2025 06:19, Smita Koralahalli wrote:
>>   drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>>   drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>>   drivers/cxl/cxl.h          |  2 ++
>>   3 files changed, 46 insertions(+)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index cb14829bb9be..978f63b32b41 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>>   	return 0;
>>   }
>>   
>> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>> +{
>> +	/* Wait for cxl_pci and cxl_mem drivers to load */
>> +	cxl_wait_for_pci_mem();
>> +
>> +	/*
>> +	 * Wait for the driver probe routines to complete after cxl_pci
>> +	 * and cxl_mem drivers are loaded.
>> +	 */
>> +	wait_for_device_probe();
>> +}
>> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>> +
>> +static void cxl_softreserv_mem_update(void)
>> +{
>> +	schedule_work(&cxl_sr_work);
>> +}
>> +
>>   static int cxl_acpi_probe(struct platform_device *pdev)
>>   {
>>   	int rc;
>> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>>   
>>   	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>>   	cxl_bus_rescan();
>> +
>> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
>> +	cxl_softreserv_mem_update();
>> +
>>   	return 0;
>>   }
>>   
>> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>>   
>>   static void __exit cxl_acpi_exit(void)
>>   {
>> +	cancel_work_sync(&cxl_sr_work);
>>   	platform_driver_unregister(&cxl_acpi_driver);
>>   	cxl_bus_drain();
>>   }
>> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
>> index 72818a2c8ec8..c0d8f70aed56 100644
>> --- a/drivers/cxl/core/suspend.c
>> +++ b/drivers/cxl/core/suspend.c
>> @@ -2,12 +2,15 @@
>>   /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>>   #include <linux/atomic.h>
>>   #include <linux/export.h>
>> +#include <linux/wait.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   
>>   static atomic_t mem_active;
>>   static atomic_t pci_loaded;
>>   
>> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
> 
> Given that this file (suspend.c) focuses on power management functions,
> it might be more appropriate to move the wait queue declaration and its
> related changes to acpi.c in where the its caller is.

You mean drivers/cxl/acpi.c and not drivers/cxl/core/acpi.c right? The core one is my mistake and I'm going to create a patch to remove that.

DJ

> 
> 
> Thanks
> Zhijian
> 
>> +
>>   bool cxl_mem_active(void)
>>   {
>>   	if (IS_ENABLED(CONFIG_CXL_MEM))
>> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>>   void cxl_mem_active_inc(void)
>>   {
>>   	atomic_inc(&mem_active);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>>   
>> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>>   
>> +static bool cxl_pci_loaded(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_CXL_PCI))
>> +		return atomic_read(&pci_loaded) != 0;
>> +
>> +	return false;
>> +}
>> +
>>   void mark_cxl_pci_loaded(void)
>>   {
>>   	atomic_inc(&pci_loaded);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
>> +
>> +void cxl_wait_for_pci_mem(void)
>> +{
>> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
>> +				cxl_mem_active(), 30 * HZ))
>> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
>> +}


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  2025-06-03 23:45   ` Dave Jiang
@ 2025-06-04 18:31     ` Koralahalli Channabasappa, Smita
  0 siblings, 0 replies; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-06-04 18:31 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-kernel, nvdimm, linux-fsdevel,
	linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On 6/3/2025 4:45 PM, Dave Jiang wrote:
> 
> 
> On 6/3/25 3:19 PM, Smita Koralahalli wrote:
>> Introduce a waitqueue mechanism to coordinate initialization between the
>> cxl_pci and cxl_mem drivers.
>>
>> Launch a background worker from cxl_acpi_probe() that waits for both
>> drivers to complete initialization before invoking wait_for_device_probe().
>> Without this, the probe completion wait could begin prematurely, before
>> the drivers are present, leading to missed updates.
>>
>> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>   drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>>   drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>>   drivers/cxl/cxl.h          |  2 ++
>>   3 files changed, 46 insertions(+)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index cb14829bb9be..978f63b32b41 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>>   	return 0;
>>   }
>>   
>> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>> +{
>> +	/* Wait for cxl_pci and cxl_mem drivers to load */
>> +	cxl_wait_for_pci_mem();
>> +
>> +	/*
>> +	 * Wait for the driver probe routines to complete after cxl_pci
>> +	 * and cxl_mem drivers are loaded.
>> +	 */
>> +	wait_for_device_probe();
>> +}
>> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>> +
>> +static void cxl_softreserv_mem_update(void)
>> +{
>> +	schedule_work(&cxl_sr_work);
>> +}
>> +
>>   static int cxl_acpi_probe(struct platform_device *pdev)
>>   {
>>   	int rc;
>> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>>   
>>   	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>>   	cxl_bus_rescan();
>> +
>> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
>> +	cxl_softreserv_mem_update();
>> +
>>   	return 0;
>>   }
>>   
>> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>>   
>>   static void __exit cxl_acpi_exit(void)
>>   {
>> +	cancel_work_sync(&cxl_sr_work);
>>   	platform_driver_unregister(&cxl_acpi_driver);
>>   	cxl_bus_drain();
>>   }
>> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
>> index 72818a2c8ec8..c0d8f70aed56 100644
>> --- a/drivers/cxl/core/suspend.c
>> +++ b/drivers/cxl/core/suspend.c
>> @@ -2,12 +2,15 @@
>>   /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>>   #include <linux/atomic.h>
>>   #include <linux/export.h>
>> +#include <linux/wait.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   
>>   static atomic_t mem_active;
>>   static atomic_t pci_loaded;
>>   
>> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
>> +
>>   bool cxl_mem_active(void)
>>   {
>>   	if (IS_ENABLED(CONFIG_CXL_MEM))
>> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>>   void cxl_mem_active_inc(void)
>>   {
>>   	atomic_inc(&mem_active);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>>   
>> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>>   
>> +static bool cxl_pci_loaded(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_CXL_PCI))
>> +		return atomic_read(&pci_loaded) != 0;
>> +
>> +	return false;
>> +}
>> +
>>   void mark_cxl_pci_loaded(void)
>>   {
>>   	atomic_inc(&pci_loaded);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
>> +
>> +void cxl_wait_for_pci_mem(void)
>> +{
>> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
>> +				cxl_mem_active(), 30 * HZ))
> 
> I'm trying to understand why cxl_pci_loaded() is needed. cxl_mem_active() goes above 0 when a cxl_mem_probe() instance succeeds. cxl_mem_probe() being triggered implies that an instance of cxl_pci_probe() has been called since cxl_mem_probe() is triggered from devm_cxl_add_memdev() with memdev being added and cxl_mem driver also have been loaded. So does cxl_mem_active() not imply cxl_pci_loaded() and makes it unnecessary?

Yeah you are right. I will remove this check.

Thanks
Smita
> 
> DJ
> 
> 
>> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_wait_for_pci_mem, "CXL");
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index a9ab46eb0610..1ba7d39c2991 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -902,6 +902,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>>   
>>   bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>>   
>> +void cxl_wait_for_pci_mem(void);
>> +
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>>    * of these symbols in tools/testing/cxl/.
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 0/7] Add managed SOFT RESERVE resource handling
  2025-06-04  8:43 ` [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Zhijian Li (Fujitsu)
@ 2025-06-04 18:59   ` Koralahalli Channabasappa, Smita
  2025-06-16  1:00     ` Zhijian Li (Fujitsu)
  0 siblings, 1 reply; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-06-04 18:59 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati

Hi Zhijian,

Thanks for testing my patches.

On 6/4/2025 1:43 AM, Zhijian Li (Fujitsu) wrote:
> Smita,
> 
> Thanks for your awesome work. I just tested the scenarios you listed, and they work as expected. Thanks again.
> (Minor comments inlined)
> 
> Tested-by: Li Zhijian <lizhijian@fujitsu.com>
> 
> 
> To the CXL community,
> 
> The scenarios mentioned here essentially cover what a correct firmware may provide. However,
> I would like to discuss one more scenario that I can simulate with a modified QEMU:
> The E820 exposes a SOFT RESERVED region which is the same as a CFMW, but the HDM decoders are not committed. This means no region will be auto-created during boot.
> 
> As an example, after boot, the iomem tree is as follows:
> 1050000000-304fffffff : CXL Window 0
>     1050000000-304fffffff : Soft Reserved
>       <No region>
> 
> In this case, the SOFT RESERVED resource is not trimmed, so the end-user cannot create a new region.
> My question is: Is this scenario a problem? If it is, should we fix it in this patchset or create a new patch?
> 

I believe firmware should handle this correctly by ensuring that any 
exposed SOFT RESERVED ranges correspond to committed HDM decoders and 
result in region creation.

That said, I’d be interested in hearing what the rest of the community 
thinks.

> 
> 
> 
> On 04/06/2025 06:19, Smita Koralahalli wrote:
>> Add the ability to manage SOFT RESERVE iomem resources prior to them being
>> added to the iomem resource tree. This allows drivers, such as CXL, to
>> remove any pieces of the SOFT RESERVE resource that intersect with created
>> CXL regions.
>>
>> The current approach of leaving the SOFT RESERVE resources as is can cause
>> failures during hotplug of devices, such as CXL, because the resource is
>> not available for reuse after teardown of the device.
>>
>> The approach is to add SOFT RESERVE resources to a separate tree during
>> boot.
> 
> No special tree at all since V3

Will make changes. I overlooked the cover letter.

> 
> 
>> This allows any drivers to update the SOFT RESERVE resources before
>> they are merged into the iomem resource tree. In addition a notifier chain
>> is added so that drivers can be notified when these SOFT RESERVE resources
>> are added to the ioeme resource tree.
>>
>> The CXL driver is modified to use a worker thread that waits for the CXL
>> PCI and CXL mem drivers to be loaded and for their probe routine to
>> complete. Then the driver walks through any created CXL regions to trim any
>> intersections with SOFT RESERVE resources in the iomem tree.
>>
>> The dax driver uses the new soft reserve notifier chain so it can consume
>> any remaining SOFT RESERVES once they're added to the iomem tree.
>>
>> The following scenarios have been tested:
>>
>> Example 1: Exact alignment, soft reserved is a child of the region
>>
>> |---------- "Soft Reserved" -----------|
>> |-------------- "Region #" ------------|
>>
>> Before:
>>     1050000000-304fffffff : CXL Window 0
>>       1050000000-304fffffff : region0
>>         1050000000-304fffffff : Soft Reserved
>>           1080000000-2fffffffff : dax0.0
> 
> BTW, I'm curious how to set up a dax with an address range different from its corresponding region.

Hmm, this configuration was provided directly by our BIOS. The DAX 
device was mapped to a subset of the region's address space as part of 
the platform's firmware setup, so I did not explicitly configure it..

> 
> 
>>             1080000000-2fffffffff : System RAM (kmem)
>>
>> After:
>>     1050000000-304fffffff : CXL Window 0
>>       1050000000-304fffffff : region1
>>         1080000000-2fffffffff : dax0.0
>>           1080000000-2fffffffff : System RAM (kmem)
>>
>> Example 2: Start and/or end aligned and soft reserved spans multiple
>> regions
> 
> Tested
> 
>>
>> |----------- "Soft Reserved" -----------|
>> |-------- "Region #" -------|
>> or
>> |----------- "Soft Reserved" -----------|
>> |-------- "Region #" -------|
> 
> Typo? should be:
> |----------- "Soft Reserved" -----------|
>               |-------- "Region #" -------|

Yeah, Will fix.

> 
>>
>> Example 3: No alignment
>> |---------- "Soft Reserved" ----------|
>> 	|---- "Region #" ----|
> 
> Tested.
> 
> 
> Thanks
> Zhijian

Thanks
Smita

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region()
  2025-06-03 23:49   ` Dave Jiang
@ 2025-06-04 19:56     ` Nathan Fontenot
  2025-06-25 15:23       ` Robert Richter
  0 siblings, 1 reply; 40+ messages in thread
From: Nathan Fontenot @ 2025-06-04 19:56 UTC (permalink / raw)
  To: Dave Jiang, Smita Koralahalli, linux-cxl, linux-kernel, nvdimm,
	linux-fsdevel, linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On 6/3/2025 6:49 PM, Dave Jiang wrote:
> 
> 
> On 6/3/25 3:19 PM, Smita Koralahalli wrote:
>> Add a NULL check in is_cxl_region() to prevent potential null pointer
>> dereference if a caller passes a NULL device. This change ensures the
>> function safely returns false instead of triggering undefined behavior
>> when dev is NULL.
> 
> Don't think this change is necessary. The code paths should not be hitting any NULL region devices unless it's a programming error.

I originally added this to the patchset during some initial development to handle possible
NULL dev pointers when updating soft reserve resources, see cxl_region_softreserv_update()
in patch 5/7.

In the current form of the that routine it appears we shouldn't execute the while loop
if dev is NULL so this could get from the patch set.

-Nathan 

> 
>>
>> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>  drivers/cxl/core/region.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index c3f4dc244df7..109b8a98c4c7 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -2333,7 +2333,7 @@ const struct device_type cxl_region_type = {
>>  
>>  bool is_cxl_region(struct device *dev)
>>  {
>> -	return dev->type == &cxl_region_type;
>> +	return dev && dev->type == &cxl_region_type;
>>  }
>>  EXPORT_SYMBOL_NS_GPL(is_cxl_region, "CXL");
>>  
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  2025-06-04 14:35     ` Dave Jiang
@ 2025-06-05  2:18       ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-05  2:18 UTC (permalink / raw)
  To: Dave Jiang, Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 04/06/2025 22:35, Dave Jiang wrote:
>>>    
>>> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
>> Given that this file (suspend.c) focuses on power management functions,
>> it might be more appropriate to move the wait queue declaration and its
>> related changes to acpi.c in where the its caller is.
> You mean drivers/cxl/acpi.c and not drivers/cxl/core/acpi.c right? 

Yes


> The core one is my mistake and I'm going to create a patch to remove that.


Completely missed its existence - thank you for catching and cleaning that up!

Thanks
Zhijian


> 
> DJ

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer
  2025-06-03 22:19 ` [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer Smita Koralahalli
@ 2025-06-05 16:54   ` Dave Jiang
  2025-06-06  8:11     ` Zhijian Li (Fujitsu)
  0 siblings, 1 reply; 40+ messages in thread
From: Dave Jiang @ 2025-06-05 16:54 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl, linux-kernel, nvdimm, linux-fsdevel,
	linux-pm
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li



On 6/3/25 3:19 PM, Smita Koralahalli wrote:
> From: Nathan Fontenot <nathan.fontenot@amd.com>
> 
> To enable registration of HMEM devices for SOFT RESERVED regions after
> the DAX HMEM device is initialized, this patch saves a reference to the
> DAX HMEM platform device.
> 
> This saved pointer will be used in a follow-up patch to allow late
> registration of SOFT RESERVED memory ranges. It also enables
> simplification of the walk_hmem_resources() by removing the need to
> pass a struct device argument.
> 
> There are no functional changes.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/dax/hmem/device.c | 4 ++--
>  drivers/dax/hmem/hmem.c   | 9 ++++++---
>  include/linux/dax.h       | 5 ++---
>  3 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> index f9e1a76a04a9..59ad44761191 100644
> --- a/drivers/dax/hmem/device.c
> +++ b/drivers/dax/hmem/device.c
> @@ -17,14 +17,14 @@ static struct resource hmem_active = {
>  	.flags = IORESOURCE_MEM,
>  };
>  
> -int walk_hmem_resources(struct device *host, walk_hmem_fn fn)
> +int walk_hmem_resources(walk_hmem_fn fn)
>  {
>  	struct resource *res;
>  	int rc = 0;
>  
>  	mutex_lock(&hmem_resource_lock);
>  	for (res = hmem_active.child; res; res = res->sibling) {
> -		rc = fn(host, (int) res->desc, res);
> +		rc = fn((int) res->desc, res);
>  		if (rc)
>  			break;
>  	}
> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> index 5e7c53f18491..3aedef5f1be1 100644
> --- a/drivers/dax/hmem/hmem.c
> +++ b/drivers/dax/hmem/hmem.c
> @@ -9,6 +9,8 @@
>  static bool region_idle;
>  module_param_named(region_idle, region_idle, bool, 0644);
>  
> +static struct platform_device *dax_hmem_pdev;
> +
>  static int dax_hmem_probe(struct platform_device *pdev)
>  {
>  	unsigned long flags = IORESOURCE_DAX_KMEM;
> @@ -59,9 +61,9 @@ static void release_hmem(void *pdev)
>  	platform_device_unregister(pdev);
>  }
>  
> -static int hmem_register_device(struct device *host, int target_nid,
> -				const struct resource *res)
> +static int hmem_register_device(int target_nid, const struct resource *res)
>  {
> +	struct device *host = &dax_hmem_pdev->dev;
>  	struct platform_device *pdev;
>  	struct memregion_info info;
>  	long id;
> @@ -125,7 +127,8 @@ static int hmem_register_device(struct device *host, int target_nid,
>  
>  static int dax_hmem_platform_probe(struct platform_device *pdev)
>  {
> -	return walk_hmem_resources(&pdev->dev, hmem_register_device);
> +	dax_hmem_pdev = pdev;

Is there never more than 1 DAX HMEM platform device that can show up? The global pointer makes me nervous.

DJ

> +	return walk_hmem_resources(hmem_register_device);
>  }
>  
>  static struct platform_driver dax_hmem_platform_driver = {
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index dcc9fcdf14e4..a4ad3708ea35 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -305,7 +305,6 @@ static inline void hmem_register_resource(int target_nid, struct resource *r)
>  }
>  #endif
>  
> -typedef int (*walk_hmem_fn)(struct device *dev, int target_nid,
> -			    const struct resource *res);
> -int walk_hmem_resources(struct device *dev, walk_hmem_fn fn);
> +typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);
> +int walk_hmem_resources(walk_hmem_fn fn);
>  #endif


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-03 22:19 ` [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown Smita Koralahalli
@ 2025-06-06  4:16   ` Zhijian Li (Fujitsu)
  2025-06-06 17:53     ` Koralahalli Channabasappa, Smita
  2025-06-09 12:54   ` Jonathan Cameron
  2025-07-09  1:14   ` Alison Schofield
  2 siblings, 1 reply; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-06  4:16 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati


The term "region teardown" in the subject line appears inaccurate.


As I understand, cxl_region_softreserv_update() should work only
when the region is ready, but the current logic only guarantees that the *memdev* is ready.

Is there truly no timing gap between region readiness and memdev readiness?
If this assumption is true, could we document this relationship in both the commit log and code comments?


Thanks
Zhijian


On 04/06/2025 06:19, Smita Koralahalli wrote:
> Reworked from a patch by Alison Schofield <alison.schofield@intel.com>
> 
> Previously, when CXL regions were created through autodiscovery and their
> resources overlapped with SOFT RESERVED ranges, the soft reserved resource
> remained in place after region teardown. This left the HPA range
> unavailable for reuse even after the region was destroyed.
> 
> Enhance the logic to reliably remove SOFT RESERVED resources associated
> with a region, regardless of alignment or hierarchy in the iomem tree.
> 
> Link: https://lore.kernel.org/linux-cxl/29312c0765224ae76862d59a17748c8188fb95f1.1692638817.git.alison.schofield@intel.com/
> Co-developed-by: Alison Schofield <alison.schofield@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>   drivers/cxl/acpi.c        |   2 +
>   drivers/cxl/core/region.c | 151 ++++++++++++++++++++++++++++++++++++++
>   drivers/cxl/cxl.h         |   5 ++
>   3 files changed, 158 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 978f63b32b41..1b1388feb36d 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -823,6 +823,8 @@ static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>   	 * and cxl_mem drivers are loaded.
>   	 */
>   	wait_for_device_probe();
> +
> +	cxl_region_softreserv_update();
>   }
>   static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>   
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 109b8a98c4c7..3a5ca44d65f3 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3443,6 +3443,157 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
>   
> +static int add_soft_reserved(resource_size_t start, resource_size_t len,
> +			     unsigned long flags)
> +{
> +	struct resource *res = kmalloc(sizeof(*res), GFP_KERNEL);
> +	int rc;
> +
> +	if (!res)
> +		return -ENOMEM;
> +
> +	*res = DEFINE_RES_MEM_NAMED(start, len, "Soft Reserved");
> +
> +	res->desc = IORES_DESC_SOFT_RESERVED;
> +	res->flags = flags;
> +	rc = insert_resource(&iomem_resource, res);
> +	if (rc) {
> +		kfree(res);
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static void remove_soft_reserved(struct cxl_region *cxlr, struct resource *soft,
> +				 resource_size_t start, resource_size_t end)
> +{
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	resource_size_t new_start, new_end;
> +	int rc;
> +
> +	/* Prevent new usage while removing or adjusting the resource */
> +	guard(mutex)(&cxlrd->range_lock);
> +
> +	/* Aligns at both resource start and end */
> +	if (soft->start == start && soft->end == end)
> +		goto remove;
> +
> +	/* Aligns at either resource start or end */
> +	if (soft->start == start || soft->end == end) {
> +		if (soft->start == start) {
> +			new_start = end + 1;
> +			new_end = soft->end;
> +		} else {
> +			new_start = soft->start;
> +			new_end = start - 1;
> +		}
> +
> +		rc = add_soft_reserved(new_start, new_end - new_start + 1,
> +				       soft->flags);
> +		if (rc)
> +			dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
> +				 &new_start);
> +
> +		/* Remove the original Soft Reserved resource */
> +		goto remove;
> +	}
> +
> +	/*
> +	 * No alignment. Attempt a 3-way split that removes the part of
> +	 * the resource the region occupied, and then creates new soft
> +	 * reserved resources for the leading and trailing addr space.
> +	 */
> +	new_start = soft->start;
> +	new_end = soft->end;
> +
> +	rc = add_soft_reserved(new_start, start - new_start, soft->flags);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
> +			 &new_start);
> +
> +	rc = add_soft_reserved(end + 1, new_end - end, soft->flags);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa + 1\n",
> +			 &end);
> +
> +remove:
> +	rc = remove_resource(soft);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot remove soft reserved resource %pr\n",
> +			 soft);
> +}
> +
> +/*
> + * normalize_resource
> + *
> + * The walk_iomem_res_desc() returns a copy of a resource, not a reference
> + * to the actual resource in the iomem_resource tree. As a result,
> + * __release_resource() which relies on pointer equality will fail.
> + *
> + * This helper walks the children of the resource's parent to find and
> + * return the original resource pointer that matches the given resource's
> + * start and end addresses.
> + *
> + * Return: Pointer to the matching original resource in iomem_resource, or
> + *         NULL if not found or invalid input.
> + */
> +static struct resource *normalize_resource(struct resource *res)
> +{
> +	if (!res || !res->parent)
> +		return NULL;
> +
> +	for (struct resource *res_iter = res->parent->child;
> +	     res_iter != NULL; res_iter = res_iter->sibling) {
> +		if ((res_iter->start == res->start) &&
> +		    (res_iter->end == res->end))
> +			return res_iter;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int __cxl_region_softreserv_update(struct resource *soft,
> +					  void *_cxlr)
> +{
> +	struct cxl_region *cxlr = _cxlr;
> +	struct resource *res = cxlr->params.res;
> +
> +	/* Skip non-intersecting soft-reserved regions */
> +	if (soft->end < res->start || soft->start > res->end)
> +		return 0;
> +
> +	soft = normalize_resource(soft);
> +	if (!soft)
> +		return -EINVAL;
> +
> +	remove_soft_reserved(cxlr, soft, res->start, res->end);
> +
> +	return 0;
> +}
> +
> +int cxl_region_softreserv_update(void)
> +{
> +	struct device *dev = NULL;
> +
> +	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
> +		struct device *put_dev __free(put_device) = dev;
> +		struct cxl_region *cxlr;
> +
> +		if (!is_cxl_region(dev))
> +			continue;
> +
> +		cxlr = to_cxl_region(dev);
> +
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, cxlr,
> +				    __cxl_region_softreserv_update);
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> +
>   u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
>   {
>   	struct cxl_region_ref *iter;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1ba7d39c2991..fc39c4b24745 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -859,6 +859,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
>   int cxl_add_to_region(struct cxl_port *root,
>   		      struct cxl_endpoint_decoder *cxled);
>   struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
> +int cxl_region_softreserv_update(void);
>   u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
>   #else
>   static inline bool is_cxl_pmem_region(struct device *dev)
> @@ -878,6 +879,10 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
>   {
>   	return NULL;
>   }
> +static inline int cxl_region_softreserv_update(void)
> +{
> +	return 0;
> +}
>   static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
>   					       u64 spa)
>   {

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer
  2025-06-05 16:54   ` Dave Jiang
@ 2025-06-06  8:11     ` Zhijian Li (Fujitsu)
  2025-06-06 20:09       ` Nathan Fontenot
  0 siblings, 1 reply; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-06  8:11 UTC (permalink / raw)
  To: Dave Jiang, Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 06/06/2025 00:54, Dave Jiang wrote:
>> -static int hmem_register_device(struct device *host, int target_nid,
>> -				const struct resource *res)
>> +static int hmem_register_device(int target_nid, const struct resource *res)
>>   {
>> +	struct device *host = &dax_hmem_pdev->dev;
>>   	struct platform_device *pdev;
>>   	struct memregion_info info;
>>   	long id;
>> @@ -125,7 +127,8 @@ static int hmem_register_device(struct device *host, int target_nid,
>>   
>>   static int dax_hmem_platform_probe(struct platform_device *pdev)
>>   {
>> -	return walk_hmem_resources(&pdev->dev, hmem_register_device);
>> +	dax_hmem_pdev = pdev;

> Is there never more than 1 DAX HMEM platform device that can show up? The global pointer makes me nervous.

IIUC,

Referring to the device creation logic in `__hmem_register_resource()` (shown below),
only one `hmem_platform` instance can ever be registered. This ensures the change is safe.


However, I agree that using a global pointer in a function that may be called multiple times
does raise valid concerns.

To strengthen this, how about:
1. Add a comment clarifying single-instance enforcement
2. Add a warn_on/bug_on for it: `WARN_ON(dax_hmem_pdev && dax_hmem_pdev != pdev)`


static void __hmem_register_resource(int target_nid, struct resource *res)
{
	struct platform_device *pdev;
	struct resource *new;
	int rc;

	new = __request_region(&hmem_active, res->start, resource_size(res), "",
			       0);
	if (!new) {
		pr_debug("hmem range %pr already active\n", res);
		return;
	}

	new->desc = target_nid;

	if (platform_initialized)
		return;

	pdev = platform_device_alloc("hmem_platform", 0);
	if (!pdev) {
		pr_err_once("failed to register device-dax hmem_platform device\n");
		return;
	}

	rc = platform_device_add(pdev);
	if (rc)
		platform_device_put(pdev);
	else
		platform_initialized = true;
}

Thanks
Zhijian





> 
> DJ

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-06  4:16   ` Zhijian Li (Fujitsu)
@ 2025-06-06 17:53     ` Koralahalli Channabasappa, Smita
  0 siblings, 0 replies; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-06-06 17:53 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati

On 6/5/2025 9:16 PM, Zhijian Li (Fujitsu) wrote:
> 
> The term "region teardown" in the subject line appears inaccurate.
> 
> 
> As I understand, cxl_region_softreserv_update() should work only
> when the region is ready, but the current logic only guarantees that the *memdev* is ready.
> 
> Is there truly no timing gap between region readiness and memdev readiness?
> If this assumption is true, could we document this relationship in both the commit log and code comments?
> 
> 
> Thanks
> Zhijian

I think you're right in pointing this out.

We cannot guarantee region readiness unless there's an explicit ordering
or dependency ensuring that cxl_region_probe() always completes after or 
synchronously with memdev probing and wait_for_device_probe().

I don't believe we currently have such a guarantee.

I was considering adding a null check for cxlr->params.res in 
__cxl_region_softreserv_update(), along with a comment like below:

static int __cxl_region_softreserv_update(struct resource *soft,
					  void *_cxlr)
{
	struct cxl_region *cxlr = _cxlr;
	struct resource *res = cxlr->params.res;

	/*
	 * Skip if the region has not yet been fully initialized.
	 *
	 * This code assumes all cxl_region devices have completed
	 * probing before this function runs, and that params.res is
	 * valid.
	 */

	if (!res)
		return 0;

..
..

}

This would prevent a null dereference and make the assumption around 
region readiness more explicit.

That said, I'd appreciate input from Terry, Nathan, or others on whether 
there's a better way to guarantee that region creation has completed by 
the time this function is invoked.

Thanks
Smita

> 
> 
> On 04/06/2025 06:19, Smita Koralahalli wrote:
>> Reworked from a patch by Alison Schofield <alison.schofield@intel.com>
>>
>> Previously, when CXL regions were created through autodiscovery and their
>> resources overlapped with SOFT RESERVED ranges, the soft reserved resource
>> remained in place after region teardown. This left the HPA range
>> unavailable for reuse even after the region was destroyed.
>>
>> Enhance the logic to reliably remove SOFT RESERVED resources associated
>> with a region, regardless of alignment or hierarchy in the iomem tree.
>>
>> Link: https://lore.kernel.org/linux-cxl/29312c0765224ae76862d59a17748c8188fb95f1.1692638817.git.alison.schofield@intel.com/
>> Co-developed-by: Alison Schofield <alison.schofield@intel.com>
>> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>    drivers/cxl/acpi.c        |   2 +
>>    drivers/cxl/core/region.c | 151 ++++++++++++++++++++++++++++++++++++++
>>    drivers/cxl/cxl.h         |   5 ++
>>    3 files changed, 158 insertions(+)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index 978f63b32b41..1b1388feb36d 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -823,6 +823,8 @@ static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>>    	 * and cxl_mem drivers are loaded.
>>    	 */
>>    	wait_for_device_probe();
>> +
>> +	cxl_region_softreserv_update();
>>    }
>>    static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>>    
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 109b8a98c4c7..3a5ca44d65f3 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3443,6 +3443,157 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>>    }
>>    EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
>>    
>> +static int add_soft_reserved(resource_size_t start, resource_size_t len,
>> +			     unsigned long flags)
>> +{
>> +	struct resource *res = kmalloc(sizeof(*res), GFP_KERNEL);
>> +	int rc;
>> +
>> +	if (!res)
>> +		return -ENOMEM;
>> +
>> +	*res = DEFINE_RES_MEM_NAMED(start, len, "Soft Reserved");
>> +
>> +	res->desc = IORES_DESC_SOFT_RESERVED;
>> +	res->flags = flags;
>> +	rc = insert_resource(&iomem_resource, res);
>> +	if (rc) {
>> +		kfree(res);
>> +		return rc;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void remove_soft_reserved(struct cxl_region *cxlr, struct resource *soft,
>> +				 resource_size_t start, resource_size_t end)
>> +{
>> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>> +	resource_size_t new_start, new_end;
>> +	int rc;
>> +
>> +	/* Prevent new usage while removing or adjusting the resource */
>> +	guard(mutex)(&cxlrd->range_lock);
>> +
>> +	/* Aligns at both resource start and end */
>> +	if (soft->start == start && soft->end == end)
>> +		goto remove;
>> +
>> +	/* Aligns at either resource start or end */
>> +	if (soft->start == start || soft->end == end) {
>> +		if (soft->start == start) {
>> +			new_start = end + 1;
>> +			new_end = soft->end;
>> +		} else {
>> +			new_start = soft->start;
>> +			new_end = start - 1;
>> +		}
>> +
>> +		rc = add_soft_reserved(new_start, new_end - new_start + 1,
>> +				       soft->flags);
>> +		if (rc)
>> +			dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
>> +				 &new_start);
>> +
>> +		/* Remove the original Soft Reserved resource */
>> +		goto remove;
>> +	}
>> +
>> +	/*
>> +	 * No alignment. Attempt a 3-way split that removes the part of
>> +	 * the resource the region occupied, and then creates new soft
>> +	 * reserved resources for the leading and trailing addr space.
>> +	 */
>> +	new_start = soft->start;
>> +	new_end = soft->end;
>> +
>> +	rc = add_soft_reserved(new_start, start - new_start, soft->flags);
>> +	if (rc)
>> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
>> +			 &new_start);
>> +
>> +	rc = add_soft_reserved(end + 1, new_end - end, soft->flags);
>> +	if (rc)
>> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa + 1\n",
>> +			 &end);
>> +
>> +remove:
>> +	rc = remove_resource(soft);
>> +	if (rc)
>> +		dev_warn(&cxlr->dev, "cannot remove soft reserved resource %pr\n",
>> +			 soft);
>> +}
>> +
>> +/*
>> + * normalize_resource
>> + *
>> + * The walk_iomem_res_desc() returns a copy of a resource, not a reference
>> + * to the actual resource in the iomem_resource tree. As a result,
>> + * __release_resource() which relies on pointer equality will fail.
>> + *
>> + * This helper walks the children of the resource's parent to find and
>> + * return the original resource pointer that matches the given resource's
>> + * start and end addresses.
>> + *
>> + * Return: Pointer to the matching original resource in iomem_resource, or
>> + *         NULL if not found or invalid input.
>> + */
>> +static struct resource *normalize_resource(struct resource *res)
>> +{
>> +	if (!res || !res->parent)
>> +		return NULL;
>> +
>> +	for (struct resource *res_iter = res->parent->child;
>> +	     res_iter != NULL; res_iter = res_iter->sibling) {
>> +		if ((res_iter->start == res->start) &&
>> +		    (res_iter->end == res->end))
>> +			return res_iter;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static int __cxl_region_softreserv_update(struct resource *soft,
>> +					  void *_cxlr)
>> +{
>> +	struct cxl_region *cxlr = _cxlr;
>> +	struct resource *res = cxlr->params.res;
>> +
>> +	/* Skip non-intersecting soft-reserved regions */
>> +	if (soft->end < res->start || soft->start > res->end)
>> +		return 0;
>> +
>> +	soft = normalize_resource(soft);
>> +	if (!soft)
>> +		return -EINVAL;
>> +
>> +	remove_soft_reserved(cxlr, soft, res->start, res->end);
>> +
>> +	return 0;
>> +}
>> +
>> +int cxl_region_softreserv_update(void)
>> +{
>> +	struct device *dev = NULL;
>> +
>> +	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
>> +		struct device *put_dev __free(put_device) = dev;
>> +		struct cxl_region *cxlr;
>> +
>> +		if (!is_cxl_region(dev))
>> +			continue;
>> +
>> +		cxlr = to_cxl_region(dev);
>> +
>> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> +				    IORESOURCE_MEM, 0, -1, cxlr,
>> +				    __cxl_region_softreserv_update);
>> +	}
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
>> +
>>    u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
>>    {
>>    	struct cxl_region_ref *iter;
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 1ba7d39c2991..fc39c4b24745 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -859,6 +859,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
>>    int cxl_add_to_region(struct cxl_port *root,
>>    		      struct cxl_endpoint_decoder *cxled);
>>    struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
>> +int cxl_region_softreserv_update(void);
>>    u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
>>    #else
>>    static inline bool is_cxl_pmem_region(struct device *dev)
>> @@ -878,6 +879,10 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
>>    {
>>    	return NULL;
>>    }
>> +static inline int cxl_region_softreserv_update(void)
>> +{
>> +	return 0;
>> +}
>>    static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
>>    					       u64 spa)
>>    {


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer
  2025-06-06  8:11     ` Zhijian Li (Fujitsu)
@ 2025-06-06 20:09       ` Nathan Fontenot
  0 siblings, 0 replies; 40+ messages in thread
From: Nathan Fontenot @ 2025-06-06 20:09 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), Dave Jiang, Smita Koralahalli,
	linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
	nvdimm@lists.linux.dev, linux-fsdevel@vger.kernel.org,
	linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati

On 6/6/2025 3:11 AM, Zhijian Li (Fujitsu) wrote:
> 
> 
> On 06/06/2025 00:54, Dave Jiang wrote:
>>> -static int hmem_register_device(struct device *host, int target_nid,
>>> -				const struct resource *res)
>>> +static int hmem_register_device(int target_nid, const struct resource *res)
>>>   {
>>> +	struct device *host = &dax_hmem_pdev->dev;
>>>   	struct platform_device *pdev;
>>>   	struct memregion_info info;
>>>   	long id;
>>> @@ -125,7 +127,8 @@ static int hmem_register_device(struct device *host, int target_nid,
>>>   
>>>   static int dax_hmem_platform_probe(struct platform_device *pdev)
>>>   {
>>> -	return walk_hmem_resources(&pdev->dev, hmem_register_device);
>>> +	dax_hmem_pdev = pdev;
> 
>> Is there never more than 1 DAX HMEM platform device that can show up? The global pointer makes me nervous.
> 
> IIUC,
> 
> Referring to the device creation logic in `__hmem_register_resource()` (shown below),
> only one `hmem_platform` instance can ever be registered. This ensures the change is safe.
> 
> 
> However, I agree that using a global pointer in a function that may be called multiple times
> does raise valid concerns.

Note: The creation of the platform device is moved in patch 7/7. I think the placed
it was moved to may not be correct. The creation of the platform device should be
in hmem_init so that it is always create one instance.

-Nathan

> 
> To strengthen this, how about:
> 1. Add a comment clarifying single-instance enforcement
> 2. Add a warn_on/bug_on for it: `WARN_ON(dax_hmem_pdev && dax_hmem_pdev != pdev)`
> 
> 
> static void __hmem_register_resource(int target_nid, struct resource *res)
> {
> 	struct platform_device *pdev;
> 	struct resource *new;
> 	int rc;
> 
> 	new = __request_region(&hmem_active, res->start, resource_size(res), "",
> 			       0);
> 	if (!new) {
> 		pr_debug("hmem range %pr already active\n", res);
> 		return;
> 	}
> 
> 	new->desc = target_nid;
> 
> 	if (platform_initialized)
> 		return;
> 
> 	pdev = platform_device_alloc("hmem_platform", 0);
> 	if (!pdev) {
> 		pr_err_once("failed to register device-dax hmem_platform device\n");
> 		return;
> 	}
> 
> 	rc = platform_device_add(pdev);
> 	if (rc)
> 		platform_device_put(pdev);
> 	else
> 		platform_initialized = true;
> }
> 
> Thanks
> Zhijian
> 
> 
> 
> 
> 
>>
>> DJ


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o
  2025-06-03 22:19 ` [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o Smita Koralahalli
@ 2025-06-09 11:02   ` Jonathan Cameron
  2025-06-09 23:25     ` Koralahalli Channabasappa, Smita
  0 siblings, 1 reply; 40+ messages in thread
From: Jonathan Cameron @ 2025-06-09 11:02 UTC (permalink / raw)
  To: Smita Koralahalli
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Tue, 3 Jun 2025 22:19:44 +0000
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:

> In preparation for soft-reserved resource handling, make the suspend
> infrastructure always available by removing the CONFIG_CXL_SUSPEND
> Kconfig option.
> 
> This ensures cxl_mem_active_inc()/dec() and cxl_mem_active() are
> unconditionally available, enabling coordination between cxl_pci and
> cxl_mem drivers during region setup and hotplug operations.

If these are no longer just being used for suspend, given there
is nothing else in the file, maybe move them to somewhere else?


> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/Kconfig        | 4 ----
>  drivers/cxl/core/Makefile  | 2 +-
>  drivers/cxl/core/suspend.c | 5 ++++-
>  drivers/cxl/cxlmem.h       | 9 ---------
>  include/linux/pm.h         | 7 -------
>  5 files changed, 5 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index cf1ba673b8c2..d09144c2002e 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -118,10 +118,6 @@ config CXL_PORT
>  	default CXL_BUS
>  	tristate
>  
> -config CXL_SUSPEND
> -	def_bool y
> -	depends on SUSPEND && CXL_MEM
> -
>  config CXL_REGION
>  	bool "CXL: Region Support"
>  	default CXL_BUS
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 086df97a0fcf..035864db8a32 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CXL_BUS) += cxl_core.o
> -obj-$(CONFIG_CXL_SUSPEND) += suspend.o
> +obj-y += suspend.o
>  
>  ccflags-y += -I$(srctree)/drivers/cxl
>  CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)
> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> index 29aa5cc5e565..5ba4b4de0e33 100644
> --- a/drivers/cxl/core/suspend.c
> +++ b/drivers/cxl/core/suspend.c
> @@ -8,7 +8,10 @@ static atomic_t mem_active;
>  
>  bool cxl_mem_active(void)
>  {
> -	return atomic_read(&mem_active) != 0;
> +	if (IS_ENABLED(CONFIG_CXL_MEM))
> +		return atomic_read(&mem_active) != 0;
> +
> +	return false;
>  }
>  
>  void cxl_mem_active_inc(void)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 3ec6b906371b..1bd1e88c4cc0 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -853,17 +853,8 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
>  int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
>  int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
>  
> -#ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> -#else
> -static inline void cxl_mem_active_inc(void)
> -{
> -}
> -static inline void cxl_mem_active_dec(void)
> -{
> -}
> -#endif
>  
>  int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
>  
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index f0bd8fbae4f2..415928e0b6ca 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -35,14 +35,7 @@ static inline void pm_vt_switch_unregister(struct device *dev)
>  }
>  #endif /* CONFIG_VT_CONSOLE_SLEEP */
>  
> -#ifdef CONFIG_CXL_SUSPEND
>  bool cxl_mem_active(void);
> -#else
> -static inline bool cxl_mem_active(void)
> -{
> -	return false;
> -}
> -#endif
>  
>  /*
>   * Device power management


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-03 22:19 ` [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown Smita Koralahalli
  2025-06-06  4:16   ` Zhijian Li (Fujitsu)
@ 2025-06-09 12:54   ` Jonathan Cameron
  2025-06-10  1:25     ` Koralahalli Channabasappa, Smita
  2025-07-09  1:14   ` Alison Schofield
  2 siblings, 1 reply; 40+ messages in thread
From: Jonathan Cameron @ 2025-06-09 12:54 UTC (permalink / raw)
  To: Smita Koralahalli
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Tue, 3 Jun 2025 22:19:47 +0000
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:

> Reworked from a patch by Alison Schofield <alison.schofield@intel.com>
> 
> Previously, when CXL regions were created through autodiscovery and their
> resources overlapped with SOFT RESERVED ranges, the soft reserved resource
> remained in place after region teardown. This left the HPA range
> unavailable for reuse even after the region was destroyed.
> 
> Enhance the logic to reliably remove SOFT RESERVED resources associated
> with a region, regardless of alignment or hierarchy in the iomem tree.
> 
> Link: https://lore.kernel.org/linux-cxl/29312c0765224ae76862d59a17748c8188fb95f1.1692638817.git.alison.schofield@intel.com/
> Co-developed-by: Alison Schofield <alison.schofield@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/acpi.c        |   2 +
>  drivers/cxl/core/region.c | 151 ++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h         |   5 ++
>  3 files changed, 158 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 978f63b32b41..1b1388feb36d 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -823,6 +823,8 @@ static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>  	 * and cxl_mem drivers are loaded.
>  	 */
>  	wait_for_device_probe();
> +
> +	cxl_region_softreserv_update();
>  }
>  static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>  
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 109b8a98c4c7..3a5ca44d65f3 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3443,6 +3443,157 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
>  
> +static int add_soft_reserved(resource_size_t start, resource_size_t len,
> +			     unsigned long flags)
> +{
> +	struct resource *res = kmalloc(sizeof(*res), GFP_KERNEL);
> +	int rc;
> +
> +	if (!res)
> +		return -ENOMEM;
> +
> +	*res = DEFINE_RES_MEM_NAMED(start, len, "Soft Reserved");
> +
> +	res->desc = IORES_DESC_SOFT_RESERVED;
> +	res->flags = flags;

I'm a bit doubtful about using a define that restricts a bunch of the
elements, then overriding 2 of them immediate after.

DEFINE_RES_NAMED_DESC(start, len, "Soft Reserved", flags | IORESOURCE_MEM,
		      IORES_DESC_SOFT_RESERVED);

> +	rc = insert_resource(&iomem_resource, res);
> +	if (rc) {
> +		kfree(res);
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static void remove_soft_reserved(struct cxl_region *cxlr, struct resource *soft,
> +				 resource_size_t start, resource_size_t end)
> +{
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	resource_size_t new_start, new_end;
> +	int rc;
> +
> +	/* Prevent new usage while removing or adjusting the resource */
> +	guard(mutex)(&cxlrd->range_lock);
> +
> +	/* Aligns at both resource start and end */
> +	if (soft->start == start && soft->end == end)
> +		goto remove;
> +

Might be a clearer flow with else if rather than
a goto.  

> +	/* Aligns at either resource start or end */
> +	if (soft->start == start || soft->end == end) {
> +		if (soft->start == start) {
> +			new_start = end + 1;
> +			new_end = soft->end;
> +		} else {
> +			new_start = soft->start;
> +			new_end = start - 1;
> +		}
> +
> +		rc = add_soft_reserved(new_start, new_end - new_start + 1,
> +				       soft->flags);

This is the remnant of what was there before, but the flags are from
the bit we are dropping?  That feels odd.  They might well be the same
but maybe we need to make that explicit?

> +		if (rc)
> +			dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
> +				 &new_start);
> +
> +		/* Remove the original Soft Reserved resource */
> +		goto remove;
> +	}
> +
> +	/*
> +	 * No alignment. Attempt a 3-way split that removes the part of
> +	 * the resource the region occupied, and then creates new soft
> +	 * reserved resources for the leading and trailing addr space.
> +	 */
> +	new_start = soft->start;
> +	new_end = soft->end;
> +
> +	rc = add_soft_reserved(new_start, start - new_start, soft->flags);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
> +			 &new_start);
> +
> +	rc = add_soft_reserved(end + 1, new_end - end, soft->flags);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa + 1\n",
> +			 &end);
> +
> +remove:
> +	rc = remove_resource(soft);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot remove soft reserved resource %pr\n",
> +			 soft);
> +}
> +
> +/*
> + * normalize_resource
> + *
> + * The walk_iomem_res_desc() returns a copy of a resource, not a reference
> + * to the actual resource in the iomem_resource tree. As a result,
> + * __release_resource() which relies on pointer equality will fail.

Probably want some statement on why nothing can race with this give
the resource_lock is not being held.

> + *
> + * This helper walks the children of the resource's parent to find and
> + * return the original resource pointer that matches the given resource's
> + * start and end addresses.
> + *
> + * Return: Pointer to the matching original resource in iomem_resource, or
> + *         NULL if not found or invalid input.
> + */
> +static struct resource *normalize_resource(struct resource *res)
> +{
> +	if (!res || !res->parent)
> +		return NULL;
> +
> +	for (struct resource *res_iter = res->parent->child;
> +	     res_iter != NULL; res_iter = res_iter->sibling) {

I'd move the res_iter != NULL to previous line as it is still under 80 chars.


> +		if ((res_iter->start == res->start) &&
> +		    (res_iter->end == res->end))
> +			return res_iter;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int __cxl_region_softreserv_update(struct resource *soft,
> +					  void *_cxlr)
> +{
> +	struct cxl_region *cxlr = _cxlr;
> +	struct resource *res = cxlr->params.res;
> +
> +	/* Skip non-intersecting soft-reserved regions */
> +	if (soft->end < res->start || soft->start > res->end)
> +		return 0;
> +
> +	soft = normalize_resource(soft);
> +	if (!soft)
> +		return -EINVAL;
> +
> +	remove_soft_reserved(cxlr, soft, res->start, res->end);
> +
> +	return 0;
> +}
> +
> +int cxl_region_softreserv_update(void)
> +{
> +	struct device *dev = NULL;
> +
> +	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
> +		struct device *put_dev __free(put_device) = dev;
This free is a little bit outside of the constructor and destructor
together rules.

I wonder if bus_for_each_dev() is cleaner here or is there a reason
we have to have released the subsystem lock + grabbed the device
one before calling __cxl_region_softreserv_update?

> +		struct cxl_region *cxlr;
> +
> +		if (!is_cxl_region(dev))
> +			continue;
If you stick to bus_find_X   I wonder if we should define helpers for

the match function and use bus_find_device()

> +
> +		cxlr = to_cxl_region(dev);
> +
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, cxlr,
> +				    __cxl_region_softreserv_update);
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-06-03 22:19 ` [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation Smita Koralahalli
@ 2025-06-09 13:01   ` Jonathan Cameron
  2025-06-13  2:12   ` Zhijian Li (Fujitsu)
  2025-07-15 16:27   ` Alison Schofield
  2 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-06-09 13:01 UTC (permalink / raw)
  To: Smita Koralahalli
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Tue, 3 Jun 2025 22:19:49 +0000
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:

> From: Nathan Fontenot <nathan.fontenot@amd.com>
> 
> The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
> during initialization. This interferes with the CXL driver’s ability to
> create regions and trim overlapping SOFT RESERVED ranges before DAX uses
> them.
> 
> To resolve this, defer the DAX driver's resource consumption if the
> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
> iomem resource tree in this case. After CXL region creation completes,
> any remaining SOFT RESERVED resources are explicitly registered with the
> DAX driver by the CXL driver.
> 
> This sequencing ensures proper handling of overlaps and fixes hotplug
> failures.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/core/region.c | 10 +++++++++
>  drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
>  drivers/dax/hmem/hmem.c   |  3 ++-
>  include/linux/dax.h       |  6 ++++++
>  4 files changed, 40 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3a5ca44d65f3..c6c0c7ba3b20 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -10,6 +10,7 @@
>  #include <linux/sort.h>
>  #include <linux/idr.h>
>  #include <linux/memory-tiers.h>
> +#include <linux/dax.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
>  #include "core.h"
> @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
>  	return NULL;
>  }
>  
> +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
> +{
> +	return hmem_register_device(phys_to_target_node(res->start), res);
> +}
> +
>  static int __cxl_region_softreserv_update(struct resource *soft,
>  					  void *_cxlr)
>  {
> @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
>  				    __cxl_region_softreserv_update);
>  	}
>  
> +	/* Now register any remaining SOFT RESERVES with DAX */
> +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
> +			    0, -1, NULL, cxl_softreserv_mem_register);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> index 59ad44761191..cc1ed7bbdb1a 100644
> --- a/drivers/dax/hmem/device.c
> +++ b/drivers/dax/hmem/device.c
> @@ -8,7 +8,6 @@
>  static bool nohmem;
>  module_param_named(disable, nohmem, bool, 0444);
>  
> -static bool platform_initialized;
>  static DEFINE_MUTEX(hmem_resource_lock);
>  static struct resource hmem_active = {
>  	.name = "HMEM devices",
> @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
>  
>  static void __hmem_register_resource(int target_nid, struct resource *res)
>  {
> -	struct platform_device *pdev;
>  	struct resource *new;
> -	int rc;
>  
>  	new = __request_region(&hmem_active, res->start, resource_size(res), "",
>  			       0);
> @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
>  	}
>  
>  	new->desc = target_nid;
> -
> -	if (platform_initialized)
> -		return;
> -
> -	pdev = platform_device_alloc("hmem_platform", 0);
> -	if (!pdev) {
> -		pr_err_once("failed to register device-dax hmem_platform device\n");
> -		return;
> -	}
> -
> -	rc = platform_device_add(pdev);
> -	if (rc)
> -		platform_device_put(pdev);
> -	else
> -		platform_initialized = true;
>  }
>  
>  void hmem_register_resource(int target_nid, struct resource *res)
> @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
>  
>  static __init int hmem_init(void)
>  {
> -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
> -	return 0;
> +	struct platform_device *pdev;
> +	int rc;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, NULL,
> +				    hmem_register_one);
> +	}
> +
> +	pdev = platform_device_alloc("hmem_platform", 0);
> +	if (!pdev) {
> +		pr_err("failed to register device-dax hmem_platform device\n");
> +		return -1;
> +	}
> +
> +	rc = platform_device_add(pdev);

platform_device_register_simple("hmem_platform", -1, NULL, 0); or something like
that?  There are quite a few variants of platform_device_register to cover
simple cases.


> +	if (rc) {
> +		pr_err("failed to add device-dax hmem_platform device\n");
> +		platform_device_put(pdev);
> +	}
> +
> +	return rc;
>  }
>  
>  /*


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o
  2025-06-09 11:02   ` Jonathan Cameron
@ 2025-06-09 23:25     ` Koralahalli Channabasappa, Smita
  2025-06-10  9:38       ` Jonathan Cameron
  0 siblings, 1 reply; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-06-09 23:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li


On 6/9/2025 4:02 AM, Jonathan Cameron wrote:
> On Tue, 3 Jun 2025 22:19:44 +0000
> Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:
> 
>> In preparation for soft-reserved resource handling, make the suspend
>> infrastructure always available by removing the CONFIG_CXL_SUSPEND
>> Kconfig option.
>>
>> This ensures cxl_mem_active_inc()/dec() and cxl_mem_active() are
>> unconditionally available, enabling coordination between cxl_pci and
>> cxl_mem drivers during region setup and hotplug operations.
> 
> If these are no longer just being used for suspend, given there
> is nothing else in the file, maybe move them to somewhere else?

There was recommendation to move the wait queue declaration and its
related changes to acpi.c. I was considering that. Let me know if there 
is any other best place for this.

Thanks
Smita
> 
> 
>>
>> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>   drivers/cxl/Kconfig        | 4 ----
>>   drivers/cxl/core/Makefile  | 2 +-
>>   drivers/cxl/core/suspend.c | 5 ++++-
>>   drivers/cxl/cxlmem.h       | 9 ---------
>>   include/linux/pm.h         | 7 -------
>>   5 files changed, 5 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
>> index cf1ba673b8c2..d09144c2002e 100644
>> --- a/drivers/cxl/Kconfig
>> +++ b/drivers/cxl/Kconfig
>> @@ -118,10 +118,6 @@ config CXL_PORT
>>   	default CXL_BUS
>>   	tristate
>>   
>> -config CXL_SUSPEND
>> -	def_bool y
>> -	depends on SUSPEND && CXL_MEM
>> -
>>   config CXL_REGION
>>   	bool "CXL: Region Support"
>>   	default CXL_BUS
>> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
>> index 086df97a0fcf..035864db8a32 100644
>> --- a/drivers/cxl/core/Makefile
>> +++ b/drivers/cxl/core/Makefile
>> @@ -1,6 +1,6 @@
>>   # SPDX-License-Identifier: GPL-2.0
>>   obj-$(CONFIG_CXL_BUS) += cxl_core.o
>> -obj-$(CONFIG_CXL_SUSPEND) += suspend.o
>> +obj-y += suspend.o
>>   
>>   ccflags-y += -I$(srctree)/drivers/cxl
>>   CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)
>> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
>> index 29aa5cc5e565..5ba4b4de0e33 100644
>> --- a/drivers/cxl/core/suspend.c
>> +++ b/drivers/cxl/core/suspend.c
>> @@ -8,7 +8,10 @@ static atomic_t mem_active;
>>   
>>   bool cxl_mem_active(void)
>>   {
>> -	return atomic_read(&mem_active) != 0;
>> +	if (IS_ENABLED(CONFIG_CXL_MEM))
>> +		return atomic_read(&mem_active) != 0;
>> +
>> +	return false;
>>   }
>>   
>>   void cxl_mem_active_inc(void)
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 3ec6b906371b..1bd1e88c4cc0 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -853,17 +853,8 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
>>   int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
>>   int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
>>   
>> -#ifdef CONFIG_CXL_SUSPEND
>>   void cxl_mem_active_inc(void);
>>   void cxl_mem_active_dec(void);
>> -#else
>> -static inline void cxl_mem_active_inc(void)
>> -{
>> -}
>> -static inline void cxl_mem_active_dec(void)
>> -{
>> -}
>> -#endif
>>   
>>   int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
>>   
>> diff --git a/include/linux/pm.h b/include/linux/pm.h
>> index f0bd8fbae4f2..415928e0b6ca 100644
>> --- a/include/linux/pm.h
>> +++ b/include/linux/pm.h
>> @@ -35,14 +35,7 @@ static inline void pm_vt_switch_unregister(struct device *dev)
>>   }
>>   #endif /* CONFIG_VT_CONSOLE_SLEEP */
>>   
>> -#ifdef CONFIG_CXL_SUSPEND
>>   bool cxl_mem_active(void);
>> -#else
>> -static inline bool cxl_mem_active(void)
>> -{
>> -	return false;
>> -}
>> -#endif
>>   
>>   /*
>>    * Device power management
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-09 12:54   ` Jonathan Cameron
@ 2025-06-10  1:25     ` Koralahalli Channabasappa, Smita
  2025-06-10  9:37       ` Jonathan Cameron
  2025-06-25 15:20       ` Robert Richter
  0 siblings, 2 replies; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-06-10  1:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

Hi Jonathan,

On 6/9/2025 5:54 AM, Jonathan Cameron wrote:
> On Tue, 3 Jun 2025 22:19:47 +0000
> Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:
> 
>> Reworked from a patch by Alison Schofield <alison.schofield@intel.com>
>>
>> Previously, when CXL regions were created through autodiscovery and their
>> resources overlapped with SOFT RESERVED ranges, the soft reserved resource
>> remained in place after region teardown. This left the HPA range
>> unavailable for reuse even after the region was destroyed.
>>
>> Enhance the logic to reliably remove SOFT RESERVED resources associated
>> with a region, regardless of alignment or hierarchy in the iomem tree.
>>
>> Link: https://lore.kernel.org/linux-cxl/29312c0765224ae76862d59a17748c8188fb95f1.1692638817.git.alison.schofield@intel.com/
>> Co-developed-by: Alison Schofield <alison.schofield@intel.com>
>> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>   drivers/cxl/acpi.c        |   2 +
>>   drivers/cxl/core/region.c | 151 ++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h         |   5 ++
>>   3 files changed, 158 insertions(+)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index 978f63b32b41..1b1388feb36d 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -823,6 +823,8 @@ static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>>   	 * and cxl_mem drivers are loaded.
>>   	 */
>>   	wait_for_device_probe();
>> +
>> +	cxl_region_softreserv_update();
>>   }
>>   static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>>   
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 109b8a98c4c7..3a5ca44d65f3 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3443,6 +3443,157 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
>>   
>> +static int add_soft_reserved(resource_size_t start, resource_size_t len,
>> +			     unsigned long flags)
>> +{
>> +	struct resource *res = kmalloc(sizeof(*res), GFP_KERNEL);
>> +	int rc;
>> +
>> +	if (!res)
>> +		return -ENOMEM;
>> +
>> +	*res = DEFINE_RES_MEM_NAMED(start, len, "Soft Reserved");
>> +
>> +	res->desc = IORES_DESC_SOFT_RESERVED;
>> +	res->flags = flags;
> 
> I'm a bit doubtful about using a define that restricts a bunch of the
> elements, then overriding 2 of them immediate after.
> 
> DEFINE_RES_NAMED_DESC(start, len, "Soft Reserved", flags | IORESOURCE_MEM,
> 		      IORES_DESC_SOFT_RESERVED);

Sure, I will change to the above.

> 
>> +	rc = insert_resource(&iomem_resource, res);
>> +	if (rc) {
>> +		kfree(res);
>> +		return rc;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void remove_soft_reserved(struct cxl_region *cxlr, struct resource *soft,
>> +				 resource_size_t start, resource_size_t end)
>> +{
>> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>> +	resource_size_t new_start, new_end;
>> +	int rc;
>> +
>> +	/* Prevent new usage while removing or adjusting the resource */
>> +	guard(mutex)(&cxlrd->range_lock);
>> +
>> +	/* Aligns at both resource start and end */
>> +	if (soft->start == start && soft->end == end)
>> +		goto remove;
>> +
> 
> Might be a clearer flow with else if rather than
> a goto.

Okay will change.

> 
>> +	/* Aligns at either resource start or end */
>> +	if (soft->start == start || soft->end == end) {
>> +		if (soft->start == start) {
>> +			new_start = end + 1;
>> +			new_end = soft->end;
>> +		} else {
>> +			new_start = soft->start;
>> +			new_end = start - 1;
>> +		}
>> +
>> +		rc = add_soft_reserved(new_start, new_end - new_start + 1,
>> +				       soft->flags);
> 
> This is the remnant of what was there before, but the flags are from
> the bit we are dropping?  That feels odd.  They might well be the same
> but maybe we need to make that explicit?

Okay. Probably I can update code to clarify this by adding a comment, or 
I can also filter or copy only the relevant flag bits if you think 
that's more appropriate.

> 
>> +		if (rc)
>> +			dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
>> +				 &new_start);
>> +
>> +		/* Remove the original Soft Reserved resource */
>> +		goto remove;
>> +	}
>> +
>> +	/*
>> +	 * No alignment. Attempt a 3-way split that removes the part of
>> +	 * the resource the region occupied, and then creates new soft
>> +	 * reserved resources for the leading and trailing addr space.
>> +	 */
>> +	new_start = soft->start;
>> +	new_end = soft->end;
>> +
>> +	rc = add_soft_reserved(new_start, start - new_start, soft->flags);
>> +	if (rc)
>> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
>> +			 &new_start);
>> +
>> +	rc = add_soft_reserved(end + 1, new_end - end, soft->flags);
>> +	if (rc)
>> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa + 1\n",
>> +			 &end);
>> +
>> +remove:
>> +	rc = remove_resource(soft);
>> +	if (rc)
>> +		dev_warn(&cxlr->dev, "cannot remove soft reserved resource %pr\n",
>> +			 soft);
>> +}
>> +
>> +/*
>> + * normalize_resource
>> + *
>> + * The walk_iomem_res_desc() returns a copy of a resource, not a reference
>> + * to the actual resource in the iomem_resource tree. As a result,
>> + * __release_resource() which relies on pointer equality will fail.
> 
> Probably want some statement on why nothing can race with this give
> the resource_lock is not being held.

Hmm, probably you are right that normalize_resource() is accessing the 
resource tree without holding resource_lock, which could lead to races.

I will update the function to take a read_lock(&resource_lock) before 
walking res->parent->child..

Let me know if you'd prefer this locking be handled before calling 
normalize_resource() instead..

> 
>> + *
>> + * This helper walks the children of the resource's parent to find and
>> + * return the original resource pointer that matches the given resource's
>> + * start and end addresses.
>> + *
>> + * Return: Pointer to the matching original resource in iomem_resource, or
>> + *         NULL if not found or invalid input.
>> + */
>> +static struct resource *normalize_resource(struct resource *res)
>> +{
>> +	if (!res || !res->parent)
>> +		return NULL;
>> +
>> +	for (struct resource *res_iter = res->parent->child;
>> +	     res_iter != NULL; res_iter = res_iter->sibling) {
> 
> I'd move the res_iter != NULL to previous line as it is still under 80 chars.

Sure will fix.

> 
> 
>> +		if ((res_iter->start == res->start) &&
>> +		    (res_iter->end == res->end))
>> +			return res_iter;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static int __cxl_region_softreserv_update(struct resource *soft,
>> +					  void *_cxlr)
>> +{
>> +	struct cxl_region *cxlr = _cxlr;
>> +	struct resource *res = cxlr->params.res;
>> +
>> +	/* Skip non-intersecting soft-reserved regions */
>> +	if (soft->end < res->start || soft->start > res->end)
>> +		return 0;
>> +
>> +	soft = normalize_resource(soft);
>> +	if (!soft)
>> +		return -EINVAL;
>> +
>> +	remove_soft_reserved(cxlr, soft, res->start, res->end);
>> +
>> +	return 0;
>> +}
>> +
>> +int cxl_region_softreserv_update(void)
>> +{
>> +	struct device *dev = NULL;
>> +
>> +	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
>> +		struct device *put_dev __free(put_device) = dev;
> This free is a little bit outside of the constructor and destructor
> together rules.
> 
> I wonder if bus_for_each_dev() is cleaner here or is there a reason
> we have to have released the subsystem lock + grabbed the device
> one before calling __cxl_region_softreserv_update?

Thanks for the suggestion. I will replace the bus_find_next_device() 
with bus_for_each_dev(). I think bus_for_each_dev() simplifies the flow 
as there's also no need to call put_device() explicitly.

Thanks
Smita

> 
>> +		struct cxl_region *cxlr;
>> +
>> +		if (!is_cxl_region(dev))
>> +			continue;
> If you stick to bus_find_X   I wonder if we should define helpers for
> 
> the match function and use bus_find_device()
> 
>> +
>> +		cxlr = to_cxl_region(dev);
>> +
>> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> +				    IORESOURCE_MEM, 0, -1, cxlr,
>> +				    __cxl_region_softreserv_update);
>> +	}
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> 
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-10  1:25     ` Koralahalli Channabasappa, Smita
@ 2025-06-10  9:37       ` Jonathan Cameron
  2025-06-25 15:20       ` Robert Richter
  1 sibling, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-06-10  9:37 UTC (permalink / raw)
  To: Koralahalli Channabasappa, Smita
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

Hi Smita,

> >> +/*
> >> + * normalize_resource
> >> + *
> >> + * The walk_iomem_res_desc() returns a copy of a resource, not a reference
> >> + * to the actual resource in the iomem_resource tree. As a result,
> >> + * __release_resource() which relies on pointer equality will fail.  
> > 
> > Probably want some statement on why nothing can race with this give
> > the resource_lock is not being held.  
> 
> Hmm, probably you are right that normalize_resource() is accessing the 
> resource tree without holding resource_lock, which could lead to races.
> 
> I will update the function to take a read_lock(&resource_lock) before 
> walking res->parent->child..
> 
> Let me know if you'd prefer this locking be handled before calling 
> normalize_resource() instead..
I don't mind either way - see what looks better to you.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o
  2025-06-09 23:25     ` Koralahalli Channabasappa, Smita
@ 2025-06-10  9:38       ` Jonathan Cameron
  0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2025-06-10  9:38 UTC (permalink / raw)
  To: Koralahalli Channabasappa, Smita
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Mon, 9 Jun 2025 16:25:49 -0700
"Koralahalli Channabasappa, Smita" <Smita.KoralahalliChannabasappa@amd.com> wrote:

> On 6/9/2025 4:02 AM, Jonathan Cameron wrote:
> > On Tue, 3 Jun 2025 22:19:44 +0000
> > Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:
> >   
> >> In preparation for soft-reserved resource handling, make the suspend
> >> infrastructure always available by removing the CONFIG_CXL_SUSPEND
> >> Kconfig option.
> >>
> >> This ensures cxl_mem_active_inc()/dec() and cxl_mem_active() are
> >> unconditionally available, enabling coordination between cxl_pci and
> >> cxl_mem drivers during region setup and hotplug operations.  
> > 
> > If these are no longer just being used for suspend, given there
> > is nothing else in the file, maybe move them to somewhere else?  
> 
> There was recommendation to move the wait queue declaration and its
> related changes to acpi.c. I was considering that. Let me know if there 
> is any other best place for this.
> 
I wasn't sure on the best location (which is why I was lazy an didn't
suggest one ;)  Dan, Dave etc anyone have strong mental model for where
this should be?

> Thanks
> Smita
> > 
> >   
> >>
> >> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> >> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> >> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> >> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> >> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> >> ---
> >>   drivers/cxl/Kconfig        | 4 ----
> >>   drivers/cxl/core/Makefile  | 2 +-
> >>   drivers/cxl/core/suspend.c | 5 ++++-
> >>   drivers/cxl/cxlmem.h       | 9 ---------
> >>   include/linux/pm.h         | 7 -------
> >>   5 files changed, 5 insertions(+), 22 deletions(-)
> >>
> >> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> >> index cf1ba673b8c2..d09144c2002e 100644
> >> --- a/drivers/cxl/Kconfig
> >> +++ b/drivers/cxl/Kconfig
> >> @@ -118,10 +118,6 @@ config CXL_PORT
> >>   	default CXL_BUS
> >>   	tristate
> >>   
> >> -config CXL_SUSPEND
> >> -	def_bool y
> >> -	depends on SUSPEND && CXL_MEM
> >> -
> >>   config CXL_REGION
> >>   	bool "CXL: Region Support"
> >>   	default CXL_BUS
> >> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> >> index 086df97a0fcf..035864db8a32 100644
> >> --- a/drivers/cxl/core/Makefile
> >> +++ b/drivers/cxl/core/Makefile
> >> @@ -1,6 +1,6 @@
> >>   # SPDX-License-Identifier: GPL-2.0
> >>   obj-$(CONFIG_CXL_BUS) += cxl_core.o
> >> -obj-$(CONFIG_CXL_SUSPEND) += suspend.o
> >> +obj-y += suspend.o
> >>   
> >>   ccflags-y += -I$(srctree)/drivers/cxl
> >>   CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)
> >> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> >> index 29aa5cc5e565..5ba4b4de0e33 100644
> >> --- a/drivers/cxl/core/suspend.c
> >> +++ b/drivers/cxl/core/suspend.c
> >> @@ -8,7 +8,10 @@ static atomic_t mem_active;
> >>   
> >>   bool cxl_mem_active(void)
> >>   {
> >> -	return atomic_read(&mem_active) != 0;
> >> +	if (IS_ENABLED(CONFIG_CXL_MEM))
> >> +		return atomic_read(&mem_active) != 0;
> >> +
> >> +	return false;
> >>   }
> >>   
> >>   void cxl_mem_active_inc(void)
> >> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> >> index 3ec6b906371b..1bd1e88c4cc0 100644
> >> --- a/drivers/cxl/cxlmem.h
> >> +++ b/drivers/cxl/cxlmem.h
> >> @@ -853,17 +853,8 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
> >>   int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
> >>   int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
> >>   
> >> -#ifdef CONFIG_CXL_SUSPEND
> >>   void cxl_mem_active_inc(void);
> >>   void cxl_mem_active_dec(void);
> >> -#else
> >> -static inline void cxl_mem_active_inc(void)
> >> -{
> >> -}
> >> -static inline void cxl_mem_active_dec(void)
> >> -{
> >> -}
> >> -#endif
> >>   
> >>   int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> >>   
> >> diff --git a/include/linux/pm.h b/include/linux/pm.h
> >> index f0bd8fbae4f2..415928e0b6ca 100644
> >> --- a/include/linux/pm.h
> >> +++ b/include/linux/pm.h
> >> @@ -35,14 +35,7 @@ static inline void pm_vt_switch_unregister(struct device *dev)
> >>   }
> >>   #endif /* CONFIG_VT_CONSOLE_SLEEP */
> >>   
> >> -#ifdef CONFIG_CXL_SUSPEND
> >>   bool cxl_mem_active(void);
> >> -#else
> >> -static inline bool cxl_mem_active(void)
> >> -{
> >> -	return false;
> >> -}
> >> -#endif
> >>   
> >>   /*
> >>    * Device power management  
> >   
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-06-03 22:19 ` [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation Smita Koralahalli
  2025-06-09 13:01   ` Jonathan Cameron
@ 2025-06-13  2:12   ` Zhijian Li (Fujitsu)
  2025-07-10  4:22     ` Koralahalli Channabasappa, Smita
  2025-07-15 16:27   ` Alison Schofield
  2 siblings, 1 reply; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-13  2:12 UTC (permalink / raw)
  To: Smita Koralahalli, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati

Hi Smita, Nathan, Terry

I am struggling to understand if this patch is truly necessary, or if I haven't
fully grasped the scenario where it provides value. Without applying this patch
on a QEMU/VM with both HMEM and CXL.mem installed, I observed no issues. (Are there
specific config options required to reproduce the problem?)

Here is the /proc/iomem without the patch:
180000000-1ffffffff : Soft Reserved  ### 2 hmem nodes
   180000000-1bfffffff : dax1.0
     180000000-1bfffffff : System RAM (kmem)
   1c0000000-1ffffffff : dax2.0
     1c0000000-1ffffffff : System RAM (kmem)
5c0001128-5c00011b7 : port1
5d0000000-64fffffff : CXL Window 0  ### 1 CXL node
   5d0000000-64fffffff : region0
     5d0000000-64fffffff : dax0.0
       5d0000000-64fffffff : System RAM (kmem)

On 04/06/2025 06:19, Smita Koralahalli wrote:
> From: Nathan Fontenot <nathan.fontenot@amd.com>
> 
> The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
> during initialization. This interferes with the CXL driver’s ability to
> create regions and trim overlapping SOFT RESERVED ranges before DAX uses
> them.

When referring to "HMEM driver" in the commit message, is it
`dax_hmem_platform_driver` or `dax_hmem_driver`? Regardless of which,
what is the impact if one consumes all SOFT RESERVED resources?

Since `hmem_register_device()` only creates HMEM devices for ranges
*without* `IORES_DESC_CXL` which could be marked in cxl_acpi , cxl_core/cxl_dax
should still create regions and DAX devices without conflicts.

> To resolve this, defer the DAX driver's resource consumption if the
> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
> iomem resource tree in this case. After CXL region creation completes,
> any remaining SOFT RESERVED resources are explicitly registered with the
> DAX driver by the CXL driver.

Conversely, with this patch applied, `cxl_region_softreserv_update()` attempts
to register new HMEM devices. This may cause duplicate registrations for the
  same range (e.g., 0x180000000-0x1ffffffff), triggering warnings like:

[   14.984108] kmem dax4.0: mapping0: 0x180000000-0x1ffffffff could not reserve region
[   14.987204] kmem dax4.0: probe with driver kmem failed with error -16

Because the HMAT initialization already registered these sub-ranges:
   180000000-1bfffffff
   1c0000000-1ffffffff


If I'm missing something, please correct me.

Thanks,
Zhijian



> 
> This sequencing ensures proper handling of overlaps and fixes hotplug
> failures.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>   drivers/cxl/core/region.c | 10 +++++++++
>   drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
>   drivers/dax/hmem/hmem.c   |  3 ++-
>   include/linux/dax.h       |  6 ++++++
>   4 files changed, 40 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3a5ca44d65f3..c6c0c7ba3b20 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -10,6 +10,7 @@
>   #include <linux/sort.h>
>   #include <linux/idr.h>
>   #include <linux/memory-tiers.h>
> +#include <linux/dax.h>
>   #include <cxlmem.h>
>   #include <cxl.h>
>   #include "core.h"
> @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
>   	return NULL;
>   }
>   
> +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
> +{
> +	return hmem_register_device(phys_to_target_node(res->start), res);
> +}
> +
>   static int __cxl_region_softreserv_update(struct resource *soft,
>   					  void *_cxlr)
>   {
> @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
>   				    __cxl_region_softreserv_update);
>   	}
>   
> +	/* Now register any remaining SOFT RESERVES with DAX */
> +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
> +			    0, -1, NULL, cxl_softreserv_mem_register);
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> index 59ad44761191..cc1ed7bbdb1a 100644
> --- a/drivers/dax/hmem/device.c
> +++ b/drivers/dax/hmem/device.c
> @@ -8,7 +8,6 @@
>   static bool nohmem;
>   module_param_named(disable, nohmem, bool, 0444);
>   
> -static bool platform_initialized;
>   static DEFINE_MUTEX(hmem_resource_lock);
>   static struct resource hmem_active = {
>   	.name = "HMEM devices",
> @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
>   
>   static void __hmem_register_resource(int target_nid, struct resource *res)
>   {
> -	struct platform_device *pdev;
>   	struct resource *new;
> -	int rc;
>   
>   	new = __request_region(&hmem_active, res->start, resource_size(res), "",
>   			       0);
> @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
>   	}
>   
>   	new->desc = target_nid;
> -
> -	if (platform_initialized)
> -		return;
> -
> -	pdev = platform_device_alloc("hmem_platform", 0);
> -	if (!pdev) {
> -		pr_err_once("failed to register device-dax hmem_platform device\n");
> -		return;
> -	}
> -
> -	rc = platform_device_add(pdev);
> -	if (rc)
> -		platform_device_put(pdev);
> -	else
> -		platform_initialized = true;
>   }
>   
>   void hmem_register_resource(int target_nid, struct resource *res)
> @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
>   
>   static __init int hmem_init(void)
>   {
> -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
> -	return 0;
> +	struct platform_device *pdev;
> +	int rc;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, NULL,
> +				    hmem_register_one);
> +	}
> +
> +	pdev = platform_device_alloc("hmem_platform", 0);
> +	if (!pdev) {
> +		pr_err("failed to register device-dax hmem_platform device\n");
> +		return -1;
> +	}
> +
> +	rc = platform_device_add(pdev);
> +	if (rc) {
> +		pr_err("failed to add device-dax hmem_platform device\n");
> +		platform_device_put(pdev);
> +	}
> +
> +	return rc;
>   }
>   
>   /*
> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> index 3aedef5f1be1..a206b9b383e4 100644
> --- a/drivers/dax/hmem/hmem.c
> +++ b/drivers/dax/hmem/hmem.c
> @@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
>   	platform_device_unregister(pdev);
>   }
>   
> -static int hmem_register_device(int target_nid, const struct resource *res)
> +int hmem_register_device(int target_nid, const struct resource *res)
>   {
>   	struct device *host = &dax_hmem_pdev->dev;
>   	struct platform_device *pdev;
> @@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
>   	platform_device_put(pdev);
>   	return rc;
>   }
> +EXPORT_SYMBOL_GPL(hmem_register_device);
>   
>   static int dax_hmem_platform_probe(struct platform_device *pdev)
>   {
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index a4ad3708ea35..5052dca8b3bc 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
>   
>   #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
>   void hmem_register_resource(int target_nid, struct resource *r);
> +int hmem_register_device(int target_nid, const struct resource *res);
>   #else
>   static inline void hmem_register_resource(int target_nid, struct resource *r)
>   {
>   }
> +
> +static inline int hmem_register_device(int target_nid, const struct resource *res)
> +{
> +	return 0;
> +}
>   #endif
>   
>   typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 0/7] Add managed SOFT RESERVE resource handling
  2025-06-04 18:59   ` Koralahalli Channabasappa, Smita
@ 2025-06-16  1:00     ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-06-16  1:00 UTC (permalink / raw)
  To: Koralahalli Channabasappa, Smita, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 05/06/2025 02:59, Koralahalli Channabasappa, Smita wrote:
>>
>>
>> To the CXL community,
>>
>> The scenarios mentioned here essentially cover what a correct firmware may provide. However,
>> I would like to discuss one more scenario that I can simulate with a modified QEMU:
>> The E820 exposes a SOFT RESERVED region which is the same as a CFMW, but the HDM decoders are not committed. This means no region will be auto-created during boot.
>>
>> As an example, after boot, the iomem tree is as follows:
>> 1050000000-304fffffff : CXL Window 0
>>     1050000000-304fffffff : Soft Reserved
>>       <No region>
>>
>> In this case, the SOFT RESERVED resource is not trimmed, so the end-user cannot create a new region.
>> My question is: Is this scenario a problem? If it is, should we fix it in this patchset or create a new patch?
>>
> 
> I believe firmware should handle this correctly by ensuring that any exposed SOFT RESERVED ranges correspond to committed HDM decoders and result in region creation.
> 
> That said, I’d be interested in hearing what the rest of the community thinks.


After several days, we still haven't heard other significant opinions. I'm fine with keeping the current case coverage.
If the case I described becomes common in the future, we can revisit it then.

Thanks
Zhijian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-10  1:25     ` Koralahalli Channabasappa, Smita
  2025-06-10  9:37       ` Jonathan Cameron
@ 2025-06-25 15:20       ` Robert Richter
  1 sibling, 0 replies; 40+ messages in thread
From: Robert Richter @ 2025-06-25 15:20 UTC (permalink / raw)
  To: Koralahalli Channabasappa, Smita
  Cc: Jonathan Cameron, linux-cxl, linux-kernel, nvdimm, linux-fsdevel,
	linux-pm, Davidlohr Bueso, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On 09.06.25 18:25:23, Koralahalli Channabasappa, Smita wrote:
> On 6/9/2025 5:54 AM, Jonathan Cameron wrote:
> > On Tue, 3 Jun 2025 22:19:47 +0000
> > Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:

> > > +static int __cxl_region_softreserv_update(struct resource *soft,
> > > +					  void *_cxlr)
> > > +{
> > > +	struct cxl_region *cxlr = _cxlr;
> > > +	struct resource *res = cxlr->params.res;
> > > +
> > > +	/* Skip non-intersecting soft-reserved regions */
> > > +	if (soft->end < res->start || soft->start > res->end)
> > > +		return 0;
> > > +
> > > +	soft = normalize_resource(soft);
> > > +	if (!soft)
> > > +		return -EINVAL;
> > > +
> > > +	remove_soft_reserved(cxlr, soft, res->start, res->end);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +int cxl_region_softreserv_update(void)
> > > +{
> > > +	struct device *dev = NULL;
> > > +
> > > +	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
> > > +		struct device *put_dev __free(put_device) = dev;
> > This free is a little bit outside of the constructor and destructor
> > together rules.
> > 
> > I wonder if bus_for_each_dev() is cleaner here or is there a reason
> > we have to have released the subsystem lock + grabbed the device
> > one before calling __cxl_region_softreserv_update?
> 
> Thanks for the suggestion. I will replace the bus_find_next_device() with
> bus_for_each_dev(). I think bus_for_each_dev() simplifies the flow as
> there's also no need to call put_device() explicitly.

That change makes path #1 obsolete (see comments there too). I would
prefer to remove it.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region()
  2025-06-04 19:56     ` Nathan Fontenot
@ 2025-06-25 15:23       ` Robert Richter
  0 siblings, 0 replies; 40+ messages in thread
From: Robert Richter @ 2025-06-25 15:23 UTC (permalink / raw)
  To: Nathan Fontenot
  Cc: Dave Jiang, Smita Koralahalli, linux-cxl, linux-kernel, nvdimm,
	linux-fsdevel, linux-pm, Davidlohr Bueso, Jonathan Cameron,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Matthew Wilcox, Jan Kara, Rafael J . Wysocki, Len Brown,
	Pavel Machek, Li Ming, Jeff Johnson, Ying Huang, Yao Xingtao,
	Peter Zijlstra, Greg KH, Terry Bowman, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On 04.06.25 14:56:22, Nathan Fontenot wrote:
> On 6/3/2025 6:49 PM, Dave Jiang wrote:
> > 
> > 
> > On 6/3/25 3:19 PM, Smita Koralahalli wrote:
> >> Add a NULL check in is_cxl_region() to prevent potential null pointer
> >> dereference if a caller passes a NULL device. This change ensures the
> >> function safely returns false instead of triggering undefined behavior
> >> when dev is NULL.
> > 
> > Don't think this change is necessary. The code paths should not be hitting any NULL region devices unless it's a programming error.
> 
> I originally added this to the patchset during some initial development to handle possible
> NULL dev pointers when updating soft reserve resources, see cxl_region_softreserv_update()
> in patch 5/7.
> 
> In the current form of the that routine it appears we shouldn't execute the while loop
> if dev is NULL so this could get from the patch set.

With the suggested change to use bus_for_each_dev() in patch #5 this
patch should be dropped.

-Robert

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown
  2025-06-03 22:19 ` [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown Smita Koralahalli
  2025-06-06  4:16   ` Zhijian Li (Fujitsu)
  2025-06-09 12:54   ` Jonathan Cameron
@ 2025-07-09  1:14   ` Alison Schofield
  2 siblings, 0 replies; 40+ messages in thread
From: Alison Schofield @ 2025-07-09  1:14 UTC (permalink / raw)
  To: Smita Koralahalli
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Tue, Jun 03, 2025 at 10:19:47PM +0000, Smita Koralahalli wrote:
> Reworked from a patch by Alison Schofield <alison.schofield@intel.com>

This isn't really "Introducing" is it? Looks like it adds the new softreserved
helpers and then adds the caller to the notifier function. And, that notifier
function triggers after probe, not on region teardown. It's setting us up for
a clean teardown for sure!

So let's rename something like:
Remove a region SOFT RESERVED resource after probe

I'm chasing a problem with v4, that wasn't present in v2. (I skipped v3)
The notifier is working and I see the SOFT RESERVED gone after probe, but
the dax device below the region is still present after region teardown.
I'll get more details, but thought I'd throw it out in case anyone has
seen similar.

-- Alison

> 
> Previously, when CXL regions were created through autodiscovery and their
> resources overlapped with SOFT RESERVED ranges, the soft reserved resource
> remained in place after region teardown. This left the HPA range
> unavailable for reuse even after the region was destroyed.
> 
> Enhance the logic to reliably remove SOFT RESERVED resources associated
> with a region, regardless of alignment or hierarchy in the iomem tree.
> 
> Link: https://lore.kernel.org/linux-cxl/29312c0765224ae76862d59a17748c8188fb95f1.1692638817.git.alison.schofield@intel.com/
> Co-developed-by: Alison Schofield <alison.schofield@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/acpi.c        |   2 +
>  drivers/cxl/core/region.c | 151 ++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h         |   5 ++
>  3 files changed, 158 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 978f63b32b41..1b1388feb36d 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -823,6 +823,8 @@ static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>  	 * and cxl_mem drivers are loaded.
>  	 */
>  	wait_for_device_probe();
> +
> +	cxl_region_softreserv_update();
>  }
>  static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>  
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 109b8a98c4c7..3a5ca44d65f3 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3443,6 +3443,157 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
>  
> +static int add_soft_reserved(resource_size_t start, resource_size_t len,
> +			     unsigned long flags)
> +{
> +	struct resource *res = kmalloc(sizeof(*res), GFP_KERNEL);
> +	int rc;
> +
> +	if (!res)
> +		return -ENOMEM;
> +
> +	*res = DEFINE_RES_MEM_NAMED(start, len, "Soft Reserved");
> +
> +	res->desc = IORES_DESC_SOFT_RESERVED;
> +	res->flags = flags;
> +	rc = insert_resource(&iomem_resource, res);
> +	if (rc) {
> +		kfree(res);
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static void remove_soft_reserved(struct cxl_region *cxlr, struct resource *soft,
> +				 resource_size_t start, resource_size_t end)
> +{
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	resource_size_t new_start, new_end;
> +	int rc;
> +
> +	/* Prevent new usage while removing or adjusting the resource */
> +	guard(mutex)(&cxlrd->range_lock);
> +
> +	/* Aligns at both resource start and end */
> +	if (soft->start == start && soft->end == end)
> +		goto remove;
> +
> +	/* Aligns at either resource start or end */
> +	if (soft->start == start || soft->end == end) {
> +		if (soft->start == start) {
> +			new_start = end + 1;
> +			new_end = soft->end;
> +		} else {
> +			new_start = soft->start;
> +			new_end = start - 1;
> +		}
> +
> +		rc = add_soft_reserved(new_start, new_end - new_start + 1,
> +				       soft->flags);
> +		if (rc)
> +			dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
> +				 &new_start);
> +
> +		/* Remove the original Soft Reserved resource */
> +		goto remove;
> +	}
> +
> +	/*
> +	 * No alignment. Attempt a 3-way split that removes the part of
> +	 * the resource the region occupied, and then creates new soft
> +	 * reserved resources for the leading and trailing addr space.
> +	 */
> +	new_start = soft->start;
> +	new_end = soft->end;
> +
> +	rc = add_soft_reserved(new_start, start - new_start, soft->flags);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa\n",
> +			 &new_start);
> +
> +	rc = add_soft_reserved(end + 1, new_end - end, soft->flags);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot add new soft reserved resource at %pa + 1\n",
> +			 &end);
> +
> +remove:
> +	rc = remove_resource(soft);
> +	if (rc)
> +		dev_warn(&cxlr->dev, "cannot remove soft reserved resource %pr\n",
> +			 soft);
> +}
> +
> +/*
> + * normalize_resource
> + *
> + * The walk_iomem_res_desc() returns a copy of a resource, not a reference
> + * to the actual resource in the iomem_resource tree. As a result,
> + * __release_resource() which relies on pointer equality will fail.
> + *
> + * This helper walks the children of the resource's parent to find and
> + * return the original resource pointer that matches the given resource's
> + * start and end addresses.
> + *
> + * Return: Pointer to the matching original resource in iomem_resource, or
> + *         NULL if not found or invalid input.
> + */
> +static struct resource *normalize_resource(struct resource *res)
> +{
> +	if (!res || !res->parent)
> +		return NULL;
> +
> +	for (struct resource *res_iter = res->parent->child;
> +	     res_iter != NULL; res_iter = res_iter->sibling) {
> +		if ((res_iter->start == res->start) &&
> +		    (res_iter->end == res->end))
> +			return res_iter;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int __cxl_region_softreserv_update(struct resource *soft,
> +					  void *_cxlr)
> +{
> +	struct cxl_region *cxlr = _cxlr;
> +	struct resource *res = cxlr->params.res;
> +
> +	/* Skip non-intersecting soft-reserved regions */
> +	if (soft->end < res->start || soft->start > res->end)
> +		return 0;
> +
> +	soft = normalize_resource(soft);
> +	if (!soft)
> +		return -EINVAL;
> +
> +	remove_soft_reserved(cxlr, soft, res->start, res->end);
> +
> +	return 0;
> +}
> +
> +int cxl_region_softreserv_update(void)
> +{
> +	struct device *dev = NULL;
> +
> +	while ((dev = bus_find_next_device(&cxl_bus_type, dev))) {
> +		struct device *put_dev __free(put_device) = dev;
> +		struct cxl_region *cxlr;
> +
> +		if (!is_cxl_region(dev))
> +			continue;
> +
> +		cxlr = to_cxl_region(dev);
> +
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, cxlr,
> +				    __cxl_region_softreserv_update);
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> +
>  u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa)
>  {
>  	struct cxl_region_ref *iter;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1ba7d39c2991..fc39c4b24745 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -859,6 +859,7 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
>  int cxl_add_to_region(struct cxl_port *root,
>  		      struct cxl_endpoint_decoder *cxled);
>  struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
> +int cxl_region_softreserv_update(void);
>  u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
>  #else
>  static inline bool is_cxl_pmem_region(struct device *dev)
> @@ -878,6 +879,10 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
>  {
>  	return NULL;
>  }
> +static inline int cxl_region_softreserv_update(void)
> +{
> +	return 0;
> +}
>  static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
>  					       u64 spa)
>  {
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-06-13  2:12   ` Zhijian Li (Fujitsu)
@ 2025-07-10  4:22     ` Koralahalli Channabasappa, Smita
  2025-07-10  8:18       ` Zhijian Li (Fujitsu)
  0 siblings, 1 reply; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-07-10  4:22 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati

Hi Zhijian,

Sorry for the delay in getting back on this. It took me a bit of time to 
fully understand what was happening. Responses inline.

On 6/12/2025 7:12 PM, Zhijian Li (Fujitsu) wrote:
> Hi Smita, Nathan, Terry
> 
> I am struggling to understand if this patch is truly necessary, or if I haven't
> fully grasped the scenario where it provides value. Without applying this patch
> on a QEMU/VM with both HMEM and CXL.mem installed, I observed no issues. (Are there
> specific config options required to reproduce the problem?)
> 
> Here is the /proc/iomem without the patch:
> 180000000-1ffffffff : Soft Reserved  ### 2 hmem nodes
>     180000000-1bfffffff : dax1.0
>       180000000-1bfffffff : System RAM (kmem)
>     1c0000000-1ffffffff : dax2.0
>       1c0000000-1ffffffff : System RAM (kmem)
> 5c0001128-5c00011b7 : port1
> 5d0000000-64fffffff : CXL Window 0  ### 1 CXL node
>     5d0000000-64fffffff : region0
>       5d0000000-64fffffff : dax0.0
>         5d0000000-64fffffff : System RAM (kmem)
> 
> On 04/06/2025 06:19, Smita Koralahalli wrote:
>> From: Nathan Fontenot <nathan.fontenot@amd.com>
>>
>> The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
>> during initialization. This interferes with the CXL driver’s ability to
>> create regions and trim overlapping SOFT RESERVED ranges before DAX uses
>> them.
> 
> When referring to "HMEM driver" in the commit message, is it
> `dax_hmem_platform_driver` or `dax_hmem_driver`? Regardless of which,
> what is the impact if one consumes all SOFT RESERVED resources?
> 
> Since `hmem_register_device()` only creates HMEM devices for ranges
> *without* `IORES_DESC_CXL` which could be marked in cxl_acpi , cxl_core/cxl_dax
> should still create regions and DAX devices without conflicts.

You're correct that hmem_register_device() includes a check to skip
regions overlapping with IORES_DESC_CXL. However, this check only works 
if the CXL region driver has already inserted those regions into 
iomem_resource. If dax_hmem_platform_probe() runs too early (before CXL 
region probing), that check fails to detect overlaps — leading to 
erroneous registration.

This is what I think. I may be wrong. Also, Alison/Dan comment here: 
"New approach is to not have the CXL intersecting soft reserved
resources in iomem_resource tree."..

https://lore.kernel.org/linux-cxl/ZPdoduf5IckVWQVD@aschofie-mobl2/

> 
>> To resolve this, defer the DAX driver's resource consumption if the
>> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
>> iomem resource tree in this case. After CXL region creation completes,
>> any remaining SOFT RESERVED resources are explicitly registered with the
>> DAX driver by the CXL driver.
> 
> Conversely, with this patch applied, `cxl_region_softreserv_update()` attempts
> to register new HMEM devices. This may cause duplicate registrations for the
>    same range (e.g., 0x180000000-0x1ffffffff), triggering warnings like:
> 
> [   14.984108] kmem dax4.0: mapping0: 0x180000000-0x1ffffffff could not reserve region
> [   14.987204] kmem dax4.0: probe with driver kmem failed with error -16
> 
> Because the HMAT initialization already registered these sub-ranges:
>     180000000-1bfffffff
>     1c0000000-1ffffffff
> 
> 
> If I'm missing something, please correct me.

Yeah, this bug is due to a double invocation of hmem_register_device() 
once from cxl_softreserv_mem_register() and once from 
dax_hmem_platform_probe().

When CONFIG_CXL_ACPI=y, walk_iomem_res_desc() is skipped in hmem_init(),
so I expected hmem_active to remain empty. However, I missed the detail 
that the ACPI HMAT parser (drivers/acpi/numa/hmat.c) calls 
hmem_register_resource(), which populates hmem_active via 
__hmem_register_resource().

Case 1 (No bug): If dax_hmem_platform_probe() runs when hmem_active is 
still empty.

walk_hmem_resources() walks nothing — it's effectively a no-op.

Later, cxl_softreserv_mem_register() is invoked to register leftover 
soft-reserved regions via hmem_register_device().

Only one registration occurs, no conflict.

Case 2: If dax_hmem_platform_probe() runs after hmem_active is populated 
by hmat_register_target_devices() (via hmem_register_resource()):

walk_hmem_resources() iterates those regions. It invokes 
hmem_register_device(). Later, cxl_region driver does the same again.

This results in duplicate instances for the same physical range and 
second call fails like below:

[   14.984108] kmem dax4.0: mapping0: 0x180000000-0x1ffffffff could not 
reserve region
[   14.987204] kmem dax4.0: probe with driver kmem failed with error -16

Below, did the job to fix the above bug for me and I did incorporate 
this in v5.

static int dax_hmem_platform_probe(struct platform_device *pdev)
{
+	if (IS_ENABLED(CONFIG_CXL_ACPI))
+		return 0;

	dax_hmem_pdev = pdev;
	return walk_hmem_resources(hmem_register_device);
}

Let me know if my thought process is right. I would appreciate any 
additional feedback or suggestions.

Meanwhile, I should also mention that my approach fails, if cxl_acpi 
finishes probing before dax_hmem is even loaded, it attempts to call 
into unresolved dax_hmem symbols, causing probe failures. Particularly 
when CXL_BUS=y and DEV_DAX_HMEM=m.

ld: vmlinux.o: in function `cxl_softreserv_mem_register':
region.c:(.text+0xc15160): undefined reference to `hmem_register_device'
make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1

I spent some time exploring possible fixes for this symbol dependency 
issue, which delayed my v5 submission. I would welcome any ideas..

In the meantime, I noticed your new patchset that introduces a different 
approach for resolving resource conflicts between CMFW and Soft Reserved 
regions. I will take a closer look at that.

Thanks
Smita

> 
> Thanks,
> Zhijian
> 
> 
> 
>>
>> This sequencing ensures proper handling of overlaps and fixes hotplug
>> failures.
>>
>> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>    drivers/cxl/core/region.c | 10 +++++++++
>>    drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
>>    drivers/dax/hmem/hmem.c  |  3 ++-
>>    include/linux/dax.h       |  6 ++++++
>>    4 files changed, 40 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 3a5ca44d65f3..c6c0c7ba3b20 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -10,6 +10,7 @@
>>    #include <linux/sort.h>
>>    #include <linux/idr.h>
>>    #include <linux/memory-tiers.h>
>> +#include <linux/dax.h>
>>    #include <cxlmem.h>
>>    #include <cxl.h>
>>    #include "core.h"
>> @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
>>    	return NULL;
>>    }
>>    
>> +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
>> +{
>> +	return hmem_register_device(phys_to_target_node(res->start), res);
>> +}
>> +
>>    static int __cxl_region_softreserv_update(struct resource *soft,
>>    					  void *_cxlr)
>>    {
>> @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
>>    				    __cxl_region_softreserv_update);
>>    	}
>>    
>> +	/* Now register any remaining SOFT RESERVES with DAX */
>> +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
>> +			    0, -1, NULL, cxl_softreserv_mem_register);
>> +
>>    	return 0;
>>    }
>>    EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
>> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
>> index 59ad44761191..cc1ed7bbdb1a 100644
>> --- a/drivers/dax/hmem/device.c
>> +++ b/drivers/dax/hmem/device.c
>> @@ -8,7 +8,6 @@
>>    static bool nohmem;
>>    module_param_named(disable, nohmem, bool, 0444);
>>    
>> -static bool platform_initialized;
>>    static DEFINE_MUTEX(hmem_resource_lock);
>>    static struct resource hmem_active = {
>>    	.name = "HMEM devices",
>> @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
>>    
>>    static void __hmem_register_resource(int target_nid, struct resource *res)
>>    {
>> -	struct platform_device *pdev;
>>    	struct resource *new;
>> -	int rc;
>>    
>>    	new = __request_region(&hmem_active, res->start, resource_size(res), "",
>>    			       0);
>> @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
>>    	}
>>    
>>    	new->desc = target_nid;
>> -
>> -	if (platform_initialized)
>> -		return;
>> -
>> -	pdev = platform_device_alloc("hmem_platform", 0);
>> -	if (!pdev) {
>> -		pr_err_once("failed to register device-dax hmem_platform device\n");
>> -		return;
>> -	}
>> -
>> -	rc = platform_device_add(pdev);
>> -	if (rc)
>> -		platform_device_put(pdev);
>> -	else
>> -		platform_initialized = true;
>>    }
>>    
>>    void hmem_register_resource(int target_nid, struct resource *res)
>> @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
>>    
>>    static __init int hmem_init(void)
>>    {
>> -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
>> -	return 0;
>> +	struct platform_device *pdev;
>> +	int rc;
>> +
>> +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
>> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> +				    IORESOURCE_MEM, 0, -1, NULL,
>> +				    hmem_register_one);
>> +	}
>> +
>> +	pdev = platform_device_alloc("hmem_platform", 0);
>> +	if (!pdev) {
>> +		pr_err("failed to register device-dax hmem_platform device\n");
>> +		return -1;
>> +	}
>> +
>> +	rc = platform_device_add(pdev);
>> +	if (rc) {
>> +		pr_err("failed to add device-dax hmem_platform device\n");
>> +		platform_device_put(pdev);
>> +	}
>> +
>> +	return rc;
>>    }
>>    
>>    /*
>> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
>> index 3aedef5f1be1..a206b9b383e4 100644
>> --- a/drivers/dax/hmem/hmem.c
>> +++ b/drivers/dax/hmem/hmem.c
>> @@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
>>    	platform_device_unregister(pdev);
>>    }
>>    
>> -static int hmem_register_device(int target_nid, const struct resource *res)
>> +int hmem_register_device(int target_nid, const struct resource *res)
>>    {
>>    	struct device *host = &dax_hmem_pdev->dev;
>>    	struct platform_device *pdev;
>> @@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
>>    	platform_device_put(pdev);
>>    	return rc;
>>    }
>> +EXPORT_SYMBOL_GPL(hmem_register_device);
>>    
>>    static int dax_hmem_platform_probe(struct platform_device *pdev)
>>    {
>> diff --git a/include/linux/dax.h b/include/linux/dax.h
>> index a4ad3708ea35..5052dca8b3bc 100644
>> --- a/include/linux/dax.h
>> +++ b/include/linux/dax.h
>> @@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
>>    
>>    #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
>>    void hmem_register_resource(int target_nid, struct resource *r);
>> +int hmem_register_device(int target_nid, const struct resource *res);
>>    #else
>>    static inline void hmem_register_resource(int target_nid, struct resource *r)
>>    {
>>    }
>> +
>> +static inline int hmem_register_device(int target_nid, const struct resource *res)
>> +{
>> +	return 0;
>> +}
>>    #endif
>>    
>>    typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-07-10  4:22     ` Koralahalli Channabasappa, Smita
@ 2025-07-10  8:18       ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 40+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-07-10  8:18 UTC (permalink / raw)
  To: Koralahalli Channabasappa, Smita, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Xingtao Yao (Fujitsu), Peter Zijlstra,
	Greg KH, Nathan Fontenot, Terry Bowman, Robert Richter,
	Benjamin Cheatham, PradeepVineshReddy Kodamati



On 10/07/2025 12:22, Koralahalli Channabasappa, Smita wrote:
>>
>> what is the impact if one consumes all SOFT RESERVED resources?
>>
>> Since `hmem_register_device()` only creates HMEM devices for ranges
>> *without* `IORES_DESC_CXL` which could be marked in cxl_acpi , cxl_core/cxl_dax
>> should still create regions and DAX devices without conflicts.
> 
> You're correct that hmem_register_device() includes a check to skip
> regions overlapping with IORES_DESC_CXL. However, this check only works if the CXL region driver has already inserted those regions into iomem_resource. 

IIUC, this relies on the the root decoder resource(CFMW) has be already inserted iomem_resource which is currently done in cxl_acpi

This also can be resolved by the modules loading dependence chain.
something like this:

#if IS_MODULE(CONFIG_CXL_ACPI)
MODULE_SOFTDEP("pre: cxl_acpi");
#endif



> If dax_hmem_platform_probe() runs too early (before CXL region probing), that check fails to detect overlaps — leading to erroneous registration.
> 
> This is what I think. I may be wrong. Also, Alison/Dan comment here: "New approach is to not have the CXL intersecting soft reserved
> resources in iomem_resource tree."..

I think his point was the latter sentence.
"Only move them there if CXL region assembly fails and we want to make them availabe to DAX directly."

which in my understanding was remove the 'soft reserved' in the next region creating.


> 
> https://lore.kernel.org/linux-cxl/ZPdoduf5IckVWQVD@aschofie-mobl2/
> 
>>
>>> To resolve this, defer the DAX driver's resource consumption if the
>>> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
>>> iomem resource tree in this case. After CXL region creation completes,
>>> any remaining SOFT RESERVED resources are explicitly registered with the
>>> DAX driver by the CXL driver.
>>
>> Conversely, with this patch applied, `cxl_region_softreserv_update()` attempts
>> to register new HMEM devices. This may cause duplicate registrations for the
>>    same range (e.g., 0x180000000-0x1ffffffff), triggering warnings like:
>>
>> [   14.984108] kmem dax4.0: mapping0: 0x180000000-0x1ffffffff could not reserve region
>> [   14.987204] kmem dax4.0: probe with driver kmem failed with error -16
>>
>> Because the HMAT initialization already registered these sub-ranges:
>>     180000000-1bfffffff
>>     1c0000000-1ffffffff
>>
>>
>> If I'm missing something, please correct me.
> 
> Yeah, this bug is due to a double invocation of hmem_register_device() once from cxl_softreserv_mem_register() and once from dax_hmem_platform_probe().
> 
> When CONFIG_CXL_ACPI=y, walk_iomem_res_desc() is skipped in hmem_init(),
> so I expected hmem_active to remain empty. However, I missed the detail that the ACPI HMAT parser (drivers/acpi/numa/hmat.c) calls hmem_register_resource(), which populates hmem_active via __hmem_register_resource().
> 
> Case 1 (No bug): If dax_hmem_platform_probe() runs when hmem_active is still empty.
> 
> walk_hmem_resources() walks nothing — it's effectively a no-op.
> 
> Later, cxl_softreserv_mem_register() is invoked to register leftover soft-reserved regions via hmem_register_device().
> 
> Only one registration occurs, no conflict.
> 
> Case 2: If dax_hmem_platform_probe() runs after hmem_active is populated by hmat_register_target_devices() (via hmem_register_resource()):
> 
> walk_hmem_resources() iterates those regions. It invokes hmem_register_device(). Later, cxl_region driver does the same again.
> 
> This results in duplicate instances for the same physical range and second call fails like below:
> 
> [   14.984108] kmem dax4.0: mapping0: 0x180000000-0x1ffffffff could not reserve region
> [   14.987204] kmem dax4.0: probe with driver kmem failed with error -16
> 
> Below, did the job to fix the above bug for me and I did incorporate this in v5.

Actually, I don't think it's a real problem, current code can tolerate such duplicating registration. so I'm fine to just turn it to dev_debug from dev_warn.



> 
> static int dax_hmem_platform_probe(struct platform_device *pdev)
> {
> +    if (IS_ENABLED(CONFIG_CXL_ACPI))
> +        return 0;
> 
>      dax_hmem_pdev = pdev;
>      return walk_hmem_resources(hmem_register_device);
> }
> 
> Let me know if my thought process is right. I would appreciate any additional feedback or suggestions.
> 
> Meanwhile, I should also mention that my approach fails, if cxl_acpi finishes probing before dax_hmem is even loaded, it attempts to call into unresolved dax_hmem symbols, causing probe failures. Particularly when CXL_BUS=y and DEV_DAX_HMEM=m.
> 
> ld: vmlinux.o: in function `cxl_softreserv_mem_register':
> region.c:(.text+0xc15160): undefined reference to `hmem_register_device'
> make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
> 
> I spent some time exploring possible fixes for this symbol dependency issue, which delayed my v5 submission. I would welcome any ideas..
> 
> In the meantime, I noticed your new patchset that introduces a different approach for resolving resource conflicts between CMFW and Soft Reserved regions. I will take a closer look at that.


Thanks in advance.

(I am uncertain whether such an approach has ever been proposed previously.)




> 
> Thanks
> Smita
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-06-03 22:19 ` [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation Smita Koralahalli
  2025-06-09 13:01   ` Jonathan Cameron
  2025-06-13  2:12   ` Zhijian Li (Fujitsu)
@ 2025-07-15 16:27   ` Alison Schofield
  2025-07-15 18:19     ` Koralahalli Channabasappa, Smita
  2 siblings, 1 reply; 40+ messages in thread
From: Alison Schofield @ 2025-07-15 16:27 UTC (permalink / raw)
  To: Smita Koralahalli
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Tue, Jun 03, 2025 at 10:19:49PM +0000, Smita Koralahalli wrote:
> From: Nathan Fontenot <nathan.fontenot@amd.com>
> 
> The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
> during initialization. This interferes with the CXL driver’s ability to
> create regions and trim overlapping SOFT RESERVED ranges before DAX uses
> them.
> 
> To resolve this, defer the DAX driver's resource consumption if the
> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
> iomem resource tree in this case. After CXL region creation completes,
> any remaining SOFT RESERVED resources are explicitly registered with the
> DAX driver by the CXL driver.
> 
> This sequencing ensures proper handling of overlaps and fixes hotplug
> failures.

Hi Smita,

About the issue I first mentioned here [1]. The HMEM driver is not
waiting for region probe to finish. By the time region probe attempts
to hand off the memory to DAX, the memory is already marked as System RAM.

See 'case CXL_PARTMODE_RAM:' in cxl_region_probe(). The is_system_ram()
test fails so devm_cxl_add_dax_region() not possible.

This means that in appearance, just looking at /proc/iomem/, this
seems to have worked. There is no soft reserved and the dax and
kmem resources are child resources of the region resource. But they
were not set up by the region driver, hence no unregister callback
is triggered when the region is disabled.

It appears like this: 

c080000000-17dbfffffff : CXL Window 0
  c080000000-c47fffffff : region2
    c080000000-c47fffffff : dax0.0
      c080000000-c47fffffff : System RAM (kmem)

Now, to make the memory available for reuse, need to do:
# daxctl offline-memory dax0.0
# daxctl destroy-device --force dax0.0
# cxl disable-region 2
# cxl destroy-region 2

Whereas previously, did this:
# daxctl offline-memory dax0.0
# cxl disable-region 2
  After disabling region, dax device unregistered.
# cxl destroy-region 2

I do see that __cxl_region_softreserv_update() is not called until
after cxl_region_probe() completes, so that is waiting properly to
pick up the scraps. I'm actually not sure there would be any scraps
though, if the HMEM driver has already done it's thing. In my case
the Soft Reserved size is same as region, so I cannot tell what
would happen if that Soft Reserved had more capacity than the region.

If I do this: # CONFIG_DEV_DAX_HMEM is not set, works same as before,
which is as expected. 

Let me know if I can try anything else out or collect more info.

--Alison


[1] https://lore.kernel.org/nvdimm/20250603221949.53272-1-Smita.KoralahalliChannabasappa@amd.com/T/#m10c0eb7b258af7cd0c84c7ee2c417c055724f921


> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>  drivers/cxl/core/region.c | 10 +++++++++
>  drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
>  drivers/dax/hmem/hmem.c   |  3 ++-
>  include/linux/dax.h       |  6 ++++++
>  4 files changed, 40 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3a5ca44d65f3..c6c0c7ba3b20 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -10,6 +10,7 @@
>  #include <linux/sort.h>
>  #include <linux/idr.h>
>  #include <linux/memory-tiers.h>
> +#include <linux/dax.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
>  #include "core.h"
> @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
>  	return NULL;
>  }
>  
> +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
> +{
> +	return hmem_register_device(phys_to_target_node(res->start), res);
> +}
> +
>  static int __cxl_region_softreserv_update(struct resource *soft,
>  					  void *_cxlr)
>  {
> @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
>  				    __cxl_region_softreserv_update);
>  	}
>  
> +	/* Now register any remaining SOFT RESERVES with DAX */
> +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
> +			    0, -1, NULL, cxl_softreserv_mem_register);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> index 59ad44761191..cc1ed7bbdb1a 100644
> --- a/drivers/dax/hmem/device.c
> +++ b/drivers/dax/hmem/device.c
> @@ -8,7 +8,6 @@
>  static bool nohmem;
>  module_param_named(disable, nohmem, bool, 0444);
>  
> -static bool platform_initialized;
>  static DEFINE_MUTEX(hmem_resource_lock);
>  static struct resource hmem_active = {
>  	.name = "HMEM devices",
> @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
>  
>  static void __hmem_register_resource(int target_nid, struct resource *res)
>  {
> -	struct platform_device *pdev;
>  	struct resource *new;
> -	int rc;
>  
>  	new = __request_region(&hmem_active, res->start, resource_size(res), "",
>  			       0);
> @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
>  	}
>  
>  	new->desc = target_nid;
> -
> -	if (platform_initialized)
> -		return;
> -
> -	pdev = platform_device_alloc("hmem_platform", 0);
> -	if (!pdev) {
> -		pr_err_once("failed to register device-dax hmem_platform device\n");
> -		return;
> -	}
> -
> -	rc = platform_device_add(pdev);
> -	if (rc)
> -		platform_device_put(pdev);
> -	else
> -		platform_initialized = true;
>  }
>  
>  void hmem_register_resource(int target_nid, struct resource *res)
> @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
>  
>  static __init int hmem_init(void)
>  {
> -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
> -	return 0;
> +	struct platform_device *pdev;
> +	int rc;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, NULL,
> +				    hmem_register_one);
> +	}
> +
> +	pdev = platform_device_alloc("hmem_platform", 0);
> +	if (!pdev) {
> +		pr_err("failed to register device-dax hmem_platform device\n");
> +		return -1;
> +	}
> +
> +	rc = platform_device_add(pdev);
> +	if (rc) {
> +		pr_err("failed to add device-dax hmem_platform device\n");
> +		platform_device_put(pdev);
> +	}
> +
> +	return rc;
>  }
>  
>  /*
> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> index 3aedef5f1be1..a206b9b383e4 100644
> --- a/drivers/dax/hmem/hmem.c
> +++ b/drivers/dax/hmem/hmem.c
> @@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
>  	platform_device_unregister(pdev);
>  }
>  
> -static int hmem_register_device(int target_nid, const struct resource *res)
> +int hmem_register_device(int target_nid, const struct resource *res)
>  {
>  	struct device *host = &dax_hmem_pdev->dev;
>  	struct platform_device *pdev;
> @@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
>  	platform_device_put(pdev);
>  	return rc;
>  }
> +EXPORT_SYMBOL_GPL(hmem_register_device);
>  
>  static int dax_hmem_platform_probe(struct platform_device *pdev)
>  {
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index a4ad3708ea35..5052dca8b3bc 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
>  
>  #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
>  void hmem_register_resource(int target_nid, struct resource *r);
> +int hmem_register_device(int target_nid, const struct resource *res);
>  #else
>  static inline void hmem_register_resource(int target_nid, struct resource *r)
>  {
>  }
> +
> +static inline int hmem_register_device(int target_nid, const struct resource *res)
> +{
> +	return 0;
> +}
>  #endif
>  
>  typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-07-15 16:27   ` Alison Schofield
@ 2025-07-15 18:19     ` Koralahalli Channabasappa, Smita
  2025-07-15 18:45       ` Alison Schofield
  0 siblings, 1 reply; 40+ messages in thread
From: Koralahalli Channabasappa, Smita @ 2025-07-15 18:19 UTC (permalink / raw)
  To: Alison Schofield
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

Hi Alison,

Sorry I missed this email before sending out v5. Comments inline.

On 7/15/2025 9:27 AM, Alison Schofield wrote:
> On Tue, Jun 03, 2025 at 10:19:49PM +0000, Smita Koralahalli wrote:
>> From: Nathan Fontenot <nathan.fontenot@amd.com>
>>
>> The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
>> during initialization. This interferes with the CXL driver’s ability to
>> create regions and trim overlapping SOFT RESERVED ranges before DAX uses
>> them.
>>
>> To resolve this, defer the DAX driver's resource consumption if the
>> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
>> iomem resource tree in this case. After CXL region creation completes,
>> any remaining SOFT RESERVED resources are explicitly registered with the
>> DAX driver by the CXL driver.
>>
>> This sequencing ensures proper handling of overlaps and fixes hotplug
>> failures.
> 
> Hi Smita,
> 
> About the issue I first mentioned here [1]. The HMEM driver is not
> waiting for region probe to finish. By the time region probe attempts
> to hand off the memory to DAX, the memory is already marked as System RAM.
> 
> See 'case CXL_PARTMODE_RAM:' in cxl_region_probe(). The is_system_ram()
> test fails so devm_cxl_add_dax_region() not possible.
> 
> This means that in appearance, just looking at /proc/iomem/, this
> seems to have worked. There is no soft reserved and the dax and
> kmem resources are child resources of the region resource. But they
> were not set up by the region driver, hence no unregister callback
> is triggered when the region is disabled.

I believe this should be resolved in v5. I see the following dmesg 
entries indicating that devm_cxl_add_dax_region() is being called 
successfully for all regions:

# dmesg | grep devm_cxl_add_dax_region
[   40.730864] devm_cxl_add_dax_region: cxl_region region0: region0: 
register dax_region0
[   40.756307] devm_cxl_add_dax_region: cxl_region region1: region1: 
register dax_region1
[   43.689882] devm_cxl_add_dax_region: cxl_region region2: region2: 
register dax_region2

cat /proc/iomem

850000000-284fffffff : CXL Window 0
   850000000-284fffffff : region0
     850000000-284fffffff : dax0.0
       850000000-284fffffff : System RAM (kmem)
2850000000-484fffffff : CXL Window 1
   2850000000-484fffffff : region1
     2850000000-484fffffff : dax1.0
       2850000000-484fffffff : System RAM (kmem)
4850000000-684fffffff : CXL Window 2
   4850000000-684fffffff : region2
     4850000000-684fffffff : dax2.0
       4850000000-684fffffff : System RAM (kmem)

I suspect devm_cxl_add_dax_region() didn't execute in v4 because 
hmem_register_resource() (called from hmat.c) preemptively created 
hmem_active entries. These were consumed during walk_hmem_resources() 
and registered by hmem_register_device() before the CXL region probe 
could complete.

In v5, if CONFIG_CXL_ACPI is enabled, soft reserved resources are stored 
in hmem_deferred_active instead and hmem_register_resource() calls from 
hmat.c are disabled. This ensures DAX registration happens only through 
CXL drivers probing.

> 
> It appears like this:
> 
> c080000000-17dbfffffff : CXL Window 0
>    c080000000-c47fffffff : region2
>      c080000000-c47fffffff : dax0.0
>        c080000000-c47fffffff : System RAM (kmem)
> 
> Now, to make the memory available for reuse, need to do:
> # daxctl offline-memory dax0.0
> # daxctl destroy-device --force dax0.0
> # cxl disable-region 2
> # cxl destroy-region 2
> 
> Whereas previously, did this:
> # daxctl offline-memory dax0.0
> # cxl disable-region 2
>    After disabling region, dax device unregistered.
> # cxl destroy-region 2

I haven’t yet tested this specific unregister flow with v5. I’ll verify 
and follow up if I see any issues. Meanwhile, please let me know if you 
still observe the same behavior or need additional debug info from my side.

Thanks
Smita
> 
> I do see that __cxl_region_softreserv_update() is not called until
> after cxl_region_probe() completes, so that is waiting properly to
> pick up the scraps. I'm actually not sure there would be any scraps
> though, if the HMEM driver has already done it's thing. In my case
> the Soft Reserved size is same as region, so I cannot tell what
> would happen if that Soft Reserved had more capacity than the region.
> 
> If I do this: # CONFIG_DEV_DAX_HMEM is not set, works same as before,
> which is as expected.
> 
> Let me know if I can try anything else out or collect more info.
> 
> --Alison
> 
> 
> [1] https://lore.kernel.org/nvdimm/20250603221949.53272-1-Smita.KoralahalliChannabasappa@amd.com/T/#m10c0eb7b258af7cd0c84c7ee2c417c055724f921
> 
> 
>>
>> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>   drivers/cxl/core/region.c | 10 +++++++++
>>   drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
>>   drivers/dax/hmem/hmem.c   |  3 ++-
>>   include/linux/dax.h       |  6 ++++++
>>   4 files changed, 40 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 3a5ca44d65f3..c6c0c7ba3b20 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -10,6 +10,7 @@
>>   #include <linux/sort.h>
>>   #include <linux/idr.h>
>>   #include <linux/memory-tiers.h>
>> +#include <linux/dax.h>
>>   #include <cxlmem.h>
>>   #include <cxl.h>
>>   #include "core.h"
>> @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
>>   	return NULL;
>>   }
>>   
>> +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
>> +{
>> +	return hmem_register_device(phys_to_target_node(res->start), res);
>> +}
>> +
>>   static int __cxl_region_softreserv_update(struct resource *soft,
>>   					  void *_cxlr)
>>   {
>> @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
>>   				    __cxl_region_softreserv_update);
>>   	}
>>   
>> +	/* Now register any remaining SOFT RESERVES with DAX */
>> +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
>> +			    0, -1, NULL, cxl_softreserv_mem_register);
>> +
>>   	return 0;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
>> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
>> index 59ad44761191..cc1ed7bbdb1a 100644
>> --- a/drivers/dax/hmem/device.c
>> +++ b/drivers/dax/hmem/device.c
>> @@ -8,7 +8,6 @@
>>   static bool nohmem;
>>   module_param_named(disable, nohmem, bool, 0444);
>>   
>> -static bool platform_initialized;
>>   static DEFINE_MUTEX(hmem_resource_lock);
>>   static struct resource hmem_active = {
>>   	.name = "HMEM devices",
>> @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
>>   
>>   static void __hmem_register_resource(int target_nid, struct resource *res)
>>   {
>> -	struct platform_device *pdev;
>>   	struct resource *new;
>> -	int rc;
>>   
>>   	new = __request_region(&hmem_active, res->start, resource_size(res), "",
>>   			       0);
>> @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
>>   	}
>>   
>>   	new->desc = target_nid;
>> -
>> -	if (platform_initialized)
>> -		return;
>> -
>> -	pdev = platform_device_alloc("hmem_platform", 0);
>> -	if (!pdev) {
>> -		pr_err_once("failed to register device-dax hmem_platform device\n");
>> -		return;
>> -	}
>> -
>> -	rc = platform_device_add(pdev);
>> -	if (rc)
>> -		platform_device_put(pdev);
>> -	else
>> -		platform_initialized = true;
>>   }
>>   
>>   void hmem_register_resource(int target_nid, struct resource *res)
>> @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
>>   
>>   static __init int hmem_init(void)
>>   {
>> -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
>> -	return 0;
>> +	struct platform_device *pdev;
>> +	int rc;
>> +
>> +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
>> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> +				    IORESOURCE_MEM, 0, -1, NULL,
>> +				    hmem_register_one);
>> +	}
>> +
>> +	pdev = platform_device_alloc("hmem_platform", 0);
>> +	if (!pdev) {
>> +		pr_err("failed to register device-dax hmem_platform device\n");
>> +		return -1;
>> +	}
>> +
>> +	rc = platform_device_add(pdev);
>> +	if (rc) {
>> +		pr_err("failed to add device-dax hmem_platform device\n");
>> +		platform_device_put(pdev);
>> +	}
>> +
>> +	return rc;
>>   }
>>   
>>   /*
>> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
>> index 3aedef5f1be1..a206b9b383e4 100644
>> --- a/drivers/dax/hmem/hmem.c
>> +++ b/drivers/dax/hmem/hmem.c
>> @@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
>>   	platform_device_unregister(pdev);
>>   }
>>   
>> -static int hmem_register_device(int target_nid, const struct resource *res)
>> +int hmem_register_device(int target_nid, const struct resource *res)
>>   {
>>   	struct device *host = &dax_hmem_pdev->dev;
>>   	struct platform_device *pdev;
>> @@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
>>   	platform_device_put(pdev);
>>   	return rc;
>>   }
>> +EXPORT_SYMBOL_GPL(hmem_register_device);
>>   
>>   static int dax_hmem_platform_probe(struct platform_device *pdev)
>>   {
>> diff --git a/include/linux/dax.h b/include/linux/dax.h
>> index a4ad3708ea35..5052dca8b3bc 100644
>> --- a/include/linux/dax.h
>> +++ b/include/linux/dax.h
>> @@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
>>   
>>   #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
>>   void hmem_register_resource(int target_nid, struct resource *r);
>> +int hmem_register_device(int target_nid, const struct resource *res);
>>   #else
>>   static inline void hmem_register_resource(int target_nid, struct resource *r)
>>   {
>>   }
>> +
>> +static inline int hmem_register_device(int target_nid, const struct resource *res)
>> +{
>> +	return 0;
>> +}
>>   #endif
>>   
>>   typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);
>> -- 
>> 2.17.1
>>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation
  2025-07-15 18:19     ` Koralahalli Channabasappa, Smita
@ 2025-07-15 18:45       ` Alison Schofield
  0 siblings, 0 replies; 40+ messages in thread
From: Alison Schofield @ 2025-07-15 18:45 UTC (permalink / raw)
  To: Koralahalli Channabasappa, Smita
  Cc: linux-cxl, linux-kernel, nvdimm, linux-fsdevel, linux-pm,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Matthew Wilcox, Jan Kara,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Li Ming,
	Jeff Johnson, Ying Huang, Yao Xingtao, Peter Zijlstra, Greg KH,
	Nathan Fontenot, Terry Bowman, Robert Richter, Benjamin Cheatham,
	PradeepVineshReddy Kodamati, Zhijian Li

On Tue, Jul 15, 2025 at 11:19:05AM -0700, Koralahalli Channabasappa, Smita wrote:
> Hi Alison,
> 
> Sorry I missed this email before sending out v5. Comments inline.

Ah, I see v5 now and will try it out. Thanks!


> 
> On 7/15/2025 9:27 AM, Alison Schofield wrote:
> > On Tue, Jun 03, 2025 at 10:19:49PM +0000, Smita Koralahalli wrote:
> > > From: Nathan Fontenot <nathan.fontenot@amd.com>
> > > 
> > > The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
> > > during initialization. This interferes with the CXL driver’s ability to
> > > create regions and trim overlapping SOFT RESERVED ranges before DAX uses
> > > them.
> > > 
> > > To resolve this, defer the DAX driver's resource consumption if the
> > > cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
> > > iomem resource tree in this case. After CXL region creation completes,
> > > any remaining SOFT RESERVED resources are explicitly registered with the
> > > DAX driver by the CXL driver.
> > > 
> > > This sequencing ensures proper handling of overlaps and fixes hotplug
> > > failures.
> > 
> > Hi Smita,
> > 
> > About the issue I first mentioned here [1]. The HMEM driver is not
> > waiting for region probe to finish. By the time region probe attempts
> > to hand off the memory to DAX, the memory is already marked as System RAM.
> > 
> > See 'case CXL_PARTMODE_RAM:' in cxl_region_probe(). The is_system_ram()
> > test fails so devm_cxl_add_dax_region() not possible.
> > 
> > This means that in appearance, just looking at /proc/iomem/, this
> > seems to have worked. There is no soft reserved and the dax and
> > kmem resources are child resources of the region resource. But they
> > were not set up by the region driver, hence no unregister callback
> > is triggered when the region is disabled.
> 
> I believe this should be resolved in v5. I see the following dmesg entries
> indicating that devm_cxl_add_dax_region() is being called successfully for
> all regions:
> 
> # dmesg | grep devm_cxl_add_dax_region
> [   40.730864] devm_cxl_add_dax_region: cxl_region region0: region0:
> register dax_region0
> [   40.756307] devm_cxl_add_dax_region: cxl_region region1: region1:
> register dax_region1
> [   43.689882] devm_cxl_add_dax_region: cxl_region region2: region2:
> register dax_region2
> 
> cat /proc/iomem
> 
> 850000000-284fffffff : CXL Window 0
>   850000000-284fffffff : region0
>     850000000-284fffffff : dax0.0
>       850000000-284fffffff : System RAM (kmem)
> 2850000000-484fffffff : CXL Window 1
>   2850000000-484fffffff : region1
>     2850000000-484fffffff : dax1.0
>       2850000000-484fffffff : System RAM (kmem)
> 4850000000-684fffffff : CXL Window 2
>   4850000000-684fffffff : region2
>     4850000000-684fffffff : dax2.0
>       4850000000-684fffffff : System RAM (kmem)
> 
> I suspect devm_cxl_add_dax_region() didn't execute in v4 because
> hmem_register_resource() (called from hmat.c) preemptively created
> hmem_active entries. These were consumed during walk_hmem_resources() and
> registered by hmem_register_device() before the CXL region probe could
> complete.
> 
> In v5, if CONFIG_CXL_ACPI is enabled, soft reserved resources are stored in
> hmem_deferred_active instead and hmem_register_resource() calls from hmat.c
> are disabled. This ensures DAX registration happens only through CXL drivers
> probing.
> 
> > 
> > It appears like this:
> > 
> > c080000000-17dbfffffff : CXL Window 0
> >    c080000000-c47fffffff : region2
> >      c080000000-c47fffffff : dax0.0
> >        c080000000-c47fffffff : System RAM (kmem)
> > 
> > Now, to make the memory available for reuse, need to do:
> > # daxctl offline-memory dax0.0
> > # daxctl destroy-device --force dax0.0
> > # cxl disable-region 2
> > # cxl destroy-region 2
> > 
> > Whereas previously, did this:
> > # daxctl offline-memory dax0.0
> > # cxl disable-region 2
> >    After disabling region, dax device unregistered.
> > # cxl destroy-region 2
> 
> I haven’t yet tested this specific unregister flow with v5. I’ll verify and
> follow up if I see any issues. Meanwhile, please let me know if you still
> observe the same behavior or need additional debug info from my side.
> 
> Thanks
> Smita
> > 
> > I do see that __cxl_region_softreserv_update() is not called until
> > after cxl_region_probe() completes, so that is waiting properly to
> > pick up the scraps. I'm actually not sure there would be any scraps
> > though, if the HMEM driver has already done it's thing. In my case
> > the Soft Reserved size is same as region, so I cannot tell what
> > would happen if that Soft Reserved had more capacity than the region.
> > 
> > If I do this: # CONFIG_DEV_DAX_HMEM is not set, works same as before,
> > which is as expected.
> > 
> > Let me know if I can try anything else out or collect more info.
> > 
> > --Alison
> > 
> > 
> > [1] https://lore.kernel.org/nvdimm/20250603221949.53272-1-Smita.KoralahalliChannabasappa@amd.com/T/#m10c0eb7b258af7cd0c84c7ee2c417c055724f921
> > 
> > 
> > > 
> > > Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> > > Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> > > Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> > > Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> > > Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> > > ---
> > >   drivers/cxl/core/region.c | 10 +++++++++
> > >   drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
> > >   drivers/dax/hmem/hmem.c   |  3 ++-
> > >   include/linux/dax.h       |  6 ++++++
> > >   4 files changed, 40 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > > index 3a5ca44d65f3..c6c0c7ba3b20 100644
> > > --- a/drivers/cxl/core/region.c
> > > +++ b/drivers/cxl/core/region.c
> > > @@ -10,6 +10,7 @@
> > >   #include <linux/sort.h>
> > >   #include <linux/idr.h>
> > >   #include <linux/memory-tiers.h>
> > > +#include <linux/dax.h>
> > >   #include <cxlmem.h>
> > >   #include <cxl.h>
> > >   #include "core.h"
> > > @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
> > >   	return NULL;
> > >   }
> > > +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
> > > +{
> > > +	return hmem_register_device(phys_to_target_node(res->start), res);
> > > +}
> > > +
> > >   static int __cxl_region_softreserv_update(struct resource *soft,
> > >   					  void *_cxlr)
> > >   {
> > > @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
> > >   				    __cxl_region_softreserv_update);
> > >   	}
> > > +	/* Now register any remaining SOFT RESERVES with DAX */
> > > +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
> > > +			    0, -1, NULL, cxl_softreserv_mem_register);
> > > +
> > >   	return 0;
> > >   }
> > >   EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> > > diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> > > index 59ad44761191..cc1ed7bbdb1a 100644
> > > --- a/drivers/dax/hmem/device.c
> > > +++ b/drivers/dax/hmem/device.c
> > > @@ -8,7 +8,6 @@
> > >   static bool nohmem;
> > >   module_param_named(disable, nohmem, bool, 0444);
> > > -static bool platform_initialized;
> > >   static DEFINE_MUTEX(hmem_resource_lock);
> > >   static struct resource hmem_active = {
> > >   	.name = "HMEM devices",
> > > @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
> > >   static void __hmem_register_resource(int target_nid, struct resource *res)
> > >   {
> > > -	struct platform_device *pdev;
> > >   	struct resource *new;
> > > -	int rc;
> > >   	new = __request_region(&hmem_active, res->start, resource_size(res), "",
> > >   			       0);
> > > @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
> > >   	}
> > >   	new->desc = target_nid;
> > > -
> > > -	if (platform_initialized)
> > > -		return;
> > > -
> > > -	pdev = platform_device_alloc("hmem_platform", 0);
> > > -	if (!pdev) {
> > > -		pr_err_once("failed to register device-dax hmem_platform device\n");
> > > -		return;
> > > -	}
> > > -
> > > -	rc = platform_device_add(pdev);
> > > -	if (rc)
> > > -		platform_device_put(pdev);
> > > -	else
> > > -		platform_initialized = true;
> > >   }
> > >   void hmem_register_resource(int target_nid, struct resource *res)
> > > @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
> > >   static __init int hmem_init(void)
> > >   {
> > > -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> > > -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
> > > -	return 0;
> > > +	struct platform_device *pdev;
> > > +	int rc;
> > > +
> > > +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
> > > +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> > > +				    IORESOURCE_MEM, 0, -1, NULL,
> > > +				    hmem_register_one);
> > > +	}
> > > +
> > > +	pdev = platform_device_alloc("hmem_platform", 0);
> > > +	if (!pdev) {
> > > +		pr_err("failed to register device-dax hmem_platform device\n");
> > > +		return -1;
> > > +	}
> > > +
> > > +	rc = platform_device_add(pdev);
> > > +	if (rc) {
> > > +		pr_err("failed to add device-dax hmem_platform device\n");
> > > +		platform_device_put(pdev);
> > > +	}
> > > +
> > > +	return rc;
> > >   }
> > >   /*
> > > diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> > > index 3aedef5f1be1..a206b9b383e4 100644
> > > --- a/drivers/dax/hmem/hmem.c
> > > +++ b/drivers/dax/hmem/hmem.c
> > > @@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
> > >   	platform_device_unregister(pdev);
> > >   }
> > > -static int hmem_register_device(int target_nid, const struct resource *res)
> > > +int hmem_register_device(int target_nid, const struct resource *res)
> > >   {
> > >   	struct device *host = &dax_hmem_pdev->dev;
> > >   	struct platform_device *pdev;
> > > @@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
> > >   	platform_device_put(pdev);
> > >   	return rc;
> > >   }
> > > +EXPORT_SYMBOL_GPL(hmem_register_device);
> > >   static int dax_hmem_platform_probe(struct platform_device *pdev)
> > >   {
> > > diff --git a/include/linux/dax.h b/include/linux/dax.h
> > > index a4ad3708ea35..5052dca8b3bc 100644
> > > --- a/include/linux/dax.h
> > > +++ b/include/linux/dax.h
> > > @@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
> > >   #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
> > >   void hmem_register_resource(int target_nid, struct resource *r);
> > > +int hmem_register_device(int target_nid, const struct resource *res);
> > >   #else
> > >   static inline void hmem_register_resource(int target_nid, struct resource *r)
> > >   {
> > >   }
> > > +
> > > +static inline int hmem_register_device(int target_nid, const struct resource *res)
> > > +{
> > > +	return 0;
> > > +}
> > >   #endif
> > >   typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);
> > > -- 
> > > 2.17.1
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2025-07-15 18:46 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 22:19 [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Smita Koralahalli
2025-06-03 22:19 ` [PATCH v4 1/7] cxl/region: Avoid null pointer dereference in is_cxl_region() Smita Koralahalli
2025-06-03 23:49   ` Dave Jiang
2025-06-04 19:56     ` Nathan Fontenot
2025-06-25 15:23       ` Robert Richter
2025-06-03 22:19 ` [PATCH v4 2/7] cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o Smita Koralahalli
2025-06-09 11:02   ` Jonathan Cameron
2025-06-09 23:25     ` Koralahalli Channabasappa, Smita
2025-06-10  9:38       ` Jonathan Cameron
2025-06-03 22:19 ` [PATCH v4 3/7] cxl/pci: Add pci_loaded tracking to mark PCI driver readiness Smita Koralahalli
2025-06-04  9:29   ` Zhijian Li (Fujitsu)
2025-06-03 22:19 ` [PATCH v4 4/7] cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe Smita Koralahalli
2025-06-03 23:45   ` Dave Jiang
2025-06-04 18:31     ` Koralahalli Channabasappa, Smita
2025-06-04  9:40   ` Zhijian Li (Fujitsu)
2025-06-04 14:35     ` Dave Jiang
2025-06-05  2:18       ` Zhijian Li (Fujitsu)
2025-06-03 22:19 ` [PATCH v4 5/7] cxl/region: Introduce SOFT RESERVED resource removal on region teardown Smita Koralahalli
2025-06-06  4:16   ` Zhijian Li (Fujitsu)
2025-06-06 17:53     ` Koralahalli Channabasappa, Smita
2025-06-09 12:54   ` Jonathan Cameron
2025-06-10  1:25     ` Koralahalli Channabasappa, Smita
2025-06-10  9:37       ` Jonathan Cameron
2025-06-25 15:20       ` Robert Richter
2025-07-09  1:14   ` Alison Schofield
2025-06-03 22:19 ` [PATCH v4 6/7] dax/hmem: Save the DAX HMEM platform device pointer Smita Koralahalli
2025-06-05 16:54   ` Dave Jiang
2025-06-06  8:11     ` Zhijian Li (Fujitsu)
2025-06-06 20:09       ` Nathan Fontenot
2025-06-03 22:19 ` [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after CXL region creation Smita Koralahalli
2025-06-09 13:01   ` Jonathan Cameron
2025-06-13  2:12   ` Zhijian Li (Fujitsu)
2025-07-10  4:22     ` Koralahalli Channabasappa, Smita
2025-07-10  8:18       ` Zhijian Li (Fujitsu)
2025-07-15 16:27   ` Alison Schofield
2025-07-15 18:19     ` Koralahalli Channabasappa, Smita
2025-07-15 18:45       ` Alison Schofield
2025-06-04  8:43 ` [PATCH v4 0/7] Add managed SOFT RESERVE resource handling Zhijian Li (Fujitsu)
2025-06-04 18:59   ` Koralahalli Channabasappa, Smita
2025-06-16  1:00     ` Zhijian Li (Fujitsu)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).