From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<nvdimm@lists.linux.dev>, <linux-fsdevel@vger.kernel.org>,
<linux-pm@vger.kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
"Alison Schofield" <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Yazen Ghannam <yazen.ghannam@amd.com>,
"Dave Jiang" <dave.jiang@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
"Matthew Wilcox" <willy@infradead.org>, Jan Kara <jack@suse.cz>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Len Brown <len.brown@intel.com>, Pavel Machek <pavel@kernel.org>,
Li Ming <ming.li@zohomail.com>,
Jeff Johnson <jeff.johnson@oss.qualcomm.com>,
Ying Huang <huang.ying.caritas@gmail.com>,
Yao Xingtao <yaoxt.fnst@fujitsu.com>,
"Peter Zijlstra" <peterz@infradead.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Nathan Fontenot <nathan.fontenot@amd.com>,
Terry Bowman <terry.bowman@amd.com>,
Robert Richter <rrichter@amd.com>,
Benjamin Cheatham <benjamin.cheatham@amd.com>,
Zhijian Li <lizhijian@fujitsu.com>,
Borislav Petkov <bp@alien8.de>,
Tomasz Wolski <tomasz.wolski@fujitsu.com>
Subject: Re: [PATCH v8 8/9] dax/hmem, cxl: Defer and resolve Soft Reserved ownership
Date: Mon, 23 Mar 2026 18:13:31 +0000 [thread overview]
Message-ID: <20260323181331.000018f2@huawei.com> (raw)
In-Reply-To: <20260322195343.206900-9-Smita.KoralahalliChannabasappa@amd.com>
On Sun, 22 Mar 2026 19:53:41 +0000
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:
> The current probe time ownership check for Soft Reserved memory based
> solely on CXL window intersection is insufficient. dax_hmem probing is not
> always guaranteed to run after CXL enumeration and region assembly, which
> can lead to incorrect ownership decisions before the CXL stack has
> finished publishing windows and assembling committed regions.
>
> Introduce deferred ownership handling for Soft Reserved ranges that
> intersect CXL windows. When such a range is encountered during the
> initial dax_hmem probe, schedule deferred work to wait for the CXL stack
> to complete enumeration and region assembly before deciding ownership.
>
> Once the deferred work runs, evaluate each Soft Reserved range
> individually: if a CXL region fully contains the range, skip it and let
> dax_cxl bind. Otherwise, register it with dax_hmem. This per-range
> ownership model avoids the need for CXL region teardown and
> alloc_dax_region() resource exclusion prevents double claiming.
>
> Introduce a boolean flag dax_hmem_initial_probe to live inside device.c
> so it survives module reload. Ensure dax_cxl defers driver registration
> until dax_hmem has completed ownership resolution. dax_cxl calls
> dax_hmem_flush_work() before cxl_driver_register(), which both waits for
> the deferred work to complete and creates a module symbol dependency that
> forces dax_hmem.ko to load before dax_cxl.
>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
https://sashiko.dev/#/patchset/20260322195343.206900-1-Smita.KoralahalliChannabasappa%40amd.com
Might be worth a look. I think the last comment is potentially correct
though unlikely a platform_driver_register() actually fails.
I've not looked too closely at the others. Given this was doing something
unusual I thought I'd see what it found. Looks like some interesting
questions if nothing else.
> ---
> drivers/dax/bus.h | 7 ++++
> drivers/dax/cxl.c | 1 +
> drivers/dax/hmem/device.c | 3 ++
> drivers/dax/hmem/hmem.c | 74 +++++++++++++++++++++++++++++++++++++++
> 4 files changed, 85 insertions(+)
>
> diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
> index cbbf64443098..ebbfe2d6da14 100644
> --- a/drivers/dax/bus.h
> +++ b/drivers/dax/bus.h
> @@ -49,6 +49,13 @@ void dax_driver_unregister(struct dax_device_driver *dax_drv);
> void kill_dev_dax(struct dev_dax *dev_dax);
> bool static_dev_dax(struct dev_dax *dev_dax);
>
> +#if IS_ENABLED(CONFIG_DEV_DAX_HMEM)
> +extern bool dax_hmem_initial_probe;
> +void dax_hmem_flush_work(void);
> +#else
> +static inline void dax_hmem_flush_work(void) { }
> +#endif
> +
> #define MODULE_ALIAS_DAX_DEVICE(type) \
> MODULE_ALIAS("dax:t" __stringify(type) "*")
> #define DAX_DEVICE_MODALIAS_FMT "dax:t%d"
> diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
> index a2136adfa186..3ab39b77843d 100644
> --- a/drivers/dax/cxl.c
> +++ b/drivers/dax/cxl.c
> @@ -44,6 +44,7 @@ static struct cxl_driver cxl_dax_region_driver = {
>
> static void cxl_dax_region_driver_register(struct work_struct *work)
> {
> + dax_hmem_flush_work();
> cxl_driver_register(&cxl_dax_region_driver);
> }
>
> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> index 56e3cbd181b5..991a4bf7d969 100644
> --- a/drivers/dax/hmem/device.c
> +++ b/drivers/dax/hmem/device.c
> @@ -8,6 +8,9 @@
> static bool nohmem;
> module_param_named(disable, nohmem, bool, 0444);
>
> +bool dax_hmem_initial_probe;
> +EXPORT_SYMBOL_GPL(dax_hmem_initial_probe);
> +
> static bool platform_initialized;
> static DEFINE_MUTEX(hmem_resource_lock);
> static struct resource hmem_active = {
> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> index ca752db03201..9ceda6b5cadf 100644
> --- a/drivers/dax/hmem/hmem.c
> +++ b/drivers/dax/hmem/hmem.c
> @@ -3,6 +3,7 @@
> #include <linux/memregion.h>
> #include <linux/module.h>
> #include <linux/dax.h>
> +#include <cxl/cxl.h>
> #include "../bus.h"
>
> static bool region_idle;
> @@ -58,6 +59,23 @@ static void release_hmem(void *pdev)
> platform_device_unregister(pdev);
> }
>
> +struct dax_defer_work {
> + struct platform_device *pdev;
> + struct work_struct work;
> +};
> +
> +static void process_defer_work(struct work_struct *w);
> +
> +static struct dax_defer_work dax_hmem_work = {
> + .work = __WORK_INITIALIZER(dax_hmem_work.work, process_defer_work),
> +};
> +
> +void dax_hmem_flush_work(void)
> +{
> + flush_work(&dax_hmem_work.work);
> +}
> +EXPORT_SYMBOL_GPL(dax_hmem_flush_work);
> +
> static int __hmem_register_device(struct device *host, int target_nid,
> const struct resource *res)
> {
> @@ -122,6 +140,11 @@ static int hmem_register_device(struct device *host, int target_nid,
> if (IS_ENABLED(CONFIG_DEV_DAX_CXL) &&
> region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
> IORES_DESC_CXL) != REGION_DISJOINT) {
> + if (!dax_hmem_initial_probe) {
> + dev_dbg(host, "await CXL initial probe: %pr\n", res);
> + queue_work(system_long_wq, &dax_hmem_work.work);
> + return 0;
> + }
> dev_dbg(host, "deferring range to CXL: %pr\n", res);
> return 0;
> }
> @@ -129,8 +152,54 @@ static int hmem_register_device(struct device *host, int target_nid,
> return __hmem_register_device(host, target_nid, res);
> }
>
> +static int hmem_register_cxl_device(struct device *host, int target_nid,
> + const struct resource *res)
> +{
> + if (region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
> + IORES_DESC_CXL) == REGION_DISJOINT)
> + return 0;
> +
> + if (cxl_region_contains_resource((struct resource *)res)) {
> + dev_dbg(host, "CXL claims resource, dropping: %pr\n", res);
> + return 0;
> + }
> +
> + dev_dbg(host, "CXL did not claim resource, registering: %pr\n", res);
> + return __hmem_register_device(host, target_nid, res);
> +}
> +
> +static void process_defer_work(struct work_struct *w)
> +{
> + struct dax_defer_work *work = container_of(w, typeof(*work), work);
> + struct platform_device *pdev;
> +
> + if (!work->pdev)
> + return;
> +
> + pdev = work->pdev;
> +
> + /* Relies on cxl_acpi and cxl_pci having had a chance to load */
> + wait_for_device_probe();
> +
> + guard(device)(&pdev->dev);
> + if (!pdev->dev.driver)
> + return;
> +
> + if (!dax_hmem_initial_probe) {
> + dax_hmem_initial_probe = true;
> + walk_hmem_resources(&pdev->dev, hmem_register_cxl_device);
> + }
> +}
> +
> static int dax_hmem_platform_probe(struct platform_device *pdev)
> {
> + if (work_pending(&dax_hmem_work.work))
> + return -EBUSY;
> +
> + if (!dax_hmem_work.pdev)
> + dax_hmem_work.pdev =
> + to_platform_device(get_device(&pdev->dev));
> +
> return walk_hmem_resources(&pdev->dev, hmem_register_device);
> }
>
> @@ -168,6 +237,11 @@ static __init int dax_hmem_init(void)
>
> static __exit void dax_hmem_exit(void)
> {
> + if (dax_hmem_work.pdev) {
> + flush_work(&dax_hmem_work.work);
> + put_device(&dax_hmem_work.pdev->dev);
> + }
> +
> platform_driver_unregister(&dax_hmem_driver);
> platform_driver_unregister(&dax_hmem_platform_driver);
> }
next prev parent reply other threads:[~2026-03-23 18:13 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-22 19:53 [PATCH v8 0/9] dax/hmem, cxl: Coordinate Soft Reserved handling with CXL and HMEM Smita Koralahalli
2026-03-22 19:53 ` [PATCH v8 1/9] dax/bus: Use dax_region_put() in alloc_dax_region() error path Smita Koralahalli
2026-03-23 17:11 ` Dave Jiang
2026-03-23 17:57 ` Jonathan Cameron
2026-03-23 19:37 ` Dan Williams
2026-03-22 19:53 ` [PATCH v8 2/9] dax/hmem: Factor HMEM registration into __hmem_register_device() Smita Koralahalli
2026-03-23 17:14 ` Dave Jiang
2026-03-23 17:59 ` Jonathan Cameron
2026-03-22 19:53 ` [PATCH v8 3/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges Smita Koralahalli
2026-03-23 19:54 ` Dan Williams
2026-03-24 5:46 ` Koralahalli Channabasappa, Smita
2026-03-24 16:25 ` Dan Williams
2026-03-22 19:53 ` [PATCH v8 4/9] dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL Smita Koralahalli
2026-03-22 19:53 ` [PATCH v8 5/9] dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding Smita Koralahalli
2026-03-22 19:53 ` [PATCH v8 6/9] dax: Track all dax_region allocations under a global resource tree Smita Koralahalli
2026-03-23 17:31 ` Dave Jiang
2026-03-23 20:55 ` Dan Williams
2026-03-22 19:53 ` [PATCH v8 7/9] cxl/region: Add helper to check Soft Reserved containment by CXL regions Smita Koralahalli
2026-03-22 19:53 ` [PATCH v8 8/9] dax/hmem, cxl: Defer and resolve Soft Reserved ownership Smita Koralahalli
2026-03-23 18:03 ` Jonathan Cameron
2026-03-23 18:13 ` Jonathan Cameron [this message]
2026-03-24 21:50 ` Koralahalli Channabasappa, Smita
2026-03-25 12:12 ` Jonathan Cameron
2026-03-23 18:17 ` Dave Jiang
2026-03-22 19:53 ` [PATCH v8 9/9] dax/hmem: Reintroduce Soft Reserved ranges back into the iomem tree Smita Koralahalli
2026-03-23 21:09 ` Dan Williams
2026-04-16 22:46 ` [PATCH v8 0/9] dax/hmem, cxl: Coordinate Soft Reserved handling with CXL and HMEM Tomasz Wolski
2026-04-16 23:44 ` Tomasz Wolski
2026-05-12 12:57 ` Tomasz Wolski
2026-05-12 15:44 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260323181331.000018f2@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alison.schofield@intel.com \
--cc=ardb@kernel.org \
--cc=benjamin.cheatham@amd.com \
--cc=bp@alien8.de \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=gregkh@linuxfoundation.org \
--cc=huang.ying.caritas@gmail.com \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jeff.johnson@oss.qualcomm.com \
--cc=len.brown@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lizhijian@fujitsu.com \
--cc=ming.li@zohomail.com \
--cc=nathan.fontenot@amd.com \
--cc=nvdimm@lists.linux.dev \
--cc=pavel@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
--cc=tomasz.wolski@fujitsu.com \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
--cc=yaoxt.fnst@fujitsu.com \
--cc=yazen.ghannam@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.