From: "Koralahalli Channabasappa, Smita" <skoralah@amd.com>
To: Gregory Price <gourry@gourry.net>,
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] dax/hmem, cxl: Defer and resolve ownership of Soft Reserved memory ranges
Date: Tue, 17 Feb 2026 14:21:55 -0800 [thread overview]
Message-ID: <e5deb402-b07a-4fd5-9de0-570bfe9ad715@amd.com> (raw)
In-Reply-To: <20260213144704.2780265-1-gourry@gourry.net>
On 2/13/2026 6:47 AM, Gregory Price wrote:
> On Tue, Feb 10, 2026 at 06:45:00AM +0000, Smita Koralahalli wrote:
>
> This is a review generated by kreview-0811365ff2
> Reference: https://github.com/masoncl/review-prompts/
>
> This is not an automated email, I thought this looked valid, the rest of
> the text here is auto-generated.
>
> ~Gregory
>
> ---
>
> [...]
>
>> +static void process_defer_work(void *data)
>> +{
>> + struct platform_device *pdev = data;
>> + int rc;
>> +
>> + /* relies on cxl_acpi and cxl_pci having had a chance to load */
>> + wait_for_device_probe();
>
> [...]
>
>> +static void kill_defer_work(void *data)
>> +{
>> + struct platform_device *pdev = data;
>> +
>> + dax_hmem_flush_work();
>> + dax_hmem_unregister_work(process_defer_work, pdev);
>> +}
>> +
>> static int dax_hmem_platform_probe(struct platform_device *pdev)
>> {
>> + int rc;
>> +
>> + rc = dax_hmem_register_work(process_defer_work, pdev);
>> + if (rc)
>> + return rc;
>> +
>> + rc = devm_add_action_or_reset(&pdev->dev, kill_defer_work, pdev);
>> + if (rc)
>> + return rc;
>> +
>> return walk_hmem_resources(&pdev->dev, hmem_register_device);
>> }
>
> Is there a potential deadlock in the error path here? If
> walk_hmem_resources() encounters a CXL-intersecting range first
> (calling dax_hmem_queue_work(), which schedules process_defer_work
> on system_long_wq), and then a subsequent non-CXL Soft Reserved
> range fails to register, the probe returns an error. The devres
> cleanup then calls kill_defer_work -> dax_hmem_flush_work() ->
> flush_work(&dax_hmem_work).
>
> Meanwhile, process_defer_work calls wait_for_device_probe(), which
> waits for probe_count to reach zero. Since devres_release_all runs
> inside really_probe before driver_probe_device decrements
> probe_count, this looks like a circular wait:
>
> driver_probe_device
> atomic_inc(&probe_count)
> __driver_probe_device
> really_probe
> dax_hmem_platform_probe
> walk_hmem_resources -> hmem_register_device
> CXL range: dax_hmem_queue_work()
> non-CXL range: fails
> returns error
> devres_release_all
> kill_defer_work
> dax_hmem_flush_work
> flush_work(&dax_hmem_work) <-- waits for process_defer_work
> process_defer_work
> wait_for_device_probe() <-- waits for probe_count == 0
> atomic_dec(&probe_count) <-- never reached
>
> The trigger requires both CXL-intersecting and non-CXL Soft Reserved
> ranges with the non-CXL registration failing, so the window is narrow,
> but the deadlock would be permanent if hit.
>
> Would it be safer to cancel the work instead of flushing it in the
> error path, or to avoid queuing deferred work during the initial
> walk?
Yes, you're right, that's a real deadlock. Thanks for pointing this out.
This might affect any wait on the work — flush_work, cancel_work_sync,
all the same problem.
Just using dax_hmem_unregister_work (which only grabs the mutex and
NULLs pointers) avoids the deadlock but opens a use-after-free: the
worker may have already copied fn/data and be mid-execution of
process_defer_work(pdev) while devres tears down the pdev underneath it.
Your suggestion of avoiding queuing during the initial walk is the right
approach I think. Set a flag when hmem_register_device defers a
CXL-intersecting range, queue the work after walk_hmem_resources returns
successfully. Something like:
static int dax_hmem_platform_probe(struct platform_device *pdev)
{
..
..
- return walk_hmem_resources(&pdev->dev, hmem_register_device);
+ rc = walk_hmem_resources(&pdev->dev, hmem_register_device);
+ if (rc)
+ return rc;
+ if (defer) //defer is set under DAX_CXL_MODE_DEFER
+ dax_hmem_queue_work();
}
I think this works but there might be subtleties I'm missing. What do
you think?
Thanks
Smita
next prev parent reply other threads:[~2026-02-17 22:22 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 6:44 [PATCH v6 0/9] dax/hmem, cxl: Coordinate Soft Reserved handling with CXL and HMEM Smita Koralahalli
2026-02-10 6:44 ` [PATCH v6 1/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges Smita Koralahalli
2026-02-19 3:22 ` Alison Schofield
2026-02-10 6:44 ` [PATCH v6 2/9] dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL Smita Koralahalli
2026-02-19 3:23 ` Alison Schofield
2026-02-10 6:44 ` [PATCH v6 3/9] cxl/region: Skip decoder reset on detach for autodiscovered regions Smita Koralahalli
2026-02-19 3:44 ` Alison Schofield
2026-02-20 20:35 ` Koralahalli Channabasappa, Smita
2026-03-11 21:37 ` Dan Williams
2026-03-12 19:53 ` Dan Williams
2026-03-12 21:28 ` Koralahalli Channabasappa, Smita
2026-03-13 12:54 ` Alejandro Lucero Palau
2026-03-17 2:14 ` Dan Williams
2026-03-18 7:33 ` Alejandro Lucero Palau
2026-03-18 21:49 ` Dave Jiang
2026-03-18 21:27 ` Alison Schofield
2026-03-24 14:06 ` Alejandro Lucero Palau
2026-03-24 19:46 ` Dan Williams
2026-03-24 22:23 ` Alejandro Lucero Palau
2026-03-25 1:51 ` Alison Schofield
2026-02-10 6:44 ` [PATCH v6 4/9] dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding Smita Koralahalli
2026-02-18 15:54 ` Dave Jiang
2026-03-09 14:31 ` Jonathan Cameron
2026-02-10 6:44 ` [PATCH v6 5/9] dax: Track all dax_region allocations under a global resource tree Smita Koralahalli
2026-02-18 16:04 ` Dave Jiang
2026-03-09 14:37 ` Jonathan Cameron
2026-03-12 21:30 ` Koralahalli Channabasappa, Smita
2026-03-12 0:27 ` Dan Williams
2026-03-12 21:31 ` Koralahalli Channabasappa, Smita
2026-02-10 6:44 ` [PATCH v6 6/9] cxl/region: Add helper to check Soft Reserved containment by CXL regions Smita Koralahalli
2026-03-12 0:29 ` Dan Williams
2026-02-10 6:44 ` [PATCH v6 7/9] dax: Add deferred-work helpers for dax_hmem and dax_cxl coordination Smita Koralahalli
2026-02-18 17:52 ` Dave Jiang
2026-02-20 0:02 ` Koralahalli Channabasappa, Smita
2026-02-20 15:55 ` Dave Jiang
2026-03-09 14:49 ` Jonathan Cameron
2026-02-10 6:45 ` [PATCH v6 8/9] dax/hmem, cxl: Defer and resolve ownership of Soft Reserved memory ranges Smita Koralahalli
2026-02-13 14:47 ` [PATCH] " Gregory Price
2026-02-17 22:21 ` Koralahalli Channabasappa, Smita [this message]
2026-02-18 18:05 ` [PATCH v6 8/9] " Dave Jiang
2026-02-20 19:54 ` Koralahalli Channabasappa, Smita
2026-02-20 10:14 ` Alejandro Lucero Palau
2026-03-12 2:28 ` Dan Williams
2026-03-13 18:41 ` Koralahalli Channabasappa, Smita
2026-03-17 2:36 ` Dan Williams
2026-03-16 22:26 ` Koralahalli Channabasappa, Smita
2026-03-17 2:42 ` Dan Williams
2026-02-10 6:45 ` [PATCH v6 9/9] dax/hmem: Reintroduce Soft Reserved ranges back into the iomem tree Smita Koralahalli
2026-02-10 19:16 ` [PATCH v6 0/9] dax/hmem, cxl: Coordinate Soft Reserved handling with CXL and HMEM Alison Schofield
2026-02-10 19:49 ` Koralahalli Channabasappa, Smita
2026-02-12 6:38 ` Alison Schofield
2026-02-20 21:00 ` Koralahalli Channabasappa, Smita
2026-02-12 14:44 ` Tomasz Wolski
2026-02-12 21:18 ` Alison Schofield
2026-02-13 7:47 ` Yasunori Goto (Fujitsu)
2026-02-13 17:31 ` Alison Schofield
2026-02-16 5:15 ` Yasunori Goto (Fujitsu)
2026-02-12 20:02 ` [sos-linux-dev] " Koralahalli Channabasappa, Smita
2026-02-13 14:04 ` Gregory Price
2026-02-20 20:47 ` Koralahalli Channabasappa, Smita
2026-02-20 9:45 ` Tomasz Wolski
2026-02-20 21:19 ` Koralahalli Channabasappa, Smita
2026-02-22 23:17 ` Tomasz Wolski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e5deb402-b07a-4fd5-9de0-570bfe9ad715@amd.com \
--to=skoralah@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=dan.j.williams@intel.com \
--cc=gourry@gourry.net \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox