From: Alison Schofield <alison.schofield@intel.com>
To: <dan.j.williams@intel.com>
Cc: <gourry@gourry.net>, Davidlohr Bueso <dave@stgolabs.net>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Dave Jiang <dave.jiang@intel.com>,
"Vishal Verma" <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>, <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 2/2] cxl/region: Unregister auto-created region when assembly fails
Date: Wed, 4 Feb 2026 16:20:00 -0800 [thread overview]
Message-ID: <aYPiMCuNQ2torvwF@aschofie-mobl2.lan> (raw)
In-Reply-To: <6981668d2182_55fa10024@dwillia2-mobl4.notmuch>
On Mon, Feb 02, 2026 at 07:07:57PM -0800, Dan Williams wrote:
> Alison Schofield wrote:
> > On Fri, Jan 30, 2026 at 09:45:29AM -0800, Dan Williams wrote:
> > > Alison Schofield wrote:
> > > > When auto-created region assembly fails the region remains registered
> > > > but disabled.
> > >
> > > Right, that is good forensics, administrator action is needed to figure
> > > out what to do next.
> > >
> > > > The region continues to reserve its memory resource, preventing DAX
> > > > from registering the memory.
> > >
> > > I would rather have the partially assemebled region to continue to
> > > exist. It can help debug the expected catastrophic error reports from
> > > DAX enabling access to a memory range that the CXL side can see has
> > > completely failed (lost an interleave member). If the failure is more
> > > benign and DAX side access is viable, then the forensics matter and
> > > userspace can cleanup.
> >
> > Thanks for all the feedback, Dan & Greg -
> >
> > I'm responding here because this is the overriding topic of do we want to
> > behave better upon region assembly failures. If there is a path here to
> > becoming a better region driver then I'll take a look at the implementation
> > comments, like if or how to timeout.
> >
> > One point I should have led with: while we are focused on failover to DAX,
> > the issue here is more general. It is about the region driver leaving behind
> > an unrecoverable partial configuration on assembly failure, independent of
> > consumers.
>
> This gets to the heart of the question of what practical problem is
> being solved with this and is the solution suitable? Outside of the
> "platform is doing something strange" case like "Normalized Addressing"
> or "Non-CXL Interleave Target" I am struggling to imagine an end user
> benefiting from this automatic cleanup. A system which is so flaky that
> it can not arrange for BIOS configured interleave to stay alive through
> Linux boot. At that point I expect the end user to decommission that
> system, and flag it for remediation, not recover it and keep running.
Hi Dan,
Why is the response different here that with DAX failover due to wonky
BIOS usage of Soft Reserved resources. When BIOS is unclear(?), we give
up on the CXL regions and give all the memory directly to DAX so the
system can come up w all it's expected resources, yet for these region
assembly failures we are willing to strand memory, even though the
option to give to DAX is so easily available.
In the past, we've seen BIOS-defined regions that the kernel was unable
to assemble for reasons other than outright hardware failure. While we
can hope the worst of those issues are behind us, the region assembly
code path has not been quiet, so assuming no more issues seems ill-
fated.
>
> > Neither of these failures are recoverable from userspace today. If they
> > should be recoverable from userspace, prove me wrong, but I'm doubtful
> > that we are just one smart admin or one good cxl-cli update away from handling
> > this in userspace.
>
> That is my bad. I mixed some unverified wishes in with my replies, but
> the end goal for me remains the same. Userspace should be able to undo
> every step that auto-region assembly performs.
I haven't heard that before. Sounds good to do. (Repeating myself
I know, but that'll take a coordinated effort, ie changes to both
region driver and userspace.
>
> > That's why these patches make the region driver fail gracefully. And I
> > do think it is the region driver’s job to fail gracefully.
>
> This where you lose me. It fails gracefully today. It stops in a safe
> configuration same as if userspace stopped short of fully configuring a
> region. I keep coming back to the RAID example because CXL region
> assembly is roughly patterned after RAID assembly. In that example a
> RAID0 array does not disappear after 30 seconds if auto-assembly fails,
> it waits for administrator action.
>
This is where you lose me. An unrepairable config may be safe but it's use
is limited.
wrt the RAID analogy: I’m not familiar with md internals. IIUC RAID tooling
provides supported admin actions to stop, tear-down, and rebuild incomplete
arrays. This CXL failure mode leaves a partial configuration that userspace
cannot repair. So leaving the object behind is not comparable without a
supported repair path. Aspirational, but not within reach like this
soln.
> The potential conflict with a DAX takeover is a separate problem that
> also might not need full teardown if we can make it work with
> incremental fixes.
This is intended to work with and is tested w the DAX takeover patches.
Like said above, it seems odd to let DAX takeover if BIOS gives us wonky
Soft Reserved boundaries, but we won't let DAX takeover for region
assembly failures.
>
> > When auto-created region assembly fails, the region remains registered
> > with decoders still enabled. In that state, userspace does not have a
> > supported way to unwind the configuration. cxl destroy-region fails because
> > the decoders are still enabled (--force fails). So while “administrator action
> > is needed” is true in principle, the admin has no effective action available.
> > Leaving the region behind does not provide a viable recovery path because
> > it leaves all the things related to this region stuck. All the things being
> > the HPA resource, the DPA resources, and the decoders.
>
> Right, I think that is a gap worth fixing to have all the same tools
> available for partial creation recovery available to partial assembly
> recovery. A "gap" and not a "bug" because only a unit test might care
> about this presently.
This is not a unit test driven issue. It was not found in unit testing and
is not motivated by trying to make a contrived unit test pass. This was
observed in real configurations where BIOS-defined regions failed to
assemble and memory was not recoverable.
Agree to call it a gap. Did I call it a bug? I'm not reading any intent into
why the region was not unregistered upon assembly failure previously. If you
tell me that it was with the intent that user space tooling would pick up the
pieces, I believe you and it's worth examining:
Which will work better:
-- improve the existing stop on assembly failure so userspace can repair
-- or unregistering completely with a fail-over to DAX. Non DAX users can
recreate at cmdline.
It's difficult not to be biased towards these patches, when they are simple
and within reach and the other is aspirational.
>
> > On the forensics point, the most actionable diagnostic information is not in
> > cxl-list output. cxl-list can show the existence of a disabled region, but
> > it does not show why assembly failed, which endpoint is missing, or what
> > happened at the time of failure.
>
> cxl list -RDi -r $region
>
> ...shows the region, the number of expected targets, and the ones that
> have arrived.
>
> > The forensic info is in the kernel log, because that’s where the
> > assembly failure is detected and where the relevant context exists.
> > With the changes here, the kernel messaging is improved so that the
> > failure is explicit rather than requiring the admin to infer the
> > situation from a disabled region in cxl-list.
>
> The kernel log does not know the device that was meant to arrive. . The
> kernel log likely has debug disabled by default. This situation should
> be debuggable without the kernel log.
>
> Likely the first notification of something wrong is operations tooling
> noticing that serverX came up with less memory than expected, not a
> kernel log message.
(feels out of order here, but to finish response on comments)
I agree the cxl list output is useful, but not as useful as making the
failure explicit in a non-debug kernel log message, nor as useful
as giving the user their expected memory via failover to DAX, nor as
useful as allowing the user to create a new region from userspace
with same resources.
--Alison
next prev parent reply other threads:[~2026-02-05 0:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-30 4:23 [PATCH 1/2] cxl/region: Timeout auto region assembly waiting for endpoints Alison Schofield
2026-01-30 4:23 ` [PATCH 2/2] cxl/region: Unregister auto-created region when assembly fails Alison Schofield
2026-01-30 17:45 ` dan.j.williams
2026-01-31 1:04 ` Alison Schofield
2026-01-31 15:49 ` Gregory Price
2026-02-05 0:32 ` Alison Schofield
2026-02-05 4:22 ` Gregory Price
2026-02-03 3:07 ` dan.j.williams
2026-02-05 0:20 ` Alison Schofield [this message]
2026-02-05 1:03 ` dan.j.williams
2026-01-30 4:58 ` [PATCH 1/2] cxl/region: Timeout auto region assembly waiting for endpoints dan.j.williams
2026-01-30 17:42 ` Gregory Price
2026-01-30 18:26 ` dan.j.williams
2026-01-30 19:03 ` Gregory Price
2026-01-30 22:46 ` dan.j.williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYPiMCuNQ2torvwF@aschofie-mobl2.lan \
--to=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=gourry@gourry.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox