From: <dan.j.williams@intel.com>
To: "Cheatham, Benjamin" <benjamin.cheatham@amd.com>,
<linux-cxl@vger.kernel.org>
Cc: <benjamin.cheatham@amd.com>
Subject: Re: RFC: CXL Isolation Support
Date: Mon, 2 Feb 2026 12:20:24 -0800 [thread overview]
Message-ID: <69810708cf7df_55fa10055@dwillia2-mobl4.notmuch> (raw)
In-Reply-To: <b4f7a760-5cf4-4343-8f7e-9698ba435fa4@amd.com>
Cheatham, Benjamin wrote:
> Quick Background:
> CXL.mem isolation and timeout is a mechanism that allows the host to
> continue operation in the event a CXL.mem link goes down or a CXL.mem
> transaction times out (semi-analogous to PCIe DPC for CXL)[1]. After CXL.mem
> isolation is triggered all CXL memory below the root port is inaccessible.
...and this is unrecoverable in the generic memory expansion case as
detailed previously [1].
[1]: http://lore.kernel.org/65cea1bc6ac0c_5e9bf294ed@dwillia2-xfh.jf.intel.com.notmuch
> At this point writes to the memory are dropped and reads return synchronous
> exceptions (platform specific, but probably poisoned data). The alternative
> to this support (which is the case now) is the host system resets when a
> CXL.mem link goes down or a CXL.mem transaction timeouts out.
>
> Why I'm Sending This:
> I sent out a patch series a few months back that implemented CXL.mem
> error isolation to this list [2]. It didn't really gain traction due
> to not having a customer requesting it. We (AMD) have heard from some
> customers that they are interested in this support, but aren't willing to
> help out upstream.
Then they get the status quo until that "interest" matures into shared
requirements definition, clarification of assumptions, and consensus of
tradeoffs.
> The main motivation behind using isolation we've heard
> is that customers would like to use CXL but are worried about system
> reliability since it's still a new technology.
That does not appear prohibitive given CXL uptake to date. Isolation
does not improve reliability on its own. It replaces hangs with poison
that is fatal outside of constrained use cases.
Now, all of the push back to date has been with respect to the general
purpose memory expansion use case. The way forward from there is new
evidence that the expected mitigations to make isolation useful still
result in a usable feature. The evidence of *that* is the new use case
that Vikram proposed several months back in the CXL collaboration call,
CXL Accelerator error recovery.
In that case there is a chance that the acclerator error model meets the
requirements to make isolation useful. Guarantees like 1:1 host bridge
to endpoint direct-attach, non-interleaved CXL.mem, and limited risk of
core kernel dependencies on that CXL.mem.
I am interested in the isolation for CXL accelerator discussion. I am
not interested in muddying through isolation for the general memory
expander use case without engagement from deployment use cases.
next prev parent reply other threads:[~2026-02-02 20:20 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-30 19:47 RFC: CXL Isolation Support Cheatham, Benjamin
2026-01-30 21:30 ` Gregory Price
2026-02-02 15:59 ` Jonathan Cameron
2026-02-02 16:50 ` Gregory Price
2026-02-02 17:31 ` Cheatham, Benjamin
2026-02-02 17:30 ` Cheatham, Benjamin
2026-02-02 19:52 ` Gregory Price
2026-02-02 15:52 ` Jonathan Cameron
2026-02-02 19:28 ` Vikram Sethi
2026-02-02 20:20 ` dan.j.williams [this message]
2026-02-05 20:49 ` Cheatham, Benjamin
2026-02-05 21:52 ` dan.j.williams
2026-02-05 22:54 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=69810708cf7df_55fa10055@dwillia2-mobl4.notmuch \
--to=dan.j.williams@intel.com \
--cc=benjamin.cheatham@amd.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox