From: Dan Williams <dan.j.williams@intel.com>
To: Vikram Sethi <vsethi@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
"Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
"catalin.marinas@arm.com" <Catalin.Marinas@arm.com>,
James Morse <james.morse@arm.com>
Cc: "Natu, Mahesh" <mahesh.natu@intel.com>
Subject: RE: Questions about CXL device (type 3 memory) hotplug
Date: Wed, 24 May 2023 14:20:23 -0700 [thread overview]
Message-ID: <646e7f96f33e2_33fb3294c1@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <BN8PR12MB3330831F2E666E9BB1319E66BD419@BN8PR12MB3330.namprd12.prod.outlook.com>
Vikram Sethi wrote:
[..]
> > I don't understand this failure mode. Accelerator is added, driver sets up an
> > HDM decode range and triggers CPU cache invalidation before mapping the
> > memory into page tables. Wouldn't the device, upon receiving an invalidation
> > request, just snoop its caches and say "nothing for me to do"?
>
> Device's snoop filter is in a clean reset/power on state. It is not
> tracking anything checked out by the host CPU/peer. If it starts
> receiving writebacks or even CleanEvicts for its memory,
CleanEvict is a device-to-host request. We are talking about
host-to-device requests which is only SnpData, SnpInv, and SnpCur,
right?
> looks like an unexpected coherency message and i Know of at least one
> implementation that triggers an error interrupt in response. I don't
> know of a statement In the specification that this is expected and
> implementations should ignore. If there is such a statement, could you
> please point me to it?
All the specification says (CXL 3.0 3.2.4.4 Host to Device Requests) is
what to do *if* the device is holding that cacheline.
If a device fails when it gets one of those requests when it does not
hold a line then how can this work in the nominal case of the device not
owning any random cacheline?
> Remove memory needs a cache flush IMO, in a way that prevents
> speculative fetches. This can be done in kernel with uncacheable
> mappings alone, if possible in the arch callback, or via FW call.
That assumes that the kernel owns all mappings. I worry about mappings
that the kernel cannot see like x86 SMM. That's why it's currently an
invalidate before next usage, but I am not opposed to also flushing on
remove if the current solution is causing device-failures in practice.
Can you confirm that the current kernel arrangement is causing failures
in practice, or is this a theoretical concern? ...and if it is happening
in practice do you have the example patch that fixes it?
next prev parent reply other threads:[~2023-05-24 21:20 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 8:06 Questions about CXL device (type 3 memory) hotplug Yasunori Gotou (Fujitsu)
2023-05-23 0:11 ` Dan Williams
2023-05-23 8:31 ` Yasunori Gotou (Fujitsu)
2023-05-23 17:36 ` Dan Williams
2023-05-24 11:12 ` Yasunori Gotou (Fujitsu)
2023-05-24 20:51 ` Dan Williams
2023-05-25 10:32 ` Yasunori Gotou (Fujitsu)
2023-05-26 8:05 ` Yasunori Gotou (Fujitsu)
2023-05-26 14:48 ` Dan Williams
2023-05-29 8:07 ` Yasunori Gotou (Fujitsu)
2023-06-06 17:58 ` Dan Williams
2023-06-08 7:39 ` Yasunori Gotou (Fujitsu)
2023-06-08 18:37 ` Dan Williams
2023-06-09 1:02 ` Yasunori Gotou (Fujitsu)
2023-05-23 13:34 ` Vikram Sethi
2023-05-23 18:40 ` Dan Williams
2023-05-24 0:02 ` Vikram Sethi
2023-05-24 4:03 ` Dan Williams
2023-05-24 14:47 ` Vikram Sethi
2023-05-24 21:20 ` Dan Williams [this message]
2023-05-31 4:25 ` Vikram Sethi
2023-06-06 20:54 ` Dan Williams
2023-06-07 1:06 ` Vikram Sethi
2023-06-07 15:12 ` Jonathan Cameron
2023-06-07 18:44 ` Vikram Sethi
2023-06-08 15:19 ` Jonathan Cameron
2023-06-08 18:41 ` Dan Williams
2024-03-27 7:10 ` Yuquan Wang
2024-03-27 7:18 ` Yuquan Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=646e7f96f33e2_33fb3294c1@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Catalin.Marinas@arm.com \
--cc=james.morse@arm.com \
--cc=linux-cxl@vger.kernel.org \
--cc=mahesh.natu@intel.com \
--cc=vsethi@nvidia.com \
--cc=y-goto@fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox