Linux CXL
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Vikram Sethi <vsethi@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"catalin.marinas@arm.com" <Catalin.Marinas@arm.com>,
	James Morse <james.morse@arm.com>
Cc: "Natu, Mahesh" <mahesh.natu@intel.com>
Subject: RE: Questions about CXL device (type 3 memory) hotplug
Date: Tue, 23 May 2023 11:40:19 -0700	[thread overview]
Message-ID: <646d0892eadc3_afb77294cb@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <BYAPR12MB333685D98C43D5247BEA998EBD409@BYAPR12MB3336.namprd12.prod.outlook.com>

Vikram Sethi wrote:
> Hi Dan, 
> 
> > From: Dan Williams <dan.j.williams@intel.com>
> > Sent: Monday, May 22, 2023 7:12 PM
> > To: Yasunori Gotou (Fujitsu) <y-goto@fujitsu.com>; linux-
> > cxl@vger.kernel.org
> > Cc: 'Dan Williams' <dan.j.williams@intel.com>
> > Subject: RE: Questions about CXL device (type 3 memory) hotplug
> > 
> > > Q4) Current CXL drivers/tools support Hot-removal request from PCIe?
> > >
> > >     CXL specification says "In a managed Hot-Remove flow, software is
> > >     notified of a hot removal request."
> > 
> > Currently there is a requirement that:
> > 
> > cxl disable-memdev
> > 
> > ...is run before the device can be removed. There is no warning from the PCI
> > hotplug driver. Which means that if end user does the wrong sequence they
> > can crash the kernel / remove memory that may still be in active use.
> >
> Is there any notion of a cache flush when memory is removed (or in future CXL reset)?

No.

> Generally, CPU caches must be flushed when memory is removed because any evictions
> when the memory isn't present can cause async errors which can be fatal to the system
> or at least to VMs, depending on ISA.

This seems incompatible with memory hotplug. The cache flushing is only
done on the subsequent reuse of physical address range to make sure that
any pending evictions are complete before the newly constituted address
range is put into service, or that any prior clean cache lines of old
content are dropped. See cxl_region_invalidate_memregion() for where
this is called.

> If the kernel does the cache flush, it must be done
> with only uncacheable mappings present to prevent speculative fetches after the cache flush. 

This is why the invalidation is done after physical address range is
populated by new devices. To flush any speculative fetches to the old
composition of the address range.

> Even so, kernel VA based cache flushes will likely be slow, so may be better to have the notion
> of an arch callback that can invoke firmware to do the cache flush. 
> Perhaps arch_remove_memory is the right place to invoke such a cache flush/FW call?
> I think the CXL specification should also address the need for cache flush when removing memory
> or doing CXL reset.

Seems out of scope for the CXL specification, this is up to each arch to
handle.

Here is some discussion of what ARM is thinking about in this space:

https://lore.kernel.org/all/40cd479b-f0f8-5dba-0e41-4cef73693927@arm.com/

  reply	other threads:[~2023-05-23 18:40 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22  8:06 Questions about CXL device (type 3 memory) hotplug Yasunori Gotou (Fujitsu)
2023-05-23  0:11 ` Dan Williams
2023-05-23  8:31   ` Yasunori Gotou (Fujitsu)
2023-05-23 17:36     ` Dan Williams
2023-05-24 11:12       ` Yasunori Gotou (Fujitsu)
2023-05-24 20:51         ` Dan Williams
2023-05-25 10:32           ` Yasunori Gotou (Fujitsu)
2023-05-26  8:05         ` Yasunori Gotou (Fujitsu)
2023-05-26 14:48           ` Dan Williams
2023-05-29  8:07             ` Yasunori Gotou (Fujitsu)
2023-06-06 17:58               ` Dan Williams
2023-06-08  7:39                 ` Yasunori Gotou (Fujitsu)
2023-06-08 18:37                   ` Dan Williams
2023-06-09  1:02                     ` Yasunori Gotou (Fujitsu)
2023-05-23 13:34   ` Vikram Sethi
2023-05-23 18:40     ` Dan Williams [this message]
2023-05-24  0:02       ` Vikram Sethi
2023-05-24  4:03         ` Dan Williams
2023-05-24 14:47           ` Vikram Sethi
2023-05-24 21:20             ` Dan Williams
2023-05-31  4:25               ` Vikram Sethi
2023-06-06 20:54                 ` Dan Williams
2023-06-07  1:06                   ` Vikram Sethi
2023-06-07 15:12                     ` Jonathan Cameron
2023-06-07 18:44                       ` Vikram Sethi
2023-06-08 15:19                         ` Jonathan Cameron
2023-06-08 18:41                           ` Dan Williams
2024-03-27  7:10   ` Yuquan Wang
2024-03-27  7:18   ` Yuquan Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=646d0892eadc3_afb77294cb@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mahesh.natu@intel.com \
    --cc=vsethi@nvidia.com \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox