From: Dan Williams <dan.j.williams@intel.com>
To: Gregory Price <gregory.price@memverge.com>,
Dan Williams <dan.j.williams@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Jonathan Cameron via <qemu-devel@nongnu.org>,
Lukas Wunner <lukas@wunner.de>, Michael Tsirkin <mst@redhat.com>,
Ben Widawsky <bwidawsk@kernel.org>, <linux-cxl@vger.kernel.org>,
<linuxarm@huawei.com>, Ira Weiny <ira.weiny@intel.com>,
Gregory Price <gourry.memverge@gmail.com>
Subject: Re: cxl nvdimm Potential probe ordering issues.
Date: Fri, 20 Jan 2023 14:41:05 -0800 [thread overview]
Message-ID: <63cb1881b078a_3a36e5294ab@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <Y8sNfC1YQVj/DfBU@memverge.com>
Gregory Price wrote:
> On Fri, Jan 20, 2023 at 09:38:13AM -0800, Dan Williams wrote:
> > As it stands currently that dax device and the cxl device are not
> > related since a default dax-device is loaded just based on the presence
> > of an EFI_MEMORY_SP address range in the address map. With the new ram
> > enabling that default device will be elided and CXL will register a
> > dax-device parented by a cxl region.
> >
> > > - The memory *does not* auto-online, instead the dax device can be
> > > onlined as system-ram *manually* via ndctl and friends
> >
> > That *manually* part is the problem that needs distro help to solve. It
> > should be the case that by default all Linux distributions auto-online
> > all dax-devices. If that happens to online memory that is too slow for
> > general use, or too high-performance / precious for general purpose use
> > then the administrator can set policy after the fact. Unfortunately user
> > policy can not be applied if these memory ranges were onlined by the
> > kernel at boot , so that's why the kernel policy defaults to not-online.
> >
> > In other words, there is no guarantee that memory that was assigned to
> > the general purpose pool at boot can be removed. The only guaranteed
> > behavior is to never give the memory to the core kernel in the first
> > instance and always let user policy route the memory.
> >
> > > 3) The code creates an nvdimm_bridge IFF a CFMW is defined - regardless
> > > of the type-3 device configuration (pmem-only or vmem-only)
> >
> > Correct, the top-level bus code (cxl_acpi) and the endpoint code
> > (cxl_mem, cxl_port) need to handshake before establishing regions. For
> > pmem regions the platform needs to claim the availability of a pmem
> > capable CXL window.
> >
> > > 4) As you can see above, multiple decoders are registered. I'm not sure
> > > if that's correct or not, but it does seem odd given there's only one
> > > cxl type-3 device. Odd that decoder0.0 shows up when CFMW is there,
> > > but not when it isn't.
> >
> > CXL windows are modeled as decoders hanging off the the CXL root device
> > (ACPI0017 on ACPI based platforms). An endpoint decoder can then map a
> > selection of that window.
> >
> > > Don't know why I haven't thought of this until now, but is the CFMW code
> > > reporting something odd about what's behind it? Is it assuming the
> > > devices are pmem?
> >
> > No, the cxl_acpi code is just advertising platform decode possibilities
> > independent of what devices show up. Think of this like the PCI MMIO
> > space that gets allocated to a root bridge at the beginning of time.
> > That space may or may not get consumed based on what devices show up
> > downstream.
>
> Thank you for the explanation Dan, and thank you for you patience
> @JCameron. I'm fairly sure I grok it now.
>
> Summarizing to make sure: the cxl driver is providing what would be the
> CXL.io (control) path, and the CXL.mem path is basically being simulated
> by what otherwise would be a traditional PCI memory region. This explains
> why turning off Legacy mode drops the dax devices, and why the topology
> looks strange - the devices are basically attached in 2 different ways.
>
> Might there be interest from the QEMU community to implement this
> legacy-style setup in the short term, in an effort to test the the
> control path of type-3 devices while we wait for the kernel to catch up?
>
> Or should we forget this mode ever existed and just barrel forward
> with HDM decoders and writing the kernel code to hook up the underlying
> devices in drivers/cxl?
Which mode are you referring?
The next steps for the kernel enabling relevant to this thread are:
* ram region discovery (platform firmware or kexec established)
* ram region creation
* pmem region discovery (from labels)
next prev parent reply other threads:[~2023-01-20 22:42 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 14:24 [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 1/8] hw/mem/cxl_type3: Improve error handling in realize() Jonathan Cameron via
2023-01-11 17:33 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 2/8] hw/pci-bridge/cxl_downstream: Fix type naming mismatch Jonathan Cameron via
2023-01-11 14:45 ` Philippe Mathieu-Daudé
2023-01-11 17:38 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 3/8] hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL Jonathan Cameron via
2023-01-11 17:41 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 4/8] hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition Jonathan Cameron via
2023-01-11 15:48 ` Philippe Mathieu-Daudé
2023-01-11 14:24 ` [PATCH 5/8] hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge Jonathan Cameron via
2023-01-11 17:48 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 6/8] qemu/bswap: Add const_le64() Jonathan Cameron via
2023-01-11 15:49 ` Philippe Mathieu-Daudé
2023-01-11 16:07 ` Philippe Mathieu-Daudé
2023-01-11 16:33 ` Philippe Mathieu-Daudé
2023-01-11 16:40 ` Philippe Mathieu-Daudé
2023-01-11 16:59 ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 7/8] qemu/uuid: Add UUID static initializer Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 8/8] hw/cxl/mailbox: Use new UUID network order define for cel_uuid Jonathan Cameron via
2023-01-11 15:50 ` Philippe Mathieu-Daudé
2023-01-12 15:39 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Gregory Price
2023-01-12 17:21 ` Jonathan Cameron via
2023-01-12 22:46 ` Gregory Price
2023-01-13 9:12 ` Jonathan Cameron via
2023-01-13 14:19 ` Gregory Price
2023-01-13 14:40 ` Jonathan Cameron via
2023-01-13 14:45 ` Jonathan Cameron via
2023-01-13 15:12 ` Lukas Wunner
2023-01-13 15:42 ` Gregory Price
2023-01-18 19:22 ` Gregory Price
2023-01-18 19:31 ` Gregory Price
2023-01-19 12:42 ` Jonathan Cameron via
2023-01-19 15:04 ` cxl nvdimm Potential probe ordering issues Jonathan Cameron via
2023-01-19 16:17 ` Jonathan Cameron via
2023-01-20 5:51 ` Gregory Price
2023-01-20 17:26 ` Dan Williams
2023-01-20 4:53 ` Gregory Price
2023-01-20 10:47 ` Jonathan Cameron via
2023-01-20 17:38 ` Dan Williams
2023-01-20 21:54 ` Gregory Price
2023-01-20 22:41 ` Dan Williams [this message]
2023-01-23 9:44 ` Jonathan Cameron via
2023-01-23 18:16 ` Gregory Price
2023-01-19 10:19 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron via
2023-01-19 11:48 ` Michael S. Tsirkin
2023-01-19 12:16 ` Jonathan Cameron via
2023-01-19 14:23 ` Gregory Price
2023-01-19 14:20 ` Gregory Price
2023-01-13 14:45 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=63cb1881b078a_3a36e5294ab@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=bwidawsk@kernel.org \
--cc=gourry.memverge@gmail.com \
--cc=gregory.price@memverge.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=lukas@wunner.de \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).