From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Gregory Price <gregory.price@memverge.com>,
Jonathan Cameron via <qemu-devel@nongnu.org>,
Lukas Wunner <lukas@wunner.de>, Michael Tsirkin <mst@redhat.com>,
Ben Widawsky <bwidawsk@kernel.org>, <linux-cxl@vger.kernel.org>,
<linuxarm@huawei.com>, Ira Weiny <ira.weiny@intel.com>,
Gregory Price <gourry.memverge@gmail.com>
Subject: Re: cxl nvdimm Potential probe ordering issues.
Date: Mon, 23 Jan 2023 09:44:44 +0000 [thread overview]
Message-ID: <20230123094444.00006bd3@Huawei.com> (raw)
In-Reply-To: <63cb1881b078a_3a36e5294ab@dwillia2-xfh.jf.intel.com.notmuch>
On Fri, 20 Jan 2023 14:41:05 -0800
Dan Williams <dan.j.williams@intel.com> wrote:
> Gregory Price wrote:
> > On Fri, Jan 20, 2023 at 09:38:13AM -0800, Dan Williams wrote:
> > > As it stands currently that dax device and the cxl device are not
> > > related since a default dax-device is loaded just based on the presence
> > > of an EFI_MEMORY_SP address range in the address map. With the new ram
> > > enabling that default device will be elided and CXL will register a
> > > dax-device parented by a cxl region.
> > >
> > > > - The memory *does not* auto-online, instead the dax device can be
> > > > onlined as system-ram *manually* via ndctl and friends
> > >
> > > That *manually* part is the problem that needs distro help to solve. It
> > > should be the case that by default all Linux distributions auto-online
> > > all dax-devices. If that happens to online memory that is too slow for
> > > general use, or too high-performance / precious for general purpose use
> > > then the administrator can set policy after the fact. Unfortunately user
> > > policy can not be applied if these memory ranges were onlined by the
> > > kernel at boot , so that's why the kernel policy defaults to not-online.
> > >
> > > In other words, there is no guarantee that memory that was assigned to
> > > the general purpose pool at boot can be removed. The only guaranteed
> > > behavior is to never give the memory to the core kernel in the first
> > > instance and always let user policy route the memory.
> > >
> > > > 3) The code creates an nvdimm_bridge IFF a CFMW is defined - regardless
> > > > of the type-3 device configuration (pmem-only or vmem-only)
> > >
> > > Correct, the top-level bus code (cxl_acpi) and the endpoint code
> > > (cxl_mem, cxl_port) need to handshake before establishing regions. For
> > > pmem regions the platform needs to claim the availability of a pmem
> > > capable CXL window.
> > >
> > > > 4) As you can see above, multiple decoders are registered. I'm not sure
> > > > if that's correct or not, but it does seem odd given there's only one
> > > > cxl type-3 device. Odd that decoder0.0 shows up when CFMW is there,
> > > > but not when it isn't.
> > >
> > > CXL windows are modeled as decoders hanging off the the CXL root device
> > > (ACPI0017 on ACPI based platforms). An endpoint decoder can then map a
> > > selection of that window.
> > >
> > > > Don't know why I haven't thought of this until now, but is the CFMW code
> > > > reporting something odd about what's behind it? Is it assuming the
> > > > devices are pmem?
> > >
> > > No, the cxl_acpi code is just advertising platform decode possibilities
> > > independent of what devices show up. Think of this like the PCI MMIO
> > > space that gets allocated to a root bridge at the beginning of time.
> > > That space may or may not get consumed based on what devices show up
> > > downstream.
> >
> > Thank you for the explanation Dan, and thank you for you patience
> > @JCameron. I'm fairly sure I grok it now.
> >
> > Summarizing to make sure: the cxl driver is providing what would be the
> > CXL.io (control) path, and the CXL.mem path is basically being simulated
> > by what otherwise would be a traditional PCI memory region. This explains
> > why turning off Legacy mode drops the dax devices, and why the topology
> > looks strange - the devices are basically attached in 2 different ways.
> >
> > Might there be interest from the QEMU community to implement this
> > legacy-style setup in the short term, in an effort to test the the
> > control path of type-3 devices while we wait for the kernel to catch up?
I'd happily review such code, but it's not on my list of things to work on
otherwise. Too many other things to support!
Jonathan
> >
> > Or should we forget this mode ever existed and just barrel forward
> > with HDM decoders and writing the kernel code to hook up the underlying
> > devices in drivers/cxl?
>
> Which mode are you referring?
>
> The next steps for the kernel enabling relevant to this thread are:
>
> * ram region discovery (platform firmware or kexec established)
> * ram region creation
> * pmem region discovery (from labels)
next prev parent reply other threads:[~2023-01-23 9:45 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 14:24 [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 1/8] hw/mem/cxl_type3: Improve error handling in realize() Jonathan Cameron via
2023-01-11 17:33 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 2/8] hw/pci-bridge/cxl_downstream: Fix type naming mismatch Jonathan Cameron via
2023-01-11 14:45 ` Philippe Mathieu-Daudé
2023-01-11 17:38 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 3/8] hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL Jonathan Cameron via
2023-01-11 17:41 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 4/8] hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition Jonathan Cameron via
2023-01-11 15:48 ` Philippe Mathieu-Daudé
2023-01-11 14:24 ` [PATCH 5/8] hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge Jonathan Cameron via
2023-01-11 17:48 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 6/8] qemu/bswap: Add const_le64() Jonathan Cameron via
2023-01-11 15:49 ` Philippe Mathieu-Daudé
2023-01-11 16:07 ` Philippe Mathieu-Daudé
2023-01-11 16:33 ` Philippe Mathieu-Daudé
2023-01-11 16:40 ` Philippe Mathieu-Daudé
2023-01-11 16:59 ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 7/8] qemu/uuid: Add UUID static initializer Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 8/8] hw/cxl/mailbox: Use new UUID network order define for cel_uuid Jonathan Cameron via
2023-01-11 15:50 ` Philippe Mathieu-Daudé
2023-01-12 15:39 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Gregory Price
2023-01-12 17:21 ` Jonathan Cameron via
2023-01-12 22:46 ` Gregory Price
2023-01-13 9:12 ` Jonathan Cameron via
2023-01-13 14:19 ` Gregory Price
2023-01-13 14:40 ` Jonathan Cameron via
2023-01-13 14:45 ` Jonathan Cameron via
2023-01-13 15:12 ` Lukas Wunner
2023-01-13 15:42 ` Gregory Price
2023-01-18 19:22 ` Gregory Price
2023-01-18 19:31 ` Gregory Price
2023-01-19 12:42 ` Jonathan Cameron via
2023-01-19 15:04 ` cxl nvdimm Potential probe ordering issues Jonathan Cameron via
2023-01-19 16:17 ` Jonathan Cameron via
2023-01-20 5:51 ` Gregory Price
2023-01-20 17:26 ` Dan Williams
2023-01-20 4:53 ` Gregory Price
2023-01-20 10:47 ` Jonathan Cameron via
2023-01-20 17:38 ` Dan Williams
2023-01-20 21:54 ` Gregory Price
2023-01-20 22:41 ` Dan Williams
2023-01-23 9:44 ` Jonathan Cameron via [this message]
2023-01-23 18:16 ` Gregory Price
2023-01-19 10:19 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron via
2023-01-19 11:48 ` Michael S. Tsirkin
2023-01-19 12:16 ` Jonathan Cameron via
2023-01-19 14:23 ` Gregory Price
2023-01-19 14:20 ` Gregory Price
2023-01-13 14:45 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230123094444.00006bd3@Huawei.com \
--to=qemu-devel@nongnu.org \
--cc=Jonathan.Cameron@Huawei.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=gourry.memverge@gmail.com \
--cc=gregory.price@memverge.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=lukas@wunner.de \
--cc=mst@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).