Linux CXL
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Hao Xiang <hao.xiang@bytedance.com>
Cc: "Ho-Ren (Jack) Chuang" <horenchuang@bytedance.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
	"Gregory Price" <gourry.memverge@gmail.com>,
	"Fan Ni" <fan.ni@samsung.com>, "Ira Weiny" <ira.weiny@intel.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"David Hildenbrand" <david@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	qemu-devel@nongnu.org, "Ho-Ren (Jack) Chuang" <horenc@vt.edu>,
	linux-cxl@vger.kernel.org
Subject: Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'
Date: Tue, 9 Jan 2024 17:13:21 -0500	[thread overview]
Message-ID: <ZZ3FAegN/ezu9Ghy@memverge.com> (raw)
In-Reply-To: <CAAYibXjcVu-13h9jHpt2fZ3wmjz-_Dyj+WfUh0AEpid87GLriQ@mail.gmail.com>

On Tue, Jan 09, 2024 at 01:27:28PM -0800, Hao Xiang wrote:
> On Tue, Jan 9, 2024 at 11:58 AM Gregory Price
> <gregory.price@memverge.com> wrote:
> >
> > If you drop this line:
> >
> > -numa node,memdev=vmem0,nodeid=1
> 
> We tried this as well and it works after going through the cxlcli
> process and created the devdax device. The problem is that without the
> "nodeid=1" configuration, we cannot connect with the explicit per numa
> node latency/bandwidth configuration "-numa hmat-lb". I glanced at the
> code in hw/numa.c, parse_numa_hmat_lb() looks like the one passing the
> lb information to VM's hmat.
>

Yeah, this is what Jonathan was saying - right now there isn't a good
way (in QEMU) to pass the hmat/cdat stuff down through the device.
Needs to be plumbed out.

In the meantime: You should just straight up drop the cxl device from
your QEMU config.  It doesn't actually get you anything.

> From what I understand so far, the guest kernel will dynamically
> create a numa node after a cxl devdax device is created. That means we
> don't know the numa node until after VM boot. 2. QEMU can only
> statically parse the lb information to the VM at boot time. How do we
> connect these two things?

during boot, the kernel discovers all the memory regions exposed to
bios. In this qemu configuration you have defined:

region 0: CPU + DRAM node
region 1: DRAM only node
region 2: CXL Fixed Memory Window (the last line of the cxl stuff)

The kernel reads this information on boot and reserves 1 numa node for
each of these regions.

The kernel then automatically brings up regions 0 and 1 in nodes 0 and 1
respectively.

Node2 sits dormant until you go through the cxl-cli startup sequence.


What you're asking for is for the QEMU team to plumb hmat/cdat
information down through the type3 device.  I *think* that is presently
possible with a custom CDAT file - but Jonathan probably has more
details on that.  You'll have to go digging for answers on that one.


Now - even if you did that - the current state of the cxl-type3 device
is *not what you want* because your memory accesses will be routed
through the read/write functions in the emulated device.

What Jonathan and I discussed on the other thread is how you might go
about slimming this down to allow pass-through of the memory without the
need for all the fluff.  This is a non-trivial refactor of the existing
device, so i would not expect that any time soon.

At the end of the day, quickest way to get-there-from-here is to just
drop the cxl related lines from your QEMU config, and keep everything
else.

> 
> Assuming that the same issue applies to a physical server with CXL.
> Were you able to see a host kernel getting the correct lb information
> for a CXL devdax device?
> 

Yes, if you bring up a CXL device via cxl-cli on real hardware, the
subsequent numa node ends up in the "lower tier" of the memory-tiering
infrastructure.

~Gregory

  reply	other threads:[~2024-01-09 22:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-01  7:53 [QEMU-devel][RFC PATCH 0/1] Introduce HostMemType for 'memory-backend-*' Ho-Ren (Jack) Chuang
2024-01-01  7:53 ` [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram' Ho-Ren (Jack) Chuang
2024-01-02 10:29   ` Philippe Mathieu-Daudé
2024-01-02 13:03   ` David Hildenbrand
2024-01-06  0:45     ` [External] " Hao Xiang
2024-01-03 21:56   ` Gregory Price
2024-01-06  5:59     ` [External] " Hao Xiang
2024-01-08 17:15       ` Gregory Price
2024-01-08 22:47         ` Hao Xiang
2024-01-09  1:05           ` Hao Xiang
2024-01-09  1:13             ` Gregory Price
2024-01-09 19:33               ` Hao Xiang
2024-01-09 19:57                 ` Gregory Price
2024-01-09 21:27                   ` Hao Xiang
2024-01-09 22:13                     ` Gregory Price [this message]
2024-01-09 23:55                       ` Hao Xiang
2024-01-10 14:31                         ` Jonathan Cameron
2024-01-12 15:32   ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZ3FAegN/ezu9Ghy@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=fan.ni@samsung.com \
    --cc=gourry.memverge@gmail.com \
    --cc=hao.xiang@bytedance.com \
    --cc=horenc@vt.edu \
    --cc=horenchuang@bytedance.com \
    --cc=imammedo@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox