From: Gregory Price <gregory.price@memverge.com>
To: Hao Xiang <hao.xiang@bytedance.com>
Cc: "Ho-Ren (Jack) Chuang" <horenchuang@bytedance.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
"Ben Widawsky" <ben.widawsky@intel.com>,
"Gregory Price" <gourry.memverge@gmail.com>,
"Fan Ni" <fan.ni@samsung.com>, "Ira Weiny" <ira.weiny@intel.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"David Hildenbrand" <david@redhat.com>,
"Igor Mammedov" <imammedo@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Eduardo Habkost" <eduardo@habkost.net>,
qemu-devel@nongnu.org, "Ho-Ren (Jack) Chuang" <horenc@vt.edu>,
linux-cxl@vger.kernel.org
Subject: Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'
Date: Mon, 8 Jan 2024 12:15:06 -0500 [thread overview]
Message-ID: <ZZwtmiucNXxmrZ7S@memverge.com> (raw)
In-Reply-To: <CAAYibXjZ0HSCqMrzXGv62cMLncS_81R3e1uNV5Fu4CPm0zAtYw@mail.gmail.com>
On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote:
> On Wed, Jan 3, 2024 at 1:56 PM Gregory Price <gregory.price@memverge.com> wrote:
> >
> > For a variety of performance reasons, this will not work the way you
> > want it to. You are essentially telling QEMU to map the vmem0 into a
> > virtual cxl device, and now any memory accesses to that memory region
> > will end up going through the cxl-type3 device logic - which is an IO
> > path from the perspective of QEMU.
>
> I didn't understand exactly how the virtual cxl-type3 device works. I
> thought it would go with the same "guest virtual address -> guest
> physical address -> host physical address" translation totally done by
> CPU. But if it is going through an emulation path handled by virtual
> cxl-type3, I agree the performance would be bad. Do you know why
> accessing memory on a virtual cxl-type3 device can't go with the
> nested page table translation?
>
Because a byte-access on CXL memory can have checks on it that must be
emulated by the virtual device, and because there are caching
implications that have to be emulated as well.
The cxl device you are using is an emulated CXL device - not a
virtualization interface. Nuanced difference: the emulated device has
to emulate *everything* that CXL device does.
What you want is passthrough / managed access to a real device -
virtualization. This is not the way to accomplish that. A better way
to accomplish that is to simply pass the memory through as a static numa
node as I described.
>
> When we had a discussion with Intel, they told us to not use the KVM
> option in QEMU while using virtual cxl type3 device. That's probably
> related to the issue you described here? We enabled KVM though but
> haven't seen the crash yet.
>
The crash really only happens, IIRC, if code ends up hosted in that
memory. I forget the exact scenario, but the working theory is it has
to do with the way instruction caches are managed with KVM and this
device.
> >
> > You're better off just using the `host-nodes` field of host-memory
> > and passing bandwidth/latency attributes though via `-numa hmat-lb`
>
> We tried this but it doesn't work from end to end right now. I
> described the issue in another fork of this thread.
>
> >
> > In that scenario, the guest software doesn't even need to know CXL
> > exists at all, it can just read the attributes of the numa node
> > that QEMU created for it.
>
> We thought about this before. But the current kernel implementation
> requires a devdax device to be probed and recognized as a slow tier
> (by reading the memory attributes). I don't think this can be done via
> the path you described. Have you tried this before?
>
Right, because the memory tiering component lumps the nodes together.
Better idea: Fix the memory tiering component
I cc'd you on another patch line that is discussing something relevant
to this.
https://lore.kernel.org/linux-mm/87fs00njft.fsf@yhuang6-desk2.ccr.corp.intel.com/T/#m32d58f8cc607aec942995994a41b17ff711519c8
The point is: There's no need for this to be a dax device at all, there
is no need for the guest to even know what is providing the memory, or
for the guest to have any management access to the memory. It just
wants the memory and the ability to tier it.
So we should fix the memory tiering component to work with this
workflow.
~Gregory
next prev parent reply other threads:[~2024-01-08 17:21 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-01 7:53 [QEMU-devel][RFC PATCH 0/1] Introduce HostMemType for 'memory-backend-*' Ho-Ren (Jack) Chuang
2024-01-01 7:53 ` [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram' Ho-Ren (Jack) Chuang
2024-01-02 10:29 ` Philippe Mathieu-Daudé
2024-01-02 13:03 ` David Hildenbrand
2024-01-06 0:45 ` [External] " Hao Xiang
2024-01-03 21:56 ` Gregory Price
2024-01-06 5:59 ` [External] " Hao Xiang
2024-01-08 17:15 ` Gregory Price [this message]
2024-01-08 22:47 ` Hao Xiang
2024-01-09 1:05 ` Hao Xiang
2024-01-09 1:13 ` Gregory Price
2024-01-09 19:33 ` Hao Xiang
2024-01-09 19:57 ` Gregory Price
2024-01-09 21:27 ` Hao Xiang
2024-01-09 22:13 ` Gregory Price
2024-01-09 23:55 ` Hao Xiang
2024-01-10 14:31 ` Jonathan Cameron via
2024-01-12 15:32 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZwtmiucNXxmrZ7S@memverge.com \
--to=gregory.price@memverge.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=armbru@redhat.com \
--cc=ben.widawsky@intel.com \
--cc=berrange@redhat.com \
--cc=david@redhat.com \
--cc=eblake@redhat.com \
--cc=eduardo@habkost.net \
--cc=fan.ni@samsung.com \
--cc=gourry.memverge@gmail.com \
--cc=hao.xiang@bytedance.com \
--cc=horenc@vt.edu \
--cc=horenchuang@bytedance.com \
--cc=imammedo@redhat.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).