From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Sajjan Rao <sajjanr@gmail.com>
Cc: Gregory Price <gregory.price@memverge.com>,
Dimitrios Palyvos <dimitrios.palyvos@zptcorp.com>,
<linux-cxl@vger.kernel.org>, <qemu-devel@nongnu.org>,
<richard.henderson@linaro.org>
Subject: Crash with CXL + TCG on 8.2: Was Re: qemu cxl memory expander shows numa_node -1
Date: Thu, 1 Feb 2024 13:04:38 +0000 [thread overview]
Message-ID: <20240201130438.00001384@Huawei.com> (raw)
In-Reply-To: <CAAg4ParQKj9FUe0DRX0Wmk1KT0bnxx2F7W=ic38781j7eVz+OQ@mail.gmail.com>
On Tue, 30 Jan 2024 13:50:18 +0530
Sajjan Rao <sajjanr@gmail.com> wrote:
> Hi Jonathan,
>
> The QEMU command line in the original email has been corrected back in
> August 2023 based on the subsequent responses.
>
> My current QEMU command line reads like below. As you can see I am not
> assigning numa to the CXL memory object.
>
> qemu-system-x86_64 \
> -hda /var/lib/libvirt/images/CXL-Test_1.qcow2 \
> -machine type=q35,nvdimm=on,cxl=on \
> -accel tcg,thread=single \
> -m 4G \
> -smp cpus=4 \
> -object memory-backend-ram,size=4G,id=m0 \
> -object memory-backend-ram,size=256M,id=cxl-mem1 \
> -object memory-backend-ram,size=256M,id=cxl-mem2 \
> -numa node,memdev=m0,cpus=0-3,nodeid=0 \
> -netdev user,id=net0,net=192.168.0.0/24,dhcpstart=192.168.0.9,hostfwd=tcp::2222-:22
> \
> -device virtio-net-pci,netdev=net0 \
> -device pxb-cxl,bus_nr=2,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
> -device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \
> -device cxl-upstream,bus=cxl_rp_port0,id=us0,addr=0.0,multifunction=on, \
> -device cxl-switch-mailbox-cci,bus=cxl_rp_port0,addr=0.2,target=us0 \
> -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
> -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=8 \
> -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem1,id=cxl-vmem1 \
> -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem2,id=cxl-vmem2 \
> -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=512M,cxl-fmw.0.interleave-granularity=2k
> \
> -D /tmp/qemu.log \
> -nographic
>
> Until I moved to Qemu version 8.2 recently, I was able to create
> regions and run linux native commands on CXL memory using
> #numactl --membind <cxl NUMA#> top
>
> You had advised me to turn off KVM and use tcg since the membind
> command will run code out of CXL memory which is not supported. By
> disabling KVM the membind command worked fine.
> However with Qemu version 8.2 the same membind command results in a
> kernel hard crash.
Just to check, kernel crashes, or qemu crashes?
I've probably replicated and it seems to be qemu that is going down with a TCG issue.
Bisection underway.
This may take a while.
Our use of TCG is unusual with what QEMU thinks of as io memory is unusual
so we tend to run into corners no one else cares about.
Richard, +CC on off chance you can guess what has happened and save
me a bisection run..
x86 machine pretty much as described above
root@localhost:~/devmem2# numactl --membind=1 touch a
qemu: fatal: cpu_io_recompile: could not find TB for pc=(nil)
RAX=00d6b969c0000000 RBX=ff294696c0044440 RCX=0000000000000028 RDX=0000000000000000
RSI=0000000000000275 RDI=0000000000000000 RBP=0000000490000000 RSP=ff4f8767805d3d20
R8 =0000000000000000 R9 =ff4f8767805d3cdc R10=0000000000000000 R11=0000000000000040
R12=ff294696c0044980 R13=0000000000000000 R14=ff294696c51d0000 R15=0000000000000000
RIP=ffffffff9d270fed RFL=00000007 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 00000000 00000000
CS =0010 0000000000000000 ffffffff 00af9b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0000 0000000000000000 00000000 00000000
FS =0000 0000000000000000 00000000 00000000
GS =0000 ff2946973bc00000 00000000 00000000
LDT=0000 0000000000000000 00000000 00008200 DPL=0 LDT
TR =0040 fffffe37d29e7000 00004087 00008900 DPL=0 TSS64-avl
GDT= fffffe37d29e5000 0000007f
IDT= fffffe0000000000 00000fff
CR0=80050033 CR2=00007f2972bdc450 CR3=0000000490000000 CR4=00751ef0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00d6b969c0000000 CCD=0000000490000000 CCO=ADDQ
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
YMM00=0000000000000000 0000000000000000 3a3a3a3a3a3a3a3a 3a3a3a3a3a3a3a3a
YMM01=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM02=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM03=0000000000000000 0000000000000000 00ff0000000000ff 0000000000000000
YMM04=0000000000000000 0000000000000000 5f796c7261655f63 62696c5f5f004554
YMM05=0000000000000000 0000000000000000 0000000000000000 0000000000000060
YMM06=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM07=0000000000000000 0000000000000000 0909090909090909 0909090909090909
YMM08=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM09=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM10=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM11=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM12=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM13=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM14=0000000000000000 0000000000000000 0000000000000000 0000000000000000
YMM15=0000000000000000 0000000000000000 0000000000000000 0000000000000000
Jonathan
> I wanted to check if this is a known issue with 8.2 and is there a way
> around it.
>
> Thanks,
> Sajjan
>
> On Fri, Jan 26, 2024 at 10:42 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Fri, 26 Jan 2024 10:43:43 -0500
> > Gregory Price <gregory.price@memverge.com> wrote:
> >
> > > On Fri, Jan 26, 2024 at 12:39:26PM +0000, Jonathan Cameron wrote:
> > > > On Thu, 25 Jan 2024 13:45:09 +0530
> > > > Sajjan Rao <sajjanr@gmail.com> wrote:
> > > >
> > > > > Looks like something changed in QEMU 8.2 that broke running code out
> > > > > of CXL memory with KVM disabled.
> > > > > I used "numactl --membind 2 ls" as suggested by Dimitrios earlier,
> > > > > this worked for me until I updated to the latest QEMU.
> > > > >
> > > > > Is this a known issue? Or am I missing something?
> > > >
> > > > I'm confused on how the description below ever worked.
> > > > Assigning the underlying memdev=cxl-mem1 to a numa node isn't going
> > > > to correctly build the connections the CFMWS PA range.
> > > >
> > >
> > > I've now seen 3-4 occasions where people have done this and run into
> > > trouble (for obvious reasons). Is there anything we can do to disallow
> > > the double-registering of a single memdev to both a numa node and a cxl
> > > device?
> > >
> > It would be novel for us to prevent people shooting themselves
> > in the foot ;) but I guess this should be fairly easy as the
> > numa node logic prevents the same one being used multiple times so can
> > copy how that is done.
> >
> > This should do the trick (very lightly tested).
> > It's end of day Friday here so a formal patch can wait for next week.
> >
> >
> > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> > index f29346fae7..d4194bb757 100644
> > --- a/hw/mem/cxl_type3.c
> > +++ b/hw/mem/cxl_type3.c
> > @@ -827,6 +827,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > error_setg(errp, "volatile memdev must have backing device");
> > return false;
> > }
> > + if (host_memory_backend_is_mapped(ct3d->hostvmem)) {
> > + error_setg(errp, "memory backend %s can't be used multiple times.",
> > + object_get_canonical_path_component(OBJECT(ct3d->hostvmem)));
> > + return false;
> > + }
> > memory_region_set_nonvolatile(vmr, false);
> > memory_region_set_enabled(vmr, true);
> > host_memory_backend_set_mapped(ct3d->hostvmem, true);
> > @@ -850,6 +855,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > error_setg(errp, "persistent memdev must have backing device");
> > return false;
> > }
> > + if (host_memory_backend_is_mapped(ct3d->hostpmem)) {
> > + error_setg(errp, "memory backend %s can't be used multiple times.",
> > + object_get_canonical_path_component(OBJECT(ct3d->hostpmem)));
> > + return false;
> > + }
> > memory_region_set_nonvolatile(pmr, true);
> > memory_region_set_enabled(pmr, true);
> > host_memory_backend_set_mapped(ct3d->hostpmem, true);
> > @@ -880,6 +890,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > error_setg(errp, "dynamic capacity must have backing device");
> > return false;
> > }
> > + if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) {
> > + error_setg(errp, "memory backend %s can't be used multiple times.",
> > + object_get_canonical_path_component(OBJECT(ct3d->dc.host_dc)));
> > + return false;
> > + }
> > /* FIXME: set dc as nonvolatile for now */
> > memory_region_set_nonvolatile(dc_mr, true);
> > memory_region_set_enabled(dc_mr, true);
> >
> >
> >
> >
> >
> > > ~Gregory
> >
next parent reply other threads:[~2024-02-01 13:04 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAAg4PaqsGZvkDk_=PH+Oz-yeEUVcVsrumncAgegRKuxe_YoFhA@mail.gmail.com>
[not found] ` <CAGEDW0fWCfuG3KrNSwDjNVGAZVL9NJgF26Jqyd840HfQdNGLbA@mail.gmail.com>
[not found] ` <CAAg4Pard=zh_5p650UcNdQEoQWZLM6G7KRqdPQHLmaR4oZMJ3w@mail.gmail.com>
[not found] ` <CAGEDW0dVEk-QXuL=DPVvSP4t5cafz6N-r_SrCxgFnBfFOsixSA@mail.gmail.com>
[not found] ` <CAAg4PaqgZcTXkWuys7FZjQdRChTkKj-ZnJQCdxpTMCxy4Hghow@mail.gmail.com>
[not found] ` <20230823175056.00001a84@Huawei.com>
[not found] ` <CAAg4ParSB4_2FU2bu96A=3tSNuwHqZwK0wCS18EJoPAq9kYEkw@mail.gmail.com>
[not found] ` <CAAg4Pap9KzkgX=fgE7vNJYxEpGbHA-NVsgBY5npXizUbMhjp9A@mail.gmail.com>
[not found] ` <20240126123926.000051bd@Huawei.com>
[not found] ` <ZbPTL00WOo7UC0e6@memverge.com>
[not found] ` <20240126171233.00002a2e@Huawei.com>
[not found] ` <CAAg4ParQKj9FUe0DRX0Wmk1KT0bnxx2F7W=ic38781j7eVz+OQ@mail.gmail.com>
2024-02-01 13:04 ` Jonathan Cameron via [this message]
2024-02-01 13:12 ` Crash with CXL + TCG on 8.2: Was Re: qemu cxl memory expander shows numa_node -1 Peter Maydell
2024-02-01 14:01 ` Jonathan Cameron via
2024-02-01 14:35 ` Peter Maydell
2024-02-01 15:17 ` Alex Bennée
2024-02-01 15:29 ` Jonathan Cameron via
2024-02-01 16:00 ` Peter Maydell
2024-02-01 16:21 ` Jonathan Cameron via
2024-02-01 16:45 ` Alex Bennée
2024-02-01 17:04 ` Gregory Price
2024-02-01 17:07 ` Peter Maydell
2024-02-01 17:29 ` Gregory Price
2024-02-01 17:08 ` Jonathan Cameron via
2024-02-01 17:21 ` Peter Maydell
2024-02-01 17:41 ` Jonathan Cameron via
2024-02-01 17:25 ` Alex Bennée
2024-02-01 18:04 ` Peter Maydell
2024-02-01 18:56 ` Gregory Price
2024-02-02 16:26 ` Jonathan Cameron via
2024-02-02 16:33 ` Peter Maydell
2024-02-02 16:50 ` Gregory Price
2024-02-02 16:56 ` Peter Maydell
2024-02-07 17:34 ` Jonathan Cameron via
2024-02-08 14:50 ` Jonathan Cameron via
2024-02-15 15:29 ` Jonathan Cameron via
2024-02-19 7:55 ` Mattias Nissler
2024-02-15 15:04 ` Jonathan Cameron via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240201130438.00001384@Huawei.com \
--to=qemu-devel@nongnu.org \
--cc=Jonathan.Cameron@Huawei.com \
--cc=dimitrios.palyvos@zptcorp.com \
--cc=gregory.price@memverge.com \
--cc=linux-cxl@vger.kernel.org \
--cc=richard.henderson@linaro.org \
--cc=sajjanr@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).