Linux CXL
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Sajjan Rao <sajjanr@gmail.com>
Cc: Dimitrios Palyvos <dimitrios.palyvos@zptcorp.com>,
	<linux-cxl@vger.kernel.org>
Subject: Re: qemu cxl memory expander shows numa_node -1
Date: Wed, 23 Aug 2023 17:50:56 +0100	[thread overview]
Message-ID: <20230823175056.00001a84@Huawei.com> (raw)
In-Reply-To: <CAAg4PaqgZcTXkWuys7FZjQdRChTkKj-ZnJQCdxpTMCxy4Hghow@mail.gmail.com>

On Wed, 23 Aug 2023 16:43:13 +0530
Sajjan Rao <sajjanr@gmail.com> wrote:

> Thank you Dimitrios. That worked!
> 
> On Mon, Aug 21, 2023 at 4:23 PM Dimitrios Palyvos
> <dimitrios.palyvos@zptcorp.com> wrote:
> >
> > Hi,
> >
> > Ah yes, I believe you need to enable the kernel config option
> > CONFIG_CXL_REGION_INVALIDATION_TEST for the region creation to work in
> > QEMU. The help entry of that config option gives more info on the why.
> >
> > Hope that helps!
> >
> > Kind regards,
> > Dimitris
> >
> >
> > On Mon, Aug 21, 2023 at 12:01 PM Sajjan Rao <sajjanr@gmail.com> wrote:  
> > >
> > > Hello Dimitrios,
> > >
> > > Thank you for the pointers. I have the 6.4.10 kernel and modified the
> > > qemu options, but now I see an error creating the region.
> > > Is there anything else I missed?
> > >
> > > [root@cxl-test /]# cxl create-region -d decoder0.0 -s 268435456 -t ram
> > > [ 4144.982608] cxl region0: Failed to synchronize CPU cache state
> > > cxl region: create_region: region0: failed to commit decode: No such
> > > device or address
> > >
> > > Thanks,
> > > Sajjan
> > >
> > > -- qemu
> > >
> > > qemu-system-x86_64 \
> > >  -hda /var/lib/libvirt/images/CXL-Test_1.qcow2 \
> > >  -machine type=q35,cxl=on \
> > >  -m 4G \
> > >  -smp cpus=2 \
> > >  -accel tcg,thread=single \
> > >  -object memory-backend-ram,size=4G,id=m0 \
> > >  -object memory-backend-ram,size=256M,id=cxl-mem1 \
> > >  -numa node,memdev=m0,cpus=0-1,nodeid=0 \
> > >  -netdev user,id=net0,net=192.168.0.0/24,dhcpstart=192.168.0.9  \
> > >  -device virtio-net-pci,netdev=net0 \
> > >  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> > >  -device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \
> > >  -device cxl-type3,bus=cxl_rp_port0,volatile-memdev=cxl-mem1,id=cxl-mem1 \
> > >  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G \
> > >  -nographic
> > >
> > > -----
> > >
> > > [root@cxl-test /]# uname -r
> > > 6.4.10-200.fc38.x86_64
> > > [root@cxl-test /]# cxl list
> > > [
> > >   {
> > >     "memdev":"mem0",
> > >     "ram_size":268435456,
> > >     "serial":0,
> > >     "host":"0000:0d:00.0"
> > >   }
> > > ]
> > > [root@cxl-test /]# cxl create-region -d decoder0.0 -s 268435456 -t ram
> > > [ 4144.982608] cxl region0: Failed to synchronize CPU cache state
> > > cxl region: create_region: region0: failed to commit decode: No such
> > > device or address
> > >
> > > [root@cxl-test /]#
> > >
> > > On Fri, Aug 18, 2023 at 8:31 PM Dimitrios Palyvos
> > > <dimitrios.palyvos@zptcorp.com> wrote:  
> > > >
> > > > Hi,
> > > >
> > > > I am not an expert (and not 100% sure if that's what you want to do),
> > > > but here's one way to get your configuration to work:
> > > > 1. Disable KVM.

Just to second this - don't use KVM and expect it to work with CXL emulation
if you are trying to use kmem to present it as normal memory - it should be fine
as long as you never run instructions resident in that memory.

It will crash in nasty ways due to various issues with instruction emulation
where it is running out of memory behind the emulated interleave decoders.

So far we haven't cared enough to add the complexity that would be needed
to make that work.

TCG is the way to go for now.

Jonathan

> > > > 2. Remove the CXL NUMA node from the QEMU command.
> > > > 3. Use the ndctl utilities in the guest to initialize your CXL memory
> > > > and associated NUMA node.
> > > >
> > > > More specifically, I changed your QEMU command as follows:
> > > >
> > > >     qemu-system-x86_64 \
> > > >      -hda /var/lib/libvirt/images/CXL-Test_1.qcow2 \
> > > >      -machine type=q35,cxl=on \
> > > >      -m 8G \
> > > >      -smp cpus=8 \
> > > >      -object memory-backend-ram,size=4G,id=m0 \
> > > >      -object memory-backend-ram,size=4G,id=m1 \
> > > >      -object memory-backend-ram,size=2G,id=cxl-mem1 \
> > > >      -numa node,memdev=m0,cpus=0-3,nodeid=0 \
> > > >      -numa node,memdev=m1,cpus=4-7,nodeid=1 \
> > > >      -netdev user,id=net0,net=192.168.0.0/24,dhcpstart=192.168.0.9  \
> > > >      -device virtio-net-pci,netdev=net0 \
> > > >      -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> > > >      -device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \
> > > >      -device cxl-type3,bus=cxl_rp_port0,volatile-memdev=cxl-mem1,id=cxl-mem1 \
> > > >      -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G \
> > > >      -nographic
> > > >
> > > > In the guest, install ndctl: https://github.com/pmem/ndctl
> > > >
> > > > After that, you should be able to see the CXL memory:
> > > >     root@cxl-img:~# cxl list
> > > >     [
> > > >       {
> > > >         "memdev":"mem0",
> > > >         "ram_size":2147483648,
> > > >         "serial":0,
> > > >         "host":"0000:0d:00.0"
> > > >       }
> > > >     ]
> > > >
> > > > And initialize it as RAM:
> > > >     root@cxl-img:~# cxl create-region -d decoder0.0 -s 2147483648 -t ram
> > > >     ...
> > > >
> > > > root@cxl-img:~# lsmem --output-all
> > > >     RANGE                                  SIZE  STATE REMOVABLE BLOCK
> > > > NODE  ZONES
> > > >     0x0000000000000000-0x0000000007ffffff  128M online       yes     0
> > > >    0   None
> > > >     0x0000000008000000-0x000000007fffffff  1.9G online       yes  1-15
> > > >    0  DMA32
> > > >     0x0000000100000000-0x000000017fffffff    2G online       yes 32-47
> > > >    0 Normal
> > > >     0x0000000180000000-0x000000027fffffff    4G online       yes 48-79
> > > >    1 Normal
> > > >     0x0000000290000000-0x000000030fffffff    2G online       yes 82-97
> > > >    2 Normal
> > > >
> > > >     Memory block size:       128M
> > > >     Total online memory:      10G
> > > >     Total offline memory:      0B
> > > >
> > > >
> > > >     root@cxl-img:~# cat /proc/iomem
> > > >     ...
> > > >     290000000-38fffffff : CXL Window 0
> > > >       290000000-30fffffff : region0
> > > >         290000000-30fffffff : dax0.0
> > > >           290000000-30fffffff : System RAM (kmem)
> > > >
> > > >
> > > > Then you can generate traffic in the CXL NUMA node, for example:
> > > >
> > > >     root@cxl-img:~# numactl --membind 2 ls
> > > >
> > > > Note: The above is with linux v6.4.11.
> > > >
> > > > Hope that helps!
> > > >
> > > > Kind regards,
> > > > Dimitris
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Aug 18, 2023 at 11:39 AM Sajjan Rao <sajjanr@gmail.com> wrote:  
> > > > >
> > > > > Hello,
> > > > >
> > > > > I have a qemu + cxl configuration coming up with one configured type 3
> > > > > device. My goal is to generate some cxl.mem traffic in this
> > > > > configuration.
> > > > > However the numa_node is always showing as -1. I have tried various
> > > > > qemu command line parameters including to explicitly set numa_node for
> > > > > cxl devices.
> > > > >
> > > > > Here is my qemu command line
> > > > > --------
> > > > > qemu-system-x86_64 \
> > > > >  -hda /var/lib/libvirt/images/CXL-Test_1.qcow2 \
> > > > >  -machine type=q35,accel=kvm,cxl=on \
> > > > >  -m 10G \
> > > > >  -smp cpus=8 \
> > > > >  -object memory-backend-ram,size=4G,id=m0 \
> > > > >  -object memory-backend-ram,size=4G,id=m1 \
> > > > >  -object memory-backend-ram,size=2G,id=cxl-mem1 \
> > > > >  -numa node,memdev=m0,cpus=0-1,nodeid=0 \
> > > > >  -numa node,memdev=m1,cpus=2-3,nodeid=1 \
> > > > >  -numa node,memdev=cxl-mem1,cpus=4-7,nodeid=2 \
> > > > >  -netdev user,id=net0,net=192.168.0.0/24,dhcpstart=192.168.0.9  \
> > > > >  -device virtio-net-pci,netdev=net0 \
> > > > >  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> > > > >  -device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \
> > > > >  -device cxl-type3,bus=cxl_rp_port0,volatile-memdev=cxl-mem1,id=cxl-mem1 \
> > > > >  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G \
> > > > >  -enable-kvm \
> > > > >  -nographic
> > > > > -----
> > > > >
> > > > > I see that the cxl device is listed in lspci output
> > > > > ------
> > > > > #lspci | grep -i cxl
> > > > > 0d:00.0 CXL: Intel Corporation Device 0d93 (rev 01)
> > > > >
> > > > > #lspci -s 0d:00.0 -vvv | grep -i numa
> > > > > #
> > > > >
> > > > > -------
> > > > >
> > > > > sysfs output
> > > > > ----------
> > > > > #cat /sys/bus/cxl/devices/mem0/numa_node
> > > > > -1
> > > > > --------
> > > > >
> > > > > numactl output
> > > > >
> > > > > ------------------
> > > > > #numactl -H
> > > > > available: 3 nodes (0-2)
> > > > > node 0 cpus: 0 1
> > > > > node 0 size: 3910 MB
> > > > > node 0 free: 3776 MB
> > > > > node 1 cpus: 2 3
> > > > > node 1 size: 4031 MB
> > > > > node 1 free: 3927 MB
> > > > > node 2 cpus: 4 5 6 7
> > > > > node 2 size: 2011 MB
> > > > > node 2 free: 1785 MB
> > > > > node distances:
> > > > > node   0   1   2
> > > > >   0:  10  20  20
> > > > >   1:  20  10  20
> > > > >   2:  20  20  10
> > > > > -------------------
> > > > >
> > > > > The numa_node 2 is expected to be mapped to a CXL device, I do see
> > > > > some activity in numastat output, but it's unclear if this is really
> > > > > mapped to the CXL device since the device itself says numa_node is -1
> > > > > (expected to show 2).
> > > > >
> > > > > Has anybody seen this behavior? Any help will be greatly appreciated.
> > > > >
> > > > > Thanks,
> > > > > Sajjan  
> > > >
> > > > --
> > > > **CONFIDENTIALITY NOTICE:*
> > > > *
> > > > *The contents of this email message and any
> > > > attachments are intended solely for the addressee(s) and may contain
> > > > confidential and/or privileged information and may be legally protected
> > > > from disclosure. If you are not the intended recipient of this message or
> > > > their agent, or if this message has been addressed to you in error, please
> > > > immediately alert the sender by reply email and then delete this message
> > > > and any attachments. If you are not the intended recipient, you are hereby
> > > > notified that any use, dissemination, copying, or storage of this message
> > > > or its attachments is strictly prohibited. *  
> >
> >
> >
> > --
> >
> > Dimitrios Payvos-Giannas, PhD
> >
> > Software Engineer
> >
> > ZeroPoint Technologies
> >
> > Remove the waste.
> >
> > Release the power.
> >
> > --
> > **CONFIDENTIALITY NOTICE:*
> > *
> > *The contents of this email message and any
> > attachments are intended solely for the addressee(s) and may contain
> > confidential and/or privileged information and may be legally protected
> > from disclosure. If you are not the intended recipient of this message or
> > their agent, or if this message has been addressed to you in error, please
> > immediately alert the sender by reply email and then delete this message
> > and any attachments. If you are not the intended recipient, you are hereby
> > notified that any use, dissemination, copying, or storage of this message
> > or its attachments is strictly prohibited. *  


  reply	other threads:[~2023-08-23 16:51 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-18  9:38 qemu cxl memory expander shows numa_node -1 Sajjan Rao
2023-08-18 15:01 ` Dimitrios Palyvos
2023-08-21 10:00   ` Sajjan Rao
2023-08-21 10:53     ` Dimitrios Palyvos
2023-08-23 11:13       ` Sajjan Rao
2023-08-23 16:50         ` Jonathan Cameron [this message]
2023-08-24  6:26           ` Sajjan Rao
2024-01-25  8:15             ` Sajjan Rao
2024-01-26 12:39               ` Jonathan Cameron
2024-01-26 15:43                 ` Gregory Price
2024-01-26 17:12                   ` Jonathan Cameron
2024-01-30  8:20                     ` Sajjan Rao
2024-02-01 13:04                       ` Crash with CXL + TCG on 8.2: Was " Jonathan Cameron
2024-02-01 13:12                         ` Peter Maydell
2024-02-01 14:01                           ` Jonathan Cameron
2024-02-01 14:35                             ` Peter Maydell
2024-02-01 15:17                               ` Alex Bennée
2024-02-01 15:29                                 ` Jonathan Cameron
2024-02-01 16:00                                 ` Peter Maydell
2024-02-01 16:21                                   ` Jonathan Cameron
2024-02-01 16:45                                     ` Alex Bennée
2024-02-01 17:04                                       ` Gregory Price
2024-02-01 17:07                                         ` Peter Maydell
2024-02-01 17:29                                           ` Gregory Price
2024-02-01 17:08                                       ` Jonathan Cameron
2024-02-01 17:21                                         ` Peter Maydell
2024-02-01 17:41                                           ` Jonathan Cameron
2024-02-01 17:25                                         ` Alex Bennée
2024-02-01 18:04                                           ` Peter Maydell
2024-02-01 18:56                                             ` Gregory Price
2024-02-02 16:26                                               ` Jonathan Cameron
2024-02-02 16:33                                                 ` Peter Maydell
2024-02-02 16:50                                                   ` Gregory Price
2024-02-02 16:56                                                     ` Peter Maydell
2024-02-07 17:34                                                       ` Jonathan Cameron
2024-02-08 14:50                                                         ` Jonathan Cameron
2024-02-15 15:29                                                           ` Jonathan Cameron
2024-02-15 15:04                                   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230823175056.00001a84@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dimitrios.palyvos@zptcorp.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=sajjanr@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox