All of lore.kernel.org
 help / color / mirror / Atom feed
From: Itaru Kitayama <itaru.kitayama@linux.dev>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: qemu-devel@nongnu.org, "Fan Ni" <fan.ni@samsung.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	mst@redhat.com, linux-cxl@vger.kernel.org, linuxarm@huawei.com,
	qemu-arm@nongnu.org,
	"Yuquan Wang" <wangyuquan1236@phytium.com.cn>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [PATCH v13 0/5] arm/virt: CXL support via pxb_cxl
Date: Fri, 16 May 2025 11:30:49 +0900	[thread overview]
Message-ID: <1DF02466-C91E-461E-B35F-D42CEE9F040D@linux.dev> (raw)
In-Reply-To: <20250513111455.128266-1-Jonathan.Cameron@huawei.com>

Hi Jonathan,

> On May 13, 2025, at 20:14, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> V13:
> - Make CXL fixed memory windows sysbus devices.
>  IIRC this was requested by Peter in one of the reviews a long time back
>  but at the time the motivation was less strong than it becomes with some
>  WiP patches for hotness monitoring and high performance direct connect
>  where we need a machine type independent way to iterate all the CXL
>  fixed memory windows. This is a convenient place to do it so drag that
>  work forward into this series.
> 
>  This allows us to drop separate list and necessary machine specific
>  access code in favour of
>  object_child_foreach_recursive(object_get_root(),...)
>  One snag is that the ordering of multiple fixed memory windows in that
>  walk depends on the underlying g_hash_table iterations rather than the
>  order of creation. In the memory map layout and ACPI table creation we
>  need both stable and predictable ordering. Resolve this in a similar
>  fashion to object_class_get_list_sorted() be throwing them in a GSList
>  and sorting that. Only use this when a sorted list is needed.
> 
>  Dropped RFC as now I'm happy with this code and would like to get it
>  upstream!  Particularly as it broken even today due to enscripten
>  related changes that stop us using g_slist_sort(). Easy fix though.
> 
> Note that we have an issue for CXL emulation in general and TCG which
> is being discussed in:
> https://lore.kernel.org/all/20250425183524.00000b28@huawei.com/
> (also affects some other platforms)
> 
> Until that is resolved, either rebase this back on 10.0 or just
> don't let code run out of it (don't use KMEM to expose it as normal
> memory, use DAX instead).
> 
> Previous cover letter.
> 
> Back in 2022, this series stalled on the absence of a solution to device
> tree support for PCI Expander Bridges (PXB) and we ended up only having
> x86 support upstream. I've been carrying the arm64 support out of tree
> since then, with occasional nasty surprises (e.g. UNIMP + DT issue seen
> a few weeks ago) and a fair number of fiddly rebases.
> gitlab.com/jic23/qemu cxl-<latest date>
> 
> A recent discussion with Peter Maydell indicated that there are various
> other ACPI only features now, so in general he might be more relaxed
> about DT support being necessary. The upcoming vSMMUv3 support would
> run into this problem as well.
> 
> I presented the background to the PXB issue at Linaro connect 2022. In
> short the issue is that PXBs steal MMIO space from the main PCI root
> bridge. The challenge is knowing how much to steal.
> 
> On ACPI platforms, we can rely on EDK2 to perform an enumeration and
> configuration of the PCI topology and QEMU can update the ACPI tables
> after EDK2 has done this when it can simply read the space used by the
> root ports. On device tree, there is no entity to figure out that
> enumeration so we don't know how to size the stolen region.
> 
> Three approaches were discussed:
> 1) Enumerating in QEMU. Horribly complex and the last thing we want is a
>   3rd enumeration implementation that ends up out of sync with EDK2 and
>   the kernel (there are frequent issues because of how those existing
>   implementations differ.
> 2) Figure out how to enumerate in kernel. I never put a huge amount of work
>   into this, but it seemed likely to involve a nasty dance with similar
>   very specific code to that EDK2 is carrying and would very challenging
>   to upstream (given the lack of clarity on real use cases for PXBs and
>   DT).
> 3) Hack it based on the control we have which is bus numbers.
>   No one liked this but it worked :)
> 
> The other little wrinkle would be the need to define full bindings for CXL
> on DT + implement a fairly complex kernel stack as equivalent in ACPI
> involves a static table, CEDT, new runtime queries via _DSM and a description
> of various components. Doable, but so far there is no interest on physical
> platforms. Worth noting that for now, the QEMU CXL emulation is all about
> testing and developing the OS stack, not about virtualization (performance
> is terrible except in some very contrived situations!)
> 
> Back to posting as an RFC because there was some discussion of approach to
> modelling the devices that may need a bit of redesign.
> The discussion kind of died out on the back of DT issue and I doubt anyone
> can remember the details.
> 
> https://lore.kernel.org/qemu-devel/20220616141950.23374-1-Jonathan.Cameron@huawei.com/
> 
> There is only a very simple test in here, because my intent is not to
> duplicate what we have on x86, but just to do a smoke test that everything
> is hooked up.  In general we need much more comprehensive end to end CXL
> tests but that requires a reaonsably stable guest software stack. A few
> people have expressed interest in working on that, but we aren't there yet.
> 
> Note that this series has a very different use case to that in the proposed
> SBSA-ref support:
> https://lore.kernel.org/qemu-devel/20250117034343.26356-1-wangyuquan1236@phytium.com.cn/
> 
> SBSA-ref is a good choice if you want a relatively simple mostly fixed
> configuration.  That works well with the limited host system
> discoverability etc as EDK2 can be build against a known configuration.
> 
> My interest with this support in arm/virt is support host software stack
> development (we have a wide range of contributors, most of whom are working
> on emulation + the kernel support). I care about the weird corners. As such
> I need to be able to bring up variable numbers of host bridges, multiple CXL
> Fixed Memory Windows with varying characteristics (interleave etc), complex
> NUMA topologies with wierd performance characteristics etc. We can do that
> on x86 upstream today, or my gitlab tree. Note that we need arm support
> for some arch specific features in the near future (cache flushing).
> Doing kernel development with this need for flexibility on SBSA-ref is not
> currently practical. SBSA-ref CXL support is an excellent thing, just
> not much use to me for this work.
> 
> Jonathan Cameron (5):
>  hw/cxl-host: Add an index field to CXLFixedMemoryWindow
>  hw/cxl: Make the CXL fixed memory windows devices.
>  hw/cxl-host: Allow split of establishing memory address and mmio
>    setup.
>  hw/arm/virt: Basic CXL enablement on pci_expander_bridge instances
>    pxb-cxl
>  qtest/cxl: Add aarch64 virt test for CXL
> 
> include/hw/arm/virt.h     |   4 +
> include/hw/cxl/cxl.h      |   4 +
> include/hw/cxl/cxl_host.h |   6 +-
> hw/acpi/cxl.c             |  83 +++++++++------
> hw/arm/virt-acpi-build.c  |  34 ++++++
> hw/arm/virt.c             |  29 +++++
> hw/cxl/cxl-host-stubs.c   |   8 +-
> hw/cxl/cxl-host.c         | 218 ++++++++++++++++++++++++++++++++------
> hw/i386/pc.c              |  51 ++++-----
> tests/qtest/cxl-test.c    |  59 ++++++++---
> tests/qtest/meson.build   |   1 +
> 11 files changed, 389 insertions(+), 108 deletions(-)
> 
> -- 
> 2.43.0
> 

With your series applied on top of upstream QEMU, the -drive option does not work well with the sane CXL
setup (I use run_qemu.sh maintained by Marc et al. at Intel) see below:

/home/realm/projects/qemu/build/qemu-system-aarch64 -machine virt,accel=tcg,cxl=on,highmem=on,compact-highmem=on,highmem-ecam=on,highmem-mmio=on -m 2048M,slots=0,maxmem=6144M -smp 2,sockets=1,cores=2,threads=1 -display none -nographic -drive if=pflash,format=raw,unit=0,file=AAVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=AAVMF_VARS.fd -drive file=root.img,format=raw,media=disk -kernel mkosi.extra/boot/vmlinuz-6.15.0-rc4-00040-g128ad8fa385b -initrd mkosi.extra/boot/initramfs-6.15.0-rc4-00040-g128ad8fa385b.img -append selinux=0 audit=0 console=tty0 console=ttyS0 root=PARTUUID=14d6bae9-c917-435d-89ea-99af1fa4439a ignore_loglevel rw initcall_debug log_buf_len=20M memory_hotplug.memmap_on_memory=force cxl_acpi.dyndbg=+fplm cxl_pci.dyndbg=+fplm cxl_core.dyndbg=+fplm cxl_mem.dyndbg=+fplm cxl_pmem.dyndbg=+fplm cxl_port.dyndbg=+fplm cxl_region.dyndbg=+fplm cxl_test.dyndbg=+fplm cxl_mock.dyndbg=+fplm cxl_mock_mem.dyndbg=+fplm systemd.set_credential=agetty.autologin:root systemd.set_credential=login.noauth:yes -device e1000,netdev=net0,mac=52:54:00:12:34:56 -netdev user,id=net0,hostfwd=tcp::10022-:22 -cpu max -object memory-backend-file,id=cxl-mem0,share=on,mem-path=cxltest0.raw,size=256M -object memory-backend-file,id=cxl-mem1,share=on,mem-path=cxltest1.raw,size=256M -object memory-backend-file,id=cxl-mem2,share=on,mem-path=cxltest2.raw,size=256M -object memory-backend-file,id=cxl-mem3,share=on,mem-path=cxltest3.raw,size=256M -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=lsa0.raw,size=128K -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=lsa1.raw,size=128K -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=lsa2.raw,size=128K -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=lsa3.raw,size=128K -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=53 -device pxb-cxl,id=cxl.1,bus=pcie.0,bus_nr=191 -device cxl-rp,id=hb0rp0,bus=cxl.0,chassis=0,slot=0,port=0 -device cxl-rp,id=hb0rp1,bus=cxl.0,chassis=0,slot=1,port=1 -device cxl-rp,id=hb1rp0,bus=cxl.1,chassis=0,slot=2,port=0 -device cxl-rp,id=hb1rp1,bus=cxl.1,chassis=0,slot=3,port=1 -device cxl-upstream,port=4,bus=hb0rp0,id=cxl-up0,multifunction=on,addr=0.0,sn=12345678 -device cxl-switch-mailbox-cci,bus=hb0rp0,addr=0.1,target=cxl-up0 -device cxl-upstream,port=4,bus=hb1rp0,id=cxl-up1,multifunction=on,addr=0.0,sn=12341234 -device cxl-switch-mailbox-cci,bus=hb1rp0,addr=0.1,target=cxl-up1 -device cxl-downstream,port=0,bus=cxl-up0,id=swport0,chassis=0,slot=4 -device cxl-downstream,port=1,bus=cxl-up0,id=swport1,chassis=0,slot=5 -device cxl-downstream,port=2,bus=cxl-up0,id=swport2,chassis=0,slot=6 -device cxl-downstream,port=3,bus=cxl-up0,id=swport3,chassis=0,slot=7 -device cxl-downstream,port=0,bus=cxl-up1,id=swport4,chassis=0,slot=8 -device cxl-downstream,port=1,bus=cxl-up1,id=swport5,chassis=0,slot=9 -device cxl-downstream,port=2,bus=cxl-up1,id=swport6,chassis=0,slot=10 -device cxl-downstream,port=3,bus=cxl-up1,id=swport7,chassis=0,slot=11 -device cxl-type3,bus=swport0,persistent-memdev=cxl-mem0,id=cxl-pmem0,lsa=cxl-lsa0 -device cxl-type3,bus=swport2,persistent-memdev=cxl-mem1,id=cxl-pmem1,lsa=cxl-lsa1 -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem2,id=cxl-vmem2,lsa=cxl-lsa2 -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem3,id=cxl-vmem3,lsa=cxl-lsa3 -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k,cxl-fmw.1.targets.0=cxl.0,cxl-fmw.1.targets.1=cxl.1,cxl-fmw.1.size=4G,cxl-fmw.1.interleave-granularity=8k -snapshot -object memory-backend-ram,id=mem0,size=2048M -numa node,nodeid=0,memdev=mem0, -numa cpu,node-id=0,socket-id=0 -numa dist,src=0,dst=0,val=10
qemu-system-aarch64: -drive file=root.img,format=raw,media=disk: PCI: Only PCI/PCIe bridges can be plugged into pxb-cxl

Plain upstream QEMU aarch64 target vert machine can handle the -drive option without an issue _without_ those cxl setup options added. I think the error was seen with your previous cxl-2025-03-20 branch. 

Thanks,
Itaru. 

  parent reply	other threads:[~2025-05-16  2:31 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-13 11:14 [PATCH v13 0/5] arm/virt: CXL support via pxb_cxl Jonathan Cameron
2025-05-13 11:14 ` Jonathan Cameron via
2025-05-13 11:14 ` [PATCH v13 1/5] hw/cxl-host: Add an index field to CXLFixedMemoryWindow Jonathan Cameron
2025-08-08  8:29   ` wangyuquan
2025-05-13 11:14   ` Jonathan Cameron via
2025-05-16  5:44   ` Zhijian Li (Fujitsu)
2025-05-16  5:44     ` Zhijian Li (Fujitsu) via
2025-05-21 16:59   ` Fan Ni
2025-08-08  8:44   ` Yuquan Wang
2025-08-08  8:57   ` Yuquan Wang
2025-05-13 11:14 ` [PATCH v13 2/5] hw/cxl: Make the CXL fixed memory windows devices Jonathan Cameron
2025-05-13 11:14   ` Jonathan Cameron via
2025-05-16  5:44   ` Zhijian Li (Fujitsu)
2025-05-16  5:44     ` Zhijian Li (Fujitsu) via
2025-05-21 17:54     ` Fan Ni
2025-05-27 16:04     ` Jonathan Cameron
2025-05-27 16:04       ` Jonathan Cameron via
2025-05-27 16:04       ` Jonathan Cameron via
2025-05-13 11:14 ` [PATCH v13 3/5] hw/cxl-host: Allow split of establishing memory address and mmio setup Jonathan Cameron
2025-05-13 11:14   ` Jonathan Cameron via
2025-05-16  5:50   ` Zhijian Li (Fujitsu)
2025-05-16  5:50     ` Zhijian Li (Fujitsu) via
2025-05-27 16:28     ` Jonathan Cameron
2025-05-27 16:28       ` Jonathan Cameron via
2025-05-27 16:28       ` Jonathan Cameron via
2025-05-13 11:14 ` [PATCH v13 4/5] hw/arm/virt: Basic CXL enablement on pci_expander_bridge instances pxb-cxl Jonathan Cameron
2025-05-13 11:14   ` Jonathan Cameron via
2025-05-13 11:14 ` [PATCH v13 5/5] qtest/cxl: Add aarch64 virt test for CXL Jonathan Cameron
2025-05-13 11:14   ` Jonathan Cameron via
2025-05-15  9:04   ` Itaru Kitayama
2025-05-19 12:54     ` Jonathan Cameron
2025-05-19 12:54       ` Jonathan Cameron via
2025-05-19 12:54       ` Jonathan Cameron via
2025-05-20 20:38       ` Itaru Kitayama
2025-05-21  7:38       ` Itaru Kitayama
2025-05-21 17:52         ` Jonathan Cameron
2025-05-21 17:52           ` Jonathan Cameron via
2025-05-21 17:52           ` Jonathan Cameron via
2025-05-21 22:07           ` Itaru Kitayama
2025-05-16  2:30 ` Itaru Kitayama [this message]
2025-05-16  6:34   ` [PATCH v13 0/5] arm/virt: CXL support via pxb_cxl Itaru Kitayama
2025-05-20 17:31   ` Jonathan Cameron
2025-05-20 17:31     ` Jonathan Cameron via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1DF02466-C91E-461E-B35F-D42CEE9F040D@linux.dev \
    --to=itaru.kitayama@linux.dev \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=fan.ni@samsung.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=wangyuquan1236@phytium.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.