From: Gregory Price <gregory.price@memverge.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: qemu-devel@nongnu.org, Michael Tsirkin <mst@redhat.com>,
Ben Widawsky <bwidawsk@kernel.org>,
linux-cxl@vger.kernel.org, linuxarm@huawei.com,
Ira Weiny <ira.weiny@intel.com>,
Gregory Price <gourry.memverge@gmail.com>
Subject: Re: [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream
Date: Thu, 12 Jan 2023 17:46:27 -0500 [thread overview]
Message-ID: <Y8CNw/fZT5fZJZcK@memverge.com> (raw)
In-Reply-To: <20230112172130.0000391b@Huawei.com>
On Thu, Jan 12, 2023 at 05:21:30PM +0000, Jonathan Cameron wrote:
> On Thu, 12 Jan 2023 10:39:17 -0500
> Gregory Price <gregory.price@memverge.com> wrote:
>
> > On Wed, Jan 11, 2023 at 02:24:32PM +0000, Jonathan Cameron via wrote:
> > > Gregory's patches were posted as part of his work on adding volatile support.
> > > https://lore.kernel.org/linux-cxl/20221006233702.18532-1-gregory.price@memverge.com/
> > > https://lore.kernel.org/linux-cxl/20221128150157.97724-2-gregory.price@memverge.com/
> > > I might propose this for upstream inclusion this cycle, but testing is
> > > currently limited by lack of suitable kernel support.
> >
> > fwiw the testing i've done suggests the problem isn't necessarily the
> > implementation so much as either the EFI support or the ACPI tables.
> >
> > For example, we see memory expanders come up no problem and turn into
> > volatile memory on real hardware, with the same kernels with just a few
> > commands. My gut feeling is that either a mailbox command is missing or
> > that the ACPI tables are missing/significantly different.
> >
> > I haven't been able to investigate further at this point, but that's my
> > current state with the voltile type-3 device testing.
>
> My assumption was that all shipping hardware platforms were doing the
> enumeration and bring up of memory expanders in the BIOS / firmware.
> Those are then presented to the OS already set up exactly as if they were
> normal memory. We could do the same on QEMU but that means a lot of
> work in EDK2. Note that it makes no sense to do the enumeration and
> creation of ACPI tables in QEMU itself though could hack it like that.
> This stuff is done in firmware because that enables it for legacy
> OSes. Everything is more or less presented to the OS like you would
> present RAM (EFI memory map, ACPI tables etc).
>
> Firmware enumeration doesn't typically support hotplug, so if we add
> support for hotplug of volatile memory type 3 devices to the kernel
> we will also be able to do 'cold plug' and have the kernel bring them up
> in a similar fashion to what we do for non-volatile (for non volatile there
> is typically no real support in firmware as there is a bunch of policy to
> deal with that doesn't belong in firmware). (simplifying heavily ;)
>
> So I don't think we are missing anything in the emulation, just in the
> software layers above it. Could be wrong though ;)
>
> Jonathan
>
>
I'm not so sure something is missing so much as something seems
incorrect in either the ACPI table structure definitions, the mailbox,
or even the doe emulation.
I took your branch and reverted to just prior to the volatile patch
refernce: 59a59ef725699e0efb3e9e31a7f8d246de7286ed
QEMU configuration for boot (Please let me know if something is wrong)
sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
-drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
-m 2G,slots=4,maxmem=4G \
-smp 4 \
-machine type=q35,accel=kvm,cxl=on \
-enable-kvm \
-nographic \
-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
-device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \
-object memory-backend-file,pmem=true,id=cxl-mem0,mem-path=/tmp/cxl-mem0,size=1G \
-object memory-backend-file,pmem=true,id=lsa0,mem-path=/tmp/cxl-lsa0,size=1G \
-device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 \
-M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
After boot we find:
[root@fedora ~]# ls /sys/bus/cxl/devices/
decoder0.0 decoder2.0 mem0 pmem0 root0
decoder1.0 endpoint2 nvdimm-bridge0 port1
[root@fedora ~]# ls -al /sys/bus/dax/devices/
total 0
drwxr-xr-x. 2 root root 0 Jan 12 22:44 .
drwxr-xr-x. 4 root root 0 Jan 12 22:44 ..
During boot, I am seeing three separate call traces, all of which appear
to be related to PCI DOE and/or getting CDAT information.
[ 3.916900] Call Trace:
[ 3.916906] <TASK>
[ 3.931217] pci_doe_submit_task+0x5d/0xd0
[ 3.936609] pci_doe_discovery+0xb4/0x100
[ 3.936627] ? pci_doe_xa_destroy+0x10/0x10
[ 3.942675] pcim_doe_create_mb+0x219/0x290
[ 3.950506] cxl_pci_probe+0x192/0x430
[ 3.960248] local_pci_probe+0x41/0x80
[ 3.966564] pci_device_probe+0xb3/0x220
[ 3.966579] really_probe+0xde/0x380
[ 3.966583] ? pm_runtime_barrier+0x50/0x90
[ 3.969158] __driver_probe_device+0x78/0x170
[ 3.969167] driver_probe_device+0x1f/0x90
[ 3.978264] __driver_attach_async_helper+0x5c/0xe0
[ 3.983953] async_run_entry_fn+0x30/0x130
[ 3.991084] process_one_work+0x294/0x5b0
[ 4.004458] worker_thread+0x4f/0x3a0
[ 4.012612] ? process_one_work+0x5b0/0x5b0
[ 4.019114] kthread+0xf5/0x120
[ 4.025133] ? kthread_complete_and_exit+0x20/0x20
[ 4.031327] ret_from_fork+0x22/0x30
[ 4.038969] </TASK>
[ 16.047704] pci_doe_submit_task+0x5d/0xd0
[ 16.047713] cxl_cdat_get_length+0xb8/0x110
[ 16.047779] ? dvsec_range_allowed+0x60/0x60
[ 16.047803] read_cdat_data+0xaf/0x1a0
[ 16.047814] cxl_port_probe+0x80/0x120
[ 16.047824] cxl_bus_probe+0x17/0x50
[ 16.047830] really_probe+0xde/0x380
[ 16.047835] ? pm_runtime_barrier+0x50/0x90
[ 16.047843] __driver_probe_device+0x78/0x170
[ 16.047851] driver_probe_device+0x1f/0x90
[ 16.047858] __device_attach_driver+0x85/0x110
[ 16.047881] ? driver_allows_async_probing+0x70/0x70
[ 16.047884] bus_for_each_drv+0x7a/0xb0
[ 16.047896] __device_attach+0xb3/0x1d0
[ 16.047907] bus_probe_device+0x9f/0xc0
[ 16.047913] device_add+0x41e/0x9b0
[ 16.047918] ? kobject_set_name_vargs+0x6d/0x90
[ 16.047928] ? dev_set_name+0x4b/0x60
[ 16.047944] devm_cxl_add_port+0x27b/0x3b0
[ 16.047970] devm_cxl_add_endpoint+0x82/0x130
[ 16.047982] cxl_mem_probe+0xc4/0x11d [cxl_mem]
[ 16.047997] cxl_bus_probe+0x17/0x50
[ 16.048003] really_probe+0xde/0x380
[ 16.048007] ? pm_runtime_barrier+0x50/0x90
[ 16.048014] __driver_probe_device+0x78/0x170
[ 16.048022] driver_probe_device+0x1f/0x90
[ 16.048029] __driver_attach+0xd5/0x1d0
[ 16.048036] ? __device_attach_driver+0x110/0x110
[ 16.048040] bus_for_each_dev+0x76/0xa0
[ 16.048051] bus_add_driver+0x1b1/0x200
[ 16.048061] driver_register+0x89/0xe0
[ 16.048066] ? 0xffffffffc056e000
[ 16.048070] do_one_initcall+0x6e/0x320
[ 16.048091] do_init_module+0x4a/0x200
[ 16.048099] __do_sys_init_module+0x16a/0x1a0
[ 16.048132] do_syscall_64+0x5b/0x80
[ 16.048138] ? lock_is_held_type+0xe8/0x140
[ 16.048148] ? asm_exc_page_fault+0x22/0x30
[ 16.048156] ? lockdep_hardirqs_on+0x7d/0x100
[ 16.048162] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 16.054601] pci_doe_submit_task+0x5d/0xd0
[ 16.054610] cxl_cdat_read_table.isra.0+0x141/0x190
[ 16.054660] ? dvsec_range_allowed+0x60/0x60
[ 16.054685] read_cdat_data+0xfc/0x1a0
[ 16.054695] cxl_port_probe+0x80/0x120
[ 16.054706] cxl_bus_probe+0x17/0x50
[ 16.054712] really_probe+0xde/0x380
[ 16.054717] ? pm_runtime_barrier+0x50/0x90
[ 16.054725] __driver_probe_device+0x78/0x170
[ 16.054733] driver_probe_device+0x1f/0x90
[ 16.054739] __device_attach_driver+0x85/0x110
[ 16.054747] ? driver_allows_async_probing+0x70/0x70
[ 16.054751] bus_for_each_drv+0x7a/0xb0
[ 16.054767] __device_attach+0xb3/0x1d0
[ 16.054782] bus_probe_device+0x9f/0xc0
[ 16.054791] device_add+0x41e/0x9b0
[ 16.054798] ? kobject_set_name_vargs+0x6d/0x90
[ 16.054811] ? dev_set_name+0x4b/0x60
[ 16.054831] devm_cxl_add_port+0x27b/0x3b0
[ 16.054843] devm_cxl_add_endpoint+0x82/0x130
[ 16.054854] cxl_mem_probe+0xc4/0x11d [cxl_mem]
[ 16.054869] cxl_bus_probe+0x17/0x50
[ 16.054875] really_probe+0xde/0x380
[ 16.054879] ? pm_runtime_barrier+0x50/0x90
[ 16.054887] __driver_probe_device+0x78/0x170
[ 16.054894] driver_probe_device+0x1f/0x90
[ 16.054901] __driver_attach+0xd5/0x1d0
[ 16.054908] ? __device_attach_driver+0x110/0x110
[ 16.054912] bus_for_each_dev+0x76/0xa0
[ 16.054923] bus_add_driver+0x1b1/0x200
[ 16.055204] driver_register+0x89/0xe0
[ 16.055211] ? 0xffffffffc056e000
[ 16.055215] do_one_initcall+0x6e/0x320
[ 16.055237] do_init_module+0x4a/0x200
[ 16.055245] __do_sys_init_module+0x16a/0x1a0
[ 16.055277] do_syscall_64+0x5b/0x80
[ 16.055283] ? lock_is_held_type+0xe8/0x140
[ 16.055294] ? asm_exc_page_fault+0x22/0x30
[ 16.055301] ? lockdep_hardirqs_on+0x7d/0x100
[ 16.055307] entry_SYSCALL_64_after_hwframe+0x63/0xcd
next prev parent reply other threads:[~2023-01-12 22:47 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 14:24 [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron
2023-01-11 14:24 ` [PATCH 1/8] hw/mem/cxl_type3: Improve error handling in realize() Jonathan Cameron
2023-01-11 17:33 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 2/8] hw/pci-bridge/cxl_downstream: Fix type naming mismatch Jonathan Cameron
2023-01-11 14:45 ` Philippe Mathieu-Daudé
2023-01-11 17:38 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 3/8] hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL Jonathan Cameron
2023-01-11 17:41 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 4/8] hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition Jonathan Cameron
2023-01-11 15:48 ` Philippe Mathieu-Daudé
2023-01-11 14:24 ` [PATCH 5/8] hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge Jonathan Cameron
2023-01-11 17:48 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 6/8] qemu/bswap: Add const_le64() Jonathan Cameron
2023-01-11 15:49 ` Philippe Mathieu-Daudé
2023-01-11 16:07 ` Philippe Mathieu-Daudé
2023-01-11 16:33 ` Philippe Mathieu-Daudé
2023-01-11 16:40 ` Philippe Mathieu-Daudé
2023-01-11 16:59 ` Jonathan Cameron
2023-01-11 14:24 ` [PATCH 7/8] qemu/uuid: Add UUID static initializer Jonathan Cameron
2023-01-11 14:24 ` [PATCH 8/8] hw/cxl/mailbox: Use new UUID network order define for cel_uuid Jonathan Cameron
2023-01-11 15:50 ` Philippe Mathieu-Daudé
2023-01-12 15:39 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Gregory Price
2023-01-12 17:21 ` Jonathan Cameron
2023-01-12 22:46 ` Gregory Price [this message]
2023-01-13 9:12 ` Jonathan Cameron
2023-01-13 14:19 ` Gregory Price
2023-01-13 14:40 ` Jonathan Cameron
2023-01-13 14:45 ` Jonathan Cameron
2023-01-13 15:12 ` Lukas Wunner
2023-01-13 15:42 ` Gregory Price
2023-01-18 19:22 ` Gregory Price
2023-01-18 19:31 ` Gregory Price
2023-01-19 12:42 ` Jonathan Cameron
2023-01-19 15:04 ` cxl nvdimm Potential probe ordering issues Jonathan Cameron
2023-01-19 16:17 ` Jonathan Cameron
2023-01-20 5:51 ` Gregory Price
2023-01-20 17:26 ` Dan Williams
2023-01-20 4:53 ` Gregory Price
2023-01-20 10:47 ` Jonathan Cameron
2023-01-20 17:38 ` Dan Williams
2023-01-20 21:54 ` Gregory Price
2023-01-20 22:41 ` Dan Williams
2023-01-23 9:44 ` Jonathan Cameron
2023-01-23 18:16 ` Gregory Price
2023-01-19 10:19 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron
2023-01-19 11:48 ` Michael S. Tsirkin
2023-01-19 12:16 ` Jonathan Cameron
2023-01-19 14:23 ` Gregory Price
2023-01-19 14:20 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8CNw/fZT5fZJZcK@memverge.com \
--to=gregory.price@memverge.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=bwidawsk@kernel.org \
--cc=gourry.memverge@gmail.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox