From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Peter Maydell" <peter.maydell@linaro.org>,
"Ben Widawsky" <ben.widawsky@intel.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
Date: Mon, 3 Apr 2023 17:12:32 +0100 [thread overview]
Message-ID: <20230403171232.000020bb@huawei.com> (raw)
In-Reply-To: <0ce0c313-61af-213f-96f6-16ab5f137b0f@redhat.com>
On Fri, 24 Mar 2023 11:17:50 +0100
Thomas Huth <thuth@redhat.com> wrote:
> On 07/12/2022 14.26, Thomas Huth wrote:
> > On 07/12/2022 14.21, Jonathan Cameron wrote:
> >> On Mon, 05 Dec 2022 14:59:39 +0000
> >> Alex Bennée <alex.bennee@linaro.org> wrote:
> >>
> >>> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
> >>>
> >>>> On Mon, 5 Dec 2022 10:54:03 +0000
> >>>> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
> >>>>> On Sun, 4 Dec 2022 08:23:55 +0100
> >>>>> Thomas Huth <thuth@redhat.com> wrote:
> >>>>>> On 04/11/2022 07.47, Thomas Huth wrote:
> >>>>>>> On 16/06/2022 18.57, Michael S. Tsirkin wrote:
> >>>>>>>> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>>>>>>>
> >>>>>>>> Emulation of a simple CXL Switch downstream port.
> >>>>>>>> The Device ID has been allocated for this use.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>>>>>>> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
> >>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> ---
> >>>>>>>> hw/cxl/cxl-host.c | 43 +++++-
> >>>>>>>> hw/pci-bridge/cxl_downstream.c | 249
> >>>>>>>> +++++++++++++++++++++++++++++++++
> >>>>>>>> hw/pci-bridge/meson.build | 2 +-
> >>>>>>>> 3 files changed, 291 insertions(+), 3 deletions(-)
> >>>>>>>> create mode 100644 hw/pci-bridge/cxl_downstream.c
> >>>>>>>
> >>>>>>> Hi!
> >>>>>>>
> >>>>>>> There is a memory problem somewhere in this new device. I can make QEMU
> >>>>>>> crash by running something like this:
> >>>>>>>
> >>>>>>> $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
> >>>>>>> -display none -monitor stdio
> >>>>>>> QEMU 7.1.50 monitor - type 'help' for more information
> >>>>>>> (qemu) device_add cxl-downstream
> >>>>>>> ./qemu/qom/object.c:1188:5: runtime error: member access within
> >>>>>>> misaligned
> >>>>>>> address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8
> >>>>>>> byte
> >>>>>>> alignment
> >>>>>>> 0x3b3b3b3b3b3b3b3b: note: pointer points here
> >>>>>>> <memory cannot be printed>
> >>>>>>> Bus error (core dumped)
> >>>>>>>
> >>>>>>> Could you have a look if you've got some spare minutes?
> >>>>>>
> >>>>>> Ping! Jonathan, Michael, any news on this bug?
> >>>>>>
> >>>>>> (this breaks one of my local tests, that's why it's annoying for me)
> >>>>> Sorry, my email filters ate your earlier message.
> >>>>>
> >>>>> Looking into this now. I'll note that it also happens on
> >>>>> device_add xio3130-downstream so not specific to this new device.
> >>>>>
> >>>>> So far all I've managed to do is track it to something rcu related
> >>>>> as failing in a call to drain_call_rcu() in qmp_device_add()
> >>>>>
> >>>>> Will continue digging.
> >>>>
> >>>> Assuming I'm seeing the same thing...
> >>>>
> >>>> Problem is g_free() on the PCIBridge windows:
> >>>> https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235
> >>>>
> >>>> Is called before we get an rcu_call() to flatview_destroy() as a
> >>>> result of the final call of flatview_unref() in
> >>>> address_space_set_flatview()
> >>>> so we get a use after free.
> >>>>
> >>>> As to what the fix is... Suggestions welcome!
> >>>
> >>> It sounds like this is the wrong place to free the value then. I guess
> >>> the PCI aliases into &w->alias_io() don't get dealt with until RCU
> >>> clean-up time.
> >>>
> >>> I *think* using g_free_rcu() should be enough to ensure the free occurs
> >>> after the rest of the RCU cleanups but maybe you should only be cleaning
> >>> up the windows at device unrealize time? Is this a dynamic piece of
> >>> memory which gets updated during the lifetime of the device?
> >>
> >> There is unfortunately code that swaps it for an updated structure
> >> in pci_bridge_update_mappings()
> >>
> >>>
> >>> If the memory is being cleared with RCU then the access to the base
> >>> pointer should be done with the appropriate qatomic_rcu_[set|read]
> >>> functions.
> >>>
> >>
> >> I'm annoyingly snowed under this week with other things, but hopefully
> >> can get to in a few days. Note we are looking at an old problem
> >> here, just one that's happening for an additional device, not sure
> >> if that really affects urgency of fix though.
> >
> > It's too late now for QEMU 7.2 anyway, so there is no hurry, I think.
>
> I'm still seeing problems with this device, I guess it's still the
> same issue:
>
> $ valgrind -q ./qemu-system-x86_64 -M x-remote -monitor stdio
> ...
> QEMU 7.2.91 monitor - type 'help' for more information
> (qemu) device_add cxl-downstream,id=c1
> ==46154== Thread 2:
> ==46154== Invalid read of size 8
> ==46154== at 0x7A0A0B: memory_region_unref (memory.c:1798)
> ==46154== by 0x7A0A0B: flatview_destroy (memory.c:298)
> ==46154== by 0x9A5E32: call_rcu_thread (rcu.c:284)
> ==46154== by 0x99CC19: qemu_thread_start (qemu-thread-posix.c:541)
> ==46154== by 0xB6A31C9: start_thread (in /usr/lib64/libpthread-2.28.so)
> ==46154== by 0xB8F4E72: clone (in /usr/lib64/libc-2.28.so)
> ==46154== Address 0x1a2a95c0 is 64 bytes inside a block of size 1,632 free'd
> ==46154== at 0x4C39A93: free (vg_replace_malloc.c:872)
> ==46154== by 0x79B62B1: g_free (in /usr/lib64/libglib-2.0.so.0.5600.4)
> ==46154== by 0x55E06F: cxl_dsp_realize (cxl_downstream.c:201)
> ==46154== by 0x5523AC: pci_qdev_realize (pci.c:2098)
> ==46154== by 0x833A1E: device_set_realized (qdev.c:510)
> ==46154== by 0x836DC5: property_set_bool (object.c:2285)
> ==46154== by 0x838FA2: object_property_set (object.c:1420)
> ==46154== by 0x83C01E: object_property_set_qobject (qom-qobject.c:28)
> ==46154== by 0x839213: object_property_set_bool (object.c:1489)
> ==46154== by 0x5F9787: qdev_device_add_from_qdict (qdev-monitor.c:714)
> ==46154== by 0x5F98F0: qdev_device_add (qdev-monitor.c:733)
> ==46154== by 0x5F99E1: qmp_device_add (qdev-monitor.c:855)
> ==46154== Block was alloc'd at
> ==46154== at 0x4C37135: malloc (vg_replace_malloc.c:381)
> ==46154== by 0x79B61A5: g_malloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
> ==46154== by 0x553072: pci_bridge_region_init (pci_bridge.c:191)
> ==46154== by 0x553575: pci_bridge_initfn (pci_bridge.c:388)
> ==46154== by 0x55E032: cxl_dsp_realize (cxl_downstream.c:147)
> ==46154== by 0x5523AC: pci_qdev_realize (pci.c:2098)
> ==46154== by 0x833A1E: device_set_realized (qdev.c:510)
> ==46154== by 0x836DC5: property_set_bool (object.c:2285)
> ==46154== by 0x838FA2: object_property_set (object.c:1420)
> ==46154== by 0x83C01E: object_property_set_qobject (qom-qobject.c:28)
> ==46154== by 0x839213: object_property_set_bool (object.c:1489)
> ==46154== by 0x5F9787: qdev_device_add_from_qdict (qdev-monitor.c:714)
>
> Could we get a fix for QEMU 8.0 ?
The original option of moving over to g_free_rcu() and need to then protect
all accesses to br->windows was make a mess of things and as far as I can
immediately spot seems to be unnecessary.
Unfortunately I don't understand why the window handling is complex in the
first place. The following patch just embeds the structure
directly in the PCI Bridge and seems to avoid the issue you have reported.
As the code to update it on the fly is protected anyway by
memory_region_transaction_begin() I don't think we care about ensuring the
exposed windows are consistent whilst we update them.
If someone else could sanity check my logic that would be great.
Thanks,
Jonathan
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index dd5af508f9..698fd01ae6 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -184,11 +184,11 @@ static void pci_bridge_init_vga_aliases(PCIBridge *br, PCIBus *parent,
}
}
-static PCIBridgeWindows *pci_bridge_region_init(PCIBridge *br)
+static void pci_bridge_region_init(PCIBridge *br)
{
PCIDevice *pd = PCI_DEVICE(br);
PCIBus *parent = pci_get_bus(pd);
- PCIBridgeWindows *w = g_new(PCIBridgeWindows, 1);
+ PCIBridgeWindows *w = &br->windows;
uint16_t cmd = pci_get_word(pd->config + PCI_COMMAND);
pci_bridge_init_alias(br, &w->alias_pref_mem,
@@ -211,8 +211,6 @@ static PCIBridgeWindows *pci_bridge_region_init(PCIBridge *br)
cmd & PCI_COMMAND_IO);
pci_bridge_init_vga_aliases(br, parent, w->alias_vga);
-
- return w;
}
static void pci_bridge_region_del(PCIBridge *br, PCIBridgeWindows *w)
@@ -234,19 +232,17 @@ static void pci_bridge_region_cleanup(PCIBridge *br, PCIBridgeWindows *w)
object_unparent(OBJECT(&w->alias_vga[QEMU_PCI_VGA_IO_LO]));
object_unparent(OBJECT(&w->alias_vga[QEMU_PCI_VGA_IO_HI]));
object_unparent(OBJECT(&w->alias_vga[QEMU_PCI_VGA_MEM]));
- g_free(w);
}
void pci_bridge_update_mappings(PCIBridge *br)
{
- PCIBridgeWindows *w = br->windows;
-
+ PCIBridgeWindows *w = &br->windows;
/* Make updates atomic to: handle the case of one VCPU updating the bridge
* while another accesses an unaffected region. */
memory_region_transaction_begin();
- pci_bridge_region_del(br, br->windows);
+ pci_bridge_region_del(br, w);
pci_bridge_region_cleanup(br, w);
- br->windows = pci_bridge_region_init(br);
+ pci_bridge_region_init(br);
memory_region_transaction_commit();
}
@@ -385,7 +381,7 @@ void pci_bridge_initfn(PCIDevice *dev, const char *typename)
sec_bus->address_space_io = &br->address_space_io;
memory_region_init(&br->address_space_io, OBJECT(br), "pci_bridge_io",
4 * GiB);
- br->windows = pci_bridge_region_init(br);
+ pci_bridge_region_init(br);
QLIST_INIT(&sec_bus->child);
QLIST_INSERT_HEAD(&parent->child, sec_bus, sibling);
}
@@ -396,8 +392,8 @@ void pci_bridge_exitfn(PCIDevice *pci_dev)
PCIBridge *s = PCI_BRIDGE(pci_dev);
assert(QLIST_EMPTY(&s->sec_bus.child));
QLIST_REMOVE(&s->sec_bus, sibling);
- pci_bridge_region_del(s, s->windows);
- pci_bridge_region_cleanup(s, s->windows);
+ pci_bridge_region_del(s, &s->windows);
+ pci_bridge_region_cleanup(s, &s->windows);
/* object_unparent() is called automatically during device deletion */
}
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index 81a058bb2c..b650748a39 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -30,6 +30,7 @@
#include "hw/pci/pci_bus.h"
#include "hw/cxl/cxl.h"
#include "qom/object.h"
+#include "qemu/rcu.h"
typedef struct PCIBridgeWindows PCIBridgeWindows;
@@ -39,8 +40,11 @@ typedef struct PCIBridgeWindows PCIBridgeWindows;
* as subregions.
*/
struct PCIBridgeWindows {
+// struct rcu_head rcu;
MemoryRegion alias_pref_mem;
+ // uint8_t pad;
MemoryRegion alias_mem;
+ // uint8_t pad2;
MemoryRegion alias_io;
/*
* When bridge control VGA forwarding is enabled, bridges will
@@ -73,7 +77,7 @@ struct PCIBridge {
MemoryRegion address_space_mem;
MemoryRegion address_space_io;
- PCIBridgeWindows *windows;
+ PCIBridgeWindows windows;
pci_map_irq_fn map_irq;
const char *bus_name;
>
> Thomas
>
next prev parent reply other threads:[~2023-04-03 16:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 01/10] pci-bridge/cxl_upstream: Add a CXL switch upstream port Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port Michael S. Tsirkin
2022-11-04 6:47 ` Thomas Huth
2022-12-04 7:23 ` Thomas Huth
2022-12-05 10:54 ` Jonathan Cameron via
2022-12-05 12:45 ` Jonathan Cameron via
2022-12-05 14:59 ` Alex Bennée
2022-12-07 13:21 ` Jonathan Cameron via
2022-12-07 13:26 ` Thomas Huth
2023-03-24 10:17 ` Thomas Huth
2023-04-03 16:12 ` Jonathan Cameron via [this message]
2022-06-16 16:57 ` [PULL 03/10] docs/cxl: Add switch documentation Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 04/10] virtio/vhost-user: Fix wrong vhost notifier GPtrArray size Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 05/10] virtio-iommu: Add bypass mode support to assigned device Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 06/10] virtio-iommu: Use recursive lock to avoid deadlock Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 07/10] virtio-iommu: Add an assert check in translate routine Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 08/10] crypto: Introduce RSA algorithm Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 09/10] vhost: also check queue state in the vhost_dev_set_log error routine Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 10/10] acpi/erst: fix fallthrough code upon validation failure Michael S. Tsirkin
2022-06-16 20:46 ` [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230403171232.000020bb@huawei.com \
--to=qemu-devel@nongnu.org \
--cc=Jonathan.Cameron@huawei.com \
--cc=alex.bennee@linaro.org \
--cc=ben.widawsky@intel.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.