From: "Michael S. Tsirkin" <mst@redhat.com>
To: xuyandong <xuyandong2@huawei.com>
Cc: "marcel@redhat.com" <marcel@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Zhanghailiang <zhang.zhanghailiang@huawei.com>,
"wangxin (U)" <wangxinxin.wang@huawei.com>,
"Huangweidong (C)" <weidong.huang@huawei.com>
Subject: Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug
Date: Mon, 10 Dec 2018 23:25:34 -0500 [thread overview]
Message-ID: <20181210232256-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <7CECC2DFC21538489F72729DF5EFB4D908AC7A04@DGGEMM501-MBX.china.huawei.com>
On Tue, Dec 11, 2018 at 03:51:09AM +0000, xuyandong wrote:
> > On Tue, Dec 11, 2018 at 02:55:43AM +0000, xuyandong wrote:
> > > On Tue, Dec 11, 2018 at 01:47:37AM +0000, xuyandong wrote:
> > > > > On Sat, Dec 08, 2018 at 11:58:59AM +0000, xuyandong wrote:
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In our test, we configured VM with several pci-bridges and
> > > > > > > > > a virtio-net nic been attached with bus 4,
> > > > > > > > >
> > > > > > > > > After VM is startup, We ping this nic from host to judge
> > > > > > > > > if it is working normally. Then, we hot add pci devices to
> > > > > > > > > this VM with bus
> > > > 0.
> > > > > > > > >
> > > > > > > > > We found the virtio-net NIC in bus 4 is not working (can
> > > > > > > > > not
> > > > > > > > > connect) occasionally, as it kick virtio backend failure with error
> > below:
> > > > > > > > >
> > > > > > > > > Unassigned mem write 00000000fc803004 = 0x1
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > memory-region: pci_bridge_pci
> > > > > > > > >
> > > > > > > > > 0000000000000000-ffffffffffffffff (prio 0, RW):
> > > > > > > > > pci_bridge_pci
> > > > > > > > >
> > > > > > > > > 00000000fc800000-00000000fc803fff (prio 1, RW):
> > > > > > > > > virtio-pci
> > > > > > > > >
> > > > > > > > > 00000000fc800000-00000000fc800fff (prio 0, RW):
> > > > > > > > > virtio-pci-common
> > > > > > > > >
> > > > > > > > > 00000000fc801000-00000000fc801fff (prio 0, RW):
> > > > > > > > > virtio-pci-isr
> > > > > > > > >
> > > > > > > > > 00000000fc802000-00000000fc802fff (prio 0, RW):
> > > > > > > > > virtio-pci-device
> > > > > > > > >
> > > > > > > > > 00000000fc803000-00000000fc803fff (prio 0, RW):
> > > > > > > > > virtio-pci-notify <- io mem unassigned
> > > > > > > > >
> > > > > > > > > …
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We caught an exceptional address changing while this
> > > > > > > > > problem happened, show as
> > > > > > > > > follow:
> > > > > > > > >
> > > > > > > > > Before pci_bridge_update_mappings:
> > > > > > > > >
> > > > > > > > > 00000000fc000000-00000000fc1fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc000000-00000000fc1fffff
> > > > > > > > >
> > > > > > > > > 00000000fc200000-00000000fc3fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc200000-00000000fc3fffff
> > > > > > > > >
> > > > > > > > > 00000000fc400000-00000000fc5fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc400000-00000000fc5fffff
> > > > > > > > >
> > > > > > > > > 00000000fc600000-00000000fc7fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc600000-00000000fc7fffff
> > > > > > > > >
> > > > > > > > > 00000000fc800000-00000000fc9fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fc800000-00000000fc9fffff
> > > > > > > > > <- correct Adress Spce
> > > > > > > > >
> > > > > > > > > 00000000fca00000-00000000fcbfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fca00000-00000000fcbfffff
> > > > > > > > >
> > > > > > > > > 00000000fcc00000-00000000fcdfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fcc00000-00000000fcdfffff
> > > > > > > > >
> > > > > > > > > 00000000fce00000-00000000fcffffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_pref_mem @pci_bridge_pci
> > > > > > > > > 00000000fce00000-00000000fcffffff
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > After pci_bridge_update_mappings:
> > > > > > > > >
> > > > > > > > > 00000000fda00000-00000000fdbfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fda00000-00000000fdbfffff
> > > > > > > > >
> > > > > > > > > 00000000fdc00000-00000000fddfffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fdc00000-00000000fddfffff
> > > > > > > > >
> > > > > > > > > 00000000fde00000-00000000fdffffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fde00000-00000000fdffffff
> > > > > > > > >
> > > > > > > > > 00000000fe000000-00000000fe1fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe000000-00000000fe1fffff
> > > > > > > > >
> > > > > > > > > 00000000fe200000-00000000fe3fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe200000-00000000fe3fffff
> > > > > > > > >
> > > > > > > > > 00000000fe400000-00000000fe5fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe400000-00000000fe5fffff
> > > > > > > > >
> > > > > > > > > 00000000fe600000-00000000fe7fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe600000-00000000fe7fffff
> > > > > > > > >
> > > > > > > > > 00000000fe800000-00000000fe9fffff (prio 1, RW):
> > > > > > > > > alias pci_bridge_mem @pci_bridge_pci
> > > > > > > > > 00000000fe800000-00000000fe9fffff
> > > > > > > > >
> > > > > > > > > fffffffffc800000-fffffffffc800000 (prio 1, RW):
> > > > > > > > > alias
> > > > > > pci_bridge_pref_mem
> > > > > > > > > @pci_bridge_pci fffffffffc800000-fffffffffc800000 <- Exceptional
> > > > Adress
> > > > > > > > Space
> > > > > > > >
> > > > > > > > This one is empty though right?
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We have figured out why this address becomes this value,
> > > > > > > > > according to pci spec, pci driver can get BAR address
> > > > > > > > > size by writing 0xffffffff to
> > > > > > > > >
> > > > > > > > > the pci register firstly, and then read back the value from this
> > register.
> > > > > > > >
> > > > > > > >
> > > > > > > > OK however as you show below the BAR being sized is the BAR
> > > > > > > > if a bridge. Are you then adding a bridge device by hotplug?
> > > > > > >
> > > > > > > No, I just simply hot plugged a VFIO device to Bus 0, another
> > > > > > > interesting phenomenon is If I hot plug the device to other
> > > > > > > bus, this doesn't
> > > > > > happened.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > We didn't handle this value specially while process pci
> > > > > > > > > write in qemu, the function call stack is:
> > > > > > > > >
> > > > > > > > > Pci_bridge_dev_write_config
> > > > > > > > >
> > > > > > > > > -> pci_bridge_write_config
> > > > > > > > >
> > > > > > > > > -> pci_default_write_config (we update the config[address]
> > > > > > > > > -> value here to
> > > > > > > > > fffffffffc800000, which should be 0xfc800000 )
> > > > > > > > >
> > > > > > > > > -> pci_bridge_update_mappings
> > > > > > > > >
> > > > > > > > > ->pci_bridge_region_del(br, br->windows);
> > > > > > > > >
> > > > > > > > > -> pci_bridge_region_init
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -> pci_bridge_init_alias (here pci_bridge_get_base, we use
> > > > > > > > > -> the
> > > > > > > > > wrong value
> > > > > > > > > fffffffffc800000)
> > > > > > > > >
> > > > > > > > > ->
> > > > > > > > > memory_region_transaction_commit
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > So, as we can see, we use the wrong base address in qemu
> > > > > > > > > to update the memory regions, though, we update the base
> > > > > > > > > address to
> > > > > > > > >
> > > > > > > > > The correct value after pci driver in VM write the
> > > > > > > > > original value back, the virtio NIC in bus 4 may still
> > > > > > > > > sends net packets concurrently with
> > > > > > > > >
> > > > > > > > > The wrong memory region address.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We have tried to skip the memory region update action in
> > > > > > > > > qemu while detect pci write with 0xffffffff value, and it
> > > > > > > > > does work, but
> > > > > > > > >
> > > > > > > > > This seems to be not gently.
> > > > > > > >
> > > > > > > > For sure. But I'm still puzzled as to why does Linux try to
> > > > > > > > size the BAR of the bridge while a device behind it is used.
> > > > > > > >
> > > > > > > > Can you pls post your QEMU command line?
> > > > > > >
> > > > > > > My QEMU command line:
> > > > > > > /root/xyd/qemu-system-x86_64 -name
> > > > > > > guest=Linux,debug-threads=on -S -object
> > > > > > > secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/dom
> > > > > > > ain-
> > > > > > > 194-
> > > > > > > Linux/master-key.aes -machine
> > > > > > > pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu
> > > > > > > host,+kvm_pv_eoi -bios /usr/share/OVMF/OVMF.fd -m
> > > > > > > size=4194304k,slots=256,maxmem=33554432k -realtime mlock=off
> > > > > > > -smp
> > > > > > > 20,sockets=20,cores=1,threads=1 -numa
> > > > > > > node,nodeid=0,cpus=0-4,mem=1024 -numa
> > > > > > > node,nodeid=1,cpus=5-9,mem=1024 -numa
> > > > > > > node,nodeid=2,cpus=10-14,mem=1024 -numa
> > > > > > > node,nodeid=3,cpus=15-19,mem=1024 -uuid
> > > > > > > 34a588c7-b0f2-4952-b39c-47fae3411439 -no-user-config
> > > > > > > -nodefaults -chardev
> > > > > > > socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-194-Li
> > > > > > > nux/
> > > > > > > moni
> > > > > > > tor.sock,server,nowait -mon
> > > > > > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc
> > > > > > > -no-hpet -global kvm-pit.lost_tick_policy=delay -no-shutdown
> > > > > > > -boot strict=on -device
> > > > > > > pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x8 -device
> > > > > > > pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x9 -device
> > > > > > > pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0xa -device
> > > > > > > pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0xb -device
> > > > > > > pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0xc -device
> > > > > > > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> > > > > > > usb-ehci,id=usb1,bus=pci.0,addr=0x10 -device
> > > > > > > nec-usb-xhci,id=usb2,bus=pci.0,addr=0x11 -device
> > > > > > > virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
> > > > > > > virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device
> > > > > > > virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x5 -device
> > > > > > > virtio-scsi-pci,id=scsi3,bus=pci.0,addr=0x6 -device
> > > > > > > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
> > > > > > > file=/mnt/sdb/xml/centos_74_x64_uefi.raw,format=raw,if=none,id
> > > > > > > =dri
> > > > > > > ve-v
> > > > > > > irtio-disk0,cache=none -device
> > > > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-
> > > > > > > disk
> > > > > > > 0,id
> > > > > > > =virtio-disk0,bootindex=1 -drive
> > > > > > > if=none,id=drive-ide0-1-1,readonly=on,cache=none -device
> > > > > > > ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1
> > > > > > > -netdev
> > > > > > > tap,fd=35,id=hostnet0 -device
> > > > > > > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:5d:8b,b
> > > > > > > us=p
> > > > > > > ci.4
> > > > > > > ,addr=0x1 -chardev pty,id=charserial0 -device
> > > > > > > isa-serial,chardev=charserial0,id=serial0 -device
> > > > > > > usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device
> > > > > > > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x12 -device
> > > > > > > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xd -msg
> > > > > > > timestamp=on
> > > > > > >
> > > > > > > I am also very curious about this issue, in the linux kernel
> > > > > > > code, maybe double
> > > > > > check in function pci_bridge_check_ranges triggered this problem.
> > > > > >
> > > > > > If you can get the stacktrace in Linux when it tries to write
> > > > > > this fffff value, that would be quite helpful.
> > > > > >
> > > > >
> > > > > After I add mdelay(100) in function pci_bridge_check_ranges, this
> > > > > phenomenon is easier to reproduce, below is my modify in kernel:
> > > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > > > > index cb389277..86e232d 100644
> > > > > --- a/drivers/pci/setup-bus.c
> > > > > +++ b/drivers/pci/setup-bus.c
> > > > > @@ -27,7 +27,7 @@
> > > > > #include <linux/slab.h>
> > > > > #include <linux/acpi.h>
> > > > > #include "pci.h"
> > > > > -
> > > > > +#include <linux/delay.h>
> > > > > unsigned int pci_flags;
> > > > >
> > > > > struct pci_dev_resource {
> > > > > @@ -787,6 +787,9 @@ static void pci_bridge_check_ranges(struct
> > > > > pci_bus
> > > > *bus)
> > > > > pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32,
> > > > > 0xffffffff);
> > > > > pci_read_config_dword(bridge,
> > > > > PCI_PREF_BASE_UPPER32, &tmp);
> > > > > + mdelay(100);
> > > > > + printk(KERN_ERR "sleep\n");
> > > > > + dump_stack();
> > > > > if (!tmp)
> > > > > b_res[2].flags &= ~IORESOURCE_MEM_64;
> > > > > pci_write_config_dword(bridge,
> > > > > PCI_PREF_BASE_UPPER32,
> > > > >
> > > >
> > > > OK!
> > > > I just sent a Linux patch that should help.
> > > > I would appreciate it if you will give it a try and if that helps
> > > > reply to it with a
> > > > Tested-by: tag.
> > > >
> > >
> > > I tested this patch and it works fine on my machine.
> > >
> > > But I have another question, if we only fix this problem in the
> > > kernel, the Linux version that has been released does not work well on the
> > virtualization platform.
> > > Is there a way to fix this problem in the backend?
> >
> > There could we a way to work around this.
> > Does below help?
>
> I am sorry to tell you, I tested this patch and it doesn't work fine.
>
> >
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
> > 236a20eaa8..7834cac4b0 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -551,7 +551,7 @@ static void build_append_pci_bus_devices(Aml
> > *parent_scope, PCIBus *bus,
> >
> > aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
> > aml_append(method,
> > - aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check */)
> > + aml_call2("DVNT", aml_name("PCIU"), aml_int(4) /* Device
> > + Check Light */)
> > );
> > aml_append(method,
> > aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request */)
Oh I see, another bug:
case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT event\n");
/* TBD: Exactly what does 'light' mean? */
break;
And then e.g. acpi_generic_hotplug_event(struct acpi_device *adev, u32 type)
and friends all just ignore this event type.
--
MST
next prev parent reply other threads:[~2018-12-11 4:25 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-08 11:58 [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug xuyandong
2018-12-09 14:26 ` Michael S. Tsirkin
2018-12-10 1:59 ` xuyandong
2018-12-10 2:22 ` Michael S. Tsirkin
2018-12-10 3:12 ` xuyandong
2018-12-10 18:14 ` Michael S. Tsirkin
2018-12-11 1:47 ` xuyandong
2018-12-11 2:17 ` Michael S. Tsirkin
2018-12-11 2:55 ` xuyandong
2018-12-11 3:38 ` Michael S. Tsirkin
2018-12-11 3:51 ` xuyandong
2018-12-11 4:04 ` Michael S. Tsirkin
2018-12-11 4:25 ` Michael S. Tsirkin [this message]
2019-01-07 14:37 ` xuyandong
2019-01-07 15:06 ` Michael S. Tsirkin
2019-01-07 15:28 ` xuyandong
2019-01-07 16:24 ` Michael S. Tsirkin
2019-01-07 14:51 ` xuyandong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181210232256-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=marcel@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=wangxinxin.wang@huawei.com \
--cc=weidong.huang@huawei.com \
--cc=xuyandong2@huawei.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.