From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=55042 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PxAha-0007YO-SD for qemu-devel@nongnu.org; Tue, 08 Mar 2011 23:08:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PxAhZ-0003Pm-Bg for qemu-devel@nongnu.org; Tue, 08 Mar 2011 23:08:46 -0500 Received: from e4.ny.us.ibm.com ([32.97.182.144]:44304) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PxAhZ-0003Nr-7Q for qemu-devel@nongnu.org; Tue, 08 Mar 2011 23:08:45 -0500 Received: from d01dlp01.pok.ibm.com (d01dlp01.pok.ibm.com [9.56.224.56]) by e4.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p293nAS7012350 for ; Tue, 8 Mar 2011 22:49:10 -0500 Received: from d01relay05.pok.ibm.com (d01relay05.pok.ibm.com [9.56.227.237]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 222E738C8038 for ; Tue, 8 Mar 2011 23:08:21 -0500 (EST) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay05.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p2948JOi231914 for ; Tue, 8 Mar 2011 23:08:23 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p2948Jkx013749 for ; Wed, 9 Mar 2011 01:08:19 -0300 Date: Tue, 8 Mar 2011 22:08:15 -0600 From: Ryan Harper Subject: Re: [Qemu-devel] [PATCH] `qdev_free` when unplug a pci device Message-ID: <20110309040814.GM23238@us.ibm.com> References: <1298396180-23734-1-git-send-email-wdauchy@gmail.com> <20110223025001.GC19727@valinux.co.jp> <4D6B0DF8.5000407@cn.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D6B0DF8.5000407@cn.fujitsu.com> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wen Congyang Cc: Isaku Yamahata , Gerd Hoffmann , Markus Armbruster , William Dauchy , qemu-devel@nongnu.org * Wen Congyang [2011-02-27 20:56]: > Hi Markus Armbruster > > At 02/23/2011 04:30 PM, Markus Armbruster Write: > > Isaku Yamahata writes: > > > > > > > > > I don't think this patch is correct. Let me explain. > > > > Device hot unplug is *not* guaranteed to succeed. > > > > For some buses, such as USB, it always succeeds immediately, i.e. when > > the device_del monitor command finishes, the device is gone. Live is > > good. > > > > But for PCI, device_del merely initiates the ACPI unplug rain dance. It > > doesn't wait for the dance to complete. Why? The dance can take an > > unpredictable amount of time, including forever. > > > > Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI > > slot, and the unplug has not yet completed (race condition), or it > > failed. Yes, Virginia, PCI hotplug *can* fail. > > > > When unplug succeeds, the qdev is automatically destroyed. > > pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event() > > does it for PCIE. > > I got a similar problem. When I unplug a pci device by hand, it works > as expected, and I can hotplug it again. But when I use a srcipt to > do the same thing, sometimes it failed. I think I may find another bug. > > Steps to reproduce this bug: > 1. cat ./test-e1000.sh # RHEL6RC is domain name > #! /bin/bash > > while true; do > virsh attach-interface RHEL6RC network default --mac 52:54:00:1f:db:c7 --model e1000 > if [[ $? -ne 0 ]]; then > break > fi > virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7 > if [[ $? -ne 0 ]]; then > break > fi > sleep 5 How do you know that the guest has responded at this point before you attempt to attach again at the top of the loop. Any attach/detach requires the guest to respond to the request and it may not respond at all. > done > > 2. ./test-e1000.sh > Interface attached successfully > > Interface detached successfully > > Interface attached successfully > > Interface detached successfully > > error: Failed to attach interface > error: operation failed: adding e1000,netdev=hostnet1,id=net1,mac=52:54:00:1f:db:c7,bus=pci.0,addr=0x8 device failed: Duplicate ID 'net1' for device > > > I use ftrace to trace whether sci interrupt is passed to guest OS, here is log: > # cat trace | grep 'irq=9' > -0 [000] 118342.634772: irq_handler_entry: irq=9 name=acpi > -0 [000] 118342.634831: irq_handler_exit: irq=9 ret=handled > -0 [000] 118342.693216: irq_handler_entry: irq=9 name=acpi > -0 [000] 118342.693254: irq_handler_exit: irq=9 ret=handled > -0 [000] 118347.737689: irq_handler_entry: irq=9 name=acpi > -0 [000] 118347.737738: irq_handler_exit: irq=9 ret=handled > According to the log, we only receive 3 sci interrupt, and one is lost. > > I enable piix4's debug and 1 line printf into piix4_device_hotplug: > printf("slot: %d, up: %d, down: %d, state: %d\n", slot, s->pci0_status.up, s->pci0_status.down, state); > here is the log: > ======== > PM: mapping to 0xb000 > PM readw port=0x0004 val=0x0000 > ... > gpe write afe2 <== 0 > gpe write afe0 <== 255 > gpe write afe3 <== 0 > gpe write afe1 <== 255 > PM readw port=0x0000 val=0x0001 > PM readw port=0x0002 val=0x0000 > gpe read afe0 == 0 > gpe read afe2 == 0 > gpe read afe1 == 0 > gpe read afe3 == 0 > PM writew port=0x0000 val=0x0020 > PM readw port=0x0002 val=0x0000 > PM writew port=0x0002 val=0x0020 > PM readw port=0x0002 val=0x0020 > gpe write afe2 <== 255 > gpe write afe3 <== 255 > ... > slot: 6, up: 0, down: 0, state: 1 <==== we attach interface the first time > PM readw port=0x0000 val=0x0001 > PM readw port=0x0002 val=0x0120 > gpe read afe0 == 2 > gpe read afe2 == ff > gpe read afe2 == ff > gpe write afe2 <== 253 > gpe read afe1 == 0 > gpe read afe3 == ff > pcihotplug read ae00 == 40 > pcihotplug read ae04 == 0 > ... > gpe write afe0 <== 2 > gpe write afe2 <== 255 > slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first time > PM readw port=0x0000 val=0x0001 > PM readw port=0x0002 val=0x0120 > gpe read afe0 == 2 > gpe read afe2 == ff > gpe read afe2 == ff > gpe write afe2 <== 253 > gpe read afe1 == 0 > gpe read afe3 == ff > pcihotplug read ae00 == 0 > pcihotplug read ae04 == 40 > ... > pciej write ae08 <== 64 <=== we will call qdev_free() > pciej write ae08 <== 64 > gpe write afe0 <== 2 > gpe write afe2 <== 255 > slot: 7, up: 0, down: 64, state: 1 <=== we attach interface the second time. > PM readw port=0x0000 val=0x0001 > PM readw port=0x0002 val=0x0120 > gpe read afe0 == 2 > gpe read afe2 == ff > gpe read afe2 == ff > gpe write afe2 <== 253 > gpe read afe1 == 0 > gpe read afe3 == ff > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > pcihotplug read ae00 == 80 > pcihotplug read ae04 == 0 > slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second time > gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci interupt to be lost > gpe write afe2 <== 255 > =========== > > Here is short log, and show the different between the first time and second time: > =========== > gpe write afe0 <== 2 > gpe write afe2 <== 255 > slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first time > .... > slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second time > gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci interupt to be lost > gpe write afe2 <== 255 > > =========== > > Does this behavior is a normal behavior? > > The following patch can fix this problem. > > >From 3c149575b52261b8da9a73d5c4ebdfedce9866c1 Mon Sep 17 00:00:00 2001 > From: Wen Congyang > Date: Mon, 28 Feb 2011 10:33:45 +0800 > Subject: [PATCH] pend hotplug event until last event is handled by guest OS > > --- > hw/acpi_piix4.c | 51 ++++++++++++++++++++++++++++++++++++++++++++------- > 1 files changed, 44 insertions(+), 7 deletions(-) > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > index b5a2762..4ff3475 100644 > --- a/hw/acpi_piix4.c > +++ b/hw/acpi_piix4.c > @@ -74,6 +74,7 @@ typedef struct PIIX4PMState { > /* for pci hotplug */ > struct gpe_regs gpe; > struct pci_status pci0_status; > + struct pci_status pci0_status_pending; > uint32_t pci0_hotplug_enable; > } PIIX4PMState; > > @@ -518,6 +519,19 @@ static void gpe_reset_val(uint16_t *cur, int addr, uint32_t val) > *cur = (*cur & (0xff << (8 - shift))) | (x1 << shift); > } > > +static void raise_pending_hotplug(PIIX4PMState *s) > +{ > + if (s->pci0_status_pending.up || s->pci0_status_pending.down) { > + s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS; > + s->pci0_status.up = s->pci0_status_pending.up; > + s->pci0_status.down = s->pci0_status_pending.down; > + s->pci0_status_pending.up = 0; > + s->pci0_status_pending.down = 0; > + > + pm_update_sci(s); > + } > +} > + > static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val) > { > PIIX4PMState *s = opaque; > @@ -539,6 +553,11 @@ static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val) > pm_update_sci(s); > > PIIX4_DPRINTF("gpe write %x <== %d\n", addr, val); > + > + if (!(g->sts & PIIX4_PCI_HOTPLUG_STATUS) && g->en & PIIX4_PCI_HOTPLUG_STATUS) { > + /* check and reraise the pending hotplug event */ > + raise_pending_hotplug(s); > + } > } > > static uint32_t pcihotplug_read(void *opaque, uint32_t addr) > @@ -639,12 +658,22 @@ static void enable_device(PIIX4PMState *s, int slot) > s->pci0_status.up |= (1 << slot); > } > > +static void pend_enable_device(PIIX4PMState *s, int slot) > +{ > + s->pci0_status_pending.up |= (1 << slot); > +} > + > static void disable_device(PIIX4PMState *s, int slot) > { > s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS; > s->pci0_status.down |= (1 << slot); > } > > +static void pend_disable_device(PIIX4PMState *s, int slot) > +{ > + s->pci0_status_pending.down |= (1 << slot); > +} > + > static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, > PCIHotplugState state) > { > @@ -659,15 +688,23 @@ static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, > return 0; > } > > - s->pci0_status.up = 0; > - s->pci0_status.down = 0; > - if (state == PCI_HOTPLUG_ENABLED) { > - enable_device(s, slot); > + if (s->gpe.sts & PIIX4_PCI_HOTPLUG_STATUS) { > + if (state == PCI_HOTPLUG_ENABLED) { > + pend_enable_device(s, slot); > + } else { > + pend_disable_device(s, slot); > + } > } else { > - disable_device(s, slot); > - } > + s->pci0_status.up = 0; > + s->pci0_status.down = 0; > + if (state == PCI_HOTPLUG_ENABLED) { > + enable_device(s, slot); > + } else { > + disable_device(s, slot); > + } > > - pm_update_sci(s); > + pm_update_sci(s); > + } > > return 0; > } > -- > 1.7.1 > > > > > Your patch adds a *second* qdev_free(). No good. > > > > > -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ryanh@us.ibm.com