From: Isaku Yamahata <yamahata@valinux.co.jp>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
William Dauchy <wdauchy@gmail.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] `qdev_free` when unplug a pci device
Date: Tue, 1 Mar 2011 18:49:20 +0900 [thread overview]
Message-ID: <20110301094920.GL14584@valinux.co.jp> (raw)
In-Reply-To: <4D6CA0F8.8040905@cn.fujitsu.com>
On Tue, Mar 01, 2011 at 03:32:08PM +0800, Wen Congyang wrote:
> > The issue is, Qemu injected sci interrupt into guest,
> > but before the guest completes to handled it,
> > users/qemu can start to inject the next sci event triggered by hot plug/unplug.
> > Thus qemu loses sci interrupt and up/down status.
>
> Yes, you are right. But I still do not understand why introducing pending bits
> can not solve the issue?
Maybe your patch works.
The difference is to introduce new variables or new state machine.
Anyway I suppose that the eventual solution is to switch to
express native hotplug which is much more reliable and doesn't rely
on ACPI. (note: I'm very biased here.)
thanks
>
> Thanks
> Wen Congyang
>
> >
> > thanks
> >>
> >>>
> >>> - on power-on (0, 0)
> >>> - on hot plug (0, 0) -> (1, 0)
> >>> if other combination -> error
> >>> - on hot unpug (1, 0) or (0, 0) -> (0, 1)
> >>> (0, 0) is for cold plugged devices.
> >>> (1, 0) is for hot plugged devices.
> >>> if other combination -> error
> >>> - on pciej_write(write on PCI_EJ_BASE)
> >>> (0, 1) -> (0, 0) and qdev_free()
> >>> if other combination -> ignore
> >>>
> >>> When enabling sci, those bits are checked and raise sci if necessary.
> >>>
> >>> Any comments on this?
> >>> Anyway the current pci hotplug-related commands are somewhat broken,
> >>> and needs clean up with multifunction hotplug support which is
> >>> on my todo list.
> >>>
> >>> thanks
> >>>
> >>> On Mon, Feb 28, 2011 at 10:52:40AM +0800, Wen Congyang wrote:
> >>>> Hi Markus Armbruster
> >>>>
> >>>> At 02/23/2011 04:30 PM, Markus Armbruster Write:
> >>>>> Isaku Yamahata <yamahata@valinux.co.jp> writes:
> >>>>>
> >>>>
> >>>> <snip>
> >>>>
> >>>>>
> >>>>> I don't think this patch is correct. Let me explain.
> >>>>>
> >>>>> Device hot unplug is *not* guaranteed to succeed.
> >>>>>
> >>>>> For some buses, such as USB, it always succeeds immediately, i.e. when
> >>>>> the device_del monitor command finishes, the device is gone. Live is
> >>>>> good.
> >>>>>
> >>>>> But for PCI, device_del merely initiates the ACPI unplug rain dance. It
> >>>>> doesn't wait for the dance to complete. Why? The dance can take an
> >>>>> unpredictable amount of time, including forever.
> >>>>>
> >>>>> Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI
> >>>>> slot, and the unplug has not yet completed (race condition), or it
> >>>>> failed. Yes, Virginia, PCI hotplug *can* fail.
> >>>>>
> >>>>> When unplug succeeds, the qdev is automatically destroyed.
> >>>>> pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event()
> >>>>> does it for PCIE.
> >>>>
> >>>> I got a similar problem. When I unplug a pci device by hand, it works
> >>>> as expected, and I can hotplug it again. But when I use a srcipt to
> >>>> do the same thing, sometimes it failed. I think I may find another bug.
> >>>>
> >>>> Steps to reproduce this bug:
> >>>> 1. cat ./test-e1000.sh # RHEL6RC is domain name
> >>>> #! /bin/bash
> >>>>
> >>>> while true; do
> >>>> virsh attach-interface RHEL6RC network default --mac 52:54:00:1f:db:c7 --model e1000
> >>>> if [[ $? -ne 0 ]]; then
> >>>> break
> >>>> fi
> >>>> virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7
> >>>> if [[ $? -ne 0 ]]; then
> >>>> break
> >>>> fi
> >>>> sleep 5
> >>>> done
> >>>>
> >>>> 2. ./test-e1000.sh
> >>>> Interface attached successfully
> >>>>
> >>>> Interface detached successfully
> >>>>
> >>>> Interface attached successfully
> >>>>
> >>>> Interface detached successfully
> >>>>
> >>>> error: Failed to attach interface
> >>>> error: operation failed: adding e1000,netdev=hostnet1,id=net1,mac=52:54:00:1f:db:c7,bus=pci.0,addr=0x8 device failed: Duplicate ID 'net1' for device
> >>>>
> >>>>
> >>>> I use ftrace to trace whether sci interrupt is passed to guest OS, here is log:
> >>>> # cat trace | grep 'irq=9'
> >>>> <idle>-0 [000] 118342.634772: irq_handler_entry: irq=9 name=acpi
> >>>> <idle>-0 [000] 118342.634831: irq_handler_exit: irq=9 ret=handled
> >>>> <idle>-0 [000] 118342.693216: irq_handler_entry: irq=9 name=acpi
> >>>> <idle>-0 [000] 118342.693254: irq_handler_exit: irq=9 ret=handled
> >>>> <idle>-0 [000] 118347.737689: irq_handler_entry: irq=9 name=acpi
> >>>> <idle>-0 [000] 118347.737738: irq_handler_exit: irq=9 ret=handled
> >>>> According to the log, we only receive 3 sci interrupt, and one is lost.
> >>>>
> >>>> I enable piix4's debug and 1 line printf into piix4_device_hotplug:
> >>>> printf("slot: %d, up: %d, down: %d, state: %d\n", slot, s->pci0_status.up, s->pci0_status.down, state);
> >>>> here is the log:
> >>>> ========
> >>>> PM: mapping to 0xb000
> >>>> PM readw port=0x0004 val=0x0000
> >>>> ...
> >>>> gpe write afe2 <== 0
> >>>> gpe write afe0 <== 255
> >>>> gpe write afe3 <== 0
> >>>> gpe write afe1 <== 255
> >>>> PM readw port=0x0000 val=0x0001
> >>>> PM readw port=0x0002 val=0x0000
> >>>> gpe read afe0 == 0
> >>>> gpe read afe2 == 0
> >>>> gpe read afe1 == 0
> >>>> gpe read afe3 == 0
> >>>> PM writew port=0x0000 val=0x0020
> >>>> PM readw port=0x0002 val=0x0000
> >>>> PM writew port=0x0002 val=0x0020
> >>>> PM readw port=0x0002 val=0x0020
> >>>> gpe write afe2 <== 255
> >>>> gpe write afe3 <== 255
> >>>> ...
> >>>> slot: 6, up: 0, down: 0, state: 1 <==== we attach interface the first time
> >>>> PM readw port=0x0000 val=0x0001
> >>>> PM readw port=0x0002 val=0x0120
> >>>> gpe read afe0 == 2
> >>>> gpe read afe2 == ff
> >>>> gpe read afe2 == ff
> >>>> gpe write afe2 <== 253
> >>>> gpe read afe1 == 0
> >>>> gpe read afe3 == ff
> >>>> pcihotplug read ae00 == 40
> >>>> pcihotplug read ae04 == 0
> >>>> ...
> >>>> gpe write afe0 <== 2
> >>>> gpe write afe2 <== 255
> >>>> slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first time
> >>>> PM readw port=0x0000 val=0x0001
> >>>> PM readw port=0x0002 val=0x0120
> >>>> gpe read afe0 == 2
> >>>> gpe read afe2 == ff
> >>>> gpe read afe2 == ff
> >>>> gpe write afe2 <== 253
> >>>> gpe read afe1 == 0
> >>>> gpe read afe3 == ff
> >>>> pcihotplug read ae00 == 0
> >>>> pcihotplug read ae04 == 40
> >>>> ...
> >>>> pciej write ae08 <== 64 <=== we will call qdev_free()
> >>>> pciej write ae08 <== 64
> >>>> gpe write afe0 <== 2
> >>>> gpe write afe2 <== 255
> >>>> slot: 7, up: 0, down: 64, state: 1 <=== we attach interface the second time.
> >>>> PM readw port=0x0000 val=0x0001
> >>>> PM readw port=0x0002 val=0x0120
> >>>> gpe read afe0 == 2
> >>>> gpe read afe2 == ff
> >>>> gpe read afe2 == ff
> >>>> gpe write afe2 <== 253
> >>>> gpe read afe1 == 0
> >>>> gpe read afe3 == ff
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> pcihotplug read ae00 == 80
> >>>> pcihotplug read ae04 == 0
> >>>> slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second time
> >>>> gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci interupt to be lost
> >>>> gpe write afe2 <== 255
> >>>> ===========
> >>>>
> >>>> Here is short log, and show the different between the first time and second time:
> >>>> ===========
> >>>> gpe write afe0 <== 2
> >>>> gpe write afe2 <== 255
> >>>> slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first time
> >>>> ....
> >>>> slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second time
> >>>> gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci interupt to be lost
> >>>> gpe write afe2 <== 255
> >>>>
> >>>> ===========
> >>>>
> >>>> Does this behavior is a normal behavior?
> >>>>
> >>>> The following patch can fix this problem.
> >>>>
> >>>> From 3c149575b52261b8da9a73d5c4ebdfedce9866c1 Mon Sep 17 00:00:00 2001
> >>>> From: Wen Congyang <wency@cn.fujitsu.com>
> >>>> Date: Mon, 28 Feb 2011 10:33:45 +0800
> >>>> Subject: [PATCH] pend hotplug event until last event is handled by guest OS
> >>>>
> >>>> ---
> >>>> hw/acpi_piix4.c | 51 ++++++++++++++++++++++++++++++++++++++++++++-------
> >>>> 1 files changed, 44 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> >>>> index b5a2762..4ff3475 100644
> >>>> --- a/hw/acpi_piix4.c
> >>>> +++ b/hw/acpi_piix4.c
> >>>> @@ -74,6 +74,7 @@ typedef struct PIIX4PMState {
> >>>> /* for pci hotplug */
> >>>> struct gpe_regs gpe;
> >>>> struct pci_status pci0_status;
> >>>> + struct pci_status pci0_status_pending;
> >>>> uint32_t pci0_hotplug_enable;
> >>>> } PIIX4PMState;
> >>>>
> >>>> @@ -518,6 +519,19 @@ static void gpe_reset_val(uint16_t *cur, int addr, uint32_t val)
> >>>> *cur = (*cur & (0xff << (8 - shift))) | (x1 << shift);
> >>>> }
> >>>>
> >>>> +static void raise_pending_hotplug(PIIX4PMState *s)
> >>>> +{
> >>>> + if (s->pci0_status_pending.up || s->pci0_status_pending.down) {
> >>>> + s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS;
> >>>> + s->pci0_status.up = s->pci0_status_pending.up;
> >>>> + s->pci0_status.down = s->pci0_status_pending.down;
> >>>> + s->pci0_status_pending.up = 0;
> >>>> + s->pci0_status_pending.down = 0;
> >>>> +
> >>>> + pm_update_sci(s);
> >>>> + }
> >>>> +}
> >>>> +
> >>>> static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val)
> >>>> {
> >>>> PIIX4PMState *s = opaque;
> >>>> @@ -539,6 +553,11 @@ static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val)
> >>>> pm_update_sci(s);
> >>>>
> >>>> PIIX4_DPRINTF("gpe write %x <== %d\n", addr, val);
> >>>> +
> >>>> + if (!(g->sts & PIIX4_PCI_HOTPLUG_STATUS) && g->en & PIIX4_PCI_HOTPLUG_STATUS) {
> >>>> + /* check and reraise the pending hotplug event */
> >>>> + raise_pending_hotplug(s);
> >>>> + }
> >>>> }
> >>>>
> >>>> static uint32_t pcihotplug_read(void *opaque, uint32_t addr)
> >>>> @@ -639,12 +658,22 @@ static void enable_device(PIIX4PMState *s, int slot)
> >>>> s->pci0_status.up |= (1 << slot);
> >>>> }
> >>>>
> >>>> +static void pend_enable_device(PIIX4PMState *s, int slot)
> >>>> +{
> >>>> + s->pci0_status_pending.up |= (1 << slot);
> >>>> +}
> >>>> +
> >>>> static void disable_device(PIIX4PMState *s, int slot)
> >>>> {
> >>>> s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS;
> >>>> s->pci0_status.down |= (1 << slot);
> >>>> }
> >>>>
> >>>> +static void pend_disable_device(PIIX4PMState *s, int slot)
> >>>> +{
> >>>> + s->pci0_status_pending.down |= (1 << slot);
> >>>> +}
> >>>> +
> >>>> static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
> >>>> PCIHotplugState state)
> >>>> {
> >>>> @@ -659,15 +688,23 @@ static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
> >>>> return 0;
> >>>> }
> >>>>
> >>>> - s->pci0_status.up = 0;
> >>>> - s->pci0_status.down = 0;
> >>>> - if (state == PCI_HOTPLUG_ENABLED) {
> >>>> - enable_device(s, slot);
> >>>> + if (s->gpe.sts & PIIX4_PCI_HOTPLUG_STATUS) {
> >>>> + if (state == PCI_HOTPLUG_ENABLED) {
> >>>> + pend_enable_device(s, slot);
> >>>> + } else {
> >>>> + pend_disable_device(s, slot);
> >>>> + }
> >>>> } else {
> >>>> - disable_device(s, slot);
> >>>> - }
> >>>> + s->pci0_status.up = 0;
> >>>> + s->pci0_status.down = 0;
> >>>> + if (state == PCI_HOTPLUG_ENABLED) {
> >>>> + enable_device(s, slot);
> >>>> + } else {
> >>>> + disable_device(s, slot);
> >>>> + }
> >>>>
> >>>> - pm_update_sci(s);
> >>>> + pm_update_sci(s);
> >>>> + }
> >>>>
> >>>> return 0;
> >>>> }
> >>>> --
> >>>> 1.7.1
> >>>>
> >>>>>
> >>>>> Your patch adds a *second* qdev_free(). No good.
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
--
yamahata
next prev parent reply other threads:[~2011-03-01 9:49 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-22 17:36 [Qemu-devel] [PATCH] `qdev_free` when unplug a pci device William Dauchy
2011-02-23 2:50 ` Isaku Yamahata
2011-02-23 8:30 ` Markus Armbruster
2011-02-23 9:32 ` William Dauchy
2011-02-28 2:52 ` Wen Congyang
2011-03-01 4:11 ` Isaku Yamahata
2011-03-01 6:58 ` Wen Congyang
2011-03-01 7:13 ` Isaku Yamahata
2011-03-01 7:32 ` Wen Congyang
2011-03-01 9:49 ` Isaku Yamahata [this message]
2011-03-09 4:08 ` Ryan Harper
2011-03-09 5:04 ` Wen Congyang
2011-03-09 6:12 ` Ryan Harper
2011-03-09 7:19 ` Wen Congyang
2011-03-10 4:31 ` Ryan Harper
2011-03-10 5:28 ` Wen Congyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110301094920.GL14584@valinux.co.jp \
--to=yamahata@valinux.co.jp \
--cc=armbru@redhat.com \
--cc=kraxel@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=wdauchy@gmail.com \
--cc=wency@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).