qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup
@ 2012-04-05  5:51 Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 1/5] acpi_piix4: Disallow write to up/down PCI hotplug registers Alex Williamson
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Alex Williamson @ 2012-04-05  5:51 UTC (permalink / raw)
  To: qemu-devel, mst
  Cc: aliguori, gleb, jbaron, yamahata, alex.williamson, kraxel,
	pbonzini, imammedo, aurelien

We've been batting this one back and forth.  This series includes
several of the cleanups and specification clarifications from my
series awhile back.  Patch 5 is my proposed alternative to
Michael's PCI hotplug race fix.  Since that version I added slot
present tracking so we can be a little more strategic about which
slots we ask the guest to check.  The approach for that path is
described in the commit log.  I tested hotplug on both Linux an
Windows guests (XP + 2k8), intentionally trying to do back to back
device_add and device_del to get a race, but couldn't (I did
however get a glibc double free that seems unrelated to this
series).

Long term I'd like to deprecate the up/down PCI hotplug interface
and move to a new model.  I think perhaps we should define 3 new
registers:

1) Device present in slot bitmap (foundation already in 5/5 here)
2) Virtual slot power state bitmap
3) Requested slot date bitmap

With these we should be able to do proper _STA, _PS0, and _PS3.
We'd maintain the eject register for _EJ0, but deprecate up/down.

For a device_add, the 'present' & 'request' bitmaps gets set for
the slot by qemu, gpe.sts set and SCI interrupt sent.  The OSPM
can simply compare requested state vs power state and send the
appropriate notify.  Same general idea for removal.  _STA can be
formed by looking at the 'present' and 'power' bitmaps, and
_PS0/3 perform writes to the power register to indicate the virtual
state.  I think this provides a better register ownership model
(qemu owns present/request, OSPM owns power) and also allows us
to differentiate between a guest initiated remove and a platform
(vm manager) initiated remove so we can avoid changing the vm
state in the former case.  I'll try to send a patch for this out
soon, but let me know if there are any concerns or suggestions for
a redesign.  Thanks,

Alex

---

Alex Williamson (5):
      acpi_piix4: Fix PCI hotplug race
      acpi_piix4: Remove PCI_RMV_BASE write code
      acpi_piix4: Use pci_get/set_byte
      acpi_piix4: Only allow writes to PCI hotplug eject register
      acpi_piix4: Disallow write to up/down PCI hotplug registers


 docs/specs/acpi_pci_hotplug.txt |    8 +-
 hw/acpi_piix4.c                 |  178 ++++++++++++++++++++-------------------
 2 files changed, 95 insertions(+), 91 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH 1/5] acpi_piix4: Disallow write to up/down PCI hotplug registers
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
@ 2012-04-05  5:51 ` Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register Alex Williamson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Alex Williamson @ 2012-04-05  5:51 UTC (permalink / raw)
  To: qemu-devel, mst
  Cc: aliguori, gleb, jbaron, yamahata, alex.williamson, kraxel,
	pbonzini, imammedo, aurelien

These are never written, clarify spec, remove access.  Split
reads into two smaller functions.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 docs/specs/acpi_pci_hotplug.txt |    4 ++--
 hw/acpi_piix4.c                 |   42 ++++++++++++---------------------------
 2 files changed, 15 insertions(+), 31 deletions(-)

diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
index f0f74a7..1e2c8a2 100644
--- a/docs/specs/acpi_pci_hotplug.txt
+++ b/docs/specs/acpi_pci_hotplug.txt
@@ -15,14 +15,14 @@ PCI slot injection notification pending (IO port 0xae00-0xae03, 4-byte access):
 Slot injection notification pending. One bit per slot.
 
 Read by ACPI BIOS GPE.1 handler to notify OS of injection
-events.
+events.  Read-only.
 
 PCI slot removal notification (IO port 0xae04-0xae07, 4-byte access):
 -----------------------------------------------------
 Slot removal notification pending. One bit per slot.
 
 Read by ACPI BIOS GPE.1 handler to notify OS of removal
-events.
+events.  Read-only.
 
 PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
 ----------------------------------------
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 0c77730..44d1423 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -41,7 +41,8 @@
 #define GPE_BASE 0xafe0
 #define PROC_BASE 0xaf00
 #define GPE_LEN 4
-#define PCI_BASE 0xae00
+#define PCI_UP_BASE 0xae00
+#define PCI_DOWN_BASE 0xae04
 #define PCI_EJ_BASE 0xae08
 #define PCI_RMV_BASE 0xae0c
 
@@ -468,38 +469,22 @@ static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val)
     PIIX4_DPRINTF("gpe write %x <== %d\n", addr, val);
 }
 
-static uint32_t pcihotplug_read(void *opaque, uint32_t addr)
+static uint32_t pci_up_read(void *opaque, uint32_t addr)
 {
-    uint32_t val = 0;
-    struct pci_status *g = opaque;
-    switch (addr) {
-        case PCI_BASE:
-            val = g->up;
-            break;
-        case PCI_BASE + 4:
-            val = g->down;
-            break;
-        default:
-            break;
-    }
+    PIIX4PMState *s = opaque;
+    uint32_t val = s->pci0_status.up;
 
-    PIIX4_DPRINTF("pcihotplug read %x == %x\n", addr, val);
+    PIIX4_DPRINTF("pci_up_read %x\n", val);
     return val;
 }
 
-static void pcihotplug_write(void *opaque, uint32_t addr, uint32_t val)
+static uint32_t pci_down_read(void *opaque, uint32_t addr)
 {
-    struct pci_status *g = opaque;
-    switch (addr) {
-        case PCI_BASE:
-            g->up = val;
-            break;
-        case PCI_BASE + 4:
-            g->down = val;
-            break;
-   }
+    PIIX4PMState *s = opaque;
+    uint32_t val = s->pci0_status.down;
 
-    PIIX4_DPRINTF("pcihotplug write %x <== %d\n", addr, val);
+    PIIX4_DPRINTF("pci_down_read %x\n", val);
+    return val;
 }
 
 static uint32_t pciej_read(void *opaque, uint32_t addr)
@@ -545,7 +530,6 @@ static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
 
 static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
 {
-    struct pci_status *pci0_status = &s->pci0_status;
     int i = 0, cpus = smp_cpus;
 
     while (cpus > 0) {
@@ -560,8 +544,8 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
     register_ioport_write(PROC_BASE, 32, 1, gpe_writeb, s);
     register_ioport_read(PROC_BASE, 32, 1,  gpe_readb, s);
 
-    register_ioport_write(PCI_BASE, 8, 4, pcihotplug_write, pci0_status);
-    register_ioport_read(PCI_BASE, 8, 4,  pcihotplug_read, pci0_status);
+    register_ioport_read(PCI_UP_BASE, 4, 4, pci_up_read, s);
+    register_ioport_read(PCI_DOWN_BASE, 4, 4, pci_down_read, s);
 
     register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
     register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 1/5] acpi_piix4: Disallow write to up/down PCI hotplug registers Alex Williamson
@ 2012-04-05  5:51 ` Alex Williamson
  2012-04-05  8:21   ` Igor Mammedov
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 3/5] acpi_piix4: Use pci_get/set_byte Alex Williamson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2012-04-05  5:51 UTC (permalink / raw)
  To: qemu-devel, mst
  Cc: aliguori, gleb, jbaron, yamahata, alex.williamson, kraxel,
	pbonzini, imammedo, aurelien

This is never read.  We can also derive bus from the write handler,
making this more inline with the other callbacks.  Note that
pciej_write was actually called with (PCIBus *)dev->bus, which is
cast as a void* allowing us to pretend it's a BusState*.  Fix this
so we don't depend on the BusState location within PCIBus.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 docs/specs/acpi_pci_hotplug.txt |    2 +-
 hw/acpi_piix4.c                 |   14 ++++----------
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
index 1e2c8a2..1e61d19 100644
--- a/docs/specs/acpi_pci_hotplug.txt
+++ b/docs/specs/acpi_pci_hotplug.txt
@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
 ----------------------------------------
 
 Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
-Reads return 0.
+Read-only.
 
 PCI removability status (IO port 0xae0c-0xae0f, 4-byte access):
 -----------------------------------------------
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 44d1423..6ee832a 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -487,15 +487,11 @@ static uint32_t pci_down_read(void *opaque, uint32_t addr)
     return val;
 }
 
-static uint32_t pciej_read(void *opaque, uint32_t addr)
-{
-    PIIX4_DPRINTF("pciej read %x\n", addr);
-    return 0;
-}
-
 static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
 {
-    BusState *bus = opaque;
+    PIIX4PMState *s = opaque;
+    PCIDevice *dev = &s->dev;
+    BusState *bus = qdev_get_parent_bus(&dev->qdev);
     DeviceState *qdev, *next;
     int slot = ffs(val) - 1;
 
@@ -507,7 +503,6 @@ static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
         }
     }
 
-
     PIIX4_DPRINTF("pciej write %x <== %d\n", addr, val);
 }
 
@@ -547,8 +542,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
     register_ioport_read(PCI_UP_BASE, 4, 4, pci_up_read, s);
     register_ioport_read(PCI_DOWN_BASE, 4, 4, pci_down_read, s);
 
-    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
-    register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);
+    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, s);
 
     register_ioport_write(PCI_RMV_BASE, 4, 4, pcirmv_write, s);
     register_ioport_read(PCI_RMV_BASE, 4, 4,  pcirmv_read, s);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH 3/5] acpi_piix4: Use pci_get/set_byte
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 1/5] acpi_piix4: Disallow write to up/down PCI hotplug registers Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register Alex Williamson
@ 2012-04-05  5:51 ` Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 4/5] acpi_piix4: Remove PCI_RMV_BASE write code Alex Williamson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Alex Williamson @ 2012-04-05  5:51 UTC (permalink / raw)
  To: qemu-devel, mst
  Cc: aliguori, gleb, jbaron, yamahata, alex.williamson, kraxel,
	pbonzini, imammedo, aurelien

Remove stray direct access

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/acpi_piix4.c |   53 +++++++++++++++++++++++++++--------------------------
 1 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 6ee832a..c1b1532 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -173,11 +173,12 @@ static const IORangeOps pm_iorange_ops = {
 static void apm_ctrl_changed(uint32_t val, void *arg)
 {
     PIIX4PMState *s = arg;
+    PCIDevice *dev = &s->dev;
 
     /* ACPI specs 3.0, 4.7.2.5 */
     acpi_pm1_cnt_update(&s->ar, val == ACPI_ENABLE, val == ACPI_DISABLE);
 
-    if (s->dev.config[0x5b] & (1 << 1)) {
+    if (pci_get_byte(dev->config + 0x5b) & (1 << 1)) {
         if (s->smi_irq) {
             qemu_irq_raise(s->smi_irq);
         }
@@ -191,10 +192,11 @@ static void acpi_dbg_writel(void *opaque, uint32_t addr, uint32_t val)
 
 static void pm_io_space_update(PIIX4PMState *s)
 {
+    PCIDevice *dev = &s->dev;
     uint32_t pm_io_base;
 
-    if (s->dev.config[0x80] & 1) {
-        pm_io_base = le32_to_cpu(*(uint32_t *)(s->dev.config + 0x40));
+    if (pci_get_byte(dev->config + 0x80) & 1) {
+        pm_io_base = pci_get_long(dev->config + 0x40);
         pm_io_base &= 0xffc0;
 
         /* XXX: need to improve memory and ioport allocation */
@@ -297,16 +299,16 @@ static void piix4_update_hotplug(PIIX4PMState *s)
 static void piix4_reset(void *opaque)
 {
     PIIX4PMState *s = opaque;
-    uint8_t *pci_conf = s->dev.config;
+    PCIDevice *dev = &s->dev;
 
-    pci_conf[0x58] = 0;
-    pci_conf[0x59] = 0;
-    pci_conf[0x5a] = 0;
-    pci_conf[0x5b] = 0;
+    pci_set_byte(dev->config + 0x58, 0);
+    pci_set_byte(dev->config + 0x59, 0);
+    pci_set_byte(dev->config + 0x5a, 0);
+    pci_set_byte(dev->config + 0x5b, 0);
 
     if (s->kvm_enabled) {
         /* Mark SMM as already inited (until KVM supports SMM). */
-        pci_conf[0x5B] = 0x02;
+        pci_set_byte(dev->config + 0x5B, 0x02);
     }
     piix4_update_hotplug(s);
 }
@@ -322,13 +324,14 @@ static void piix4_powerdown(void *opaque, int irq, int power_failing)
 static void piix4_pm_machine_ready(Notifier *n, void *opaque)
 {
     PIIX4PMState *s = container_of(n, PIIX4PMState, machine_ready);
-    uint8_t *pci_conf;
+    PCIDevice *dev = &s->dev;
 
-    pci_conf = s->dev.config;
-    pci_conf[0x5f] = (isa_is_ioport_assigned(0x378) ? 0x80 : 0) | 0x10;
-    pci_conf[0x63] = 0x60;
-    pci_conf[0x67] = (isa_is_ioport_assigned(0x3f8) ? 0x08 : 0) |
-	(isa_is_ioport_assigned(0x2f8) ? 0x90 : 0);
+    pci_set_byte(dev->config + 0x5f,
+                 (isa_is_ioport_assigned(0x378) ? 0x80 : 0) | 0x10);
+    pci_set_byte(dev->config + 0x63, 0x60);
+    pci_set_byte(dev->config + 0x67,
+                 (isa_is_ioport_assigned(0x3f8) ? 0x08 : 0) |
+                 (isa_is_ioport_assigned(0x2f8) ? 0x90 : 0));
 
 }
 
@@ -337,18 +340,16 @@ static PIIX4PMState *global_piix4_pm_state; /* cpu hotadd */
 static int piix4_pm_initfn(PCIDevice *dev)
 {
     PIIX4PMState *s = DO_UPCAST(PIIX4PMState, dev, dev);
-    uint8_t *pci_conf;
 
     /* for cpu hotadd */
     global_piix4_pm_state = s;
 
-    pci_conf = s->dev.config;
-    pci_conf[0x06] = 0x80;
-    pci_conf[0x07] = 0x02;
-    pci_conf[0x09] = 0x00;
-    pci_conf[0x3d] = 0x01; // interrupt pin 1
+    pci_set_byte(dev->config + 0x06, 0x80);
+    pci_set_byte(dev->config + 0x07, 0x02);
+    pci_set_byte(dev->config + 0x09, 0x00);
+    pci_set_byte(dev->config + 0x3d, 0x01); /* interrupt pin 1 */
 
-    pci_conf[0x40] = 0x01; /* PM io base read only bit */
+    pci_set_byte(dev->config + 0x40, 0x01); /* PM io base read only bit */
 
     /* APM */
     apm_init(&s->apm, apm_ctrl_changed, s);
@@ -358,14 +359,14 @@ static int piix4_pm_initfn(PCIDevice *dev)
     if (s->kvm_enabled) {
         /* Mark SMM as already inited to prevent SMM from running.  KVM does not
          * support SMM mode. */
-        pci_conf[0x5B] = 0x02;
+        pci_set_byte(dev->config + 0x5B, 0x02);
     }
 
     /* XXX: which specification is used ? The i82731AB has different
        mappings */
-    pci_conf[0x90] = s->smb_io_base | 1;
-    pci_conf[0x91] = s->smb_io_base >> 8;
-    pci_conf[0xd2] = 0x09;
+    pci_set_byte(dev->config + 0x90, s->smb_io_base | 1);
+    pci_set_byte(dev->config + 0x91, s->smb_io_base >> 8);
+    pci_set_byte(dev->config + 0xd2, 0x09);
     register_ioport_write(s->smb_io_base, 64, 1, smb_ioport_writeb, &s->smb);
     register_ioport_read(s->smb_io_base, 64, 1, smb_ioport_readb, &s->smb);
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH 4/5] acpi_piix4: Remove PCI_RMV_BASE write code
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
                   ` (2 preceding siblings ...)
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 3/5] acpi_piix4: Use pci_get/set_byte Alex Williamson
@ 2012-04-05  5:51 ` Alex Williamson
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 5/5] acpi_piix4: Fix PCI hotplug race Alex Williamson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Alex Williamson @ 2012-04-05  5:51 UTC (permalink / raw)
  To: qemu-devel, mst
  Cc: aliguori, gleb, jbaron, yamahata, alex.williamson, kraxel,
	pbonzini, imammedo, aurelien

Achieves the same result with less code.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 docs/specs/acpi_pci_hotplug.txt |    2 +-
 hw/acpi_piix4.c                 |    6 ------
 2 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
index 1e61d19..44a28da 100644
--- a/docs/specs/acpi_pci_hotplug.txt
+++ b/docs/specs/acpi_pci_hotplug.txt
@@ -34,4 +34,4 @@ PCI removability status (IO port 0xae0c-0xae0f, 4-byte access):
 -----------------------------------------------
 
 Used by ACPI BIOS _RMV method to indicate removability status to OS. One
-bit per slot.
+bit per slot.  Read-only
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index c1b1532..a6fe260 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -514,11 +514,6 @@ static uint32_t pcirmv_read(void *opaque, uint32_t addr)
     return s->pci0_hotplug_enable;
 }
 
-static void pcirmv_write(void *opaque, uint32_t addr, uint32_t val)
-{
-    return;
-}
-
 extern const char *global_cpu_model;
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
@@ -545,7 +540,6 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
 
     register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, s);
 
-    register_ioport_write(PCI_RMV_BASE, 4, 4, pcirmv_write, s);
     register_ioport_read(PCI_RMV_BASE, 4, 4,  pcirmv_read, s);
 
     pci_bus_hotplug(bus, piix4_device_hotplug, &s->dev.qdev);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH 5/5] acpi_piix4: Fix PCI hotplug race
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
                   ` (3 preceding siblings ...)
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 4/5] acpi_piix4: Remove PCI_RMV_BASE write code Alex Williamson
@ 2012-04-05  5:51 ` Alex Williamson
  2012-04-05 10:03 ` [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Michael S. Tsirkin
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Alex Williamson @ 2012-04-05  5:51 UTC (permalink / raw)
  To: qemu-devel, mst
  Cc: aliguori, gleb, jbaron, yamahata, alex.williamson, kraxel,
	pbonzini, imammedo, aurelien

As Michael Tsirkin demonstrated, current PCI hotplug is vulnerable
to a few races.  The first is a race with other hotplug operations
because we clear the up & down registers at each event.  If a new
event comes before the last is processed, up/down is cleared and
the event is lost.

To fix this for the down register, we create a life cycle for
the event request that starts with the hot unplug request in
piix4_device_hotplug() and ends when the device is ejected.
This allows us to mask and clear individual bits, preserving them
against races.  For the up register, we have no clear end point
for when the event is finished.  We could modify the BIOS to
acknowledge the bit and clear it, but this creates BIOS compatibiliy
issues without offering a complete solution.  Instead we note that
gratuitous ACPI device checks are not harmful, which allows us to
issue a device check for every slot.  We know which slots are present
and we know which slots are hotpluggable, so we can easily reduce
this to a more manageable set for the guest.

The other race Michael noted was that an unplug request followed
by reset may also lose the eject notification, which may also
result in the eject request being lost which a subsequent add
or remove.  Once we're in reset, the device is unused and we can
flush the queue of device removals ourselves.  Previously if a
device_del was issued to a guest without ACPI PCI hotplug support,
it was necessary to shutdown the guest to recover the device.
With this, a guest reboot is sufficient.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/acpi_piix4.c |   71 ++++++++++++++++++++++++++++++++++++++++---------------
 1 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index a6fe260..8ab4053 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -54,7 +54,7 @@ struct gpe_regs {
 };
 
 struct pci_status {
-    uint32_t up;
+    uint32_t up; /* deprecated, maintained for migration compatibility */
     uint32_t down;
 };
 
@@ -77,6 +77,7 @@ typedef struct PIIX4PMState {
     struct gpe_regs gpe_cpu;
     struct pci_status pci0_status;
     uint32_t pci0_hotplug_enable;
+    uint32_t pci0_slot_device_present;
 } PIIX4PMState;
 
 static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s);
@@ -214,6 +215,17 @@ static void pm_write_config(PCIDevice *d,
         pm_io_space_update((PIIX4PMState *)d);
 }
 
+static void vmstate_pci_status_pre_save(void *opaque)
+{
+    struct pci_status *pci0_status = opaque;
+    PIIX4PMState *s = container_of(pci0_status, PIIX4PMState, pci0_status);
+
+    /* We no longer track up, so build a safe value for migrating
+     * to a version that still does... of course these might get lost
+     * by an old buggy implementation, but we try. */
+    pci0_status->up = s->pci0_slot_device_present & s->pci0_hotplug_enable;
+}
+
 static int vmstate_acpi_post_load(void *opaque, int version_id)
 {
     PIIX4PMState *s = opaque;
@@ -249,6 +261,7 @@ static const VMStateDescription vmstate_pci_status = {
     .version_id = 1,
     .minimum_version_id = 1,
     .minimum_version_id_old = 1,
+    .pre_save = vmstate_pci_status_pre_save,
     .fields      = (VMStateField []) {
         VMSTATE_UINT32(up, struct pci_status),
         VMSTATE_UINT32(down, struct pci_status),
@@ -277,13 +290,38 @@ static const VMStateDescription vmstate_acpi = {
     }
 };
 
+static void acpi_piix_eject_slot(PIIX4PMState *s, unsigned slots)
+{
+    DeviceState *qdev, *next;
+    BusState *bus = qdev_get_parent_bus(&s->dev.qdev);
+    int slot = ffs(slots) - 1;
+
+    /* Mark request as complete */
+    s->pci0_status.down &= ~(1U << slot);
+
+    QTAILQ_FOREACH_SAFE(qdev, &bus->children, sibling, next) {
+        PCIDevice *dev = PCI_DEVICE(qdev);
+        PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
+        if (PCI_SLOT(dev->devfn) == slot && !pc->no_hotplug) {
+            s->pci0_slot_device_present &= ~(1U << slot);
+            qdev_free(qdev);
+        }
+    }
+}
+
 static void piix4_update_hotplug(PIIX4PMState *s)
 {
     PCIDevice *dev = &s->dev;
     BusState *bus = qdev_get_parent_bus(&dev->qdev);
     DeviceState *qdev, *next;
 
+    /* Execute any pending removes during reset */
+    while (s->pci0_status.down) {
+        acpi_piix_eject_slot(s, s->pci0_status.down);
+    }
+
     s->pci0_hotplug_enable = ~0;
+    s->pci0_slot_device_present = 0;
 
     QTAILQ_FOREACH_SAFE(qdev, &bus->children, sibling, next) {
         PCIDevice *pdev = PCI_DEVICE(qdev);
@@ -291,8 +329,10 @@ static void piix4_update_hotplug(PIIX4PMState *s)
         int slot = PCI_SLOT(pdev->devfn);
 
         if (pc->no_hotplug) {
-            s->pci0_hotplug_enable &= ~(1 << slot);
+            s->pci0_hotplug_enable &= ~(1U << slot);
         }
+
+        s->pci0_slot_device_present |= (1U << slot);
     }
 }
 
@@ -473,7 +513,11 @@ static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val)
 static uint32_t pci_up_read(void *opaque, uint32_t addr)
 {
     PIIX4PMState *s = opaque;
-    uint32_t val = s->pci0_status.up;
+    uint32_t val;
+
+    /* Manufacture an "up" value to cause a device check on any hotplug
+     * slot with a device.  Extra device checks are harmless. */
+    val = s->pci0_slot_device_present & s->pci0_hotplug_enable;
 
     PIIX4_DPRINTF("pci_up_read %x\n", val);
     return val;
@@ -490,19 +534,7 @@ static uint32_t pci_down_read(void *opaque, uint32_t addr)
 
 static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
 {
-    PIIX4PMState *s = opaque;
-    PCIDevice *dev = &s->dev;
-    BusState *bus = qdev_get_parent_bus(&dev->qdev);
-    DeviceState *qdev, *next;
-    int slot = ffs(val) - 1;
-
-    QTAILQ_FOREACH_SAFE(qdev, &bus->children, sibling, next) {
-        PCIDevice *dev = PCI_DEVICE(qdev);
-        PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
-        if (PCI_SLOT(dev->devfn) == slot && !pc->no_hotplug) {
-            qdev_free(qdev);
-        }
-    }
+    acpi_piix_eject_slot(opaque, val);
 
     PIIX4_DPRINTF("pciej write %x <== %d\n", addr, val);
 }
@@ -590,13 +622,13 @@ void qemu_system_cpu_hot_add(int cpu, int state)
 static void enable_device(PIIX4PMState *s, int slot)
 {
     s->ar.gpe.sts[0] |= PIIX4_PCI_HOTPLUG_STATUS;
-    s->pci0_status.up |= (1 << slot);
+    s->pci0_slot_device_present |= (1U << slot);
 }
 
 static void disable_device(PIIX4PMState *s, int slot)
 {
     s->ar.gpe.sts[0] |= PIIX4_PCI_HOTPLUG_STATUS;
-    s->pci0_status.down |= (1 << slot);
+    s->pci0_status.down |= (1U << slot);
 }
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
@@ -610,11 +642,10 @@ static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
      * it is present on boot, no hotplug event is necessary. We do send an
      * event when the device is disabled later. */
     if (state == PCI_COLDPLUG_ENABLED) {
+        s->pci0_slot_device_present |= (1U << slot);
         return 0;
     }
 
-    s->pci0_status.up = 0;
-    s->pci0_status.down = 0;
     if (state == PCI_HOTPLUG_ENABLED) {
         enable_device(s, slot);
     } else {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register Alex Williamson
@ 2012-04-05  8:21   ` Igor Mammedov
  2012-04-05  9:04     ` Michael S. Tsirkin
  2012-04-05 14:28     ` Alex Williamson
  0 siblings, 2 replies; 23+ messages in thread
From: Igor Mammedov @ 2012-04-05  8:21 UTC (permalink / raw)
  To: Alex Williamson
  Cc: aliguori, gleb, mst, jbaron, qemu-devel, yamahata, kraxel,
	pbonzini, aurelien

On 04/05/2012 07:51 AM, Alex Williamson wrote:
> This is never read.  We can also derive bus from the write handler,
> making this more inline with the other callbacks.  Note that
> pciej_write was actually called with (PCIBus *)dev->bus, which is
> cast as a void* allowing us to pretend it's a BusState*.  Fix this
> so we don't depend on the BusState location within PCIBus.
>
> Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> ---
>
>   docs/specs/acpi_pci_hotplug.txt |    2 +-
>   hw/acpi_piix4.c                 |   14 ++++----------
>   2 files changed, 5 insertions(+), 11 deletions(-)
>
> diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> index 1e2c8a2..1e61d19 100644
> --- a/docs/specs/acpi_pci_hotplug.txt
> +++ b/docs/specs/acpi_pci_hotplug.txt
> @@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
>   ----------------------------------------
>
>   Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> -Reads return 0.
> +Read-only.
Write-only perhaps?

>
>   PCI removability status (IO port 0xae0c-0xae0f, 4-byte access):
>   -----------------------------------------------
> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> index 44d1423..6ee832a 100644
> --- a/hw/acpi_piix4.c
> +++ b/hw/acpi_piix4.c
> @@ -487,15 +487,11 @@ static uint32_t pci_down_read(void *opaque, uint32_t addr)
>       return val;
>   }
>
> -static uint32_t pciej_read(void *opaque, uint32_t addr)
> -{
> -    PIIX4_DPRINTF("pciej read %x\n", addr);
> -    return 0;
> -}
> -
what will happen if bios tries to read from reg?

>   static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
>   {
> -    BusState *bus = opaque;
> +    PIIX4PMState *s = opaque;
> +    PCIDevice *dev =&s->dev;
> +    BusState *bus = qdev_get_parent_bus(&dev->qdev);
>       DeviceState *qdev, *next;
>       int slot = ffs(val) - 1;
>
> @@ -507,7 +503,6 @@ static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
>           }
>       }
>
> -
>       PIIX4_DPRINTF("pciej write %x<== %d\n", addr, val);
>   }
>
> @@ -547,8 +542,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
>       register_ioport_read(PCI_UP_BASE, 4, 4, pci_up_read, s);
>       register_ioport_read(PCI_DOWN_BASE, 4, 4, pci_down_read, s);
>
> -    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
> -    register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);
> +    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, s);
>
>       register_ioport_write(PCI_RMV_BASE, 4, 4, pcirmv_write, s);
>       register_ioport_read(PCI_RMV_BASE, 4, 4,  pcirmv_read, s);
>

-- 
-----
  Igor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  8:21   ` Igor Mammedov
@ 2012-04-05  9:04     ` Michael S. Tsirkin
  2012-04-05  9:12       ` Gleb Natapov
  2012-04-05 14:28     ` Alex Williamson
  1 sibling, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05  9:04 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: aliguori, gleb, jbaron, qemu-devel, yamahata, Alex Williamson,
	kraxel, pbonzini, aurelien

On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> On 04/05/2012 07:51 AM, Alex Williamson wrote:
> >This is never read.  We can also derive bus from the write handler,
> >making this more inline with the other callbacks.  Note that
> >pciej_write was actually called with (PCIBus *)dev->bus, which is
> >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> >so we don't depend on the BusState location within PCIBus.
> >
> >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> >---
> >
> >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> >  hw/acpi_piix4.c                 |   14 ++++----------
> >  2 files changed, 5 insertions(+), 11 deletions(-)
> >
> >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> >index 1e2c8a2..1e61d19 100644
> >--- a/docs/specs/acpi_pci_hotplug.txt
> >+++ b/docs/specs/acpi_pci_hotplug.txt
> >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> >  ----------------------------------------
> >
> >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> >-Reads return 0.
> >+Read-only.
> Write-only perhaps?

Yes, let's also specify what happens in practice.
I think it is 'Guest should never read this register, in practice
0 is returned'.

> >
> >  PCI removability status (IO port 0xae0c-0xae0f, 4-byte access):
> >  -----------------------------------------------
> >diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> >index 44d1423..6ee832a 100644
> >--- a/hw/acpi_piix4.c
> >+++ b/hw/acpi_piix4.c
> >@@ -487,15 +487,11 @@ static uint32_t pci_down_read(void *opaque, uint32_t addr)
> >      return val;
> >  }
> >
> >-static uint32_t pciej_read(void *opaque, uint32_t addr)
> >-{
> >-    PIIX4_DPRINTF("pciej read %x\n", addr);
> >-    return 0;
> >-}
> >-
> what will happen if bios tries to read from reg?

I think it currently gets 0.

> >  static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
> >  {
> >-    BusState *bus = opaque;
> >+    PIIX4PMState *s = opaque;
> >+    PCIDevice *dev =&s->dev;
> >+    BusState *bus = qdev_get_parent_bus(&dev->qdev);
> >      DeviceState *qdev, *next;
> >      int slot = ffs(val) - 1;
> >
> >@@ -507,7 +503,6 @@ static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
> >          }
> >      }
> >
> >-
> >      PIIX4_DPRINTF("pciej write %x<== %d\n", addr, val);
> >  }
> >
> >@@ -547,8 +542,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
> >      register_ioport_read(PCI_UP_BASE, 4, 4, pci_up_read, s);
> >      register_ioport_read(PCI_DOWN_BASE, 4, 4, pci_down_read, s);
> >
> >-    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
> >-    register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);
> >+    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, s);
> >
> >      register_ioport_write(PCI_RMV_BASE, 4, 4, pcirmv_write, s);
> >      register_ioport_read(PCI_RMV_BASE, 4, 4,  pcirmv_read, s);
> >
> 
> -- 
> -----
>  Igor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  9:04     ` Michael S. Tsirkin
@ 2012-04-05  9:12       ` Gleb Natapov
  2012-04-05  9:37         ` Michael S. Tsirkin
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2012-04-05  9:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > >This is never read.  We can also derive bus from the write handler,
> > >making this more inline with the other callbacks.  Note that
> > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > >so we don't depend on the BusState location within PCIBus.
> > >
> > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > >---
> > >
> > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > >  hw/acpi_piix4.c                 |   14 ++++----------
> > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > >
> > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > >index 1e2c8a2..1e61d19 100644
> > >--- a/docs/specs/acpi_pci_hotplug.txt
> > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > >  ----------------------------------------
> > >
> > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > >-Reads return 0.
> > >+Read-only.
> > Write-only perhaps?
> 
> Yes, let's also specify what happens in practice.
No we shouldn't.

> I think it is 'Guest should never read this register, in practice
> 0 is returned'.
> 
In practice kitten die for each read. Unspecified behaviour is
unspecified.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  9:12       ` Gleb Natapov
@ 2012-04-05  9:37         ` Michael S. Tsirkin
  2012-04-05  9:40           ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05  9:37 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > >This is never read.  We can also derive bus from the write handler,
> > > >making this more inline with the other callbacks.  Note that
> > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > >so we don't depend on the BusState location within PCIBus.
> > > >
> > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > >---
> > > >
> > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > >
> > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > >index 1e2c8a2..1e61d19 100644
> > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > >  ----------------------------------------
> > > >
> > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > >-Reads return 0.
> > > >+Read-only.
> > > Write-only perhaps?
> > 
> > Yes, let's also specify what happens in practice.
> No we shouldn't.
> 
> > I think it is 'Guest should never read this register, in practice
> > 0 is returned'.
> > 
> In practice kitten die for each read. Unspecified behaviour is
> unspecified.


Why, what are you worried about? I just want to document what we do.

The reason I want to specify behaviour on read is because down
the road we might want to return something here. Our lives
will be easier if we have a document which we can read
and figure out what old qemu did.


> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  9:37         ` Michael S. Tsirkin
@ 2012-04-05  9:40           ` Gleb Natapov
  2012-04-05  9:53             ` Michael S. Tsirkin
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2012-04-05  9:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > >This is never read.  We can also derive bus from the write handler,
> > > > >making this more inline with the other callbacks.  Note that
> > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > >so we don't depend on the BusState location within PCIBus.
> > > > >
> > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > >---
> > > > >
> > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > >
> > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > >index 1e2c8a2..1e61d19 100644
> > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > >  ----------------------------------------
> > > > >
> > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > >-Reads return 0.
> > > > >+Read-only.
> > > > Write-only perhaps?
> > > 
> > > Yes, let's also specify what happens in practice.
> > No we shouldn't.
> > 
> > > I think it is 'Guest should never read this register, in practice
> > > 0 is returned'.
> > > 
> > In practice kitten die for each read. Unspecified behaviour is
> > unspecified.
> 
> 
> Why, what are you worried about? I just want to document what we do.
> 
You are making undefined behaviour to be defined one.

> The reason I want to specify behaviour on read is because down
> the road we might want to return something here. Our lives
> will be easier if we have a document which we can read
> and figure out what old qemu did.
> 
You can do all that only if behaviour is undefined. If it is defined you
can't change it. Our lives will be easier if we will leave undefined
behaviour undefined.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  9:40           ` Gleb Natapov
@ 2012-04-05  9:53             ` Michael S. Tsirkin
  2012-04-05 10:08               ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05  9:53 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 12:40:00PM +0300, Gleb Natapov wrote:
> On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > > >This is never read.  We can also derive bus from the write handler,
> > > > > >making this more inline with the other callbacks.  Note that
> > > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > > >so we don't depend on the BusState location within PCIBus.
> > > > > >
> > > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > > >---
> > > > > >
> > > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > > >
> > > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > > >index 1e2c8a2..1e61d19 100644
> > > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > > >  ----------------------------------------
> > > > > >
> > > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > > >-Reads return 0.
> > > > > >+Read-only.
> > > > > Write-only perhaps?
> > > > 
> > > > Yes, let's also specify what happens in practice.
> > > No we shouldn't.
> > > 
> > > > I think it is 'Guest should never read this register, in practice
> > > > 0 is returned'.
> > > > 
> > > In practice kitten die for each read. Unspecified behaviour is
> > > unspecified.
> > 
> > 
> > Why, what are you worried about? I just want to document what we do.
> > 
> You are making undefined behaviour to be defined one.
> 
> > The reason I want to specify behaviour on read is because down
> > the road we might want to return something here. Our lives
> > will be easier if we have a document which we can read
> > and figure out what old qemu did.
> > 
> You can do all that only if behaviour is undefined. If it is defined you
> can't change it.

We are doing just that constantly, just be careful.
Documenting what happens will make it easier.

> Our lives will be easier if we will leave undefined
> behaviour undefined.

So yes live it undefined for guests but document what happens
for ourselves. Let's make it explicit, say
'current implementation returns 0 this can
 change at any time without notice.'

I want to go further. For up/down I would like to
document pas behaviour as well
'past implementations made the registers
read-write, writing there would clobber
outstanding hotplug requests.
current implementation makes the register read-only,
writes are discarded.
'

Just undefined is vague. If someone bothered
documenting the current undefined behavour
with registers being writeable instead of
undefined, then people would detect this
as a bug and this would have been fixed.


> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
                   ` (4 preceding siblings ...)
  2012-04-05  5:51 ` [Qemu-devel] [PATCH 5/5] acpi_piix4: Fix PCI hotplug race Alex Williamson
@ 2012-04-05 10:03 ` Michael S. Tsirkin
  2012-04-05 11:32 ` Gleb Natapov
  2012-04-05 12:31 ` Michael S. Tsirkin
  7 siblings, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05 10:03 UTC (permalink / raw)
  To: Alex Williamson
  Cc: aliguori, gleb, jbaron, qemu-devel, yamahata, kraxel, pbonzini,
	imammedo, aurelien

On Wed, Apr 04, 2012 at 11:51:00PM -0600, Alex Williamson wrote:
> We've been batting this one back and forth.  This series includes
> several of the cleanups and specification clarifications from my
> series awhile back.  Patch 5 is my proposed alternative to
> Michael's PCI hotplug race fix.  Since that version I added slot
> present tracking so we can be a little more strategic about which
> slots we ask the guest to check.  The approach for that path is
> described in the commit log.

So I would have organized it differently,
with a bugfix first and the rework on top
(e.g. make life easier if we want to cherry-pick this change
to the stable branch) and there's a bit of
discussion about documentation but I'm prepared to apply
this as is too if you prefer.

For the series:
Acked-by: Michael S. Tsirkin <mst@redhat.com>

>  I tested hotplug on both Linux an
> Windows guests (XP + 2k8), intentionally trying to do back to back
> device_add and device_del to get a race, but couldn't (I did
> however get a glibc double free that seems unrelated to this
> series).

Reproducible?


> Long term I'd like to deprecate the up/down PCI hotplug interface
> and move to a new model.  I think perhaps we should define 3 new
> registers:
> 
> 1) Device present in slot bitmap (foundation already in 5/5 here)
> 2) Virtual slot power state bitmap
> 3) Requested slot date bitmap
> 
> With these we should be able to do proper _STA, _PS0, and _PS3.
> We'd maintain the eject register for _EJ0, but deprecate up/down.
> 
> For a device_add, the 'present' & 'request' bitmaps gets set for
> the slot by qemu, gpe.sts set and SCI interrupt sent.  The OSPM
> can simply compare requested state vs power state and send the
> appropriate notify.  Same general idea for removal.  _STA can be
> formed by looking at the 'present' and 'power' bitmaps, and
> _PS0/3 perform writes to the power register to indicate the virtual
> state.  I think this provides a better register ownership model
> (qemu owns present/request, OSPM owns power) and also allows us
> to differentiate between a guest initiated remove and a platform
> (vm manager) initiated remove so we can avoid changing the vm
> state in the former case.  I'll try to send a patch for this out
> soon, but let me know if there are any concerns or suggestions for
> a redesign.  Thanks,
> 
> Alex
> 
> ---
> 
> Alex Williamson (5):
>       acpi_piix4: Fix PCI hotplug race
>       acpi_piix4: Remove PCI_RMV_BASE write code
>       acpi_piix4: Use pci_get/set_byte
>       acpi_piix4: Only allow writes to PCI hotplug eject register
>       acpi_piix4: Disallow write to up/down PCI hotplug registers
> 
> 
>  docs/specs/acpi_pci_hotplug.txt |    8 +-
>  hw/acpi_piix4.c                 |  178 ++++++++++++++++++++-------------------
>  2 files changed, 95 insertions(+), 91 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  9:53             ` Michael S. Tsirkin
@ 2012-04-05 10:08               ` Gleb Natapov
  2012-04-05 10:20                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2012-04-05 10:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 12:53:55PM +0300, Michael S. Tsirkin wrote:
> On Thu, Apr 05, 2012 at 12:40:00PM +0300, Gleb Natapov wrote:
> > On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > > > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > > > >This is never read.  We can also derive bus from the write handler,
> > > > > > >making this more inline with the other callbacks.  Note that
> > > > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > > > >so we don't depend on the BusState location within PCIBus.
> > > > > > >
> > > > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > > > >---
> > > > > > >
> > > > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > > > >
> > > > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > > > >index 1e2c8a2..1e61d19 100644
> > > > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > > > >  ----------------------------------------
> > > > > > >
> > > > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > > > >-Reads return 0.
> > > > > > >+Read-only.
> > > > > > Write-only perhaps?
> > > > > 
> > > > > Yes, let's also specify what happens in practice.
> > > > No we shouldn't.
> > > > 
> > > > > I think it is 'Guest should never read this register, in practice
> > > > > 0 is returned'.
> > > > > 
> > > > In practice kitten die for each read. Unspecified behaviour is
> > > > unspecified.
> > > 
> > > 
> > > Why, what are you worried about? I just want to document what we do.
> > > 
> > You are making undefined behaviour to be defined one.
> > 
> > > The reason I want to specify behaviour on read is because down
> > > the road we might want to return something here. Our lives
> > > will be easier if we have a document which we can read
> > > and figure out what old qemu did.
> > > 
> > You can do all that only if behaviour is undefined. If it is defined you
> > can't change it.
> 
> We are doing just that constantly, just be careful.
> Documenting what happens will make it easier.
> 
You keeping saying that it keeps not making sense to me.

> > Our lives will be easier if we will leave undefined
> > behaviour undefined.
> 
> So yes live it undefined for guests but document what happens
> for ourselves. Let's make it explicit, say
> 'current implementation returns 0 this can
>  change at any time without notice.'
> 
Current behaviour is documented in the current code. If the purpose of
the document is to define ABI for a guest then the only thing that make
sense to specify is that behaviour is undefined. Actually saying
"register is write only" is enough for everyone to understand that reads
are undefined. Look at HW specs. There is no "in practice read will do
this and that" near write only register description.

> I want to go further. For up/down I would like to
> document pas behaviour as well
> 'past implementations made the registers
> read-write, writing there would clobber
> outstanding hotplug requests.
> current implementation makes the register read-only,
> writes are discarded.
> '
Documenting things that were undocumented and were used make sense,
but then you can't change how they behave if you believe that there is
a code that relies on old behaviour.

> 
> Just undefined is vague. If someone bothered
> documenting the current undefined behavour
> with registers being writeable instead of
> undefined, then people would detect this
> as a bug and this would have been fixed.
> 
I have no idea what are you trying to say here.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05 10:08               ` Gleb Natapov
@ 2012-04-05 10:20                 ` Michael S. Tsirkin
  2012-04-05 10:48                   ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05 10:20 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 01:08:06PM +0300, Gleb Natapov wrote:
> On Thu, Apr 05, 2012 at 12:53:55PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Apr 05, 2012 at 12:40:00PM +0300, Gleb Natapov wrote:
> > > On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > > > > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > > > > >This is never read.  We can also derive bus from the write handler,
> > > > > > > >making this more inline with the other callbacks.  Note that
> > > > > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > > > > >so we don't depend on the BusState location within PCIBus.
> > > > > > > >
> > > > > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > > > > >---
> > > > > > > >
> > > > > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > > > > >
> > > > > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > >index 1e2c8a2..1e61d19 100644
> > > > > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > > > > >  ----------------------------------------
> > > > > > > >
> > > > > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > > > > >-Reads return 0.
> > > > > > > >+Read-only.
> > > > > > > Write-only perhaps?
> > > > > > 
> > > > > > Yes, let's also specify what happens in practice.
> > > > > No we shouldn't.
> > > > > 
> > > > > > I think it is 'Guest should never read this register, in practice
> > > > > > 0 is returned'.
> > > > > > 
> > > > > In practice kitten die for each read. Unspecified behaviour is
> > > > > unspecified.
> > > > 
> > > > 
> > > > Why, what are you worried about? I just want to document what we do.
> > > > 
> > > You are making undefined behaviour to be defined one.
> > > 
> > > > The reason I want to specify behaviour on read is because down
> > > > the road we might want to return something here. Our lives
> > > > will be easier if we have a document which we can read
> > > > and figure out what old qemu did.
> > > > 
> > > You can do all that only if behaviour is undefined. If it is defined you
> > > can't change it.
> > 
> > We are doing just that constantly, just be careful.
> > Documenting what happens will make it easier.
> > 
> You keeping saying that it keeps not making sense to me.
> 
> > > Our lives will be easier if we will leave undefined
> > > behaviour undefined.
> > 
> > So yes live it undefined for guests but document what happens
> > for ourselves. Let's make it explicit, say
> > 'current implementation returns 0 this can
> >  change at any time without notice.'
> > 
> Current behaviour is documented in the current code. If the purpose of
> the document is to define ABI for a guest then the only thing that make
> sense to specify is that behaviour is undefined. Actually saying
> "register is write only" is enough for everyone to understand that reads
> are undefined. Look at HW specs. There is no "in practice read will do
> this and that" near write only register description.

This is because hardware is hardware, you do not
keep developing it. We keep editing what our hardware
does so it makes sense documenting what it does
even if it is not guest visible.

> > I want to go further. For up/down I would like to
> > document pas behaviour as well
> > 'past implementations made the registers
> > read-write, writing there would clobber
> > outstanding hotplug requests.
> > current implementation makes the register read-only,
> > writes are discarded.
> > '
> Documenting things that were undocumented and were used make sense,
> but then you can't change how they behave if you believe that there is
> a code that relies on old behaviour.
> > 
> > Just undefined is vague. If someone bothered
> > documenting the current undefined behavour
> > with registers being writeable instead of
> > undefined, then people would detect this
> > as a bug and this would have been fixed.
> > 
> I have no idea what are you trying to say here.

I am trying to say that besides guest visible behaviour I want
to document, separately, non guest visible behaviour.
For example what registers *that guest should never read*
actually do on read.

This is not different from code comments really,
I just want them in a central place because this
is guest triggerable.
Why is this good? Makes it easier to do things like security
audit, or develop new features in a compatible way.


> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05 10:20                 ` Michael S. Tsirkin
@ 2012-04-05 10:48                   ` Gleb Natapov
  2012-04-05 15:12                     ` Alex Williamson
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2012-04-05 10:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: aliguori, jbaron, qemu-devel, yamahata, Alex Williamson, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 01:20:03PM +0300, Michael S. Tsirkin wrote:
> On Thu, Apr 05, 2012 at 01:08:06PM +0300, Gleb Natapov wrote:
> > On Thu, Apr 05, 2012 at 12:53:55PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Apr 05, 2012 at 12:40:00PM +0300, Gleb Natapov wrote:
> > > > On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > > > > > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > > > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > > > > > >This is never read.  We can also derive bus from the write handler,
> > > > > > > > >making this more inline with the other callbacks.  Note that
> > > > > > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > > > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > > > > > >so we don't depend on the BusState location within PCIBus.
> > > > > > > > >
> > > > > > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > > > > > >---
> > > > > > > > >
> > > > > > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > > > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > > > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > > > > > >
> > > > > > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > >index 1e2c8a2..1e61d19 100644
> > > > > > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > > > > > >  ----------------------------------------
> > > > > > > > >
> > > > > > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > > > > > >-Reads return 0.
> > > > > > > > >+Read-only.
> > > > > > > > Write-only perhaps?
> > > > > > > 
> > > > > > > Yes, let's also specify what happens in practice.
> > > > > > No we shouldn't.
> > > > > > 
> > > > > > > I think it is 'Guest should never read this register, in practice
> > > > > > > 0 is returned'.
> > > > > > > 
> > > > > > In practice kitten die for each read. Unspecified behaviour is
> > > > > > unspecified.
> > > > > 
> > > > > 
> > > > > Why, what are you worried about? I just want to document what we do.
> > > > > 
> > > > You are making undefined behaviour to be defined one.
> > > > 
> > > > > The reason I want to specify behaviour on read is because down
> > > > > the road we might want to return something here. Our lives
> > > > > will be easier if we have a document which we can read
> > > > > and figure out what old qemu did.
> > > > > 
> > > > You can do all that only if behaviour is undefined. If it is defined you
> > > > can't change it.
> > > 
> > > We are doing just that constantly, just be careful.
> > > Documenting what happens will make it easier.
> > > 
> > You keeping saying that it keeps not making sense to me.
> > 
> > > > Our lives will be easier if we will leave undefined
> > > > behaviour undefined.
> > > 
> > > So yes live it undefined for guests but document what happens
> > > for ourselves. Let's make it explicit, say
> > > 'current implementation returns 0 this can
> > >  change at any time without notice.'
> > > 
> > Current behaviour is documented in the current code. If the purpose of
> > the document is to define ABI for a guest then the only thing that make
> > sense to specify is that behaviour is undefined. Actually saying
> > "register is write only" is enough for everyone to understand that reads
> > are undefined. Look at HW specs. There is no "in practice read will do
> > this and that" near write only register description.
> 
> This is because hardware is hardware, you do not
> keep developing it. We keep editing what our hardware
> does so it makes sense documenting what it does
> even if it is not guest visible.
> 
Oh yes! Intel stopped developing cpu just after 8008 :)

> > > I want to go further. For up/down I would like to
> > > document pas behaviour as well
> > > 'past implementations made the registers
> > > read-write, writing there would clobber
> > > outstanding hotplug requests.
> > > current implementation makes the register read-only,
> > > writes are discarded.
> > > '
> > Documenting things that were undocumented and were used make sense,
> > but then you can't change how they behave if you believe that there is
> > a code that relies on old behaviour.
> > > 
> > > Just undefined is vague. If someone bothered
> > > documenting the current undefined behavour
> > > with registers being writeable instead of
> > > undefined, then people would detect this
> > > as a bug and this would have been fixed.
> > > 
> > I have no idea what are you trying to say here.
> 
> I am trying to say that besides guest visible behaviour I want
> to document, separately, non guest visible behaviour.
Document it some other place. Hmm, how about doing in the code?

> For example what registers *that guest should never read*
> actually do on read.
> 
Looks at the code. Or are you going to describe everything QEMU does
internally in plane english? If not, why making an exception for acpi
code?

> This is not different from code comments really,
> I just want them in a central place because this
> is guest triggerable.
This is not guest triggerable. We do not care about guests that triggers
it and we state it explicitly in documentation that is targeted for
guest developers.

> Why is this good? Makes it easier to do things like security
> audit, or develop new features in a compatible way.
> 
I do not see how it helps for both of those. I do see how it harmful for
the later.

Look the only reason this spec exists is because there is no real HW we
emulate here. We do not write such specs for HW that have spec already.
So we design our own HW and we document it. The only reason I will look
at this spec is to check that my changes to the code does not modify
guest visible behaviour in a way that guest cannot cope with. I will not
look at the spec to check what current code does, it does not make
sense. In short the spec should describe what code should do, not what
it does and as such there is not place for "current code does that"
there.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
                   ` (5 preceding siblings ...)
  2012-04-05 10:03 ` [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Michael S. Tsirkin
@ 2012-04-05 11:32 ` Gleb Natapov
  2012-04-05 12:31 ` Michael S. Tsirkin
  7 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2012-04-05 11:32 UTC (permalink / raw)
  To: Alex Williamson
  Cc: aliguori, mst, jbaron, qemu-devel, yamahata, kraxel, pbonzini,
	imammedo, aurelien

On Wed, Apr 04, 2012 at 11:51:00PM -0600, Alex Williamson wrote:
> We've been batting this one back and forth.  This series includes
> several of the cleanups and specification clarifications from my
> series awhile back.  Patch 5 is my proposed alternative to
> Michael's PCI hotplug race fix.  Since that version I added slot
> present tracking so we can be a little more strategic about which
> slots we ask the guest to check.  The approach for that path is
> described in the commit log.  I tested hotplug on both Linux an
> Windows guests (XP + 2k8), intentionally trying to do back to back
> device_add and device_del to get a race, but couldn't (I did
> however get a glibc double free that seems unrelated to this
> series).
> 
> Long term I'd like to deprecate the up/down PCI hotplug interface
> and move to a new model.  I think perhaps we should define 3 new
> registers:
> 
> 1) Device present in slot bitmap (foundation already in 5/5 here)
> 2) Virtual slot power state bitmap
> 3) Requested slot date bitmap
> 
> With these we should be able to do proper _STA, _PS0, and _PS3.
> We'd maintain the eject register for _EJ0, but deprecate up/down.
> 
> For a device_add, the 'present' & 'request' bitmaps gets set for
> the slot by qemu, gpe.sts set and SCI interrupt sent.  The OSPM
> can simply compare requested state vs power state and send the
> appropriate notify.  Same general idea for removal.  _STA can be
> formed by looking at the 'present' and 'power' bitmaps, and
> _PS0/3 perform writes to the power register to indicate the virtual
> state.  I think this provides a better register ownership model
> (qemu owns present/request, OSPM owns power) and also allows us
> to differentiate between a guest initiated remove and a platform
> (vm manager) initiated remove so we can avoid changing the vm
> state in the former case.  I'll try to send a patch for this out
> soon, but let me know if there are any concerns or suggestions for
> a redesign.  Thanks,
> 
> Alex
> 
Acked-by: Gleb Natapov <gleb@redhat.com>

> ---
> 
> Alex Williamson (5):
>       acpi_piix4: Fix PCI hotplug race
>       acpi_piix4: Remove PCI_RMV_BASE write code
>       acpi_piix4: Use pci_get/set_byte
>       acpi_piix4: Only allow writes to PCI hotplug eject register
>       acpi_piix4: Disallow write to up/down PCI hotplug registers
> 
> 
>  docs/specs/acpi_pci_hotplug.txt |    8 +-
>  hw/acpi_piix4.c                 |  178 ++++++++++++++++++++-------------------
>  2 files changed, 95 insertions(+), 91 deletions(-)

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup
  2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
                   ` (6 preceding siblings ...)
  2012-04-05 11:32 ` Gleb Natapov
@ 2012-04-05 12:31 ` Michael S. Tsirkin
  2012-04-05 15:14   ` Alex Williamson
  7 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05 12:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: aliguori, gleb, jbaron, qemu-devel, yamahata, kraxel, pbonzini,
	imammedo, aurelien

Which commit is this series against?
When trying to apply to my tree, I get conflicts.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05  8:21   ` Igor Mammedov
  2012-04-05  9:04     ` Michael S. Tsirkin
@ 2012-04-05 14:28     ` Alex Williamson
  1 sibling, 0 replies; 23+ messages in thread
From: Alex Williamson @ 2012-04-05 14:28 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: aliguori, gleb, mst, jbaron, qemu-devel, yamahata, kraxel,
	pbonzini, aurelien

On Thu, 2012-04-05 at 10:21 +0200, Igor Mammedov wrote:
> On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > This is never read.  We can also derive bus from the write handler,
> > making this more inline with the other callbacks.  Note that
> > pciej_write was actually called with (PCIBus *)dev->bus, which is
> > cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > so we don't depend on the BusState location within PCIBus.
> >
> > Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > ---
> >
> >   docs/specs/acpi_pci_hotplug.txt |    2 +-
> >   hw/acpi_piix4.c                 |   14 ++++----------
> >   2 files changed, 5 insertions(+), 11 deletions(-)
> >
> > diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > index 1e2c8a2..1e61d19 100644
> > --- a/docs/specs/acpi_pci_hotplug.txt
> > +++ b/docs/specs/acpi_pci_hotplug.txt
> > @@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> >   ----------------------------------------
> >
> >   Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > -Reads return 0.
> > +Read-only.
> Write-only perhaps?

Oops, copy/paste.  Will fix.

> >
> >   PCI removability status (IO port 0xae0c-0xae0f, 4-byte access):
> >   -----------------------------------------------
> > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> > index 44d1423..6ee832a 100644
> > --- a/hw/acpi_piix4.c
> > +++ b/hw/acpi_piix4.c
> > @@ -487,15 +487,11 @@ static uint32_t pci_down_read(void *opaque, uint32_t addr)
> >       return val;
> >   }
> >
> > -static uint32_t pciej_read(void *opaque, uint32_t addr)
> > -{
> > -    PIIX4_DPRINTF("pciej read %x\n", addr);
> > -    return 0;
> > -}
> > -
> what will happen if bios tries to read from reg?

-1.  Thanks,

Alex

> >   static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
> >   {
> > -    BusState *bus = opaque;
> > +    PIIX4PMState *s = opaque;
> > +    PCIDevice *dev =&s->dev;
> > +    BusState *bus = qdev_get_parent_bus(&dev->qdev);
> >       DeviceState *qdev, *next;
> >       int slot = ffs(val) - 1;
> >
> > @@ -507,7 +503,6 @@ static void pciej_write(void *opaque, uint32_t addr, uint32_t val)
> >           }
> >       }
> >
> > -
> >       PIIX4_DPRINTF("pciej write %x<== %d\n", addr, val);
> >   }
> >
> > @@ -547,8 +542,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
> >       register_ioport_read(PCI_UP_BASE, 4, 4, pci_up_read, s);
> >       register_ioport_read(PCI_DOWN_BASE, 4, 4, pci_down_read, s);
> >
> > -    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
> > -    register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);
> > +    register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, s);
> >
> >       register_ioport_write(PCI_RMV_BASE, 4, 4, pcirmv_write, s);
> >       register_ioport_read(PCI_RMV_BASE, 4, 4,  pcirmv_read, s);
> >
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05 10:48                   ` Gleb Natapov
@ 2012-04-05 15:12                     ` Alex Williamson
  2012-04-05 15:38                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2012-04-05 15:12 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: aliguori, Michael S. Tsirkin, jbaron, qemu-devel, yamahata,
	kraxel, pbonzini, Igor Mammedov, aurelien

On Thu, 2012-04-05 at 13:48 +0300, Gleb Natapov wrote:
> On Thu, Apr 05, 2012 at 01:20:03PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Apr 05, 2012 at 01:08:06PM +0300, Gleb Natapov wrote:
> > > On Thu, Apr 05, 2012 at 12:53:55PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Apr 05, 2012 at 12:40:00PM +0300, Gleb Natapov wrote:
> > > > > On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > > > > > > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > > > > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > > > > > > >This is never read.  We can also derive bus from the write handler,
> > > > > > > > > >making this more inline with the other callbacks.  Note that
> > > > > > > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > > > > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > > > > > > >so we don't depend on the BusState location within PCIBus.
> > > > > > > > > >
> > > > > > > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > > > > > > >---
> > > > > > > > > >
> > > > > > > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > > > > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > > > > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > > > > > > >
> > > > > > > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > > >index 1e2c8a2..1e61d19 100644
> > > > > > > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > > > > > > >  ----------------------------------------
> > > > > > > > > >
> > > > > > > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > > > > > > >-Reads return 0.
> > > > > > > > > >+Read-only.
> > > > > > > > > Write-only perhaps?
> > > > > > > > 
> > > > > > > > Yes, let's also specify what happens in practice.
> > > > > > > No we shouldn't.
> > > > > > > 
> > > > > > > > I think it is 'Guest should never read this register, in practice
> > > > > > > > 0 is returned'.
> > > > > > > > 
> > > > > > > In practice kitten die for each read. Unspecified behaviour is
> > > > > > > unspecified.
> > > > > > 
> > > > > > 
> > > > > > Why, what are you worried about? I just want to document what we do.
> > > > > > 
> > > > > You are making undefined behaviour to be defined one.
> > > > > 
> > > > > > The reason I want to specify behaviour on read is because down
> > > > > > the road we might want to return something here. Our lives
> > > > > > will be easier if we have a document which we can read
> > > > > > and figure out what old qemu did.
> > > > > > 
> > > > > You can do all that only if behaviour is undefined. If it is defined you
> > > > > can't change it.
> > > > 
> > > > We are doing just that constantly, just be careful.
> > > > Documenting what happens will make it easier.
> > > > 
> > > You keeping saying that it keeps not making sense to me.
> > > 
> > > > > Our lives will be easier if we will leave undefined
> > > > > behaviour undefined.
> > > > 
> > > > So yes live it undefined for guests but document what happens
> > > > for ourselves. Let's make it explicit, say
> > > > 'current implementation returns 0 this can
> > > >  change at any time without notice.'
> > > > 
> > > Current behaviour is documented in the current code. If the purpose of
> > > the document is to define ABI for a guest then the only thing that make
> > > sense to specify is that behaviour is undefined. Actually saying
> > > "register is write only" is enough for everyone to understand that reads
> > > are undefined. Look at HW specs. There is no "in practice read will do
> > > this and that" near write only register description.
> > 
> > This is because hardware is hardware, you do not
> > keep developing it. We keep editing what our hardware
> > does so it makes sense documenting what it does
> > even if it is not guest visible.
> > 
> Oh yes! Intel stopped developing cpu just after 8008 :)
> 
> > > > I want to go further. For up/down I would like to
> > > > document pas behaviour as well
> > > > 'past implementations made the registers
> > > > read-write, writing there would clobber
> > > > outstanding hotplug requests.
> > > > current implementation makes the register read-only,
> > > > writes are discarded.
> > > > '
> > > Documenting things that were undocumented and were used make sense,
> > > but then you can't change how they behave if you believe that there is
> > > a code that relies on old behaviour.
> > > > 
> > > > Just undefined is vague. If someone bothered
> > > > documenting the current undefined behavour
> > > > with registers being writeable instead of
> > > > undefined, then people would detect this
> > > > as a bug and this would have been fixed.
> > > > 
> > > I have no idea what are you trying to say here.
> > 
> > I am trying to say that besides guest visible behaviour I want
> > to document, separately, non guest visible behaviour.
> Document it some other place. Hmm, how about doing in the code?
> 
> > For example what registers *that guest should never read*
> > actually do on read.
> > 
> Looks at the code. Or are you going to describe everything QEMU does
> internally in plane english? If not, why making an exception for acpi
> code?
> 
> > This is not different from code comments really,
> > I just want them in a central place because this
> > is guest triggerable.
> This is not guest triggerable. We do not care about guests that triggers
> it and we state it explicitly in documentation that is targeted for
> guest developers.
> 
> > Why is this good? Makes it easier to do things like security
> > audit, or develop new features in a compatible way.
> > 
> I do not see how it helps for both of those. I do see how it harmful for
> the later.
> 
> Look the only reason this spec exists is because there is no real HW we
> emulate here. We do not write such specs for HW that have spec already.
> So we design our own HW and we document it. The only reason I will look
> at this spec is to check that my changes to the code does not modify
> guest visible behaviour in a way that guest cannot cope with. I will not
> look at the spec to check what current code does, it does not make
> sense. In short the spec should describe what code should do, not what
> it does and as such there is not place for "current code does that"
> there.

Wow, missed quite a discussion on this overnight.  Based on the patch,
obviously I agree that we should not be trying to define undefined
behavior.  Up/down was clearly broken.  Any kind of read, modify, write
from the guest can race with qemu, so I think it's justified to change
the behavior.  The hotplug capable register was always read-only, so I'm
really not changing anything there except clearly defining it.  The
eject register supports read for absolutely no reason, so I'd like to
remove that support before anyone actually depends on it and then we
have to carry a return 0 read function on forever.  Alternatively, we
could take this opportunity to define the read side of the eject
register as a hotplug implementation version or features register.  Any
preference for that?  Thanks,

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup
  2012-04-05 12:31 ` Michael S. Tsirkin
@ 2012-04-05 15:14   ` Alex Williamson
  2012-04-05 15:39     ` Michael S. Tsirkin
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2012-04-05 15:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: aliguori, gleb, jbaron, qemu-devel, yamahata, kraxel, pbonzini,
	imammedo, aurelien

On Thu, 2012-04-05 at 15:31 +0300, Michael S. Tsirkin wrote:
> Which commit is this series against?
> When trying to apply to my tree, I get conflicts.

Sorry, it's off qemu-kvm, it was too late when I sent this.  Let me
re-order to send the fix separately so we can continue to discuss the
doc/undefined behavior.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register
  2012-04-05 15:12                     ` Alex Williamson
@ 2012-04-05 15:38                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05 15:38 UTC (permalink / raw)
  To: Alex Williamson
  Cc: aliguori, Gleb Natapov, jbaron, qemu-devel, yamahata, kraxel,
	pbonzini, Igor Mammedov, aurelien

On Thu, Apr 05, 2012 at 09:12:31AM -0600, Alex Williamson wrote:
> On Thu, 2012-04-05 at 13:48 +0300, Gleb Natapov wrote:
> > On Thu, Apr 05, 2012 at 01:20:03PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Apr 05, 2012 at 01:08:06PM +0300, Gleb Natapov wrote:
> > > > On Thu, Apr 05, 2012 at 12:53:55PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Apr 05, 2012 at 12:40:00PM +0300, Gleb Natapov wrote:
> > > > > > On Thu, Apr 05, 2012 at 12:37:01PM +0300, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Apr 05, 2012 at 12:12:37PM +0300, Gleb Natapov wrote:
> > > > > > > > On Thu, Apr 05, 2012 at 12:04:44PM +0300, Michael S. Tsirkin wrote:
> > > > > > > > > On Thu, Apr 05, 2012 at 10:21:18AM +0200, Igor Mammedov wrote:
> > > > > > > > > > On 04/05/2012 07:51 AM, Alex Williamson wrote:
> > > > > > > > > > >This is never read.  We can also derive bus from the write handler,
> > > > > > > > > > >making this more inline with the other callbacks.  Note that
> > > > > > > > > > >pciej_write was actually called with (PCIBus *)dev->bus, which is
> > > > > > > > > > >cast as a void* allowing us to pretend it's a BusState*.  Fix this
> > > > > > > > > > >so we don't depend on the BusState location within PCIBus.
> > > > > > > > > > >
> > > > > > > > > > >Signed-off-by: Alex Williamson<alex.williamson@redhat.com>
> > > > > > > > > > >---
> > > > > > > > > > >
> > > > > > > > > > >  docs/specs/acpi_pci_hotplug.txt |    2 +-
> > > > > > > > > > >  hw/acpi_piix4.c                 |   14 ++++----------
> > > > > > > > > > >  2 files changed, 5 insertions(+), 11 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > >diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > > > >index 1e2c8a2..1e61d19 100644
> > > > > > > > > > >--- a/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > > > >+++ b/docs/specs/acpi_pci_hotplug.txt
> > > > > > > > > > >@@ -28,7 +28,7 @@ PCI device eject (IO port 0xae08-0xae0b, 4-byte access):
> > > > > > > > > > >  ----------------------------------------
> > > > > > > > > > >
> > > > > > > > > > >  Used by ACPI BIOS _EJ0 method to request device removal. One bit per slot.
> > > > > > > > > > >-Reads return 0.
> > > > > > > > > > >+Read-only.
> > > > > > > > > > Write-only perhaps?
> > > > > > > > > 
> > > > > > > > > Yes, let's also specify what happens in practice.
> > > > > > > > No we shouldn't.
> > > > > > > > 
> > > > > > > > > I think it is 'Guest should never read this register, in practice
> > > > > > > > > 0 is returned'.
> > > > > > > > > 
> > > > > > > > In practice kitten die for each read. Unspecified behaviour is
> > > > > > > > unspecified.
> > > > > > > 
> > > > > > > 
> > > > > > > Why, what are you worried about? I just want to document what we do.
> > > > > > > 
> > > > > > You are making undefined behaviour to be defined one.
> > > > > > 
> > > > > > > The reason I want to specify behaviour on read is because down
> > > > > > > the road we might want to return something here. Our lives
> > > > > > > will be easier if we have a document which we can read
> > > > > > > and figure out what old qemu did.
> > > > > > > 
> > > > > > You can do all that only if behaviour is undefined. If it is defined you
> > > > > > can't change it.
> > > > > 
> > > > > We are doing just that constantly, just be careful.
> > > > > Documenting what happens will make it easier.
> > > > > 
> > > > You keeping saying that it keeps not making sense to me.
> > > > 
> > > > > > Our lives will be easier if we will leave undefined
> > > > > > behaviour undefined.
> > > > > 
> > > > > So yes live it undefined for guests but document what happens
> > > > > for ourselves. Let's make it explicit, say
> > > > > 'current implementation returns 0 this can
> > > > >  change at any time without notice.'
> > > > > 
> > > > Current behaviour is documented in the current code. If the purpose of
> > > > the document is to define ABI for a guest then the only thing that make
> > > > sense to specify is that behaviour is undefined. Actually saying
> > > > "register is write only" is enough for everyone to understand that reads
> > > > are undefined. Look at HW specs. There is no "in practice read will do
> > > > this and that" near write only register description.
> > > 
> > > This is because hardware is hardware, you do not
> > > keep developing it. We keep editing what our hardware
> > > does so it makes sense documenting what it does
> > > even if it is not guest visible.
> > > 
> > Oh yes! Intel stopped developing cpu just after 8008 :)
> > 
> > > > > I want to go further. For up/down I would like to
> > > > > document pas behaviour as well
> > > > > 'past implementations made the registers
> > > > > read-write, writing there would clobber
> > > > > outstanding hotplug requests.
> > > > > current implementation makes the register read-only,
> > > > > writes are discarded.
> > > > > '
> > > > Documenting things that were undocumented and were used make sense,
> > > > but then you can't change how they behave if you believe that there is
> > > > a code that relies on old behaviour.
> > > > > 
> > > > > Just undefined is vague. If someone bothered
> > > > > documenting the current undefined behavour
> > > > > with registers being writeable instead of
> > > > > undefined, then people would detect this
> > > > > as a bug and this would have been fixed.
> > > > > 
> > > > I have no idea what are you trying to say here.
> > > 
> > > I am trying to say that besides guest visible behaviour I want
> > > to document, separately, non guest visible behaviour.
> > Document it some other place. Hmm, how about doing in the code?
> > 
> > > For example what registers *that guest should never read*
> > > actually do on read.
> > > 
> > Looks at the code. Or are you going to describe everything QEMU does
> > internally in plane english? If not, why making an exception for acpi
> > code?
> > 
> > > This is not different from code comments really,
> > > I just want them in a central place because this
> > > is guest triggerable.
> > This is not guest triggerable. We do not care about guests that triggers
> > it and we state it explicitly in documentation that is targeted for
> > guest developers.
> > 
> > > Why is this good? Makes it easier to do things like security
> > > audit, or develop new features in a compatible way.
> > > 
> > I do not see how it helps for both of those. I do see how it harmful for
> > the later.
> > 
> > Look the only reason this spec exists is because there is no real HW we
> > emulate here. We do not write such specs for HW that have spec already.
> > So we design our own HW and we document it. The only reason I will look
> > at this spec is to check that my changes to the code does not modify
> > guest visible behaviour in a way that guest cannot cope with. I will not
> > look at the spec to check what current code does, it does not make
> > sense. In short the spec should describe what code should do, not what
> > it does and as such there is not place for "current code does that"
> > there.
> 
> Wow, missed quite a discussion on this overnight.  Based on the patch,
> obviously I agree that we should not be trying to define undefined
> behavior.  Up/down was clearly broken.  Any kind of read, modify, write
> from the guest can race with qemu, so I think it's justified to change
> the behavior.  The hotplug capable register was always read-only, so I'm
> really not changing anything there except clearly defining it.  The
> eject register supports read for absolutely no reason, so I'd like to
> remove that support before anyone actually depends on it and then we
> have to carry a return 0 read function on forever.  Alternatively, we
> could take this opportunity to define the read side of the eject
> register as a hotplug implementation version or features register.  Any
> preference for that?  Thanks,
> 
> Alex

I prefer it to be a version or features register.
As we are compatible with old BIOS we can keep it
at 0 for now, exactly as it was, right?


-- 
MST

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup
  2012-04-05 15:14   ` Alex Williamson
@ 2012-04-05 15:39     ` Michael S. Tsirkin
  0 siblings, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2012-04-05 15:39 UTC (permalink / raw)
  To: Alex Williamson
  Cc: aliguori, gleb, jbaron, qemu-devel, yamahata, kraxel, pbonzini,
	imammedo, aurelien

On Thu, Apr 05, 2012 at 09:14:34AM -0600, Alex Williamson wrote:
> On Thu, 2012-04-05 at 15:31 +0300, Michael S. Tsirkin wrote:
> > Which commit is this series against?
> > When trying to apply to my tree, I get conflicts.
> 
> Sorry, it's off qemu-kvm, it was too late when I sent this.  Let me
> re-order to send the fix separately so we can continue to discuss the
> doc/undefined behavior.  Thanks,
> 
> Alex

Let's keep returning 0 on eject however so that
it can serve as a versioning register.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-04-05 15:39 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-05  5:51 [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Alex Williamson
2012-04-05  5:51 ` [Qemu-devel] [PATCH 1/5] acpi_piix4: Disallow write to up/down PCI hotplug registers Alex Williamson
2012-04-05  5:51 ` [Qemu-devel] [PATCH 2/5] acpi_piix4: Only allow writes to PCI hotplug eject register Alex Williamson
2012-04-05  8:21   ` Igor Mammedov
2012-04-05  9:04     ` Michael S. Tsirkin
2012-04-05  9:12       ` Gleb Natapov
2012-04-05  9:37         ` Michael S. Tsirkin
2012-04-05  9:40           ` Gleb Natapov
2012-04-05  9:53             ` Michael S. Tsirkin
2012-04-05 10:08               ` Gleb Natapov
2012-04-05 10:20                 ` Michael S. Tsirkin
2012-04-05 10:48                   ` Gleb Natapov
2012-04-05 15:12                     ` Alex Williamson
2012-04-05 15:38                       ` Michael S. Tsirkin
2012-04-05 14:28     ` Alex Williamson
2012-04-05  5:51 ` [Qemu-devel] [PATCH 3/5] acpi_piix4: Use pci_get/set_byte Alex Williamson
2012-04-05  5:51 ` [Qemu-devel] [PATCH 4/5] acpi_piix4: Remove PCI_RMV_BASE write code Alex Williamson
2012-04-05  5:51 ` [Qemu-devel] [PATCH 5/5] acpi_piix4: Fix PCI hotplug race Alex Williamson
2012-04-05 10:03 ` [Qemu-devel] [PATCH 0/5] PCI hotplug fixes/cleanup Michael S. Tsirkin
2012-04-05 11:32 ` Gleb Natapov
2012-04-05 12:31 ` Michael S. Tsirkin
2012-04-05 15:14   ` Alex Williamson
2012-04-05 15:39     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).