[Qemu-devel] -drive werror=stop can cause state change handlers run out of order

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] -drive werror=stop can cause state change handlers run out of order
@ 2009-07-23 21:49 Markus Armbruster
  2009-07-27 18:44 ` [Qemu-devel] " Markus Armbruster
  2009-07-27 18:51 ` Markus Armbruster
  0 siblings, 2 replies; 5+ messages in thread
From: Markus Armbruster @ 2009-07-23 21:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gleb Natapov

Consider the following scenario[1]:

0. The virtual IDE drive goes south.  All future writes return errors.

1. Something encounters a write error, and duly stops the VM with
   vm_stop().

2. vm_stop() calls vm_state_notify(0).

3. vm_state_notify() runs the callbacks in list vm_change_state_head.
   It contains ide_dma_restart_cb() installed by bmdma_map()[2].  It
   also contains audio_vm_change_state_handler() installed by
   audio_init().

4. audio_vm_change_state_handler() stops audio stuff.

5. User continues VM with monitor command "c".  This runs vm_start().

6. vm_start() calls vm_state_notify(1).

7. vm_state_notify() runs the callbacks in vm_change_state_head.

8. Say ide_dma_restart_cb() happens to come first.  It does its work,
   runs into a write error, and duly stops the VM with vm_stop().

9. vm_stop() runs vm_state_notify(0).

10. vm_state_notify() runs the callbacks in vm_change_state_head.

11. audio_vm_change_state_handler() stops audio stuff.  Which isn't
   running.

12. vm_stop() finishes, ide_dma_restart_cb() finishes, step 7's
   vm_state_notify() resumes running handlers.

13. audio_vm_change_state_handler() starts audio stuff.  Oopsie.

What happens here is that when a VM state change handler changes VM
state, other VM state change handlers can see the state transitions out
of order.

I showed this to Gleb, and he suggested to have ide_dma_restart_cb()[3]
set up a bottom half to retry writes.  I'm not familiar with the block
code, so I figure I ask here before I try it: Is that the way to fix
this?

[1] Note: I didn't actually reproduce it in this form with upstream
code.

[2] Actually two of them, for the IDE device's bmdma[0] and bmdma[1],
but that doesn't matter.

[3] Same for SCSI and virtio-blk.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] Re: -drive werror=stop can cause state change handlers run out of order
  2009-07-23 21:49 [Qemu-devel] -drive werror=stop can cause state change handlers run out of order Markus Armbruster
@ 2009-07-27 18:44 ` Markus Armbruster
  2009-07-27 18:51 ` Markus Armbruster
  1 sibling, 0 replies; 5+ messages in thread
From: Markus Armbruster @ 2009-07-27 18:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gleb Natapov

I reproduced the problem as follows.

The appended patch does two things:

1. It lets me inject write errors by setting variable inject_write_error
   in gdb.  Crude, but works.

2. It shows what audio_vm_change_state_handler() does.

Apply it and start a guest with sound and werror=stop under gdb, say

    $ gdb --args qemu -drive file=f10.qcow2,werror=stop -soundhw ac97 -monitor unix:monitor,server,nowait

Let it boot to single user mode (just to speed things up).  When
everything's nicely quiet, do

    (gdb) set inject_write_error=1000
    (gdb) c

Trigger a write.  Running "sync" works for me.  Guest stops, printing

    injecting write error on ide0-hd0
    audio_vm_change_state_handler voice disable

Connect to the monitor and resume the guest ("c").  Guest stops,
printing

    injecting write error on ide0-hd0
    audio_vm_change_state_handler voice disable
    injecting write error on ide0-hd0
    audio_vm_change_state_handler voice enable

You see that audio_vm_change_state_handler() are run in the wrong order,
and voice is thus left enabled rather than disabled.



diff --git a/audio/audio.c b/audio/audio.c
index 694a83e..64d079f 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -1630,6 +1630,9 @@ static void audio_vm_change_state_handler (void *opaque, int running,
     HWVoiceIn *hwi = NULL;
     int op = running ? VOICE_ENABLE : VOICE_DISABLE;
 
+    printf("audio_vm_change_state_handler voice %s\n",
+           running ? "enable" : "disable");
+
     s->vm_running = running;
     while ((hwo = audio_pcm_hw_find_any_enabled_out (hwo))) {
         hwo->pcm_ops->ctl_out (hwo, op);
diff --git a/hw/ide.c b/hw/ide.c
index 1e56786..98d0315 100644
--- a/hw/ide.c
+++ b/hw/ide.c
@@ -1060,6 +1061,8 @@ static void ide_sector_write_timer_cb(void *opaque)
     ide_set_irq(s);
 }
 
+int inject_write_error;
+
 static void ide_sector_write(IDEState *s)
 {
     int64_t sector_num;
@@ -1075,6 +1078,12 @@ static void ide_sector_write(IDEState *s)
         n = s->req_nb_sectors;
     ret = bdrv_write(s->bs, sector_num, s->io_buffer, n);
 
+    if (inject_write_error) {
+        printf("injecting write error on %s\n", s->bs->device_name);
+        ret = -EIO;
+        inject_write_error--;
+    }
+
     if (ret != 0) {
         if (ide_handle_write_error(s, -ret, BM_STATUS_PIO_RETRY))
             return;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] Re: -drive werror=stop can cause state change handlers run out of order
  2009-07-23 21:49 [Qemu-devel] -drive werror=stop can cause state change handlers run out of order Markus Armbruster
  2009-07-27 18:44 ` [Qemu-devel] " Markus Armbruster
@ 2009-07-27 18:51 ` Markus Armbruster
  2009-07-28 18:33   ` [Qemu-devel] [PATCH] Fix VM state change handlers running " Markus Armbruster
  1 sibling, 1 reply; 5+ messages in thread
From: Markus Armbruster @ 2009-07-27 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gleb Natapov

Sketch of a possible fix.  Please review carefully, as I'm not exactly
and expert on the block code.

If this is the way to go, I'll complete it (SCSI & virtio-blk) and
submit it properly.


diff --git a/hw/ide.c b/hw/ide.c
index 1e56786..1fe6116 100644
--- a/hw/ide.c
+++ b/hw/ide.c
@@ -501,6 +501,7 @@ typedef struct BMDMAState {
     QEMUIOVector qiov;
     int64_t sector_num;
     uint32_t nsector;
+    QEMUBH *bh;
 } BMDMAState;
 
 typedef struct PCIIDEState {
@@ -1109,11 +1118,13 @@ static void ide_sector_write(IDEState *s)
     }
 }
 
-static void ide_dma_restart_cb(void *opaque, int running, int reason)
+static void ide_dma_restart_bh(void *opaque)
 {
     BMDMAState *bm = opaque;
-    if (!running)
-        return;
+    qemu_bh_delete(bm->bh);
+    bm->bh = NULL;
+    if (!vm_running)
+        return;                 /* FIXME can this happen? */
     if (bm->status & BM_STATUS_DMA_RETRY) {
         bm->status &= ~BM_STATUS_DMA_RETRY;
         ide_dma_restart(bm->ide_if);
@@ -1123,6 +1134,17 @@ static void ide_dma_restart_cb(void *opaque, int running, int reason)
     }
 }
 
+static void ide_dma_restart_cb(void *opaque, int running, int reason)
+{
+    BMDMAState *bm = opaque;
+    if (!running)
+        return;
+    if (!bm->bh) {
+        bm->bh = qemu_bh_new(ide_dma_restart_bh, bm);
+        qemu_bh_schedule(bm->bh);
+    }
+}
+
 static void ide_write_dma_cb(void *opaque, int ret)
 {
     BMDMAState *bm = opaque;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH] Fix VM state change handlers running out of order
  2009-07-27 18:51 ` Markus Armbruster
@ 2009-07-28 18:33   ` Markus Armbruster
  2009-07-29  7:27     ` [Qemu-devel] " Gleb Natapov
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Armbruster @ 2009-07-28 18:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gleb Natapov

When a VM state change handler changes VM state, other VM state change
handlers can see the state transitions out of order.

bmdma_map(), scsi_disk_init() and virtio_blk_init() install VM state
change handlers to restart DMA.  These handlers can vm_stop() by
running into a write error on a drive with werror=stop.  This throws
the VM state change handler callback into disarray.  Here's an example
case I observed:

0. The virtual IDE drive goes south.  All future writes return errors.

1. Something encounters a write error, and duly stops the VM with
   vm_stop().

2. vm_stop() calls vm_state_notify(0).

3. vm_state_notify() runs the callbacks in list vm_change_state_head.
   It contains ide_dma_restart_cb() installed by bmdma_map().  It also
   contains audio_vm_change_state_handler() installed by audio_init().

4. audio_vm_change_state_handler() stops audio stuff.

5. User continues VM with monitor command "c".  This runs vm_start().

6. vm_start() calls vm_state_notify(1).

7. vm_state_notify() runs the callbacks in vm_change_state_head.

8. ide_dma_restart_cb() happens to come first.  It does its work, runs
   into a write error, and duly stops the VM with vm_stop().

9. vm_stop() runs vm_state_notify(0).

10. vm_state_notify() runs the callbacks in vm_change_state_head.

11. audio_vm_change_state_handler() stops audio stuff.  Which isn't
   running.

12. vm_stop() finishes, ide_dma_restart_cb() finishes, step 7's
   vm_state_notify() resumes running handlers.

13. audio_vm_change_state_handler() starts audio stuff.  Oopsie.

Fix this by moving the actual write from each VM state change handler
into a new bottom half (suggested by Gleb Natapov).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 hw/ide.c        |   22 +++++++++++++++++++---
 hw/scsi-disk.c  |   21 ++++++++++++++++++---
 hw/virtio-blk.c |   20 +++++++++++++++++---
 3 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/hw/ide.c b/hw/ide.c
index 1e56786..2eea438 100644
--- a/hw/ide.c
+++ b/hw/ide.c
@@ -501,6 +501,7 @@ typedef struct BMDMAState {
     QEMUIOVector qiov;
     int64_t sector_num;
     uint32_t nsector;
+    QEMUBH *bh;
 } BMDMAState;
 
 typedef struct PCIIDEState {
@@ -1109,11 +1110,13 @@ static void ide_sector_write(IDEState *s)
     }
 }
 
-static void ide_dma_restart_cb(void *opaque, int running, int reason)
+static void ide_dma_restart_bh(void *opaque)
 {
     BMDMAState *bm = opaque;
-    if (!running)
-        return;
+
+    qemu_bh_delete(bm->bh);
+    bm->bh = NULL;
+
     if (bm->status & BM_STATUS_DMA_RETRY) {
         bm->status &= ~BM_STATUS_DMA_RETRY;
         ide_dma_restart(bm->ide_if);
@@ -1123,6 +1126,19 @@ static void ide_dma_restart_cb(void *opaque, int running, int reason)
     }
 }
 
+static void ide_dma_restart_cb(void *opaque, int running, int reason)
+{
+    BMDMAState *bm = opaque;
+
+    if (!running)
+        return;
+
+    if (!bm->bh) {
+        bm->bh = qemu_bh_new(ide_dma_restart_bh, bm);
+        qemu_bh_schedule(bm->bh);
+    }
+}
+
 static void ide_write_dma_cb(void *opaque, int ret)
 {
     BMDMAState *bm = opaque;
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 8b6426f..5b825c9 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -74,6 +74,7 @@ struct SCSIDeviceState
     scsi_completionfn completion;
     void *opaque;
     char drive_serial_str[21];
+    QEMUBH *bh;
 };
 
 /* Global pool of SCSIRequest structures.  */
@@ -308,12 +309,13 @@ static int scsi_write_data(SCSIDevice *d, uint32_t tag)
     return 0;
 }
 
-static void scsi_dma_restart_cb(void *opaque, int running, int reason)
+static void scsi_dma_restart_bh(void *opaque)
 {
     SCSIDeviceState *s = opaque;
     SCSIRequest *r = s->requests;
-    if (!running)
-        return;
+
+    qemu_bh_delete(s->bh);
+    s->bh = NULL;
 
     while (r) {
         if (r->status & SCSI_REQ_STATUS_RETRY) {
@@ -324,6 +326,19 @@ static void scsi_dma_restart_cb(void *opaque, int running, int reason)
     }
 }
 
+static void scsi_dma_restart_cb(void *opaque, int running, int reason)
+{
+    SCSIDeviceState *s = opaque;
+
+    if (!running)
+        return;
+
+    if (!s->bh) {
+        s->bh = qemu_bh_new(scsi_dma_restart_bh, s);
+        qemu_bh_schedule(s->bh);
+    }
+}
+
 /* Return a pointer to the data buffer.  */
 static uint8_t *scsi_get_buf(SCSIDevice *d, uint32_t tag)
 {
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 2beff52..5036b5b 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -26,6 +26,7 @@ typedef struct VirtIOBlock
     VirtQueue *vq;
     void *rq;
     char serial_str[BLOCK_SERIAL_STRLEN + 1];
+    QEMUBH *bh;
 } VirtIOBlock;
 
 static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -302,13 +303,13 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
      */
 }
 
-static void virtio_blk_dma_restart_cb(void *opaque, int running, int reason)
+static void virtio_blk_dma_restart_bh(void *opaque)
 {
     VirtIOBlock *s = opaque;
     VirtIOBlockReq *req = s->rq;
 
-    if (!running)
-        return;
+    qemu_bh_delete(s->bh);
+    s->bh = NULL;
 
     s->rq = NULL;
 
@@ -318,6 +319,19 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running, int reason)
     }
 }
 
+static void virtio_blk_dma_restart_cb(void *opaque, int running, int reason)
+{
+    VirtIOBlock *s = opaque;
+
+    if (!running)
+        return;
+
+    if (!s->bh) {
+        s->bh = qemu_bh_new(virtio_blk_dma_restart_bh, s);
+        qemu_bh_schedule(s->bh);
+    }
+}
+
 static void virtio_blk_reset(VirtIODevice *vdev)
 {
     /*
-- 
1.6.2.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] Re: [PATCH] Fix VM state change handlers running out of order
  2009-07-28 18:33   ` [Qemu-devel] [PATCH] Fix VM state change handlers running " Markus Armbruster
@ 2009-07-29  7:27     ` Gleb Natapov
  0 siblings, 0 replies; 5+ messages in thread
From: Gleb Natapov @ 2009-07-29  7:27 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel

On Tue, Jul 28, 2009 at 02:33:41PM -0400, Markus Armbruster wrote:
> Fix this by moving the actual write from each VM state change handler
> into a new bottom half (suggested by Gleb Natapov).
> 
This is exactly what I meant. Thanks.

> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>

> ---
>  hw/ide.c        |   22 +++++++++++++++++++---
>  hw/scsi-disk.c  |   21 ++++++++++++++++++---
>  hw/virtio-blk.c |   20 +++++++++++++++++---
>  3 files changed, 54 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/ide.c b/hw/ide.c
> index 1e56786..2eea438 100644
> --- a/hw/ide.c
> +++ b/hw/ide.c
> @@ -501,6 +501,7 @@ typedef struct BMDMAState {
>      QEMUIOVector qiov;
>      int64_t sector_num;
>      uint32_t nsector;
> +    QEMUBH *bh;
>  } BMDMAState;
>  
>  typedef struct PCIIDEState {
> @@ -1109,11 +1110,13 @@ static void ide_sector_write(IDEState *s)
>      }
>  }
>  
> -static void ide_dma_restart_cb(void *opaque, int running, int reason)
> +static void ide_dma_restart_bh(void *opaque)
>  {
>      BMDMAState *bm = opaque;
> -    if (!running)
> -        return;
> +
> +    qemu_bh_delete(bm->bh);
> +    bm->bh = NULL;
> +
>      if (bm->status & BM_STATUS_DMA_RETRY) {
>          bm->status &= ~BM_STATUS_DMA_RETRY;
>          ide_dma_restart(bm->ide_if);
> @@ -1123,6 +1126,19 @@ static void ide_dma_restart_cb(void *opaque, int running, int reason)
>      }
>  }
>  
> +static void ide_dma_restart_cb(void *opaque, int running, int reason)
> +{
> +    BMDMAState *bm = opaque;
> +
> +    if (!running)
> +        return;
> +
> +    if (!bm->bh) {
> +        bm->bh = qemu_bh_new(ide_dma_restart_bh, bm);
> +        qemu_bh_schedule(bm->bh);
> +    }
> +}
> +
>  static void ide_write_dma_cb(void *opaque, int ret)
>  {
>      BMDMAState *bm = opaque;
> diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
> index 8b6426f..5b825c9 100644
> --- a/hw/scsi-disk.c
> +++ b/hw/scsi-disk.c
> @@ -74,6 +74,7 @@ struct SCSIDeviceState
>      scsi_completionfn completion;
>      void *opaque;
>      char drive_serial_str[21];
> +    QEMUBH *bh;
>  };
>  
>  /* Global pool of SCSIRequest structures.  */
> @@ -308,12 +309,13 @@ static int scsi_write_data(SCSIDevice *d, uint32_t tag)
>      return 0;
>  }
>  
> -static void scsi_dma_restart_cb(void *opaque, int running, int reason)
> +static void scsi_dma_restart_bh(void *opaque)
>  {
>      SCSIDeviceState *s = opaque;
>      SCSIRequest *r = s->requests;
> -    if (!running)
> -        return;
> +
> +    qemu_bh_delete(s->bh);
> +    s->bh = NULL;
>  
>      while (r) {
>          if (r->status & SCSI_REQ_STATUS_RETRY) {
> @@ -324,6 +326,19 @@ static void scsi_dma_restart_cb(void *opaque, int running, int reason)
>      }
>  }
>  
> +static void scsi_dma_restart_cb(void *opaque, int running, int reason)
> +{
> +    SCSIDeviceState *s = opaque;
> +
> +    if (!running)
> +        return;
> +
> +    if (!s->bh) {
> +        s->bh = qemu_bh_new(scsi_dma_restart_bh, s);
> +        qemu_bh_schedule(s->bh);
> +    }
> +}
> +
>  /* Return a pointer to the data buffer.  */
>  static uint8_t *scsi_get_buf(SCSIDevice *d, uint32_t tag)
>  {
> diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
> index 2beff52..5036b5b 100644
> --- a/hw/virtio-blk.c
> +++ b/hw/virtio-blk.c
> @@ -26,6 +26,7 @@ typedef struct VirtIOBlock
>      VirtQueue *vq;
>      void *rq;
>      char serial_str[BLOCK_SERIAL_STRLEN + 1];
> +    QEMUBH *bh;
>  } VirtIOBlock;
>  
>  static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
> @@ -302,13 +303,13 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>       */
>  }
>  
> -static void virtio_blk_dma_restart_cb(void *opaque, int running, int reason)
> +static void virtio_blk_dma_restart_bh(void *opaque)
>  {
>      VirtIOBlock *s = opaque;
>      VirtIOBlockReq *req = s->rq;
>  
> -    if (!running)
> -        return;
> +    qemu_bh_delete(s->bh);
> +    s->bh = NULL;
>  
>      s->rq = NULL;
>  
> @@ -318,6 +319,19 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running, int reason)
>      }
>  }
>  
> +static void virtio_blk_dma_restart_cb(void *opaque, int running, int reason)
> +{
> +    VirtIOBlock *s = opaque;
> +
> +    if (!running)
> +        return;
> +
> +    if (!s->bh) {
> +        s->bh = qemu_bh_new(virtio_blk_dma_restart_bh, s);
> +        qemu_bh_schedule(s->bh);
> +    }
> +}
> +
>  static void virtio_blk_reset(VirtIODevice *vdev)
>  {
>      /*
> -- 
> 1.6.2.5

--
			Gleb.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-07-29  7:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-23 21:49 [Qemu-devel] -drive werror=stop can cause state change handlers run out of order Markus Armbruster
2009-07-27 18:44 ` [Qemu-devel] " Markus Armbruster
2009-07-27 18:51 ` Markus Armbruster
2009-07-28 18:33   ` [Qemu-devel] [PATCH] Fix VM state change handlers running " Markus Armbruster
2009-07-29  7:27     ` [Qemu-devel] " Gleb Natapov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).