qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/2] ide: convert pio code path to asynchronous I/O
@ 2012-03-29  9:31 Stefan Hajnoczi
  2012-03-29  9:31 ` [Qemu-devel] [PATCH v2 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
  2012-03-29  9:31 ` [Qemu-devel] [PATCH v2 2/2] ide: convert ide_sector_write() " Stefan Hajnoczi
  0 siblings, 2 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2012-03-29  9:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Michael Tokarev, Zhi Yong Wu, Paolo Bonzini

IDE PIO mode is currently implemented using synchronous I/O functions.  There's
no need to do this because the IDE interface is actually designed with polling
and interrupts in mind - we can do asynchronous I/O and let the guest know when
the operation has completed.  The benefit of asynchronous I/O is that the guest
can continue executing code and is more responsive.

The second aim of this conversion is to avoid calling bdrv_read()/bdrv_write()
since they do not work with I/O throttling.  This means guests should now boot
IDE drives successfully when I/O throttling is enabled.

Note that ATAPI is not converted yet and still uses bdrv_read() in two
locations.  A future patch will have to convert ATAPI so CD-ROMs also do
asynchronous I/O.

I have tested both Windows 7 Home Premium and Red Hat Enterprise Linux 6.0
guests with these patches.  In Windows, use the device manager to disable DMA
on the IDE channels.  Under recent Linux kernels, use the libata.dma=0 kernel
parameter.

Chris and Richard: Please test this to confirm that it fixes the hang you
reported.

v2:
 * Keep aiocb and cancel request on reset [mjt]

Stefan Hajnoczi (2):
  ide: convert ide_sector_read() to asynchronous I/O
  ide: convert ide_sector_write() to asynchronous I/O

 hw/ide/core.c     |  137 +++++++++++++++++++++++++++++++++++++---------------
 hw/ide/internal.h |    3 +
 2 files changed, 100 insertions(+), 40 deletions(-)

-- 
1.7.9.1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Qemu-devel] [PATCH v2 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-29  9:31 [Qemu-devel] [PATCH v2 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
@ 2012-03-29  9:31 ` Stefan Hajnoczi
  2012-03-29  9:31 ` [Qemu-devel] [PATCH v2 2/2] ide: convert ide_sector_write() " Stefan Hajnoczi
  1 sibling, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2012-03-29  9:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Michael Tokarev, Zhi Yong Wu, Paolo Bonzini

The IDE PIO interface currently uses bdrv_read() to perform reads
synchronously.  Synchronous I/O in the vcpu thread is bad because it
prevents the guest from executing code - it makes the guest
unresponsive.

This patch converts IDE PIO to use bdrv_aio_readv().  We simply need to
use the BUSY_STAT status so the guest knows to wait while we are busy.

The only external user of ide_sector_read() is restart behavior on I/O
errors and it is not affected by this change.  We still need to restart
I/O in the same way.

Migration is also unaffected if I understand the code correctly.  We
continue to use the same transfer function and the BUSY_STAT status
should never be migrated since we flush I/O before migrating device
state.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/ide/core.c     |   76 ++++++++++++++++++++++++++++++++++++++--------------
 hw/ide/internal.h |    3 ++
 2 files changed, 58 insertions(+), 21 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 4d568ac..47bc958 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -457,40 +457,68 @@ static void ide_rw_error(IDEState *s) {
     ide_set_irq(s->bus);
 }
 
+static void ide_sector_read_cb(void *opaque, int ret)
+{
+    IDEState *s = opaque;
+    int n;
+
+    s->pio_aiocb = NULL;
+    s->status &= ~BUSY_STAT;
+
+    bdrv_acct_done(s->bs, &s->acct);
+    if (ret != 0) {
+        if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY |
+                                BM_STATUS_RETRY_READ)) {
+            return;
+        }
+    }
+
+    n = s->nsector;
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
+
+    /* Allow the guest to read the io_buffer */
+    ide_transfer_start(s, s->io_buffer, n * BDRV_SECTOR_SIZE, ide_sector_read);
+
+    ide_set_irq(s->bus);
+
+    ide_set_sector(s, ide_get_sector(s) + n);
+    s->nsector -= n;
+}
+
 void ide_sector_read(IDEState *s)
 {
     int64_t sector_num;
-    int ret, n;
+    int n;
 
     s->status = READY_STAT | SEEK_STAT;
     s->error = 0; /* not needed by IDE spec, but needed by Windows */
     sector_num = ide_get_sector(s);
     n = s->nsector;
+
     if (n == 0) {
-        /* no more sector to read from disk */
         ide_transfer_stop(s);
-    } else {
+        return;
+    }
+
+    s->status |= BUSY_STAT;
+
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
+
 #if defined(DEBUG_IDE)
-        printf("read sector=%" PRId64 "\n", sector_num);
+    printf("sector=%" PRId64 "\n", sector_num);
 #endif
-        if (n > s->req_nb_sectors)
-            n = s->req_nb_sectors;
 
-        bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
-        ret = bdrv_read(s->bs, sector_num, s->io_buffer, n);
-        bdrv_acct_done(s->bs, &s->acct);
-        if (ret != 0) {
-            if (ide_handle_rw_error(s, -ret,
-                BM_STATUS_PIO_RETRY | BM_STATUS_RETRY_READ))
-            {
-                return;
-            }
-        }
-        ide_transfer_start(s, s->io_buffer, 512 * n, ide_sector_read);
-        ide_set_irq(s->bus);
-        ide_set_sector(s, sector_num + n);
-        s->nsector -= n;
-    }
+    s->iov.iov_base = s->io_buffer;
+    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
+    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
+
+    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
+    s->pio_aiocb = bdrv_aio_readv(s->bs, sector_num, &s->qiov, n,
+                                  ide_sector_read_cb, s);
 }
 
 static void dma_buf_commit(IDEState *s)
@@ -1750,6 +1778,12 @@ static void ide_reset(IDEState *s)
 #ifdef DEBUG_IDE
     printf("ide: reset\n");
 #endif
+
+    if (s->pio_aiocb) {
+        bdrv_aio_cancel(s->pio_aiocb);
+        s->pio_aiocb = NULL;
+    }
+
     if (s->drive_kind == IDE_CFATA)
         s->mult_sectors = 0;
     else
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index c808a0d..8f9259b 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -383,6 +383,9 @@ struct IDEState {
     int cd_sector_size;
     int atapi_dma; /* true if dma is requested for the packet cmd */
     BlockAcctCookie acct;
+    BlockDriverAIOCB *pio_aiocb;
+    struct iovec iov;
+    QEMUIOVector qiov;
     /* ATA DMA state */
     int io_buffer_size;
     QEMUSGList sg;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [Qemu-devel] [PATCH v2 2/2] ide: convert ide_sector_write() to asynchronous I/O
  2012-03-29  9:31 [Qemu-devel] [PATCH v2 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
  2012-03-29  9:31 ` [Qemu-devel] [PATCH v2 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
@ 2012-03-29  9:31 ` Stefan Hajnoczi
  1 sibling, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2012-03-29  9:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Michael Tokarev, Zhi Yong Wu, Paolo Bonzini

The IDE PIO write sector code path uses bdrv_write() and hence can make
the guest unresponsive while the I/O request is in progress.  This patch
converts ide_sector_write() to use bdrv_aio_writev() by using the
BUSY_STAT bit to tell the guest that the request is in progress.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/ide/core.c |   61 +++++++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 47bc958..dc21e7c 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -673,40 +673,39 @@ static void ide_sector_write_timer_cb(void *opaque)
     ide_set_irq(s->bus);
 }
 
-void ide_sector_write(IDEState *s)
+static void ide_sector_write_cb(void *opaque, int ret)
 {
-    int64_t sector_num;
-    int ret, n, n1;
-
-    s->status = READY_STAT | SEEK_STAT;
-    sector_num = ide_get_sector(s);
-#if defined(DEBUG_IDE)
-    printf("write sector=%" PRId64 "\n", sector_num);
-#endif
-    n = s->nsector;
-    if (n > s->req_nb_sectors)
-        n = s->req_nb_sectors;
+    IDEState *s = opaque;
+    int n;
 
-    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
-    ret = bdrv_write(s->bs, sector_num, s->io_buffer, n);
     bdrv_acct_done(s->bs, &s->acct);
 
+    s->pio_aiocb = NULL;
+    s->status &= ~BUSY_STAT;
+
     if (ret != 0) {
-        if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY))
+        if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY)) {
             return;
+        }
     }
 
+    n = s->nsector;
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
     s->nsector -= n;
     if (s->nsector == 0) {
         /* no more sectors to write */
         ide_transfer_stop(s);
     } else {
-        n1 = s->nsector;
-        if (n1 > s->req_nb_sectors)
+        int n1 = s->nsector;
+        if (n1 > s->req_nb_sectors) {
             n1 = s->req_nb_sectors;
-        ide_transfer_start(s, s->io_buffer, 512 * n1, ide_sector_write);
+        }
+        ide_transfer_start(s, s->io_buffer, n1 * BDRV_SECTOR_SIZE,
+                           ide_sector_write);
     }
-    ide_set_sector(s, sector_num + n);
+    ide_set_sector(s, ide_get_sector(s) + n);
 
     if (win2k_install_hack && ((++s->irq_count % 16) == 0)) {
         /* It seems there is a bug in the Windows 2000 installer HDD
@@ -722,6 +721,30 @@ void ide_sector_write(IDEState *s)
     }
 }
 
+void ide_sector_write(IDEState *s)
+{
+    int64_t sector_num;
+    int n;
+
+    s->status = READY_STAT | SEEK_STAT | BUSY_STAT;
+    sector_num = ide_get_sector(s);
+#if defined(DEBUG_IDE)
+    printf("sector=%" PRId64 "\n", sector_num);
+#endif
+    n = s->nsector;
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
+
+    s->iov.iov_base = s->io_buffer;
+    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
+    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
+
+    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
+    s->pio_aiocb = bdrv_aio_writev(s->bs, sector_num, &s->qiov, n,
+                                   ide_sector_write_cb, s);
+}
+
 static void ide_flush_cb(void *opaque, int ret)
 {
     IDEState *s = opaque;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-03-29  9:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-29  9:31 [Qemu-devel] [PATCH v2 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
2012-03-29  9:31 ` [Qemu-devel] [PATCH v2 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
2012-03-29  9:31 ` [Qemu-devel] [PATCH v2 2/2] ide: convert ide_sector_write() " Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).