All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: qemu-devel@nongnu.org
Subject: [Qemu-devel] fix bdrv_read/write_em and qemu_aio_flush
Date: Thu, 28 May 2009 18:33:10 +0200	[thread overview]
Message-ID: <20090528163309.GJ20464@random.random> (raw)

Hello,

the debug code in my ide_dma_cancel patch (not yet included upstream)
made us notice that when qemu_aio_flush returns, there are still
pending aio commands that waits to complete. Auditing the code I found
strange stuff like the fact qemu_aio_waits does nothing if there's an
unrelated (no aio related) bh executed. And I think I found the reason
of why there was still pending aio when qemu_aio_flush because
qemu_aio_wait does a lot more than wait, it can start aio, and if the
previous ->io_flush returned zero, the loop ends and ->io_flush isn't
repeated. The fact an unrelated bh can make qemu_aio_wait a noop seems
troublesome for all callers that aren't calling qemu_aio_wait in a
loop like qemu_aio_flush, so I preferred to change those callers to a
safer qemu_aio_flush in case the bh executed generates more pending
I/O. What you think about this patch against qemu git?

------------
Subject: fix bdrv_read/write_em and qemu_aio_flush
From: Andrea Arcangeli <aarcange@redhat.com>

qemu_aio_wait() was used to start aio through qemu_bh_poll(), like in case of
qcow2 reads from holes. The bh is global. I can't see how it can be possibly
safe to make qemu_aio_wait a noop, if any unrelated bh was pending and had to
be executed. 

In addition qemu_aio_wait by invoking the bh, could execute new aio callbacks
that would issue more aio commands during their completion handlers, breaking
the invariant that qemu_aio_poll returns only when all ->io_flush methods
returns 0 (which lead to a failure in my fix to ide_dma_cancel that has debug
code to catch that, code that isn't yet in qemu upstream).

	->io_flush returns 0
	qemu_aio_wait()
	qemu_aio_bh() -> qcow_aio_read_bh -> ide_read_dma_cb -> ide_dma_submit_check
	break the loop despite ->io_flush wouldn't return 0

I also changed most aio_wait callers to call aio_flush instead, this will
ensure that even a read from hole submitted with bdrv_aio_readv will be
complete by the time aio_flush returns.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---

diff --git a/aio.c b/aio.c
index 11fbb6c..b304852 100644
--- a/aio.c
+++ b/aio.c
@@ -103,6 +103,12 @@ void qemu_aio_flush(void)
     do {
         ret = 0;
 
+	/*
+	 * If there are pending emulated aio start them so
+	 * flush will be able to return 1.
+	 */
+	qemu_bh_poll();
+
         LIST_FOREACH(node, &aio_handlers, node) {
             ret |= node->io_flush(node->opaque);
         }
@@ -115,9 +121,6 @@ void qemu_aio_wait(void)
 {
     int ret;
 
-    if (qemu_bh_poll())
-        return;
-
     do {
         AioHandler *node;
         fd_set rdfds, wrfds;
diff --git a/block.c b/block.c
index c80d4b9..78088a9 100644
--- a/block.c
+++ b/block.c
@@ -1487,7 +1487,7 @@ static int bdrv_read_em(BlockDriverState *bs, int64_t sector_num,
         return -1;
 
     while (async_ret == NOT_DONE) {
-        qemu_aio_wait();
+        qemu_aio_flush();
     }
 
     return async_ret;
@@ -1510,7 +1510,7 @@ static int bdrv_write_em(BlockDriverState *bs, int64_t sector_num,
     if (acb == NULL)
         return -1;
     while (async_ret == NOT_DONE) {
-        qemu_aio_wait();
+        qemu_aio_flush();
     }
     return async_ret;
 }
diff --git a/qemu-io.c b/qemu-io.c
index f0a17b9..0d52074 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -153,7 +153,7 @@ static int do_aio_readv(QEMUIOVector *qiov, int64_t offset, int *total)
 		return -EIO;
 
 	while (async_ret == NOT_DONE)
-		qemu_aio_wait();
+		qemu_aio_flush();
 
 	*total = qiov->size;
 	return async_ret < 0 ? async_ret : 1;
@@ -170,7 +170,7 @@ static int do_aio_writev(QEMUIOVector *qiov, int64_t offset, int *total)
 		return -EIO;
 
 	while (async_ret == NOT_DONE)
-		qemu_aio_wait();
+		qemu_aio_flush();
 
 	*total = qiov->size;
 	return async_ret < 0 ? async_ret : 1;

             reply	other threads:[~2009-05-28 16:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-28 16:33 Andrea Arcangeli [this message]
2009-05-30 10:08 ` [Qemu-devel] fix bdrv_read/write_em and qemu_aio_flush Christoph Hellwig
2009-05-30 12:17   ` Andrea Arcangeli
2009-06-04 11:26     ` [Qemu-devel] [PATCH] fix qemu_aio_flush Andrea Arcangeli
2009-06-04 11:51       ` [Qemu-devel] " Kevin Wolf
2009-06-05 15:57       ` [Qemu-devel] " Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090528163309.GJ20464@random.random \
    --to=aarcange@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.