From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55034)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1YsUvI-0003S2-Or
	for qemu-devel@nongnu.org; Wed, 13 May 2015 07:34:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1YsUvH-0008R8-RF
	for qemu-devel@nongnu.org; Wed, 13 May 2015 07:34:00 -0400
Message-ID: <5553369C.9010907@redhat.com>
Date: Wed, 13 May 2015 13:33:48 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1431538099-3286-1-git-send-email-famz@redhat.com>
	<1431538099-3286-12-git-send-email-famz@redhat.com>
	<555326ED.3050609@redhat.com>
	<20150513110843.GB30644@ad.nay.redhat.com>
In-Reply-To: <20150513110843.GB30644@ad.nay.redhat.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during
 bdrv_drain and bdrv_drain_all
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Fam Zheng <famz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-block@nongnu.org, armbru@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, Stefan Hajnoczi <stefanha@redhat.com>


On 13/05/2015 13:08, Fam Zheng wrote:
> > I think this isn't enough.  It's the callers of bdrv_drain and
> > bdrv_drain_all that need to block before drain and unblock before
> > aio_context_release.
>=20
> Which callers do you mean? qmp_transaction is covered in this series.

All of them. :(

In some cases it may be unnecessary.  I'm thinking of bdrv_set_aio_contex=
t,
and mig_save_device_dirty, but that's probably the exception rather than
the rule.

In some cases, like bdrv_snapshot_delete, the change is trivial.

The only somewhat more complex case is probably block/mirror.c.  There,
the aio_context_release happens through block_job_sleep_ns or
qemu_coroutine_yield.  I guess the change would look something like
this sketch:

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..dcfede0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque)
=20
         if (cnt =3D=3D 0 && should_complete) {
             /* The dirty bitmap is not updated while operations are pend=
ing.
-             * If we're about to exit, wait for pending operations befor=
e
-             * calling bdrv_get_dirty_count(bs), or we may exit while th=
e
+             * If we're about to exit, wait for pending operations and
+             * recheck bdrv_get_dirty_count(bs), or we may exit while th=
e
              * source has dirty data to copy!
              *
              * Note that I/O can be submitted by the guest while
-             * mirror_populate runs.
+             * the rest of mirror_populate runs, but we lock it out here=
.
              */
             trace_mirror_before_drain(s, cnt);
+
+            // ... block ...
             bdrv_drain(bs);
             cnt =3D bdrv_get_dirty_count(s->dirty_bitmap);
+
+            if (cnt =3D=3D 0) {
+                /* The two disks are in sync.  Exit and report successfu=
l
+                 * completion.
+                 */
+                assert(s->synced && QLIST_EMPTY(&bs->tracked_requests));
+                s->common.cancelled =3D false;
+                break; // ... and unblock somewhere else...
+            }
+
+            // ... unblock ...
         }
=20
         ret =3D 0;
@@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque)
         } else if (!should_complete) {
             delay_ns =3D (s->in_flight =3D=3D 0 && cnt =3D=3D 0 ? SLICE_=
TIME : 0);
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns=
);
-        } else if (cnt =3D=3D 0) {
-            /* The two disks are in sync.  Exit and report successful
-             * completion.
-             */
-            assert(QLIST_EMPTY(&bs->tracked_requests));
-            s->common.cancelled =3D false;
-            break;
         }
         last_pause_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     }


It can be the topic of a separate series.  But this patch brings a
false sense of security (either the blocker is unnecessary, or it
needs to last after bdrv_drain returns), so I think it should be
dropped.

Paolo