From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55034) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YsUvI-0003S2-Or for qemu-devel@nongnu.org; Wed, 13 May 2015 07:34:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YsUvH-0008R8-RF for qemu-devel@nongnu.org; Wed, 13 May 2015 07:34:00 -0400 Message-ID: <5553369C.9010907@redhat.com> Date: Wed, 13 May 2015 13:33:48 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1431538099-3286-1-git-send-email-famz@redhat.com> <1431538099-3286-12-git-send-email-famz@redhat.com> <555326ED.3050609@redhat.com> <20150513110843.GB30644@ad.nay.redhat.com> In-Reply-To: <20150513110843.GB30644@ad.nay.redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: Kevin Wolf , qemu-block@nongnu.org, armbru@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, Stefan Hajnoczi On 13/05/2015 13:08, Fam Zheng wrote: > > I think this isn't enough. It's the callers of bdrv_drain and > > bdrv_drain_all that need to block before drain and unblock before > > aio_context_release. >=20 > Which callers do you mean? qmp_transaction is covered in this series. All of them. :( In some cases it may be unnecessary. I'm thinking of bdrv_set_aio_contex= t, and mig_save_device_dirty, but that's probably the exception rather than the rule. In some cases, like bdrv_snapshot_delete, the change is trivial. The only somewhat more complex case is probably block/mirror.c. There, the aio_context_release happens through block_job_sleep_ns or qemu_coroutine_yield. I guess the change would look something like this sketch: diff --git a/block/mirror.c b/block/mirror.c index 58f391a..dcfede0 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque) =20 if (cnt =3D=3D 0 && should_complete) { /* The dirty bitmap is not updated while operations are pend= ing. - * If we're about to exit, wait for pending operations befor= e - * calling bdrv_get_dirty_count(bs), or we may exit while th= e + * If we're about to exit, wait for pending operations and + * recheck bdrv_get_dirty_count(bs), or we may exit while th= e * source has dirty data to copy! * * Note that I/O can be submitted by the guest while - * mirror_populate runs. + * the rest of mirror_populate runs, but we lock it out here= . */ trace_mirror_before_drain(s, cnt); + + // ... block ... bdrv_drain(bs); cnt =3D bdrv_get_dirty_count(s->dirty_bitmap); + + if (cnt =3D=3D 0) { + /* The two disks are in sync. Exit and report successfu= l + * completion. + */ + assert(s->synced && QLIST_EMPTY(&bs->tracked_requests)); + s->common.cancelled =3D false; + break; // ... and unblock somewhere else... + } + + // ... unblock ... } =20 ret =3D 0; @@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque) } else if (!should_complete) { delay_ns =3D (s->in_flight =3D=3D 0 && cnt =3D=3D 0 ? SLICE_= TIME : 0); block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns= ); - } else if (cnt =3D=3D 0) { - /* The two disks are in sync. Exit and report successful - * completion. - */ - assert(QLIST_EMPTY(&bs->tracked_requests)); - s->common.cancelled =3D false; - break; } last_pause_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME); } It can be the topic of a separate series. But this patch brings a false sense of security (either the blocker is unnecessary, or it needs to last after bdrv_drain returns), so I think it should be dropped. Paolo