From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35646) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YsYPg-00086F-1e for qemu-devel@nongnu.org; Wed, 13 May 2015 11:17:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YsYPZ-0004ot-Qf for qemu-devel@nongnu.org; Wed, 13 May 2015 11:17:35 -0400 Date: Wed, 13 May 2015 23:17:15 +0800 From: Fam Zheng Message-ID: <20150513151715.GH30644@ad.nay.redhat.com> References: <1431538099-3286-1-git-send-email-famz@redhat.com> <1431538099-3286-12-git-send-email-famz@redhat.com> <555326ED.3050609@redhat.com> <20150513110843.GB30644@ad.nay.redhat.com> <5553369C.9010907@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5553369C.9010907@redhat.com> Subject: Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Kevin Wolf , qemu-block@nongnu.org, armbru@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, Stefan Hajnoczi On Wed, 05/13 13:33, Paolo Bonzini wrote: > > > On 13/05/2015 13:08, Fam Zheng wrote: > > > I think this isn't enough. It's the callers of bdrv_drain and > > > bdrv_drain_all that need to block before drain and unblock before > > > aio_context_release. > > > > Which callers do you mean? qmp_transaction is covered in this series. > > All of them. :( > > In some cases it may be unnecessary. I'm thinking of bdrv_set_aio_context, > and mig_save_device_dirty, but that's probably the exception rather than > the rule. > > In some cases, like bdrv_snapshot_delete, the change is trivial. > > The only somewhat more complex case is probably block/mirror.c. There, > the aio_context_release happens through block_job_sleep_ns or > qemu_coroutine_yield. I guess the change would look something like > this sketch: > > diff --git a/block/mirror.c b/block/mirror.c > index 58f391a..dcfede0 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque) > > if (cnt == 0 && should_complete) { > /* The dirty bitmap is not updated while operations are pending. > - * If we're about to exit, wait for pending operations before > - * calling bdrv_get_dirty_count(bs), or we may exit while the > + * If we're about to exit, wait for pending operations and > + * recheck bdrv_get_dirty_count(bs), or we may exit while the > * source has dirty data to copy! > * > * Note that I/O can be submitted by the guest while > - * mirror_populate runs. > + * the rest of mirror_populate runs, but we lock it out here. > */ > trace_mirror_before_drain(s, cnt); > + > + // ... block ... > bdrv_drain(bs); > cnt = bdrv_get_dirty_count(s->dirty_bitmap); > + > + if (cnt == 0) { > + /* The two disks are in sync. Exit and report successful > + * completion. > + */ > + assert(s->synced && QLIST_EMPTY(&bs->tracked_requests)); > + s->common.cancelled = false; > + break; // ... and unblock somewhere else... > + } > + > + // ... unblock ... > } > > ret = 0; > @@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque) > } else if (!should_complete) { > delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0); > block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns); > - } else if (cnt == 0) { > - /* The two disks are in sync. Exit and report successful > - * completion. > - */ > - assert(QLIST_EMPTY(&bs->tracked_requests)); > - s->common.cancelled = false; > - break; > } > last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > } > > > It can be the topic of a separate series. But this patch brings a > false sense of security (either the blocker is unnecessary, or it > needs to last after bdrv_drain returns), so I think it should be > dropped. Doesn't this let bdrv_drain_all return when virtio-blk-dataplane is having high workload, in places where you say "the blocker is unnecessary"? Fam