From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c4aw8-0005q7-LR for qemu-devel@nongnu.org; Wed, 09 Nov 2016 17:01:41 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4aw7-0007Ni-Oj for qemu-devel@nongnu.org; Wed, 09 Nov 2016 17:01:40 -0500 Date: Wed, 9 Nov 2016 17:01:31 -0500 (EST) From: Paolo Bonzini Message-ID: <1985014055.11762735.1478728891749.JavaMail.zimbra@redhat.com> In-Reply-To: <20161109183826.GG6315@localhost.localdomain> References: <20161109162008.27287-2-pbonzini@redhat.com> <20161109183826.GG6315@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH for-2.8] mirror: do not flush every time the disks are synced List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jeff Cody Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, jsnow@redhat.com ----- Original Message ----- > From: "Jeff Cody" > To: "Paolo Bonzini" > Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, jsnow@redhat.com > Sent: Wednesday, November 9, 2016 7:38:26 PM > Subject: Re: [PATCH for-2.8] mirror: do not flush every time the disks are synced > > On Wed, Nov 09, 2016 at 05:20:08PM +0100, Paolo Bonzini wrote: > > This puts a huge strain on the disks when there are many concurrent > > migrations. With this patch we only flush twice: just before issuing > > the event, and just before pivoting to the destination. If management > > will complete the job close to the BLOCK_JOB_READY event, the cost of > > the second flush should be small anyway. > > > > Signed-off-by: Paolo Bonzini > > --- > > block/mirror.c | 40 +++++++++++++++++++++++++--------------- > > 1 file changed, 25 insertions(+), 15 deletions(-) > > > > diff --git a/block/mirror.c b/block/mirror.c > > index b2c1fb8..3ec281c 100644 > > --- a/block/mirror.c > > +++ b/block/mirror.c > > @@ -615,6 +615,20 @@ static int coroutine_fn > > mirror_dirty_init(MirrorBlockJob *s) > > return 0; > > } > > > > +/* Called when going out of the streaming phase to flush the bulk of the > > + * data to the medium, or just before completing. > > + */ > > +static int mirror_flush(MirrorBlockJob *s) > > +{ > > + int ret = blk_flush(s->target); > > + if (ret < 0) { > > + if (mirror_error_action(s, false, -ret) == > > BLOCK_ERROR_ACTION_REPORT) { > > + s->ret = ret; > > + } > > + } > > + return ret; > > +} > > + > > static void coroutine_fn mirror_run(void *opaque) > > { > > MirrorBlockJob *s = opaque; > > @@ -727,27 +741,23 @@ static void coroutine_fn mirror_run(void *opaque) > > should_complete = false; > > if (s->in_flight == 0 && cnt == 0) { > > trace_mirror_before_flush(s); > > - ret = blk_flush(s->target); > > - if (ret < 0) { > > - if (mirror_error_action(s, false, -ret) == > > - BLOCK_ERROR_ACTION_REPORT) { > > - goto immediate_exit; > > + if (!s->synced) { > > + if (mirror_flush(s) < 0) { > > + /* Go check s->ret. */ > > + continue; > > I think this would be easier to follow, if rather than popping back up to > the top of the loop to do error checking, to just do the error cleanup like > normal, e.g.: > > > @@ -765,7 +775,7 @@ static void coroutine_fn mirror_run(void *opaque) > > > > bdrv_drained_begin(bs); > > cnt = bdrv_get_dirty_count(s->dirty_bitmap); > > - if (cnt > 0) { > > + if (cnt > 0 || mirror_flush(s) < 0) { > > bdrv_drained_end(bs); > > continue; > > Bah, that continue paradigm is used here from before this patch. Could I > convince you to change this too? I tried but I could not really write it in a way that looked nice, so I at least made both calls to mirror_flush look the same. :( The problem is that the cnt > 0 really needs to be a bdrv_drained_end+continue, while the mirror_flush case would be a bdrv_drained_end+goto immediate_exit. Paolo Paolo