From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60816) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bMy5x-0004uE-6j for qemu-devel@nongnu.org; Tue, 12 Jul 2016 09:51:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bMy5s-000148-8S for qemu-devel@nongnu.org; Tue, 12 Jul 2016 09:51:28 -0400 Date: Tue, 12 Jul 2016 15:51:08 +0200 From: Kevin Wolf Message-ID: <20160712135108.GC4478@noname.redhat.com> References: <1468316175-11522-1-git-send-email-den@openvz.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1468316175-11522-1-git-send-email-den@openvz.org> Subject: Re: [Qemu-devel] [PATCH 1/1] mirror: double performance of the bulk stage if the disc is full List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Denis V. Lunev" Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, Vladimir Sementsov-Ogievskiy , Stefan Hajnoczi , Fam Zheng , Max Reitz , Jeff Cody , Eric Blake Am 12.07.2016 um 11:36 hat Denis V. Lunev geschrieben: > From: Vladimir Sementsov-Ogievskiy > > Mirror can do up to 16 in-flight requests, but actually on full copy > (the whole source disk is non-zero) in-flight is always 1. This happens > as the request is not limited in size: the data occupies maximum available > capacity of s->buf. > > The patch limits the size of the request to some artificial constant > (1 Mb here), which is not that big or small. This effectively enables > back parallelism in mirror code as it was designed. > > The result is important: the time to migrate 10 Gb disk is reduced from > ~350 sec to 170 sec. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Signed-off-by: Denis V. Lunev > CC: Stefan Hajnoczi > CC: Fam Zheng > CC: Kevin Wolf > CC: Max Reitz > CC: Jeff Cody > CC: Eric Blake > --- > block/mirror.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/block/mirror.c b/block/mirror.c > index 4fe127e..53d3bcd 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -23,7 +23,9 @@ > > #define SLICE_TIME 100000000ULL /* ns */ > #define MAX_IN_FLIGHT 16 > -#define DEFAULT_MIRROR_BUF_SIZE (10 << 20) > +#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */ > +#define DEFAULT_MIRROR_BUF_SIZE \ > + (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE) > > /* The mirroring buffer is a list of granularity-sized chunks. > * Free chunks are organized in a list. > @@ -387,7 +389,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) > nb_chunks * sectors_per_chunk, > &io_sectors, &file); > if (ret < 0) { > - io_sectors = nb_chunks * sectors_per_chunk; > + io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS); > + } else if (ret & BDRV_BLOCK_DATA) { > + io_sectors = MIN(io_sectors, MAX_IO_SECTORS); > } Would it make sense to consider the actual buffer size? If we have s->buf_size / 16 > 1 MB, then this is wasting buffer space. On the other hand, there is probably a minimum size where using a single larger buffer performs better than two concurrent small ones. Which size this is, is hard to tell, though. If we assume that 1 MB is a good default (should we do some more testing to find the sweet spot?), we could write this as: io_sectors = MIN(io_sectors, MAX((s->buf_size / BDRV_SECTOR_SIZE) / MAX_IN_FLIGHT, MAX_IO_SECTORS)) Kevin