From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:55509) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QMcuF-0007hn-Tc for qemu-devel@nongnu.org; Wed, 18 May 2011 05:19:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QMcuE-0005eM-Qw for qemu-devel@nongnu.org; Wed, 18 May 2011 05:19:03 -0400 Received: from mail-fx0-f45.google.com ([209.85.161.45]:54591) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QMcuE-0005eH-Md for qemu-devel@nongnu.org; Wed, 18 May 2011 05:19:02 -0400 Received: by fxm2 with SMTP id 2so1195775fxm.4 for ; Wed, 18 May 2011 02:19:01 -0700 (PDT) Message-ID: <4DD38F03.7020209@gmail.com> Date: Wed, 18 May 2011 13:18:59 +0400 From: Dmitry Konishchev MIME-Version: 1.0 References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] [qemu-img] CPU consuming optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , stanislav.ievlev@gmail.com, qemu-devel@nongnu.org On 18.05.2011 11:57, Stefan Hajnoczi wrote: > Yes, optimizing is_not_zero() is good. The only additional thing I > suggest is adding a comment before the function to document the length > constraint. OK, fixed. On 18.05.2011 12:05, Kevin Wolf wrote: > A future bdrv_is_allocated() patch must make sure that the conversion > falls back to a simple is_not_zero() when a backing file is used. Thanks, I'll take this into account. Signed-off-by: Dmitry Konishchev --- qemu-img.c | 30 +++++++++++++++++++++++++++--- 1 files changed, 27 insertions(+), 3 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index e825123..7665c2f 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -496,14 +496,38 @@ static int img_commit(int argc, char **argv) return 0; } +/* + * Checks whether the sector is not a zero sector. + * + * Attention! The len must be a multiple of 4 * sizeof(long) due to + * restriction of optimizations in this function. + */ static int is_not_zero(const uint8_t *sector, int len) { + /* + * Use long as the biggest available internal data type that fits into the + * CPU register and unroll the loop to smooth out the effect of memory + * latency. + */ + int i; - len >>= 2; - for(i = 0;i < len; i++) { - if (((uint32_t *)sector)[i] != 0) + len /= sizeof(long); + + long d0; + long d1; + long d2; + long d3; + + for(i = 0; i < len; i += 4) { + d0 = ((const long*) sector)[i + 0]; + d1 = ((const long*) sector)[i + 1]; + d2 = ((const long*) sector)[i + 2]; + d3 = ((const long*) sector)[i + 3]; + + if (d0 || d1 || d2 || d3) return 1; } + return 0; } -- 1.7.4.1