From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:35143) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UKRGw-0008Tz-JJ for qemu-devel@nongnu.org; Tue, 26 Mar 2013 06:38:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UKRGr-0002QR-Rq for qemu-devel@nongnu.org; Tue, 26 Mar 2013 06:38:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34275) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UKRGr-0002QN-Jc for qemu-devel@nongnu.org; Tue, 26 Mar 2013 06:38:25 -0400 From: Juan Quintela In-Reply-To: <1364291919-19563-4-git-send-email-pl@kamp.de> (Peter Lieven's message of "Tue, 26 Mar 2013 10:58:32 +0100") References: <1364291919-19563-1-git-send-email-pl@kamp.de> <1364291919-19563-4-git-send-email-pl@kamp.de> Date: Tue, 26 Mar 2013 11:38:27 +0100 Message-ID: <87k3ouo38c.fsf@elfo.elfo> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCHv5 03/10] cutils: add a function to find non-zero content in a buffer Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: Orit Wasserman , Paolo Bonzini , qemu-devel@nongnu.org, Stefan Hajnoczi Peter Lieven wrote: > this adds buffer_find_nonzero_offset() which is a SSE2/Altivec > optimized function that searches for non-zero content in a > buffer. > > the function starts full unrolling only after the first few chunks have > been checked one by one. analyzing real memory page data has revealed > that non-zero pages are non-zero within the first 256-512 bits in > most cases. as this function is also heavily used to check for zero memory > pages this tweak has been made to avoid the high setup costs of the fully > unrolled check for non-zero pages. > > due to the optimizations used in the function there are restrictions > on buffer address and search length. the function > can_use_buffer_find_nonzero_content() can be used to check if > the function can be used safely. > > Signed-off-by: Peter Lieven > --- > include/qemu-common.h | 13 ++++++++++++ > util/cutils.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 68 insertions(+) > > diff --git a/include/qemu-common.h b/include/qemu-common.h > index 9022646..7c7c244 100644 > --- a/include/qemu-common.h > +++ b/include/qemu-common.h > @@ -472,4 +472,17 @@ void hexdump(const char *buf, FILE *fp, const char *prefix, size_t size); > #define ALL_EQ(v1, v2) ((v1) == (v2)) > #endif > > +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 > +static inline bool > +can_use_buffer_find_nonzero_offset(const void *buf, size_t len) > +{ > + if (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > + * sizeof(VECTYPE)) == 0 > + && ((uintptr_t) buf) % sizeof(VECTYPE) == 0) { > + return true; > + } > + return false; > +} > +size_t buffer_find_nonzero_offset(const void *buf, size_t len); > + > #endif > diff --git a/util/cutils.c b/util/cutils.c > index 1439da4..0314a18 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -143,6 +143,61 @@ int qemu_fdatasync(int fd) > } > > /* > + * Searches for an area with non-zero content in a buffer > + * > + * Attention! The len must be a multiple of > + * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE) > + * and addr must be a multiple of sizeof(VECTYPE) due to > + * restriction of optimizations in this function. > + * > + * can_use_buffer_find_nonzero_offset() can be used to check > + * these requirements. > + * > + * The return value is the offset of the non-zero area rounded > + * down to a multiple of sizeof(VECTYPE) for the first > + * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR chunks and down to > + * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE) > + * afterwards. > + * > + * If the buffer is all zero the return value is equal to len. > + */ > + > +size_t buffer_find_nonzero_offset(const void *buf, size_t len) > +{ > + VECTYPE *p = (VECTYPE *)buf; > + VECTYPE zero = ZERO_SPLAT; If you have to resplit it anyways, what about changing this to: - VECTYPE *p = (VECTYPE *)buf; - VECTYPE zero = ZERO_SPLAT; + const VECTYPE *p = buf; + const VECTYPE zero = ZERO_SPLAT; size_t i; >>From the "I hate casts" film? Thanks, Juan.