From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:41990) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UF3V9-0003Vm-MV for qemu-devel@nongnu.org; Mon, 11 Mar 2013 10:14:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UF3V7-0008F4-Kt for qemu-devel@nongnu.org; Mon, 11 Mar 2013 10:14:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:15916) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UF3V7-0008Ey-Bs for qemu-devel@nongnu.org; Mon, 11 Mar 2013 10:14:53 -0400 Message-ID: <513DE6D6.9000105@redhat.com> Date: Mon, 11 Mar 2013 15:14:46 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <513DDFA3.1020308@dlhnet.de> In-Reply-To: <513DDFA3.1020308@dlhnet.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] find_next_bit optimizations List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: Orit Wasserman , "qemu-devel@nongnu.org" , Corentin Chary Il 11/03/2013 14:44, Peter Lieven ha scritto: > Hi, > > I ever since had a few VMs which are very hard to migrate because of a > lot of memory I/O. I found that finding the next dirty bit > seemed to be one of the culprits (apart from removing locking which > Paolo is working on). > > I have to following proposal which seems to help a lot in my case. Just > wanted to have some feedback. > I applied the same unrolling idea like in buffer_is_zero(). > > Peter > > --- a/util/bitops.c > +++ b/util/bitops.c > @@ -24,12 +24,13 @@ unsigned long find_next_bit(const unsigned long > *addr, unsigned long size, > const unsigned long *p = addr + BITOP_WORD(offset); > unsigned long result = offset & ~(BITS_PER_LONG-1); > unsigned long tmp; > + unsigned long d0,d1,d2,d3; > > if (offset >= size) { > return size; > } > size -= result; > - offset %= BITS_PER_LONG; > + offset &= (BITS_PER_LONG-1); > if (offset) { > tmp = *(p++); > tmp &= (~0UL << offset); > @@ -43,6 +44,18 @@ unsigned long find_next_bit(const unsigned long > *addr, unsigned long size, > result += BITS_PER_LONG; > } > while (size & ~(BITS_PER_LONG-1)) { > + while (!(size & (4*BITS_PER_LONG-1))) { This really means if (!(size & (4*BITS_PER_LONG-1))) { while (1) { ... } } because the subtraction will not change the result of the "while" loop condition. What you want is probably "while (size & ~(4*BITS_PER_LONG-1))", which in turn means "while (size >= 4*BITS_PER_LONG). Please change both while loops to use a ">=" condition, it's easier to read. Paolo > + d0 = *p; > + d1 = *(p+1); > + d2 = *(p+2); > + d3 = *(p+3); > + if (d0 || d1 || d2 || d3) { > + break; > + } > + p+=4; > + result += 4*BITS_PER_LONG; > + size -= 4*BITS_PER_LONG; > + } > if ((tmp = *(p++))) { > goto found_middle; > }