From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53689) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZpID1-0004zj-OC for qemu-devel@nongnu.org; Thu, 22 Oct 2015 11:55:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZpICx-0000XK-IW for qemu-devel@nongnu.org; Thu, 22 Oct 2015 11:55:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35514) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZpICx-0000XC-Ct for qemu-devel@nongnu.org; Thu, 22 Oct 2015 11:55:15 -0400 References: <1445522453-14450-1-git-send-email-P@draigBrady.com> <5628F4BC.2040502@redhat.com> <5628F634.6040809@redhat.com> <5628FE20.80802@draigBrady.com> <5629050C.20607@bernhard-voelker.de> From: Eric Blake Message-ID: <562906DD.5040501@redhat.com> Date: Thu, 22 Oct 2015 09:55:09 -0600 MIME-Version: 1.0 In-Reply-To: <5629050C.20607@bernhard-voelker.de> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="d64hxupjwWQp7is86eu7Acm52DJp5GFaI" Subject: Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bernhard Voelker , =?UTF-8?Q?P=c3=a1draig_Brady?= , Paolo Bonzini , coreutils@gnu.org Cc: Rusty Russell , "qemu-devel@nongnu.org" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --d64hxupjwWQp7is86eu7Acm52DJp5GFaI Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 10/22/2015 09:47 AM, Bernhard Voelker wrote: >> Also I suspect the extra conditions involved in using longs >> for just the first 16 bytes would outweigh the benefits? >> I.E. the first simple loop probably breaks early, and if not >> has the added benefit of "priming the pumps" for the subsequent memcmp= (). >=20 > what about spending some 16 bytes of memory and do the memcmp on the wh= ole > buffer? >=20 > static unsigned char p[] =3D {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > return 0 =3D=3D memcmp (p, buf, bufsize); Won't work over the whole bufsize for anything larger than 16 unless you do repeated memcmp()s. Or are you suggesting that the first 16-byte head validation be done against a static buffer via one memcmp(), followed by the other overlap-self memcmp() for the rest of the buffer? But I suspect that for short lengths, it is more efficient to do an unrolled loop than to make a function call (where the function call itself will probably just do an unrolled loop on the short length). You want the short case to be fast, and the real speedup comes by delegating as much of the long case as possible to the system memcmp() optimizations. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --d64hxupjwWQp7is86eu7Acm52DJp5GFaI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJWKQbdAAoJEKeha0olJ0NqoQAH/iOafxhmG0B5JCYSIun1FDCz FEyHhG+sRHXAzeTzq/NvhZeXBOB5Q4CxgnHBkXAHl62ETZ8h0zxIha7TumxlHQZE 92UO5O2MQzsaKg9hqJ7iOeDsXwynE6QrPcKEVWwMPpXs/ihzFD783KY443WMhg8y E2Bk8xCsZUsPZS4IdGNWZQjBnOXILjmQFOYOw0qqzzBy73/iqGPFnG6h6AR85bP2 FArgJ5oRHqJsedMiAznEjxeo/xG2n+GBQqVEPFNBPHI0mjU/5IHhIJhnUJMoNEhH ypLkO9JHdWEhlH/KuEhnNqzhNBQv+3AnrworrIvmla+IMbrl6weQ/saaFbX3toQ= =E94B -----END PGP SIGNATURE----- --d64hxupjwWQp7is86eu7Acm52DJp5GFaI--