From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51951) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bcPf7-0007I3-JO for qemu-devel@nongnu.org; Wed, 24 Aug 2016 00:19:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bcPf3-0004dX-Ae for qemu-devel@nongnu.org; Wed, 24 Aug 2016 00:19:36 -0400 Sender: Richard Henderson From: Richard Henderson Date: Tue, 23 Aug 2016 21:17:52 -0700 Message-Id: <1472012279-20581-1-git-send-email-rth@twiddle.net> Subject: [Qemu-devel] [PATCH 0/7] Improve buffer_is_zero List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: vijay.kilari@gmail.com, qemu-arm@nongnu.org, pbonzini@redhat.com, peter.maydell@linaro.org Patches 1-3 remove the use of ifunc from the implementation. Patch 5 adjusts the x86 implementation a bit more to take advantage of ptest (in sse4.1) and unaligned accesses (in avx1). Patches 2 and 6 are the result of my conversation with Vijaya Kumar with respect to ThunderX. Patch 7 is the result of seeing some really really horrible code produced for ppc64le (gcc 4.9 and mainline). This has had limited testing. What I don't know is the best way to benchmark this -- the only way I know to trigger this is via the console, by hand, which doesn't make for reasonable timing. r~ Richard Henderson (7): cutils: Remove SPLAT macro cutils: Export only buffer_is_zero cutils: Rearrange buffer_is_zero acceleration cutils: Add generic prefetch cutils: Rewrite x86 buffer zero checking cutils: Rewrite aarch64 buffer zero checking cutils: Rewrite ppc buffer zero checking configure | 21 +- include/qemu/cutils.h | 2 - migration/ram.c | 2 +- migration/rdma.c | 5 +- util/cutils.c | 526 +++++++++++++++++++++++++++++++++----------------- 5 files changed, 352 insertions(+), 204 deletions(-) -- 2.7.4