From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45996) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aOQxj-0005be-RV for qemu-devel@nongnu.org; Wed, 27 Jan 2016 09:20:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aOQxg-0000oI-JS for qemu-devel@nongnu.org; Wed, 27 Jan 2016 09:20:47 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34855) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aOQxg-0000nl-Ep for qemu-devel@nongnu.org; Wed, 27 Jan 2016 09:20:44 -0500 References: <1453880034-25076-1-git-send-email-liang.z.li@intel.com> From: Paolo Bonzini Message-ID: <56A8D235.8040705@redhat.com> Date: Wed, 27 Jan 2016 15:20:37 +0100 MIME-Version: 1.0 In-Reply-To: <1453880034-25076-1-git-send-email-liang.z.li@intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v5 0/2] add avx2 instruction optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liang Li , qemu-devel@nongnu.org Cc: peter.maydell@linaro.org, quintela@redhat.com, dgilbert@redhat.com, stefanha@redhat.com, amit.shah@redhat.com, rth@twiddle.net On 27/01/2016 08:33, Liang Li wrote: > buffer_find_nonzero_offset() is a hot function during live migration. > Now it use SSE2 instructions for optimization. For platform supports > AVX2 instructions, use the AVX2 instructions for optimization can help > to improve the performance of zero page checking about 30% comparing > to SSE2. > Live migration can be faster with this optimization, the test result > shows that for an 8GB RAM idle guest, this patch can help to shorten > the total live migration time about 6%. > > This patch use the ifunc mechanism to select the proper function when > running, for platform supports AVX2, execute the AVX2 instructions, > else, execute the original instructions. > > With this patch, the QEMU binary can run on both platforms support AVX2 > or not. > > Compiler which doesn't support the AVX2 and ifunc attribute can also build > the source code successfully. > > v5 -> v4 changes: > * Enhance the ifunc attribute detection (Paolo's suggestion) > > v3 -> v4 changes: > * Use the GCC #pragma to make things simple (Paolo's suggestion) > * Put avx2 related code in cutils.c (Richard's suggestion) > * Change the configure, detect ifunc and avx2 attributes together > > v2 -> v3 changes: > * Detect the ifunc attribute support (Paolo's suggestion) > * Use the ifunc attribute instead of the inline asm (Richard's suggestion) > * Change the configure (Juan's suggestion) > > Liang Li (2): > configure: detect ifunc and avx2 attribute > cutils: add avx2 instruction optimization > > configure | 21 +++++++++ > include/qemu-common.h | 8 +--- > util/cutils.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++++-- > 3 files changed, 136 insertions(+), 11 deletions(-) Reviewed-by: Paolo Bonzini