From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57647) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zw4zh-0006Ng-NQ for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:13:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zw4zd-00014N-In for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:13:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34852) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zw4zd-00014I-Dv for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:13:33 -0500 From: Juan Quintela In-Reply-To: (Liang Z. Li's message of "Tue, 10 Nov 2015 05:48:15 +0000") References: <1447123907-26750-1-git-send-email-liang.z.li@intel.com> <564167C4.2060702@redhat.com> Date: Tue, 10 Nov 2015 10:13:28 +0100 Message-ID: <87h9ku8bev.fsf@emacs.mitica> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" Cc: "amit.shah@redhat.com" , "pbonzini@redhat.com" , "qemu-devel@nongnu.org" , "mst@redhat.com" "Li, Liang Z" wrote: >> Rather than trying to cater to multiple assembly instruction implementations >> ourselves, have you tried taking the ideas in this earlier thread? >> https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05298.html >> >> Ideally, libc's memcmp() will already be using the most efficient assembly >> instructions without us having to reproduce the work of picking the instructions >> that work best. >> > > Eric, thanks for you information. I didn't notice that discussion before. > > > I rewrite the buffer_find_nonzero_offset() with the 'bool memeqzero4_paolo length' > then write a test program to check a large amount of zero pages, and > use the 'time' to > recode the time takes by different optimization. Test result is like this: > > SSE2: > ------------------------------------------------------ > | test 1 | test 2 > ---------------------------------------------------- > Time(S):| 13.696 | 13.533 > ------------------------------------------------ > > > AVX2: > ------------------------------------------- > | test 1 | test 2 > ------------------------------------------- > Time (S):| 10.583 | 10.306 > ------------------------------------------- > > memeqzero4_paolo: > --------------------------------------- > | test 1 | test 2 > --------------------------------------- > Time (S):| 9.718 | 9.817 > ---------------------------------------- > > > Paolo's implementation has the best performance. It seems that we can > remove the SSE2 related Intrinsics. How should I understand that comment? That you are about to send an email to remove the sse2 support and that I can forget about this patch? Thanks, Juan. > > Liang >> -- >> Eric Blake eblake redhat com +1-919-301-3266 >> Libvirt virtualization library http://libvirt.org