From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57647)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <quintela@redhat.com>) id 1Zw4zh-0006Ng-NQ
	for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:13:38 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <quintela@redhat.com>) id 1Zw4zd-00014N-In
	for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:13:37 -0500
Received: from mx1.redhat.com ([209.132.183.28]:34852)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <quintela@redhat.com>) id 1Zw4zd-00014I-Dv
	for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:13:33 -0500
From: Juan Quintela <quintela@redhat.com>
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E019A2935@shsmsx102.ccr.corp.intel.com>
	(Liang Z. Li's message of "Tue, 10 Nov 2015 05:48:15 +0000")
References: <1447123907-26750-1-git-send-email-liang.z.li@intel.com>
	<564167C4.2060702@redhat.com>
	<F2CBF3009FA73547804AE4C663CAB28E019A2935@shsmsx102.ccr.corp.intel.com>
Date: Tue, 10 Nov 2015 10:13:28 +0100
Message-ID: <87h9ku8bev.fsf@emacs.mitica>
MIME-Version: 1.0
Content-Type: text/plain
Subject: Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization
Reply-To: quintela@redhat.com
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "amit.shah@redhat.com" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "mst@redhat.com" <mst@redhat.com>

"Li, Liang Z" <liang.z.li@intel.com> wrote:
>> Rather than trying to cater to multiple assembly instruction implementations
>> ourselves, have you tried taking the ideas in this earlier thread?
>> https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05298.html
>> 
>> Ideally, libc's memcmp() will already be using the most efficient assembly
>> instructions without us having to reproduce the work of picking the instructions
>> that work best.
>> 
>
> Eric, thanks for you information. I didn't notice that discussion before.
>
>
> I rewrite the buffer_find_nonzero_offset() with the 'bool memeqzero4_paolo length'
> then write a test program to check a large amount of zero pages, and
> use the 'time' to
> recode the time takes by different optimization. Test result is like this:
>
> SSE2:
> ------------------------------------------------------
>               |            test 1         |     test 2
> ----------------------------------------------------
> Time(S):|       13.696            | 13.533  
> ------------------------------------------------
>
>
> AVX2:
> -------------------------------------------
>               |        test 1     | test 2
> -------------------------------------------
> Time (S):|      10.583      |  10.306
> -------------------------------------------
>
> memeqzero4_paolo:
> ---------------------------------------
>               |        test 1     | test 2
> ---------------------------------------
> Time (S):|      9.718     |  9.817
> ----------------------------------------
>
>
> Paolo's implementation has the best performance. It seems that we can
> remove the SSE2 related Intrinsics.

How should I understand that comment?  That you are about to send an
email to remove the sse2 support and that I can forget about this patch?

Thanks, Juan.


>
> Liang
>> --
>> Eric Blake   eblake redhat com    +1-919-301-3266
>> Libvirt virtualization library http://libvirt.org