From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60335) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zwq8v-0008Fp-Kt for qemu-devel@nongnu.org; Thu, 12 Nov 2015 06:34:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zwq8q-0004uX-O4 for qemu-devel@nongnu.org; Thu, 12 Nov 2015 06:34:17 -0500 Received: from mx1.redhat.com ([209.132.183.28]:59134) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zwq8q-0004uR-I3 for qemu-devel@nongnu.org; Thu, 12 Nov 2015 06:34:12 -0500 From: Juan Quintela In-Reply-To: (Liang Z. Li's message of "Thu, 12 Nov 2015 09:53:40 +0000") References: <1447123907-26750-1-git-send-email-liang.z.li@intel.com> <564167C4.2060702@redhat.com> <87h9ku8bev.fsf@emacs.mitica> <5641BA7B.4050108@redhat.com> <56445141.2070907@redhat.com> <5644561C.3060208@redhat.com> <56445FAB.80906@redhat.com> Date: Thu, 12 Nov 2015 12:34:07 +0100 Message-ID: <87tworqwnk.fsf@emacs.mitica> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" Cc: "amit.shah@redhat.com" , Paolo Bonzini , "qemu-devel@nongnu.org" , "mst@redhat.com" "Li, Liang Z" wrote: >> On 12/11/2015 10:40, Li, Liang Z wrote: >> > I migrate a 8GB RAM Idle guest, I think most of it's pages are zero pages. >> > >> > I use your new code: >> > ------------------------------------------------- >> > unsigned long *p = ... >> > if (p[0] || p[1] || p[2] || p[3] >> > || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0) >> > return BUFFER_NOT_ZERO; >> > else >> > return BUFFER_ZERO; >> > --------------------------------------------------- >> > and the result is almost the same. I also tried the check 8, 16 long >> > data at the beginning, same result. >> >> Interesting... Well, all I can say is that applaud you for testing >> your hypothesis >> with the benchmark. >> >> Probably the setup cost of memcmp is too high, because the testing loop is >> already very optimized. >> >> Please submit the AVX2 version if it helps! I read the email in the wrong order. Forget about my other email. Sorry, Juan. > > Yes, the AVX2 version really helps. I have already submitted it, could > you help to review it? > > I am curious about the original intention to add the SSE2 Intrinsics, > is the same reason? > > I even suspect the VM may impact the 'memcmp()' performance, is it possible? > > Liang > >> Paolo