From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54464) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a6gB8-0005C7-1z for qemu-devel@nongnu.org; Wed, 09 Dec 2015 09:57:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a6gB4-0000UK-T9 for qemu-devel@nongnu.org; Wed, 09 Dec 2015 09:57:14 -0500 Received: from mail-qg0-x22f.google.com ([2607:f8b0:400d:c04::22f]:35492) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a6gB4-0000UG-Oy for qemu-devel@nongnu.org; Wed, 09 Dec 2015 09:57:10 -0500 Received: by qgec40 with SMTP id c40so82429783qge.2 for ; Wed, 09 Dec 2015 06:57:10 -0800 (PST) Sender: Richard Henderson References: <1449576535-3369-1-git-send-email-liang.z.li@intel.com> <1449576535-3369-2-git-send-email-liang.z.li@intel.com> <566700CA.1080408@twiddle.net> From: Richard Henderson Message-ID: <56684141.5020504@twiddle.net> Date: Wed, 9 Dec 2015 06:57:05 -0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [v3 1/3] cutils: add avx2 instruction optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" , "qemu-devel@nongnu.org" Cc: "mst@redhat.com" , "quintela@redhat.com" , "dgilbert@redhat.com" , "stefanha@redhat.com" , "amit.shah@redhat.com" , "pbonzini@redhat.com" On 12/09/2015 01:32 AM, Li, Liang Z wrote: > I think you means the ' __attribute__((target("avx2")))', I have tried this way, the issue here is: > without the ' -mavx2' option for gcc, there are compiling error: '__m256i undeclared', the __attribute__((target("avx2"))) > can't solve this issue. Any idea? You're right that you can't use the normal __m256i, as it doesn't get declared. But you can define the same type within the function itself. Which is a simple matter of typedef long long __m256i __attribute__((vector_size(32))); From there, you might as well rely on other gcc extensions to instead write __m256i tmp0 = p[i + 0] | p[i + 1]; rather than obfuscating the code with AVX2_VEC_OR. r~