From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42339)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1XhXqP-0007HZ-5L
	for qemu-devel@nongnu.org; Fri, 24 Oct 2014 01:55:34 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1XhXqG-0004ND-3l
	for qemu-devel@nongnu.org; Fri, 24 Oct 2014 01:55:25 -0400
Received: from mail-wg0-x22f.google.com ([2a00:1450:400c:c00::22f]:60277)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1XhXqF-0004N8-T8
	for qemu-devel@nongnu.org; Fri, 24 Oct 2014 01:55:16 -0400
Received: by mail-wg0-f47.google.com with SMTP id x13so327441wgg.18
	for <qemu-devel@nongnu.org>; Thu, 23 Oct 2014 22:55:15 -0700 (PDT)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
Message-ID: <5449E9BE.9050900@redhat.com>
Date: Fri, 24 Oct 2014 07:55:10 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1414033363-31032-1-git-send-email-chao.p.peng@linux.intel.com>
	<20141023194923.GA25413@thinpad.lan.raisama.net>
	<20141024012716.GB3135@pengc-linux.bj.intel.com>
In-Reply-To: <20141024012716.GB3135@pengc-linux.bj.intel.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [PATCH] target-i386: add Intel AVX-512 support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Chao Peng <chao.p.peng@linux.intel.com>, Eduardo Habkost <ehabkost@redhat.com>
Cc: kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>, qemu-devel@nongnu.org, Vadim Rozenfeld <vrozenfe@redhat.com>, Laszlo Ersek <lersek@redhat.com>, =?windows-1252?Q?Andreas_F=E4rber?= <afaerber@suse.de>


On 10/24/2014 03:27 AM, Chao Peng wrote:
> On Thu, Oct 23, 2014 at 05:49:23PM -0200, Eduardo Habkost wrote:
>> On Thu, Oct 23, 2014 at 11:02:43AM +0800, Chao Peng wrote:
>> [...]
>>> @@ -707,6 +714,24 @@ typedef union {
>>>  } XMMReg;
>>>  
>>>  typedef union {
>>> +    uint8_t _b[32];
>>> +    uint16_t _w[16];
>>> +    uint32_t _l[8];
>>> +    uint64_t _q[4];
>>> +    float32 _s[8];
>>> +    float64 _d[4];
>>> +} YMMReg;
>>> +
>>> +typedef union {
>>> +    uint8_t _b[64];
>>> +    uint16_t _w[32];
>>> +    uint32_t _l[16];
>>> +    uint64_t _q[8];
>>> +    float32 _s[16];
>>> +    float64 _d[8];
>>> +} ZMMReg;
>>> +
>>> +typedef union {
>>>      uint8_t _b[8];
>>>      uint16_t _w[4];
>>>      uint32_t _l[2];
>>> @@ -725,6 +750,20 @@ typedef struct BNDCSReg {
>>>  } BNDCSReg;
>>>  
>>>  #ifdef HOST_WORDS_BIGENDIAN
>>> +#define ZMM_B(n) _b[63 - (n)]
>>> +#define ZMM_W(n) _w[31 - (n)]
>>> +#define ZMM_L(n) _l[15 - (n)]
>>> +#define ZMM_S(n) _s[15 - (n)]
>>> +#define ZMM_Q(n) _q[7 - (n)]
>>> +#define ZMM_D(n) _d[7 - (n)]
>>> +
>>> +#define YMM_B(n) _b[31 - (n)]
>>> +#define YMM_W(n) _w[15 - (n)]
>>> +#define YMM_L(n) _l[7 - (n)]
>>> +#define YMM_S(n) _s[7 - (n)]
>>> +#define YMM_Q(n) _q[3 - (n)]
>>> +#define YMM_D(n) _d[3 - (n)]
>>> +
>>>  #define XMM_B(n) _b[15 - (n)]
>>>  #define XMM_W(n) _w[7 - (n)]
>>>  #define XMM_L(n) _l[3 - (n)]
>>> @@ -737,6 +776,20 @@ typedef struct BNDCSReg {
>>>  #define MMX_L(n) _l[1 - (n)]
>>>  #define MMX_S(n) _s[1 - (n)]
>>>  #else
>>> +#define ZMM_B(n) _b[n]
>>> +#define ZMM_W(n) _w[n]
>>> +#define ZMM_L(n) _l[n]
>>> +#define ZMM_S(n) _s[n]
>>> +#define ZMM_Q(n) _q[n]
>>> +#define ZMM_D(n) _d[n]
>>> +
>>> +#define YMM_B(n) _b[n]
>>> +#define YMM_W(n) _w[n]
>>> +#define YMM_L(n) _l[n]
>>> +#define YMM_S(n) _s[n]
>>> +#define YMM_Q(n) _q[n]
>>> +#define YMM_D(n) _d[n]
>>> +
>>
>> I am probably not being able to see some future use case of those data
>> structures, but: why all the extra complexity here, if only ZMM_Q and
>> YMM_Q are being used in the code, and the only place affected by the
>> ordering of YMMReg and ZMMReg array elements are the memcpy() calls on
>> kvm_{put,get}_xsave(), where the data always have the same layout?
>>
> 
> Thanks Eduardo, then I feel comfortable to drop most of these macros and
> only keep YMM_Q/ZMM_Q left. As no acutal benefit for ordering, then I
> will also make these two endiness-insensitive.

I think we can keep the macros.  The actual cleanup would be to have a
single member for the 32 512-bit ZMM registers, instead of splitting
xmm/ymmh/zmmh/zmm_hi16.  This will get rid of the YMM_* and ZMM_*
registers.  However, we could not use simple memcpy()s to marshal in and
out of the XSAVE data.  We can do it in 2.2.

Paolo