Re: [Qemu-devel] [PATCH] target-i386: add Intel AVX-512 support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Paolo Bonzini <pbonzini@redhat.com>
To: Chao Peng <chao.p.peng@linux.intel.com>,
	Eduardo Habkost <ehabkost@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Marcelo Tosatti" <mtosatti@redhat.com>,
	"Vadim Rozenfeld" <vrozenfe@redhat.com>,
	"Laszlo Ersek" <lersek@redhat.com>,
	"Andreas Färber" <afaerber@suse.de>
Subject: Re: [Qemu-devel] [PATCH] target-i386: add Intel AVX-512 support
Date: Fri, 24 Oct 2014 07:55:10 +0200	[thread overview]
Message-ID: <5449E9BE.9050900@redhat.com> (raw)
In-Reply-To: <20141024012716.GB3135@pengc-linux.bj.intel.com>



On 10/24/2014 03:27 AM, Chao Peng wrote:
> On Thu, Oct 23, 2014 at 05:49:23PM -0200, Eduardo Habkost wrote:
>> On Thu, Oct 23, 2014 at 11:02:43AM +0800, Chao Peng wrote:
>> [...]
>>> @@ -707,6 +714,24 @@ typedef union {
>>>  } XMMReg;
>>>  
>>>  typedef union {
>>> +    uint8_t _b[32];
>>> +    uint16_t _w[16];
>>> +    uint32_t _l[8];
>>> +    uint64_t _q[4];
>>> +    float32 _s[8];
>>> +    float64 _d[4];
>>> +} YMMReg;
>>> +
>>> +typedef union {
>>> +    uint8_t _b[64];
>>> +    uint16_t _w[32];
>>> +    uint32_t _l[16];
>>> +    uint64_t _q[8];
>>> +    float32 _s[16];
>>> +    float64 _d[8];
>>> +} ZMMReg;
>>> +
>>> +typedef union {
>>>      uint8_t _b[8];
>>>      uint16_t _w[4];
>>>      uint32_t _l[2];
>>> @@ -725,6 +750,20 @@ typedef struct BNDCSReg {
>>>  } BNDCSReg;
>>>  
>>>  #ifdef HOST_WORDS_BIGENDIAN
>>> +#define ZMM_B(n) _b[63 - (n)]
>>> +#define ZMM_W(n) _w[31 - (n)]
>>> +#define ZMM_L(n) _l[15 - (n)]
>>> +#define ZMM_S(n) _s[15 - (n)]
>>> +#define ZMM_Q(n) _q[7 - (n)]
>>> +#define ZMM_D(n) _d[7 - (n)]
>>> +
>>> +#define YMM_B(n) _b[31 - (n)]
>>> +#define YMM_W(n) _w[15 - (n)]
>>> +#define YMM_L(n) _l[7 - (n)]
>>> +#define YMM_S(n) _s[7 - (n)]
>>> +#define YMM_Q(n) _q[3 - (n)]
>>> +#define YMM_D(n) _d[3 - (n)]
>>> +
>>>  #define XMM_B(n) _b[15 - (n)]
>>>  #define XMM_W(n) _w[7 - (n)]
>>>  #define XMM_L(n) _l[3 - (n)]
>>> @@ -737,6 +776,20 @@ typedef struct BNDCSReg {
>>>  #define MMX_L(n) _l[1 - (n)]
>>>  #define MMX_S(n) _s[1 - (n)]
>>>  #else
>>> +#define ZMM_B(n) _b[n]
>>> +#define ZMM_W(n) _w[n]
>>> +#define ZMM_L(n) _l[n]
>>> +#define ZMM_S(n) _s[n]
>>> +#define ZMM_Q(n) _q[n]
>>> +#define ZMM_D(n) _d[n]
>>> +
>>> +#define YMM_B(n) _b[n]
>>> +#define YMM_W(n) _w[n]
>>> +#define YMM_L(n) _l[n]
>>> +#define YMM_S(n) _s[n]
>>> +#define YMM_Q(n) _q[n]
>>> +#define YMM_D(n) _d[n]
>>> +
>>
>> I am probably not being able to see some future use case of those data
>> structures, but: why all the extra complexity here, if only ZMM_Q and
>> YMM_Q are being used in the code, and the only place affected by the
>> ordering of YMMReg and ZMMReg array elements are the memcpy() calls on
>> kvm_{put,get}_xsave(), where the data always have the same layout?
>>
> 
> Thanks Eduardo, then I feel comfortable to drop most of these macros and
> only keep YMM_Q/ZMM_Q left. As no acutal benefit for ordering, then I
> will also make these two endiness-insensitive.

I think we can keep the macros.  The actual cleanup would be to have a
single member for the 32 512-bit ZMM registers, instead of splitting
xmm/ymmh/zmmh/zmm_hi16.  This will get rid of the YMM_* and ZMM_*
registers.  However, we could not use simple memcpy()s to marshal in and
out of the XSAVE data.  We can do it in 2.2.

Paolo

WARNING: multiple messages have this Message-ID (diff)

From: Paolo Bonzini <pbonzini@redhat.com>
To: Chao Peng <chao.p.peng@linux.intel.com>,
	Eduardo Habkost <ehabkost@redhat.com>
Cc: kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
	"Marcelo Tosatti" <mtosatti@redhat.com>,
	qemu-devel@nongnu.org, "Vadim Rozenfeld" <vrozenfe@redhat.com>,
	"Laszlo Ersek" <lersek@redhat.com>,
	"Andreas Färber" <afaerber@suse.de>
Subject: Re: [Qemu-devel] [PATCH] target-i386: add Intel AVX-512 support
Date: Fri, 24 Oct 2014 07:55:10 +0200	[thread overview]
Message-ID: <5449E9BE.9050900@redhat.com> (raw)
In-Reply-To: <20141024012716.GB3135@pengc-linux.bj.intel.com>



On 10/24/2014 03:27 AM, Chao Peng wrote:
> On Thu, Oct 23, 2014 at 05:49:23PM -0200, Eduardo Habkost wrote:
>> On Thu, Oct 23, 2014 at 11:02:43AM +0800, Chao Peng wrote:
>> [...]
>>> @@ -707,6 +714,24 @@ typedef union {
>>>  } XMMReg;
>>>  
>>>  typedef union {
>>> +    uint8_t _b[32];
>>> +    uint16_t _w[16];
>>> +    uint32_t _l[8];
>>> +    uint64_t _q[4];
>>> +    float32 _s[8];
>>> +    float64 _d[4];
>>> +} YMMReg;
>>> +
>>> +typedef union {
>>> +    uint8_t _b[64];
>>> +    uint16_t _w[32];
>>> +    uint32_t _l[16];
>>> +    uint64_t _q[8];
>>> +    float32 _s[16];
>>> +    float64 _d[8];
>>> +} ZMMReg;
>>> +
>>> +typedef union {
>>>      uint8_t _b[8];
>>>      uint16_t _w[4];
>>>      uint32_t _l[2];
>>> @@ -725,6 +750,20 @@ typedef struct BNDCSReg {
>>>  } BNDCSReg;
>>>  
>>>  #ifdef HOST_WORDS_BIGENDIAN
>>> +#define ZMM_B(n) _b[63 - (n)]
>>> +#define ZMM_W(n) _w[31 - (n)]
>>> +#define ZMM_L(n) _l[15 - (n)]
>>> +#define ZMM_S(n) _s[15 - (n)]
>>> +#define ZMM_Q(n) _q[7 - (n)]
>>> +#define ZMM_D(n) _d[7 - (n)]
>>> +
>>> +#define YMM_B(n) _b[31 - (n)]
>>> +#define YMM_W(n) _w[15 - (n)]
>>> +#define YMM_L(n) _l[7 - (n)]
>>> +#define YMM_S(n) _s[7 - (n)]
>>> +#define YMM_Q(n) _q[3 - (n)]
>>> +#define YMM_D(n) _d[3 - (n)]
>>> +
>>>  #define XMM_B(n) _b[15 - (n)]
>>>  #define XMM_W(n) _w[7 - (n)]
>>>  #define XMM_L(n) _l[3 - (n)]
>>> @@ -737,6 +776,20 @@ typedef struct BNDCSReg {
>>>  #define MMX_L(n) _l[1 - (n)]
>>>  #define MMX_S(n) _s[1 - (n)]
>>>  #else
>>> +#define ZMM_B(n) _b[n]
>>> +#define ZMM_W(n) _w[n]
>>> +#define ZMM_L(n) _l[n]
>>> +#define ZMM_S(n) _s[n]
>>> +#define ZMM_Q(n) _q[n]
>>> +#define ZMM_D(n) _d[n]
>>> +
>>> +#define YMM_B(n) _b[n]
>>> +#define YMM_W(n) _w[n]
>>> +#define YMM_L(n) _l[n]
>>> +#define YMM_S(n) _s[n]
>>> +#define YMM_Q(n) _q[n]
>>> +#define YMM_D(n) _d[n]
>>> +
>>
>> I am probably not being able to see some future use case of those data
>> structures, but: why all the extra complexity here, if only ZMM_Q and
>> YMM_Q are being used in the code, and the only place affected by the
>> ordering of YMMReg and ZMMReg array elements are the memcpy() calls on
>> kvm_{put,get}_xsave(), where the data always have the same layout?
>>
> 
> Thanks Eduardo, then I feel comfortable to drop most of these macros and
> only keep YMM_Q/ZMM_Q left. As no acutal benefit for ordering, then I
> will also make these two endiness-insensitive.

I think we can keep the macros.  The actual cleanup would be to have a
single member for the 32 512-bit ZMM registers, instead of splitting
xmm/ymmh/zmmh/zmm_hi16.  This will get rid of the YMM_* and ZMM_*
registers.  However, we could not use simple memcpy()s to marshal in and
out of the XSAVE data.  We can do it in 2.2.

Paolo

next prev parent reply	other threads:[~2014-10-24  5:55 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-23  3:02 [PATCH] target-i386: add Intel AVX-512 support Chao Peng
2014-10-23  3:02 ` [Qemu-devel] " Chao Peng
2014-10-23 14:34 ` Paolo Bonzini
2014-10-23 14:34   ` [Qemu-devel] " Paolo Bonzini
2014-10-24 16:38   ` Eduardo Habkost
2014-10-24 16:38     ` Eduardo Habkost
2014-10-23 19:49 ` Eduardo Habkost
2014-10-23 19:49   ` Eduardo Habkost
2014-10-24  1:27   ` Chao Peng
2014-10-24  1:27     ` Chao Peng
2014-10-24  5:55     ` Paolo Bonzini [this message]
2014-10-24  5:55       ` Paolo Bonzini
2014-10-24 11:12       ` Eduardo Habkost
2014-10-24 11:12         ` Eduardo Habkost
2014-10-24 11:38         ` Paolo Bonzini
2014-10-24 11:38           ` Paolo Bonzini
2014-10-27 15:48       ` Eduardo Habkost
2014-10-27 15:48         ` Eduardo Habkost
2014-10-27 15:53         ` Paolo Bonzini
2014-10-27 15:53           ` Paolo Bonzini
2014-10-24 16:01 ` Eduardo Habkost
2014-10-24 16:01   ` Eduardo Habkost
2014-10-27  2:07   ` Chao Peng
2014-10-27  2:07     ` Chao Peng
2014-11-02 10:19 ` Michael S. Tsirkin
2014-11-02 10:19   ` [Qemu-devel] " Michael S. Tsirkin
2014-11-03  1:53   ` Chao Peng
2014-11-03 11:31   ` Paolo Bonzini
2014-11-03 11:31     ` [Qemu-devel] " Paolo Bonzini
2014-11-03 12:34     ` Michael S. Tsirkin
2014-11-03 12:34       ` [Qemu-devel] " Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5449E9BE.9050900@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=afaerber@suse.de \
    --cc=chao.p.peng@linux.intel.com \
    --cc=ehabkost@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=lersek@redhat.com \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vrozenfe@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.