All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, ehabkost@redhat.com, kvm@vger.kernel.org
Subject: Re: [PATCH uq/master 2/2] KVM: make XSAVE support more robust
Date: Mon, 9 Sep 2013 12:18:04 +0300	[thread overview]
Message-ID: <20130909091804.GN17294@redhat.com> (raw)
In-Reply-To: <522D8C2E.1030601@redhat.com>

On Mon, Sep 09, 2013 at 10:51:58AM +0200, Paolo Bonzini wrote:
> Il 08/09/2013 13:52, Gleb Natapov ha scritto:
> > On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote:
> >> QEMU moves state from CPUArchState to struct kvm_xsave and back when it
> >> invokes the KVM_*_XSAVE ioctls.  Because it doesn't treat the XSAVE
> >> region as an opaque blob, it might be impossible to set some state on
> >> the destination if migrating to an older version.
> >>
> >> This patch blocks migration if it finds that unsupported bits are set
> >> in the XSTATE_BV header field.  To make this work robustly, QEMU should
> >> only report in env->xstate_bv those fields that will actually end up
> >> in the migration stream.
> > 
> > We usually handle host cpu differences in cpuid layer, not by trying to
> > validate migration data.
> 
> Actually we do both.  QEMU for example detects invalid subsections and
> blocks migration, and CPU differences also result in subsections that
> the destination does not know.
> 
That's different from what you do here though. If xstate_bv was in its
separate subsection things would be easier, but it is not.

> But as far as QEMU is concerned, setting an unknown bit in XSTATE_BV is
> not a CPU difference, it is simply invalid migration data.
> 
> > i.e CPUID.0D should be configurable and
> > management should be able to query QEMU what is supported and prevent
> > migration attempt accordingly.
> 
> Management is already able to query QEMU of what is supported, because
> new XSAVE state is always attached to new CPUID bits in leaves other
> than 0Dh (e.g. EAX=07h, ECX=0h returns AVX512 and MPX support in EBX).
> QEMU should compute 0Dh data based on those bits indeed.
If it is computable from other data even better, easier for us.

> 
> However, KVM_GET/SET_XSAVE should still return all values supported by
> the hypervisor, independent of the supported CPUID bits.
> 
Why?

> >>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> ---
> >>  target-i386/kvm.c     | 3 ++-
> >>  target-i386/machine.c | 4 ++++
> >>  2 files changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> >> index 749aa09..df08a4b 100644
> >> --- a/target-i386/kvm.c
> >> +++ b/target-i386/kvm.c
> >> @@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu)
> >>              sizeof env->fpregs);
> >>      memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE],
> >>              sizeof env->xmm_regs);
> >> -    env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV];
> >> +    env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] &
> >> +            XSTATE_SUPPORTED;
> > Don't we just drop state here that will not be restored on the
> > destination and destination will not be able to tell since we masked
> > unsupported bits?
> 
> A well-behaved guest should not have modified that state anyway, since:
> 
> * the source and destination machines should have the same CPU
> 
> * since the destination QEMU does not support the feature, the source
> should have masked it as well
> 
> * the guest should always probe CPUID before using a feature
> 
The I fail to see what is the purpose of the patch. I see two cases:
1. Each extended state has separate CPUID bit (is this guarantied?)
  - In this case, as you say, by matching CPUID on src and dst we guaranty
    that migration data is good.
2. There is a state that is advertised in CPUID.0D, but does not have
   any regular "feature" CPUID associated with it.
 - In this case this patch will drop valid state that needs to be
   restored.

> There will be only one change for well-behaved guests with this patch
> (and the change will be invisible to them since they behave well).
> After the patch, KVM_SET_XSAVE will set the extended states to the
> processor-reset state instead of all-zeros.  However, all
> currently-defined states have a processor-reset state that is equal to
> all-zeroes, so this change is theoretical.
> 
> In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here,
> and we should hide all features that are not visible in CPUID.  It is
> okay, however, to test it in cpu_post_load.
The kernel should not even return state that is not visible in CPUID.

> 
> Paolo
> 
> >>      memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE],
> >>              sizeof env->ymmh_regs);
> >>      return 0;
> >> diff --git a/target-i386/machine.c b/target-i386/machine.c
> >> index dc81cde..9e2cfcf 100644
> >> --- a/target-i386/machine.c
> >> +++ b/target-i386/machine.c
> >> @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id)
> >>      CPUX86State *env = &cpu->env;
> >>      int i;
> >>  
> >> +    if (env->xstate_bv & ~XSTATE_SUPPORTED) {
> >> +        return -EINVAL;
> >> +    }
> >> + 
> >>      /*
> >>       * Real mode guest segments register DPL should be zero.
> >>       * Older KVM version were setting it wrongly.
> >> -- 
> >> 1.8.3.1
> > 
> > --
> > 			Gleb.
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 

--
			Gleb.

WARNING: multiple messages have this Message-ID (diff)
From: Gleb Natapov <gleb@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, ehabkost@redhat.com
Subject: Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust
Date: Mon, 9 Sep 2013 12:18:04 +0300	[thread overview]
Message-ID: <20130909091804.GN17294@redhat.com> (raw)
In-Reply-To: <522D8C2E.1030601@redhat.com>

On Mon, Sep 09, 2013 at 10:51:58AM +0200, Paolo Bonzini wrote:
> Il 08/09/2013 13:52, Gleb Natapov ha scritto:
> > On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote:
> >> QEMU moves state from CPUArchState to struct kvm_xsave and back when it
> >> invokes the KVM_*_XSAVE ioctls.  Because it doesn't treat the XSAVE
> >> region as an opaque blob, it might be impossible to set some state on
> >> the destination if migrating to an older version.
> >>
> >> This patch blocks migration if it finds that unsupported bits are set
> >> in the XSTATE_BV header field.  To make this work robustly, QEMU should
> >> only report in env->xstate_bv those fields that will actually end up
> >> in the migration stream.
> > 
> > We usually handle host cpu differences in cpuid layer, not by trying to
> > validate migration data.
> 
> Actually we do both.  QEMU for example detects invalid subsections and
> blocks migration, and CPU differences also result in subsections that
> the destination does not know.
> 
That's different from what you do here though. If xstate_bv was in its
separate subsection things would be easier, but it is not.

> But as far as QEMU is concerned, setting an unknown bit in XSTATE_BV is
> not a CPU difference, it is simply invalid migration data.
> 
> > i.e CPUID.0D should be configurable and
> > management should be able to query QEMU what is supported and prevent
> > migration attempt accordingly.
> 
> Management is already able to query QEMU of what is supported, because
> new XSAVE state is always attached to new CPUID bits in leaves other
> than 0Dh (e.g. EAX=07h, ECX=0h returns AVX512 and MPX support in EBX).
> QEMU should compute 0Dh data based on those bits indeed.
If it is computable from other data even better, easier for us.

> 
> However, KVM_GET/SET_XSAVE should still return all values supported by
> the hypervisor, independent of the supported CPUID bits.
> 
Why?

> >>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> ---
> >>  target-i386/kvm.c     | 3 ++-
> >>  target-i386/machine.c | 4 ++++
> >>  2 files changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> >> index 749aa09..df08a4b 100644
> >> --- a/target-i386/kvm.c
> >> +++ b/target-i386/kvm.c
> >> @@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu)
> >>              sizeof env->fpregs);
> >>      memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE],
> >>              sizeof env->xmm_regs);
> >> -    env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV];
> >> +    env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] &
> >> +            XSTATE_SUPPORTED;
> > Don't we just drop state here that will not be restored on the
> > destination and destination will not be able to tell since we masked
> > unsupported bits?
> 
> A well-behaved guest should not have modified that state anyway, since:
> 
> * the source and destination machines should have the same CPU
> 
> * since the destination QEMU does not support the feature, the source
> should have masked it as well
> 
> * the guest should always probe CPUID before using a feature
> 
The I fail to see what is the purpose of the patch. I see two cases:
1. Each extended state has separate CPUID bit (is this guarantied?)
  - In this case, as you say, by matching CPUID on src and dst we guaranty
    that migration data is good.
2. There is a state that is advertised in CPUID.0D, but does not have
   any regular "feature" CPUID associated with it.
 - In this case this patch will drop valid state that needs to be
   restored.

> There will be only one change for well-behaved guests with this patch
> (and the change will be invisible to them since they behave well).
> After the patch, KVM_SET_XSAVE will set the extended states to the
> processor-reset state instead of all-zeros.  However, all
> currently-defined states have a processor-reset state that is equal to
> all-zeroes, so this change is theoretical.
> 
> In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here,
> and we should hide all features that are not visible in CPUID.  It is
> okay, however, to test it in cpu_post_load.
The kernel should not even return state that is not visible in CPUID.

> 
> Paolo
> 
> >>      memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE],
> >>              sizeof env->ymmh_regs);
> >>      return 0;
> >> diff --git a/target-i386/machine.c b/target-i386/machine.c
> >> index dc81cde..9e2cfcf 100644
> >> --- a/target-i386/machine.c
> >> +++ b/target-i386/machine.c
> >> @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id)
> >>      CPUX86State *env = &cpu->env;
> >>      int i;
> >>  
> >> +    if (env->xstate_bv & ~XSTATE_SUPPORTED) {
> >> +        return -EINVAL;
> >> +    }
> >> + 
> >>      /*
> >>       * Real mode guest segments register DPL should be zero.
> >>       * Older KVM version were setting it wrongly.
> >> -- 
> >> 1.8.3.1
> > 
> > --
> > 			Gleb.
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 

--
			Gleb.

  reply	other threads:[~2013-09-09  9:18 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-05 13:06 [PATCH uq/master 0/2] KVM: issues with XSAVE support Paolo Bonzini
2013-09-05 13:06 ` [Qemu-devel] " Paolo Bonzini
2013-09-05 13:06 ` [PATCH uq/master 1/2] x86: fix migration from pre-version 12 Paolo Bonzini
2013-09-05 13:06   ` [Qemu-devel] " Paolo Bonzini
2013-09-08 11:40   ` Gleb Natapov
2013-09-08 11:40     ` [Qemu-devel] " Gleb Natapov
2013-09-09  8:31     ` Paolo Bonzini
2013-09-09  8:31       ` [Qemu-devel] " Paolo Bonzini
2013-09-09  9:03       ` Gleb Natapov
2013-09-09  9:03         ` [Qemu-devel] " Gleb Natapov
2013-09-09  9:53         ` Paolo Bonzini
2013-09-09  9:53           ` [Qemu-devel] " Paolo Bonzini
2013-09-09 10:54           ` Gleb Natapov
2013-09-09 10:54             ` [Qemu-devel] " Gleb Natapov
2013-09-09 10:58             ` Gleb Natapov
2013-09-09 10:58               ` [Qemu-devel] " Gleb Natapov
2013-09-09 11:07             ` Paolo Bonzini
2013-09-09 11:07               ` [Qemu-devel] " Paolo Bonzini
2013-09-09 11:28               ` Gleb Natapov
2013-09-09 11:28                 ` [Qemu-devel] " Gleb Natapov
2013-09-09 11:46                 ` Paolo Bonzini
2013-09-09 11:46                   ` [Qemu-devel] " Paolo Bonzini
2013-09-09 12:00                   ` Gleb Natapov
2013-09-09 12:00                     ` [Qemu-devel] " Gleb Natapov
2013-09-05 13:06 ` [PATCH uq/master 2/2] KVM: make XSAVE support more robust Paolo Bonzini
2013-09-05 13:06   ` [Qemu-devel] " Paolo Bonzini
2013-09-08 11:52   ` Gleb Natapov
2013-09-08 11:52     ` [Qemu-devel] " Gleb Natapov
2013-09-09  8:51     ` Paolo Bonzini
2013-09-09  8:51       ` [Qemu-devel] " Paolo Bonzini
2013-09-09  9:18       ` Gleb Natapov [this message]
2013-09-09  9:18         ` Gleb Natapov
2013-09-09  9:50         ` Paolo Bonzini
2013-09-09  9:50           ` [Qemu-devel] " Paolo Bonzini
2013-09-09 10:41           ` Gleb Natapov
2013-09-09 10:41             ` [Qemu-devel] " Gleb Natapov
2013-09-09 12:00             ` Paolo Bonzini
2013-09-09 12:00               ` [Qemu-devel] " Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130909091804.GN17294@redhat.com \
    --to=gleb@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.