* [Qemu-devel] [PATCH uq/master 0/2] KVM: issues with XSAVE support @ 2013-09-05 13:06 Paolo Bonzini 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 Paolo Bonzini 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust Paolo Bonzini 0 siblings, 2 replies; 19+ messages in thread From: Paolo Bonzini @ 2013-09-05 13:06 UTC (permalink / raw) To: qemu-devel; +Cc: ehabkost, gleb, kvm This series fixes two migration bugs concerning KVM's XSAVE ioctls, both found by code inspection (the second in fact is just theoretical until AVX512 or MPX support is added to KVM). Please review. Paolo Bonzini (2): x86: fix migration from pre-version 12 KVM: make XSAVE support more robust target-i386/cpu.c | 1 + target-i386/cpu.h | 5 +++++ target-i386/kvm.c | 3 ++- target-i386/machine.c | 4 ++++ 4 files changed, 12 insertions(+), 1 deletion(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-05 13:06 [Qemu-devel] [PATCH uq/master 0/2] KVM: issues with XSAVE support Paolo Bonzini @ 2013-09-05 13:06 ` Paolo Bonzini 2013-09-08 11:40 ` Gleb Natapov 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust Paolo Bonzini 1 sibling, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-05 13:06 UTC (permalink / raw) To: qemu-devel; +Cc: ehabkost, gleb, kvm On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, and not restore anything. Since FP and SSE data are always valid, set them in xstate_bv at reset time. In fact, that value is the same that KVM_GET_XSAVE returns on pre-XSAVE hosts. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- target-i386/cpu.c | 1 + target-i386/cpu.h | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index c36345e..ac83106 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s) env->fpuc = 0x37f; env->mxcsr = 0x1f80; + env->xstate_bv = XSTATE_FP | XSTATE_SSE; env->pat = 0x0007040600070406ULL; env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 5723eff..a153078 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -380,6 +380,11 @@ #define MSR_VM_HSAVE_PA 0xc0010117 +#define XSTATE_SUPPORTED (XSTATE_FP|XSTATE_SSE|XSTATE_YMM) +#define XSTATE_FP 1 +#define XSTATE_SSE 2 +#define XSTATE_YMM 4 + /* CPUID feature words */ typedef enum FeatureWord { FEAT_1_EDX, /* CPUID[1].EDX */ -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 Paolo Bonzini @ 2013-09-08 11:40 ` Gleb Natapov 2013-09-09 8:31 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Gleb Natapov @ 2013-09-08 11:40 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: > On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, > and not restore anything. > XRSTOR restores FP/SSE state to reset state if no bits are set in xstate_bv. This is what should happen on reset, no? > Since FP and SSE data are always valid, set them in xstate_bv at reset > time. In fact, that value is the same that KVM_GET_XSAVE returns on > pre-XSAVE hosts. It is needed for migration between non xsave host to xsave host. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > target-i386/cpu.c | 1 + > target-i386/cpu.h | 5 +++++ > 2 files changed, 6 insertions(+) > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c > index c36345e..ac83106 100644 > --- a/target-i386/cpu.c > +++ b/target-i386/cpu.c > @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s) > env->fpuc = 0x37f; > > env->mxcsr = 0x1f80; > + env->xstate_bv = XSTATE_FP | XSTATE_SSE; > > env->pat = 0x0007040600070406ULL; > env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; > diff --git a/target-i386/cpu.h b/target-i386/cpu.h > index 5723eff..a153078 100644 > --- a/target-i386/cpu.h > +++ b/target-i386/cpu.h > @@ -380,6 +380,11 @@ > > #define MSR_VM_HSAVE_PA 0xc0010117 > > +#define XSTATE_SUPPORTED (XSTATE_FP|XSTATE_SSE|XSTATE_YMM) Supported by whom? By QEMU? We should filer unsupported bits from CPUID.0D then too. > +#define XSTATE_FP 1 > +#define XSTATE_SSE 2 > +#define XSTATE_YMM 4 > + > /* CPUID feature words */ > typedef enum FeatureWord { > FEAT_1_EDX, /* CPUID[1].EDX */ > -- > 1.8.3.1 > -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-08 11:40 ` Gleb Natapov @ 2013-09-09 8:31 ` Paolo Bonzini 2013-09-09 9:03 ` Gleb Natapov 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 8:31 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 08/09/2013 13:40, Gleb Natapov ha scritto: > On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: >> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, >> and not restore anything. >> > XRSTOR restores FP/SSE state to reset state if no bits are set in > xstate_bv. This is what should happen on reset, no? Yes. The problem happens on the migration destination when XSAVE data is not transmitted. FP/SSE data is transmitted and must be restored, but xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset state. The vcpu then loses the values that were set in the migration data. >> Since FP and SSE data are always valid, set them in xstate_bv at reset >> time. In fact, that value is the same that KVM_GET_XSAVE returns on >> pre-XSAVE hosts. > It is needed for migration between non xsave host to xsave host. Yes, and this patch does the same for migration between non-XSAVE QEMU and XSAVE QEMU. In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores xstate_bv when XSAVE is not available. Instead, it should reset the FXSAVE data to processor-reset values (except for MXCSR which always comes from XRSTOR data), i.e. to all-zeros except for the x87 control and tag words. It should also check reserved bits of MXCSR. >> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> target-i386/cpu.c | 1 + >> target-i386/cpu.h | 5 +++++ >> 2 files changed, 6 insertions(+) >> >> diff --git a/target-i386/cpu.c b/target-i386/cpu.c >> index c36345e..ac83106 100644 >> --- a/target-i386/cpu.c >> +++ b/target-i386/cpu.c >> @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s) >> env->fpuc = 0x37f; >> >> env->mxcsr = 0x1f80; >> + env->xstate_bv = XSTATE_FP | XSTATE_SSE; >> >> env->pat = 0x0007040600070406ULL; >> env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; >> diff --git a/target-i386/cpu.h b/target-i386/cpu.h >> index 5723eff..a153078 100644 >> --- a/target-i386/cpu.h >> +++ b/target-i386/cpu.h >> @@ -380,6 +380,11 @@ >> >> #define MSR_VM_HSAVE_PA 0xc0010117 >> >> +#define XSTATE_SUPPORTED (XSTATE_FP|XSTATE_SSE|XSTATE_YMM) > Supported by whom? By QEMU? We should filer unsupported bits from CPUID.0D then too. Yes. QEMU unmarshals information from the XSAVE region and back, so it cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. Paolo > >> +#define XSTATE_FP 1 >> +#define XSTATE_SSE 2 >> +#define XSTATE_YMM 4 >> + >> /* CPUID feature words */ >> typedef enum FeatureWord { >> FEAT_1_EDX, /* CPUID[1].EDX */ >> -- >> 1.8.3.1 >> > > -- > Gleb. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 8:31 ` Paolo Bonzini @ 2013-09-09 9:03 ` Gleb Natapov 2013-09-09 9:53 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 9:03 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote: > Il 08/09/2013 13:40, Gleb Natapov ha scritto: > > On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: > >> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, > >> and not restore anything. > >> > > XRSTOR restores FP/SSE state to reset state if no bits are set in > > xstate_bv. This is what should happen on reset, no? > > Yes. The problem happens on the migration destination when XSAVE data is > not transmitted. FP/SSE data is transmitted and must be restored, but > xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset > state. The vcpu then loses the values that were set in the migration data. > > >> Since FP and SSE data are always valid, set them in xstate_bv at reset > >> time. In fact, that value is the same that KVM_GET_XSAVE returns on > >> pre-XSAVE hosts. > > It is needed for migration between non xsave host to xsave host. > > Yes, and this patch does the same for migration between non-XSAVE QEMU > and XSAVE QEMU. > Can such migration happen? The commit that added xsave support (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id. > In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores > xstate_bv when XSAVE is not available. Instead, it should reset the > FXSAVE data to processor-reset values (except for MXCSR which always > comes from XRSTOR data), i.e. to all-zeros except for the x87 control > and tag words. It should also check reserved bits of MXCSR. > I do not see why. > >> > >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > >> --- > >> target-i386/cpu.c | 1 + > >> target-i386/cpu.h | 5 +++++ > >> 2 files changed, 6 insertions(+) > >> > >> diff --git a/target-i386/cpu.c b/target-i386/cpu.c > >> index c36345e..ac83106 100644 > >> --- a/target-i386/cpu.c > >> +++ b/target-i386/cpu.c > >> @@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s) > >> env->fpuc = 0x37f; > >> > >> env->mxcsr = 0x1f80; > >> + env->xstate_bv = XSTATE_FP | XSTATE_SSE; > >> > >> env->pat = 0x0007040600070406ULL; > >> env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; > >> diff --git a/target-i386/cpu.h b/target-i386/cpu.h > >> index 5723eff..a153078 100644 > >> --- a/target-i386/cpu.h > >> +++ b/target-i386/cpu.h > >> @@ -380,6 +380,11 @@ > >> > >> #define MSR_VM_HSAVE_PA 0xc0010117 > >> > >> +#define XSTATE_SUPPORTED (XSTATE_FP|XSTATE_SSE|XSTATE_YMM) > > Supported by whom? By QEMU? We should filer unsupported bits from CPUID.0D then too. > > Yes. QEMU unmarshals information from the XSAVE region and back, so it > cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. > IMO this is the main issue here, not separate bug. If we gonna let guest use CPU state QEMU does not support we gonna have a bad time. -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 9:03 ` Gleb Natapov @ 2013-09-09 9:53 ` Paolo Bonzini 2013-09-09 10:54 ` Gleb Natapov 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 9:53 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 09/09/2013 11:03, Gleb Natapov ha scritto: > On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote: >> Il 08/09/2013 13:40, Gleb Natapov ha scritto: >>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: >>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, >>>> and not restore anything. >>>> >>> XRSTOR restores FP/SSE state to reset state if no bits are set in >>> xstate_bv. This is what should happen on reset, no? >> >> Yes. The problem happens on the migration destination when XSAVE data is >> not transmitted. FP/SSE data is transmitted and must be restored, but >> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset >> state. The vcpu then loses the values that were set in the migration data. >> >>>> Since FP and SSE data are always valid, set them in xstate_bv at reset >>>> time. In fact, that value is the same that KVM_GET_XSAVE returns on >>>> pre-XSAVE hosts. >>> It is needed for migration between non xsave host to xsave host. >> >> Yes, and this patch does the same for migration between non-XSAVE QEMU >> and XSAVE QEMU. >> > Can such migration happen? The commit that added xsave support > (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id. Yes, old->new migration can happen. New->old of course cannot. >> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores >> xstate_bv when XSAVE is not available. Instead, it should reset the >> FXSAVE data to processor-reset values (except for MXCSR which always >> comes from XRSTOR data), i.e. to all-zeros except for the x87 control >> and tag words. It should also check reserved bits of MXCSR. > > I do not see why. Because otherwise it behaves in a subtly different manner for XSAVE and non-XSAVE hosts. >> Yes. QEMU unmarshals information from the XSAVE region and back, so it >> cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. >> > IMO this is the main issue here, not separate bug. If we gonna let guest > use CPU state QEMU does not support we gonna have a bad time. We cannot force the guest not to use a feature; all we can do is hide the CPUID bits so that a well-behaved guest will not use it. QEMU does hide CPUID bits for non-supported XSAVE states, except for "-cpu host". So this will not be a problem except with "-cpu host". Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 9:53 ` Paolo Bonzini @ 2013-09-09 10:54 ` Gleb Natapov 2013-09-09 10:58 ` Gleb Natapov 2013-09-09 11:07 ` Paolo Bonzini 0 siblings, 2 replies; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 10:54 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 11:53:45AM +0200, Paolo Bonzini wrote: > Il 09/09/2013 11:03, Gleb Natapov ha scritto: > > On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote: > >> Il 08/09/2013 13:40, Gleb Natapov ha scritto: > >>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: > >>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, > >>>> and not restore anything. > >>>> > >>> XRSTOR restores FP/SSE state to reset state if no bits are set in > >>> xstate_bv. This is what should happen on reset, no? > >> > >> Yes. The problem happens on the migration destination when XSAVE data is > >> not transmitted. FP/SSE data is transmitted and must be restored, but > >> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset > >> state. The vcpu then loses the values that were set in the migration data. > >> > >>>> Since FP and SSE data are always valid, set them in xstate_bv at reset > >>>> time. In fact, that value is the same that KVM_GET_XSAVE returns on > >>>> pre-XSAVE hosts. > >>> It is needed for migration between non xsave host to xsave host. > >> > >> Yes, and this patch does the same for migration between non-XSAVE QEMU > >> and XSAVE QEMU. > >> > > Can such migration happen? The commit that added xsave support > > (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id. > > Yes, old->new migration can happen. New->old of course cannot. > I see. I am fine with the patch, but please drop defines that are not used in the patch itself. > >> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores > >> xstate_bv when XSAVE is not available. Instead, it should reset the > >> FXSAVE data to processor-reset values (except for MXCSR which always > >> comes from XRSTOR data), i.e. to all-zeros except for the x87 control > >> and tag words. It should also check reserved bits of MXCSR. > > > > I do not see why. > > Because otherwise it behaves in a subtly different manner for XSAVE and > non-XSAVE hosts. I do not see how. Can you elaborate? > > >> Yes. QEMU unmarshals information from the XSAVE region and back, so it > >> cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. > >> > > IMO this is the main issue here, not separate bug. If we gonna let guest > > use CPU state QEMU does not support we gonna have a bad time. > > We cannot force the guest not to use a feature; all we can do is hide Of course we can't, this is correct for other features too, but this is guest's problem. > the CPUID bits so that a well-behaved guest will not use it. QEMU does > hide CPUID bits for non-supported XSAVE states, except for "-cpu host". > So this will not be a problem except with "-cpu host". > -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 10:54 ` Gleb Natapov @ 2013-09-09 10:58 ` Gleb Natapov 2013-09-09 11:07 ` Paolo Bonzini 1 sibling, 0 replies; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 10:58 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 01:54:50PM +0300, Gleb Natapov wrote: > On Mon, Sep 09, 2013 at 11:53:45AM +0200, Paolo Bonzini wrote: > > Il 09/09/2013 11:03, Gleb Natapov ha scritto: > > > On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote: > > >> Il 08/09/2013 13:40, Gleb Natapov ha scritto: > > >>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: > > >>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, > > >>>> and not restore anything. > > >>>> > > >>> XRSTOR restores FP/SSE state to reset state if no bits are set in > > >>> xstate_bv. This is what should happen on reset, no? > > >> > > >> Yes. The problem happens on the migration destination when XSAVE data is > > >> not transmitted. FP/SSE data is transmitted and must be restored, but > > >> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset > > >> state. The vcpu then loses the values that were set in the migration data. > > >> > > >>>> Since FP and SSE data are always valid, set them in xstate_bv at reset > > >>>> time. In fact, that value is the same that KVM_GET_XSAVE returns on > > >>>> pre-XSAVE hosts. > > >>> It is needed for migration between non xsave host to xsave host. > > >> > > >> Yes, and this patch does the same for migration between non-XSAVE QEMU > > >> and XSAVE QEMU. > > >> > > > Can such migration happen? The commit that added xsave support > > > (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id. > > > > Yes, old->new migration can happen. New->old of course cannot. > > > I see. I am fine with the patch, but please drop defines that are not > used in the patch itself. > BTW migration question, will xstate_bv no be zeroed by migration code in old->new case? -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 10:54 ` Gleb Natapov 2013-09-09 10:58 ` Gleb Natapov @ 2013-09-09 11:07 ` Paolo Bonzini 2013-09-09 11:28 ` Gleb Natapov 1 sibling, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 11:07 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 09/09/2013 12:54, Gleb Natapov ha scritto: > On Mon, Sep 09, 2013 at 11:53:45AM +0200, Paolo Bonzini wrote: >> Il 09/09/2013 11:03, Gleb Natapov ha scritto: >>> On Mon, Sep 09, 2013 at 10:31:15AM +0200, Paolo Bonzini wrote: >>>> Il 08/09/2013 13:40, Gleb Natapov ha scritto: >>>>> On Thu, Sep 05, 2013 at 03:06:21PM +0200, Paolo Bonzini wrote: >>>>>> On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv, >>>>>> and not restore anything. >>>>>> >>>>> XRSTOR restores FP/SSE state to reset state if no bits are set in >>>>> xstate_bv. This is what should happen on reset, no? >>>> >>>> Yes. The problem happens on the migration destination when XSAVE data is >>>> not transmitted. FP/SSE data is transmitted and must be restored, but >>>> xstate_bv is zero and KVM_SET_XSAVE restores FP/SSE state to reset >>>> state. The vcpu then loses the values that were set in the migration data. >>>> >>>>>> Since FP and SSE data are always valid, set them in xstate_bv at reset >>>>>> time. In fact, that value is the same that KVM_GET_XSAVE returns on >>>>>> pre-XSAVE hosts. >>>>> It is needed for migration between non xsave host to xsave host. >>>> >>>> Yes, and this patch does the same for migration between non-XSAVE QEMU >>>> and XSAVE QEMU. >>>> >>> Can such migration happen? The commit that added xsave support >>> (f1665b21f16c5dc0ac37de60233a4975aff31193) changed vmstate version id. >> >> Yes, old->new migration can happen. New->old of course cannot. >> > I see. I am fine with the patch, but please drop defines that are not > used in the patch itself. Ok. (For the "BTW" question, xstate_bv will not be zeroed, it will remain to the default value). >>>> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores >>>> xstate_bv when XSAVE is not available. Instead, it should reset the >>>> FXSAVE data to processor-reset values (except for MXCSR which always >>>> comes from XRSTOR data), i.e. to all-zeros except for the x87 control >>>> and tag words. It should also check reserved bits of MXCSR. >>> >>> I do not see why. >> >> Because otherwise it behaves in a subtly different manner for XSAVE and >> non-XSAVE hosts. > > I do not see how. Can you elaborate? Suppose userspace calls KVM_SET_XSAVE with XSTATE_BV=0. On an XSAVE host, when the guest FPU state is loaded KVM will do an XRSTOR. The XRSTOR will restore the FPU state to default values. On a non-XSAVE host, when the guest FPU state is loaded KVM will do an FXRSTR. The FXRSTR will load the FPU state from the first 512 bytes of the block that was passed to KVM_SET_XSAVE. This is not a problem because userspace will usually pass to KVM_SET_XSAVE only something that it got from KVM_GET_XSAVE, and KVM_GET_XSAVE will never set XSTATE_BV=0. However, KVM_SET_XSAVE is supposed to emulate XSAVE/XRSTOR if it is not available, and it is failing to emulate this detail. >>>> Yes. QEMU unmarshals information from the XSAVE region and back, so it >>>> cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. >>>> >>> IMO this is the main issue here, not separate bug. If we gonna let guest >>> use CPU state QEMU does not support we gonna have a bad time. >> >> We cannot force the guest not to use a feature; all we can do is hide > > Of course we can't, this is correct for other features too, but this is > guest's problem. Ok, then we agree that QEMU doesn't have a problem? The XSAVE data will always be "fresh" as long as the guest obeys CPUID bits it receives, and the CPUID bits that QEMU passes will never enable XSAVE data that QEMU does not support. Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 11:07 ` Paolo Bonzini @ 2013-09-09 11:28 ` Gleb Natapov 2013-09-09 11:46 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 11:28 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 01:07:37PM +0200, Paolo Bonzini wrote: > >>>> In fact, another bug is that kvm_vcpu_ioctl_x86_set_xsave ignores > >>>> xstate_bv when XSAVE is not available. Instead, it should reset the > >>>> FXSAVE data to processor-reset values (except for MXCSR which always > >>>> comes from XRSTOR data), i.e. to all-zeros except for the x87 control > >>>> and tag words. It should also check reserved bits of MXCSR. > >>> > >>> I do not see why. > >> > >> Because otherwise it behaves in a subtly different manner for XSAVE and > >> non-XSAVE hosts. > > > > I do not see how. Can you elaborate? > > Suppose userspace calls KVM_SET_XSAVE with XSTATE_BV=0. > > On an XSAVE host, when the guest FPU state is loaded KVM will do an > XRSTOR. The XRSTOR will restore the FPU state to default values. > > On a non-XSAVE host, when the guest FPU state is loaded KVM will do an > FXRSTR. The FXRSTR will load the FPU state from the first 512 bytes of > the block that was passed to KVM_SET_XSAVE. > > This is not a problem because userspace will usually pass to > KVM_SET_XSAVE only something that it got from KVM_GET_XSAVE, and > KVM_GET_XSAVE will never set XSTATE_BV=0. However, KVM_SET_XSAVE is > supposed to emulate XSAVE/XRSTOR if it is not available, and it is > failing to emulate this detail. > You are trying to be bug to bug compatible :) XSTATE_BV can be zero only if FPU state is reset one, otherwise the guest will not survive. KVM_SET_XSAVE is not suppose to emulate XSAVE/XRSTOR, it is not emulator function. It is better to outlaw zero value for XSTATE_BV at all, but we cannot do it because current QEMU uses it. > >>>> Yes. QEMU unmarshals information from the XSAVE region and back, so it > >>>> cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. > >>>> > >>> IMO this is the main issue here, not separate bug. If we gonna let guest > >>> use CPU state QEMU does not support we gonna have a bad time. > >> > >> We cannot force the guest not to use a feature; all we can do is hide > > > > Of course we can't, this is correct for other features too, but this is > > guest's problem. > > Ok, then we agree that QEMU doesn't have a problem? The XSAVE data will Which problem exactly. The problems I see is that 1. We do not support MPX and AVX-512 (but this is probably not the problem you meant :)) 2. 0D data is not consistent with features. Guest may not expect it and do stupid things. > always be "fresh" as long as the guest obeys CPUID bits it receives, and > the CPUID bits that QEMU passes will never enable XSAVE data that QEMU > does not support. > -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 11:28 ` Gleb Natapov @ 2013-09-09 11:46 ` Paolo Bonzini 2013-09-09 12:00 ` Gleb Natapov 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 11:46 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 09/09/2013 13:28, Gleb Natapov ha scritto: >> On an XSAVE host, when the guest FPU state is loaded KVM will do an >> XRSTOR. The XRSTOR will restore the FPU state to default values. >> >> On a non-XSAVE host, when the guest FPU state is loaded KVM will do an >> FXRSTR. The FXRSTR will load the FPU state from the first 512 bytes of >> the block that was passed to KVM_SET_XSAVE. >> >> This is not a problem because userspace will usually pass to >> KVM_SET_XSAVE only something that it got from KVM_GET_XSAVE, and >> KVM_GET_XSAVE will never set XSTATE_BV=0. However, KVM_SET_XSAVE is >> supposed to emulate XSAVE/XRSTOR if it is not available, and it is >> failing to emulate this detail. >> > You are trying to be bug to bug compatible :) XSTATE_BV can be zero only > if FPU state is reset one, otherwise the guest will not survive. Yes. > KVM_SET_XSAVE > is not suppose to emulate XSAVE/XRSTOR, it is not emulator function. It > is better to outlaw zero value for XSTATE_BV at all, but we cannot do it > because current QEMU uses it. I agree it'd be better to forbid it. If the mismatch in semantics does not bother you, I won't fix it. It slightly bothers me. :) >>>>>> Yes. QEMU unmarshals information from the XSAVE region and back, so it >>>>>> cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. >>>>>> >>>>> IMO this is the main issue here, not separate bug. If we gonna let guest >>>>> use CPU state QEMU does not support we gonna have a bad time. >>>> >>>> We cannot force the guest not to use a feature; all we can do is hide >>> >>> Of course we can't, this is correct for other features too, but this is >>> guest's problem. >> >> Ok, then we agree that QEMU doesn't have a problem? The XSAVE data will > > Which problem exactly. The problems I see is that 1. We do not support > MPX and AVX-512 (but this is probably not the problem you meant :)) 2. 0D > data is not consistent with features. Guest may not expect it and do stupid > things. It is not a problem to unmarshal information out of KVM_GET_XSAVE data (and back). If the guest does stupid things, it's a bug in an ill-behaving guest. On the other hand, I agree that passthrough of host 0xD data is bad and will fix it. Paolo >> always be "fresh" as long as the guest obeys CPUID bits it receives, and >> the CPUID bits that QEMU passes will never enable XSAVE data that QEMU >> does not support. >> > > -- > Gleb. > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 2013-09-09 11:46 ` Paolo Bonzini @ 2013-09-09 12:00 ` Gleb Natapov 0 siblings, 0 replies; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 12:00 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 01:46:49PM +0200, Paolo Bonzini wrote: > >>>>>> Yes. QEMU unmarshals information from the XSAVE region and back, so it > >>>>>> cannot support MPX or AVX-512 yet (even if KVM were). Separate bug, though. > >>>>>> > >>>>> IMO this is the main issue here, not separate bug. If we gonna let guest > >>>>> use CPU state QEMU does not support we gonna have a bad time. > >>>> > >>>> We cannot force the guest not to use a feature; all we can do is hide > >>> > >>> Of course we can't, this is correct for other features too, but this is > >>> guest's problem. > >> > >> Ok, then we agree that QEMU doesn't have a problem? The XSAVE data will > > > > Which problem exactly. The problems I see is that 1. We do not support > > MPX and AVX-512 (but this is probably not the problem you meant :)) 2. 0D > > data is not consistent with features. Guest may not expect it and do stupid > > things. > > It is not a problem to unmarshal information out of KVM_GET_XSAVE data > (and back). If the guest does stupid things, it's a bug in an > ill-behaving guest. > You know I am first in line to blame guest for everything :) (who needs guests anyway) but in this case I didn't mean that guest does something illegal. If we advertise support for some XSAVE state in 0D leaf guest is in his right to make conclusions we may not expect from that. It may check corespondent feature bit and crash if it is not present for instance. > On the other hand, I agree that passthrough of host 0xD data is bad and > will fix it. > Thanks! -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-05 13:06 [Qemu-devel] [PATCH uq/master 0/2] KVM: issues with XSAVE support Paolo Bonzini 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 Paolo Bonzini @ 2013-09-05 13:06 ` Paolo Bonzini 2013-09-08 11:52 ` Gleb Natapov 1 sibling, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-05 13:06 UTC (permalink / raw) To: qemu-devel; +Cc: ehabkost, gleb, kvm QEMU moves state from CPUArchState to struct kvm_xsave and back when it invokes the KVM_*_XSAVE ioctls. Because it doesn't treat the XSAVE region as an opaque blob, it might be impossible to set some state on the destination if migrating to an older version. This patch blocks migration if it finds that unsupported bits are set in the XSTATE_BV header field. To make this work robustly, QEMU should only report in env->xstate_bv those fields that will actually end up in the migration stream. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- target-i386/kvm.c | 3 ++- target-i386/machine.c | 4 ++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 749aa09..df08a4b 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu) sizeof env->fpregs); memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE], sizeof env->xmm_regs); - env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV]; + env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] & + XSTATE_SUPPORTED; memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE], sizeof env->ymmh_regs); return 0; diff --git a/target-i386/machine.c b/target-i386/machine.c index dc81cde..9e2cfcf 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id) CPUX86State *env = &cpu->env; int i; + if (env->xstate_bv & ~XSTATE_SUPPORTED) { + return -EINVAL; + } + /* * Real mode guest segments register DPL should be zero. * Older KVM version were setting it wrongly. -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust Paolo Bonzini @ 2013-09-08 11:52 ` Gleb Natapov 2013-09-09 8:51 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Gleb Natapov @ 2013-09-08 11:52 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote: > QEMU moves state from CPUArchState to struct kvm_xsave and back when it > invokes the KVM_*_XSAVE ioctls. Because it doesn't treat the XSAVE > region as an opaque blob, it might be impossible to set some state on > the destination if migrating to an older version. > > This patch blocks migration if it finds that unsupported bits are set > in the XSTATE_BV header field. To make this work robustly, QEMU should > only report in env->xstate_bv those fields that will actually end up > in the migration stream. We usually handle host cpu differences in cpuid layer, not by trying to validate migration data. i.e CPUID.0D should be configurable and management should be able to query QEMU what is supported and prevent migration attempt accordingly. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > target-i386/kvm.c | 3 ++- > target-i386/machine.c | 4 ++++ > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/target-i386/kvm.c b/target-i386/kvm.c > index 749aa09..df08a4b 100644 > --- a/target-i386/kvm.c > +++ b/target-i386/kvm.c > @@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu) > sizeof env->fpregs); > memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE], > sizeof env->xmm_regs); > - env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV]; > + env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] & > + XSTATE_SUPPORTED; Don't we just drop state here that will not be restored on the destination and destination will not be able to tell since we masked unsupported bits? > memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE], > sizeof env->ymmh_regs); > return 0; > diff --git a/target-i386/machine.c b/target-i386/machine.c > index dc81cde..9e2cfcf 100644 > --- a/target-i386/machine.c > +++ b/target-i386/machine.c > @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id) > CPUX86State *env = &cpu->env; > int i; > > + if (env->xstate_bv & ~XSTATE_SUPPORTED) { > + return -EINVAL; > + } > + > /* > * Real mode guest segments register DPL should be zero. > * Older KVM version were setting it wrongly. > -- > 1.8.3.1 -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-08 11:52 ` Gleb Natapov @ 2013-09-09 8:51 ` Paolo Bonzini 2013-09-09 9:18 ` Gleb Natapov 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 8:51 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 08/09/2013 13:52, Gleb Natapov ha scritto: > On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote: >> QEMU moves state from CPUArchState to struct kvm_xsave and back when it >> invokes the KVM_*_XSAVE ioctls. Because it doesn't treat the XSAVE >> region as an opaque blob, it might be impossible to set some state on >> the destination if migrating to an older version. >> >> This patch blocks migration if it finds that unsupported bits are set >> in the XSTATE_BV header field. To make this work robustly, QEMU should >> only report in env->xstate_bv those fields that will actually end up >> in the migration stream. > > We usually handle host cpu differences in cpuid layer, not by trying to > validate migration data. Actually we do both. QEMU for example detects invalid subsections and blocks migration, and CPU differences also result in subsections that the destination does not know. But as far as QEMU is concerned, setting an unknown bit in XSTATE_BV is not a CPU difference, it is simply invalid migration data. > i.e CPUID.0D should be configurable and > management should be able to query QEMU what is supported and prevent > migration attempt accordingly. Management is already able to query QEMU of what is supported, because new XSAVE state is always attached to new CPUID bits in leaves other than 0Dh (e.g. EAX=07h, ECX=0h returns AVX512 and MPX support in EBX). QEMU should compute 0Dh data based on those bits indeed. However, KVM_GET/SET_XSAVE should still return all values supported by the hypervisor, independent of the supported CPUID bits. >> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> target-i386/kvm.c | 3 ++- >> target-i386/machine.c | 4 ++++ >> 2 files changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/target-i386/kvm.c b/target-i386/kvm.c >> index 749aa09..df08a4b 100644 >> --- a/target-i386/kvm.c >> +++ b/target-i386/kvm.c >> @@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu) >> sizeof env->fpregs); >> memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE], >> sizeof env->xmm_regs); >> - env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV]; >> + env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] & >> + XSTATE_SUPPORTED; > Don't we just drop state here that will not be restored on the > destination and destination will not be able to tell since we masked > unsupported bits? A well-behaved guest should not have modified that state anyway, since: * the source and destination machines should have the same CPU * since the destination QEMU does not support the feature, the source should have masked it as well * the guest should always probe CPUID before using a feature There will be only one change for well-behaved guests with this patch (and the change will be invisible to them since they behave well). After the patch, KVM_SET_XSAVE will set the extended states to the processor-reset state instead of all-zeros. However, all currently-defined states have a processor-reset state that is equal to all-zeroes, so this change is theoretical. In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here, and we should hide all features that are not visible in CPUID. It is okay, however, to test it in cpu_post_load. Paolo >> memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE], >> sizeof env->ymmh_regs); >> return 0; >> diff --git a/target-i386/machine.c b/target-i386/machine.c >> index dc81cde..9e2cfcf 100644 >> --- a/target-i386/machine.c >> +++ b/target-i386/machine.c >> @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id) >> CPUX86State *env = &cpu->env; >> int i; >> >> + if (env->xstate_bv & ~XSTATE_SUPPORTED) { >> + return -EINVAL; >> + } >> + >> /* >> * Real mode guest segments register DPL should be zero. >> * Older KVM version were setting it wrongly. >> -- >> 1.8.3.1 > > -- > Gleb. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-09 8:51 ` Paolo Bonzini @ 2013-09-09 9:18 ` Gleb Natapov 2013-09-09 9:50 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 9:18 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 10:51:58AM +0200, Paolo Bonzini wrote: > Il 08/09/2013 13:52, Gleb Natapov ha scritto: > > On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote: > >> QEMU moves state from CPUArchState to struct kvm_xsave and back when it > >> invokes the KVM_*_XSAVE ioctls. Because it doesn't treat the XSAVE > >> region as an opaque blob, it might be impossible to set some state on > >> the destination if migrating to an older version. > >> > >> This patch blocks migration if it finds that unsupported bits are set > >> in the XSTATE_BV header field. To make this work robustly, QEMU should > >> only report in env->xstate_bv those fields that will actually end up > >> in the migration stream. > > > > We usually handle host cpu differences in cpuid layer, not by trying to > > validate migration data. > > Actually we do both. QEMU for example detects invalid subsections and > blocks migration, and CPU differences also result in subsections that > the destination does not know. > That's different from what you do here though. If xstate_bv was in its separate subsection things would be easier, but it is not. > But as far as QEMU is concerned, setting an unknown bit in XSTATE_BV is > not a CPU difference, it is simply invalid migration data. > > > i.e CPUID.0D should be configurable and > > management should be able to query QEMU what is supported and prevent > > migration attempt accordingly. > > Management is already able to query QEMU of what is supported, because > new XSAVE state is always attached to new CPUID bits in leaves other > than 0Dh (e.g. EAX=07h, ECX=0h returns AVX512 and MPX support in EBX). > QEMU should compute 0Dh data based on those bits indeed. If it is computable from other data even better, easier for us. > > However, KVM_GET/SET_XSAVE should still return all values supported by > the hypervisor, independent of the supported CPUID bits. > Why? > >> > >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > >> --- > >> target-i386/kvm.c | 3 ++- > >> target-i386/machine.c | 4 ++++ > >> 2 files changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/target-i386/kvm.c b/target-i386/kvm.c > >> index 749aa09..df08a4b 100644 > >> --- a/target-i386/kvm.c > >> +++ b/target-i386/kvm.c > >> @@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu) > >> sizeof env->fpregs); > >> memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE], > >> sizeof env->xmm_regs); > >> - env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV]; > >> + env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] & > >> + XSTATE_SUPPORTED; > > Don't we just drop state here that will not be restored on the > > destination and destination will not be able to tell since we masked > > unsupported bits? > > A well-behaved guest should not have modified that state anyway, since: > > * the source and destination machines should have the same CPU > > * since the destination QEMU does not support the feature, the source > should have masked it as well > > * the guest should always probe CPUID before using a feature > The I fail to see what is the purpose of the patch. I see two cases: 1. Each extended state has separate CPUID bit (is this guarantied?) - In this case, as you say, by matching CPUID on src and dst we guaranty that migration data is good. 2. There is a state that is advertised in CPUID.0D, but does not have any regular "feature" CPUID associated with it. - In this case this patch will drop valid state that needs to be restored. > There will be only one change for well-behaved guests with this patch > (and the change will be invisible to them since they behave well). > After the patch, KVM_SET_XSAVE will set the extended states to the > processor-reset state instead of all-zeros. However, all > currently-defined states have a processor-reset state that is equal to > all-zeroes, so this change is theoretical. > > In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here, > and we should hide all features that are not visible in CPUID. It is > okay, however, to test it in cpu_post_load. The kernel should not even return state that is not visible in CPUID. > > Paolo > > >> memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE], > >> sizeof env->ymmh_regs); > >> return 0; > >> diff --git a/target-i386/machine.c b/target-i386/machine.c > >> index dc81cde..9e2cfcf 100644 > >> --- a/target-i386/machine.c > >> +++ b/target-i386/machine.c > >> @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id) > >> CPUX86State *env = &cpu->env; > >> int i; > >> > >> + if (env->xstate_bv & ~XSTATE_SUPPORTED) { > >> + return -EINVAL; > >> + } > >> + > >> /* > >> * Real mode guest segments register DPL should be zero. > >> * Older KVM version were setting it wrongly. > >> -- > >> 1.8.3.1 > > > > -- > > Gleb. > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-09 9:18 ` Gleb Natapov @ 2013-09-09 9:50 ` Paolo Bonzini 2013-09-09 10:41 ` Gleb Natapov 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 9:50 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 09/09/2013 11:18, Gleb Natapov ha scritto: > On Mon, Sep 09, 2013 at 10:51:58AM +0200, Paolo Bonzini wrote: >> Il 08/09/2013 13:52, Gleb Natapov ha scritto: >>> On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote: >>>> QEMU moves state from CPUArchState to struct kvm_xsave and back when it >>>> invokes the KVM_*_XSAVE ioctls. Because it doesn't treat the XSAVE >>>> region as an opaque blob, it might be impossible to set some state on >>>> the destination if migrating to an older version. >>>> >>>> This patch blocks migration if it finds that unsupported bits are set >>>> in the XSTATE_BV header field. To make this work robustly, QEMU should >>>> only report in env->xstate_bv those fields that will actually end up >>>> in the migration stream. >>> >>> We usually handle host cpu differences in cpuid layer, not by trying to >>> validate migration data. >> >> Actually we do both. QEMU for example detects invalid subsections and >> blocks migration, and CPU differences also result in subsections that >> the destination does not know. >> > That's different from what you do here though. If xstate_bv was in its > separate subsection things would be easier, but it is not. I agree. And also if YMM was in its separate subsections; future XSAVE states will likely use subsections (whose presence is keyed off bits in env->xstate_bv). >> However, KVM_GET/SET_XSAVE should still return all values supported by >> the hypervisor, independent of the supported CPUID bits. > > Why? Because this is not talking to the guest, it is talking to userspace. The VCPU state is more than what is visible to the guest, and returning all of it seems more consistent with the rest of the KVM API. For example, KVM_GET_FPU always returns SSE state even if the CPUID lacks SSE and/or FXSR. >> A well-behaved guest should not have modified that state anyway, since: >> >> * the source and destination machines should have the same CPU >> >> * since the destination QEMU does not support the feature, the source >> should have masked it as well >> >> * the guest should always probe CPUID before using a feature >> > The I fail to see what is the purpose of the patch. I see two cases: > 1. Each extended state has separate CPUID bit (is this guarantied?) Not guaranteed, but it has always happened so far (AVX, AVX-512, MPX). > - In this case, as you say, by matching CPUID on src and dst we guaranty > that migration data is good. But we don't match CPUID on src and destination. This is something that the user should do, but it's better if we can test it too. Subsections do that for us; I am, in some sense, emulating subsections for the XSAVE states that are not stored in subsections. >> In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here, >> and we should hide all features that are not visible in CPUID. It is >> okay, however, to test it in cpu_post_load. > > The kernel should not even return state that is not visible in CPUID. That's an interesting point of view that I hadn't considered. But just like you asked me why it should return state that is not visible in CPUID, I'm asking you why it should not... Paolo >> >> Paolo >> >>>> memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE], >>>> sizeof env->ymmh_regs); >>>> return 0; >>>> diff --git a/target-i386/machine.c b/target-i386/machine.c >>>> index dc81cde..9e2cfcf 100644 >>>> --- a/target-i386/machine.c >>>> +++ b/target-i386/machine.c >>>> @@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id) >>>> CPUX86State *env = &cpu->env; >>>> int i; >>>> >>>> + if (env->xstate_bv & ~XSTATE_SUPPORTED) { >>>> + return -EINVAL; >>>> + } >>>> + >>>> /* >>>> * Real mode guest segments register DPL should be zero. >>>> * Older KVM version were setting it wrongly. >>>> -- >>>> 1.8.3.1 >>> >>> -- >>> Gleb. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe kvm" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > > -- > Gleb. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-09 9:50 ` Paolo Bonzini @ 2013-09-09 10:41 ` Gleb Natapov 2013-09-09 12:00 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Gleb Natapov @ 2013-09-09 10:41 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, kvm, ehabkost On Mon, Sep 09, 2013 at 11:50:03AM +0200, Paolo Bonzini wrote: > Il 09/09/2013 11:18, Gleb Natapov ha scritto: > > On Mon, Sep 09, 2013 at 10:51:58AM +0200, Paolo Bonzini wrote: > >> Il 08/09/2013 13:52, Gleb Natapov ha scritto: > >>> On Thu, Sep 05, 2013 at 03:06:22PM +0200, Paolo Bonzini wrote: > >>>> QEMU moves state from CPUArchState to struct kvm_xsave and back when it > >>>> invokes the KVM_*_XSAVE ioctls. Because it doesn't treat the XSAVE > >>>> region as an opaque blob, it might be impossible to set some state on > >>>> the destination if migrating to an older version. > >>>> > >>>> This patch blocks migration if it finds that unsupported bits are set > >>>> in the XSTATE_BV header field. To make this work robustly, QEMU should > >>>> only report in env->xstate_bv those fields that will actually end up > >>>> in the migration stream. > >>> > >>> We usually handle host cpu differences in cpuid layer, not by trying to > >>> validate migration data. > >> > >> Actually we do both. QEMU for example detects invalid subsections and > >> blocks migration, and CPU differences also result in subsections that > >> the destination does not know. > >> > > That's different from what you do here though. If xstate_bv was in its > > separate subsection things would be easier, but it is not. > > I agree. And also if YMM was in its separate subsections; future XSAVE > states will likely use subsections (whose presence is keyed off bits in > env->xstate_bv). > > >> However, KVM_GET/SET_XSAVE should still return all values supported by > >> the hypervisor, independent of the supported CPUID bits. > > > > Why? > > Because this is not talking to the guest, it is talking to userspace. > > The VCPU state is more than what is visible to the guest, and returning If a state does not affect guest in any way there is not reason to migrate it. > all of it seems more consistent with the rest of the KVM API. For > example, KVM_GET_FPU always returns SSE state even if the CPUID lacks > SSE and/or FXSR. There are counter examples too :) If APIC is not created we do not return fake information on GET_IRQCHIP. I think nobody expected FPU state to grow indefinitely, so fixed, inflexible API was introduced, but now, when CPU state has flexible extended state management it make sense to model it in the API too. > > >> A well-behaved guest should not have modified that state anyway, since: > >> > >> * the source and destination machines should have the same CPU > >> > >> * since the destination QEMU does not support the feature, the source > >> should have masked it as well > >> > >> * the guest should always probe CPUID before using a feature > >> > > The I fail to see what is the purpose of the patch. I see two cases: > > 1. Each extended state has separate CPUID bit (is this guarantied?) > > Not guaranteed, but it has always happened so far (AVX, AVX-512, MPX). > OK, So for now no need to make 0D configurable, but we need to provide correct one according to those flags, not to passthrough host values. > > - In this case, as you say, by matching CPUID on src and dst we guaranty > > that migration data is good. > > But we don't match CPUID on src and destination. This is something that Yes, I was saying that management infrastructure already knows how to handle it. > the user should do, but it's better if we can test it too. Subsections > do that for us; I am, in some sense, emulating subsections for the XSAVE > states that are not stored in subsections. We do not do it for other bits. It is possible currently to migrate to a slightly different cpu without failure and it may cause guest to crash, but we are not trying actively to catch those situations. Why XSAVE is different? > > >> In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here, > >> and we should hide all features that are not visible in CPUID. It is > >> okay, however, to test it in cpu_post_load. > > > > The kernel should not even return state that is not visible in CPUID. > > That's an interesting point of view that I hadn't considered. But just > like you asked me why it should return state that is not visible in > CPUID, I'm asking you why it should not... > For number of reasons. First because since a sate is not used there is no point in migrating it. Second to make interface more deterministic for QEMU. i.e QEMU configures only features it supports and gets exactly same state from the kernel no matter what host cpu is and what kernel version is. This patch will not be needed since kernel will do the job. -- Gleb. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust 2013-09-09 10:41 ` Gleb Natapov @ 2013-09-09 12:00 ` Paolo Bonzini 0 siblings, 0 replies; 19+ messages in thread From: Paolo Bonzini @ 2013-09-09 12:00 UTC (permalink / raw) To: Gleb Natapov; +Cc: qemu-devel, kvm, ehabkost Il 09/09/2013 12:41, Gleb Natapov ha scritto: >>>> In fact, perhaps even XSTATE_SUPPORTED is not restrictive enough here, >>>> and we should hide all features that are not visible in CPUID. It is >>>> okay, however, to test it in cpu_post_load. >>> >>> The kernel should not even return state that is not visible in CPUID. >> >> That's an interesting point of view that I hadn't considered. But just >> like you asked me why it should return state that is not visible in >> CPUID, I'm asking you why it should not... >> > For number of reasons. First because since a sate is not used there is no > point in migrating it. Second to make interface more deterministic for > QEMU. i.e QEMU configures only features it supports and gets > exactly same state from the kernel no matter what host cpu is and what > kernel version is. This patch will not be needed since kernel will do > the job. Good reasons, thanks. Let's do it in the kernel then and avoid this patch altogether. Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-09-09 12:01 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-05 13:06 [Qemu-devel] [PATCH uq/master 0/2] KVM: issues with XSAVE support Paolo Bonzini 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12 Paolo Bonzini 2013-09-08 11:40 ` Gleb Natapov 2013-09-09 8:31 ` Paolo Bonzini 2013-09-09 9:03 ` Gleb Natapov 2013-09-09 9:53 ` Paolo Bonzini 2013-09-09 10:54 ` Gleb Natapov 2013-09-09 10:58 ` Gleb Natapov 2013-09-09 11:07 ` Paolo Bonzini 2013-09-09 11:28 ` Gleb Natapov 2013-09-09 11:46 ` Paolo Bonzini 2013-09-09 12:00 ` Gleb Natapov 2013-09-05 13:06 ` [Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust Paolo Bonzini 2013-09-08 11:52 ` Gleb Natapov 2013-09-09 8:51 ` Paolo Bonzini 2013-09-09 9:18 ` Gleb Natapov 2013-09-09 9:50 ` Paolo Bonzini 2013-09-09 10:41 ` Gleb Natapov 2013-09-09 12:00 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).