[PATCH 0/2] kvm: x86/cpu: Support guest MAXPHYADDR

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] kvm: x86/cpu: Support guest MAXPHYADDR < host MAXPHYADDR
@ 2020-06-19 15:53 Mohammed Gamal
  2020-06-19 15:53 ` [PATCH 1/2] kvm: Add support for KVM_CAP_HAS_SMALLER_MAXPHYADDR Mohammed Gamal
  2020-06-19 15:53 ` [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it Mohammed Gamal
  0 siblings, 2 replies; 15+ messages in thread
From: Mohammed Gamal @ 2020-06-19 15:53 UTC (permalink / raw)
  To: qemu-devel, pbonzini; +Cc: mtosatti, Mohammed Gamal, ehabkost, kvm, rth

This series adds support for KVM_CAP_HAS_SMALLER_MAXPHYADDR to QEMU.
Some processors might not handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR in
the expected manner. Hence, we added KVM_CAP_HAS_SMALLER_MAXPHYADDR to
KVM.
In this implementation KVM is queried for KVM_CAP_HAS_SMALLER_MAXPHYADDR
when setting vCPU physical bits, and if the CPU doesn't support 
KVM_CAP_HAS_SMALLER_MAXPHYADDR the ,phys-bits is ignore and host phyiscal
bits are used. A warning message is printed to the user.

Mohammed Gamal (2):
  kvm: Add support for KVM_CAP_HAS_SMALLER_MAXPHYADDR
  x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that
    don't support it

 linux-headers/linux/kvm.h |  1 +
 target/i386/cpu.c         | 11 +++++++++++
 target/i386/kvm.c         |  5 +++++
 target/i386/kvm_i386.h    |  1 +
 4 files changed, 18 insertions(+)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] kvm: Add support for KVM_CAP_HAS_SMALLER_MAXPHYADDR
  2020-06-19 15:53 [PATCH 0/2] kvm: x86/cpu: Support guest MAXPHYADDR < host MAXPHYADDR Mohammed Gamal
@ 2020-06-19 15:53 ` Mohammed Gamal
  2020-06-19 15:53 ` [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it Mohammed Gamal
  1 sibling, 0 replies; 15+ messages in thread
From: Mohammed Gamal @ 2020-06-19 15:53 UTC (permalink / raw)
  To: qemu-devel, pbonzini; +Cc: mtosatti, Mohammed Gamal, ehabkost, kvm, rth

This adds the KVM_CAP_HAS_SMALLER_MAXPHYADDR capability
and a helper function to check for this capability.
This will allow QEMU to decide to what to do if the host
CPU can't handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR properly.

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
---
 linux-headers/linux/kvm.h | 1 +
 target/i386/kvm.c         | 5 +++++
 target/i386/kvm_i386.h    | 1 +
 3 files changed, 7 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 9804495a46..9eb61a303f 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1017,6 +1017,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_VCPU_RESETS 179
 #define KVM_CAP_S390_PROTECTED 180
 #define KVM_CAP_PPC_SECURE_GUEST 181
+#define KVM_CAP_SMALLER_MAXPHYADDR 184
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b3c13cb898..01100dbf20 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -136,6 +136,11 @@ bool kvm_has_smm(void)
     return kvm_check_extension(kvm_state, KVM_CAP_X86_SMM);
 }
 
+bool kvm_has_smaller_maxphyaddr(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_SMALLER_MAXPHYADDR);
+}
+
 bool kvm_has_adjust_clock_stable(void)
 {
     int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK);
diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
index 00bde7acaf..513f8eebbb 100644
--- a/target/i386/kvm_i386.h
+++ b/target/i386/kvm_i386.h
@@ -34,6 +34,7 @@
 
 bool kvm_allows_irq0_override(void);
 bool kvm_has_smm(void);
+bool kvm_has_smaller_maxphyaddr(void);
 bool kvm_has_adjust_clock_stable(void);
 bool kvm_has_exception_payload(void);
 void kvm_synchronize_all_tsc(void);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-06-19 15:53 [PATCH 0/2] kvm: x86/cpu: Support guest MAXPHYADDR < host MAXPHYADDR Mohammed Gamal
  2020-06-19 15:53 ` [PATCH 1/2] kvm: Add support for KVM_CAP_HAS_SMALLER_MAXPHYADDR Mohammed Gamal
@ 2020-06-19 15:53 ` Mohammed Gamal
  2020-06-19 16:25   ` Paolo Bonzini
  2020-07-08 17:16   ` Eduardo Habkost
  1 sibling, 2 replies; 15+ messages in thread
From: Mohammed Gamal @ 2020-06-19 15:53 UTC (permalink / raw)
  To: qemu-devel, pbonzini; +Cc: mtosatti, Mohammed Gamal, ehabkost, kvm, rth

If the CPU doesn't support GUEST_MAXPHYADDR < HOST_MAXPHYADDR we
let QEMU choose to use the host MAXPHYADDR and print a warning to the
user.

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
---
 target/i386/cpu.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b1b311baa2..91c57117ce 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6589,6 +6589,17 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
             uint32_t host_phys_bits = x86_host_phys_bits();
             static bool warned;
 
+	    /*
+	     * If host doesn't support setting physical bits on the guest,
+	     * report it and return
+	     */
+	    if (cpu->phys_bits < host_phys_bits &&
+		!kvm_has_smaller_maxphyaddr()) {
+		warn_report("Host doesn't support setting smaller phys-bits."
+			    " Using host phys-bits\n");
+		cpu->phys_bits = host_phys_bits;
+	    }
+
             /* Print a warning if the user set it to a value that's not the
              * host value.
              */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-06-19 15:53 ` [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it Mohammed Gamal
@ 2020-06-19 16:25   ` Paolo Bonzini
  2020-07-08 17:16   ` Eduardo Habkost
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-06-19 16:25 UTC (permalink / raw)
  To: Mohammed Gamal, qemu-devel; +Cc: mtosatti, ehabkost, kvm, rth

On 19/06/20 17:53, Mohammed Gamal wrote:
> If the CPU doesn't support GUEST_MAXPHYADDR < HOST_MAXPHYADDR we
> let QEMU choose to use the host MAXPHYADDR and print a warning to the
> user.
> 
> Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
> ---
>  target/i386/cpu.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b1b311baa2..91c57117ce 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6589,6 +6589,17 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>              uint32_t host_phys_bits = x86_host_phys_bits();
>              static bool warned;
>  
> +	    /*
> +	     * If host doesn't support setting physical bits on the guest,
> +	     * report it and return
> +	     */
> +	    if (cpu->phys_bits < host_phys_bits &&
> +		!kvm_has_smaller_maxphyaddr()) {
> +		warn_report("Host doesn't support setting smaller phys-bits."
> +			    " Using host phys-bits\n");
> +		cpu->phys_bits = host_phys_bits;
> +	    }
> +
>              /* Print a warning if the user set it to a value that's not the
>               * host value.
>               */
> 

You should remove the existing warning too.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-06-19 15:53 ` [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it Mohammed Gamal
  2020-06-19 16:25   ` Paolo Bonzini
@ 2020-07-08 17:16   ` Eduardo Habkost
  2020-07-08 17:26     ` Daniel P. Berrangé
  1 sibling, 1 reply; 15+ messages in thread
From: Eduardo Habkost @ 2020-07-08 17:16 UTC (permalink / raw)
  To: Mohammed Gamal
  Cc: mtosatti, Pedro Principeza, kvm, libvir-list, Dann Frazier,
	Guilherme Piccoli, qemu-devel, Christian Ehrhardt,
	Dr. David Alan Gilbert, Gerd Hoffmann, pbonzini,
	Daniel P. Berrangé, Laszlo Ersek, fw, rth

(CCing libvir-list, and people who were included in the OVMF
thread[1])

[1] https://lore.kernel.org/qemu-devel/99779e9c-f05f-501b-b4be-ff719f140a88@canonical.com/

On Fri, Jun 19, 2020 at 05:53:44PM +0200, Mohammed Gamal wrote:
> If the CPU doesn't support GUEST_MAXPHYADDR < HOST_MAXPHYADDR we
> let QEMU choose to use the host MAXPHYADDR and print a warning to the
> user.
> 
> Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
> ---
>  target/i386/cpu.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b1b311baa2..91c57117ce 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6589,6 +6589,17 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>              uint32_t host_phys_bits = x86_host_phys_bits();
>              static bool warned;
>  
> +	    /*
> +	     * If host doesn't support setting physical bits on the guest,
> +	     * report it and return
> +	     */
> +	    if (cpu->phys_bits < host_phys_bits &&
> +		!kvm_has_smaller_maxphyaddr()) {
> +		warn_report("Host doesn't support setting smaller phys-bits."
> +			    " Using host phys-bits\n");
> +		cpu->phys_bits = host_phys_bits;
> +	    }
> +

This looks like a regression from existing behavior.  Today,
using smaller phys-bits doesn't crash most guests, and still
allows live migration to smaller hosts.  I agree using the host
phys-bits is probably a better default, but we shouldn't override
options set explicitly in the command line.

Also, it's important that we work with libvirt and management
software to ensure they have appropriate APIs to choose what to
do when a cluster has hosts with different MAXPHYADDR.

>              /* Print a warning if the user set it to a value that's not the
>               * host value.
>               */
> -- 
> 2.26.2
> 

-- 
Eduardo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-08 17:16   ` Eduardo Habkost
@ 2020-07-08 17:26     ` Daniel P. Berrangé
  2020-07-09  9:44       ` Gerd Hoffmann
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel P. Berrangé @ 2020-07-08 17:26 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mtosatti, Pedro Principeza, kvm, libvir-list, Dann Frazier,
	Guilherme Piccoli, Dr. David Alan Gilbert, Christian Ehrhardt,
	qemu-devel, Gerd Hoffmann, pbonzini, Mohammed Gamal, Laszlo Ersek,
	fw, rth

On Wed, Jul 08, 2020 at 01:16:21PM -0400, Eduardo Habkost wrote:
> (CCing libvir-list, and people who were included in the OVMF
> thread[1])
> 
> [1] https://lore.kernel.org/qemu-devel/99779e9c-f05f-501b-b4be-ff719f140a88@canonical.com/
> 
> On Fri, Jun 19, 2020 at 05:53:44PM +0200, Mohammed Gamal wrote:
> > If the CPU doesn't support GUEST_MAXPHYADDR < HOST_MAXPHYADDR we
> > let QEMU choose to use the host MAXPHYADDR and print a warning to the
> > user.
> > 
> > Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
> > ---
> >  target/i386/cpu.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index b1b311baa2..91c57117ce 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6589,6 +6589,17 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >              uint32_t host_phys_bits = x86_host_phys_bits();
> >              static bool warned;
> >  
> > +	    /*
> > +	     * If host doesn't support setting physical bits on the guest,
> > +	     * report it and return
> > +	     */
> > +	    if (cpu->phys_bits < host_phys_bits &&
> > +		!kvm_has_smaller_maxphyaddr()) {
> > +		warn_report("Host doesn't support setting smaller phys-bits."
> > +			    " Using host phys-bits\n");
> > +		cpu->phys_bits = host_phys_bits;
> > +	    }
> > +
> 
> This looks like a regression from existing behavior.  Today,
> using smaller phys-bits doesn't crash most guests, and still
> allows live migration to smaller hosts.  I agree using the host
> phys-bits is probably a better default, but we shouldn't override
> options set explicitly in the command line.

Yeah, this looks like it would cause a behaviour change / regression
so looks questionable.

> Also, it's important that we work with libvirt and management
> software to ensure they have appropriate APIs to choose what to
> do when a cluster has hosts with different MAXPHYADDR.

There's been so many complex discussions that it is hard to have any
understanding of what we should be doing going forward. There's enough
problems wrt phys bits, that I think we would benefit from a doc that
outlines the big picture expectation for how to handle this in the
virt stack.

As mentioned in the thread quoted above, using host_phys_bits is a
obvious thing to do when the user requested "-cpu host".

The harder issue is how to handle other CPU models. I had suggested
we should try associating a phys bits value with them, which would
probably involve creating Client/Server variants for all our CPU
models which don't currently have it. I still think that's worth
exploring as a strategy and with versioned CPU models we should
be ok wrt back compatibility with that approach.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-08 17:26     ` Daniel P. Berrangé
@ 2020-07-09  9:44       ` Gerd Hoffmann
  2020-07-09  9:55         ` Mohammed Gamal
  2020-07-09 17:00         ` Jim Mattson
  0 siblings, 2 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2020-07-09  9:44 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Guilherme Piccoli, Pedro Principeza, Eduardo Habkost, kvm,
	libvir-list, Dann Frazier, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, pbonzini, Mohammed Gamal,
	Laszlo Ersek, fw, rth

  Hi,

> > (CCing libvir-list, and people who were included in the OVMF
> > thread[1])
> > 
> > [1] https://lore.kernel.org/qemu-devel/99779e9c-f05f-501b-b4be-ff719f140a88@canonical.com/

> > Also, it's important that we work with libvirt and management
> > software to ensure they have appropriate APIs to choose what to
> > do when a cluster has hosts with different MAXPHYADDR.
> 
> There's been so many complex discussions that it is hard to have any
> understanding of what we should be doing going forward. There's enough
> problems wrt phys bits, that I think we would benefit from a doc that
> outlines the big picture expectation for how to handle this in the
> virt stack.

Well, the fundamental issue is not that hard actually.  We have three
cases:

(1) GUEST_MAXPHYADDR == HOST_MAXPHYADDR

    Everything is fine ;)

(2) GUEST_MAXPHYADDR < HOST_MAXPHYADDR

    Mostly fine.  Some edge cases, like different page fault errors for
    addresses above GUEST_MAXPHYADDR and below HOST_MAXPHYADDR.  Which I
    think Mohammed fixed in the kernel recently.

(3) GUEST_MAXPHYADDR > HOST_MAXPHYADDR

    Broken.  If the guest uses addresses above HOST_MAXPHYADDR everything
    goes south.

The (2) case isn't much of a problem.  We only need to figure whenever
we want qemu allow this unconditionally (current state) or only in case
the kernel fixes are present (state with this patch applied if I read it
correctly).

The (3) case is the reason why guest firmware never ever uses
GUEST_MAXPHYADDR and goes with very conservative heuristics instead,
which in turn leads to the consequences discussed at length in the
OVMF thread linked above.

Ideally we would simply outlaw (3), but it's hard for backward
compatibility reasons.  Second best solution is a flag somewhere
(msr, cpuid, ...) telling the guest firmware "you can use
GUEST_MAXPHYADDR, we guarantee it is <= HOST_MAXPHYADDR".

> As mentioned in the thread quoted above, using host_phys_bits is a
> obvious thing to do when the user requested "-cpu host".
> 
> The harder issue is how to handle other CPU models. I had suggested
> we should try associating a phys bits value with them, which would
> probably involve creating Client/Server variants for all our CPU
> models which don't currently have it. I still think that's worth
> exploring as a strategy and with versioned CPU models we should
> be ok wrt back compatibility with that approach.

Yep, better defaults for GUEST_MAXPHYADDR would be good too, but that
is a separate (although related) discussion.

take care,
  Gerd



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-09  9:44       ` Gerd Hoffmann
@ 2020-07-09  9:55         ` Mohammed Gamal
  2020-07-09 10:11           ` Paolo Bonzini
  2020-07-09 17:00         ` Jim Mattson
  1 sibling, 1 reply; 15+ messages in thread
From: Mohammed Gamal @ 2020-07-09  9:55 UTC (permalink / raw)
  To: Gerd Hoffmann, Daniel P. Berrangé
  Cc: Guilherme Piccoli, Pedro Principeza, Eduardo Habkost, kvm,
	libvir-list, Dann Frazier, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, pbonzini, Laszlo Ersek, fw, rth

On Thu, 2020-07-09 at 11:44 +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > > (CCing libvir-list, and people who were included in the OVMF
> > > thread[1])
> > > 
> > > [1] 
> > > https://lore.kernel.org/qemu-devel/99779e9c-f05f-501b-b4be-ff719f140a88@canonical.com/
> > > Also, it's important that we work with libvirt and management
> > > software to ensure they have appropriate APIs to choose what to
> > > do when a cluster has hosts with different MAXPHYADDR.
> > 
> > There's been so many complex discussions that it is hard to have
> > any
> > understanding of what we should be doing going forward. There's
> > enough
> > problems wrt phys bits, that I think we would benefit from a doc
> > that
> > outlines the big picture expectation for how to handle this in the
> > virt stack.
> 
> Well, the fundamental issue is not that hard actually.  We have three
> cases:
> 
> (1) GUEST_MAXPHYADDR == HOST_MAXPHYADDR
> 
>     Everything is fine ;)
> 
> (2) GUEST_MAXPHYADDR < HOST_MAXPHYADDR
> 
>     Mostly fine.  Some edge cases, like different page fault errors
> for
>     addresses above GUEST_MAXPHYADDR and below
> HOST_MAXPHYADDR.  Which I
>     think Mohammed fixed in the kernel recently.
> 
> (3) GUEST_MAXPHYADDR > HOST_MAXPHYADDR
> 
>     Broken.  If the guest uses addresses above HOST_MAXPHYADDR
> everything
>     goes south.
> 
> The (2) case isn't much of a problem.  We only need to figure
> whenever
> we want qemu allow this unconditionally (current state) or only in
> case
> the kernel fixes are present (state with this patch applied if I read
> it
> correctly).
> 
> The (3) case is the reason why guest firmware never ever uses
> GUEST_MAXPHYADDR and goes with very conservative heuristics instead,
> which in turn leads to the consequences discussed at length in the
> OVMF thread linked above.
> 
> Ideally we would simply outlaw (3), but it's hard for backward
> compatibility reasons.  Second best solution is a flag somewhere
> (msr, cpuid, ...) telling the guest firmware "you can use
> GUEST_MAXPHYADDR, we guarantee it is <= HOST_MAXPHYADDR".

Problem is GUEST_MAXPHYADDR > HOST_MAXPHYADDR is actually a supported
configuration on some setups. Namely when memory encryption is enabled
on AMD CPUs[1].

> 
> > As mentioned in the thread quoted above, using host_phys_bits is a
> > obvious thing to do when the user requested "-cpu host".
> > 
> > The harder issue is how to handle other CPU models. I had suggested
> > we should try associating a phys bits value with them, which would
> > probably involve creating Client/Server variants for all our CPU
> > models which don't currently have it. I still think that's worth
> > exploring as a strategy and with versioned CPU models we should
> > be ok wrt back compatibility with that approach.
> 
> Yep, better defaults for GUEST_MAXPHYADDR would be good too, but that
> is a separate (although related) discussion.
> 
> take care,
>   Gerd
> 
[1] - https://lkml.org/lkml/2020/6/19/2371



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-09  9:55         ` Mohammed Gamal
@ 2020-07-09 10:11           ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-07-09 10:11 UTC (permalink / raw)
  To: Mohammed Gamal, Gerd Hoffmann, Daniel P. Berrangé
  Cc: Guilherme Piccoli, Pedro Principeza, Eduardo Habkost, kvm,
	libvir-list, Dann Frazier, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, Laszlo Ersek, fw, rth

On 09/07/20 11:55, Mohammed Gamal wrote:
>> Ideally we would simply outlaw (3), but it's hard for backward
>> compatibility reasons.  Second best solution is a flag somewhere
>> (msr, cpuid, ...) telling the guest firmware "you can use
>> GUEST_MAXPHYADDR, we guarantee it is <= HOST_MAXPHYADDR".
> Problem is GUEST_MAXPHYADDR > HOST_MAXPHYADDR is actually a supported
> configuration on some setups. Namely when memory encryption is enabled
> on AMD CPUs[1].
> 

It's not that bad since there's two MAXPHYADDRs, the one in CPUID and
the one computed internally by the kernel.  GUEST_MAXPHYADDR greater
than the host CPUID maxphyaddr is never supported.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-09  9:44       ` Gerd Hoffmann
  2020-07-09  9:55         ` Mohammed Gamal
@ 2020-07-09 17:00         ` Jim Mattson
  2020-07-09 19:13           ` Eduardo Habkost
  2020-07-10  7:21           ` Paolo Bonzini
  1 sibling, 2 replies; 15+ messages in thread
From: Jim Mattson @ 2020-07-09 17:00 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Daniel P. Berrangé, Eduardo Habkost, mtosatti,
	Pedro Principeza, kvm list, libvir-list, Dann Frazier,
	Guilherme Piccoli, Dr. David Alan Gilbert, Christian Ehrhardt,
	qemu-devel, Paolo Bonzini, Mohammed Gamal, Laszlo Ersek, fw, rth

On Thu, Jul 9, 2020 at 2:44 AM Gerd Hoffmann <kraxel@redhat.com> wrote:

> (2) GUEST_MAXPHYADDR < HOST_MAXPHYADDR
>
>     Mostly fine.  Some edge cases, like different page fault errors for
>     addresses above GUEST_MAXPHYADDR and below HOST_MAXPHYADDR.  Which I
>     think Mohammed fixed in the kernel recently.

Doesn't this require intercepting MOV-to-CR3 when the guest is in PAE
mode, so that the hypervisor can validate the high bits in the PDPTEs?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-09 17:00         ` Jim Mattson
@ 2020-07-09 19:13           ` Eduardo Habkost
  2020-07-10  7:22             ` Paolo Bonzini
  2020-07-10  7:21           ` Paolo Bonzini
  1 sibling, 1 reply; 15+ messages in thread
From: Eduardo Habkost @ 2020-07-09 19:13 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Guilherme Piccoli, Pedro Principeza, kvm list, libvir-list,
	Dann Frazier, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, Gerd Hoffmann, Paolo Bonzini,
	Daniel P. Berrangé, Mohammed Gamal, Laszlo Ersek, fw, rth

On Thu, Jul 09, 2020 at 10:00:59AM -0700, Jim Mattson wrote:
> On Thu, Jul 9, 2020 at 2:44 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> 
> > (2) GUEST_MAXPHYADDR < HOST_MAXPHYADDR
> >
> >     Mostly fine.  Some edge cases, like different page fault errors for
> >     addresses above GUEST_MAXPHYADDR and below HOST_MAXPHYADDR.  Which I
> >     think Mohammed fixed in the kernel recently.
> 
> Doesn't this require intercepting MOV-to-CR3 when the guest is in PAE
> mode, so that the hypervisor can validate the high bits in the PDPTEs?

If the fix has additional overhead, is the additional overhead
bad enough to warrant making it optional?  Most existing
GUEST_MAXPHYADDR < HOST_MAXPHYADDR guests already work today
without the fix.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-09 19:13           ` Eduardo Habkost
@ 2020-07-10  7:22             ` Paolo Bonzini
  2020-07-10 16:02               ` Eduardo Habkost
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-07-10  7:22 UTC (permalink / raw)
  To: Eduardo Habkost, Jim Mattson
  Cc: Guilherme Piccoli, Pedro Principeza, kvm list, libvir-list,
	Dann Frazier, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, Gerd Hoffmann,
	Daniel P. Berrangé, Mohammed Gamal, Laszlo Ersek, fw, rth

On 09/07/20 21:13, Eduardo Habkost wrote:
>> Doesn't this require intercepting MOV-to-CR3 when the guest is in PAE
>> mode, so that the hypervisor can validate the high bits in the PDPTEs?
> If the fix has additional overhead, is the additional overhead
> bad enough to warrant making it optional?  Most existing
> GUEST_MAXPHYADDR < HOST_MAXPHYADDR guests already work today
> without the fix.

The problematic case is when host maxphyaddr is 52.  That case wouldn't
work at all without the fix.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-10  7:22             ` Paolo Bonzini
@ 2020-07-10 16:02               ` Eduardo Habkost
  2020-07-10 16:49                 ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Eduardo Habkost @ 2020-07-10 16:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Guilherme Piccoli, Pedro Principeza, kvm list, libvir-list,
	Dann Frazier, rth, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, Gerd Hoffmann,
	Daniel P. Berrangé, Mohammed Gamal, Laszlo Ersek, fw,
	Jim Mattson

On Fri, Jul 10, 2020 at 09:22:42AM +0200, Paolo Bonzini wrote:
> On 09/07/20 21:13, Eduardo Habkost wrote:
> >> Doesn't this require intercepting MOV-to-CR3 when the guest is in PAE
> >> mode, so that the hypervisor can validate the high bits in the PDPTEs?
> > If the fix has additional overhead, is the additional overhead
> > bad enough to warrant making it optional?  Most existing
> > GUEST_MAXPHYADDR < HOST_MAXPHYADDR guests already work today
> > without the fix.
> 
> The problematic case is when host maxphyaddr is 52.  That case wouldn't
> work at all without the fix.

What can QEMU do to do differentiate "can't work at all without
the fix" from "not the best idea, but will probably work"?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-10 16:02               ` Eduardo Habkost
@ 2020-07-10 16:49                 ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-07-10 16:49 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Guilherme Piccoli, Pedro Principeza, kvm list, libvir-list,
	Dann Frazier, rth, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, Gerd Hoffmann,
	Daniel P. Berrangé, Mohammed Gamal, Laszlo Ersek, fw,
	Jim Mattson

On 10/07/20 18:02, Eduardo Habkost wrote:
> On Fri, Jul 10, 2020 at 09:22:42AM +0200, Paolo Bonzini wrote:
>> On 09/07/20 21:13, Eduardo Habkost wrote:
>>>> Doesn't this require intercepting MOV-to-CR3 when the guest is in PAE
>>>> mode, so that the hypervisor can validate the high bits in the PDPTEs?
>>> If the fix has additional overhead, is the additional overhead
>>> bad enough to warrant making it optional?  Most existing
>>> GUEST_MAXPHYADDR < HOST_MAXPHYADDR guests already work today
>>> without the fix.
>>
>> The problematic case is when host maxphyaddr is 52.  That case wouldn't
>> work at all without the fix.
> 
> What can QEMU do to do differentiate "can't work at all without
> the fix" from "not the best idea, but will probably work"?

Blocking guest_maxphyaddr < host_maxphyaddr if maxphyaddr==52 would be a
good start.  However it would block the default configuration on IceLake
processors (which is why Mohammed looked at this thing in the first place).

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it
  2020-07-09 17:00         ` Jim Mattson
  2020-07-09 19:13           ` Eduardo Habkost
@ 2020-07-10  7:21           ` Paolo Bonzini
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-07-10  7:21 UTC (permalink / raw)
  To: Jim Mattson, Gerd Hoffmann
  Cc: Guilherme Piccoli, Pedro Principeza, Eduardo Habkost, kvm list,
	libvir-list, Dann Frazier, mtosatti, Dr. David Alan Gilbert,
	Christian Ehrhardt, qemu-devel, Daniel P. Berrangé,
	Mohammed Gamal, Laszlo Ersek, fw, rth

On 09/07/20 19:00, Jim Mattson wrote:
>>
>>     Mostly fine.  Some edge cases, like different page fault errors for
>>     addresses above GUEST_MAXPHYADDR and below HOST_MAXPHYADDR.  Which I
>>     think Mohammed fixed in the kernel recently.
> Doesn't this require intercepting MOV-to-CR3 when the guest is in PAE
> mode, so that the hypervisor can validate the high bits in the PDPTEs?

In theory yes, but in practice it just means we'd use the AMD behavior
of loading guest PDPT entries on demand during address translation
(because the PDPT would point to nonexistent memory and cause an EPT
violation on the PDE).

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-07-10 16:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-19 15:53 [PATCH 0/2] kvm: x86/cpu: Support guest MAXPHYADDR < host MAXPHYADDR Mohammed Gamal
2020-06-19 15:53 ` [PATCH 1/2] kvm: Add support for KVM_CAP_HAS_SMALLER_MAXPHYADDR Mohammed Gamal
2020-06-19 15:53 ` [PATCH 2/2] x86/cpu: Handle GUEST_MAXPHYADDR < HOST_MAXPHYADDR for hosts that don't support it Mohammed Gamal
2020-06-19 16:25   ` Paolo Bonzini
2020-07-08 17:16   ` Eduardo Habkost
2020-07-08 17:26     ` Daniel P. Berrangé
2020-07-09  9:44       ` Gerd Hoffmann
2020-07-09  9:55         ` Mohammed Gamal
2020-07-09 10:11           ` Paolo Bonzini
2020-07-09 17:00         ` Jim Mattson
2020-07-09 19:13           ` Eduardo Habkost
2020-07-10  7:22             ` Paolo Bonzini
2020-07-10 16:02               ` Eduardo Habkost
2020-07-10 16:49                 ` Paolo Bonzini
2020-07-10  7:21           ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).