Re: 2 CPU Conformance Issue in KVM/x86

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi.kivity@gmail.com>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"kvm list" <kvm@vger.kernel.org>,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: 2 CPU Conformance Issue in KVM/x86
Date: Mon, 09 Mar 2015 21:19:29 +0200	[thread overview]
Message-ID: <54FDF241.8080002@gmail.com> (raw)
In-Reply-To: <13DCF857-5591-4499-9B0D-4165268E9CE8@gmail.com>

On 03/09/2015 09:07 PM, Nadav Amit wrote:
> Avi Kivity <avi.kivity@gmail.com> wrote:
>
>> On 03/09/2015 07:51 PM, Nadav Amit wrote:
>>> Avi Kivity <avi.kivity@gmail.com> wrote:
>>>
>>>> On 03/03/2015 11:52 AM, Paolo Bonzini wrote:
>>>>>> In this
>>>>>> case, the VM might expect exceptions when PTE bits which are higher than the
>>>>>> maximum (reported) address width are set, and it would not get such
>>>>>> exceptions. This problem can easily be experienced by small change to the
>>>>>> existing KVM unit-tests.
>>>>>>
>>>>>> There are many variants to this problem, and the only solution which I
>>>>>> consider complete is to report to the VM the maximum (52) physical address
>>>>>> width to the VM, configure the VM to exit on #PF with reserved-bit
>>>>>> error-codes, and then emulate these faulting instructions.
>>>>> Not even that would be a definitive solution.  If the guest tries to map
>>>>> RAM (e.g. a PCI BAR that is backed by RAM) above the host MAXPHYADDR,
>>>>> you would get EPT misconfiguration vmexits.
>>>>>
>>>>> I think there is no way to emulate physical address width correctly,
>>>>> except by disabling EPT.
>>>> Is the issue emulating a higher MAXPHYADDR on the guest than is available
>>>> on the host? I don't think there's any need to support that.
>>>>
>>>> Emulating a lower setting on the guest than is available on the host is, I
>>>> think, desirable. Whether it would work depends on the relative priority
>>>> of EPT misconfiguration exits vs. page table permission faults.
>>> Thanks for the feedback.
>>>
>>> Guest page-table permissions faults got priority over EPT misconfiguration.
>>> KVM can even be set to trap page-table permission faults, at least in VT-x.
>>> Anyhow, I don’t think it is enough.
>> Why is it not enough? If you trap a permission fault, you can inject any exception error code you like.
> Because there is no real permission fault. In the following example, the VM
> expects one (VM’s MAXPHYADDR=40), but there isn’t (Host’s MAXPHYADDR=46), so
> the hypervisor cannot trap it. It can only trap all #PF, which is obviously
> too intrusive.

There are three cases:

1) The guest has marked the page as not present.  In this case, no 
reserved bits are set and the guest should receive its #PF.
2) The page is present and the permissions are sufficient.  In this 
case, you will get an EPT misconfiguration and can proceed to inject a 
#PF with the reserved bit flag set.
3) The page is present but permissions are not sufficient.  In this case 
you can trap the fault via the PFEC_MASK register and inject a #PF to 
the guest.

So you can emulate it and only trap permission faults.  It's still too 
expensive though.


>>>   Here is an example
>>>
>>> My machine has MAXPHYADDR of 46. I modified kvm-unit-tests access test to
>>> set pte.45 instead of pte.51, which from the VM point-of-view should cause
>>> the #PF error-code indicate the reserved bits are set (just as pte.51 does).
>>> Here is one error from the log:
>>>
>>> test pte.p pte.45 pde.p user: FAIL: error code 5 expected d
>>> Dump mapping: address: 123400000000
>>> ------L4: 304b007
>>> ------L3: 304c007
>>> ------L2: 304d001
>>> ------L1: 200002000001
>> This is with an ept misconfig programmed into that address, yes?
> A reserved bit in the PTE is set - from the VM point-of-view. If there
> wasn’t another cause for #PF, it would lead to EPT violation/misconfig.
>
>>> As you can see, the #PF should have had two reasons: reserved bits, and user
>>> access to supervisor only page. The error-code however does not indicate the
>>> reserved-bits are set.
>>>
>>> Note that KVM did not trap any exit on that faulting instruction, as
>>> otherwise it would try to emulate the instruction and assuming it is
>>> supported (and that the #PF was not on an instruction fetch), should be able
>>> to emulate the #PF correctly.
>>> [ The test actually crashes soon after this error due to these reasons. ]
>>>
>>> Anyhow, that is the reason for me to assume that having the maximum
>>> MAXPHYADDR is better.
>> Well, that doesn't work for the reasons Paolo noted.  The guest can have a ivshmem device attached, and map it above a host-supported virtual address, and suddenly it goes slow.
> I fully understand. That’s the reason I don’t have a reasonable solution.

I can't think of one with reasonable performance either.  Perhaps the 
maintainers could raise the issue with Intel.  It looks academic but it 
can happen in real life -- KVM for example used to rely on reserved bits 
faults (it set all bits in the PTE so it wouldn't have been caught by this).

next prev parent reply	other threads:[~2015-03-09 19:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03  8:34 2 CPU Conformance Issue in KVM/x86 Nadav Amit
2015-03-03  9:52 ` Paolo Bonzini
2015-03-03 10:18   ` Nadav Amit
2015-03-03 14:31     ` Radim Krčmář
2015-03-09 17:08   ` Avi Kivity
2015-03-09 17:51     ` Nadav Amit
2015-03-09 18:23       ` Avi Kivity
2015-03-09 19:07         ` Nadav Amit
2015-03-09 19:19           ` Avi Kivity [this message]
2015-03-09 19:38             ` Paolo Bonzini
2015-03-09 19:49               ` Avi Kivity
2015-03-10 10:47                 ` Paolo Bonzini
2015-03-10 20:38                   ` Avi Kivity
2015-03-09 19:33     ` Paolo Bonzini
2015-03-09 19:50       ` Avi Kivity
2015-03-10 10:44         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54FDF241.8080002@gmail.com \
    --to=avi.kivity@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox