public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi.kivity@gmail.com>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"kvm list" <kvm@vger.kernel.org>,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: 2 CPU Conformance Issue in KVM/x86
Date: Mon, 09 Mar 2015 21:19:29 +0200	[thread overview]
Message-ID: <54FDF241.8080002@gmail.com> (raw)
In-Reply-To: <13DCF857-5591-4499-9B0D-4165268E9CE8@gmail.com>

On 03/09/2015 09:07 PM, Nadav Amit wrote:
> Avi Kivity <avi.kivity@gmail.com> wrote:
>
>> On 03/09/2015 07:51 PM, Nadav Amit wrote:
>>> Avi Kivity <avi.kivity@gmail.com> wrote:
>>>
>>>> On 03/03/2015 11:52 AM, Paolo Bonzini wrote:
>>>>>> In this
>>>>>> case, the VM might expect exceptions when PTE bits which are higher than the
>>>>>> maximum (reported) address width are set, and it would not get such
>>>>>> exceptions. This problem can easily be experienced by small change to the
>>>>>> existing KVM unit-tests.
>>>>>>
>>>>>> There are many variants to this problem, and the only solution which I
>>>>>> consider complete is to report to the VM the maximum (52) physical address
>>>>>> width to the VM, configure the VM to exit on #PF with reserved-bit
>>>>>> error-codes, and then emulate these faulting instructions.
>>>>> Not even that would be a definitive solution.  If the guest tries to map
>>>>> RAM (e.g. a PCI BAR that is backed by RAM) above the host MAXPHYADDR,
>>>>> you would get EPT misconfiguration vmexits.
>>>>>
>>>>> I think there is no way to emulate physical address width correctly,
>>>>> except by disabling EPT.
>>>> Is the issue emulating a higher MAXPHYADDR on the guest than is available
>>>> on the host? I don't think there's any need to support that.
>>>>
>>>> Emulating a lower setting on the guest than is available on the host is, I
>>>> think, desirable. Whether it would work depends on the relative priority
>>>> of EPT misconfiguration exits vs. page table permission faults.
>>> Thanks for the feedback.
>>>
>>> Guest page-table permissions faults got priority over EPT misconfiguration.
>>> KVM can even be set to trap page-table permission faults, at least in VT-x.
>>> Anyhow, I don’t think it is enough.
>> Why is it not enough? If you trap a permission fault, you can inject any exception error code you like.
> Because there is no real permission fault. In the following example, the VM
> expects one (VM’s MAXPHYADDR=40), but there isn’t (Host’s MAXPHYADDR=46), so
> the hypervisor cannot trap it. It can only trap all #PF, which is obviously
> too intrusive.

There are three cases:

1) The guest has marked the page as not present.  In this case, no 
reserved bits are set and the guest should receive its #PF.
2) The page is present and the permissions are sufficient.  In this 
case, you will get an EPT misconfiguration and can proceed to inject a 
#PF with the reserved bit flag set.
3) The page is present but permissions are not sufficient.  In this case 
you can trap the fault via the PFEC_MASK register and inject a #PF to 
the guest.

So you can emulate it and only trap permission faults.  It's still too 
expensive though.


>>>   Here is an example
>>>
>>> My machine has MAXPHYADDR of 46. I modified kvm-unit-tests access test to
>>> set pte.45 instead of pte.51, which from the VM point-of-view should cause
>>> the #PF error-code indicate the reserved bits are set (just as pte.51 does).
>>> Here is one error from the log:
>>>
>>> test pte.p pte.45 pde.p user: FAIL: error code 5 expected d
>>> Dump mapping: address: 123400000000
>>> ------L4: 304b007
>>> ------L3: 304c007
>>> ------L2: 304d001
>>> ------L1: 200002000001
>> This is with an ept misconfig programmed into that address, yes?
> A reserved bit in the PTE is set - from the VM point-of-view. If there
> wasn’t another cause for #PF, it would lead to EPT violation/misconfig.
>
>>> As you can see, the #PF should have had two reasons: reserved bits, and user
>>> access to supervisor only page. The error-code however does not indicate the
>>> reserved-bits are set.
>>>
>>> Note that KVM did not trap any exit on that faulting instruction, as
>>> otherwise it would try to emulate the instruction and assuming it is
>>> supported (and that the #PF was not on an instruction fetch), should be able
>>> to emulate the #PF correctly.
>>> [ The test actually crashes soon after this error due to these reasons. ]
>>>
>>> Anyhow, that is the reason for me to assume that having the maximum
>>> MAXPHYADDR is better.
>> Well, that doesn't work for the reasons Paolo noted.  The guest can have a ivshmem device attached, and map it above a host-supported virtual address, and suddenly it goes slow.
> I fully understand. That’s the reason I don’t have a reasonable solution.

I can't think of one with reasonable performance either.  Perhaps the 
maintainers could raise the issue with Intel.  It looks academic but it 
can happen in real life -- KVM for example used to rely on reserved bits 
faults (it set all bits in the PTE so it wouldn't have been caught by this).

  reply	other threads:[~2015-03-09 19:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03  8:34 2 CPU Conformance Issue in KVM/x86 Nadav Amit
2015-03-03  9:52 ` Paolo Bonzini
2015-03-03 10:18   ` Nadav Amit
2015-03-03 14:31     ` Radim Krčmář
2015-03-09 17:08   ` Avi Kivity
2015-03-09 17:51     ` Nadav Amit
2015-03-09 18:23       ` Avi Kivity
2015-03-09 19:07         ` Nadav Amit
2015-03-09 19:19           ` Avi Kivity [this message]
2015-03-09 19:38             ` Paolo Bonzini
2015-03-09 19:49               ` Avi Kivity
2015-03-10 10:47                 ` Paolo Bonzini
2015-03-10 20:38                   ` Avi Kivity
2015-03-09 19:33     ` Paolo Bonzini
2015-03-09 19:50       ` Avi Kivity
2015-03-10 10:44         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54FDF241.8080002@gmail.com \
    --to=avi.kivity@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox