From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1
Date: Mon, 09 Aug 2010 23:24:43 -0400
Message-ID: <4C60C67B.9040107@redhat.com>
References: <1276431753-nyh@il.ibm.com> <201006131225.o5DCP79H012922@rice.haifa.ibm.com> <4C15E95D.9000300@redhat.com> <20100622145441.GA23496@fermat.math.technion.ac.il> <20100622165322.GA29629@fermat.math.technion.ac.il> <4C21C0B6.50404@redhat.com> <20100808150928.GA17166@fermat.math.technion.ac.il>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm-devel <kvm@vger.kernel.org>
To: "Nadav Har'El" <nyh@math.technion.ac.il>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:8658 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1755493Ab0HJDZX (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 9 Aug 2010 23:25:23 -0400
In-Reply-To: <20100808150928.GA17166@fermat.math.technion.ac.il>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

  On 08/08/2010 11:09 AM, Nadav Har'El wrote:
>
>>> +page table (with bypass_guest_pf disabled).
>> Might as well remove this, since nvmx will not be merged with such a
>> gaping hole.
>>
>> In theory I ought to reject anything that doesn't comply with the spec.
>> In practice I'll accept deviations from the spec, so long as
>>
>> - those features aren't used by common guests
>> - when the features are attempted to be used, kvm will issue a warning
> Ok, I plugged the big gaping hole and left a small invisible hole ;-)
>
> The situation now is that you no longer have to run kvm with bypass_guest_pf,
> not on L0 and not on L1. L1 guests will run normally, possibly with
> bypass_guest_pf enabled. However, when L2 guests run every page-fault will
> cause an exit - regardless of what L0 or L1 tried to define via
> PFEC_MASK, PFEC_MATCH and EB[pf].
>
> The reason why I said there is a "small hole" left is that now there is the
> possibility that we inject L1 with a page fault that it didn't expect to get.
> But in practice, this does not seem to cause any problems for neither KVM
> nor VMWare Server.

Not nice, but acceptable.  Spurious page faults are accepted by guests 
since they're often the result of concurrent faults on the same address.

>> I don't think PFEC matching ought to present any implementation difficulty.
> Well, it is more complicated than it first appeared (at least to me).
> One problem is that there is no real way (at least none that I thought of)
> to "or" the pf-trapping desires of L0 and L1.

If they use the same "sense" (bit 14 of EXCEPTION_BITMAP), you can AND 
the two PFEC_MASKs, and drop any bits remaining where PFEC_MATCH is 
different.  Not worth it, probably.

>   I solved this by  traping all
> page faults, which is unfortunate. The second problem, related to the first
> one, when L0 gets a page fault while running L2, it is now quite diffcult to
> figure out whether it should be injected into L1, i.e., whether L1 asked
> for this specific page-fault trap to happen. We need check whether the
> page_fault_error_code match's the L1-specified pfec_mask and pfec_match
> (and eb.pf), but it's actually more complicated, because the
> page_fault_error_code we got from the processor refers to the shadow page
> tables, and we need to translate it back to what it would mean for L1's page
> tables.

You can recover original PFEC by doing a walk_addr().

> Doing this correctly would require me to spend quite a bit more time to
> understand exactly how the shadow page tables code works, and I hesitate
> whether I should do that now, when I know that common guest hypervisors
> work perfectly without fixing this issue, and when most people would rather
> use EPT and not shadow page tables anyway.
>
> In any case, I left a TODO in the code about this, so it won't be forgotten.

Sure.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.