From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1 Date: Mon, 09 Aug 2010 23:24:43 -0400 Message-ID: <4C60C67B.9040107@redhat.com> References: <1276431753-nyh@il.ibm.com> <201006131225.o5DCP79H012922@rice.haifa.ibm.com> <4C15E95D.9000300@redhat.com> <20100622145441.GA23496@fermat.math.technion.ac.il> <20100622165322.GA29629@fermat.math.technion.ac.il> <4C21C0B6.50404@redhat.com> <20100808150928.GA17166@fermat.math.technion.ac.il> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm-devel To: "Nadav Har'El" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:8658 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755493Ab0HJDZX (ORCPT ); Mon, 9 Aug 2010 23:25:23 -0400 In-Reply-To: <20100808150928.GA17166@fermat.math.technion.ac.il> Sender: kvm-owner@vger.kernel.org List-ID: On 08/08/2010 11:09 AM, Nadav Har'El wrote: > >>> +page table (with bypass_guest_pf disabled). >> Might as well remove this, since nvmx will not be merged with such a >> gaping hole. >> >> In theory I ought to reject anything that doesn't comply with the spec. >> In practice I'll accept deviations from the spec, so long as >> >> - those features aren't used by common guests >> - when the features are attempted to be used, kvm will issue a warning > Ok, I plugged the big gaping hole and left a small invisible hole ;-) > > The situation now is that you no longer have to run kvm with bypass_guest_pf, > not on L0 and not on L1. L1 guests will run normally, possibly with > bypass_guest_pf enabled. However, when L2 guests run every page-fault will > cause an exit - regardless of what L0 or L1 tried to define via > PFEC_MASK, PFEC_MATCH and EB[pf]. > > The reason why I said there is a "small hole" left is that now there is the > possibility that we inject L1 with a page fault that it didn't expect to get. > But in practice, this does not seem to cause any problems for neither KVM > nor VMWare Server. Not nice, but acceptable. Spurious page faults are accepted by guests since they're often the result of concurrent faults on the same address. >> I don't think PFEC matching ought to present any implementation difficulty. > Well, it is more complicated than it first appeared (at least to me). > One problem is that there is no real way (at least none that I thought of) > to "or" the pf-trapping desires of L0 and L1. If they use the same "sense" (bit 14 of EXCEPTION_BITMAP), you can AND the two PFEC_MASKs, and drop any bits remaining where PFEC_MATCH is different. Not worth it, probably. > I solved this by traping all > page faults, which is unfortunate. The second problem, related to the first > one, when L0 gets a page fault while running L2, it is now quite diffcult to > figure out whether it should be injected into L1, i.e., whether L1 asked > for this specific page-fault trap to happen. We need check whether the > page_fault_error_code match's the L1-specified pfec_mask and pfec_match > (and eb.pf), but it's actually more complicated, because the > page_fault_error_code we got from the processor refers to the shadow page > tables, and we need to translate it back to what it would mean for L1's page > tables. You can recover original PFEC by doing a walk_addr(). > Doing this correctly would require me to spend quite a bit more time to > understand exactly how the shadow page tables code works, and I hesitate > whether I should do that now, when I know that common guest hypervisors > work perfectly without fixing this issue, and when most people would rather > use EPT and not shadow page tables anyway. > > In any case, I left a TODO in the code about this, so it won't be forgotten. Sure. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.