From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH v3 1/6] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state Date: Tue, 3 Sep 2013 20:55:44 +0300 Message-ID: <20130903175544.GP10142@redhat.com> References: <93ce2376292d9d6fc7a4f4d53919b0a07d4e7859.1375971992.git.jan.kiszka@siemens.com> <20130902082136.GM22899@redhat.com> <5224552D.3080700@siemens.com> <20130902093627.GO22899@redhat.com> <52262009.9090401@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Paolo Bonzini , kvm , Xiao Guangrong , Jun Nakajima , Yang Zhang , Arthur Chunqi Li To: Jan Kiszka Return-path: Received: from mx1.redhat.com ([209.132.183.28]:43553 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758380Ab3ICRzw (ORCPT ); Tue, 3 Sep 2013 13:55:52 -0400 Content-Disposition: inline In-Reply-To: <52262009.9090401@siemens.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Sep 03, 2013 at 07:44:41PM +0200, Jan Kiszka wrote: > On 2013-09-02 11:36, Gleb Natapov wrote: > > On Mon, Sep 02, 2013 at 11:06:53AM +0200, Jan Kiszka wrote: > >> On 2013-09-02 10:21, Gleb Natapov wrote: > >>> On Thu, Aug 08, 2013 at 04:26:28PM +0200, Jan Kiszka wrote: > >>>> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the > >>> Not a typo :) That what Avi asked for do during initial nested VMX > >>> review: http://markmail.org/message/hhidqyhbo2mrgxxc > >> > >> Yeah, should rephrase this. > >> > >>> > >>> But there is at least one transition check that kvm_set_cr0() does that > >>> should not be done during vmexit emulation, namely CS.L bit check, so I > >>> tend to agree that kvm_set_cr0() is not appropriate here, at lest not as > >>> it is. > >> > >> kvm_set_cr0() is for emulating explicit guest changes. It is not the > >> proper interface for implicit, vendor-dependent changes like this one. > >> > > Agree, the problem is that we do not have proper interface for implicit > > changes like this one (do not see why it is vendor-dependent, SVM also > > restores host state in a similar way). > > > >>> But can we skip other checks kvm_set_cr0() does? For instance > >>> what prevents us from loading CR0.PG = 1 EFER.LME = 1 and CR4.PAE = 0 > >>> during nested vmexit? What _should_ prevent it is vmentry check from > >>> 26.2.4 > >>> > >>> If the "host address-space size" VM-exit control is 1, the following > >>> must hold: > >>> - Bit 5 of the CR4 field (corresponding to CR4.PAE) is 1. > >>> > >>> But I do not see that we do that check on vmentry. > >>> > >>> What about NW/CD bit checks, or reserved bits checks? 27.5.1 says: > >>> The following bits are not modified: > >>> For CR0, ET, CD, NW; bits 63:32 (on processors that support Intel 64 > >>> architecture), 28:19, 17, and 15:6; and any bits that are fixed in > >>> VMX operation (see Section 23.8). > >>> > >>> But again current vmexit code does not emulate this properly and just > >>> sets everything from host_cr0. vmentry should also preserve all those > >>> bit but it looks like it doesn't too. > >>> > >> > >> Yes, there is surely more to improve. Do you think the lacking checks > >> can cause troubles for L0, or is this just imprecise emulation that can > >> be addressed separately? > >> > > The lacking checks may cause L0 to fail guest entry which will trigger > > internal error. If it is exploitable by L0 userspace it is a serious > > problem, if only L0 kernel can trigger it then less so. I remember Avi > > was concerned that KVM code may depend on all registers to be consistent > > otherwise it can be exploited, I cannot prove or disprove this theory > > :), but if it is the case then event L0 kernel case is problematic. > > So how to proceed with this? > Looking at the set_sreg code it looks like we already can create non consistent state there, so I will apply 1,2,4,6 of this series and hope that CR0 loading bugs I listed above will be eventually fixed on top :) Can you rephrase commit message for patch 1? -- Gleb.