From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Huang, Kai" Subject: Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's IA32_SGXLEPUBKEYHASHn runtime switch support Date: Wed, 7 Jun 2017 10:51:51 +1200 Message-ID: <90924786-c132-524e-177f-7af582d49c85@linux.intel.com> References: <20170508052434.3627-1-kai.huang@linux.intel.com> <20170508052434.3627-9-kai.huang@linux.intel.com> <58dcdb2d-6894-b0a3-8d6f-2ab752fd6d22@linux.intel.com> <6ab7ec4e-e0fa-af47-11b2-f26edcb088fb@linux.intel.com> <596dc1ad-eac7-798d-72e5-665eb7f3f2e4@linux.intel.com> <0b4697b9-0976-c8ad-e26f-4ff683318486@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Kai Huang , Paolo Bonzini , Radim Krcmar , kvm list , "intel-sgx-kernel-dev@lists.01.org" , haim.cohen@intel.com, Jarkko Sakkinen To: Andy Lutomirski Return-path: Received: from mga06.intel.com ([134.134.136.31]:13956 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751419AbdFFWv7 (ORCPT ); Tue, 6 Jun 2017 18:51:59 -0400 In-Reply-To: Content-Language: en-US Sender: kvm-owner@vger.kernel.org List-ID: On 6/7/2017 9:22 AM, Andy Lutomirski wrote: > On Tue, Jun 6, 2017 at 1:52 PM, Huang, Kai wrote: >> >> >> On 5/18/2017 7:45 PM, Huang, Kai wrote: >>> >>> >>> >>> On 5/17/2017 12:09 PM, Andy Lutomirski wrote: >>>> >>>> On Mon, May 15, 2017 at 5:48 PM, Huang, Kai >>>> wrote: >>>>> >>>>> >>>>> >>>>> On 5/12/2017 6:11 PM, Andy Lutomirski wrote: >>>>>> >>>>>> >>>>>> On Thu, May 11, 2017 at 9:56 PM, Huang, Kai >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> I am not sure whether the cost of writing to 4 MSRs would be >>>>>>> *extremely* >>>>>>> slow, as when vcpu is schedule in, KVM is already doing vmcs_load, >>>>>>> writing >>>>>>> to several MSRs, etc. >>>>>> >>>>>> >>>>>> >>>>>> I'm speculating that these MSRs may be rather unoptimized and hence >>>>>> unusualy slow. >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Have a percpu variable that stores the current SGXLEPUBKEYHASH along >>>>>>>> with whatever lock is needed (probably just a mutex). Users of EINIT >>>>>>>> will take the mutex, compare the percpu variable to the desired >>>>>>>> value, >>>>>>>> and, if it's different, do WRMSR and update the percpu variable. >>>>>>>> >>>>>>>> KVM will implement writes to SGXLEPUBKEYHASH by updating its >>>>>>>> in-memory >>>>>>>> state but *not* changing the MSRs. KVM will trap and emulate EINIT >>>>>>>> to >>>>>>>> support the same handling as the host. There is no action required >>>>>>>> at >>>>>>>> all on KVM guest entry and exit. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> This is doable, but SGX driver needs to do those things and expose >>>>>>> interfaces for KVM to use. In terms of the percpu data, it is nice to >>>>>>> have, >>>>>>> but I am not sure whether it is mandatory, as IMO EINIT is not even in >>>>>>> performance critical path. We can simply read old value from MSRs out >>>>>>> and >>>>>>> compare whether the old equals to the new. >>>>>> >>>>>> >>>>>> >>>>>> I think the SGX driver should probably live in arch/x86, and the >>>>>> interface could be a simple percpu variable that is exported (from the >>>>>> main kernel image, not from a module). >>>>>> >>>>>>> >>>>>>>> >>>>>>>> FWIW, I think that KVM will, in the long run, want to trap EINIT for >>>>>>>> other reasons: someone is going to want to implement policy for what >>>>>>>> enclaves are allowed that applies to guests as well as the host. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I am not very convinced why "what enclaves are allowed" in host would >>>>>>> apply >>>>>>> to guest. Can you elaborate? I mean in general virtualization just >>>>>>> focus >>>>>>> emulating hardware behavior. If a native machine is able to run any >>>>>>> LE, >>>>>>> the >>>>>>> virtual machine should be able to as well (of course, with guest's >>>>>>> IA32_FEATURE_CONTROL[bit 17] set). >>>>>> >>>>>> >>>>>> >>>>>> I strongly disagree. I can imagine two classes of sensible policies >>>>>> for launch control: >>>>>> >>>>>> 1. Allow everything. This seems quite sensible to me. >>>>>> >>>>>> 2. Allow some things, and make sure that VMs have at least as >>>>>> restrictive a policy as host root has. After all, what's the point of >>>>>> restricting enclaves in the host if host code can simply spawn a >>>>>> little VM to run otherwise-disallowed enclaves? >>>>> >>>>> >>>>> >>>>> What's the current SGX driver launch control policy? Yes allow >>>>> everything >>>>> works for KVM so lets skip this. Are we going to support allowing >>>>> several >>>>> LEs, or just allowing one single LE? I know Jarkko is doing in-kernel LE >>>>> staff but I don't know details. >>>>> >>>>> I am trying to find a way that we can both not break host launch control >>>>> policy, and be consistent to HW behavior (from guest's view). Currently >>>>> we >>>>> can create a KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn >>>>> either >>>>> enabled or disabled. I introduced an Qemu parameter 'lewr' for this >>>>> purpose. >>>>> Actually I introduced below Qemu SGX parameters for creating guest: >>>>> >>>>> -sgx epc=,lehash='SHA-256 hash',lewr >>>>> >>>>> where 'epc' specifies guest's EPC size, lehash specifies (initial) value >>>>> of >>>>> guest's IA32_SGXLEPUBKEYHASHn, and 'lewr' specifies whether guest is >>>>> allowed >>>>> to change guest's IA32_SGXLEPUBKEYHASHn at runtime. >>>>> >>>>> If host only allows one single LE to run, KVM can add a restrict that >>>>> only >>>>> allows to create KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn >>>>> disabled, so that only host allowed (single) hash can be used by guest. >>>>> From >>>>> guest's view, it simply has IA32_FEATURE_CONTROL[bit17] cleared and has >>>>> IA32_SGXLEPUBKEYHASHn with default value to be host allowed (single) >>>>> hash. >>>>> >>>>> If host allows several LEs (not but everything), and if we create guest >>>>> with >>>>> 'lewr', then the behavior is not consistent with HW behavior, as from >>>>> guest's hardware's point of view, we can actually run any LE but we have >>>>> to >>>>> tell guest that you are only allowed to change IA32_SGXLEPUBKEYHASHn to >>>>> some >>>>> specific values. One compromise solution is we don't allow to create >>>>> guest >>>>> with 'lewr' specified, and at the meantime, only allow to create guest >>>>> with >>>>> host approved hashes specified in 'lehash'. This will make guest's >>>>> behavior >>>>> consistent to HW behavior but only allows guest to run one LE (which is >>>>> specified by 'lehash' when guest is created). >>>> >>>> >>>> I'm not sure I entirely agree for a couple reasons. >>>> >>>> 1. I wouldn't be surprised if the kernel ends up implementing a policy >>>> in which it checks all enclaves (not just LEs) for acceptability. In >>>> fact, if the kernel sticks with the "no LE at all or just >>>> kernel-internal LE", then checking enclaves directly against some >>>> admin- or distro-provided signer list seems reasonable. This type of >>>> policy can't be forwarded to a guest by restricting allowed LE >>>> signers. But this is mostly speculation since AFAIK no one has >>>> seriously proposed any particular policy support and the plan was to >>>> not have this for the initial implementation. >>>> >>>> 2. While matching hardware behavior is nice in principle, there >>>> doesn't seem to be useful hardware behavior to match here. If the >>>> host had a list of five allowed LE signers, how exactly would it >>>> restrict the MSRs? They're not written atomically, so you can't >>>> directly tell what's being written. >>> >>> >>> In this case I actually plan to just allow creating guest with guest's >>> IA32_SGXLEPUBKEYHASHn disabled (without 'lewr' specified). If 'lewr' is >>> specified, creating guest will fail. And we only allow creating guest with >>> host allowed hash values (with 'lehash=hash-value'), and if 'hash-value' >>> specified by 'lehash' is not allowed by host, we also fail to create guest. >>> >>> We can only allow creating guest with 'lewr' specified when host allows >>> anything. >>> >>> But in this way, we are restricting guest OS's ability to run LE, as only >>> one LE, that is specified by 'lehash' parameter, can be run. But I think >>> this won't hurt much, as multiple guests still are able to run different >>> LEs? >>> >>> Also, the only way to fail an MSR >>>> >>>> write is to send #GP, and Windows (and even Linux) may not expect >>>> that. Linux doesn't panic due to #GP on MSR writes these days, but >>>> you still get a big fat warning. I wouldn't be at all surprised if >>>> Windows BSODs. >>> >>> >>> We cannot allow writing some particular value to MSRs successfully, while >>> injecting #GP when writing other values to the same MSRs. So #GP is not >>> option. >>> >>> ENCLS[EINIT], on the other hand, returns an actual >>>> >>>> error code. I'm not sure that a sensible error code exists >>>> ("SGX_HYPERVISOR_SAID_NO?", perhaps), >>> >>> >>> Looks no such error code exists. And we cannot return such error code to >>> guest as such error code is only supposed to be valid when ENCLS is run in >>> hypervisor. >>> >>> but SGX_INVALID_EINITTOKEN seems >>>> >>>> to mean, more or less, "the CPU thinks you're not authorized to do >>>> this", so forcing that error code could be entirely reasonable. >>>> >>>> If the host policy is to allow a list of LE signers, you could return >>>> SGX_INVALID_EINITTOKEN if the guest tries to EINIT an LE that isn't in >>>> the list. >>> >>> >>> But this would be inconsistent with HW behavior. If the hash value in >>> guest's IA32_SGXLEPUBKEYHASHn is matched with the one passed by EINIT, EINIT >>> is not supposed to return SGX_INVALID_EINITTOKEN. >>> >>> I think from VMM's perspective, emulating HW behavior to be consistent >>> with real HW behavior is very important. >>> >>> Paolo, would you provide your comments? >> >> >> Hi all, >> >> This has been quite for a while and I'd like to start discussion again. >> Jarkko told me that currently he only supports one LE in SGX driver, but I >> am not sure whether he is going to extend in the future or not. I think this >> might also depend on requirements from customers. >> >> Andy, >> >> If we only support one LE in driver, then we can only support the same LE >> for all KVM guests, according to your comments that host kernel launch >> control policy should also apply to KVM guests? WOuld you comments more? > > I'm not at all convinced that, going forward, Linux's host-side launch > control policy will be entirely contained in the LE. I'm also not > convinced that non-Linux guests will function at all under this type > of policy -- what if FreeBSD's LE is different for whatever reason? I am not convinced either. I think we need Jarkko to elaborate how is host side launch control policy implemented, or is there any policy at all. I also tried to read SGX driver code but looks I couldn't find any implementation regarding to this. Hi Jarkko, Can you elaborate on this? Thanks, -Kai > >> >> Jarkko, >> >> Could you help to clarify the whole launch control policy in host side so >> that we can have a better understanding together? >> >> Thanks, >> -Kai >> >>> >>>> >>>> --Andy >>>> >> >