From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA6FF53BB for ; Mon, 6 Feb 2023 18:58:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675709931; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=m16OitzwO71nkhwycKeW6iMhsPFeKDcAk5Zf0tX9cZY=; b=AlHG5bV8Qp5UFANRT9UJ7PVCObFe0VPnTYOLsyb/iSiWiUG58ah085PEH4Fv/h4ztKMZ0C BD7AgT100MFn0jUGVUiw3nOUWvWCr7W+KbV9pSyUVD2PnTA5RIbjrbPjpJ0MudG/7r1ed+ uR9kBcOWopnX0DZI/1M1q8S8kTtalIc= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-297-2N3APtZFMpagDPT2LxwlFA-1; Mon, 06 Feb 2023 13:58:48 -0500 X-MC-Unique: 2N3APtZFMpagDPT2LxwlFA-1 Received: by mail-wm1-f69.google.com with SMTP id j37-20020a05600c1c2500b003deaf780ab6so6983345wms.4 for ; Mon, 06 Feb 2023 10:58:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=m16OitzwO71nkhwycKeW6iMhsPFeKDcAk5Zf0tX9cZY=; b=E+c7F6JSeFqKuGGmPBp9mo/zuanCOw1glS/xou8pOvVYWzgMpO0KRnd8sRGCQ2sISr bcgwh0B8FD7cFj3KGGrZ6CZDFdzIkI2/3J95SyHeLwQaPQVutXSXYAZ5MBWpbriOikN4 Qagg4auHqRZVRqVniJKQeHW4ZgEzZHdLT7IG3NKe5Me9z+Zxk7k8y11ooTw8nR0s5Kot 4KFqnfrcbKHkUlG9c/y1z99939JDpgyjpPjQjr4g31jP+jxgRnQ8IrF0gFzBzlVXqsn3 7y1IarrkBjdLlcEXFDfu7Hsvbk4dP1gVVWZiUdkpccQKE+1KlE29cBmJMBbKTlurjx0w FBhw== X-Gm-Message-State: AO0yUKXkpB15+P+RnwFewsFCVWLTDo2CTZrPcdtWG6vz3eJbhTfhBu/Q sGe7Kv3lDH/SbUWqjDkP7TAuHfauqXYRRWGcv4kmfebjAa6Axzo7Zrfd8VaCkrpPrQToQOM08ol d1CoEQOS5OfJWgUqLLXQqgg== X-Received: by 2002:a05:6000:69b:b0:2bf:dcdc:afb8 with SMTP id bo27-20020a056000069b00b002bfdcdcafb8mr20030052wrb.64.1675709927237; Mon, 06 Feb 2023 10:58:47 -0800 (PST) X-Google-Smtp-Source: AK7set8UHg3rrYoack96/+Aen7rLx+dOSRGVf2dLJL6sX4SVrvCzEJO1dTlVfmXxYWw3BHd7tf86MA== X-Received: by 2002:a05:6000:69b:b0:2bf:dcdc:afb8 with SMTP id bo27-20020a056000069b00b002bfdcdcafb8mr20030031wrb.64.1675709926964; Mon, 06 Feb 2023 10:58:46 -0800 (PST) Received: from work-vm (ward-16-b2-v4wan-166627-cust863.vm18.cable.virginm.net. [81.97.203.96]) by smtp.gmail.com with ESMTPSA id s16-20020adff810000000b002c3dc4131f5sm7212697wrp.18.2023.02.06.10.58.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 10:58:46 -0800 (PST) Date: Mon, 6 Feb 2023 18:58:44 +0000 From: "Dr. David Alan Gilbert" To: Christophe de Dinechin Cc: "Michael S. Tsirkin" , James Bottomley , "Reshetova, Elena" , Leon Romanovsky , Greg Kroah-Hartman , "Shishkin, Alexander" , "Shutemov, Kirill" , "Kuppuswamy, Sathyanarayanan" , "Kleen, Andi" , "Hansen, Dave" , Thomas Gleixner , Peter Zijlstra , "Wunner, Lukas" , Mika Westerberg , Jason Wang , "Poimboe, Josh" , "aarcange@redhat.com" , Cfir Cohen , Marc Orr , "jbachmann@google.com" , "pgonda@google.com" , "keescook@chromium.org" , James Morris , Michael Kelley , "Lange, Jon" , "linux-coco@lists.linux.dev" , Linux Kernel Mailing List , Kernel Hardening Subject: Re: Linux guest kernel threat model for Confidential Computing Message-ID: References: <220b0be95a8c733f0a6eeddc08e37977ee21d518.camel@linux.ibm.com> <261bc99edc43990eecb1aac4fe8005cedc495c20.camel@linux.ibm.com> <20230131123033-mutt-send-email-mst@kernel.org> <6BCC3285-ACA3-4E38-8811-1A91C9F03852@redhat.com> <20230201055412-mutt-send-email-mst@kernel.org> <4B78D161-2712-434A-8E6F-9D8BA468BB3A@redhat.com> <20230201105305-mutt-send-email-mst@kernel.org> Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/2.2.9 (2022-11-12) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit * Christophe de Dinechin (dinechin@redhat.com) wrote: > > On 2023-02-01 at 11:02 -05, "Michael S. Tsirkin" wrote... > > On Wed, Feb 01, 2023 at 02:15:10PM +0100, Christophe de Dinechin Dupont de Dinechin wrote: > >> > >> > >> > On 1 Feb 2023, at 12:01, Michael S. Tsirkin wrote: > >> > > >> > On Wed, Feb 01, 2023 at 11:52:27AM +0100, Christophe de Dinechin Dupont de Dinechin wrote: > >> >> > >> >> > >> >>> On 31 Jan 2023, at 18:39, Michael S. Tsirkin wrote: > >> >>> > >> >>> On Tue, Jan 31, 2023 at 04:14:29PM +0100, Christophe de Dinechin wrote: > >> >>>> Finally, security considerations that apply irrespective of whether the > >> >>>> platform is confidential or not are also outside of the scope of this > >> >>>> document. This includes topics ranging from timing attacks to social > >> >>>> engineering. > >> >>> > >> >>> Why are timing attacks by hypervisor on the guest out of scope? > >> >> > >> >> Good point. > >> >> > >> >> I was thinking that mitigation against timing attacks is the same > >> >> irrespective of the source of the attack. However, because the HV > >> >> controls CPU time allocation, there are presumably attacks that > >> >> are made much easier through the HV. Those should be listed. > >> > > >> > Not just that, also because it can and does emulate some devices. > >> > For example, are disk encryption systems protected against timing of > >> > disk accesses? > >> > This is why some people keep saying "forget about emulated devices, require > >> > passthrough, include devices in the trust zone". > >> > > >> >>> > >> >>>> > >> >>>> > >> >>>> Feel free to comment and reword at will ;-) > >> >>>> > >> >>>> > >> >>>> 3/ PCI-as-a-threat: where does that come from > >> >>>> > >> >>>> Isn't there a fundamental difference, from a threat model perspective, > >> >>>> between a bad actor, say a rogue sysadmin dumping the guest memory (which CC > >> >>>> should defeat) and compromised software feeding us bad data? I think there > >> >>>> is: at leats inside the TCB, we can detect bad software using measurements, > >> >>>> and prevent it from running using attestation. In other words, we first > >> >>>> check what we will run, then we run it. The security there is that we know > >> >>>> what we are running. The trust we have in the software is from testing, > >> >>>> reviewing or using it. > >> >>>> > >> >>>> This relies on a key aspect provided by TDX and SEV, which is that the > >> >>>> software being measured is largely tamper-resistant thanks to memory > >> >>>> encryption. In other words, after you have measured your guest software > >> >>>> stack, the host or hypervisor cannot willy-nilly change it. > >> >>>> > >> >>>> So this brings me to the next question: is there any way we could offer the > >> >>>> same kind of service for KVM and qemu? The measurement part seems relatively > >> >>>> easy. Thetamper-resistant part, on the other hand, seems quite difficult to > >> >>>> me. But maybe someone else will have a brilliant idea? > >> >>>> > >> >>>> So I'm asking the question, because if you could somehow prove to the guest > >> >>>> not only that it's running the right guest stack (as we can do today) but > >> >>>> also a known host/KVM/hypervisor stack, we would also switch the potential > >> >>>> issues with PCI, MSRs and the like from "malicious" to merely "bogus", and > >> >>>> this is something which is evidently easier to deal with. > >> >>> > >> >>> Agree absolutely that's much easier. > >> >>> > >> >>>> I briefly discussed this with James, and he pointed out two interesting > >> >>>> aspects of that question: > >> >>>> > >> >>>> 1/ In the CC world, we don't really care about *virtual* PCI devices. We > >> >>>> care about either virtio devices, or physical ones being passed through > >> >>>> to the guest. Let's assume physical ones can be trusted, see above. > >> >>>> That leaves virtio devices. How much damage can a malicious virtio device > >> >>>> do to the guest kernel, and can this lead to secrets being leaked? > >> >>>> > >> >>>> 2/ He was not as negative as I anticipated on the possibility of somehow > >> >>>> being able to prevent tampering of the guest. One example he mentioned is > >> >>>> a research paper [1] about running the hypervisor itself inside an > >> >>>> "outer" TCB, using VMPLs on AMD. Maybe something similar can be achieved > >> >>>> with TDX using secure enclaves or some other mechanism? > >> >>> > >> >>> Or even just secureboot based root of trust? > >> >> > >> >> You mean host secureboot? Or guest? > >> >> > >> >> If it’s host, then the problem is detecting malicious tampering with > >> >> host code (whether it’s kernel or hypervisor). > >> > > >> > Host. Lots of existing systems do this. As an extreme boot a RO disk, > >> > limit which packages are allowed. > >> > >> Is that provable to the guest? > >> > >> Consider a cloud provider doing that: how do they prove to their guest: > >> > >> a) What firmware, kernel and kvm they run > >> > >> b) That what they booted cannot be maliciouly modified, e.g. by a rogue > >> device driver installed by a rogue sysadmin > >> > >> My understanding is that SecureBoot is only intended to prevent non-verified > >> operating systems from booting. So the proof is given to the cloud provider, > >> and the proof is that the system boots successfully. > > > > I think I should have said measured boot not secure boot. > > The problem again is how you prove to the guest that you are not lying? > > We know how to do that from a guest [1], but you will note that in the > normal process, a trusted hardware component (e.g. the PSP for AMD SEV) > proves the validity of the measurements of the TCB by encrypting it with an > attestation signing key derived from some chip-unique secret. For AMD, this > is called the VCEK, and TDX has something similar. In the case of SEV, this > goes through firmware, and you have to tell the firmware each time you > insert data in the original TCB (using SNP_LAUNCH_UPDATE). This is all tied > to a VM execution context. I do not believe there is any provision to do the > same thing to measure host data. And again, it would be somewhat pointless > if there isn't also a mechanism to ensure the host data is not changed after > the measurement. > > Now, I don't think it would be super-difficult to add a firmware service > that would let the host do some kind of equivalent to PVALIDATE, setting > some physical pages aside that then get measured and become inaccessible to > the host. The PSP or similar could then integrate these measurements as part > of the TCB, and the fact that the pages were "transferred" to this special > invariant block would ensure the guests that the code will not change after > being measured. > > I am not aware that such a mechanism exists on any of the existing CC > platforms. Please feel free to enlighten me if I'm wrong. > > [1] https://www.redhat.com/en/blog/understanding-confidential-containers-attestation-flow > > > >> > >> After that, I think all bets are off. SecureBoot does little AFAICT > >> to prevent malicious modifications of the running system by someone with > >> root access, including deliberately loading a malicious kvm-zilog.ko > > > > So disable module loading then or don't allow root access? > > Who would do that? > > The problem is that we have a host and a tenant, and the tenant does not > trust the host in principle. So it is not sufficient for the host to disable > module loading or carefully control root access. It is also necessary to > prove to the tenant(s) that this was done. > > > > >> > >> It does not mean it cannot be done, just that I don’t think we > >> have the tools at the moment. > > > > Phones, chromebooks do this all the time ... > > Indeed, but there, this is to prove to the phone's real owner (which, > surprise, is not the naive person who thought they'd get some kind of > ownership by buying the phone) that the software running on the phone has > not been replaced by some horribly jailbreaked goo. > > In other words, the user of the phone gets no proof whatsoever of anything, > except that the phone appears to work. This is somewhat the situation in the > cloud today: the owners of the hardware get all sorts of useful checks, from > SecureBoot to error-correction for memory or I/O devices. However, someone > running in a VM on the cloud gets none of that, just like the user of your > phone. Assuming you do a measured boot, the host OS and firmware is measured into the host TPM; people have thought in the past about triggering attestations of the host from the guest; then you could have something external attest the host and only release keys to the guests disks if the attestation is correct; or a key for the guests disks held in the hosts TPM. Dave > -- > Cheers, > Christophe de Dinechin (https://c3d.github.io) > Theory of Incomplete Measurements (https://c3d.github.io/TIM) > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK