From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Tue, 8 Aug 2023 14:13:26 -0700 Subject: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory In-Reply-To: References: <20230718234512.1690985-13-seanjc@google.com> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Mon, Aug 07, 2023, Ackerley Tng wrote: > I?d like to propose an alternative to the refcounting approach between > the gmem file and associated kvm, where we think of KVM?s memslots as > users of the gmem file. > > Instead of having the gmem file pin the VM (i.e. take a refcount on > kvm), we could let memslot take a refcount on the gmem file when the > memslots are configured. > > Here?s a POC patch that flips the refcounting (and modified selftests in > the next commit): > https://github.com/googleprodkernel/linux-cc/commit/7f487b029b89b9f3e9b094a721bc0772f3c8c797 > > One side effect of having the gmem file pin the VM is that now the gmem > file becomes sort of a false handle on the VM: > > + Closing the file destroys the file pointers in the VM and invalidates > the pointers Yeah, this is less than ideal. But, it's also how things operate today. KVM doesn't hold references to VMAs or files, e.g. if userspace munmap()s memory, any and all SPTEs pointing at the memory are zapped. The only difference with gmem is that KVM needs to explicitly invalidate file pointers, instead of that happening behind the scenes (no more VMAs to find). Again, I agree the resulting code is more complex than I would prefer, but from a userspace perspective I don't see this as problematic. > + Keeping the file open keeps the VM around in the kernel even though > the VM fd may already be closed. That is perfectly ok. There is plenty of prior art, as well as plenty of ways for userspace to shoot itself in the foot. E.g. open a stats fd for a vCPU and the VM and all its vCPUs will be kept alive. And conceptually it's sound, anything created in the scope of a VM _should_ pin the VM. > I feel that memslots form a natural way of managing usage of the gmem > file. When a memslot is created, it is using the file; hence we take a > refcount on the gmem file, and as memslots are removed, we drop > refcounts on the gmem file. Yes and no. It's definitely more natural *if* the goal is to allow guest_memfd memory to exist without being attached to a VM. But I'm not at all convinced that we want to allow that, or that it has desirable properties. With TDX and SNP in particuarly, I'm pretty sure that allowing memory to outlive the VM is very underisable (more below). > The KVM pointer is shared among all the bindings in gmem?s xarray, and we can > enforce that a gmem file is used only with one VM: > > + When binding a memslot to the file, if a kvm pointer exists, it must > be the same kvm as the one in this binding > + When the binding to the last memslot is removed from a file, NULL the > kvm pointer. Nullifying the KVM pointer isn't sufficient, because without additional actions userspace could extract data from a VM by deleting its memslots and then binding the guest_memfd to an attacker controlled VM. Or more likely with TDX and SNP, induce badness by coercing KVM into mapping memory into a guest with the wrong ASID/HKID. I can think of three ways to handle that: (a) prevent a different VM from *ever* binding to the gmem instance (b) free/zero physical pages when unbinding (c) free/zero when binding to a different VM Option (a) is easy, but that pretty much defeats the purpose of decopuling guest_memfd from a VM. Option (b) isn't hard to implement, but it screws up the lifecycle of the memory, e.g. would require memory when a memslot is deleted. That isn't necessarily a deal-breaker, but it runs counter to how KVM memlots currently operate. Memslots are basically just weird page tables, e.g. deleting a memslot doesn't have any impact on the underlying data in memory. TDX throws a wrench in this as removing a page from the Secure EPT is effectively destructive to the data (can't be mapped back in to the VM without zeroing the data), but IMO that's an oddity with TDX and not necessarily something we want to carry over to other VM types. There would also be performance implications (probably a non-issue in practice), and weirdness if/when we get to sharing, linking and/or mmap()ing gmem. E.g. what should happen if the last memslot (binding) is deleted, but there outstanding userspace mappings? Option (c) is better from a lifecycle perspective, but it adds its own flavor of complexity, e.g. the performant way to reclaim TDX memory requires the TDMR (effectively the VM pointer), and so a deferred relcaim doesn't really work for TDX. And I'm pretty sure it *can't* work for SNP, because RMP entries must not outlive the VM; KVM can't reuse an ASID if there are pages assigned to that ASID in the RMP, i.e. until all memory belonging to the VM has been fully freed. > Could binding gmem files not on creation, but at memslot configuration > time be sufficient and simpler? After working through the flows, I think binding on-demand would simplify the refcounting (stating the obvious), but complicate the lifecycle of the memory as well as the contract between KVM and userspace, and would break the separation of concerns between the inode (physical memory / data) and file (VM's view / mappings). From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3035171BF for ; Tue, 8 Aug 2023 21:13:28 +0000 (UTC) Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-563379fe16aso6292905a12.3 for ; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=EiApR7LgfgrJOIT4RH8vwTp7YvtN3WX8F65amcCZ5tUQpyv/ZsbnzElbxtLH3YiRAr Xco5OB//pxBTUJ1Hz9I9wbK8VFMOtBI25Tou7znxlR4xLwF5hwG16hAzm3VNuOm7qNVv RXaphvZexnvsjJwdUedz4miwZlrHjyPKGJ4tg41nXpc1Cr1tDeUR45j+cXRmHY2zeNeJ aD2FSNHH9s2X0AH/332mpiw5dSfBYPcqgRab/m5d5fFkQb+ZWPzwvwukMvExFid52TaS W3qOIOCRBt/UK+p15DAoWro/GC9Q6ni0EalNj49A0OSEsHwDkeyMMKxa0prAwizF/XXF JV1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=jBy9PgiiM++J6Nj/hIyWnjYkfrK0oUWajRcQyVhyFQl/m8KEbmOMbGQv7q2FF0a7Lx k5kYSpx+JhnumtdSV4ZS+pxWyDstV2ybncngav6wXSgR94q09tY4z/gqc86NX8FmHEBi MYLex/poPQJgCTOtrgWLLQEDAh4+jGKMsipJ9J7Mb++x5HZM7CdlUB70wJe67+Loo8F7 76AtrYOsbgh3MeHhL9u3hKHkExZ3JZaji/uo1qsricHTMEoGuDx8viGaqtVBe6QfQoN0 Ovf08PYNewtvD2EDpnUAVih8+1v/U/vDIms1CbsgKB9aQSfFFjs7iUcmLZ1A0Q3Bl6Uv Qzkw== X-Gm-Message-State: AOJu0YyQeFblvyul+DylfZjhSRd010Q379sVqsRp+mLNAb/X24k4sE7/ 1fYN13tKFwPUSuYHqyaZg1bjGDbkMiA= X-Google-Smtp-Source: AGHT+IF3VhVvM3seA6Kd3G1yrwf0aF3MmO0cqr5SVZX1MHXlsabrjEav/SHlO1qK0PHGPVCVUrfGthX1EsA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:b242:0:b0:563:e937:5e87 with SMTP id t2-20020a63b242000000b00563e9375e87mr12735pgo.5.1691529208021; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) Date: Tue, 8 Aug 2023 14:13:26 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20230718234512.1690985-13-seanjc@google.com> Message-ID: Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Ackerley Tng Cc: pbonzini@redhat.com, maz@kernel.org, oliver.upton@linux.dev, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, willy@infradead.org, akpm@linux-foundation.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, chao.p.peng@linux.intel.com, tabba@google.com, jarkko@kernel.org, yu.c.zhang@linux.intel.com, vannapurve@google.com, mail@maciej.szmigiero.name, vbabka@suse.cz, david@redhat.com, qperret@google.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Mon, Aug 07, 2023, Ackerley Tng wrote: > I=E2=80=99d like to propose an alternative to the refcounting approach be= tween > the gmem file and associated kvm, where we think of KVM=E2=80=99s memslot= s as > users of the gmem file. >=20 > Instead of having the gmem file pin the VM (i.e. take a refcount on > kvm), we could let memslot take a refcount on the gmem file when the > memslots are configured. >=20 > Here=E2=80=99s a POC patch that flips the refcounting (and modified selft= ests in > the next commit): > https://github.com/googleprodkernel/linux-cc/commit/7f487b029b89b9f3e9b09= 4a721bc0772f3c8c797 >=20 > One side effect of having the gmem file pin the VM is that now the gmem > file becomes sort of a false handle on the VM: >=20 > + Closing the file destroys the file pointers in the VM and invalidates > the pointers Yeah, this is less than ideal. But, it's also how things operate today. K= VM doesn't hold references to VMAs or files, e.g. if userspace munmap()s memor= y, any and all SPTEs pointing at the memory are zapped. The only difference w= ith gmem is that KVM needs to explicitly invalidate file pointers, instead of t= hat happening behind the scenes (no more VMAs to find). Again, I agree the res= ulting code is more complex than I would prefer, but from a userspace perspective = I don't see this as problematic. > + Keeping the file open keeps the VM around in the kernel even though > the VM fd may already be closed. That is perfectly ok. There is plenty of prior art, as well as plenty of w= ays for userspace to shoot itself in the foot. E.g. open a stats fd for a vCPU= and the VM and all its vCPUs will be kept alive. And conceptually it's sound, anything created in the scope of a VM _should_ pin the VM. > I feel that memslots form a natural way of managing usage of the gmem > file. When a memslot is created, it is using the file; hence we take a > refcount on the gmem file, and as memslots are removed, we drop > refcounts on the gmem file. Yes and no. It's definitely more natural *if* the goal is to allow guest_m= emfd memory to exist without being attached to a VM. But I'm not at all convinc= ed that we want to allow that, or that it has desirable properties. With TDX = and SNP in particuarly, I'm pretty sure that allowing memory to outlive the VM = is very underisable (more below). > The KVM pointer is shared among all the bindings in gmem=E2=80=99s xarray= , and we can > enforce that a gmem file is used only with one VM: >=20 > + When binding a memslot to the file, if a kvm pointer exists, it must > be the same kvm as the one in this binding > + When the binding to the last memslot is removed from a file, NULL the > kvm pointer. Nullifying the KVM pointer isn't sufficient, because without additional act= ions userspace could extract data from a VM by deleting its memslots and then bi= nding the guest_memfd to an attacker controlled VM. Or more likely with TDX and = SNP, induce badness by coercing KVM into mapping memory into a guest with the wr= ong ASID/HKID. I can think of three ways to handle that: (a) prevent a different VM from *ever* binding to the gmem instance (b) free/zero physical pages when unbinding (c) free/zero when binding to a different VM Option (a) is easy, but that pretty much defeats the purpose of decopuling guest_memfd from a VM. Option (b) isn't hard to implement, but it screws up the lifecycle of the m= emory, e.g. would require memory when a memslot is deleted. That isn't necessaril= y a deal-breaker, but it runs counter to how KVM memlots currently operate. Me= mslots are basically just weird page tables, e.g. deleting a memslot doesn't have = any impact on the underlying data in memory. TDX throws a wrench in this as re= moving a page from the Secure EPT is effectively destructive to the data (can't be= mapped back in to the VM without zeroing the data), but IMO that's an oddity with = TDX and not necessarily something we want to carry over to other VM types. There would also be performance implications (probably a non-issue in pract= ice), and weirdness if/when we get to sharing, linking and/or mmap()ing gmem. E.= g. what should happen if the last memslot (binding) is deleted, but there outstandi= ng userspace mappings? Option (c) is better from a lifecycle perspective, but it adds its own flav= or of complexity, e.g. the performant way to reclaim TDX memory requires the TDMR (effectively the VM pointer), and so a deferred relcaim doesn't really work= for TDX. And I'm pretty sure it *can't* work for SNP, because RMP entries must= not outlive the VM; KVM can't reuse an ASID if there are pages assigned to that= ASID in the RMP, i.e. until all memory belonging to the VM has been fully freed. > Could binding gmem files not on creation, but at memslot configuration > time be sufficient and simpler? After working through the flows, I think binding on-demand would simplify t= he refcounting (stating the obvious), but complicate the lifecycle of the memo= ry as well as the contract between KVM and userspace, and would break the separat= ion of concerns between the inode (physical memory / data) and file (VM's view / m= appings). From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 799D3C001E0 for ; Tue, 8 Aug 2023 21:13:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=8U0ysafd3EKS0ZNm4Daaaa/07AK/CmUHRVUm/x/c+vk=; b=39XkWchirGR7tIOHVY5hfSp2au 3qPHjvMtME1UTXCORnVGyxjg+zHLfraWt85Gl1CDqK9aLOsUYjiu0MpGcII6TTS46bKZwB0LPvNDX +FbF1fG5qIHTG1X5hJkPQ4O+CXaKAwqDvoRZxe3wQxoMp7wlfC/OukIX5WinjdRGaqt6ApEIRdGDk 5JWOzPZcCcK7itYF0y8q6MI73GG19IICvkvP5NRIYRF8RKLjCgoKDckYNuz3R+NLW6pd5p+2im4+s 0Gkwv6Xez0X2yBGRlQH59O6eGuE5DSCAsTcukz3w3WvnJ3AdM3xa3J35qFL4/4gAVRbGFX6g8K696 pQ4bTG4Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qTU1K-003UMe-0D; Tue, 08 Aug 2023 21:13:38 +0000 Received: from mail-pg1-x54a.google.com ([2607:f8b0:4864:20::54a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qTU1F-003UKM-0f for linux-riscv@lists.infradead.org; Tue, 08 Aug 2023 21:13:35 +0000 Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-56463e0340cso6299734a12.2 for ; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=EiApR7LgfgrJOIT4RH8vwTp7YvtN3WX8F65amcCZ5tUQpyv/ZsbnzElbxtLH3YiRAr Xco5OB//pxBTUJ1Hz9I9wbK8VFMOtBI25Tou7znxlR4xLwF5hwG16hAzm3VNuOm7qNVv RXaphvZexnvsjJwdUedz4miwZlrHjyPKGJ4tg41nXpc1Cr1tDeUR45j+cXRmHY2zeNeJ aD2FSNHH9s2X0AH/332mpiw5dSfBYPcqgRab/m5d5fFkQb+ZWPzwvwukMvExFid52TaS W3qOIOCRBt/UK+p15DAoWro/GC9Q6ni0EalNj49A0OSEsHwDkeyMMKxa0prAwizF/XXF JV1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=MulMvl00QmSxSyx3wKKPi967wBI+zH+ezLN6yZ1s1djNaLUd9FP6KwIGZlLO9XRhSp Z0Fv3jdLzTPPlwHdPArYleyPeyS9R/hKyOVQdjmCQocMYovIG6plnWEaoZ5vrdrEEzy0 hxIbT0GEeIyxBqtbOtAD4drd5GFuDP277mjc7JeQCZerWN1s3zjKL+bxs+tViSC5tZhG B3C6CtTKUwLWHe/4O1kKV4AlV8WTrYPtyfYUOs8wIuj3N7rlFX2j4Nki7AzHJFoGzdSG NNGA8Gi5Yv6ODigJqD2dWZ8Uf+9NFz4Xu32vstmQ3evunZ6aFT9g1Ln3ykAkv5X4R4RN WHfQ== X-Gm-Message-State: AOJu0YwJg8aRAmrjia0rzjU6j+OInShfkZwX5H0VgmDgaIEDmYLVzzid pZ2VxgxFzMvZNI7e6KnXErKOfjwHbqY= X-Google-Smtp-Source: AGHT+IF3VhVvM3seA6Kd3G1yrwf0aF3MmO0cqr5SVZX1MHXlsabrjEav/SHlO1qK0PHGPVCVUrfGthX1EsA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:b242:0:b0:563:e937:5e87 with SMTP id t2-20020a63b242000000b00563e9375e87mr12735pgo.5.1691529208021; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) Date: Tue, 8 Aug 2023 14:13:26 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230718234512.1690985-13-seanjc@google.com> Message-ID: Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Ackerley Tng Cc: pbonzini@redhat.com, maz@kernel.org, oliver.upton@linux.dev, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, willy@infradead.org, akpm@linux-foundation.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, chao.p.peng@linux.intel.com, tabba@google.com, jarkko@kernel.org, yu.c.zhang@linux.intel.com, vannapurve@google.com, mail@maciej.szmigiero.name, vbabka@suse.cz, david@redhat.com, qperret@google.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230808_141333_248230_A9B8FE80 X-CRM114-Status: GOOD ( 29.58 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org T24gTW9uLCBBdWcgMDcsIDIwMjMsIEFja2VybGV5IFRuZyB3cm90ZToKPiBJ4oCZZCBsaWtlIHRv IHByb3Bvc2UgYW4gYWx0ZXJuYXRpdmUgdG8gdGhlIHJlZmNvdW50aW5nIGFwcHJvYWNoIGJldHdl ZW4KPiB0aGUgZ21lbSBmaWxlIGFuZCBhc3NvY2lhdGVkIGt2bSwgd2hlcmUgd2UgdGhpbmsgb2Yg S1ZN4oCZcyBtZW1zbG90cyBhcwo+IHVzZXJzIG9mIHRoZSBnbWVtIGZpbGUuCj4gCj4gSW5zdGVh ZCBvZiBoYXZpbmcgdGhlIGdtZW0gZmlsZSBwaW4gdGhlIFZNIChpLmUuIHRha2UgYSByZWZjb3Vu dCBvbgo+IGt2bSksIHdlIGNvdWxkIGxldCBtZW1zbG90IHRha2UgYSByZWZjb3VudCBvbiB0aGUg Z21lbSBmaWxlIHdoZW4gdGhlCj4gbWVtc2xvdHMgYXJlIGNvbmZpZ3VyZWQuCj4gCj4gSGVyZeKA mXMgYSBQT0MgcGF0Y2ggdGhhdCBmbGlwcyB0aGUgcmVmY291bnRpbmcgKGFuZCBtb2RpZmllZCBz ZWxmdGVzdHMgaW4KPiB0aGUgbmV4dCBjb21taXQpOgo+IGh0dHBzOi8vZ2l0aHViLmNvbS9nb29n bGVwcm9ka2VybmVsL2xpbnV4LWNjL2NvbW1pdC83ZjQ4N2IwMjliODliOWYzZTliMDk0YTcyMWJj MDc3MmYzYzhjNzk3Cj4gCj4gT25lIHNpZGUgZWZmZWN0IG9mIGhhdmluZyB0aGUgZ21lbSBmaWxl IHBpbiB0aGUgVk0gaXMgdGhhdCBub3cgdGhlIGdtZW0KPiBmaWxlIGJlY29tZXMgc29ydCBvZiBh IGZhbHNlIGhhbmRsZSBvbiB0aGUgVk06Cj4gCj4gKyBDbG9zaW5nIHRoZSBmaWxlIGRlc3Ryb3lz IHRoZSBmaWxlIHBvaW50ZXJzIGluIHRoZSBWTSBhbmQgaW52YWxpZGF0ZXMKPiAgIHRoZSBwb2lu dGVycwoKWWVhaCwgdGhpcyBpcyBsZXNzIHRoYW4gaWRlYWwuICBCdXQsIGl0J3MgYWxzbyBob3cg dGhpbmdzIG9wZXJhdGUgdG9kYXkuICBLVk0KZG9lc24ndCBob2xkIHJlZmVyZW5jZXMgdG8gVk1B cyBvciBmaWxlcywgZS5nLiBpZiB1c2Vyc3BhY2UgbXVubWFwKClzIG1lbW9yeSwKYW55IGFuZCBh bGwgU1BURXMgcG9pbnRpbmcgYXQgdGhlIG1lbW9yeSBhcmUgemFwcGVkLiAgVGhlIG9ubHkgZGlm ZmVyZW5jZSB3aXRoCmdtZW0gaXMgdGhhdCBLVk0gbmVlZHMgdG8gZXhwbGljaXRseSBpbnZhbGlk YXRlIGZpbGUgcG9pbnRlcnMsIGluc3RlYWQgb2YgdGhhdApoYXBwZW5pbmcgYmVoaW5kIHRoZSBz Y2VuZXMgKG5vIG1vcmUgVk1BcyB0byBmaW5kKS4gIEFnYWluLCBJIGFncmVlIHRoZSByZXN1bHRp bmcKY29kZSBpcyBtb3JlIGNvbXBsZXggdGhhbiBJIHdvdWxkIHByZWZlciwgYnV0IGZyb20gYSB1 c2Vyc3BhY2UgcGVyc3BlY3RpdmUgSQpkb24ndCBzZWUgdGhpcyBhcyBwcm9ibGVtYXRpYy4KCj4g KyBLZWVwaW5nIHRoZSBmaWxlIG9wZW4ga2VlcHMgdGhlIFZNIGFyb3VuZCBpbiB0aGUga2VybmVs IGV2ZW4gdGhvdWdoCj4gICB0aGUgVk0gZmQgbWF5IGFscmVhZHkgYmUgY2xvc2VkLgoKVGhhdCBp cyBwZXJmZWN0bHkgb2suICBUaGVyZSBpcyBwbGVudHkgb2YgcHJpb3IgYXJ0LCBhcyB3ZWxsIGFz IHBsZW50eSBvZiB3YXlzCmZvciB1c2Vyc3BhY2UgdG8gc2hvb3QgaXRzZWxmIGluIHRoZSBmb290 LiAgRS5nLiBvcGVuIGEgc3RhdHMgZmQgZm9yIGEgdkNQVSBhbmQKdGhlIFZNIGFuZCBhbGwgaXRz IHZDUFVzIHdpbGwgYmUga2VwdCBhbGl2ZS4gIEFuZCBjb25jZXB0dWFsbHkgaXQncyBzb3VuZCwK YW55dGhpbmcgY3JlYXRlZCBpbiB0aGUgc2NvcGUgb2YgYSBWTSBfc2hvdWxkXyBwaW4gdGhlIFZN LgoKPiBJIGZlZWwgdGhhdCBtZW1zbG90cyBmb3JtIGEgbmF0dXJhbCB3YXkgb2YgbWFuYWdpbmcg dXNhZ2Ugb2YgdGhlIGdtZW0KPiBmaWxlLiBXaGVuIGEgbWVtc2xvdCBpcyBjcmVhdGVkLCBpdCBp cyB1c2luZyB0aGUgZmlsZTsgaGVuY2Ugd2UgdGFrZSBhCj4gcmVmY291bnQgb24gdGhlIGdtZW0g ZmlsZSwgYW5kIGFzIG1lbXNsb3RzIGFyZSByZW1vdmVkLCB3ZSBkcm9wCj4gcmVmY291bnRzIG9u IHRoZSBnbWVtIGZpbGUuCgpZZXMgYW5kIG5vLiAgSXQncyBkZWZpbml0ZWx5IG1vcmUgbmF0dXJh bCAqaWYqIHRoZSBnb2FsIGlzIHRvIGFsbG93IGd1ZXN0X21lbWZkCm1lbW9yeSB0byBleGlzdCB3 aXRob3V0IGJlaW5nIGF0dGFjaGVkIHRvIGEgVk0uICBCdXQgSSdtIG5vdCBhdCBhbGwgY29udmlu Y2VkCnRoYXQgd2Ugd2FudCB0byBhbGxvdyB0aGF0LCBvciB0aGF0IGl0IGhhcyBkZXNpcmFibGUg cHJvcGVydGllcy4gIFdpdGggVERYIGFuZApTTlAgaW4gcGFydGljdWFybHksIEknbSBwcmV0dHkg c3VyZSB0aGF0IGFsbG93aW5nIG1lbW9yeSB0byBvdXRsaXZlIHRoZSBWTSBpcwp2ZXJ5IHVuZGVy aXNhYmxlIChtb3JlIGJlbG93KS4KCj4gVGhlIEtWTSBwb2ludGVyIGlzIHNoYXJlZCBhbW9uZyBh bGwgdGhlIGJpbmRpbmdzIGluIGdtZW3igJlzIHhhcnJheSwgYW5kIHdlIGNhbgo+IGVuZm9yY2Ug dGhhdCBhIGdtZW0gZmlsZSBpcyB1c2VkIG9ubHkgd2l0aCBvbmUgVk06Cj4gCj4gKyBXaGVuIGJp bmRpbmcgYSBtZW1zbG90IHRvIHRoZSBmaWxlLCBpZiBhIGt2bSBwb2ludGVyIGV4aXN0cywgaXQg bXVzdAo+ICAgYmUgdGhlIHNhbWUga3ZtIGFzIHRoZSBvbmUgaW4gdGhpcyBiaW5kaW5nCj4gKyBX aGVuIHRoZSBiaW5kaW5nIHRvIHRoZSBsYXN0IG1lbXNsb3QgaXMgcmVtb3ZlZCBmcm9tIGEgZmls ZSwgTlVMTCB0aGUKPiAgIGt2bSBwb2ludGVyLgoKTnVsbGlmeWluZyB0aGUgS1ZNIHBvaW50ZXIg aXNuJ3Qgc3VmZmljaWVudCwgYmVjYXVzZSB3aXRob3V0IGFkZGl0aW9uYWwgYWN0aW9ucwp1c2Vy c3BhY2UgY291bGQgZXh0cmFjdCBkYXRhIGZyb20gYSBWTSBieSBkZWxldGluZyBpdHMgbWVtc2xv dHMgYW5kIHRoZW4gYmluZGluZwp0aGUgZ3Vlc3RfbWVtZmQgdG8gYW4gYXR0YWNrZXIgY29udHJv bGxlZCBWTS4gIE9yIG1vcmUgbGlrZWx5IHdpdGggVERYIGFuZCBTTlAsCmluZHVjZSBiYWRuZXNz IGJ5IGNvZXJjaW5nIEtWTSBpbnRvIG1hcHBpbmcgbWVtb3J5IGludG8gYSBndWVzdCB3aXRoIHRo ZSB3cm9uZwpBU0lEL0hLSUQuCgpJIGNhbiB0aGluayBvZiB0aHJlZSB3YXlzIHRvIGhhbmRsZSB0 aGF0OgoKICAoYSkgcHJldmVudCBhIGRpZmZlcmVudCBWTSBmcm9tICpldmVyKiBiaW5kaW5nIHRv IHRoZSBnbWVtIGluc3RhbmNlCiAgKGIpIGZyZWUvemVybyBwaHlzaWNhbCBwYWdlcyB3aGVuIHVu YmluZGluZwogIChjKSBmcmVlL3plcm8gd2hlbiBiaW5kaW5nIHRvIGEgZGlmZmVyZW50IFZNCgpP cHRpb24gKGEpIGlzIGVhc3ksIGJ1dCB0aGF0IHByZXR0eSBtdWNoIGRlZmVhdHMgdGhlIHB1cnBv c2Ugb2YgZGVjb3B1bGluZwpndWVzdF9tZW1mZCBmcm9tIGEgVk0uCgpPcHRpb24gKGIpIGlzbid0 IGhhcmQgdG8gaW1wbGVtZW50LCBidXQgaXQgc2NyZXdzIHVwIHRoZSBsaWZlY3ljbGUgb2YgdGhl IG1lbW9yeSwKZS5nLiB3b3VsZCByZXF1aXJlIG1lbW9yeSB3aGVuIGEgbWVtc2xvdCBpcyBkZWxl dGVkLiAgVGhhdCBpc24ndCBuZWNlc3NhcmlseSBhCmRlYWwtYnJlYWtlciwgYnV0IGl0IHJ1bnMg Y291bnRlciB0byBob3cgS1ZNIG1lbWxvdHMgY3VycmVudGx5IG9wZXJhdGUuICBNZW1zbG90cwph cmUgYmFzaWNhbGx5IGp1c3Qgd2VpcmQgcGFnZSB0YWJsZXMsIGUuZy4gZGVsZXRpbmcgYSBtZW1z bG90IGRvZXNuJ3QgaGF2ZSBhbnkKaW1wYWN0IG9uIHRoZSB1bmRlcmx5aW5nIGRhdGEgaW4gbWVt b3J5LiAgVERYIHRocm93cyBhIHdyZW5jaCBpbiB0aGlzIGFzIHJlbW92aW5nCmEgcGFnZSBmcm9t IHRoZSBTZWN1cmUgRVBUIGlzIGVmZmVjdGl2ZWx5IGRlc3RydWN0aXZlIHRvIHRoZSBkYXRhIChj YW4ndCBiZSBtYXBwZWQKYmFjayBpbiB0byB0aGUgVk0gd2l0aG91dCB6ZXJvaW5nIHRoZSBkYXRh KSwgYnV0IElNTyB0aGF0J3MgYW4gb2RkaXR5IHdpdGggVERYIGFuZApub3QgbmVjZXNzYXJpbHkg c29tZXRoaW5nIHdlIHdhbnQgdG8gY2Fycnkgb3ZlciB0byBvdGhlciBWTSB0eXBlcy4KClRoZXJl IHdvdWxkIGFsc28gYmUgcGVyZm9ybWFuY2UgaW1wbGljYXRpb25zIChwcm9iYWJseSBhIG5vbi1p c3N1ZSBpbiBwcmFjdGljZSksCmFuZCB3ZWlyZG5lc3MgaWYvd2hlbiB3ZSBnZXQgdG8gc2hhcmlu ZywgbGlua2luZyBhbmQvb3IgbW1hcCgpaW5nIGdtZW0uICBFLmcuIHdoYXQKc2hvdWxkIGhhcHBl biBpZiB0aGUgbGFzdCBtZW1zbG90IChiaW5kaW5nKSBpcyBkZWxldGVkLCBidXQgdGhlcmUgb3V0 c3RhbmRpbmcgdXNlcnNwYWNlCm1hcHBpbmdzPwoKT3B0aW9uIChjKSBpcyBiZXR0ZXIgZnJvbSBh IGxpZmVjeWNsZSBwZXJzcGVjdGl2ZSwgYnV0IGl0IGFkZHMgaXRzIG93biBmbGF2b3Igb2YKY29t cGxleGl0eSwgZS5nLiB0aGUgcGVyZm9ybWFudCB3YXkgdG8gcmVjbGFpbSBURFggbWVtb3J5IHJl cXVpcmVzIHRoZSBURE1SCihlZmZlY3RpdmVseSB0aGUgVk0gcG9pbnRlciksIGFuZCBzbyBhIGRl ZmVycmVkIHJlbGNhaW0gZG9lc24ndCByZWFsbHkgd29yayBmb3IKVERYLiAgQW5kIEknbSBwcmV0 dHkgc3VyZSBpdCAqY2FuJ3QqIHdvcmsgZm9yIFNOUCwgYmVjYXVzZSBSTVAgZW50cmllcyBtdXN0 IG5vdApvdXRsaXZlIHRoZSBWTTsgS1ZNIGNhbid0IHJldXNlIGFuIEFTSUQgaWYgdGhlcmUgYXJl IHBhZ2VzIGFzc2lnbmVkIHRvIHRoYXQgQVNJRAppbiB0aGUgUk1QLCBpLmUuIHVudGlsIGFsbCBt ZW1vcnkgYmVsb25naW5nIHRvIHRoZSBWTSBoYXMgYmVlbiBmdWxseSBmcmVlZC4KCj4gQ291bGQg YmluZGluZyBnbWVtIGZpbGVzIG5vdCBvbiBjcmVhdGlvbiwgYnV0IGF0IG1lbXNsb3QgY29uZmln dXJhdGlvbgo+IHRpbWUgYmUgc3VmZmljaWVudCBhbmQgc2ltcGxlcj8KCkFmdGVyIHdvcmtpbmcg dGhyb3VnaCB0aGUgZmxvd3MsIEkgdGhpbmsgYmluZGluZyBvbi1kZW1hbmQgd291bGQgc2ltcGxp ZnkgdGhlCnJlZmNvdW50aW5nIChzdGF0aW5nIHRoZSBvYnZpb3VzKSwgYnV0IGNvbXBsaWNhdGUg dGhlIGxpZmVjeWNsZSBvZiB0aGUgbWVtb3J5IGFzCndlbGwgYXMgdGhlIGNvbnRyYWN0IGJldHdl ZW4gS1ZNIGFuZCB1c2Vyc3BhY2UsIGFuZCB3b3VsZCBicmVhayB0aGUgc2VwYXJhdGlvbiBvZgpj b25jZXJucyBiZXR3ZWVuIHRoZSBpbm9kZSAocGh5c2ljYWwgbWVtb3J5IC8gZGF0YSkgYW5kIGZp bGUgKFZNJ3MgdmlldyAvIG1hcHBpbmdzKS4KCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fCmxpbnV4LXJpc2N2IG1haWxpbmcgbGlzdApsaW51eC1yaXNjdkBs aXN0cy5pbmZyYWRlYWQub3JnCmh0dHA6Ly9saXN0cy5pbmZyYWRlYWQub3JnL21haWxtYW4vbGlz dGluZm8vbGludXgtcmlzY3YK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51FA1C04FDF for ; Tue, 8 Aug 2023 21:14:41 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=EiApR7Lg; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4RL5YC24jSz3c3X for ; Wed, 9 Aug 2023 07:14:39 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=EiApR7Lg; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--seanjc.bounces.google.com (client-ip=2607:f8b0:4864:20::549; helo=mail-pg1-x549.google.com; envelope-from=3-k_szaykdm8dzv84x19916z.x97638fiaax-yzg63ded.9k6vwd.9c1@flex--seanjc.bounces.google.com; receiver=lists.ozlabs.org) Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4RL5Wy0M10z2yt6 for ; Wed, 9 Aug 2023 07:13:32 +1000 (AEST) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-56463e0340cso6299736a12.2 for ; Tue, 08 Aug 2023 14:13:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=EiApR7LgfgrJOIT4RH8vwTp7YvtN3WX8F65amcCZ5tUQpyv/ZsbnzElbxtLH3YiRAr Xco5OB//pxBTUJ1Hz9I9wbK8VFMOtBI25Tou7znxlR4xLwF5hwG16hAzm3VNuOm7qNVv RXaphvZexnvsjJwdUedz4miwZlrHjyPKGJ4tg41nXpc1Cr1tDeUR45j+cXRmHY2zeNeJ aD2FSNHH9s2X0AH/332mpiw5dSfBYPcqgRab/m5d5fFkQb+ZWPzwvwukMvExFid52TaS W3qOIOCRBt/UK+p15DAoWro/GC9Q6ni0EalNj49A0OSEsHwDkeyMMKxa0prAwizF/XXF JV1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=QV1FY80QY6Vogaluw1cjZfcpDAY1xliwS4WVI+SZQ7fB6xVO1sg/EYpIF2kz+8TH5s 3y/wfCuozfWLHnbzyQ1PKY7jWWM6YTfYszj6sdFEcx8DSG0rqVJMcPFB8JIb1fB5k0gB tUNWbSPB/PkMA45nDIvOGN6lRvOBnYYQXMBpX+3r4rbgm+xUh0DtInZySKsyg/qvNWYm peBeG1LrDaTpPLJ1UttrNUS2h8uDH66zvlI5SPXtV8z/8XdbYwAP+1aYRtHgmFPRkJnH 8YGwa+Ier2GJS5+anro50L5QfjZ2ihsFQlkSFv1wgD1Q8bjIiCs5ZhcJ5rk9ZvFv92UM hvIQ== X-Gm-Message-State: AOJu0YyhATG+VYDdvnj4HVzOqAFFTqo6Anccd84mlbXhJ5aUoPAQ8pPM rw+hkwYaJw28+ZjVExrpjdOghAlRquU= X-Google-Smtp-Source: AGHT+IF3VhVvM3seA6Kd3G1yrwf0aF3MmO0cqr5SVZX1MHXlsabrjEav/SHlO1qK0PHGPVCVUrfGthX1EsA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:b242:0:b0:563:e937:5e87 with SMTP id t2-20020a63b242000000b00563e9375e87mr12735pgo.5.1691529208021; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) Date: Tue, 8 Aug 2023 14:13:26 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230718234512.1690985-13-seanjc@google.com> Message-ID: Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Ackerley Tng Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, david@redhat.com, yu.c.zhang@linux.intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chao.p.peng@linux.intel.com, linux-riscv@lists.infradead.org, isaku.yamahata@gmail.com, paul@paul-moore.com, maz@kernel.org, chenhuacai@kernel.org, jmorris@namei.org, willy@infradead.org, wei.w.wang@intel.com, tabba@google.com, jarkko@kernel.org, serge@hallyn.com, mail@maciej.szmigiero.name, aou@eecs.berkeley.edu, vbabka@suse.cz, michael.roth@amd.com, paul.walmsley@sifive.com, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, qperret@google.com, liam.merwick@oracle.com, linux-mips@vger.kernel.org, oliver.upton@linux.dev, linux-security-module@vger.kernel.org, palmer@dabbelt.com, kvm-riscv@lists.infradead.org, anup@brainfault.org, linux-fsdevel@vger.kernel.org, pbonzini@redhat.com, akpm@linux-foundation.org, vannapurve@google.com, linuxppc-dev@lists.ozlabs.org, kirill.shutemov@linux.intel.com Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Mon, Aug 07, 2023, Ackerley Tng wrote: > I=E2=80=99d like to propose an alternative to the refcounting approach be= tween > the gmem file and associated kvm, where we think of KVM=E2=80=99s memslot= s as > users of the gmem file. >=20 > Instead of having the gmem file pin the VM (i.e. take a refcount on > kvm), we could let memslot take a refcount on the gmem file when the > memslots are configured. >=20 > Here=E2=80=99s a POC patch that flips the refcounting (and modified selft= ests in > the next commit): > https://github.com/googleprodkernel/linux-cc/commit/7f487b029b89b9f3e9b09= 4a721bc0772f3c8c797 >=20 > One side effect of having the gmem file pin the VM is that now the gmem > file becomes sort of a false handle on the VM: >=20 > + Closing the file destroys the file pointers in the VM and invalidates > the pointers Yeah, this is less than ideal. But, it's also how things operate today. K= VM doesn't hold references to VMAs or files, e.g. if userspace munmap()s memor= y, any and all SPTEs pointing at the memory are zapped. The only difference w= ith gmem is that KVM needs to explicitly invalidate file pointers, instead of t= hat happening behind the scenes (no more VMAs to find). Again, I agree the res= ulting code is more complex than I would prefer, but from a userspace perspective = I don't see this as problematic. > + Keeping the file open keeps the VM around in the kernel even though > the VM fd may already be closed. That is perfectly ok. There is plenty of prior art, as well as plenty of w= ays for userspace to shoot itself in the foot. E.g. open a stats fd for a vCPU= and the VM and all its vCPUs will be kept alive. And conceptually it's sound, anything created in the scope of a VM _should_ pin the VM. > I feel that memslots form a natural way of managing usage of the gmem > file. When a memslot is created, it is using the file; hence we take a > refcount on the gmem file, and as memslots are removed, we drop > refcounts on the gmem file. Yes and no. It's definitely more natural *if* the goal is to allow guest_m= emfd memory to exist without being attached to a VM. But I'm not at all convinc= ed that we want to allow that, or that it has desirable properties. With TDX = and SNP in particuarly, I'm pretty sure that allowing memory to outlive the VM = is very underisable (more below). > The KVM pointer is shared among all the bindings in gmem=E2=80=99s xarray= , and we can > enforce that a gmem file is used only with one VM: >=20 > + When binding a memslot to the file, if a kvm pointer exists, it must > be the same kvm as the one in this binding > + When the binding to the last memslot is removed from a file, NULL the > kvm pointer. Nullifying the KVM pointer isn't sufficient, because without additional act= ions userspace could extract data from a VM by deleting its memslots and then bi= nding the guest_memfd to an attacker controlled VM. Or more likely with TDX and = SNP, induce badness by coercing KVM into mapping memory into a guest with the wr= ong ASID/HKID. I can think of three ways to handle that: (a) prevent a different VM from *ever* binding to the gmem instance (b) free/zero physical pages when unbinding (c) free/zero when binding to a different VM Option (a) is easy, but that pretty much defeats the purpose of decopuling guest_memfd from a VM. Option (b) isn't hard to implement, but it screws up the lifecycle of the m= emory, e.g. would require memory when a memslot is deleted. That isn't necessaril= y a deal-breaker, but it runs counter to how KVM memlots currently operate. Me= mslots are basically just weird page tables, e.g. deleting a memslot doesn't have = any impact on the underlying data in memory. TDX throws a wrench in this as re= moving a page from the Secure EPT is effectively destructive to the data (can't be= mapped back in to the VM without zeroing the data), but IMO that's an oddity with = TDX and not necessarily something we want to carry over to other VM types. There would also be performance implications (probably a non-issue in pract= ice), and weirdness if/when we get to sharing, linking and/or mmap()ing gmem. E.= g. what should happen if the last memslot (binding) is deleted, but there outstandi= ng userspace mappings? Option (c) is better from a lifecycle perspective, but it adds its own flav= or of complexity, e.g. the performant way to reclaim TDX memory requires the TDMR (effectively the VM pointer), and so a deferred relcaim doesn't really work= for TDX. And I'm pretty sure it *can't* work for SNP, because RMP entries must= not outlive the VM; KVM can't reuse an ASID if there are pages assigned to that= ASID in the RMP, i.e. until all memory belonging to the VM has been fully freed. > Could binding gmem files not on creation, but at memslot configuration > time be sufficient and simpler? After working through the flows, I think binding on-demand would simplify t= he refcounting (stating the obvious), but complicate the lifecycle of the memo= ry as well as the contract between KVM and userspace, and would break the separat= ion of concerns between the inode (physical memory / data) and file (VM's view / m= appings). From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0983FC001DB for ; Tue, 8 Aug 2023 21:14:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=PuwxbGHVFPZL4q+Ie5i8/9NG5Qh+7EbpmPK5nPtupUc=; b=c8a5AwwG2c33OEMUUYzDJOTokd 4FsH5Or2wEzC9AsJ01lQNp4Zsm2gdykhHmRMWg8QJu8QjpkRgZl1kS9r33VY+PFqus2L/uJ0zmD2y CyIkw5JMlNHDnGJuOqeEnKedDOrL+pC9wsPeQZ/EGgA0gdLfS5GXQt57hGWInWviyiOLZwTFdmbAW UvMDl8gBmL3f64sj5Toif0JxhA7W/HlpodjxOSD+UJaOwiAAJsv5MYPb4ymx7eSCcgLAkNBciYicv dU5wZYaydh3uetb38Qpju2PyDnrLh5++JaT4SrnwqO6H+eVO4wdOoHXWH5s5BPGIbCvnlzxV2sMzN q5RI+G0A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qTU1K-003UMq-2A; Tue, 08 Aug 2023 21:13:38 +0000 Received: from mail-pg1-x54a.google.com ([2607:f8b0:4864:20::54a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qTU1F-003UKL-1W for linux-arm-kernel@lists.infradead.org; Tue, 08 Aug 2023 21:13:36 +0000 Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-563379fe16aso6292903a12.3 for ; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=EiApR7LgfgrJOIT4RH8vwTp7YvtN3WX8F65amcCZ5tUQpyv/ZsbnzElbxtLH3YiRAr Xco5OB//pxBTUJ1Hz9I9wbK8VFMOtBI25Tou7znxlR4xLwF5hwG16hAzm3VNuOm7qNVv RXaphvZexnvsjJwdUedz4miwZlrHjyPKGJ4tg41nXpc1Cr1tDeUR45j+cXRmHY2zeNeJ aD2FSNHH9s2X0AH/332mpiw5dSfBYPcqgRab/m5d5fFkQb+ZWPzwvwukMvExFid52TaS W3qOIOCRBt/UK+p15DAoWro/GC9Q6ni0EalNj49A0OSEsHwDkeyMMKxa0prAwizF/XXF JV1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691529208; x=1692134008; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=2U8op5NWCZxGer1aWuXuZ1d6J73Ge5+Hk7EgkvJJLAs=; b=TCkhanPHPx686nzplT5kCy04Kh++5WdrxVHjwjBXjxjjUVUdxbQH0e6IKCeb43NqA7 pPAfcHVr6yDYA4dYjYjw5N16EaKU387ROyncjQHJcwxoqB+yelxb8YpmZrJlw+0Mf688 S/8rU2ijC/Tl6un1nzhHsQ+VZ7WZWk2dK3/FvhBx6HRAthzK4L3KWGEEJgwAj9ooybTk 3tF95+8yeNMpEMNehrM2qPCSro7oUXb6Azf2XvsPzlXSnXrLnjMyDvXGEVlRgm6RYxhd 1CKhN39LfpwHlEvN0pFoD5JyNCD5iBpDzYvbhLfT3179Wvl6N19ZW2HnR4SVdXHzNfOd 2hEA== X-Gm-Message-State: AOJu0Yyk5wEs8YRfONIvTgiDvhIR2/Wq3ZGeBjopf0IfntP3nfuqLVI0 hukBuToI6jiLN36TkiWv1Saf+h9lsjM= X-Google-Smtp-Source: AGHT+IF3VhVvM3seA6Kd3G1yrwf0aF3MmO0cqr5SVZX1MHXlsabrjEav/SHlO1qK0PHGPVCVUrfGthX1EsA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:b242:0:b0:563:e937:5e87 with SMTP id t2-20020a63b242000000b00563e9375e87mr12735pgo.5.1691529208021; Tue, 08 Aug 2023 14:13:28 -0700 (PDT) Date: Tue, 8 Aug 2023 14:13:26 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230718234512.1690985-13-seanjc@google.com> Message-ID: Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Ackerley Tng Cc: pbonzini@redhat.com, maz@kernel.org, oliver.upton@linux.dev, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, willy@infradead.org, akpm@linux-foundation.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, chao.p.peng@linux.intel.com, tabba@google.com, jarkko@kernel.org, yu.c.zhang@linux.intel.com, vannapurve@google.com, mail@maciej.szmigiero.name, vbabka@suse.cz, david@redhat.com, qperret@google.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230808_141333_533145_53A29E1E X-CRM114-Status: GOOD ( 31.09 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org T24gTW9uLCBBdWcgMDcsIDIwMjMsIEFja2VybGV5IFRuZyB3cm90ZToKPiBJ4oCZZCBsaWtlIHRv IHByb3Bvc2UgYW4gYWx0ZXJuYXRpdmUgdG8gdGhlIHJlZmNvdW50aW5nIGFwcHJvYWNoIGJldHdl ZW4KPiB0aGUgZ21lbSBmaWxlIGFuZCBhc3NvY2lhdGVkIGt2bSwgd2hlcmUgd2UgdGhpbmsgb2Yg S1ZN4oCZcyBtZW1zbG90cyBhcwo+IHVzZXJzIG9mIHRoZSBnbWVtIGZpbGUuCj4gCj4gSW5zdGVh ZCBvZiBoYXZpbmcgdGhlIGdtZW0gZmlsZSBwaW4gdGhlIFZNIChpLmUuIHRha2UgYSByZWZjb3Vu dCBvbgo+IGt2bSksIHdlIGNvdWxkIGxldCBtZW1zbG90IHRha2UgYSByZWZjb3VudCBvbiB0aGUg Z21lbSBmaWxlIHdoZW4gdGhlCj4gbWVtc2xvdHMgYXJlIGNvbmZpZ3VyZWQuCj4gCj4gSGVyZeKA mXMgYSBQT0MgcGF0Y2ggdGhhdCBmbGlwcyB0aGUgcmVmY291bnRpbmcgKGFuZCBtb2RpZmllZCBz ZWxmdGVzdHMgaW4KPiB0aGUgbmV4dCBjb21taXQpOgo+IGh0dHBzOi8vZ2l0aHViLmNvbS9nb29n bGVwcm9ka2VybmVsL2xpbnV4LWNjL2NvbW1pdC83ZjQ4N2IwMjliODliOWYzZTliMDk0YTcyMWJj MDc3MmYzYzhjNzk3Cj4gCj4gT25lIHNpZGUgZWZmZWN0IG9mIGhhdmluZyB0aGUgZ21lbSBmaWxl IHBpbiB0aGUgVk0gaXMgdGhhdCBub3cgdGhlIGdtZW0KPiBmaWxlIGJlY29tZXMgc29ydCBvZiBh IGZhbHNlIGhhbmRsZSBvbiB0aGUgVk06Cj4gCj4gKyBDbG9zaW5nIHRoZSBmaWxlIGRlc3Ryb3lz IHRoZSBmaWxlIHBvaW50ZXJzIGluIHRoZSBWTSBhbmQgaW52YWxpZGF0ZXMKPiAgIHRoZSBwb2lu dGVycwoKWWVhaCwgdGhpcyBpcyBsZXNzIHRoYW4gaWRlYWwuICBCdXQsIGl0J3MgYWxzbyBob3cg dGhpbmdzIG9wZXJhdGUgdG9kYXkuICBLVk0KZG9lc24ndCBob2xkIHJlZmVyZW5jZXMgdG8gVk1B cyBvciBmaWxlcywgZS5nLiBpZiB1c2Vyc3BhY2UgbXVubWFwKClzIG1lbW9yeSwKYW55IGFuZCBh bGwgU1BURXMgcG9pbnRpbmcgYXQgdGhlIG1lbW9yeSBhcmUgemFwcGVkLiAgVGhlIG9ubHkgZGlm ZmVyZW5jZSB3aXRoCmdtZW0gaXMgdGhhdCBLVk0gbmVlZHMgdG8gZXhwbGljaXRseSBpbnZhbGlk YXRlIGZpbGUgcG9pbnRlcnMsIGluc3RlYWQgb2YgdGhhdApoYXBwZW5pbmcgYmVoaW5kIHRoZSBz Y2VuZXMgKG5vIG1vcmUgVk1BcyB0byBmaW5kKS4gIEFnYWluLCBJIGFncmVlIHRoZSByZXN1bHRp bmcKY29kZSBpcyBtb3JlIGNvbXBsZXggdGhhbiBJIHdvdWxkIHByZWZlciwgYnV0IGZyb20gYSB1 c2Vyc3BhY2UgcGVyc3BlY3RpdmUgSQpkb24ndCBzZWUgdGhpcyBhcyBwcm9ibGVtYXRpYy4KCj4g KyBLZWVwaW5nIHRoZSBmaWxlIG9wZW4ga2VlcHMgdGhlIFZNIGFyb3VuZCBpbiB0aGUga2VybmVs IGV2ZW4gdGhvdWdoCj4gICB0aGUgVk0gZmQgbWF5IGFscmVhZHkgYmUgY2xvc2VkLgoKVGhhdCBp cyBwZXJmZWN0bHkgb2suICBUaGVyZSBpcyBwbGVudHkgb2YgcHJpb3IgYXJ0LCBhcyB3ZWxsIGFz IHBsZW50eSBvZiB3YXlzCmZvciB1c2Vyc3BhY2UgdG8gc2hvb3QgaXRzZWxmIGluIHRoZSBmb290 LiAgRS5nLiBvcGVuIGEgc3RhdHMgZmQgZm9yIGEgdkNQVSBhbmQKdGhlIFZNIGFuZCBhbGwgaXRz IHZDUFVzIHdpbGwgYmUga2VwdCBhbGl2ZS4gIEFuZCBjb25jZXB0dWFsbHkgaXQncyBzb3VuZCwK YW55dGhpbmcgY3JlYXRlZCBpbiB0aGUgc2NvcGUgb2YgYSBWTSBfc2hvdWxkXyBwaW4gdGhlIFZN LgoKPiBJIGZlZWwgdGhhdCBtZW1zbG90cyBmb3JtIGEgbmF0dXJhbCB3YXkgb2YgbWFuYWdpbmcg dXNhZ2Ugb2YgdGhlIGdtZW0KPiBmaWxlLiBXaGVuIGEgbWVtc2xvdCBpcyBjcmVhdGVkLCBpdCBp cyB1c2luZyB0aGUgZmlsZTsgaGVuY2Ugd2UgdGFrZSBhCj4gcmVmY291bnQgb24gdGhlIGdtZW0g ZmlsZSwgYW5kIGFzIG1lbXNsb3RzIGFyZSByZW1vdmVkLCB3ZSBkcm9wCj4gcmVmY291bnRzIG9u IHRoZSBnbWVtIGZpbGUuCgpZZXMgYW5kIG5vLiAgSXQncyBkZWZpbml0ZWx5IG1vcmUgbmF0dXJh bCAqaWYqIHRoZSBnb2FsIGlzIHRvIGFsbG93IGd1ZXN0X21lbWZkCm1lbW9yeSB0byBleGlzdCB3 aXRob3V0IGJlaW5nIGF0dGFjaGVkIHRvIGEgVk0uICBCdXQgSSdtIG5vdCBhdCBhbGwgY29udmlu Y2VkCnRoYXQgd2Ugd2FudCB0byBhbGxvdyB0aGF0LCBvciB0aGF0IGl0IGhhcyBkZXNpcmFibGUg cHJvcGVydGllcy4gIFdpdGggVERYIGFuZApTTlAgaW4gcGFydGljdWFybHksIEknbSBwcmV0dHkg c3VyZSB0aGF0IGFsbG93aW5nIG1lbW9yeSB0byBvdXRsaXZlIHRoZSBWTSBpcwp2ZXJ5IHVuZGVy aXNhYmxlIChtb3JlIGJlbG93KS4KCj4gVGhlIEtWTSBwb2ludGVyIGlzIHNoYXJlZCBhbW9uZyBh bGwgdGhlIGJpbmRpbmdzIGluIGdtZW3igJlzIHhhcnJheSwgYW5kIHdlIGNhbgo+IGVuZm9yY2Ug dGhhdCBhIGdtZW0gZmlsZSBpcyB1c2VkIG9ubHkgd2l0aCBvbmUgVk06Cj4gCj4gKyBXaGVuIGJp bmRpbmcgYSBtZW1zbG90IHRvIHRoZSBmaWxlLCBpZiBhIGt2bSBwb2ludGVyIGV4aXN0cywgaXQg bXVzdAo+ICAgYmUgdGhlIHNhbWUga3ZtIGFzIHRoZSBvbmUgaW4gdGhpcyBiaW5kaW5nCj4gKyBX aGVuIHRoZSBiaW5kaW5nIHRvIHRoZSBsYXN0IG1lbXNsb3QgaXMgcmVtb3ZlZCBmcm9tIGEgZmls ZSwgTlVMTCB0aGUKPiAgIGt2bSBwb2ludGVyLgoKTnVsbGlmeWluZyB0aGUgS1ZNIHBvaW50ZXIg aXNuJ3Qgc3VmZmljaWVudCwgYmVjYXVzZSB3aXRob3V0IGFkZGl0aW9uYWwgYWN0aW9ucwp1c2Vy c3BhY2UgY291bGQgZXh0cmFjdCBkYXRhIGZyb20gYSBWTSBieSBkZWxldGluZyBpdHMgbWVtc2xv dHMgYW5kIHRoZW4gYmluZGluZwp0aGUgZ3Vlc3RfbWVtZmQgdG8gYW4gYXR0YWNrZXIgY29udHJv bGxlZCBWTS4gIE9yIG1vcmUgbGlrZWx5IHdpdGggVERYIGFuZCBTTlAsCmluZHVjZSBiYWRuZXNz IGJ5IGNvZXJjaW5nIEtWTSBpbnRvIG1hcHBpbmcgbWVtb3J5IGludG8gYSBndWVzdCB3aXRoIHRo ZSB3cm9uZwpBU0lEL0hLSUQuCgpJIGNhbiB0aGluayBvZiB0aHJlZSB3YXlzIHRvIGhhbmRsZSB0 aGF0OgoKICAoYSkgcHJldmVudCBhIGRpZmZlcmVudCBWTSBmcm9tICpldmVyKiBiaW5kaW5nIHRv IHRoZSBnbWVtIGluc3RhbmNlCiAgKGIpIGZyZWUvemVybyBwaHlzaWNhbCBwYWdlcyB3aGVuIHVu YmluZGluZwogIChjKSBmcmVlL3plcm8gd2hlbiBiaW5kaW5nIHRvIGEgZGlmZmVyZW50IFZNCgpP cHRpb24gKGEpIGlzIGVhc3ksIGJ1dCB0aGF0IHByZXR0eSBtdWNoIGRlZmVhdHMgdGhlIHB1cnBv c2Ugb2YgZGVjb3B1bGluZwpndWVzdF9tZW1mZCBmcm9tIGEgVk0uCgpPcHRpb24gKGIpIGlzbid0 IGhhcmQgdG8gaW1wbGVtZW50LCBidXQgaXQgc2NyZXdzIHVwIHRoZSBsaWZlY3ljbGUgb2YgdGhl IG1lbW9yeSwKZS5nLiB3b3VsZCByZXF1aXJlIG1lbW9yeSB3aGVuIGEgbWVtc2xvdCBpcyBkZWxl dGVkLiAgVGhhdCBpc24ndCBuZWNlc3NhcmlseSBhCmRlYWwtYnJlYWtlciwgYnV0IGl0IHJ1bnMg Y291bnRlciB0byBob3cgS1ZNIG1lbWxvdHMgY3VycmVudGx5IG9wZXJhdGUuICBNZW1zbG90cwph cmUgYmFzaWNhbGx5IGp1c3Qgd2VpcmQgcGFnZSB0YWJsZXMsIGUuZy4gZGVsZXRpbmcgYSBtZW1z bG90IGRvZXNuJ3QgaGF2ZSBhbnkKaW1wYWN0IG9uIHRoZSB1bmRlcmx5aW5nIGRhdGEgaW4gbWVt b3J5LiAgVERYIHRocm93cyBhIHdyZW5jaCBpbiB0aGlzIGFzIHJlbW92aW5nCmEgcGFnZSBmcm9t IHRoZSBTZWN1cmUgRVBUIGlzIGVmZmVjdGl2ZWx5IGRlc3RydWN0aXZlIHRvIHRoZSBkYXRhIChj YW4ndCBiZSBtYXBwZWQKYmFjayBpbiB0byB0aGUgVk0gd2l0aG91dCB6ZXJvaW5nIHRoZSBkYXRh KSwgYnV0IElNTyB0aGF0J3MgYW4gb2RkaXR5IHdpdGggVERYIGFuZApub3QgbmVjZXNzYXJpbHkg c29tZXRoaW5nIHdlIHdhbnQgdG8gY2Fycnkgb3ZlciB0byBvdGhlciBWTSB0eXBlcy4KClRoZXJl IHdvdWxkIGFsc28gYmUgcGVyZm9ybWFuY2UgaW1wbGljYXRpb25zIChwcm9iYWJseSBhIG5vbi1p c3N1ZSBpbiBwcmFjdGljZSksCmFuZCB3ZWlyZG5lc3MgaWYvd2hlbiB3ZSBnZXQgdG8gc2hhcmlu ZywgbGlua2luZyBhbmQvb3IgbW1hcCgpaW5nIGdtZW0uICBFLmcuIHdoYXQKc2hvdWxkIGhhcHBl biBpZiB0aGUgbGFzdCBtZW1zbG90IChiaW5kaW5nKSBpcyBkZWxldGVkLCBidXQgdGhlcmUgb3V0 c3RhbmRpbmcgdXNlcnNwYWNlCm1hcHBpbmdzPwoKT3B0aW9uIChjKSBpcyBiZXR0ZXIgZnJvbSBh IGxpZmVjeWNsZSBwZXJzcGVjdGl2ZSwgYnV0IGl0IGFkZHMgaXRzIG93biBmbGF2b3Igb2YKY29t cGxleGl0eSwgZS5nLiB0aGUgcGVyZm9ybWFudCB3YXkgdG8gcmVjbGFpbSBURFggbWVtb3J5IHJl cXVpcmVzIHRoZSBURE1SCihlZmZlY3RpdmVseSB0aGUgVk0gcG9pbnRlciksIGFuZCBzbyBhIGRl ZmVycmVkIHJlbGNhaW0gZG9lc24ndCByZWFsbHkgd29yayBmb3IKVERYLiAgQW5kIEknbSBwcmV0 dHkgc3VyZSBpdCAqY2FuJ3QqIHdvcmsgZm9yIFNOUCwgYmVjYXVzZSBSTVAgZW50cmllcyBtdXN0 IG5vdApvdXRsaXZlIHRoZSBWTTsgS1ZNIGNhbid0IHJldXNlIGFuIEFTSUQgaWYgdGhlcmUgYXJl IHBhZ2VzIGFzc2lnbmVkIHRvIHRoYXQgQVNJRAppbiB0aGUgUk1QLCBpLmUuIHVudGlsIGFsbCBt ZW1vcnkgYmVsb25naW5nIHRvIHRoZSBWTSBoYXMgYmVlbiBmdWxseSBmcmVlZC4KCj4gQ291bGQg YmluZGluZyBnbWVtIGZpbGVzIG5vdCBvbiBjcmVhdGlvbiwgYnV0IGF0IG1lbXNsb3QgY29uZmln dXJhdGlvbgo+IHRpbWUgYmUgc3VmZmljaWVudCBhbmQgc2ltcGxlcj8KCkFmdGVyIHdvcmtpbmcg dGhyb3VnaCB0aGUgZmxvd3MsIEkgdGhpbmsgYmluZGluZyBvbi1kZW1hbmQgd291bGQgc2ltcGxp ZnkgdGhlCnJlZmNvdW50aW5nIChzdGF0aW5nIHRoZSBvYnZpb3VzKSwgYnV0IGNvbXBsaWNhdGUg dGhlIGxpZmVjeWNsZSBvZiB0aGUgbWVtb3J5IGFzCndlbGwgYXMgdGhlIGNvbnRyYWN0IGJldHdl ZW4gS1ZNIGFuZCB1c2Vyc3BhY2UsIGFuZCB3b3VsZCBicmVhayB0aGUgc2VwYXJhdGlvbiBvZgpj b25jZXJucyBiZXR3ZWVuIHRoZSBpbm9kZSAocGh5c2ljYWwgbWVtb3J5IC8gZGF0YSkgYW5kIGZp bGUgKFZNJ3MgdmlldyAvIG1hcHBpbmdzKS4KCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fCmxpbnV4LWFybS1rZXJuZWwgbWFpbGluZyBsaXN0CmxpbnV4LWFy bS1rZXJuZWxAbGlzdHMuaW5mcmFkZWFkLm9yZwpodHRwOi8vbGlzdHMuaW5mcmFkZWFkLm9yZy9t YWlsbWFuL2xpc3RpbmZvL2xpbnV4LWFybS1rZXJuZWwK