From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Thu, 2 Nov 2023 10:37:29 -0700 Subject: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory In-Reply-To: References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Nov 02, 2023, David Matlack wrote: > On Thu, Nov 2, 2023 at 9:03?AM Sean Christopherson wrote: > > > > On Thu, Nov 02, 2023, Paolo Bonzini wrote: > > > On 10/31/23 23:39, David Matlack wrote: > > > > > > Maybe can you sketch out how you see this proposal being extensible to > > > > > > using guest_memfd for shared mappings? > > > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is needed. What's > > > > > missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to > > > > > ensure there are no outstanding references when converting back to private. > > > > > > > > > > For TDX/SNP, assuming we don't find a performant and robust way to do in-place > > > > > conversions, a second fd+offset pair would be needed. > > > > Is there a way to support non-in-place conversions within a single guest_memfd? > > > > > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest > > > memory. The hook would invalidate now-private parts if they have a VMA, > > > causing a SIGSEGV/EFAULT if the host touches them. > > > > > > It would forbid mappings from multiple gfns to a single offset of the > > > guest_memfd, because then the shared vs. private attribute would be tied to > > > the offset. This should not be a problem; for example, in the case of SNP, > > > the RMP already requires a single mapping from host physical address to > > > guest physical address. > > > > I don't see how this can work. It's not a M:1 scenario (where M is multiple gfns), > > it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't change on > > a conversion, what needs to change to do non-in-place conversion is the pfn, which > > is effectively the guest_memfd+offset pair. > > > > So yes, we *could* support non-in-place conversions within a single guest_memfd, > > but it would require a second offset, > > Why can't KVM free the existing page at guest_memfd+offset and > allocate a new one when doing non-in-place conversions? Oh, I see what you're suggesting. Eww. It's certainly possible, but it would largely defeat the purpose of why we are adding guest_memfd in the first place. For TDX and SNP, the goal is to provide a simple, robust mechanism for isolating guest private memory so that it's all but impossible for the host to access private memory. As things stand, memory for a given guest_memfd is either private or shared (assuming we support a second guest_memfd per memslot). I.e. there's no need to track whether a given page/folio in the guest_memfd is private vs. shared. We could use memory attributes, but that further complicates things when intrahost migration (and potentially other multi-user scenarios) comes along, i.e. when KVM supports linking multiple guest_memfd files to a single inode. We'd have to ensure that all "struct kvm" instances have identical PRIVATE attributes for a given *offset* in the inode. I'm not even sure how feasible that is for intrahost migration, and that's the *easy* case, because IIRC it's already a hard requirement that the source and destination have identical gnf=>guest_memfd bindings, i.e. KVM can somewhat easily reason about gfn attributes. But even then, that only helps with the actual migration of the VM, e.g. we'd still have to figure out how to deal with .mmap() and other shared vs. private actions when linking a new guest_memfd file against an existing inode. I haven't seen the pKVM patches for supporting .mmap(), so maybe this is already a solved problem, but I'd honestly be quite surprised if it all works correctly if/when KVM supports multiple files per inode. And I don't see what value non-in-place conversions would add. The value added by in-place conversions, aside from the obvious preservation of data, which isn't relevant to TDX/SNP, is that it doesn't require freeing and reallocating memory to avoid double-allocating for private vs. shared. That's especialy quite nice when hugepages are being used because reconstituing a hugepage "only" requires zapping SPTEs. But if KVM is freeing the private page, it's the same as punching a hole, probably quite literally, when mapping the gfn as shared. In every way I can think of, it's worse. E.g. it's more complex for KVM, and the PUNCH_HOLE => allocation operations must be serialized. Regarding double-allocating, I really, really think we should solve that in the guest. I.e. teach Linux-as-a-guest to aggressively convert at 2MiB granularity and avoid 4KiB conversions. 4KiB conversions aren't just a memory utilization problem, they're also a performance problem, e.g. shatters hugepages (which KVM doesn't yet support recovering) and increases TLB pressure for both stage-1 and stage-2 mappings. From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B5C11D54B for ; Thu, 2 Nov 2023 17:37:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lHMcZqBM" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-5788445ac04so864080a12.2 for ; Thu, 02 Nov 2023 10:37:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698946651; x=1699551451; darn=lists.linux.dev; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=lHMcZqBMylOF5FJTlERs60/dfyfkW/gsoEvqM6HOx3dspDZbpY9mkm8zqGKoCOUNdo +zSLEBohdPzDyCWKEahn6tUFIP0KTedPbYS7cNiVD/rKMpjHzQgmPD9kleaQx6P0nRCQ JiaGJianHlT0sDRo6K/0l8YajkgPmXuMS8SRmMLaLPhd1ufNnoetoeKRgGrtqM/aLmyV gqctGyp2R3aUO/cwLt+2ibLgA5AMQIdnOYbwk4Fy+AhhaHnVrpV4TDjc+Gj4tvMZtjgS tlKlodH7wRLqWQkqcj3NPnjbb4RUqNuAthG5wSHsdG/58eDxIQoDmLH9Or7W9jQesMBC 4S+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698946651; x=1699551451; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=B3HxQ5tyr7An8zn6pQBvk5XXUUK7Ag3xEj46Avb6rfR6sMKmgfBV83qAeq0jWJAokp ZnaXwWcr51hL/WNcu2WtoQmcjm2zF8rnInGtJOCy9ed6dvYguQAbI6KT8ZtK/QO7qPA6 3SEd1Wmc0yfpAoWY8NZ5nt+sEBKs1w/sP91dnD263nsGZ1RlQSqzEPf3yj1VfPaVARDX aopAvmJbWfJH/vjT/s4yD539ODGBMepsqxJdhEpXB4UkyXgLWo0rmRaqqBR8onSP8Au9 CUalPLfunVZER+BJwfJJn+OhiPykJEG0cX0p4DDMPP98YGbB4CjXK278U0/mQx+Yj7Li xahQ== X-Gm-Message-State: AOJu0YzSa5tw1z3T/6q6vrFlW1lQ75pL9f/tl8o1LFKKgZd29kDYMySS R53X7E1sw4YihqrMbSoWGQZCRLgppxQ= X-Google-Smtp-Source: AGHT+IHO2BmoMwEG0Z4p7dUErvps7DbkHqeEBbdPqNa8vmvQCAafmC1+rihoFOkAVlqXxxzUlxPm8WkwUZU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:25d2:b0:1cc:2ffe:5a27 with SMTP id jc18-20020a17090325d200b001cc2ffe5a27mr287356plb.9.1698946650780; Thu, 02 Nov 2023 10:37:30 -0700 (PDT) Date: Thu, 2 Nov 2023 10:37:29 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: David Matlack Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Thu, Nov 02, 2023, David Matlack wrote: > On Thu, Nov 2, 2023 at 9:03=E2=80=AFAM Sean Christopherson wrote: > > > > On Thu, Nov 02, 2023, Paolo Bonzini wrote: > > > On 10/31/23 23:39, David Matlack wrote: > > > > > > Maybe can you sketch out how you see this proposal being extens= ible to > > > > > > using guest_memfd for shared mappings? > > > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is= needed. What's > > > > > missing there is the ability to (safely) mmap() guest_memfd, e.g.= KVM needs to > > > > > ensure there are no outstanding references when converting back t= o private. > > > > > > > > > > For TDX/SNP, assuming we don't find a performant and robust way t= o do in-place > > > > > conversions, a second fd+offset pair would be needed. > > > > Is there a way to support non-in-place conversions within a single = guest_memfd? > > > > > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to = guest > > > memory. The hook would invalidate now-private parts if they have a V= MA, > > > causing a SIGSEGV/EFAULT if the host touches them. > > > > > > It would forbid mappings from multiple gfns to a single offset of the > > > guest_memfd, because then the shared vs. private attribute would be t= ied to > > > the offset. This should not be a problem; for example, in the case o= f SNP, > > > the RMP already requires a single mapping from host physical address = to > > > guest physical address. > > > > I don't see how this can work. It's not a M:1 scenario (where M is mul= tiple gfns), > > it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't = change on > > a conversion, what needs to change to do non-in-place conversion is the= pfn, which > > is effectively the guest_memfd+offset pair. > > > > So yes, we *could* support non-in-place conversions within a single gue= st_memfd, > > but it would require a second offset, >=20 > Why can't KVM free the existing page at guest_memfd+offset and > allocate a new one when doing non-in-place conversions? Oh, I see what you're suggesting. Eww. It's certainly possible, but it would largely defeat the purpose of why we = are adding guest_memfd in the first place. For TDX and SNP, the goal is to provide a simple, robust mechanism for isol= ating guest private memory so that it's all but impossible for the host to access= private memory. As things stand, memory for a given guest_memfd is either private = or shared (assuming we support a second guest_memfd per memslot). I.e. there's no ne= ed to track whether a given page/folio in the guest_memfd is private vs. shared. We could use memory attributes, but that further complicates things when in= trahost migration (and potentially other multi-user scenarios) comes along, i.e. wh= en KVM supports linking multiple guest_memfd files to a single inode. We'd have t= o ensure that all "struct kvm" instances have identical PRIVATE attributes for a giv= en *offset* in the inode. I'm not even sure how feasible that is for intrahos= t migration, and that's the *easy* case, because IIRC it's already a hard req= uirement that the source and destination have identical gnf=3D>guest_memfd bindings,= i.e. KVM can somewhat easily reason about gfn attributes. But even then, that only helps with the actual migration of the VM, e.g. we= 'd still have to figure out how to deal with .mmap() and other shared vs. private ac= tions when linking a new guest_memfd file against an existing inode. I haven't seen the pKVM patches for supporting .mmap(), so maybe this is al= ready a solved problem, but I'd honestly be quite surprised if it all works corre= ctly if/when KVM supports multiple files per inode. And I don't see what value non-in-place conversions would add. The value a= dded by in-place conversions, aside from the obvious preservation of data, which= isn't relevant to TDX/SNP, is that it doesn't require freeing and reallocating me= mory to avoid double-allocating for private vs. shared. That's especialy quite = nice when hugepages are being used because reconstituing a hugepage "only" requi= res zapping SPTEs. But if KVM is freeing the private page, it's the same as punching a hole, p= robably quite literally, when mapping the gfn as shared. In every way I can think = of, it's worse. E.g. it's more complex for KVM, and the PUNCH_HOLE =3D> allocation = operations must be serialized. Regarding double-allocating, I really, really think we should solve that in= the guest. I.e. teach Linux-as-a-guest to aggressively convert at 2MiB granula= rity and avoid 4KiB conversions. 4KiB conversions aren't just a memory utilizat= ion problem, they're also a performance problem, e.g. shatters hugepages (which= KVM doesn't yet support recovering) and increases TLB pressure for both stage-1= and stage-2 mappings. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A20AC4332F for ; Thu, 2 Nov 2023 17:37:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=XZsb7JeXTNIVY3T5vCnZ43qXP/Q7PZen+MpWJchRwzE=; b=u1YsXvtnF80V26tE0zPnFU7Fmr DG9u8LyN1ApvKR1tYfWCB1pWq1X40dGEOQXMK1GIjFBHr+LJDFBuhBPCp/QOBF9fBsOHU+zSRPxd6 mvgWWDNAEfQbPEgQD66DpGpEnfCxT77W2fUejKzuDWr4Y1+FstmvkfMtpaHTpY5PEjNmAPBM3vu6d M7zSGtwvob8Xa0oCYi2drW+GpO0z039gac5LNNBm9DaQkHFgMbvlHfxkeZk9249obHG0AdyXhb/F9 dwh/Ig+iIcFh/bYzUU4MkpOkVFv7Jr4g2V/aTWbt8ZOxriM0NtfAiyz8zJRPsK0jAFOAiInBNijij B8//6sWQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qybdS-009zRW-2x; Thu, 02 Nov 2023 17:37:38 +0000 Received: from mail-pl1-x64a.google.com ([2607:f8b0:4864:20::64a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qybdP-009zNp-1V for linux-riscv@lists.infradead.org; Thu, 02 Nov 2023 17:37:37 +0000 Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc2ebc3b3eso8917865ad.2 for ; Thu, 02 Nov 2023 10:37:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698946651; x=1699551451; darn=lists.infradead.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=BBdLtxJnaotx+0awbo42X18NDzSHHxFKba0JQH810CBqTgkLjk0H+VV/A2hW+/QNiA PUkSkBAkJNrJRIlA86J+08pRVb5ytptuLbuwso1wTH+49Am+rOKIegD7W//LZNeDJ04x AzCzDcwVRCLSBBi9dT8pn8kTU0RlopCYT1rRscYtTqvJ1J4dZYOL9HAn45yyVK19kba4 A6Atn82W1vz+uqEJfSJZVFzwY2HbvY5gcrQS3c6gcwjTgsak3Va/AB6o1SGYC6EbbVFi 4ckzYHVjxPmcTGM71G4YZpyvjz0XxgxrrBDwgeczJh6exNwwiVyo8h1e4TR1KoYUaE1w SEmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698946651; x=1699551451; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=ouWxPJqOr+SSm/0LhiuK7TLW9hqMehp+vzDRmpYv833QJlrGlrlgk4AxIkPQCR0tXY 3u3qJk+p0H0l5q+MBf0oKQF/Oep18riODRe8DNd9DVOqJ7VSLXMJ4/FByrSajjiWAqJl sYz279eWDsj2k48FW6je/EB0j3jsNa31TxDd3F/WPatUvFknb3SelmK3H4mNesk177t4 0REJuFMcy0j1LYCNjPwgTXm0pnh2w92Ihk2WvfuER6YgR5VPuPzdugVPiTLxmZNRYmto en/mF3POP9v2WlUvn1UPCMTFjnhNRsqO7Jfje/zYjiXxyWMBcGow9RGW3zHS8eRWkbga apXw== X-Gm-Message-State: AOJu0YzQ4q4zHLyir2suK4faxmYeIHLzcqy/g+DQgQ0Cm32bk0XdAWpZ UaISX5cnecQiAERTPQPYOZDu7d0sy/s= X-Google-Smtp-Source: AGHT+IHO2BmoMwEG0Z4p7dUErvps7DbkHqeEBbdPqNa8vmvQCAafmC1+rihoFOkAVlqXxxzUlxPm8WkwUZU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:25d2:b0:1cc:2ffe:5a27 with SMTP id jc18-20020a17090325d200b001cc2ffe5a27mr287356plb.9.1698946650780; Thu, 02 Nov 2023 10:37:30 -0700 (PDT) Date: Thu, 2 Nov 2023 10:37:29 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: David Matlack Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231102_103735_503985_C96F833F X-CRM114-Status: GOOD ( 35.91 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org T24gVGh1LCBOb3YgMDIsIDIwMjMsIERhdmlkIE1hdGxhY2sgd3JvdGU6Cj4gT24gVGh1LCBOb3Yg MiwgMjAyMyBhdCA5OjAz4oCvQU0gU2VhbiBDaHJpc3RvcGhlcnNvbiA8c2VhbmpjQGdvb2dsZS5j b20+IHdyb3RlOgo+ID4KPiA+IE9uIFRodSwgTm92IDAyLCAyMDIzLCBQYW9sbyBCb256aW5pIHdy b3RlOgo+ID4gPiBPbiAxMC8zMS8yMyAyMzozOSwgRGF2aWQgTWF0bGFjayB3cm90ZToKPiA+ID4g PiA+ID4gTWF5YmUgY2FuIHlvdSBza2V0Y2ggb3V0IGhvdyB5b3Ugc2VlIHRoaXMgcHJvcG9zYWwg YmVpbmcgZXh0ZW5zaWJsZSB0bwo+ID4gPiA+ID4gPiB1c2luZyBndWVzdF9tZW1mZCBmb3Igc2hh cmVkIG1hcHBpbmdzPwo+ID4gPiA+ID4gRm9yIGluLXBsYWNlIGNvbnZlcnNpb25zLCBlLmcuIHBL Vk0sIG5vIGFkZGl0aW9uYWwgZ3Vlc3RfbWVtZmQgaXMgbmVlZGVkLiAgV2hhdCdzCj4gPiA+ID4g PiBtaXNzaW5nIHRoZXJlIGlzIHRoZSBhYmlsaXR5IHRvIChzYWZlbHkpIG1tYXAoKSBndWVzdF9t ZW1mZCwgZS5nLiBLVk0gbmVlZHMgdG8KPiA+ID4gPiA+IGVuc3VyZSB0aGVyZSBhcmUgbm8gb3V0 c3RhbmRpbmcgcmVmZXJlbmNlcyB3aGVuIGNvbnZlcnRpbmcgYmFjayB0byBwcml2YXRlLgo+ID4g PiA+ID4KPiA+ID4gPiA+IEZvciBURFgvU05QLCBhc3N1bWluZyB3ZSBkb24ndCBmaW5kIGEgcGVy Zm9ybWFudCBhbmQgcm9idXN0IHdheSB0byBkbyBpbi1wbGFjZQo+ID4gPiA+ID4gY29udmVyc2lv bnMsIGEgc2Vjb25kIGZkK29mZnNldCBwYWlyIHdvdWxkIGJlIG5lZWRlZC4KPiA+ID4gPiBJcyB0 aGVyZSBhIHdheSB0byBzdXBwb3J0IG5vbi1pbi1wbGFjZSBjb252ZXJzaW9ucyB3aXRoaW4gYSBz aW5nbGUgZ3Vlc3RfbWVtZmQ/Cj4gPiA+Cj4gPiA+IEZvciBURFgvU05QLCB5b3UgY291bGQgaGF2 ZSBhIGhvb2sgZnJvbSBLVk1fU0VUX01FTU9SWV9BVFRSSUJVVEVTIHRvIGd1ZXN0Cj4gPiA+IG1l bW9yeS4gIFRoZSBob29rIHdvdWxkIGludmFsaWRhdGUgbm93LXByaXZhdGUgcGFydHMgaWYgdGhl eSBoYXZlIGEgVk1BLAo+ID4gPiBjYXVzaW5nIGEgU0lHU0VHVi9FRkFVTFQgaWYgdGhlIGhvc3Qg dG91Y2hlcyB0aGVtLgo+ID4gPgo+ID4gPiBJdCB3b3VsZCBmb3JiaWQgbWFwcGluZ3MgZnJvbSBt dWx0aXBsZSBnZm5zIHRvIGEgc2luZ2xlIG9mZnNldCBvZiB0aGUKPiA+ID4gZ3Vlc3RfbWVtZmQs IGJlY2F1c2UgdGhlbiB0aGUgc2hhcmVkIHZzLiBwcml2YXRlIGF0dHJpYnV0ZSB3b3VsZCBiZSB0 aWVkIHRvCj4gPiA+IHRoZSBvZmZzZXQuICBUaGlzIHNob3VsZCBub3QgYmUgYSBwcm9ibGVtOyBm b3IgZXhhbXBsZSwgaW4gdGhlIGNhc2Ugb2YgU05QLAo+ID4gPiB0aGUgUk1QIGFscmVhZHkgcmVx dWlyZXMgYSBzaW5nbGUgbWFwcGluZyBmcm9tIGhvc3QgcGh5c2ljYWwgYWRkcmVzcyB0bwo+ID4g PiBndWVzdCBwaHlzaWNhbCBhZGRyZXNzLgo+ID4KPiA+IEkgZG9uJ3Qgc2VlIGhvdyB0aGlzIGNh biB3b3JrLiAgSXQncyBub3QgYSBNOjEgc2NlbmFyaW8gKHdoZXJlIE0gaXMgbXVsdGlwbGUgZ2Zu cyksCj4gPiBpdCdzIGEgMTpOIHNjZW5hcmlvICh3aGVyZW4gTiBpcyBtdWx0aXBsZSBvZmZzZXRz KS4gIFRoZSAqZ2ZuKiBkb2Vzbid0IGNoYW5nZSBvbgo+ID4gYSBjb252ZXJzaW9uLCB3aGF0IG5l ZWRzIHRvIGNoYW5nZSB0byBkbyBub24taW4tcGxhY2UgY29udmVyc2lvbiBpcyB0aGUgcGZuLCB3 aGljaAo+ID4gaXMgZWZmZWN0aXZlbHkgdGhlIGd1ZXN0X21lbWZkK29mZnNldCBwYWlyLgo+ID4K PiA+IFNvIHllcywgd2UgKmNvdWxkKiBzdXBwb3J0IG5vbi1pbi1wbGFjZSBjb252ZXJzaW9ucyB3 aXRoaW4gYSBzaW5nbGUgZ3Vlc3RfbWVtZmQsCj4gPiBidXQgaXQgd291bGQgcmVxdWlyZSBhIHNl Y29uZCBvZmZzZXQsCj4gCj4gV2h5IGNhbid0IEtWTSBmcmVlIHRoZSBleGlzdGluZyBwYWdlIGF0 IGd1ZXN0X21lbWZkK29mZnNldCBhbmQKPiBhbGxvY2F0ZSBhIG5ldyBvbmUgd2hlbiBkb2luZyBu b24taW4tcGxhY2UgY29udmVyc2lvbnM/CgpPaCwgSSBzZWUgd2hhdCB5b3UncmUgc3VnZ2VzdGlu Zy4gIEV3dy4KCkl0J3MgY2VydGFpbmx5IHBvc3NpYmxlLCBidXQgaXQgd291bGQgbGFyZ2VseSBk ZWZlYXQgdGhlIHB1cnBvc2Ugb2Ygd2h5IHdlIGFyZQphZGRpbmcgZ3Vlc3RfbWVtZmQgaW4gdGhl IGZpcnN0IHBsYWNlLgoKRm9yIFREWCBhbmQgU05QLCB0aGUgZ29hbCBpcyB0byBwcm92aWRlIGEg c2ltcGxlLCByb2J1c3QgbWVjaGFuaXNtIGZvciBpc29sYXRpbmcKZ3Vlc3QgcHJpdmF0ZSBtZW1v cnkgc28gdGhhdCBpdCdzIGFsbCBidXQgaW1wb3NzaWJsZSBmb3IgdGhlIGhvc3QgdG8gYWNjZXNz IHByaXZhdGUKbWVtb3J5LiAgQXMgdGhpbmdzIHN0YW5kLCBtZW1vcnkgZm9yIGEgZ2l2ZW4gZ3Vl c3RfbWVtZmQgaXMgZWl0aGVyIHByaXZhdGUgb3Igc2hhcmVkCihhc3N1bWluZyB3ZSBzdXBwb3J0 IGEgc2Vjb25kIGd1ZXN0X21lbWZkIHBlciBtZW1zbG90KS4gIEkuZS4gdGhlcmUncyBubyBuZWVk IHRvCnRyYWNrIHdoZXRoZXIgYSBnaXZlbiBwYWdlL2ZvbGlvIGluIHRoZSBndWVzdF9tZW1mZCBp cyBwcml2YXRlIHZzLiBzaGFyZWQuCgpXZSBjb3VsZCB1c2UgbWVtb3J5IGF0dHJpYnV0ZXMsIGJ1 dCB0aGF0IGZ1cnRoZXIgY29tcGxpY2F0ZXMgdGhpbmdzIHdoZW4gaW50cmFob3N0Cm1pZ3JhdGlv biAoYW5kIHBvdGVudGlhbGx5IG90aGVyIG11bHRpLXVzZXIgc2NlbmFyaW9zKSBjb21lcyBhbG9u ZywgaS5lLiB3aGVuIEtWTQpzdXBwb3J0cyBsaW5raW5nIG11bHRpcGxlIGd1ZXN0X21lbWZkIGZp bGVzIHRvIGEgc2luZ2xlIGlub2RlLiAgV2UnZCBoYXZlIHRvIGVuc3VyZQp0aGF0IGFsbCAic3Ry dWN0IGt2bSIgaW5zdGFuY2VzIGhhdmUgaWRlbnRpY2FsIFBSSVZBVEUgYXR0cmlidXRlcyBmb3Ig YSBnaXZlbgoqb2Zmc2V0KiBpbiB0aGUgaW5vZGUuICBJJ20gbm90IGV2ZW4gc3VyZSBob3cgZmVh c2libGUgdGhhdCBpcyBmb3IgaW50cmFob3N0Cm1pZ3JhdGlvbiwgYW5kIHRoYXQncyB0aGUgKmVh c3kqIGNhc2UsIGJlY2F1c2UgSUlSQyBpdCdzIGFscmVhZHkgYSBoYXJkIHJlcXVpcmVtZW50CnRo YXQgdGhlIHNvdXJjZSBhbmQgZGVzdGluYXRpb24gaGF2ZSBpZGVudGljYWwgZ25mPT5ndWVzdF9t ZW1mZCBiaW5kaW5ncywgaS5lLiBLVk0KY2FuIHNvbWV3aGF0IGVhc2lseSByZWFzb24gYWJvdXQg Z2ZuIGF0dHJpYnV0ZXMuCgpCdXQgZXZlbiB0aGVuLCB0aGF0IG9ubHkgaGVscHMgd2l0aCB0aGUg YWN0dWFsIG1pZ3JhdGlvbiBvZiB0aGUgVk0sIGUuZy4gd2UnZCBzdGlsbApoYXZlIHRvIGZpZ3Vy ZSBvdXQgaG93IHRvIGRlYWwgd2l0aCAubW1hcCgpIGFuZCBvdGhlciBzaGFyZWQgdnMuIHByaXZh dGUgYWN0aW9ucwp3aGVuIGxpbmtpbmcgYSBuZXcgZ3Vlc3RfbWVtZmQgZmlsZSBhZ2FpbnN0IGFu IGV4aXN0aW5nIGlub2RlLgoKSSBoYXZlbid0IHNlZW4gdGhlIHBLVk0gcGF0Y2hlcyBmb3Igc3Vw cG9ydGluZyAubW1hcCgpLCBzbyBtYXliZSB0aGlzIGlzIGFscmVhZHkKYSBzb2x2ZWQgcHJvYmxl bSwgYnV0IEknZCBob25lc3RseSBiZSBxdWl0ZSBzdXJwcmlzZWQgaWYgaXQgYWxsIHdvcmtzIGNv cnJlY3RseQppZi93aGVuIEtWTSBzdXBwb3J0cyBtdWx0aXBsZSBmaWxlcyBwZXIgaW5vZGUuCgpB bmQgSSBkb24ndCBzZWUgd2hhdCB2YWx1ZSBub24taW4tcGxhY2UgY29udmVyc2lvbnMgd291bGQg YWRkLiAgVGhlIHZhbHVlIGFkZGVkCmJ5IGluLXBsYWNlIGNvbnZlcnNpb25zLCBhc2lkZSBmcm9t IHRoZSBvYnZpb3VzIHByZXNlcnZhdGlvbiBvZiBkYXRhLCB3aGljaCBpc24ndApyZWxldmFudCB0 byBURFgvU05QLCBpcyB0aGF0IGl0IGRvZXNuJ3QgcmVxdWlyZSBmcmVlaW5nIGFuZCByZWFsbG9j YXRpbmcgbWVtb3J5CnRvIGF2b2lkIGRvdWJsZS1hbGxvY2F0aW5nIGZvciBwcml2YXRlIHZzLiBz aGFyZWQuICBUaGF0J3MgZXNwZWNpYWx5IHF1aXRlIG5pY2UKd2hlbiBodWdlcGFnZXMgYXJlIGJl aW5nIHVzZWQgYmVjYXVzZSByZWNvbnN0aXR1aW5nIGEgaHVnZXBhZ2UgIm9ubHkiIHJlcXVpcmVz CnphcHBpbmcgU1BURXMuCgpCdXQgaWYgS1ZNIGlzIGZyZWVpbmcgdGhlIHByaXZhdGUgcGFnZSwg aXQncyB0aGUgc2FtZSBhcyBwdW5jaGluZyBhIGhvbGUsIHByb2JhYmx5CnF1aXRlIGxpdGVyYWxs eSwgd2hlbiBtYXBwaW5nIHRoZSBnZm4gYXMgc2hhcmVkLiAgSW4gZXZlcnkgd2F5IEkgY2FuIHRo aW5rIG9mLCBpdCdzCndvcnNlLiAgRS5nLiBpdCdzIG1vcmUgY29tcGxleCBmb3IgS1ZNLCBhbmQg dGhlIFBVTkNIX0hPTEUgPT4gYWxsb2NhdGlvbiBvcGVyYXRpb25zCm11c3QgYmUgc2VyaWFsaXpl ZC4KClJlZ2FyZGluZyBkb3VibGUtYWxsb2NhdGluZywgSSByZWFsbHksIHJlYWxseSB0aGluayB3 ZSBzaG91bGQgc29sdmUgdGhhdCBpbiB0aGUKZ3Vlc3QuICBJLmUuIHRlYWNoIExpbnV4LWFzLWEt Z3Vlc3QgdG8gYWdncmVzc2l2ZWx5IGNvbnZlcnQgYXQgMk1pQiBncmFudWxhcml0eQphbmQgYXZv aWQgNEtpQiBjb252ZXJzaW9ucy4gIDRLaUIgY29udmVyc2lvbnMgYXJlbid0IGp1c3QgYSBtZW1v cnkgdXRpbGl6YXRpb24KcHJvYmxlbSwgdGhleSdyZSBhbHNvIGEgcGVyZm9ybWFuY2UgcHJvYmxl bSwgZS5nLiBzaGF0dGVycyBodWdlcGFnZXMgKHdoaWNoIEtWTQpkb2Vzbid0IHlldCBzdXBwb3J0 IHJlY292ZXJpbmcpIGFuZCBpbmNyZWFzZXMgVExCIHByZXNzdXJlIGZvciBib3RoIHN0YWdlLTEg YW5kCnN0YWdlLTIgbWFwcGluZ3MuCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fXwpsaW51eC1yaXNjdiBtYWlsaW5nIGxpc3QKbGludXgtcmlzY3ZAbGlzdHMu aW5mcmFkZWFkLm9yZwpodHRwOi8vbGlzdHMuaW5mcmFkZWFkLm9yZy9tYWlsbWFuL2xpc3RpbmZv L2xpbnV4LXJpc2N2Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D260AC4167B for ; Thu, 2 Nov 2023 17:38:30 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=Zp4Asiq4; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4SLrh52zwbz3dBs for ; Fri, 3 Nov 2023 04:38:29 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=Zp4Asiq4; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--seanjc.bounces.google.com (client-ip=2607:f8b0:4864:20::64a; helo=mail-pl1-x64a.google.com; envelope-from=3wt5dzqykdfch3zc815dd5a3.1dba7cjmee1-23ka7hih.doaz0h.dg5@flex--seanjc.bounces.google.com; receiver=lists.ozlabs.org) Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4SLrg31Fl2z3cR9 for ; Fri, 3 Nov 2023 04:37:33 +1100 (AEDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc335bcb47so8956095ad.1 for ; Thu, 02 Nov 2023 10:37:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698946651; x=1699551451; darn=lists.ozlabs.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=Zp4Asiq4gqFaX0TPqe4sSp23wUCVeGs50Ej9CrsnRGIT0DovJVP8dBJIYz3fLmVlsx X7kfwSIEnZC81y/HSn859+hQd8zwCt6PgN12PzUIDomRPGo3K6HFKgQH/dI2zraIt1Pe NsTguNhLvTdSxyNVDUts5dNpi7vjmPjYYEEHxy+JmVOdGCXuUoB+ycFFkTxmtaqrdLON bTUPZnyV2IO/pqNMgPZ2p68txqFwPN7gpS3oROpMTo5D6lfBaUvsKpb0pT3A7jU5kYp9 JTwUh9UfJSo9JMbo82tguxiNhqLHUgOr/ibnM5pHDMKSQWYvTjyMRGqtD4Kqsktp6eAE NoAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698946651; x=1699551451; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=FOQIqi0lPfmz8JCP5RVLuAiswxoCYqX4O0z/3tO04m9Qf9TUxSH+S2HM9IU8em7SD/ VYSczTfS5Qwj4ZUSDVIM8Zjmz8DMyEgX4JeO9pG9v0WQDrVnUzUCjWTwLbpnd3+vvqJK rFow+T+01AdKJ34TpA7OdJsh5Hhk0LipKOb4922aP95Bl12sMJ3U3z6QXAMJFYSeJjR1 ELEsB3AVpdgU+oJWFKjkcdKo7vAyQ8o8YuEkP5aE1qfxt/d3E/zbEjofanDOpBRyxmeF LvbLhgfDLPf/lCkJLl5U7Sfz/O/eGh2YkLGiXLGfjWFmvNFDcELOc0pmYPQYgBrm6T0G Lsiw== X-Gm-Message-State: AOJu0YxKU2ogGRYywZPO5ZXrTSAuj1w+rEfn6dY+CK9+ox9EUz8X5sM2 dJ/AmHvNFSgEM+lgMWRT1UY9wpbtBNg= X-Google-Smtp-Source: AGHT+IHO2BmoMwEG0Z4p7dUErvps7DbkHqeEBbdPqNa8vmvQCAafmC1+rihoFOkAVlqXxxzUlxPm8WkwUZU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:25d2:b0:1cc:2ffe:5a27 with SMTP id jc18-20020a17090325d200b001cc2ffe5a27mr287356plb.9.1698946650780; Thu, 02 Nov 2023 10:37:30 -0700 (PDT) Date: Thu, 2 Nov 2023 10:37:29 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: David Matlack Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chao Peng , linux-riscv@lists.infradead.org, Isaku Yamahata , Marc Zyngier , Huacai Chen , Xiaoyao Li , "Matthew Wilcox \(Oracle\)" , Wang , Fuad Tabba , Yu Zhang , Maciej Szmigiero , Albert Ou , Vlastimil Babka , Michael Roth , Ackerley Tng , Alexander Viro , Paul Walmsley , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, =?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?= , Isaku Yamahata , Christian Brauner , Quentin Perret , L iam Merwick , linux-mips@vger.kernel.org, Oliver Upton , Jarkko Sakkinen , Palmer Dabbelt , "Kirill A . Shutemov" , kvm-riscv@lists.infradead.org, Anup Patel , linux-fsdevel@vger.kernel.org, Paolo Bonzini , Andrew Morton , Vishal Annapurve , linuxppc-dev@lists.ozlabs.org, Xu Yilun , Anish Moorthy Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Thu, Nov 02, 2023, David Matlack wrote: > On Thu, Nov 2, 2023 at 9:03=E2=80=AFAM Sean Christopherson wrote: > > > > On Thu, Nov 02, 2023, Paolo Bonzini wrote: > > > On 10/31/23 23:39, David Matlack wrote: > > > > > > Maybe can you sketch out how you see this proposal being extens= ible to > > > > > > using guest_memfd for shared mappings? > > > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is= needed. What's > > > > > missing there is the ability to (safely) mmap() guest_memfd, e.g.= KVM needs to > > > > > ensure there are no outstanding references when converting back t= o private. > > > > > > > > > > For TDX/SNP, assuming we don't find a performant and robust way t= o do in-place > > > > > conversions, a second fd+offset pair would be needed. > > > > Is there a way to support non-in-place conversions within a single = guest_memfd? > > > > > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to = guest > > > memory. The hook would invalidate now-private parts if they have a V= MA, > > > causing a SIGSEGV/EFAULT if the host touches them. > > > > > > It would forbid mappings from multiple gfns to a single offset of the > > > guest_memfd, because then the shared vs. private attribute would be t= ied to > > > the offset. This should not be a problem; for example, in the case o= f SNP, > > > the RMP already requires a single mapping from host physical address = to > > > guest physical address. > > > > I don't see how this can work. It's not a M:1 scenario (where M is mul= tiple gfns), > > it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't = change on > > a conversion, what needs to change to do non-in-place conversion is the= pfn, which > > is effectively the guest_memfd+offset pair. > > > > So yes, we *could* support non-in-place conversions within a single gue= st_memfd, > > but it would require a second offset, >=20 > Why can't KVM free the existing page at guest_memfd+offset and > allocate a new one when doing non-in-place conversions? Oh, I see what you're suggesting. Eww. It's certainly possible, but it would largely defeat the purpose of why we = are adding guest_memfd in the first place. For TDX and SNP, the goal is to provide a simple, robust mechanism for isol= ating guest private memory so that it's all but impossible for the host to access= private memory. As things stand, memory for a given guest_memfd is either private = or shared (assuming we support a second guest_memfd per memslot). I.e. there's no ne= ed to track whether a given page/folio in the guest_memfd is private vs. shared. We could use memory attributes, but that further complicates things when in= trahost migration (and potentially other multi-user scenarios) comes along, i.e. wh= en KVM supports linking multiple guest_memfd files to a single inode. We'd have t= o ensure that all "struct kvm" instances have identical PRIVATE attributes for a giv= en *offset* in the inode. I'm not even sure how feasible that is for intrahos= t migration, and that's the *easy* case, because IIRC it's already a hard req= uirement that the source and destination have identical gnf=3D>guest_memfd bindings,= i.e. KVM can somewhat easily reason about gfn attributes. But even then, that only helps with the actual migration of the VM, e.g. we= 'd still have to figure out how to deal with .mmap() and other shared vs. private ac= tions when linking a new guest_memfd file against an existing inode. I haven't seen the pKVM patches for supporting .mmap(), so maybe this is al= ready a solved problem, but I'd honestly be quite surprised if it all works corre= ctly if/when KVM supports multiple files per inode. And I don't see what value non-in-place conversions would add. The value a= dded by in-place conversions, aside from the obvious preservation of data, which= isn't relevant to TDX/SNP, is that it doesn't require freeing and reallocating me= mory to avoid double-allocating for private vs. shared. That's especialy quite = nice when hugepages are being used because reconstituing a hugepage "only" requi= res zapping SPTEs. But if KVM is freeing the private page, it's the same as punching a hole, p= robably quite literally, when mapping the gfn as shared. In every way I can think = of, it's worse. E.g. it's more complex for KVM, and the PUNCH_HOLE =3D> allocation = operations must be serialized. Regarding double-allocating, I really, really think we should solve that in= the guest. I.e. teach Linux-as-a-guest to aggressively convert at 2MiB granula= rity and avoid 4KiB conversions. 4KiB conversions aren't just a memory utilizat= ion problem, they're also a performance problem, e.g. shatters hugepages (which= KVM doesn't yet support recovering) and increases TLB pressure for both stage-1= and stage-2 mappings. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D4DF1C4332F for ; Thu, 2 Nov 2023 17:38:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=Ks4TxH8tfitBM3bz0N+mPuzzDHPoTj1ogrwP0ACk6k8=; b=kgXZcGbY4zADzD7WYM7nXiYcFa EO1DmmmCItYjOHjq4MUBHAMk1tKvVOOwN4sSwdoOMMQLxEz8utxu9iacKw9UwdRXK6LwV8+SIWM7b zIKmXglCli9dbpDQx0DLNxbyWspS3713vHXNcQVpNMD/0iaYvp49w/IhPauqtGf5m7j2qQFQohNXH L3CBMfaaTIHztJH+P7hCg2MGM2kyvedOkK/M8Ui/LPuKQyPsfEPtFBBvpnO6Zjkeo99rdH7WhfZ/w BYYb7ykHeqFE79FI1W896tPApsvdgQ/XpdP/FVgRSJAhJNaQ00JUzj+gCbdR2Ve7GTBfdRI6qo9+H lc+U+MYg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qybdU-009zS7-03; Thu, 02 Nov 2023 17:37:40 +0000 Received: from mail-pl1-x649.google.com ([2607:f8b0:4864:20::649]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qybdQ-009zNg-1v for linux-arm-kernel@lists.infradead.org; Thu, 02 Nov 2023 17:37:39 +0000 Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc335bcb47so8956075ad.1 for ; Thu, 02 Nov 2023 10:37:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698946651; x=1699551451; darn=lists.infradead.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=BBdLtxJnaotx+0awbo42X18NDzSHHxFKba0JQH810CBqTgkLjk0H+VV/A2hW+/QNiA PUkSkBAkJNrJRIlA86J+08pRVb5ytptuLbuwso1wTH+49Am+rOKIegD7W//LZNeDJ04x AzCzDcwVRCLSBBi9dT8pn8kTU0RlopCYT1rRscYtTqvJ1J4dZYOL9HAn45yyVK19kba4 A6Atn82W1vz+uqEJfSJZVFzwY2HbvY5gcrQS3c6gcwjTgsak3Va/AB6o1SGYC6EbbVFi 4ckzYHVjxPmcTGM71G4YZpyvjz0XxgxrrBDwgeczJh6exNwwiVyo8h1e4TR1KoYUaE1w SEmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698946651; x=1699551451; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=uN1ssLkZHjDTrGZRWby6miGmTEsoKqfVN3YbqYw1LLA=; b=DbeTFiLJzq4b8Tqzq+r8ivp1DKLs4R1ISjEdr7CU/RDeENdu9ROgiTfxNUKcVt7KE8 q/qc4Op8neSW2YPklS4vLALCc4/wTqFVkrTjFFnCu5nIHZxje7lA0et1U5qHGDT8UcAb LHLvPMWPQlBYfwul+joPKsmfP6meuTtRur/V51fpcMZYLN2B/R2pndWW+qJ6ltLS+MJN t5wnHuAwYobdSAOMqlIdvGKsm6Nd5xmOAt30Aml2eQuZ59D9ulvuXlMZMiPkAua+cFlV 5gVKLabF5LffYL1FAYsuYpZa1Nht8k2+fXXiyWakZZKN1UrjfFd/Yj7Aq0Rv20W+rhqp S0tg== X-Gm-Message-State: AOJu0YyMFf+JUj2lCKNyaZqHidi8AZTAlYXcsV3U5KewcOaNNIYXK/HF C/ilTlxldMFEEBiHUWFWe4A3GM0Xsqo= X-Google-Smtp-Source: AGHT+IHO2BmoMwEG0Z4p7dUErvps7DbkHqeEBbdPqNa8vmvQCAafmC1+rihoFOkAVlqXxxzUlxPm8WkwUZU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:25d2:b0:1cc:2ffe:5a27 with SMTP id jc18-20020a17090325d200b001cc2ffe5a27mr287356plb.9.1698946650780; Thu, 02 Nov 2023 10:37:30 -0700 (PDT) Date: Thu, 2 Nov 2023 10:37:29 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: David Matlack Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231102_103736_638490_2B03414D X-CRM114-Status: GOOD ( 37.41 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org T24gVGh1LCBOb3YgMDIsIDIwMjMsIERhdmlkIE1hdGxhY2sgd3JvdGU6Cj4gT24gVGh1LCBOb3Yg MiwgMjAyMyBhdCA5OjAz4oCvQU0gU2VhbiBDaHJpc3RvcGhlcnNvbiA8c2VhbmpjQGdvb2dsZS5j b20+IHdyb3RlOgo+ID4KPiA+IE9uIFRodSwgTm92IDAyLCAyMDIzLCBQYW9sbyBCb256aW5pIHdy b3RlOgo+ID4gPiBPbiAxMC8zMS8yMyAyMzozOSwgRGF2aWQgTWF0bGFjayB3cm90ZToKPiA+ID4g PiA+ID4gTWF5YmUgY2FuIHlvdSBza2V0Y2ggb3V0IGhvdyB5b3Ugc2VlIHRoaXMgcHJvcG9zYWwg YmVpbmcgZXh0ZW5zaWJsZSB0bwo+ID4gPiA+ID4gPiB1c2luZyBndWVzdF9tZW1mZCBmb3Igc2hh cmVkIG1hcHBpbmdzPwo+ID4gPiA+ID4gRm9yIGluLXBsYWNlIGNvbnZlcnNpb25zLCBlLmcuIHBL Vk0sIG5vIGFkZGl0aW9uYWwgZ3Vlc3RfbWVtZmQgaXMgbmVlZGVkLiAgV2hhdCdzCj4gPiA+ID4g PiBtaXNzaW5nIHRoZXJlIGlzIHRoZSBhYmlsaXR5IHRvIChzYWZlbHkpIG1tYXAoKSBndWVzdF9t ZW1mZCwgZS5nLiBLVk0gbmVlZHMgdG8KPiA+ID4gPiA+IGVuc3VyZSB0aGVyZSBhcmUgbm8gb3V0 c3RhbmRpbmcgcmVmZXJlbmNlcyB3aGVuIGNvbnZlcnRpbmcgYmFjayB0byBwcml2YXRlLgo+ID4g PiA+ID4KPiA+ID4gPiA+IEZvciBURFgvU05QLCBhc3N1bWluZyB3ZSBkb24ndCBmaW5kIGEgcGVy Zm9ybWFudCBhbmQgcm9idXN0IHdheSB0byBkbyBpbi1wbGFjZQo+ID4gPiA+ID4gY29udmVyc2lv bnMsIGEgc2Vjb25kIGZkK29mZnNldCBwYWlyIHdvdWxkIGJlIG5lZWRlZC4KPiA+ID4gPiBJcyB0 aGVyZSBhIHdheSB0byBzdXBwb3J0IG5vbi1pbi1wbGFjZSBjb252ZXJzaW9ucyB3aXRoaW4gYSBz aW5nbGUgZ3Vlc3RfbWVtZmQ/Cj4gPiA+Cj4gPiA+IEZvciBURFgvU05QLCB5b3UgY291bGQgaGF2 ZSBhIGhvb2sgZnJvbSBLVk1fU0VUX01FTU9SWV9BVFRSSUJVVEVTIHRvIGd1ZXN0Cj4gPiA+IG1l bW9yeS4gIFRoZSBob29rIHdvdWxkIGludmFsaWRhdGUgbm93LXByaXZhdGUgcGFydHMgaWYgdGhl eSBoYXZlIGEgVk1BLAo+ID4gPiBjYXVzaW5nIGEgU0lHU0VHVi9FRkFVTFQgaWYgdGhlIGhvc3Qg dG91Y2hlcyB0aGVtLgo+ID4gPgo+ID4gPiBJdCB3b3VsZCBmb3JiaWQgbWFwcGluZ3MgZnJvbSBt dWx0aXBsZSBnZm5zIHRvIGEgc2luZ2xlIG9mZnNldCBvZiB0aGUKPiA+ID4gZ3Vlc3RfbWVtZmQs IGJlY2F1c2UgdGhlbiB0aGUgc2hhcmVkIHZzLiBwcml2YXRlIGF0dHJpYnV0ZSB3b3VsZCBiZSB0 aWVkIHRvCj4gPiA+IHRoZSBvZmZzZXQuICBUaGlzIHNob3VsZCBub3QgYmUgYSBwcm9ibGVtOyBm b3IgZXhhbXBsZSwgaW4gdGhlIGNhc2Ugb2YgU05QLAo+ID4gPiB0aGUgUk1QIGFscmVhZHkgcmVx dWlyZXMgYSBzaW5nbGUgbWFwcGluZyBmcm9tIGhvc3QgcGh5c2ljYWwgYWRkcmVzcyB0bwo+ID4g PiBndWVzdCBwaHlzaWNhbCBhZGRyZXNzLgo+ID4KPiA+IEkgZG9uJ3Qgc2VlIGhvdyB0aGlzIGNh biB3b3JrLiAgSXQncyBub3QgYSBNOjEgc2NlbmFyaW8gKHdoZXJlIE0gaXMgbXVsdGlwbGUgZ2Zu cyksCj4gPiBpdCdzIGEgMTpOIHNjZW5hcmlvICh3aGVyZW4gTiBpcyBtdWx0aXBsZSBvZmZzZXRz KS4gIFRoZSAqZ2ZuKiBkb2Vzbid0IGNoYW5nZSBvbgo+ID4gYSBjb252ZXJzaW9uLCB3aGF0IG5l ZWRzIHRvIGNoYW5nZSB0byBkbyBub24taW4tcGxhY2UgY29udmVyc2lvbiBpcyB0aGUgcGZuLCB3 aGljaAo+ID4gaXMgZWZmZWN0aXZlbHkgdGhlIGd1ZXN0X21lbWZkK29mZnNldCBwYWlyLgo+ID4K PiA+IFNvIHllcywgd2UgKmNvdWxkKiBzdXBwb3J0IG5vbi1pbi1wbGFjZSBjb252ZXJzaW9ucyB3 aXRoaW4gYSBzaW5nbGUgZ3Vlc3RfbWVtZmQsCj4gPiBidXQgaXQgd291bGQgcmVxdWlyZSBhIHNl Y29uZCBvZmZzZXQsCj4gCj4gV2h5IGNhbid0IEtWTSBmcmVlIHRoZSBleGlzdGluZyBwYWdlIGF0 IGd1ZXN0X21lbWZkK29mZnNldCBhbmQKPiBhbGxvY2F0ZSBhIG5ldyBvbmUgd2hlbiBkb2luZyBu b24taW4tcGxhY2UgY29udmVyc2lvbnM/CgpPaCwgSSBzZWUgd2hhdCB5b3UncmUgc3VnZ2VzdGlu Zy4gIEV3dy4KCkl0J3MgY2VydGFpbmx5IHBvc3NpYmxlLCBidXQgaXQgd291bGQgbGFyZ2VseSBk ZWZlYXQgdGhlIHB1cnBvc2Ugb2Ygd2h5IHdlIGFyZQphZGRpbmcgZ3Vlc3RfbWVtZmQgaW4gdGhl IGZpcnN0IHBsYWNlLgoKRm9yIFREWCBhbmQgU05QLCB0aGUgZ29hbCBpcyB0byBwcm92aWRlIGEg c2ltcGxlLCByb2J1c3QgbWVjaGFuaXNtIGZvciBpc29sYXRpbmcKZ3Vlc3QgcHJpdmF0ZSBtZW1v cnkgc28gdGhhdCBpdCdzIGFsbCBidXQgaW1wb3NzaWJsZSBmb3IgdGhlIGhvc3QgdG8gYWNjZXNz IHByaXZhdGUKbWVtb3J5LiAgQXMgdGhpbmdzIHN0YW5kLCBtZW1vcnkgZm9yIGEgZ2l2ZW4gZ3Vl c3RfbWVtZmQgaXMgZWl0aGVyIHByaXZhdGUgb3Igc2hhcmVkCihhc3N1bWluZyB3ZSBzdXBwb3J0 IGEgc2Vjb25kIGd1ZXN0X21lbWZkIHBlciBtZW1zbG90KS4gIEkuZS4gdGhlcmUncyBubyBuZWVk IHRvCnRyYWNrIHdoZXRoZXIgYSBnaXZlbiBwYWdlL2ZvbGlvIGluIHRoZSBndWVzdF9tZW1mZCBp cyBwcml2YXRlIHZzLiBzaGFyZWQuCgpXZSBjb3VsZCB1c2UgbWVtb3J5IGF0dHJpYnV0ZXMsIGJ1 dCB0aGF0IGZ1cnRoZXIgY29tcGxpY2F0ZXMgdGhpbmdzIHdoZW4gaW50cmFob3N0Cm1pZ3JhdGlv biAoYW5kIHBvdGVudGlhbGx5IG90aGVyIG11bHRpLXVzZXIgc2NlbmFyaW9zKSBjb21lcyBhbG9u ZywgaS5lLiB3aGVuIEtWTQpzdXBwb3J0cyBsaW5raW5nIG11bHRpcGxlIGd1ZXN0X21lbWZkIGZp bGVzIHRvIGEgc2luZ2xlIGlub2RlLiAgV2UnZCBoYXZlIHRvIGVuc3VyZQp0aGF0IGFsbCAic3Ry dWN0IGt2bSIgaW5zdGFuY2VzIGhhdmUgaWRlbnRpY2FsIFBSSVZBVEUgYXR0cmlidXRlcyBmb3Ig YSBnaXZlbgoqb2Zmc2V0KiBpbiB0aGUgaW5vZGUuICBJJ20gbm90IGV2ZW4gc3VyZSBob3cgZmVh c2libGUgdGhhdCBpcyBmb3IgaW50cmFob3N0Cm1pZ3JhdGlvbiwgYW5kIHRoYXQncyB0aGUgKmVh c3kqIGNhc2UsIGJlY2F1c2UgSUlSQyBpdCdzIGFscmVhZHkgYSBoYXJkIHJlcXVpcmVtZW50CnRo YXQgdGhlIHNvdXJjZSBhbmQgZGVzdGluYXRpb24gaGF2ZSBpZGVudGljYWwgZ25mPT5ndWVzdF9t ZW1mZCBiaW5kaW5ncywgaS5lLiBLVk0KY2FuIHNvbWV3aGF0IGVhc2lseSByZWFzb24gYWJvdXQg Z2ZuIGF0dHJpYnV0ZXMuCgpCdXQgZXZlbiB0aGVuLCB0aGF0IG9ubHkgaGVscHMgd2l0aCB0aGUg YWN0dWFsIG1pZ3JhdGlvbiBvZiB0aGUgVk0sIGUuZy4gd2UnZCBzdGlsbApoYXZlIHRvIGZpZ3Vy ZSBvdXQgaG93IHRvIGRlYWwgd2l0aCAubW1hcCgpIGFuZCBvdGhlciBzaGFyZWQgdnMuIHByaXZh dGUgYWN0aW9ucwp3aGVuIGxpbmtpbmcgYSBuZXcgZ3Vlc3RfbWVtZmQgZmlsZSBhZ2FpbnN0IGFu IGV4aXN0aW5nIGlub2RlLgoKSSBoYXZlbid0IHNlZW4gdGhlIHBLVk0gcGF0Y2hlcyBmb3Igc3Vw cG9ydGluZyAubW1hcCgpLCBzbyBtYXliZSB0aGlzIGlzIGFscmVhZHkKYSBzb2x2ZWQgcHJvYmxl bSwgYnV0IEknZCBob25lc3RseSBiZSBxdWl0ZSBzdXJwcmlzZWQgaWYgaXQgYWxsIHdvcmtzIGNv cnJlY3RseQppZi93aGVuIEtWTSBzdXBwb3J0cyBtdWx0aXBsZSBmaWxlcyBwZXIgaW5vZGUuCgpB bmQgSSBkb24ndCBzZWUgd2hhdCB2YWx1ZSBub24taW4tcGxhY2UgY29udmVyc2lvbnMgd291bGQg YWRkLiAgVGhlIHZhbHVlIGFkZGVkCmJ5IGluLXBsYWNlIGNvbnZlcnNpb25zLCBhc2lkZSBmcm9t IHRoZSBvYnZpb3VzIHByZXNlcnZhdGlvbiBvZiBkYXRhLCB3aGljaCBpc24ndApyZWxldmFudCB0 byBURFgvU05QLCBpcyB0aGF0IGl0IGRvZXNuJ3QgcmVxdWlyZSBmcmVlaW5nIGFuZCByZWFsbG9j YXRpbmcgbWVtb3J5CnRvIGF2b2lkIGRvdWJsZS1hbGxvY2F0aW5nIGZvciBwcml2YXRlIHZzLiBz aGFyZWQuICBUaGF0J3MgZXNwZWNpYWx5IHF1aXRlIG5pY2UKd2hlbiBodWdlcGFnZXMgYXJlIGJl aW5nIHVzZWQgYmVjYXVzZSByZWNvbnN0aXR1aW5nIGEgaHVnZXBhZ2UgIm9ubHkiIHJlcXVpcmVz CnphcHBpbmcgU1BURXMuCgpCdXQgaWYgS1ZNIGlzIGZyZWVpbmcgdGhlIHByaXZhdGUgcGFnZSwg aXQncyB0aGUgc2FtZSBhcyBwdW5jaGluZyBhIGhvbGUsIHByb2JhYmx5CnF1aXRlIGxpdGVyYWxs eSwgd2hlbiBtYXBwaW5nIHRoZSBnZm4gYXMgc2hhcmVkLiAgSW4gZXZlcnkgd2F5IEkgY2FuIHRo aW5rIG9mLCBpdCdzCndvcnNlLiAgRS5nLiBpdCdzIG1vcmUgY29tcGxleCBmb3IgS1ZNLCBhbmQg dGhlIFBVTkNIX0hPTEUgPT4gYWxsb2NhdGlvbiBvcGVyYXRpb25zCm11c3QgYmUgc2VyaWFsaXpl ZC4KClJlZ2FyZGluZyBkb3VibGUtYWxsb2NhdGluZywgSSByZWFsbHksIHJlYWxseSB0aGluayB3 ZSBzaG91bGQgc29sdmUgdGhhdCBpbiB0aGUKZ3Vlc3QuICBJLmUuIHRlYWNoIExpbnV4LWFzLWEt Z3Vlc3QgdG8gYWdncmVzc2l2ZWx5IGNvbnZlcnQgYXQgMk1pQiBncmFudWxhcml0eQphbmQgYXZv aWQgNEtpQiBjb252ZXJzaW9ucy4gIDRLaUIgY29udmVyc2lvbnMgYXJlbid0IGp1c3QgYSBtZW1v cnkgdXRpbGl6YXRpb24KcHJvYmxlbSwgdGhleSdyZSBhbHNvIGEgcGVyZm9ybWFuY2UgcHJvYmxl bSwgZS5nLiBzaGF0dGVycyBodWdlcGFnZXMgKHdoaWNoIEtWTQpkb2Vzbid0IHlldCBzdXBwb3J0 IHJlY292ZXJpbmcpIGFuZCBpbmNyZWFzZXMgVExCIHByZXNzdXJlIGZvciBib3RoIHN0YWdlLTEg YW5kCnN0YWdlLTIgbWFwcGluZ3MuCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fXwpsaW51eC1hcm0ta2VybmVsIG1haWxpbmcgbGlzdApsaW51eC1hcm0ta2Vy bmVsQGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1h bi9saXN0aW5mby9saW51eC1hcm0ta2VybmVsCg==