From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Thu, 2 Nov 2023 09:03:42 -0700 Subject: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory In-Reply-To: <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Nov 02, 2023, Paolo Bonzini wrote: > On 10/31/23 23:39, David Matlack wrote: > > > > Maybe can you sketch out how you see this proposal being extensible to > > > > using guest_memfd for shared mappings? > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is needed. What's > > > missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to > > > ensure there are no outstanding references when converting back to private. > > > > > > For TDX/SNP, assuming we don't find a performant and robust way to do in-place > > > conversions, a second fd+offset pair would be needed. > > Is there a way to support non-in-place conversions within a single guest_memfd? > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest > memory. The hook would invalidate now-private parts if they have a VMA, > causing a SIGSEGV/EFAULT if the host touches them. > > It would forbid mappings from multiple gfns to a single offset of the > guest_memfd, because then the shared vs. private attribute would be tied to > the offset. This should not be a problem; for example, in the case of SNP, > the RMP already requires a single mapping from host physical address to > guest physical address. I don't see how this can work. It's not a M:1 scenario (where M is multiple gfns), it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't change on a conversion, what needs to change to do non-in-place conversion is the pfn, which is effectively the guest_memfd+offset pair. So yes, we *could* support non-in-place conversions within a single guest_memfd, but it would require a second offset, at which point it makes sense to add a second file descriptor as well. Userspace could still use a single guest_memfd instance, i.e. pass in the same file descriptor but different offsets. From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 466A91D69C for ; Thu, 2 Nov 2023 16:03:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ePyDCfOS" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1cc1682607eso9219505ad.1 for ; Thu, 02 Nov 2023 09:03:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698941024; x=1699545824; darn=lists.linux.dev; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=ePyDCfOS9ww1QuylL8HY8IHPu/x1Bw906v2cMTJ0kCsunQvLorrehIKJEMAlehRPJ5 wqhQxZ+1Sv7hgfFp3kggG1xFsA2HRW8P6Y/eDGOPGr3+KG29DB14O+ua+aMI+Y2KkjQw t0RShHgpCJXmqVOHjHyaF5BWBf+x/VzopW1yB5jHiLA4BYYwWu0vW/ob29dSyi0iZRZG Da3H/+3z1tXO0zNB0G3Nw/De4IXE8Jq/5C+WZC56IgQ8ubJd/DlFyj4VI0ZucYzg6mE6 fjWWaqEshIbDCr9ZN6kAgc3O7E4Q9P8ueuaLeMZQUh2yRbXwaKhDK6XA9Kr+kdD3tzPn 0naw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698941024; x=1699545824; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=bpfOy0+c1LoJD3lJeMnGXP1wVZhQYBQUZ+sCIi92oUVap1eANL3JgMrZLqC82AuTu4 xNlsx96T9bvNZGBx5K/AE/Fjh+lZsx9+1cna44KcGJnDIqVys2SYVgj0GsORT/NMwj32 M5HJj5FOOzzHzzEvbDh/vakmYdYIIOrdfcSbDAHYcz2Ye5E0Cyebm9rJ+ByLqoVcIMQ5 MwTAeS0PgvzSL3nlVUcb8lb0DXtSLAA5an+RdfQt+g1eeRILp3ZA5PDWlN33LZy3Qrvx mEziCUfxsj6pSNjHr37vBI5c3jh1SNmC9mi+vtWcKuG+vdtksKuY9wnMJyvxqNnZcmKj lmuA== X-Gm-Message-State: AOJu0YygxSwxPmcak37IfwvVgU9KiLDLTHAr8Ih2YNZVfv6RMKeXR9iJ W7p3mIj9vurTXWQckD5lMizZ7Ufl7hY= X-Google-Smtp-Source: AGHT+IFM1Nz2eAP9Xmws9JoQjAGGS19Q55KF/ifRxAflcrAzQHFBFIOS8V+26ERzfv1Lm5CG2bErCCPJO0Q= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:1304:b0:1cc:2ffe:5a33 with SMTP id iy4-20020a170903130400b001cc2ffe5a33mr266680plb.8.1698941024519; Thu, 02 Nov 2023 09:03:44 -0700 (PDT) Date: Thu, 2 Nov 2023 09:03:42 -0700 In-Reply-To: <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Paolo Bonzini Cc: David Matlack , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="us-ascii" On Thu, Nov 02, 2023, Paolo Bonzini wrote: > On 10/31/23 23:39, David Matlack wrote: > > > > Maybe can you sketch out how you see this proposal being extensible to > > > > using guest_memfd for shared mappings? > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is needed. What's > > > missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to > > > ensure there are no outstanding references when converting back to private. > > > > > > For TDX/SNP, assuming we don't find a performant and robust way to do in-place > > > conversions, a second fd+offset pair would be needed. > > Is there a way to support non-in-place conversions within a single guest_memfd? > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest > memory. The hook would invalidate now-private parts if they have a VMA, > causing a SIGSEGV/EFAULT if the host touches them. > > It would forbid mappings from multiple gfns to a single offset of the > guest_memfd, because then the shared vs. private attribute would be tied to > the offset. This should not be a problem; for example, in the case of SNP, > the RMP already requires a single mapping from host physical address to > guest physical address. I don't see how this can work. It's not a M:1 scenario (where M is multiple gfns), it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't change on a conversion, what needs to change to do non-in-place conversion is the pfn, which is effectively the guest_memfd+offset pair. So yes, we *could* support non-in-place conversions within a single guest_memfd, but it would require a second offset, at which point it makes sense to add a second file descriptor as well. Userspace could still use a single guest_memfd instance, i.e. pass in the same file descriptor but different offsets. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9675CC4332F for ; Thu, 2 Nov 2023 16:03:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=YnxxYb3/tUyd9T+f8ECKz73x8ncUtXa31XakLY4s6V4=; b=XdEOXJSrwXoudPizT6F/Dh2k7z J5Lz0g4xuSbS/CiPrVNCmDvgz9zEP69r76mJN7GDtIKmr8+pnYucmDCPC3bsmgxDkZZ3Ij3SKbhBT 5+p07PM4u22Fe/nO8gboPPDILfH05xEzE85z1FMs4CaivR+mZsEkeggbmmPPLhacbU9tkMx322V1l 1IO4yLHl8u4z884yl1LDniLMcrqybh7zJTL6+d1A7R1N0sB1kQmoAdPf1Vv2lIMwqu1VrdZ+yeAop HMmQzJMrGePKAma47kF5nTQRenteJkgrdJUkCsYA2HE1x/9DPETKSAsGNRc81yT4rppNwoRib3e4K P2G9I2Dw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qyaAf-009pbr-1Y; Thu, 02 Nov 2023 16:03:49 +0000 Received: from mail-pl1-x649.google.com ([2607:f8b0:4864:20::649]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qyaAb-009pa6-39 for linux-riscv@lists.infradead.org; Thu, 02 Nov 2023 16:03:47 +0000 Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc1682607eso9219585ad.1 for ; Thu, 02 Nov 2023 09:03:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698941024; x=1699545824; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=TAFpmrBa6wglHpMXGA7RDVQHBx8jbTjSCpOtOHe/63peXSXd+7bdYSIMpkhHUpjlX6 N9YTqoLFLdX5NnYRptIVJ4/Bb8oqSZfJg5lMZFbf19jUUj8FRwFvw4m5M3Vi1FgUPg+J ZEn4+RHahHbPh0tjTZqfsvXZGF9GnS3QQj0jPHZiDVLYBOC28CmjA/VP6vjUB3iaeotX 526YWgcmVuBTXn7ZblWa4QtHvE0SQu4TON4TGKkchxE90j0U/gzC6cWV7I7gKznhPLxc 2GzSju69r4lbfTP28VK0YVWMSut8VDIjbh33RSBhqkj8Tbn6UIKWxeCB4nKLAu0FxobY 4XaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698941024; x=1699545824; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=cLGlRmojOV2EJw+qUyeQDxEFRDTM6QQifWyoHPoTNClU0/e+qFtf/UssmdWY37u5nX t1NRL7NnVKiEoNnSoHjEmUdS224GmvS7EaDkUekNy8pLFS8czY70eZS9jm0GLNpkFRGL 8kt/P/wNmpPO6bhDdPoqEIBPc1K+GKBXDwCldRm+nT/kCh+Hq6ZAniA7HXm45CPU7OAU 01ZvDIXXemacNZ0s/aZXYqAgRtwcEtI6MF40NDT8jBt1KNHrx6eRWabqV8rU6t1/NiDI 8mufrjkedVhccBl/Y+WY1QS+xRi2jmbsXgyV1EWwBobWumwkvNF1P2upX35+GaPw1ifb dJvw== X-Gm-Message-State: AOJu0YyJJ7JOuEGjpvh0VjGevgmwYLhdmEoXTqWvo5mxpwn1DFvVXxZk ScPC+czoNzJTUgVK9tEhIMhiMbuEkBM= X-Google-Smtp-Source: AGHT+IFM1Nz2eAP9Xmws9JoQjAGGS19Q55KF/ifRxAflcrAzQHFBFIOS8V+26ERzfv1Lm5CG2bErCCPJO0Q= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:1304:b0:1cc:2ffe:5a33 with SMTP id iy4-20020a170903130400b001cc2ffe5a33mr266680plb.8.1698941024519; Thu, 02 Nov 2023 09:03:44 -0700 (PDT) Date: Thu, 2 Nov 2023 09:03:42 -0700 In-Reply-To: <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Paolo Bonzini Cc: David Matlack , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231102_090346_018276_289BAE1D X-CRM114-Status: GOOD ( 23.52 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org On Thu, Nov 02, 2023, Paolo Bonzini wrote: > On 10/31/23 23:39, David Matlack wrote: > > > > Maybe can you sketch out how you see this proposal being extensible to > > > > using guest_memfd for shared mappings? > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is needed. What's > > > missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to > > > ensure there are no outstanding references when converting back to private. > > > > > > For TDX/SNP, assuming we don't find a performant and robust way to do in-place > > > conversions, a second fd+offset pair would be needed. > > Is there a way to support non-in-place conversions within a single guest_memfd? > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest > memory. The hook would invalidate now-private parts if they have a VMA, > causing a SIGSEGV/EFAULT if the host touches them. > > It would forbid mappings from multiple gfns to a single offset of the > guest_memfd, because then the shared vs. private attribute would be tied to > the offset. This should not be a problem; for example, in the case of SNP, > the RMP already requires a single mapping from host physical address to > guest physical address. I don't see how this can work. It's not a M:1 scenario (where M is multiple gfns), it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't change on a conversion, what needs to change to do non-in-place conversion is the pfn, which is effectively the guest_memfd+offset pair. So yes, we *could* support non-in-place conversions within a single guest_memfd, but it would require a second offset, at which point it makes sense to add a second file descriptor as well. Userspace could still use a single guest_memfd instance, i.e. pass in the same file descriptor but different offsets. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4104DC4332F for ; Thu, 2 Nov 2023 16:04:43 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=aU1idF4w; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4SLpbs5xTQz3cnr for ; Fri, 3 Nov 2023 03:04:41 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=aU1idF4w; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--seanjc.bounces.google.com (client-ip=2607:f8b0:4864:20::649; helo=mail-pl1-x649.google.com; envelope-from=3ymhdzqykddefrnawptbbtyr.pbzyvahkccp-qriyvfgf.bmynof.bet@flex--seanjc.bounces.google.com; receiver=lists.ozlabs.org) Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4SLpZs4gfXz2xTm for ; Fri, 3 Nov 2023 03:03:47 +1100 (AEDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc391ca417so9335295ad.0 for ; Thu, 02 Nov 2023 09:03:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698941024; x=1699545824; darn=lists.ozlabs.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=aU1idF4wvXgG9iFT3OhIKiYQd5c3QGnxWiuQq/Wl/VHcyEJfgLIorrp/hlbZC3XymU YdIBoEPnFgseCsz4URfsLUMbuXuFWv/uhNBKVqdlDt3272VchtazG8yk0KCvvhLVka1d ONy7lQE+rJaZegekNU0+qy/TfctpuLH9MO5Sg+DJdbsv76W3pjp1z0Gt7MYmOF/5W3Gj waFU/GJ7YtX7KZIY6p+Q3my7KHhN+PrPXsQ+EEA+4HpPSdgzxELo/ZexgmL143pGkLq0 mp8JAqHurianr/Mpa0QGPmmLkdeoXVnP11K3cl+UOkHOUXd62fxbbGHZc1W2jzlvRyoH zzxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698941024; x=1699545824; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=XnccxSXRXeyQrbPtdh3iXAkV6v128p8ncImEAzK597tY0BFQ49ynT8AMfjo8ZFG7ca HuIA5BDgAX0LmvodlXaj23nix8ZUHW1MRuGxx+d5yV2qMK2sUwWp/r9guyaaIT1rFZ2O 3dEtppae8gGWFsFX9srYoqKvidnGU4Y1hNVYMbj544IMoRzuLx6wRk7W60SvpznMQj4l ev1M4vvoIM5wMlQ5PTJ+t7cStFjjLHb5K8w/APxJbLsuKgdKzZdaAb4MRorhxJHDA1l1 JbeWhPuacJD/Ri2a4uZcwvF2vlhvn3jN6noxMxrL6EwOj6Ndhdf6X8cPQM4a7BxhNUml ensQ== X-Gm-Message-State: AOJu0YwcELuI1Cut9pPG+M5QK2kbIZfqn5zTGGP7r+6b8rzbEFOuKzMr ua0acQz/hBbGvV1Fddr+/O6aewNyWro= X-Google-Smtp-Source: AGHT+IFM1Nz2eAP9Xmws9JoQjAGGS19Q55KF/ifRxAflcrAzQHFBFIOS8V+26ERzfv1Lm5CG2bErCCPJO0Q= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:1304:b0:1cc:2ffe:5a33 with SMTP id iy4-20020a170903130400b001cc2ffe5a33mr266680plb.8.1698941024519; Thu, 02 Nov 2023 09:03:44 -0700 (PDT) Date: Thu, 2 Nov 2023 09:03:42 -0700 In-Reply-To: <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Paolo Bonzini Content-Type: text/plain; charset="us-ascii" X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chao Peng , linux-riscv@lists.infradead.org, Isaku Yamahata , Marc Zyngier , Huacai Chen , Xiaoyao Li , "Matthew Wilcox \(Oracle\)" , Wang , Fuad Tabba , Yu Zhang , Maciej Szmigiero , Albert Ou , Vlastimil Babka , Michael Roth , Ackerley Tng , Alexander Viro , Paul Walmsley , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, =?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?= , Isaku Yamahata , Christian Brauner , Quentin Perret , A nup Patel , linux-mips@vger.kernel.org, Oliver Upton , David Matlack , Jarkko Sakkinen , Palmer Dabbelt , "Kirill A . Shutemov" , kvm-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, Liam Merwick , Andrew Morton , Vishal Annapurve , linuxppc-dev@lists.ozlabs.org, Xu Yilun , Anish Moorthy Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Thu, Nov 02, 2023, Paolo Bonzini wrote: > On 10/31/23 23:39, David Matlack wrote: > > > > Maybe can you sketch out how you see this proposal being extensible to > > > > using guest_memfd for shared mappings? > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is needed. What's > > > missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to > > > ensure there are no outstanding references when converting back to private. > > > > > > For TDX/SNP, assuming we don't find a performant and robust way to do in-place > > > conversions, a second fd+offset pair would be needed. > > Is there a way to support non-in-place conversions within a single guest_memfd? > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest > memory. The hook would invalidate now-private parts if they have a VMA, > causing a SIGSEGV/EFAULT if the host touches them. > > It would forbid mappings from multiple gfns to a single offset of the > guest_memfd, because then the shared vs. private attribute would be tied to > the offset. This should not be a problem; for example, in the case of SNP, > the RMP already requires a single mapping from host physical address to > guest physical address. I don't see how this can work. It's not a M:1 scenario (where M is multiple gfns), it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't change on a conversion, what needs to change to do non-in-place conversion is the pfn, which is effectively the guest_memfd+offset pair. So yes, we *could* support non-in-place conversions within a single guest_memfd, but it would require a second offset, at which point it makes sense to add a second file descriptor as well. Userspace could still use a single guest_memfd instance, i.e. pass in the same file descriptor but different offsets. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9510AC4332F for ; Thu, 2 Nov 2023 16:04:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=ay8rPfYsJ2d7DEV6Paye4AZYdzGs8Xo1rBDYDFVzW4Q=; b=I1Q1mRk1AzwuWVYUJZ5b0rF4ZK tRBLAZcj62Qaot4wAOXWkUANW/KmP834FYfR7IS1k+xyMP8D1L9mWsNjud5t/nB2JCQieyrr1211a /j9v/Bh3c2Ypi4AXgh8whnmcwBnv6AQ2nvJmKxl4LgtvGpw3uoSRzlSy1Y4WCrUzzDnuwJ0nIDrqD fzsfupxxuqfaRWBrr8k8oKNe2AgDebR2zaX6IGqigFIsmXWRwZl525gxih6JoZhIrV5OW5WcG1Ngf TJnueSzyxKf6FQQR7V2warEj5sZnyTayAWaC2vgK90OUqjwhcKR+jcCqbesaeOjNr1iZRLNAYNyeQ KOo/RAuQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qyaAg-009pby-0R; Thu, 02 Nov 2023 16:03:50 +0000 Received: from mail-pl1-x649.google.com ([2607:f8b0:4864:20::649]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qyaAc-009pa3-01 for linux-arm-kernel@lists.infradead.org; Thu, 02 Nov 2023 16:03:48 +0000 Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc391ca417so9335265ad.0 for ; Thu, 02 Nov 2023 09:03:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698941024; x=1699545824; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=TAFpmrBa6wglHpMXGA7RDVQHBx8jbTjSCpOtOHe/63peXSXd+7bdYSIMpkhHUpjlX6 N9YTqoLFLdX5NnYRptIVJ4/Bb8oqSZfJg5lMZFbf19jUUj8FRwFvw4m5M3Vi1FgUPg+J ZEn4+RHahHbPh0tjTZqfsvXZGF9GnS3QQj0jPHZiDVLYBOC28CmjA/VP6vjUB3iaeotX 526YWgcmVuBTXn7ZblWa4QtHvE0SQu4TON4TGKkchxE90j0U/gzC6cWV7I7gKznhPLxc 2GzSju69r4lbfTP28VK0YVWMSut8VDIjbh33RSBhqkj8Tbn6UIKWxeCB4nKLAu0FxobY 4XaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698941024; x=1699545824; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2iqXbuRKMyvTfyAi0R6oHncN26PfYTo3tctDI41vJWU=; b=RZU2gcqwVxI4aZe3xqoCfIeImUaarKCJUsGAKqmThNQyfMly8kaCxHml0lEYqnL+Ec FB//WZUoMAJ4lSIO/+gH8603MlD7pwhXNJCM72tf/hoAXzpPUd+H+YB//4S3SCvs6WYL L1E4oJaaGgd0JrZLOQyjJxQ3Ru12EWMtsCoJfBnWv5Vjo1PNw4FqZI3Chwr2NI2NUCB1 pYRBeaBxZo2UZR6HqAHiKjtYglWhSj4WeyVdGWcL5FEMhGvpIa5nMyNWkX/khK6ljX6f C4+0fu2tEPPmo2ZlRnCIYGWhRl/8priQWKrxauNhsiALzFwLpKz3QogriUCpxbIdGE+H KFaQ== X-Gm-Message-State: AOJu0YzwKQ/XQ5OP4A4U3mggW0d+nh24nxCTEtcbeR7PJtzUrRe9uvJY gNr81hNYqoo/5hpyvnezEsWK1o/uvfY= X-Google-Smtp-Source: AGHT+IFM1Nz2eAP9Xmws9JoQjAGGS19Q55KF/ifRxAflcrAzQHFBFIOS8V+26ERzfv1Lm5CG2bErCCPJO0Q= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:1304:b0:1cc:2ffe:5a33 with SMTP id iy4-20020a170903130400b001cc2ffe5a33mr266680plb.8.1698941024519; Thu, 02 Nov 2023 09:03:44 -0700 (PDT) Date: Thu, 2 Nov 2023 09:03:42 -0700 In-Reply-To: <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-17-seanjc@google.com> <6642c379-1023-4716-904f-4bbf076744c2@redhat.com> Message-ID: Subject: Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Paolo Bonzini Cc: David Matlack , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231102_090346_043766_52B900EB X-CRM114-Status: GOOD ( 25.02 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Nov 02, 2023, Paolo Bonzini wrote: > On 10/31/23 23:39, David Matlack wrote: > > > > Maybe can you sketch out how you see this proposal being extensible to > > > > using guest_memfd for shared mappings? > > > For in-place conversions, e.g. pKVM, no additional guest_memfd is needed. What's > > > missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to > > > ensure there are no outstanding references when converting back to private. > > > > > > For TDX/SNP, assuming we don't find a performant and robust way to do in-place > > > conversions, a second fd+offset pair would be needed. > > Is there a way to support non-in-place conversions within a single guest_memfd? > > For TDX/SNP, you could have a hook from KVM_SET_MEMORY_ATTRIBUTES to guest > memory. The hook would invalidate now-private parts if they have a VMA, > causing a SIGSEGV/EFAULT if the host touches them. > > It would forbid mappings from multiple gfns to a single offset of the > guest_memfd, because then the shared vs. private attribute would be tied to > the offset. This should not be a problem; for example, in the case of SNP, > the RMP already requires a single mapping from host physical address to > guest physical address. I don't see how this can work. It's not a M:1 scenario (where M is multiple gfns), it's a 1:N scenario (wheren N is multiple offsets). The *gfn* doesn't change on a conversion, what needs to change to do non-in-place conversion is the pfn, which is effectively the guest_memfd+offset pair. So yes, we *could* support non-in-place conversions within a single guest_memfd, but it would require a second offset, at which point it makes sense to add a second file descriptor as well. Userspace could still use a single guest_memfd instance, i.e. pass in the same file descriptor but different offsets. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel