From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81C1BC2BBCA for ; Thu, 20 Jun 2024 23:54:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 691DC8D00F6; Thu, 20 Jun 2024 19:54:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 63CAE8D00EC; Thu, 20 Jun 2024 19:54:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B5698D00F6; Thu, 20 Jun 2024 19:54:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 274C68D00EC for ; Thu, 20 Jun 2024 19:54:05 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9888C1A06EF for ; Thu, 20 Jun 2024 23:54:04 +0000 (UTC) X-FDA: 82252922808.10.20C8FD1 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf13.hostedemail.com (Postfix) with ESMTP id C996D20004 for ; Thu, 20 Jun 2024 23:54:02 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZE1A4vRb; spf=pass (imf13.hostedemail.com: domain of 3GcF0ZgYKCKgaMIVRKOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3GcF0ZgYKCKgaMIVRKOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718927633; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q0cRQk7/XDy8pfU62IMrgS6+Av/DI0qRpfb94ZPZk38=; b=5L4Qk+FKa7iAal1aFVjGe14gSwgRZZO/cpooaLKzEoBpuRBblwM2GouPrNnKWudBEUZlVr dW++nhIzNmO0fmRDKuYCjoD4kfw8qQnMPUwqWnZCtrUu3kzAPLzRKaNL7cDJQlsjUGR5ok FdrJHWaQpfsZ2jvR3MMTAxt32mowLUg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718927633; a=rsa-sha256; cv=none; b=DJPlgCFA6c8Fl2zY+lxyYSFyjXYK5/92YErw1vURWAb+XmbZeluvYKEApMNVPWG0aOnjjE jOqIBLrA3+2s3Wof5QZT6QkiccNSGniowNQhDDx3nc7Le+GabtPqkwSilmFGLqybL/68Fw RJwBItGIw7/qk/A7CVDgNM5Iygy9yRU= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZE1A4vRb; spf=pass (imf13.hostedemail.com: domain of 3GcF0ZgYKCKgaMIVRKOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3GcF0ZgYKCKgaMIVRKOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dfefc2c8569so2568822276.3 for ; Thu, 20 Jun 2024 16:54:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1718927642; x=1719532442; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Q0cRQk7/XDy8pfU62IMrgS6+Av/DI0qRpfb94ZPZk38=; b=ZE1A4vRbHVT503ppjsOsHB3/GpvguKtamw3qDbYhTggENy1jztbNX2B8Ur6tZToWr+ hl4nAOgM3dPvCimj3QvKLYq+7duDj53ZzpthLQ1X5X7WYddtQqHyOxcaFjC3S/pGEIWr e97WqCezBPtfDf9qV2/RSDG5ur3f4EB71EG+GqOvKde9rCmFoijyIJU5j6TmmMeh7r08 eBk8bfuxRUPNTXs2MUt5gxOW2Ho20abwx+pvgu8I20pQNutD/IRdojn+ye9Z90m3bYVH juNvQsRZRG7FbxEVCoKZf7nzo1R3MF4aWDQcp7zmrNYqPcwtmzdjhHyY2XX0kXvFDnHd c84Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718927642; x=1719532442; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Q0cRQk7/XDy8pfU62IMrgS6+Av/DI0qRpfb94ZPZk38=; b=AEhTLuY1ubcVH37/WU5jwJf8etZ7DPn3GVFDiFocVX8UqVjePFdVDpTXpXLTweAxg6 oEfOizonFSt3wkdKJP+x5XVSvniI9T0mKnLMPzQ1DOLNGqlweoBChiKNlV0fq707gTfi fWgnMH498H+tLl7znA/i9a2d4VoS5YPyYIusfUNiuLO01V90wI2eo8lUY2gPi1O+c4xy c3fnvABBm45wZzkhl5ruDvc9antdOyI2qGCzB82YHrJG/hBZwxEt7C6izKfieSR/gghQ A71rjTq9cjtmsRbDIfhd6y1kp71l2rY+akBC7HhujnsyW+Zv5ICqta8Irur9Xt3KFa1e 2kwg== X-Forwarded-Encrypted: i=1; AJvYcCVZuPAHQUKCSq8ChKt2hykpBbHikzdhGDaf1rAsf3roUo/TkLlOFpRQ0VgxhOsiL4TOYpfclk6cf3MLqCN42Cbc+7E= X-Gm-Message-State: AOJu0Yxew4mPnKZXDx5Yb8FwEQ9bIOJHxMlSrUj7G4K2j6OWP57/rLOh d+anmPJOlPOkOSs/aap+CcVlJ4hxSC8Zwy4uwuxDlt2fb7xe0m/huETKFffPCsydx+mgjcuulBC R7A== X-Google-Smtp-Source: AGHT+IFqao5Na4T3rzYX1N/evGid4F+5vGCBEQVSEibXvNgi7IKeL5G0APBP0MUgJnp7hbOtvkBt0IPXmEs= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:943:0:b0:dff:2f78:a5d7 with SMTP id 3f1490d57ef6-e02be130582mr1296664276.5.1718927641763; Thu, 20 Jun 2024 16:54:01 -0700 (PDT) Date: Thu, 20 Jun 2024 16:54:00 -0700 In-Reply-To: <20240620231133.GN2494510@nvidia.com> Mime-Version: 1.0 References: <20240620135540.GG2494510@nvidia.com> <6d7b180a-9f80-43a4-a4cc-fd79a45d7571@redhat.com> <20240620142956.GI2494510@nvidia.com> <385a5692-ffc8-455e-b371-0449b828b637@redhat.com> <20240620163626.GK2494510@nvidia.com> <66a285fc-e54e-4247-8801-e7e17ad795a6@redhat.com> <20240620231133.GN2494510@nvidia.com> Message-ID: Subject: Re: [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning From: Sean Christopherson To: Jason Gunthorpe Cc: David Hildenbrand , Fuad Tabba , Christoph Hellwig , John Hubbard , Elliot Berman , Andrew Morton , Shuah Khan , Matthew Wilcox , maz@kernel.org, kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, pbonzini@redhat.com Content-Type: text/plain; charset="us-ascii" X-Stat-Signature: drijgpi7f7tu8n16cbo9z3nea6iz8gou X-Rspamd-Queue-Id: C996D20004 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1718927642-964677 X-HE-Meta: U2FsdGVkX18R50wm1T42E6NMTjUrYnuIX4VvuuI3NkwkS1xJd6hOWnxUIAaPQsKYamnrbCWgcInL7EdexuAQMbFzDG/z40Rf0OyZ+XsLJapfDXDOL5OdhVIqFASpx2bwt4/xZXqYevFGF8F0SbJ9EPPRjns2bI79P9KcoTSap04vivIT/NNA81uUEoDspvlAYl0pzx/7qHEL/FTajo+snDCfLuekEJydRh8E/FYdXbJGfSrvjx+kTDO5RJtsc7jZt+5AwCdNUkzIsjNBJ6aLrKtq2v+ajfZa63ZoipIv1DQZgSW4Guotgx6rG4rzxeFy4+GX9ZBrAlIDL3btRVe0F6ln+notVnCSb+rsqmO/BSAOSUe+NqZ/0/OlT8t8y1SS5rDEy62MldS3paNsK4QH2nMiLwnHWgB+EfqBEzxk5O9LryPOEJY5CorXJKvloqJzexMw5FicsgYm1buh/FTY6QS9Y4PpqL/HSxEXChQQbuI15DP27l9ge8PP3G0GIA4MPAcfOC4H5P3iAHll5Tk57m+F1RQMJOg9cn6TqbeijgH3K/VI9Do91Na7/Zg2qjU9AmKwiRwUcb64YUVw06splaewyU9bDzXr/lTYa8b+S/PLa0WlSygWXJWG7WV9XYUN7ojiQh+l45UTWQc5KDVjmijcu27LPhFrIhAenZARSqfQogYvyqb40rplx2A9t6GgVzJ8jmt8lCM7qdvuhTV12TMZzJJLUh1isKUNP4RjyTTKCM/NB6RxA94gE0we0YQyXYXLwhLPN+ParYUqo6VY+0vjtQ4cWhDa34UEQVHXbomHoa4PljxVXc8XIFTWwY6pLzDYpWFoflZ81ON3zrZL4UpHnFZgQOumA8W+hepNX+6BdKK92vc6bBFhP1K2xQbEXHcQRe8cimpTO3z5dAkUqbQChhITCCNhVnm4VFc9BYSchfqk9KG6+CCg0OUg3vzU+lyjWSY9IP3jNtlIEjg CDUZidGy I0cX5O9Tst2QnMgoWST/Sz/RC4shLlJvpWONdPmbcaNMmi909eysBQv6GoaGA7X74n8JS68x79S7Jx/Qnb9Fn1gXyE0piH+doDkAJXkIpvqMTpEb6Y4YavAPjY47BQlEu7DSuvR43/VLVV/E9mDCegw+2c/Klbzwcuv7XqPHVK++9wRjVj3QjL5Xf3EjBuZiWQ76WQ8Xijkgm9glnKYU3jJVs4hRBT8bSFuk0XBJFzD/WzE3I+uGFYbckYtQcx4YKzarqvSRg4YSusxVSqeSYZNzA0+b8lfrhPShHtCyoKsdmmIVSj4JqkuYWP2kO1PCribFVlmsOo1HcqIJ07xxLaSiBt+jPDulpcGiXZ+ZmTZTdVUunF0Nnw6PZpaMSiakAjN89p5DHtrNCn2VGMdNTkFgrPnshBnZwpI9+hm0LQRv8TQc4wfu3H9FZuL2Swka7ns94ZO07h/yWHggP0ZeixCJwMJPWKT4j09uKxteMZQhNyIxncLtdJduvkcey3/Zz/K1IGod1ABx/wt49WJfm1gpzFsQap11xKcrTVSJlFjHhfLHT//52ztmCk4GgXAQfwfar X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 20, 2024, Jason Gunthorpe wrote: > On Thu, Jun 20, 2024 at 01:30:29PM -0700, Sean Christopherson wrote: > > I.e. except for blatant bugs, e.g. use-after-free, we need to be able to guarantee > > with 100% accuracy that there are no outstanding mappings when converting a page > > from shared=>private. Crossing our fingers and hoping that short-term GUP will > > have gone away isn't enough. > > To be clear it is not crossing fingers. If the page refcount is 0 then > there are no references to that memory anywhere at all. It is 100% > certain. > > It may take time to reach zero, but when it does it is safe. Yeah, we're on the same page, I just didn't catch the implicit (or maybe it was explicitly stated earlier) "wait for the refcount to hit zero" part that David already clarified. > Many things rely on this property, including FSDAX. > > > For non-CoCo VMs, I expect we'll want to be much more permissive, but I think > > they'll be a complete non-issue because there is no shared vs. private to worry > > about. We can simply allow any and all userspace mappings for guest_memfd that is > > attached to a "regular" VM, because a misbehaving userspace only loses whatever > > hardening (or other benefits) was being provided by using guest_memfd. I.e. the > > kernel and system at-large isn't at risk. > > It does seem to me like guest_memfd should really focus on the private > aspect. > > If we need normal memfd enhancements of some kind to work better with > KVM then that may be a better option than turning guest_memfd into > memfd. Heh, and then we'd end up turning memfd into guest_memfd. As I see it, being able to safely map TDX/SNP/pKVM private memory is a happy side effect that is possible because guest_memfd isn't subordinate to the primary MMU, but private memory isn't the core idenity of guest_memfd. The thing that makes guest_memfd tick is that it's guest-first, i.e. allows mapping memory into the guest with more permissions/capabilities than the host. E.g. access to private memory, hugepage mappings when the host is forced to use small pages, RWX mappings when the host is limited to RO, etc. We could do a subset of those for memfd, but I don't see the point, assuming we allow mmap() on shared guest_memfd memory. Solving mmap() for VMs that do private<=>shared conversions is the hard problem to solve. Once that's done, we'll get support for regular VMs along with the other benefits of guest_memfd for free (or very close to free).