From: David Hildenbrand <david@redhat.com>
To: Sean Christopherson <seanjc@google.com>,
Jason Gunthorpe <jgg@nvidia.com>
Cc: Fuad Tabba <tabba@google.com>,
Christoph Hellwig <hch@infradead.org>,
John Hubbard <jhubbard@nvidia.com>,
Elliot Berman <quic_eberman@quicinc.com>,
Andrew Morton <akpm@linux-foundation.org>,
Shuah Khan <shuah@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
maz@kernel.org, kvm@vger.kernel.org,
linux-arm-msm@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
pbonzini@redhat.com
Subject: Re: [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning
Date: Fri, 21 Jun 2024 09:43:47 +0200 [thread overview]
Message-ID: <96df0073-6fde-4252-a9cb-22eeb0a876bb@redhat.com> (raw)
In-Reply-To: <ZnTBGCeSN1u6wzLb@google.com>
On 21.06.24 01:54, Sean Christopherson wrote:
> On Thu, Jun 20, 2024, Jason Gunthorpe wrote:
>> On Thu, Jun 20, 2024 at 01:30:29PM -0700, Sean Christopherson wrote:
>>> I.e. except for blatant bugs, e.g. use-after-free, we need to be able to guarantee
>>> with 100% accuracy that there are no outstanding mappings when converting a page
>>> from shared=>private. Crossing our fingers and hoping that short-term GUP will
>>> have gone away isn't enough.
>>
>> To be clear it is not crossing fingers. If the page refcount is 0 then
>> there are no references to that memory anywhere at all. It is 100%
>> certain.
>>
>> It may take time to reach zero, but when it does it is safe.
>
> Yeah, we're on the same page, I just didn't catch the implicit (or maybe it was
> explicitly stated earlier) "wait for the refcount to hit zero" part that David
> already clarified.
>
>> Many things rely on this property, including FSDAX.
>>
>>> For non-CoCo VMs, I expect we'll want to be much more permissive, but I think
>>> they'll be a complete non-issue because there is no shared vs. private to worry
>>> about. We can simply allow any and all userspace mappings for guest_memfd that is
>>> attached to a "regular" VM, because a misbehaving userspace only loses whatever
>>> hardening (or other benefits) was being provided by using guest_memfd. I.e. the
>>> kernel and system at-large isn't at risk.
>>
>> It does seem to me like guest_memfd should really focus on the private
>> aspect.
We'll likely have to enter that domain for clean huge page support
and/or pKVM here either way.
Likely the future will see a mixture of things: some will use
guest_memfd only for the "private" parts and anon/shmem for the "shared"
parts, others will use guest_memfd for both.
>>
>> If we need normal memfd enhancements of some kind to work better with
>> KVM then that may be a better option than turning guest_memfd into
>> memfd.
>
> Heh, and then we'd end up turning memfd into guest_memfd. As I see it, being
> able to safely map TDX/SNP/pKVM private memory is a happy side effect that is
> possible because guest_memfd isn't subordinate to the primary MMU, but private
> memory isn't the core idenity of guest_memfd.
Right.
>
> The thing that makes guest_memfd tick is that it's guest-first, i.e. allows mapping
> memory into the guest with more permissions/capabilities than the host. E.g. access
> to private memory, hugepage mappings when the host is forced to use small pages,
> RWX mappings when the host is limited to RO, etc.
>
> We could do a subset of those for memfd, but I don't see the point, assuming we
> allow mmap() on shared guest_memfd memory. Solving mmap() for VMs that do
> private<=>shared conversions is the hard problem to solve. Once that's done,
> we'll get support for regular VMs along with the other benefits of guest_memfd
> for free (or very close to free).
I suspect there would be pushback from Hugh trying to teach memfd things
it really shouldn't be doing.
I once shared the idea of having a guest_memfd+memfd pair (managed by
KVM or whatever more genric virt infrastructure), whereby we could move
folios back and forth and only the memfd pages can be mapped and
consequently pinned. Of course, we could only move full folios, which
implies some kind of option b) for handling larger memory chunks
(gigantic pages).
But I'm not sure if that is really required and it wouldn't be just
easier to let the guest_memfd be mapped but only shared pages are handed
out.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-06-21 7:43 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-19 0:05 [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19 0:05 ` [PATCH RFC 1/5] mm/gup: Move GUP_PIN_COUNTING_BIAS to page_ref.h Elliot Berman
2024-06-19 0:05 ` [PATCH RFC 2/5] mm/gup: Add an option for obtaining an exclusive pin Elliot Berman
2024-06-19 0:05 ` [PATCH RFC 3/5] mm/gup: Add support for re-pinning a normal pinned page as exclusive Elliot Berman
2024-06-19 0:05 ` [PATCH RFC 4/5] mm/gup-test: Verify exclusive pinned Elliot Berman
2024-06-19 0:05 ` [PATCH RFC 5/5] mm/gup_test: Verify GUP grabs same pages twice Elliot Berman
2024-06-19 0:11 ` [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19 2:44 ` John Hubbard
2024-06-19 7:37 ` David Hildenbrand
2024-06-19 9:11 ` Fuad Tabba
2024-06-19 11:51 ` Jason Gunthorpe
2024-06-19 12:01 ` Fuad Tabba
2024-06-19 12:42 ` Jason Gunthorpe
2024-06-20 15:37 ` Sean Christopherson
2024-06-21 8:23 ` Fuad Tabba
2024-06-21 8:43 ` David Hildenbrand
2024-06-21 8:54 ` Fuad Tabba
2024-06-21 9:10 ` David Hildenbrand
2024-06-21 10:16 ` Fuad Tabba
2024-06-21 16:54 ` Elliot Berman
2024-06-24 19:03 ` Sean Christopherson
2024-06-24 21:50 ` David Rientjes
2024-06-26 3:19 ` Vishal Annapurve
2024-06-26 5:20 ` Pankaj Gupta
2024-06-19 12:17 ` David Hildenbrand
2024-06-20 4:11 ` Christoph Hellwig
2024-06-20 8:32 ` Fuad Tabba
2024-06-20 13:55 ` Jason Gunthorpe
2024-06-20 14:01 ` David Hildenbrand
2024-06-20 14:29 ` Jason Gunthorpe
2024-06-20 14:45 ` David Hildenbrand
2024-06-20 16:04 ` Sean Christopherson
2024-06-20 18:56 ` David Hildenbrand
2024-06-20 16:36 ` Jason Gunthorpe
2024-06-20 18:53 ` David Hildenbrand
2024-06-20 20:30 ` Sean Christopherson
2024-06-20 20:47 ` David Hildenbrand
2024-06-20 22:32 ` Sean Christopherson
2024-06-20 23:00 ` Jason Gunthorpe
2024-06-20 23:11 ` Jason Gunthorpe
2024-06-20 23:54 ` Sean Christopherson
2024-06-21 7:43 ` David Hildenbrand [this message]
2024-06-21 12:39 ` Jason Gunthorpe
2024-06-20 23:08 ` Jason Gunthorpe
2024-06-20 22:47 ` Elliot Berman
2024-06-20 23:18 ` Jason Gunthorpe
2024-06-21 7:32 ` Quentin Perret
2024-06-21 8:02 ` David Hildenbrand
2024-06-21 9:25 ` Quentin Perret
2024-06-21 9:37 ` David Hildenbrand
2024-06-21 16:48 ` Elliot Berman
2024-06-21 12:26 ` Jason Gunthorpe
2024-06-19 12:16 ` David Hildenbrand
2024-06-20 8:47 ` Fuad Tabba
2024-06-20 9:00 ` David Hildenbrand
2024-06-20 14:01 ` Jason Gunthorpe
2024-06-20 13:08 ` Mostafa Saleh
2024-06-20 14:14 ` David Hildenbrand
2024-06-20 14:34 ` Jason Gunthorpe
2024-08-02 8:26 ` Tian, Kevin
2024-08-02 11:22 ` Jason Gunthorpe
2024-08-05 2:24 ` Tian, Kevin
2024-08-05 23:22 ` Jason Gunthorpe
2024-08-06 0:50 ` Tian, Kevin
2024-06-20 16:33 ` Mostafa Saleh
2024-07-12 23:29 ` Ackerley Tng
2024-07-16 16:03 ` Sean Christopherson
2024-07-16 16:08 ` Jason Gunthorpe
2024-07-16 17:34 ` Sean Christopherson
2024-07-16 20:11 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=96df0073-6fde-4252-a9cb-22eeb0a876bb@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maz@kernel.org \
--cc=pbonzini@redhat.com \
--cc=quic_eberman@quicinc.com \
--cc=seanjc@google.com \
--cc=shuah@kernel.org \
--cc=tabba@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).