linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	Elliot Berman <quic_eberman@quicinc.com>,
	Fuad Tabba <tabba@google.com>,
	Christoph Hellwig <hch@infradead.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	maz@kernel.org, kvm@vger.kernel.org,
	linux-arm-msm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	pbonzini@redhat.com
Subject: Re: [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning
Date: Fri, 21 Jun 2024 09:25:10 +0000	[thread overview]
Message-ID: <ZnVG9oZL4GT0uFy_@google.com> (raw)
In-Reply-To: <c05f2a97-5863-4da7-bfae-2d6873a62ebe@redhat.com>

On Friday 21 Jun 2024 at 10:02:08 (+0200), David Hildenbrand wrote:
> Thanks for the information. IMHO we really should try to find a common
> ground here, and FOLL_EXCLUSIVE is likely not it :)

That's OK, IMO at least :-).

> Thanks for reviving this discussion with your patch set!
> 
> pKVM is interested in in-place conversion, I believe there are valid use
> cases for in-place conversion for TDX and friends as well (as discussed, I
> think that might be a clean way to get huge/gigantic page support in).
> 
> This implies the option to:
> 
> 1) Have shared+private memory in guest_memfd
> 2) Be able to mmap shared parts
> 3) Be able to convert shared<->private in place
> 
> and later in my interest
> 
> 4) Have huge/gigantic page support in guest_memfd with the option of
>    converting individual subpages
> 
> We might not want to make use of that model for all of CC -- as you state,
> sometimes the destructive approach might be better performance wise -- but
> having that option doesn't sound crazy to me (and maybe would solve real
> issues as well).

Cool.

> After all, the common requirement here is that "private" pages are not
> mapped/pinned/accessible.
> 
> Sure, there might be cases like "pKVM can handle access to private pages in
> user page mappings", "AMD-SNP will not crash the host if writing to private
> pages" but there are not factors that really make a difference for a common
> solution.

Sure, there isn't much value in differentiating on these things. One
might argue that we could save one mmap() on the private->shared
conversion path by keeping all of guest_memfd mapped in userspace
including private memory, but that's most probably not worth the
effort of re-designing the whole thing just for that, so let's forget
that.

The ability to handle stage-2 faults in the kernel has implications in
other places however. It means we don't need to punch holes in the
kernel linear map when donating memory to a guest for example, even with
'crazy' access patterns like load_unaligned_zeropad(). So that's good.

> private memory: not mapped, not pinned
> shared memory: maybe mapped, maybe pinned
> granularity of conversion: single pages
> 
> Anything I am missing?

That looks good to me. And as discussed in previous threads, we have the
ambition of getting page-migration to work, including for private memory,
mostly to get kcompactd to work better when pVMs are running. Android
makes extensive use of compaction, and pVMs currently stick out like a
sore thumb.

We can trivially implement a hypercall to have pKVM swap a private
page with another without the guest having to know. The difficulty is
obviously to hook that in Linux, and I've personally not looked into it
properly, so that is clearly longer term. We don't want to take anybody
by surprise if there is a need for some added complexity in guest_memfd
to support this use-case though. I don't expect folks on the receiving
end of that to agree to it blindly without knowing _what_ this
complexity is FWIW. But at least our intentions are clear :-)

Thanks,
Quentin


  reply	other threads:[~2024-06-21  9:25 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-19  0:05 [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 1/5] mm/gup: Move GUP_PIN_COUNTING_BIAS to page_ref.h Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 2/5] mm/gup: Add an option for obtaining an exclusive pin Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 3/5] mm/gup: Add support for re-pinning a normal pinned page as exclusive Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 4/5] mm/gup-test: Verify exclusive pinned Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 5/5] mm/gup_test: Verify GUP grabs same pages twice Elliot Berman
2024-06-19  0:11 ` [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19  2:44 ` John Hubbard
2024-06-19  7:37   ` David Hildenbrand
2024-06-19  9:11     ` Fuad Tabba
2024-06-19 11:51       ` Jason Gunthorpe
2024-06-19 12:01         ` Fuad Tabba
2024-06-19 12:42           ` Jason Gunthorpe
2024-06-20 15:37           ` Sean Christopherson
2024-06-21  8:23             ` Fuad Tabba
2024-06-21  8:43               ` David Hildenbrand
2024-06-21  8:54                 ` Fuad Tabba
2024-06-21  9:10                   ` David Hildenbrand
2024-06-21 10:16                     ` Fuad Tabba
2024-06-21 16:54                       ` Elliot Berman
2024-06-24 19:03                         ` Sean Christopherson
2024-06-24 21:50                           ` David Rientjes
2024-06-26  3:19                             ` Vishal Annapurve
2024-06-26  5:20                               ` Pankaj Gupta
2024-06-19 12:17         ` David Hildenbrand
2024-06-20  4:11         ` Christoph Hellwig
2024-06-20  8:32           ` Fuad Tabba
2024-06-20 13:55             ` Jason Gunthorpe
2024-06-20 14:01               ` David Hildenbrand
2024-06-20 14:29                 ` Jason Gunthorpe
2024-06-20 14:45                   ` David Hildenbrand
2024-06-20 16:04                     ` Sean Christopherson
2024-06-20 18:56                       ` David Hildenbrand
2024-06-20 16:36                     ` Jason Gunthorpe
2024-06-20 18:53                       ` David Hildenbrand
2024-06-20 20:30                         ` Sean Christopherson
2024-06-20 20:47                           ` David Hildenbrand
2024-06-20 22:32                             ` Sean Christopherson
2024-06-20 23:00                               ` Jason Gunthorpe
2024-06-20 23:11                           ` Jason Gunthorpe
2024-06-20 23:54                             ` Sean Christopherson
2024-06-21  7:43                               ` David Hildenbrand
2024-06-21 12:39                               ` Jason Gunthorpe
2024-06-20 23:08                         ` Jason Gunthorpe
2024-06-20 22:47                   ` Elliot Berman
2024-06-20 23:18                     ` Jason Gunthorpe
2024-06-21  7:32                       ` Quentin Perret
2024-06-21  8:02                         ` David Hildenbrand
2024-06-21  9:25                           ` Quentin Perret [this message]
2024-06-21  9:37                             ` David Hildenbrand
2024-06-21 16:48                             ` Elliot Berman
2024-06-21 12:26                         ` Jason Gunthorpe
2024-06-19 12:16       ` David Hildenbrand
2024-06-20  8:47         ` Fuad Tabba
2024-06-20  9:00           ` David Hildenbrand
2024-06-20 14:01             ` Jason Gunthorpe
2024-06-20 13:08     ` Mostafa Saleh
2024-06-20 14:14       ` David Hildenbrand
2024-06-20 14:34         ` Jason Gunthorpe
2024-08-02  8:26           ` Tian, Kevin
2024-08-02 11:22             ` Jason Gunthorpe
2024-08-05  2:24               ` Tian, Kevin
2024-08-05 23:22                 ` Jason Gunthorpe
2024-08-06  0:50                   ` Tian, Kevin
2024-06-20 16:33         ` Mostafa Saleh
2024-07-12 23:29 ` Ackerley Tng
2024-07-16 16:03   ` Sean Christopherson
2024-07-16 16:08     ` Jason Gunthorpe
2024-07-16 17:34       ` Sean Christopherson
2024-07-16 20:11         ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnVG9oZL4GT0uFy_@google.com \
    --to=qperret@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=quic_eberman@quicinc.com \
    --cc=shuah@kernel.org \
    --cc=tabba@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).