All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>, Fuad Tabba <tabba@google.com>,
	 Christoph Hellwig <hch@infradead.org>,
	John Hubbard <jhubbard@nvidia.com>,
	 Elliot Berman <quic_eberman@quicinc.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Shuah Khan <shuah@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	maz@kernel.org,  kvm@vger.kernel.org,
	linux-arm-msm@vger.kernel.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 pbonzini@redhat.com
Subject: Re: [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning
Date: Thu, 20 Jun 2024 15:32:27 -0700	[thread overview]
Message-ID: <ZnSt-_dkjStvT1WB@google.com> (raw)
In-Reply-To: <53d1e7c5-3e77-467b-be33-a618c3bb6cb3@redhat.com>

On Thu, Jun 20, 2024, David Hildenbrand wrote:
> On 20.06.24 22:30, Sean Christopherson wrote:
> > On Thu, Jun 20, 2024, David Hildenbrand wrote:
> > > On 20.06.24 18:36, Jason Gunthorpe wrote:
> > > > On Thu, Jun 20, 2024 at 04:45:08PM +0200, David Hildenbrand wrote:
> > > > 
> > > > > If we could disallow pinning any shared pages, that would make life a lot
> > > > > easier, but I think there were reasons for why we might require it. To
> > > > > convert shared->private, simply unmap that folio (only the shared parts
> > > > > could possibly be mapped) from all user page tables.
> > > > 
> > > > IMHO it should be reasonable to make it work like ZONE_MOVABLE and
> > > > FOLL_LONGTERM. Making a shared page private is really no different
> > > > from moving it.
> > > > 
> > > > And if you have built a VMM that uses VMA mapped shared pages and
> > > > short-term pinning then you should really also ensure that the VM is
> > > > aware when the pins go away. For instance if you are doing some virtio
> > > > thing with O_DIRECT pinning then the guest will know the pins are gone
> > > > when it observes virtio completions.
> > > > 
> > > > In this way making private is just like moving, we unmap the page and
> > > > then drive the refcount to zero, then move it.
> > > Yes, but here is the catch: what if a single shared subpage of a large folio
> > > is (validly) longterm pinned and you want to convert another shared subpage
> > > to private?
> > > 
> > > Sure, we can unmap the whole large folio (including all shared parts) before
> > > the conversion, just like we would do for migration. But we cannot detect
> > > that nobody pinned that subpage that we want to convert to private.
> > > 
> > > Core-mm is not, and will not, track pins per subpage.
> > > 
> > > So I only see two options:
> > > 
> > > a) Disallow long-term pinning. That means, we can, with a bit of wait,
> > >     always convert subpages shared->private after unmapping them and
> > >     waiting for the short-term pin to go away. Not too bad, and we
> > >     already have other mechanisms disallow long-term pinnings (especially
> > >     writable fs ones!).
> > 
> > I don't think disallowing _just_ long-term GUP will suffice, if we go the "disallow
> > GUP" route than I think it needs to disallow GUP, period.  Like the whole "GUP
> > writes to file-back memory" issue[*], which I think you're alluding to, short-term
> > GUP is also problematic.  But unlike file-backed memory, for TDX and SNP (and I
> > think pKVM), a single rogue access has a high probability of being fatal to the
> > entire system.
> 
> Disallowing short-term should work, in theory, because the

By "short-term", I assume you mean "long-term"?  Or am I more lost than I realize?

> writes-to-fileback has different issues (the PIN is not the problem but the
> dirtying).
>
> It's more related us not allowing long-term pins for FSDAX pages, because
> the lifetime of these pages is determined by the FS.
> 
> What we would do is
> 
> 1) Unmap the large folio completely and make any refaults block.
> -> No new pins can pop up
> 
> 2) If the folio is pinned, busy-wait until all the short-term pins are
>    gone.

This is the step that concerns me.   "Relatively short time" is, well, relative.
Hmm, though I suppose if userspace managed to map a shared page into something
that pins the page, and can't force an unpin, e.g. by stopping I/O?, then either
there's a host userspace bug or a guest bug, and so effectively hanging the vCPU
that is waiting for the conversion to complete is ok.

> 3) Safely convert the relevant subpage from shared -> private
> 
> Not saying it's the best approach, but it should be doable.

  reply	other threads:[~2024-06-20 22:32 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-19  0:05 [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 1/5] mm/gup: Move GUP_PIN_COUNTING_BIAS to page_ref.h Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 2/5] mm/gup: Add an option for obtaining an exclusive pin Elliot Berman
2024-06-19 22:40   ` kernel test robot
2024-06-19  0:05 ` [PATCH RFC 3/5] mm/gup: Add support for re-pinning a normal pinned page as exclusive Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 4/5] mm/gup-test: Verify exclusive pinned Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 5/5] mm/gup_test: Verify GUP grabs same pages twice Elliot Berman
2024-06-19  0:11 ` [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19  2:44 ` John Hubbard
2024-06-19  7:37   ` David Hildenbrand
2024-06-19  9:11     ` Fuad Tabba
2024-06-19 11:51       ` Jason Gunthorpe
2024-06-19 12:01         ` Fuad Tabba
2024-06-19 12:42           ` Jason Gunthorpe
2024-06-20 15:37           ` Sean Christopherson
2024-06-21  8:23             ` Fuad Tabba
2024-06-21  8:43               ` David Hildenbrand
2024-06-21  8:54                 ` Fuad Tabba
2024-06-21  9:10                   ` David Hildenbrand
2024-06-21 10:16                     ` Fuad Tabba
2024-06-21 16:54                       ` Elliot Berman
2024-06-24 19:03                         ` Sean Christopherson
2024-06-24 21:50                           ` David Rientjes
2024-06-26  3:19                             ` Vishal Annapurve
2024-06-26  5:20                               ` Pankaj Gupta
2024-06-19 12:17         ` David Hildenbrand
2024-06-20  4:11         ` Christoph Hellwig
2024-06-20  8:32           ` Fuad Tabba
2024-06-20 13:55             ` Jason Gunthorpe
2024-06-20 14:01               ` David Hildenbrand
2024-06-20 14:29                 ` Jason Gunthorpe
2024-06-20 14:45                   ` David Hildenbrand
2024-06-20 16:04                     ` Sean Christopherson
2024-06-20 18:56                       ` David Hildenbrand
2024-06-20 16:36                     ` Jason Gunthorpe
2024-06-20 18:53                       ` David Hildenbrand
2024-06-20 20:30                         ` Sean Christopherson
2024-06-20 20:47                           ` David Hildenbrand
2024-06-20 22:32                             ` Sean Christopherson [this message]
2024-06-20 23:00                               ` Jason Gunthorpe
2024-06-20 23:11                           ` Jason Gunthorpe
2024-06-20 23:54                             ` Sean Christopherson
2024-06-21  7:43                               ` David Hildenbrand
2024-06-21 12:39                               ` Jason Gunthorpe
2024-06-20 23:08                         ` Jason Gunthorpe
2024-06-20 22:47                   ` Elliot Berman
2024-06-20 23:18                     ` Jason Gunthorpe
2024-06-21  7:32                       ` Quentin Perret
2024-06-21  8:02                         ` David Hildenbrand
2024-06-21  9:25                           ` Quentin Perret
2024-06-21  9:37                             ` David Hildenbrand
2024-06-21 16:48                             ` Elliot Berman
2024-06-21 12:26                         ` Jason Gunthorpe
2024-06-19 12:16       ` David Hildenbrand
2024-06-20  8:47         ` Fuad Tabba
2024-06-20  9:00           ` David Hildenbrand
2024-06-20 14:01             ` Jason Gunthorpe
2024-06-20 13:08     ` Mostafa Saleh
2024-06-20 14:14       ` David Hildenbrand
2024-06-20 14:34         ` Jason Gunthorpe
2024-08-02  8:26           ` Tian, Kevin
2024-08-02 11:22             ` Jason Gunthorpe
2024-08-05  2:24               ` Tian, Kevin
2024-08-05 23:22                 ` Jason Gunthorpe
2024-08-06  0:50                   ` Tian, Kevin
2024-06-20 16:33         ` Mostafa Saleh
2024-07-12 23:29 ` Ackerley Tng
2024-07-16 16:03   ` Sean Christopherson
2024-07-16 16:08     ` Jason Gunthorpe
2024-07-16 17:34       ` Sean Christopherson
2024-07-16 20:11         ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnSt-_dkjStvT1WB@google.com \
    --to=seanjc@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=quic_eberman@quicinc.com \
    --cc=shuah@kernel.org \
    --cc=tabba@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.