linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Ackerley Tng <ackerleytng@google.com>,
	quic_eberman@quicinc.com, akpm@linux-foundation.org,
	david@redhat.com, kvm@vger.kernel.org,
	linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, linux-mm@kvack.org,
	maz@kernel.org, pbonzini@redhat.com, shuah@kernel.org,
	tabba@google.com, willy@infradead.org, vannapurve@google.com,
	hch@infradead.org, rientjes@google.com, jhubbard@nvidia.com,
	qperret@google.com, smostafa@google.com, fvdl@google.com,
	hughd@google.com
Subject: Re: [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning
Date: Tue, 16 Jul 2024 17:11:03 -0300	[thread overview]
Message-ID: <20240716201103.GE1482543@nvidia.com> (raw)
In-Reply-To: <ZpavP3K_xAMiu4kE@google.com>

On Tue, Jul 16, 2024 at 10:34:55AM -0700, Sean Christopherson wrote:
> On Tue, Jul 16, 2024, Jason Gunthorpe wrote:
> > On Tue, Jul 16, 2024 at 09:03:00AM -0700, Sean Christopherson wrote:
> > 
> > > > + To support huge pages, guest_memfd will take ownership of the hugepages, and
> > > >   provide interested parties (userspace, KVM, iommu) with pages to be used.
> > > >   + guest_memfd will track usage of (sub)pages, for both private and shared
> > > >     memory
> > > >   + Pages will be broken into smaller (probably 4K) chunks at creation time to
> > > >     simplify implementation (as opposed to splitting at runtime when private to
> > > >     shared conversion is requested by the guest)
> > > 
> > > FWIW, I doubt we'll ever release a version with mmap()+guest_memfd support that
> > > shatters pages at creation.  I can see it being an intermediate step, e.g. to
> > > prove correctness and provide a bisection point, but shattering hugepages at
> > > creation would effectively make hugepage support useless.
> > 
> > Why? If the private memory retains its contiguity seperately but the
> > struct pages are removed from the vmemmap, what is the downside?
> 
> Oooh, you're talking about shattering only the host userspace mappings.  Now I
> understand why there was a bit of a disconnect, I was thinking you (hand-wavy
> everyone) were saying that KVM would immediately shatter its own mappings too.

Right, I'm imagining that guestmemfd keep track of the physical ranges
in something else, like a maple tree, xarray or heck a SW radix page
table perhaps. It does not use struct pages. Then it has, say, a
bitmap indicating what 4k granuals are shared.

When kvm or the private world needs the physical addresses it reads
them out of that structure and it always sees perfectly physically
contiguous data regardless of any shared/private stuff.

It is not so much "broken at creation time", but more that guest memfd
does not use struct pages at all for private mappings and thus we can
setup the unused struct pages however we like, including removing them
from the vmemmap or preconfiguring them for order 0 granuals.

There is definitely some detailed datastructure work here to allow
guestmemfd to manage all of this efficiently and be effective for 4k
and 1G cases.

Jason


      reply	other threads:[~2024-07-16 20:11 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-19  0:05 [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 1/5] mm/gup: Move GUP_PIN_COUNTING_BIAS to page_ref.h Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 2/5] mm/gup: Add an option for obtaining an exclusive pin Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 3/5] mm/gup: Add support for re-pinning a normal pinned page as exclusive Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 4/5] mm/gup-test: Verify exclusive pinned Elliot Berman
2024-06-19  0:05 ` [PATCH RFC 5/5] mm/gup_test: Verify GUP grabs same pages twice Elliot Berman
2024-06-19  0:11 ` [PATCH RFC 0/5] mm/gup: Introduce exclusive GUP pinning Elliot Berman
2024-06-19  2:44 ` John Hubbard
2024-06-19  7:37   ` David Hildenbrand
2024-06-19  9:11     ` Fuad Tabba
2024-06-19 11:51       ` Jason Gunthorpe
2024-06-19 12:01         ` Fuad Tabba
2024-06-19 12:42           ` Jason Gunthorpe
2024-06-20 15:37           ` Sean Christopherson
2024-06-21  8:23             ` Fuad Tabba
2024-06-21  8:43               ` David Hildenbrand
2024-06-21  8:54                 ` Fuad Tabba
2024-06-21  9:10                   ` David Hildenbrand
2024-06-21 10:16                     ` Fuad Tabba
2024-06-21 16:54                       ` Elliot Berman
2024-06-24 19:03                         ` Sean Christopherson
2024-06-24 21:50                           ` David Rientjes
2024-06-26  3:19                             ` Vishal Annapurve
2024-06-26  5:20                               ` Pankaj Gupta
2024-06-19 12:17         ` David Hildenbrand
2024-06-20  4:11         ` Christoph Hellwig
2024-06-20  8:32           ` Fuad Tabba
2024-06-20 13:55             ` Jason Gunthorpe
2024-06-20 14:01               ` David Hildenbrand
2024-06-20 14:29                 ` Jason Gunthorpe
2024-06-20 14:45                   ` David Hildenbrand
2024-06-20 16:04                     ` Sean Christopherson
2024-06-20 18:56                       ` David Hildenbrand
2024-06-20 16:36                     ` Jason Gunthorpe
2024-06-20 18:53                       ` David Hildenbrand
2024-06-20 20:30                         ` Sean Christopherson
2024-06-20 20:47                           ` David Hildenbrand
2024-06-20 22:32                             ` Sean Christopherson
2024-06-20 23:00                               ` Jason Gunthorpe
2024-06-20 23:11                           ` Jason Gunthorpe
2024-06-20 23:54                             ` Sean Christopherson
2024-06-21  7:43                               ` David Hildenbrand
2024-06-21 12:39                               ` Jason Gunthorpe
2024-06-20 23:08                         ` Jason Gunthorpe
2024-06-20 22:47                   ` Elliot Berman
2024-06-20 23:18                     ` Jason Gunthorpe
2024-06-21  7:32                       ` Quentin Perret
2024-06-21  8:02                         ` David Hildenbrand
2024-06-21  9:25                           ` Quentin Perret
2024-06-21  9:37                             ` David Hildenbrand
2024-06-21 16:48                             ` Elliot Berman
2024-06-21 12:26                         ` Jason Gunthorpe
2024-06-19 12:16       ` David Hildenbrand
2024-06-20  8:47         ` Fuad Tabba
2024-06-20  9:00           ` David Hildenbrand
2024-06-20 14:01             ` Jason Gunthorpe
2024-06-20 13:08     ` Mostafa Saleh
2024-06-20 14:14       ` David Hildenbrand
2024-06-20 14:34         ` Jason Gunthorpe
2024-08-02  8:26           ` Tian, Kevin
2024-08-02 11:22             ` Jason Gunthorpe
2024-08-05  2:24               ` Tian, Kevin
2024-08-05 23:22                 ` Jason Gunthorpe
2024-08-06  0:50                   ` Tian, Kevin
2024-06-20 16:33         ` Mostafa Saleh
2024-07-12 23:29 ` Ackerley Tng
2024-07-16 16:03   ` Sean Christopherson
2024-07-16 16:08     ` Jason Gunthorpe
2024-07-16 17:34       ` Sean Christopherson
2024-07-16 20:11         ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240716201103.GE1482543@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=fvdl@google.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_eberman@quicinc.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=smostafa@google.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).