All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
	Alistair Popple <apopple@nvidia.com>,
	Christoph Hellwig <hch@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, linux-stable@vger.kernel.org,
	Vivek Kasireddy <vivek.kasireddy@intel.com>,
	Dave Airlie <airlied@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Peter Xu <peterx@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Dongwon Kim <dongwon.kim@intel.com>,
	Hugh Dickins <hughd@google.com>,
	Junxiao Chang <junxiao.chang@intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH] mm/gup: restore the ability to pin more than 2GB at a time
Date: Wed, 30 Oct 2024 21:25:52 -0300	[thread overview]
Message-ID: <20241031002552.GB10193@nvidia.com> (raw)
In-Reply-To: <21ee9aff-a9d5-495c-9e5e-38e9d25b11cd@nvidia.com>

On Wed, Oct 30, 2024 at 05:17:25PM -0700, John Hubbard wrote:
> On 10/30/24 5:02 PM, Jason Gunthorpe wrote:
> > On Wed, Oct 30, 2024 at 11:34:49AM -0700, John Hubbard wrote:
> > 
> > >  From a very high level design perspective, it's not yet clear to me
> > > that there is either a "preferred" or "not recommended" aspect to
> > > pinning in batches vs. all at once here, as long as one stays
> > > below the type (int, long, unsigned...) limits of the API. Batching
> > > seems like what you do if the internal implementation is crippled
> > > and unable to meet its API requirements. So the fact that many
> > > callers do batching is sort of "tail wags dog".
> > 
> > No.. all things need to do batching because nothing should be storing
> > a linear struct page array that is so enormous. That is going to
> > create vmemap pressure that is not desirable.
> 
> Are we talking about the same allocation size here? It's not 2GB. It
> is enough folio pointers to cover 2GB of memory, so 4MB.

Is 2GB a hard limit? I was expecting this was a range that had upper
bounds of 100GB's like for rdma.. Then it is 400MB, and yeah, that is
not great.

> That high level guidance makes sense, but here we are attempting only
> a 4MB physically contiguous allocation, and if larger than that, then
> it goes to vmalloc() which is merely virtually contiguous.

AFAIK any contiguous allocation beyond 4K basically doesn't work
reliably in a server environment due to fragmentation.

So you are always using the vmemap..

> I'm writing this because your adjectives make me suspect that you
> are referring to a 2GB allocation. But this is orders of magnitude
> smaller.

Even 4MB I would wonder about getting it split to PAGE_SIZE chunks
instead of vmemmap, but I don't know what it is being used for.

Jason


  reply	other threads:[~2024-10-31  0:26 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-30  3:01 [PATCH] mm/gup: restore the ability to pin more than 2GB at a time John Hubbard
2024-10-30  3:02 ` kernel test robot
2024-10-30  3:10   ` John Hubbard
2024-10-30  4:21 ` Christoph Hellwig
2024-10-30  4:30   ` John Hubbard
2024-10-30  4:33     ` Christoph Hellwig
2024-10-30  4:39       ` John Hubbard
2024-10-30  4:42         ` Christoph Hellwig
2024-10-30  4:44           ` John Hubbard
2024-10-30  6:18             ` Alistair Popple
2024-10-30  6:50               ` John Hubbard
2024-10-30  8:34                 ` David Hildenbrand
2024-10-30  9:01                   ` David Hildenbrand
2024-10-30 18:34                     ` John Hubbard
2024-10-31  0:02                       ` Jason Gunthorpe
2024-10-31  0:17                         ` John Hubbard
2024-10-31  0:25                           ` Jason Gunthorpe [this message]
2024-10-31  0:47                             ` John Hubbard
2024-10-30 12:04                   ` Jason Gunthorpe
2024-10-30 17:25                     ` John Hubbard
2024-10-30 11:59           ` Jason Gunthorpe
2024-10-30 11:03         ` Vlastimil Babka
2024-10-30 17:29           ` John Hubbard
2024-10-30 17:42             ` Vlastimil Babka
2024-10-30 17:49               ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241031002552.GB10193@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=airlied@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=arnd@arndb.de \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@redhat.com \
    --cc=dongwon.kim@intel.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=junxiao.chang@intel.com \
    --cc=kraxel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-stable@vger.kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=vivek.kasireddy@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.