Re: alloc_pages_bulk()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@techsingularity.net>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Jakub Kicinski <kuba@kernel.org>
Subject: Re: alloc_pages_bulk()
Date: Mon, 15 Feb 2021 12:06:09 +0000	[thread overview]
Message-ID: <20210215120608.GE3697@techsingularity.net> (raw)
In-Reply-To: <F3CD435E-905F-4262-B4DA-0C721A4235E1@oracle.com>

On Thu, Feb 11, 2021 at 04:20:31PM +0000, Chuck Lever wrote:
> > On Feb 11, 2021, at 4:12 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
> > 
> > <SNIP>
> > 
> > Parameters to __rmqueue_pcplist are garbage as the parameter order changed.
> > I'm surprised it didn't blow up in a spectacular fashion. Again, this
> > hasn't been near any testing and passing a list with high orders to
> > free_pages_bulk() will corrupt lists too. Mostly it's a curiousity to see
> > if there is justification for reworking the allocator to fundamentally
> > deal in batches and then feed batches to pcp lists and the bulk allocator
> > while leaving the normal GFP API as single page "batches". While that
> > would be ideal, it's relatively high risk for regressions. There is still
> > some scope for adding a basic bulk allocator before considering a major
> > refactoring effort.
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index f8353ea7b977..8f3fe7de2cf7 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5892,7 +5892,7 @@ __alloc_pages_bulk_nodemask(gfp_t gfp_mask, unsigned int order,
> > 	pcp_list = &pcp->lists[migratetype];
> > 
> > 	while (nr_pages) {
> > -		page = __rmqueue_pcplist(zone, gfp_mask, migratetype,
> > +		page = __rmqueue_pcplist(zone, migratetype, alloc_flags,
> > 								pcp, pcp_list);
> > 		if (!page)
> > 			break;
> 
> The NFS server is considerably more stable now. Thank you!
> 

Thanks for testing!

> I confirmed that my patch is requesting and getting multiple pages.
> The new NFSD code and the API seem to be working as expected.
> 
> The results are stunning. Each svc_alloc_arg() call here allocates
> 65 pages to satisfy a 256KB NFS READ request.
> 
> Before:
> 
>             nfsd-972   [000]   584.513817: funcgraph_entry:      + 35.385 us  |  svc_alloc_arg();
>             nfsd-979   [002]   584.513870: funcgraph_entry:      + 29.051 us  |  svc_alloc_arg();
>             nfsd-980   [001]   584.513951: funcgraph_entry:      + 29.178 us  |  svc_alloc_arg();
>             nfsd-983   [000]   584.514014: funcgraph_entry:      + 29.211 us  |  svc_alloc_arg();
>             nfsd-976   [002]   584.514059: funcgraph_entry:      + 29.315 us  |  svc_alloc_arg();
>             nfsd-974   [001]   584.514127: funcgraph_entry:      + 29.237 us  |  svc_alloc_arg();
> 
> After:
> 
>             nfsd-977   [002]    87.049425: funcgraph_entry:        4.293 us   |  svc_alloc_arg();
>             nfsd-981   [000]    87.049478: funcgraph_entry:        4.059 us   |  svc_alloc_arg();
>             nfsd-988   [001]    87.049549: funcgraph_entry:        4.474 us   |  svc_alloc_arg();
>             nfsd-983   [003]    87.049612: funcgraph_entry:        3.819 us   |  svc_alloc_arg();
>             nfsd-976   [000]    87.049619: funcgraph_entry:        3.869 us   |  svc_alloc_arg();
>             nfsd-980   [002]    87.049738: funcgraph_entry:        4.124 us   |  svc_alloc_arg();
>             nfsd-975   [000]    87.049769: funcgraph_entry:        3.734 us   |  svc_alloc_arg();
> 

Uhhhh, that is much better than I expected given how lame the
implementation is. Sure -- it works, but it has more overhead than it
should with the downside that reducing it requires fairly deep surgery. It
may be enough to tidy this up to handle order-0 pages only to start with
and see how far it gets. That's a fairly trivial modification.

> There appears to be little cost change for single-page allocations
> using the bulk allocator (nr_pages=1):
> 
> Before:
> 
>             nfsd-985   [003]   572.324517: funcgraph_entry:        0.332 us   |  svc_alloc_arg();
>             nfsd-986   [001]   572.324531: funcgraph_entry:        0.311 us   |  svc_alloc_arg();
>             nfsd-985   [003]   572.324701: funcgraph_entry:        0.311 us   |  svc_alloc_arg();
>             nfsd-986   [001]   572.324727: funcgraph_entry:        0.424 us   |  svc_alloc_arg();
>             nfsd-985   [003]   572.324760: funcgraph_entry:        0.332 us   |  svc_alloc_arg();
>             nfsd-986   [001]   572.324786: funcgraph_entry:        0.390 us   |  svc_alloc_arg();
> 
> After:
> 
>             nfsd-989   [002]    75.043226: funcgraph_entry:        0.322 us   |  svc_alloc_arg();
>             nfsd-988   [001]    75.043436: funcgraph_entry:        0.368 us   |  svc_alloc_arg();
>             nfsd-989   [002]    75.043464: funcgraph_entry:        0.424 us   |  svc_alloc_arg();
>             nfsd-988   [001]    75.043490: funcgraph_entry:        0.317 us   |  svc_alloc_arg();
>             nfsd-989   [002]    75.043517: funcgraph_entry:        0.425 us   |  svc_alloc_arg();
>             nfsd-988   [001]    75.050025: funcgraph_entry:        0.407 us   |  svc_alloc_arg();
> 

That is not too surprising given that there would be some additional
overhead to manage a list of 1 page. I would hope that users of the bulk
allocator are not routinely calling it with nr_pages == 1.

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2021-02-15 12:07 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 15:42 alloc_pages_bulk() Chuck Lever
2021-02-08 17:50 ` Fwd: alloc_pages_bulk() Chuck Lever
2021-02-09 10:31   ` alloc_pages_bulk() Jesper Dangaard Brouer
2021-02-09 13:37     ` alloc_pages_bulk() Chuck Lever
2021-02-09 17:27     ` alloc_pages_bulk() Vlastimil Babka
2021-02-10  9:51       ` alloc_pages_bulk() Christoph Hellwig
2021-02-10  8:41     ` alloc_pages_bulk() Mel Gorman
2021-02-10 11:41       ` alloc_pages_bulk() Jesper Dangaard Brouer
2021-02-10 13:07         ` alloc_pages_bulk() Mel Gorman
2021-02-10 22:58           ` alloc_pages_bulk() Chuck Lever
2021-02-11  9:12             ` alloc_pages_bulk() Mel Gorman
2021-02-11 12:26               ` alloc_pages_bulk() Jesper Dangaard Brouer
2021-02-15 12:00                 ` alloc_pages_bulk() Mel Gorman
2021-02-15 16:10                   ` alloc_pages_bulk() Jesper Dangaard Brouer
2021-02-22  9:42                     ` alloc_pages_bulk() Mel Gorman
2021-02-22 11:42                       ` alloc_pages_bulk() Jesper Dangaard Brouer
2021-02-22 14:08                         ` alloc_pages_bulk() Mel Gorman
2021-02-11 16:20               ` alloc_pages_bulk() Chuck Lever
2021-02-15 12:06                 ` Mel Gorman [this message]
2021-02-15 16:00                   ` alloc_pages_bulk() Chuck Lever
2021-02-22 20:44                   ` alloc_pages_bulk() Jesper Dangaard Brouer
2021-02-09 22:01   ` Fwd: alloc_pages_bulk() Matthew Wilcox
2021-02-09 22:55     ` alloc_pages_bulk() Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210215120608.GE3697@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=brouer@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=kuba@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.