Re: [RFC] net: use atomic allocation for order-3 page allocation

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Chris Mason <clm@fb.com>
To: Eric Dumazet <eric.dumazet@gmail.com>, Shaohua Li <shli@fb.com>
Cc: <netdev@vger.kernel.org>, <davem@davemloft.net>,
	<Kernel-team@fb.com>, Eric Dumazet <edumazet@google.com>,
	David Rientjes <rientjes@google.com>, <linux-mm@kvack.org>
Subject: Re: [RFC] net: use atomic allocation for order-3 page allocation
Date: Thu, 11 Jun 2015 17:16:46 -0400	[thread overview]
Message-ID: <5579FABE.4050505@fb.com> (raw)
In-Reply-To: <1434055687.27504.51.camel@edumazet-glaptop2.roam.corp.google.com>

On 06/11/2015 04:48 PM, Eric Dumazet wrote:
> On Thu, 2015-06-11 at 13:24 -0700, Shaohua Li wrote:
>> We saw excessive memory compaction triggered by skb_page_frag_refill.
>> This causes performance issues. Commit 5640f7685831e0 introduces the
>> order-3 allocation to improve performance. But memory compaction has
>> high overhead. The benefit of order-3 allocation can't compensate the
>> overhead of memory compaction.
>>
>> This patch makes the order-3 page allocation atomic. If there is no
>> memory pressure and memory isn't fragmented, the alloction will still
>> success, so we don't sacrifice the order-3 benefit here. If the atomic
>> allocation fails, compaction will not be triggered and we will fallback
>> to order-0 immediately.
>>
>> The mellanox driver does similar thing, if this is accepted, we must fix
>> the driver too.
>>
>> Cc: Eric Dumazet <edumazet@google.com>
>> Signed-off-by: Shaohua Li <shli@fb.com>
>> ---
>>  net/core/sock.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 292f422..e9855a4 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -1883,7 +1883,7 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
>>  
>>  	pfrag->offset = 0;
>>  	if (SKB_FRAG_PAGE_ORDER) {
>> -		pfrag->page = alloc_pages(gfp | __GFP_COMP |
>> +		pfrag->page = alloc_pages((gfp & ~__GFP_WAIT) | __GFP_COMP |
>>  					  __GFP_NOWARN | __GFP_NORETRY,
>>  					  SKB_FRAG_PAGE_ORDER);
>>  		if (likely(pfrag->page)) {
> 
> This is not a specific networking issue, but mm one.
> 
> You really need to start a discussion with mm experts.
> 
> Your changelog does not exactly explains what _is_ the problem.
> 
> If the problem lies in mm layer, it might be time to fix it, instead of
> work around the bug by never triggering it from this particular point,
> which is a safe point where a process is willing to wait a bit.
> 
> Memory compaction is either working as intending, or not.
> 
> If we enabled it but never run it because it hurts, what is the point
> enabling it ?

networking is asking for 32KB, and the MM layer is doing what it can to
provide it.  Are the gains from getting 32KB contig bigger than the cost
of moving pages around if the MM has to actually go into compaction?
Should we start disk IO to give back 32KB contig?

I think we want to tell the MM to compact in the background and give
networking 32KB if it happens to have it available.  If not, fall back
to smaller allocations without doing anything expensive.

-chris

next prev parent reply	other threads:[~2015-06-11 21:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-11 20:24 [RFC] net: use atomic allocation for order-3 page allocation Shaohua Li
2015-06-11 20:48 ` Eric Dumazet
2015-06-11 21:16   ` Chris Mason [this message]
2015-06-11 21:22     ` Eric Dumazet
2015-06-11 21:45       ` Shaohua Li
2015-06-11 21:56         ` Eric Dumazet
2015-06-11 22:01           ` Shaohua Li
2015-06-11 22:18       ` Chris Mason
2015-06-11 22:55         ` Eric Dumazet
2015-06-11 21:35     ` Debabrata Banerjee
2015-06-11 22:18       ` David Miller
2015-06-12  9:25       ` Vlastimil Babka
2015-06-11 21:25   ` Debabrata Banerjee
2015-06-11 21:28     ` Debabrata Banerjee
2015-06-12  9:34       ` Vlastimil Babka
2015-06-11 22:53 ` [RFC v2] " Eric Dumazet
2015-06-11 23:32   ` Shaohua Li
2015-06-11 23:38     ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5579FABE.4050505@fb.com \
    --to=clm@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=shli@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).