Re: [PATCH] net: use __GFP_NORETRY for high order allocations

All of lore.kernel.org
 help / color / mirror / Atom feed

From: ebiederm@xmission.com (Eric W. Biederman)
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	David Rientjes <rientjes@google.com>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] net: use __GFP_NORETRY for high order allocations
Date: Thu, 06 Feb 2014 18:03:23 -0800	[thread overview]
Message-ID: <87txcbisr8.fsf@xmission.com> (raw)
In-Reply-To: <1391712162.10160.8.camel@edumazet-glaptop2.roam.corp.google.com> (Eric Dumazet's message of "Thu, 06 Feb 2014 10:42:42 -0800")

Eric Dumazet <eric.dumazet@gmail.com> writes:

> From: Eric Dumazet <edumazet@google.com>
>
> sock_alloc_send_pskb() & sk_page_frag_refill()
> have a loop trying high order allocations to prepare
> skb with low number of fragments as this increases performance.
>
> Problem is that under memory pressure/fragmentation, this can
> trigger OOM while the intent was only to try the high order
> allocations, then fallback to order-0 allocations.
>
> We had various reports from unexpected regressions.
>
> According to David, setting __GFP_NORETRY should be fine,
> as the asynchronous compaction is still enabled, and this
> will prevent OOM from kicking as in :
>
> CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0,
> oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled
> CFSClientEventm 
>
> Call Trace:
>  [<ffffffff8043766c>] dump_header+0xe1/0x23e
>  [<ffffffff80437a02>] oom_kill_process+0x6a/0x323
>  [<ffffffff80438443>] out_of_memory+0x4b3/0x50d
>  [<ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7
>  [<ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0
>  [<ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0
>  [<ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160
>  [<ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0
>  [<ffffffff802a5037>] inet_sendmsg+0x67/0x100
>  [<ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90
>  [<ffffffff80283e85>] sock_sendmsg+0xc5/0xf0
>  [<ffffffff802847b6>] __sys_sendmsg+0x136/0x430
>  [<ffffffff80284ec8>] sys_sendmsg+0x88/0x110
>  [<ffffffff80711472>] system_call_fastpath+0x16/0x1b
> Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: David Rientjes <rientjes@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

I have seen similiar order 3 allocation failures as well and reached the
same conclusion that __GFP_NORETRY was the solution.

Eric
> ---
>  net/core/sock.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 0c127dcdf6a8..5b6a9431b017 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1775,7 +1775,9 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
>  			while (order) {
>  				if (npages >= 1 << order) {
>  					page = alloc_pages(sk->sk_allocation |
> -							   __GFP_COMP | __GFP_NOWARN,
> +							   __GFP_COMP |
> +							   __GFP_NOWARN |
> +							   __GFP_NORETRY,
>  							   order);
>  					if (page)
>  						goto fill_page;
> @@ -1845,7 +1847,7 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t prio)
>  		gfp_t gfp = prio;
>  
>  		if (order)
> -			gfp |= __GFP_COMP | __GFP_NOWARN;
> +			gfp |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY;
>  		pfrag->page = alloc_pages(gfp, order);
>  		if (likely(pfrag->page)) {
>  			pfrag->offset = 0;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

next prev parent reply	other threads:[~2014-02-07  2:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-06 18:42 [PATCH] net: use __GFP_NORETRY for high order allocations Eric Dumazet
2014-02-06 20:24 ` Joe Perches
2014-02-06 21:00   ` Eric Dumazet
2014-02-06 21:03   ` David Rientjes
2014-02-06 21:34     ` Joe Perches
2014-02-06 21:39       ` David Rientjes
2014-02-07  2:03 ` Eric W. Biederman [this message]
2014-02-07  2:09   ` Eric Dumazet
2014-02-07  6:29 ` David Miller
2014-02-08  0:22   ` Eric W. Biederman
2014-02-25 20:50     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87txcbisr8.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.