From: Rafael Aquini <aquini@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>,
Francois Romieu <romieu@fr.zoreil.com>,
atomlin@redhat.com, netdev@vger.kernel.org, davem@davemloft.net,
edumazet@google.com, pshelar@nicira.com, mst@redhat.com,
alexander.h.duyck@intel.com, riel@redhat.com,
sergei.shtylyov@cogentembedded.com, linux-kernel@vger.kernel.org
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets
Date: Tue, 28 May 2013 14:43:05 -0300 [thread overview]
Message-ID: <20130528174304.GD11614@optiplex.redhat.com> (raw)
In-Reply-To: <1369758577.3301.543.camel@edumazet-glaptop>
On Tue, May 28, 2013 at 09:29:37AM -0700, Eric Dumazet wrote:
> On Tue, 2013-05-28 at 13:15 -0300, Rafael Aquini wrote:
>
> > The real problem seems to be that more and more the network stack (drivers, perhaps)
> > is relying on chunks of contiguous page-blocks without a fallback mechanism to
> > order-0 page allocations. When memory gets fragmented, these alloc failures
> > start to pop up more often and they scare ordinary sysadmins out of their paints.
> >
>
> Where do you see that ?
>
> I see exactly the opposite trend.
>
> We have less and less buggy drivers, and we want to catch last
> offenders.
>
Perhaps the explanation is because we're looking into old stuff bad effects,
then. But just to list a few for your appreciation:
--------------------------------------------------------
Apr 23 11:25:31 217-IDC kernel: httpd: page allocation failure. order:1,
mode:0x20 Apr 23 11:25:31 217-IDC kernel: Pid: 19747, comm: httpd Not tainted
2.6.32-358.2.1.el6.x86_64 #1 Apr 23 11:25:31 217-IDC kernel: Call Trace: Apr 23
11:25:31 217-IDC kernel: <IRQ> [<ffffffff8112c207>] ?
__alloc_pages_nodemask+0x757/0x8d0 Apr 23 11:25:31 217-IDC kernel:
[<ffffffffa0337361>] ? bond_start_xmit+0x2f1/0x5d0 [bonding]
....
--------------------------------------------------------
Apr 4 18:51:32 exton kernel: swapper: page allocation failure. order:1,
mode:0x20
Apr 4 18:51:32 exton kernel: Pid: 0, comm: swapper Not tainted
2.6.32-279.19.1.el6.x86_64 #1
Apr 4 18:51:32 exton kernel: Call Trace:
Apr 4 18:51:32 exton kernel: <IRQ> [<ffffffff811231ff>] ?
__alloc_pages_nodemask+0x77f/0x940
Apr 4 18:51:32 exton kernel: [<ffffffff8115d1a2>] ? kmem_getpages+0x62/0x170
Apr 4 18:51:32 exton kernel: [<ffffffff8115ddba>] ? fallback_alloc+0x1ba/0x270
Apr 4 18:51:32 exton kernel: [<ffffffff8115d80f>] ? cache_grow+0x2cf/0x320
Apr 4 18:51:32 exton kernel: [<ffffffff8115db39>] ?
____cache_alloc_node+0x99/0x160
Apr 4 18:51:32 exton kernel: [<ffffffff8115ed00>] ?
kmem_cache_alloc_node_trace+0x90/0x200
Apr 4 18:51:32 exton kernel: [<ffffffff8115ef1d>] ? __kmalloc_node+0x4d/0x60
Apr 4 18:51:32 exton kernel: [<ffffffff8141ea1d>] ? __alloc_skb+0x6d/0x190
Apr 4 18:51:32 exton kernel: [<ffffffff8141eb5d>] ? dev_alloc_skb+0x1d/0x40
Apr 4 18:51:32 exton kernel: [<ffffffffa04f5f50>] ?
ipoib_cm_alloc_rx_skb+0x30/0x430 [ib_ipoib]
Apr 4 18:51:32 exton kernel: [<ffffffffa04f71ef>] ?
ipoib_cm_handle_rx_wc+0x29f/0x770 [ib_ipoib]
Apr 4 18:51:32 exton kernel: [<ffffffffa03c6a46>] ? mlx4_ib_poll_cq+0x2c6/0x7f0
[mlx4_ib]
....
--------------------------------------------------------
May 14 09:00:34 ifil03 kernel: swapper: page allocation failure. order:1,
mode:0x20
May 14 09:00:34 ifil03 kernel: Pid: 0, comm: swapper Not tainted
2.6.32-220.el6.x86_64 #1
May 14 09:00:34 ifil03 kernel: Call Trace:
May 14 09:00:34 ifil03 kernel: <IRQ> [<ffffffff81123f0f>] ?
__alloc_pages_nodemask+0x77f/0x940
May 14 09:00:34 ifil03 kernel: [<ffffffff8115ddc2>] ? kmem_getpages+0x62/0x170
May 14 09:00:34 ifil03 kernel: [<ffffffff8115e9da>] ? fallback_alloc+0x1ba/0x270
May 14 09:00:34 ifil03 kernel: [<ffffffff8115e42f>] ? cache_grow+0x2cf/0x320
May 14 09:00:34 ifil03 kernel: [<ffffffff8115e759>] ?
____cache_alloc_node+0x99/0x160
May 14 09:00:34 ifil03 kernel: [<ffffffff8115f53b>] ?
kmem_cache_alloc+0x11b/0x190
May 14 09:00:34 ifil03 kernel: [<ffffffff8141f528>] ? sk_prot_alloc+0x48/0x1c0
May 14 09:00:34 ifil03 kernel: [<ffffffff8141f7b2>] ? sk_clone+0x22/0x2e0
May 14 09:00:34 ifil03 kernel: [<ffffffff8146ca26>] ? inet_csk_clone+0x16/0xd0
May 14 09:00:34 ifil03 kernel: [<ffffffff814858f3>] ?
tcp_create_openreq_child+0x23/0x450
May 14 09:00:34 ifil03 kernel: [<ffffffff814832dd>] ?
tcp_v4_syn_recv_sock+0x4d/0x2a0
May 14 09:00:34 ifil03 kernel: [<ffffffff814856b1>] ? tcp_check_req+0x201/0x420
May 14 09:00:34 ifil03 kernel: [<ffffffff8147b166>] ?
tcp_rcv_state_process+0x116/0xa30
May 14 09:00:34 ifil03 kernel: [<ffffffff81482cfb>] ? tcp_v4_do_rcv+0x35b/0x430
May 14 09:00:34 ifil03 kernel: [<ffffffff81484471>] ? tcp_v4_rcv+0x4e1/0x860
May 14 09:00:34 ifil03 kernel: [<ffffffff814621fd>] ?
ip_local_deliver_finish+0xdd/0x2d0
May 14 09:00:34 ifil03 kernel: [<ffffffff81462488>] ? ip_local_deliver+0x98/0xa0
May 14 09:00:34 ifil03 kernel: [<ffffffff8146194d>] ? ip_rcv_finish+0x12d/0x440
May 14 09:00:34 ifil03 kernel: [<ffffffff8101bd86>] ?
intel_pmu_enable_all+0xa6/0x150
May 14 09:00:34 ifil03 kernel: [<ffffffff81461ed5>] ? ip_rcv+0x275/0x350
May 14 09:00:34 ifil03 kernel: [<ffffffff8142bedb>] ?
__netif_receive_skb+0x49b/0x6e0
May 14 09:00:34 ifil03 kernel: [<ffffffff8142df88>] ?
netif_receive_skb+0x58/0x60
May 14 09:00:34 ifil03 kernel: [<ffffffffa00a0a9e>] ?
vmxnet3_rq_rx_complete+0x36e/0x880 [vmxnet3]
....
--------------------------------------------------------
> > The big point of this change was to attempt to relief some of these warnings
> > which we believed as being useless, since the net stack would recover from it
> > by re-transmissions.
> > We might have misjudged the scenario, though. Perhaps a better approach would be
> > making the warning less verbose for all page-alloc failures. We could, perhaps,
> > only print a stack-dump out, if some debug flag is passed along, either as
> > reference, or by some CONFIG_DEBUG_ preprocessor directive.
>
>
> warn_alloc_failed() uses the standard DEFAULT_RATELIMIT_INTERVAL which
> is very small (5 * HZ)
>
> I would bump nopage_rs to somethin more reasonable, like one hour or one
> day.
>
Neat! Worth to try, no doubts about that. Aaron?
Cheers!
-- Rafael
next prev parent reply other threads:[~2013-05-28 17:43 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-26 20:45 [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets atomlin
2013-05-27 18:18 ` Rafael Aquini
2013-05-27 22:41 ` Francois Romieu
2013-05-28 16:00 ` Ben Greear
2013-05-28 16:15 ` Rafael Aquini
2013-05-28 16:19 ` Ben Greear
2013-05-28 16:29 ` Eric Dumazet
2013-05-28 17:43 ` Rafael Aquini [this message]
2013-05-28 17:46 ` Eric Dumazet
2013-05-28 18:03 ` Eric Dumazet
2013-05-28 18:19 ` Joe Perches
2013-05-28 21:32 ` Rik van Riel
2013-05-28 17:29 ` Joe Perches
2013-05-28 21:30 ` Rik van Riel
-- strict thread matches above, loose matches on Subject: below --
2013-05-26 20:19 atomlin
2013-05-27 17:39 ` Rik van Riel
2013-05-27 22:25 ` Eric Dumazet
2013-05-28 4:31 ` Joe Perches
2013-05-28 6:04 ` Eric Dumazet
2013-05-28 6:08 ` Joe Perches
2013-05-28 17:15 ` Debabrata Banerjee
2013-05-29 7:36 ` Rik van Riel
2013-05-27 17:56 ` Sergei Shtylyov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130528174304.GD11614@optiplex.redhat.com \
--to=aquini@redhat.com \
--cc=alexander.h.duyck@intel.com \
--cc=atomlin@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=greearb@candelatech.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pshelar@nicira.com \
--cc=riel@redhat.com \
--cc=romieu@fr.zoreil.com \
--cc=sergei.shtylyov@cogentembedded.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.