From: Eric Dumazet <eric.dumazet@gmail.com>
To: Marc MERLIN <marc@merlins.org>
Cc: David Miller <davem@davemloft.net>,
Larry.Finger@lwfinger.net, bhutchings@solarflare.com,
linux-wireless@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: 3.7.8/amd64 full interrupt hangs due to iwlwifi under big nfs copies out
Date: Mon, 18 Feb 2013 21:17:13 -0800 [thread overview]
Message-ID: <1361251033.19353.120.camel@edumazet-glaptop> (raw)
In-Reply-To: <20130219040557.GB4778@merlins.org>
On Mon, 2013-02-18 at 20:05 -0800, Marc MERLIN wrote:
> On Mon, Jul 16, 2012 at 06:21:57PM +0200, Eric Dumazet wrote:
> > > No, it's atually when I'm 'uploading' from my laptop to my server.
> > > One interesting thing is that my server is running lvm2 with snapshots,
> > > which makes writes slower than my laptop can push data over the network, so
> > > it's definitely causing buffers to fill up.
> > > I just did a download test and got 4.5MB/s sustained without problems.
> >
> > Hmm, nfs apparently is able to push lot of data, try to reduce
> > rsize/wsize to sane values, like 32K instead of 512K ?
> >
> > gargamel:/mnt/dshelf2/ /net/gargamel/mnt/dshelf2 nfs4
> > rw,nosuid,nodev,relatime,vers=4.0,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.205.7,local_lock=none,addr=192.168.205.3 0 0
> >
> > You could trace svc_sock_setbufsize() and check how large is set
> > sk_sndbuf
>
> My apologies, I totally dropped the ball on this.
>
> So, the problem was still there in more recent kernels.
>
> TL;DR:
> - reducing nfs buffers removes the full hang
> - iwlwifi has a problem where lack of pages causes the whoe machine to hang
> - NFS copies out, even with buffers down to 32K is very wonky and cp does not
> return until over 2mn after the copy is actually finished.
> (I have a trace of what's hung in cp/nfs when this happens)
>
>
> Details:
>
> It's still pretty severe because whatever blocks doesn't just end up
> blocking disk IO, but actually blocking interrupts altogether since my mouse
> can't move for a minute or more until some buffer flushes.
>
> The last trace I got during this (I can't do sysrq because I have a broken
> Lenovo T530 without a sysrq key, and typing doesn't really work when
> interrupts aren't firing).
>
> Not sure if it's useful. First chrome had an issue, and then iwlwifi
>
> chrome: page allocation failure: order:1, mode:0x4020
> Pid: 8730, comm: chrome Tainted: G O 3.7.8-amd64-preempt-20121226-fixwd #1
> Call Trace:
> <IRQ> [<ffffffff810d5f38>] warn_alloc_failed+0x117/0x12c
> [<ffffffff810d8cfd>] __alloc_pages_nodemask+0x66a/0x702
> [<ffffffff8108a948>] ? arch_local_irq_save+0x15/0x1b
> [<ffffffff811064af>] alloc_pages_current+0xcd/0xee
> [<ffffffffa039b579>] iwl_rx_allocate+0x8c/0x271 [iwlwifi]
> [<ffffffffa039c24e>] iwl_irq_tasklet+0x7e5/0x91c [iwlwifi]
> [<ffffffff8104805e>] tasklet_action+0x80/0xd2
> [<ffffffff81047c99>] __do_softirq+0xdf/0x1c5
> [<ffffffff814c1ed6>] ? _raw_spin_lock+0x1b/0x1f
> [<ffffffff810a7f37>] ? handle_irq_event+0x4d/0x62
> [<ffffffff814c7f5c>] call_softirq+0x1c/0x30
> [<ffffffff8101104e>] do_softirq+0x41/0x7f
> [<ffffffff81047e52>] irq_exit+0x3f/0xa7
> [<ffffffff81010d40>] do_IRQ+0x88/0x9f
> [<ffffffff814c246d>] common_interrupt+0x6d/0x6d
> <EOI> Mem-Info:
You could try to load iwlwifi with amsdu_size_8K set to 0 (disable)
It should hopefully use order-0 pages
Some drivers cant fallback to low order page allocations.
mlx4 is another example (it uses order-2 pages )
next prev parent reply other threads:[~2013-02-19 5:17 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-29 16:38 3.2.8/amd64 full interrupt hangs and deadlocks under big network copies (page allocation failure) Marc MERLIN
[not found] ` <20120311183244.GA14001@merlins.org>
[not found] ` <20120329053111.GD24933@merlins.org>
[not found] ` <20120329163800.GH24933-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
[not found] ` <20120329053111.GD24933-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
2012-03-29 16:41 ` Marc MERLIN
2012-03-29 18:09 ` Ben Hutchings
[not found] ` <1333044575.2656.1.camel-/LGg1Z1CJKReKY3V0RtoKmatzQS1i7+A3tAM5lWOD0I@public.gmane.org>
2012-03-29 21:19 ` Marc MERLIN
2012-04-09 17:20 ` Marc MERLIN
2012-04-09 18:12 ` David Miller
[not found] ` <20120409.141241.1216091936509309354.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-09 18:36 ` Marc MERLIN
2012-04-09 18:37 ` David Miller
2012-04-09 18:58 ` Larry Finger
2012-04-09 19:11 ` Eric Dumazet
2012-04-09 19:34 ` David Miller
[not found] ` <20120409.153452.1284163346306246866.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-09 19:46 ` Marc MERLIN
2012-04-10 3:56 ` Eric Dumazet
2012-04-10 5:11 ` Marc MERLIN
2012-04-10 6:11 ` Eric Dumazet
2012-04-11 5:27 ` Marc MERLIN
[not found] ` <20120411052733.GA17352-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
2012-04-11 5:43 ` Eric Dumazet
2012-04-11 6:30 ` [PATCH] tcp: avoid order-1 allocations on wifi and tx path Eric Dumazet
2012-04-11 7:38 ` Eric Dumazet
2012-04-11 14:12 ` David Miller
2012-04-11 14:11 ` David Miller
2012-04-11 6:08 ` [PATCH] net: allow pskb_expand_head() to get maximum tailroom Eric Dumazet
2012-04-11 14:11 ` David Miller
2012-07-15 21:59 ` 3.4.4/amd64 full interrupt hangs under big nfs copies Marc MERLIN
2012-07-16 6:18 ` Eric Dumazet
2012-07-16 15:18 ` Marc MERLIN
2012-07-16 16:21 ` Eric Dumazet
2012-07-16 17:17 ` Marc MERLIN
2013-02-19 4:05 ` 3.7.8/amd64 full interrupt hangs due to iwlwifi under big nfs copies out Marc MERLIN
2013-02-19 5:17 ` Eric Dumazet [this message]
2013-02-19 5:26 ` Marc MERLIN
2013-02-19 10:03 ` Johannes Berg
2013-02-19 16:18 ` Marc MERLIN
2013-02-19 16:36 ` Eric Dumazet
2013-02-19 16:21 ` Eric Dumazet
2013-02-20 9:12 ` Johannes Berg
2013-02-20 9:15 ` Johannes Berg
2013-02-20 15:11 ` Eric Dumazet
2013-02-20 16:20 ` Johannes Berg
[not found] ` <1361377243.8629.34.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>
2013-02-20 16:55 ` Eric Dumazet
2013-02-20 16:59 ` Johannes Berg
2013-02-20 17:39 ` Eric Dumazet
2013-02-20 17:01 ` Johannes Berg
2013-02-20 17:24 ` Eric Dumazet
2013-02-20 18:16 ` Johannes Berg
2013-02-20 19:17 ` Eric Dumazet
2013-02-20 19:58 ` Johannes Berg
2013-02-20 20:14 ` Eric Dumazet
2013-02-20 20:27 ` Johannes Berg
2013-02-20 20:09 ` Johannes Berg
2013-02-23 6:14 ` Marc MERLIN
2013-02-26 20:54 ` Johannes Berg
[not found] ` <1361912099.8440.21.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>
2013-06-18 16:52 ` Eric Dumazet
2013-06-18 17:04 ` Johannes Berg
2013-06-19 13:09 ` Stanislaw Gruszka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1361251033.19353.120.camel@edumazet-glaptop \
--to=eric.dumazet@gmail.com \
--cc=Larry.Finger@lwfinger.net \
--cc=bhutchings@solarflare.com \
--cc=davem@davemloft.net \
--cc=linux-wireless@vger.kernel.org \
--cc=marc@merlins.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox