netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Larry.Finger@lwfinger.net, bhutchings@solarflare.com,
	linux-wireless@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: 3.4.4/amd64 full interrupt hangs under big nfs copies
Date: Sun, 15 Jul 2012 14:59:35 -0700	[thread overview]
Message-ID: <20120715215935.GF24420@merlins.org> (raw)
In-Reply-To: <20120411052733.GA17352@merlins.org>

On Tue, Apr 10, 2012 at 10:27:33PM -0700, Marc MERLIN wrote:
> On Tue, Apr 10, 2012 at 08:11:03AM +0200, Eric Dumazet wrote:
> > Please try following patch, as it solved the problem for me (no more
> > order-1 allocations in tx path)
> 
> I applied our patch to 3.3.1 and cannot reproduce the problem anymore.
> 
> I'll leave a big wireless copy running overnight just in case, but I think
> you fixed it.

Mmmh, so I'm running 3.4.4 and I had another full machine hang while copying
big files (gigabytes) over wireless via NFS.
The laptop self recovered after 5mn or so (mouse cursor would not even
move) and I was able to kill -9 the process (midnight commander).
mc did not actually stop for another 4mn or so (i.e. it took that long for
the process to come out of kernel hung state), but the machine was usable
during that time.
Note that copying the same data with scp works fine.
NFS mount looks like this:
gargamel:/mnt/dshelf2/ /net/gargamel/mnt/dshelf2 nfs4 rw,nosuid,nodev,relatime,vers=4.0,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.205.7,local_lock=none,addr=192.168.205.3 0 0

I didn't have anything like last time in the kernel logs, and more
annoyingly, ps -elf does not show anything for any process in WCHAN,
making pointing the finger a bit harder (procps-ng 3.3.3 does not show
anything other than '-' in WCHAN for any process with 3.4.4).

My understanding is that user space calling drivers that shut off all
interrupts for extended periods of time (as least I think so since my mouse
cursor would not move), is still a kernel bug.

For what it's worth, copying 1GB of data in lots of small files does not
cause problems, it seems that it's big files that cause a problem since they
likely fill a buffer somewhere while interrupts are disabled.

Do you have an idea of how I can find out where my mc process is stuck in
the kernel?
Should I reproduce with specific sysrq output?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

  parent reply	other threads:[~2012-07-15 21:59 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-29 16:38 3.2.8/amd64 full interrupt hangs and deadlocks under big network copies (page allocation failure) Marc MERLIN
     [not found] ` <20120311183244.GA14001@merlins.org>
     [not found]   ` <20120329053111.GD24933@merlins.org>
     [not found]     ` <20120329163800.GH24933-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
     [not found]       ` <20120329053111.GD24933-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
2012-03-29 16:41         ` Marc MERLIN
2012-03-29 18:09 ` Ben Hutchings
     [not found]   ` <1333044575.2656.1.camel-/LGg1Z1CJKReKY3V0RtoKmatzQS1i7+A3tAM5lWOD0I@public.gmane.org>
2012-03-29 21:19     ` Marc MERLIN
2012-04-09 17:20   ` Marc MERLIN
2012-04-09 18:12     ` David Miller
     [not found]       ` <20120409.141241.1216091936509309354.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-09 18:36         ` Marc MERLIN
2012-04-09 18:37           ` David Miller
2012-04-09 18:58             ` Larry Finger
2012-04-09 19:11               ` Eric Dumazet
2012-04-09 19:34                 ` David Miller
     [not found]                   ` <20120409.153452.1284163346306246866.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-09 19:46                     ` Marc MERLIN
2012-04-10  3:56                   ` Eric Dumazet
2012-04-10  5:11                     ` Marc MERLIN
2012-04-10  6:11                       ` Eric Dumazet
2012-04-11  5:27                         ` Marc MERLIN
     [not found]                           ` <20120411052733.GA17352-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
2012-04-11  5:43                             ` Eric Dumazet
2012-04-11  6:30                               ` [PATCH] tcp: avoid order-1 allocations on wifi and tx path Eric Dumazet
2012-04-11  7:38                                 ` Eric Dumazet
2012-04-11 14:12                                   ` David Miller
2012-04-11 14:11                                 ` David Miller
2012-04-11  6:08                           ` [PATCH] net: allow pskb_expand_head() to get maximum tailroom Eric Dumazet
2012-04-11 14:11                             ` David Miller
2012-07-15 21:59                           ` Marc MERLIN [this message]
2012-07-16  6:18                             ` 3.4.4/amd64 full interrupt hangs under big nfs copies Eric Dumazet
2012-07-16 15:18                               ` Marc MERLIN
2012-07-16 16:21                                 ` Eric Dumazet
2012-07-16 17:17                                   ` Marc MERLIN
2013-02-19  4:05                                   ` 3.7.8/amd64 full interrupt hangs due to iwlwifi under big nfs copies out Marc MERLIN
2013-02-19  5:17                                     ` Eric Dumazet
2013-02-19  5:26                                       ` Marc MERLIN
2013-02-19 10:03                                       ` Johannes Berg
2013-02-19 16:18                                         ` Marc MERLIN
2013-02-19 16:36                                           ` Eric Dumazet
2013-02-19 16:21                                         ` Eric Dumazet
2013-02-20  9:12                                           ` Johannes Berg
2013-02-20  9:15                                             ` Johannes Berg
2013-02-20 15:11                                               ` Eric Dumazet
2013-02-20 16:20                                                 ` Johannes Berg
     [not found]                                                   ` <1361377243.8629.34.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>
2013-02-20 16:55                                                     ` Eric Dumazet
2013-02-20 16:59                                                       ` Johannes Berg
2013-02-20 17:39                                                         ` Eric Dumazet
2013-02-20 17:01                                                       ` Johannes Berg
2013-02-20 17:24                                                         ` Eric Dumazet
2013-02-20 18:16                                                           ` Johannes Berg
2013-02-20 19:17                                                             ` Eric Dumazet
2013-02-20 19:58                                                               ` Johannes Berg
2013-02-20 20:14                                                                 ` Eric Dumazet
2013-02-20 20:27                                                                   ` Johannes Berg
2013-02-20 20:09                                                 ` Johannes Berg
2013-02-23  6:14                                               ` Marc MERLIN
2013-02-26 20:54                                                 ` Johannes Berg
     [not found]                                                   ` <1361912099.8440.21.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>
2013-06-18 16:52                                                     ` Eric Dumazet
2013-06-18 17:04                                                       ` Johannes Berg
2013-06-19 13:09                                                         ` Stanislaw Gruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120715215935.GF24420@merlins.org \
    --to=marc@merlins.org \
    --cc=Larry.Finger@lwfinger.net \
    --cc=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-wireless@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).