All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Hughes <doug@will.to>
To: linux-kernel@vger.kernel.org
Subject: strange linux kernel NFS problem(s)
Date: Thu, 02 Dec 2010 21:40:35 -0500	[thread overview]
Message-ID: <4CF858A3.2050202@will.to> (raw)


So, this is my first post, but not my first problem of this nature. It 
just so happens that this is the first one with a recent kernel to give 
useful data, useful enough to post it and seek some advice on the subject:

symptoms: machine gets high load, nfs mount processes hang, and things 
(particularly NFS) stop working. ssh and ip connectivity still works, as 
does ps.

*general protection fault: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 1
Modules linked in: nfs auth_rpcgss autofs4 i2c_dev i2c_core lockd sunrpc 
cachefiles fscache ipmi_si ipmi_devintf ipmi_msghandler ip6t_REJECT 
xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 video output battery 
ac parport_pc lp parport joydev button sr_mod pcspkr iTCO_wdt shpchp 
dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage 
pata_acpi ata_piix ata_generic libata uhci_hcd ohci_hcd ehci_hcd [last 
unloaded: microcode]

Pid: 28573, comm: python2.5 Not tainted 2.6.34 #3 X7DWT/X7DWT
RIP: 0010:[<ffffffffa0292cdb>]  [<ffffffffa0292cdb>] 
nfs_release+0x64/0x94 [nfs]
RSP: 0018:ffff88041ccb9d58  EFLAGS: 00010246
RAX: ffff88041c47d160 RBX: ffff88041c47d1e8 RCX: ff88041c47d16088
RDX: ffff88042c593288 RSI: ffff88042c504e40 RDI: ffff88041c47d294
RBP: ffff88041ccb9d78 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000300000000 R11: 0000000000000000 R12: ffff88042c593240
R13: ffff88042c504e40 R14: ffff88041ea59ec0 R15: ffff8804273f55c0
FS:  0000000000000000(0000) GS:ffff880001840000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000003fd5c03350 CR3: 0000000001613000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process python2.5 (pid: 28573, threadinfo ffff88041ccb8000, task 
ffff8803e246adf0)
Stack:
  0000000300000000 ffff88042c504e40 ffff88041c47d1e8 ffff88041c47d1e8
<0> ffff88041ccb9d98 ffffffffa0290fc5 0000000000000010 ffff88042c504e40
<0> ffff88041ccb9dd8 ffffffff810a75b7 ffff88042caf3120 ffff88042c687768
Call Trace:
  [<ffffffffa0290fc5>] nfs_file_release+0x5c/0x61 [nfs]
  [<ffffffff810a75b7>] __fput+0xf6/0x1bf
  [<ffffffff810a78ba>] fput+0x15/0x17
  [<ffffffff8108ccff>] remove_vma+0x36/0x6c
  [<ffffffff8108ce54>] exit_mmap+0x11f/0x141
  [<ffffffff81030119>] mmput+0x2d/0xc3
  [<ffffffff81033e9f>] exit_mm+0x10b/0x118
  [<ffffffff81064b75>] ? audit_free+0x191/0x1c4
  [<ffffffff81035074>] do_exit+0x200/0x685
  [<ffffffff81035567>] do_group_exit+0x6e/0x98
  [<ffffffff810355a3>] sys_exit_group+0x12/0x16
  [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
Code: 11 e1 49 8d 54 24 48 49 8b 4c 24 48 48 8b 42 08 48 89 41 08 48 89 
08 48 8d 83 78 ff ff ff 48 8b 48 08 49 89 44 24 48 48 89 50 08 <48> 89 
11 48 89 4a 08 fe 83 ac 00 00 00 41 8b 75 38 4c 89 e7 81
RIP  [<ffffffffa0292cdb>] nfs_release+0x64/0x94 [nfs]
  RSP <ffff88041ccb9d58>
---[ end trace 1ac7372e162481b8 ]---
Fixing recursive fault but reboot is needed!
mount: server antonrootfs.d.stor.en.desres.deshaw.com not responding, 
timed out
[root@antonfe0002 ~]# uptime
  20:58:04 up 12 days,  1:05,  4 users,  load average: 20.98, 20.23, 18.99
*
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Nov20 ?        00:00:04 init [3]
root         2     0  0 Nov20 ?        00:00:00 [kthreadd]
root         3     2  0 Nov20 ?        00:00:00 [migration/0]
root         4     2  0 Nov20 ?        02:42:37 [ksoftirqd/0]
root         5     2  0 Nov20 ?        00:00:00 [migration/1]
root         6     2  3 Nov20 ?        10:04:25 [ksoftirqd/1]
root         7     2  0 Nov20 ?        00:00:00 [migration/2]
root         8     2  0 Nov20 ?        01:39:58 [ksoftirqd/2]
root         9     2  0 Nov20 ?        00:00:00 [migration/3]
root        10     2  4 Nov20 ?        13:28:17 [ksoftirqd/3]
root        11     2  0 Nov20 ?        00:00:00 [migration/4]
root        12     2  7 Nov20 ?        20:39:20 [ksoftirqd/4]
root        13     2  0 Nov20 ?        00:00:00 [migration/5]
root        14     2  0 Nov20 ?        00:06:39 [ksoftirqd/5]
root        15     2  0 Nov20 ?        00:00:00 [migration/6]
root        16     2  7 Nov20 ?        21:56:03 [ksoftirqd/6]
root        17     2  0 Nov20 ?        00:00:00 [migration/7]
root        18     2  1 Nov20 ?        03:06:59 [ksoftirqd/7]
root        19     2  0 Nov20 ?        00:00:06 [events/0]
root        20     2  0 Nov20 ?        00:00:22 [events/1]
root        21     2  0 Nov20 ?        00:00:09 [events/2]
root        22     2  0 Nov20 ?        00:00:08 [events/3]
root        23     2  0 Nov20 ?        00:00:05 [events/4]
root        24     2  0 Nov20 ?        00:00:33 [events/5]
root        25     2  0 Nov20 ?        00:00:07 [events/6]
root        26     2  0 Nov20 ?        00:00:12 [events/7]
root        27     2  0 Nov20 ?        00:00:00 [khelper]
root        32     2  0 Nov20 ?        00:00:00 [async/mgr]
root       175     2  0 Nov20 ?        00:00:00 [sync_supers]
root       177     2  0 Nov20 ?        00:00:00 [bdi-default]
root       178     2  0 Nov20 ?        00:00:00 [kintegrityd/0]
root       179     2  0 Nov20 ?        00:00:00 [kintegrityd/1]
root       180     2  0 Nov20 ?        00:00:00 [kintegrityd/2]
root       181     2  0 Nov20 ?        00:00:00 [kintegrityd/3]
root       182     2  0 Nov20 ?        00:00:00 [kintegrityd/4]
root       183     2  0 Nov20 ?        00:00:00 [kintegrityd/5]
root       184     2  0 Nov20 ?        00:00:00 [kintegrityd/6]
root       185     2  0 Nov20 ?        00:00:00 [kintegrityd/7]
root       186     2  0 Nov20 ?        00:00:00 [kblockd/0]
root       187     2  0 Nov20 ?        00:00:00 [kblockd/1]
root       188     2  0 Nov20 ?        00:00:00 [kblockd/2]
root       189     2  0 Nov20 ?        00:00:00 [kblockd/3]
root       190     2  0 Nov20 ?        00:00:00 [kblockd/4]
root       191     2  0 Nov20 ?        00:00:00 [kblockd/5]
root       192     2  0 Nov20 ?        00:00:00 [kblockd/6]
root       193     2  0 Nov20 ?        00:00:00 [kblockd/7]
root       195     2  0 Nov20 ?        00:00:00 [kacpid]
root       196     2  0 Nov20 ?        00:00:00 [kacpi_notify]
root       197     2  0 Nov20 ?        00:00:00 [kacpi_hotplug]
root       304     2  0 Nov20 ?        00:00:00 [khubd]
root       307     2  0 Nov20 ?        00:00:00 [kseriod]
root       416     2  0 Nov20 ?        00:00:00 [kswapd0]
root       417     2  0 Nov20 ?        00:00:00 [aio/0]
root       418     2  0 Nov20 ?        00:00:00 [aio/1]
root       419     2  0 Nov20 ?        00:00:00 [aio/2]
root       420     2  0 Nov20 ?        00:00:00 [aio/3]
root       421     2  0 Nov20 ?        00:00:00 [aio/4]
root       422     2  0 Nov20 ?        00:00:00 [aio/5]
root       423     2  0 Nov20 ?        00:00:00 [aio/6]
root       424     2  0 Nov20 ?        00:00:00 [aio/7]
root       426     2  0 Nov20 ?        00:00:00 [crypto/0]
root       427     2  0 Nov20 ?        00:00:00 [crypto/1]
root       428     2  0 Nov20 ?        00:00:00 [crypto/2]
root       429     2  0 Nov20 ?        00:00:00 [crypto/3]
root       430     2  0 Nov20 ?        00:00:00 [crypto/4]
root       431     2  0 Nov20 ?        00:00:00 [crypto/5]
root       432     2  0 Nov20 ?        00:00:00 [crypto/6]
root       433     2  0 Nov20 ?        00:00:00 [crypto/7]
root       635     2  0 Nov20 ?        00:00:00 [kpsmoused]
root       656     2  0 Nov20 ?        00:00:02 [edac-poller]
root       701     2  0 Nov20 ?        00:00:00 [usbhid_resumer]
root       713     2  0 Nov20 ?        00:00:00 [ata/0]
root       714     2  0 Nov20 ?        00:00:00 [ata/1]
root       715     2  0 Nov20 ?        00:00:00 [ata/2]
root       716     2  0 Nov20 ?        00:00:00 [ata/3]
root       717     2  0 Nov20 ?        00:00:00 [ata/4]
root       718     2  0 Nov20 ?        00:00:00 [ata/5]
root       719     2  0 Nov20 ?        00:00:00 [ata/6]
root       720     2  0 Nov20 ?        00:00:00 [ata/7]
root       721     2  0 Nov20 ?        00:00:00 [ata_aux]
root       724     2  0 Nov20 ?        00:00:00 [scsi_eh_0]
root       725     2  0 Nov20 ?        00:00:00 [scsi_eh_1]
root       733     2  0 Nov20 ?        00:00:00 [scsi_eh_2]
root       734     2  0 Nov20 ?        00:00:00 [usb-storage]
root       753     2  0 Nov20 ?        00:00:00 [kstriped]
root       759     2  0 Nov20 ?        00:00:00 [ksnapd]
root       763     2  0 Nov20 ?        00:33:13 [md3_raid1]
root       766     2  0 Nov20 ?        00:00:24 [md2_raid1]
root       769     2  0 Nov20 ?        00:00:46 [md1_raid1]
root       772     2  0 Nov20 ?        00:00:49 [md0_raid1]
root       777     2  0 Nov20 ?        00:00:00 [kjournald]
root       803     2  0 Nov20 ?        00:00:00 [kauditd]
root       840     1  0 Nov20 ?        00:00:03 /sbin/udevd -d
root      1450  3450  0 20:01 ?        00:00:00 crond
root      1451  1450  0 20:01 ?        00:00:00 /bin/bash 
/usr/bin/run-parts /et
root      1452  1451  0 20:01 ?        00:00:00 /bin/bash 
/etc/cron.hourly/mcelo
root      1453  1451  0 20:01 ?        00:00:00 awk -v 
progname=/etc/cron.hourly
root      1454  1452  0 20:01 ?        00:00:00 /usr/sbin/mcelog 
--ignorenodev -
0001001   2207  3393  0 20:10 ?        00:00:00 sshd: 0001001 [priv]
sshd      2208  2207  0 20:10 ?        00:00:00 sshd: 0001001 [net]
root      2210  3230  0 20:10 ?        00:00:00 /bin/mount -t nfs -s -o 
retry=10
root      2211  2210  0 20:10 ?        00:00:00 /sbin/mount.nfs fish1.nyc
root      2323     2  0 Nov20 ?        00:00:00 [kdmflush]
root      2358     2  0 Nov20 ?        00:00:00 [kjournald]
root      2359     2  0 Nov20 ?        00:00:01 [kjournald]
root      2585  3393  0 12:43 ?        00:00:00 sshd: 001002[priv]
001002    2590  2585  0 12:43 ?        00:00:00 sshd: 001002@pts/3
001002    2591  2590  0 12:43 pts/3    00:00:00 -bash
root      2740     2  0 17:53 ?        00:00:00 [kslowd000]
root      2933     1  0 Nov20 ?        00:00:00 auditd
root      2935  2933  0 Nov20 ?        00:00:00 /sbin/audispd
root      2962     2  0 Nov20 ?        00:26:41 [kipmi0]
root      2981     1  0 Nov20 ?        00:00:01 syslogd -m 0
root      2984     1  0 Nov20 ?        00:00:00 klogd -x
root      3019     1  0 Nov20 ?        00:00:00 cachefilesd
root      3031     1  0 Nov20 ?        00:01:50 irqbalance
rpc       3047     1  0 Nov20 ?        00:00:00 portmap
root      3073     2  0 Nov20 ?        00:00:00 [rpciod/0]
root      3074     2  0 Nov20 ?        00:00:00 [rpciod/1]
root      3075     2  0 Nov20 ?        00:00:00 [rpciod/2]
root      3076     2  0 Nov20 ?        00:00:00 [rpciod/3]
root      3077     2  0 Nov20 ?        00:00:00 [rpciod/4]
root      3078     2  0 Nov20 ?        00:00:00 [rpciod/5]
root      3079     2  0 Nov20 ?        00:00:00 [rpciod/6]
root      3080     2  0 Nov20 ?        00:00:00 [rpciod/7]
root      3086     1  0 Nov20 ?        00:00:00 rpc.statd
root      3135     1  0 Nov20 ?        00:00:02 mdadm --monitor --scan 
-f --pid-
root      3156     1  0 Nov20 ?        00:00:01 rpc.idmapd
root      3195     1  0 Nov20 ?        00:00:00 /usr/sbin/acpid
root      3230     1  0 Nov20 ?        00:02:33 automount
daemon    3318     1  0 Nov20 ?        00:00:35 /usr/sbin/munged
root      3333     1  0 Nov20 ?        00:02:07 /usr/sbin/snmpd -Lsd -Lf 
/dev/nu
distcc    3378     1  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
distcc    3379  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
root      3393     1  0 Nov20 ?        00:00:00 /usr/sbin/sshd
distcc    3412  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
distcc    3414  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
root      3450     1  0 Nov20 ?        00:00:01 crond
distcc    3459  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
root      3466     1  0 Nov20 ?        00:00:00 /opt/slurm/sbin/slurmd
postfix   3476     1  0 Nov20 ?        00:00:00 /usr/sbin/nullmailer-send
root      3496     1  0 Nov20 ?        00:00:00 /usr/sbin/atd
distcc    3564  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
distcc    3594  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
root      3596     1  0 Nov20 ?        00:00:00 /usr/sbin/smartd -q never
root      3599     1  0 Nov20 tty1     00:00:00 /sbin/mingetty tty1
root      3600     1  0 Nov20 tty2     00:00:00 /sbin/mingetty tty2
root      3601     1  0 Nov20 tty3     00:00:00 /sbin/mingetty tty3
root      3602     1  0 Nov20 tty4     00:00:00 /sbin/mingetty tty4
root      3603     1  0 Nov20 tty5     00:00:00 /sbin/mingetty tty5
root      3604     1  0 Nov20 tty6     00:00:00 /sbin/mingetty tty6
distcc    3618  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
distcc    3620  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
distcc    3623  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
distcc    3626  3378  0 Nov20 ?        00:00:00 /usr/bin/distccd 
--daemon --allo
root      3638     1  0 Nov20 ttyS1    00:00:00 /sbin/agetty -L ttyS1 
19200 vt10
root      3639     1  0 Nov20 ttyS0    00:00:00 /sbin/agetty -L ttyS0 
115200 vt1
root      3650     2  0 Nov20 ?        00:00:00 [nfsiod]
root      4782     1  0 Nov20 ?        00:00:33 /usr/bin/python 
/opt/rocks/bin/g
nobody    4824     1  0 Nov20 ?        00:00:35 /usr/sbin/gmond
root      5164  3393  0 20:48 ?        00:00:00 sshd: root@pts/8
001003    5211     1  0 20:48 ?        00:00:00 /usr/bin/xauth -q -
root      6264  3393  0 20:57 ?        00:00:00 sshd: root@pts/10
root      6274  6264  0 20:57 pts/10   00:00:00 -bash
root      6335  6274  0 20:58 pts/10   00:00:00 ps -ef
root      7138     2  0 Nov20 ?        00:00:00 [lockd]
001003    7607     1  0 17:55 ?        00:00:00 -bash
root      7890  3393  0 Nov20 ?        00:00:00 sshd: 001004 [priv]
001004    7898  7890  0 Nov20 ?        00:00:03 sshd: 001004@pts/0
001004    7899  7898  0 Nov20 pts/0    00:00:00 -tcsh
root     25087     2  0 16:12 ?        00:00:00 [kslowd001]
ntp      25923     1  0 05:38 ?        00:00:00 ntpd -u ntp:ntp -p 
/var/run/ntpd
root     27886  3393  0 Nov22 ?        00:00:00 sshd: 001005 [priv]
001005   27893 27886  0 Nov22 ?        00:00:02 sshd: 001005@pts/1
001005   27895 27893  0 Nov22 pts/1    00:00:00 -bash
001003   28573  7607  0 19:03 ?        00:00:00 [python2.5]
001003   29197     1  0 19:10 ?        00:00:00 -bash
001003   30030 29197 99 19:11 ?        01:46:10 python2.5 
/u/nyc/001003/lib/root
001003   30127     1  0 19:12 ?        00:00:00 /usr/bin/xauth -q -
001003   30149     1  0 19:12 ?        00:00:00 -bash
root     30181  3230  0 19:12 ?        00:00:00 /bin/mount -t nfs -s -o 
retry=10
root     30182 30181  0 19:12 ?        00:00:00 /sbin/mount.nfs host3.nyc
root     30245  3393  0 19:13 ?        00:00:00 sshd: root@pts/7
root     30353     1  0 19:14 ?        00:00:00 /sbin/umount.nfs 
/data/desrad-p
root     30504     1  0 19:16 ?        00:00:00 /sbin/umount.nfs 
/u/nyc/001008
root     31003  3230  0 19:22 ?        00:00:00 /bin/mount -t nfs -s -o 
retry=10
root     31004 31003  0 19:22 ?        00:00:00 /sbin/mount.nfs host3.nyc
root     31569     1  0 19:30 ?        00:00:00 /sbin/umount.nfs 
/proj/desrad-a
root     31632     1  0 19:31 ?        00:00:00 /sbin/umount.nfs 
/u/nyc/0001001
root     31653     1  0 19:31 ?        00:00:00 /sbin/umount.nfs 
/proj/desrad


             reply	other threads:[~2010-12-03  2:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-03  2:40 Doug Hughes [this message]
2010-12-03 17:36 ` strange linux kernel NFS problem(s) John Stoffel
2010-12-03 18:47   ` Doug Hughes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CF858A3.2050202@will.to \
    --to=doug@will.to \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.