public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.21.14 NFS related oops
@ 2007-06-13 12:00 Maciej Soltysiak
  2007-06-13 19:17 ` Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Maciej Soltysiak @ 2007-06-13 12:00 UTC (permalink / raw)
  To: linux-kernel

Hi,

If anyone is interested I got this OOPS while running a torrent 
(btdownloadcurses)
application writing directly to a NAS mounted via nfs3.

The client machine is 2.6.21.14 and it is mounted with options:
wsize=8192,rsize=8192,hard,intr,tcp

After that, the application hung and i am unable to cd into the mounted 
nfs directory
nor unmount it (busy), nor kill the app (kill -9 fails, process in D state)

Best regards,
Maciej

BUG: unable to handle kernel paging request at virtual address 5018f248
 printing eip:
f0a93c94
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: binfmt_misc sit nfs lockd nfs_acl sunrpc w83627ehf 
i2c_isa i2c_viapro i2c_core via_agp agpgart rtc
CPU:    0
EIP:    0060:[<f0a93c94>]    Not tainted VLI
EFLAGS: 00010206   (2.6.20.14-cks1 #15)
EIP is at rpcauth_checkverf+0x34/0x70 [sunrpc]
eax: d2f4447c   ebx: c655d584   ecx: 00000000   edx: f0aa9f60
esi: e91ea640   edi: d2f44474   ebp: ede2f228   esp: e64b5eec
ds: 007b   es: 007b   ss: 0068
Process rpciod/0 (pid: 1005, ti=e64b4000 task=efe95a90 task.ti=e64b4000)
Stack: 00000286 ede2f8a0 ede2f8a0 00000286 c655d584 121d0da3 00000820 
f0a8d7fd
       f0a93d60 f08bae07 00000286 c655d5cc 00000286 00000286 f08c0520 
c655d584
       00000000 c655d5ec f0a93260 f0a9306f efe95a90 ee2d5740 e092ffb0 
c034e11c
Call Trace:
 [<f0a8d7fd>] call_decode+0x27d/0x5e0 [sunrpc]
 [<f0a93d60>] rpcauth_unbindcred+0x20/0x60 [sunrpc]
 [<f08bae07>] nfs_readpage_result_full+0xf7/0x120 [nfs]
 [<f08c0520>] nfs3_xdr_readres+0x0/0x160 [nfs]
 [<f0a93260>] rpc_async_schedule+0x0/0x10 [sunrpc]
 [<f0a9306f>] __rpc_execute+0x5f/0x250 [sunrpc]
 [<c034e11c>] schedule+0x21c/0x450
 [<c01283aa>] run_workqueue+0x7a/0x110
 [<c0128a07>] worker_thread+0x137/0x160
 [<c01176b0>] default_wake_function+0x0/0x10
 [<c01288d0>] worker_thread+0x0/0x160
 [<c012b329>] kthread+0xa9/0xe0
 [<c012b280>] kthread+0x0/0xe0
 [<c0103a97>] kernel_thread_helper+0x7/0x10
 =======================
Code: 10 89 5c 24 10 89 c3 89 7c 24 18 89 d7 89 74 24 14 8b 70 28 75 1a 8b
4e 08 89 fa 89 d8 ff 51 18 8b 5c 24 10 83 74 24 14 8b 7c 24 <18> 83 c4 1c c3
89 74 24 0c 8b 40 10 8b 40 24 8b 40 10 8b 40 08 EIP: [<f0a93c94>]
rpcauth_checkverf+0x34/0x70 [sunrpc] SS:ESP 0068:e64b5eec


^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: 2.6.21.14 NFS related oops
@ 2007-06-20 10:35 Maciej Sołtysiak
  0 siblings, 0 replies; 7+ messages in thread
From: Maciej Sołtysiak @ 2007-06-20 10:35 UTC (permalink / raw)
  To: linux-kernel

 > > I'm running 2.6.21.5 now with slab debugging on, here's what I got 
about
 > > slab corruption:
 > >
 > > Slab corruption: skbuff_head_cache start=ef287b78, len=164
 > > Redzone: 0x5a2cf071/0x5a2cf071.
 > > Last user: [<c031710c>](kfree_skbmem+0x3c/0x90)
 > > 090: 6b 6b 6b 6b 6b 63 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 > > Single bit error detected. Probably bad RAM.
 > > Run memtest86+ or a similar memory test tool.
 > > Prev obj: start=ef287ac8, len=164
 > > Redzone: 0x170fc2a5/0x170fc2a5.
 > > Last user: [<c031798b>](__alloc_skb+0x2b/0x100)
 > > 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 > > 010: 00 00 00 00 e0 71 e6 ef 00 00 00 00 00 00 00 00
 > > Next obj: start=ef287c28, len=164
 > > Redzone: 0x170fc2a5/0x170fc2a5.
 > > Last user: [<c031798b>](__alloc_skb+0x2b/0x100)
 > > 000: 84 d0 85 c5 84 d0 85 c5 04 d0 85 c5 2c 0a 73 46
 > > 010: 6f cd 09 00 00 00 00 00 01 00 00 00 08 e5 72 ee
 > >
 > > How probable is that it is really a bad memory issue?
 > > Does this report say anything about which RAM chip I should
 > > investigate/replace ? I have 1x512MB+1x256MB
 > >
 > > Best Regards,
 > > Maciej
 >
 > I'd try doing as suggested above: run memtest86 on the computer for a
 > couple of hours and see what it tells you. That should hopefully give
 > you enough information to figure out which chips need replacing.

I am also getting BAD CRC on the disk that holds my swap partition.
I was wondering if slab debugging could say I have slab corruption not 
because
my RAM chips are bad, but because SWAP has bad blocks ? And that the
whole problem might be swap disk related not ram related.

 > Cheers
 >   Trond
Regards,
Maciej


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-06-20 10:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-13 12:00 2.6.21.14 NFS related oops Maciej Soltysiak
2007-06-13 19:17 ` Trond Myklebust
2007-06-13 20:35   ` Chuck Ebbert
2007-06-14 15:34   ` Maciej Soltysiak
2007-06-16  9:26   ` Maciej Sołtysiak
2007-06-16 15:08     ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2007-06-20 10:35 Maciej Sołtysiak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox