* [OOPS] nfsv4 in linux 2.6.13 (-ck7)
@ 2005-10-12 23:03 Gabriel A. Devenyi
2005-10-12 23:24 ` Chris Wright
2005-10-14 15:49 ` Trond Myklebust
0 siblings, 2 replies; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-12 23:03 UTC (permalink / raw)
To: linux-kernel
This oops seems to occur during heavy i/o load over nfsv4.
nfs-utils version 1.0.7
OOps follows, what other information is needed?
[kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
[kernel] <ffffffff80185e98>{generic_drop_inode+56}
[kernel] PGD 34e3b067 PUD 34e68067 PMD 0
[kernel] CPU 0
[kernel] Modules linked in: nvidia snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event sn
d_seq snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd
[kernel] Pid: 179, comm: kswapd0 Tainted: P 2.6.13-ck7
[kernel] RIP: 0010:[<ffffffff80185e98>] <ffffffff80185e98>{generic_drop_inode+56}
[kernel] RSP: 0018:ffff81003fcd7b68 EFLAGS: 00010246
[kernel] RAX: 0000000000100100 RBX: ffff81001a58c950 RCX: 0000000000200200
[kernel] RDX: ffff81001a58c960 RSI: ffff81003eb84000 RDI: ffff81001a58c950
[kernel] RBP: ffff81001a58c950 R08: 00000000fffffffa R09: ffff81001a58ca68
[kernel] R10: 0000000000000001 R11: ffffffff80185e60 R12: 0000000000000000
[kernel] R13: ffff81001a58c7d0 R14: ffff81001a58c860 R15: ffff81003f1f5200
[kernel] FS: 0000000040800960(0000) GS:ffffffff80494800(0000) knlGS:0000000056160040
[kernel] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[kernel] CR2: 0000000000100108 CR3: 0000000034de7000 CR4: 00000000000006e0
[kernel] Process kswapd0 (pid: 179, threadinfo ffff81003fcd6000, task ffff81003fcb2760)
[kernel] Stack: ffff81003f1f5c00 ffffffff801d7a25 00000001803e8238 ffff81003fcd7c18
[kernel] ffffffffffffffff ffff81003fcd7c18 ffff81003fcd7c00 ffff81001a58c938
[kernel] 0000000000000000 0000000000000000
[kernel] Call Trace:<ffffffff801d7a25>{__nfs_revalidate_inode+261} <ffffffff8014e5df>{find_get_pages_tag+31}
[kernel] <ffffffff8015781a>{pagevec_lookup_tag+26} <ffffffff8014e00e>{wait_on_page_writeback_range+206}
[kernel] <ffffffff801f11ba>{nfs_do_return_delegation+42} <ffffffff801f12e5>{nfs_inode_return_delegation+197}
[kernel] <ffffffff801d8a10>{nfs4_clear_inode+32} <ffffffff80184cfe>{clear_inode+158}
[kernel] <ffffffff8018594e>{dispose_list+94} <ffffffff80185b82>{shrink_icache_memory+434}
[kernel] <ffffffff8015806b>{shrink_slab+219} <ffffffff80159517>{balance_pgdat+695}
[kernel] <ffffffff801597a8>{kswapd+312} <ffffffff80143b30>{autoremove_wake_function+0}
[kernel] <ffffffff80143b30>{autoremove_wake_function+0} <ffffffff8010f30e>{child_rip+8}
[kernel] <ffffffff80159670>{kswapd+0} <ffffffff8010f306>{child_rip+0}
[kernel]
[kernel] Code: 48 89 48 08 48 89 01 48 8b 05 aa 43 26 00 48 89 50 08 48 89
[kernel] RIP <ffffffff80185e98>{generic_drop_inode+56} RSP <ffff81003fcd7b68>
--
Gabriel A. Devenyi
ace@staticwave.ca
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:03 [OOPS] nfsv4 in linux 2.6.13 (-ck7) Gabriel A. Devenyi
@ 2005-10-12 23:24 ` Chris Wright
2005-10-12 23:27 ` Gabriel A. Devenyi
2005-10-14 15:49 ` Trond Myklebust
1 sibling, 1 reply; 10+ messages in thread
From: Chris Wright @ 2005-10-12 23:24 UTC (permalink / raw)
To: Gabriel A. Devenyi; +Cc: linux-kernel
* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> This oops seems to occur during heavy i/o load over nfsv4.
>
> [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
> [kernel] <ffffffff80185e98>{generic_drop_inode+56}
There have been a couple recent reports of this, and a fix is in the works.
See the recent thread here:
http://lkml.org/lkml/2005/9/25/44
> [kernel] Modules linked in: nvidia
^^^^^^
> [kernel] Pid: 179, comm: kswapd0 Tainted: P 2.6.13-ck7
Tainted kernel, when sending bug reports please be sure bug happens
w/out tainted kernel.
thanks,
-chris
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:24 ` Chris Wright
@ 2005-10-12 23:27 ` Gabriel A. Devenyi
2005-10-12 23:31 ` Chris Wright
0 siblings, 1 reply; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-12 23:27 UTC (permalink / raw)
To: Chris Wright; +Cc: linux-kernel
On October 12, 2005 19:24, Chris Wright wrote:
> * Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> > This oops seems to occur during heavy i/o load over nfsv4.
> >
> > [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
> > [kernel] <ffffffff80185e98>{generic_drop_inode+56}
>
> There have been a couple recent reports of this, and a fix is in the works.
>
> See the recent thread here:
>
> http://lkml.org/lkml/2005/9/25/44
>
> > [kernel] Modules linked in: nvidia
> ^^^^^^
> > [kernel] Pid: 179, comm: kswapd0 Tainted: P 2.6.13-ck7
>
> Tainted kernel, when sending bug reports please be sure bug happens
> w/out tainted kernel.
Of course, my apologies, however, this is a fs error, is it even conceivable that something such as the nvidia kernel driver could affect this?
> thanks,
> -chris
>
>
--
Gabriel A. Devenyi
ace@staticwave.ca
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:27 ` Gabriel A. Devenyi
@ 2005-10-12 23:31 ` Chris Wright
2005-10-12 23:37 ` Gabriel A. Devenyi
0 siblings, 1 reply; 10+ messages in thread
From: Chris Wright @ 2005-10-12 23:31 UTC (permalink / raw)
To: Gabriel A. Devenyi; +Cc: Chris Wright, linux-kernel
* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> Of course, my apologies, however, this is a fs error, is it even
> conceivable that something such as the nvidia kernel driver could
> affect this?
In this case it's not very likely since others are seeing same problem
under load. However, a binary module can corrupt any kernel memory.
So as a general rule all bets are off with a binary module loaded.
thanks,
-chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:31 ` Chris Wright
@ 2005-10-12 23:37 ` Gabriel A. Devenyi
2005-10-12 23:56 ` Chris Wright
0 siblings, 1 reply; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-12 23:37 UTC (permalink / raw)
To: Chris Wright; +Cc: linux-kernel
On October 12, 2005 19:31, Chris Wright wrote:
> In this case it's not very likely since others are seeing same problem
> under load. However, a binary module can corrupt any kernel memory.
> So as a general rule all bets are off with a binary module loaded.
Thanks, I'll keep that in mind for next time. With regards to the patch in the other thread,
should I try and patch the client, the server or both?
--
Gabriel A. Devenyi
ace@staticwave.ca
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:37 ` Gabriel A. Devenyi
@ 2005-10-12 23:56 ` Chris Wright
2005-10-14 14:05 ` Gabriel A. Devenyi
0 siblings, 1 reply; 10+ messages in thread
From: Chris Wright @ 2005-10-12 23:56 UTC (permalink / raw)
To: Gabriel A. Devenyi; +Cc: Chris Wright, linux-kernel
* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> Thanks, I'll keep that in mind for next time. With regards to the
> patch in the other thread, should I try and patch the client, the
> server or both?
Client side AFAIK. May want to check with the nfs folks to see if
they've got any specific testing they'd find useful.
thanks,
-chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:56 ` Chris Wright
@ 2005-10-14 14:05 ` Gabriel A. Devenyi
2005-10-14 18:13 ` Chris Wright
0 siblings, 1 reply; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-14 14:05 UTC (permalink / raw)
To: Chris Wright; +Cc: linux-kernel
Chris Wright wrote:
> Client side AFAIK. May want to check with the nfs folks to see if
> they've got any specific testing they'd find useful.
Well the patch seems to have cleared this problem up, do you happen to
know where the NFS folks can be located so I can provide further
testing/feedback? Thanks.
--
Gabriel A. Devenyi
ace@staticwave.ca
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-14 14:05 ` Gabriel A. Devenyi
@ 2005-10-14 18:13 ` Chris Wright
0 siblings, 0 replies; 10+ messages in thread
From: Chris Wright @ 2005-10-14 18:13 UTC (permalink / raw)
To: Gabriel A. Devenyi; +Cc: Chris Wright, linux-kernel
* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> Chris Wright wrote:
> >Client side AFAIK. May want to check with the nfs folks to see if
> >they've got any specific testing they'd find useful.
>
> Well the patch seems to have cleared this problem up, do you happen to
> know where the NFS folks can be located so I can provide further
> testing/feedback? Thanks.
They're at nfsv4@linux-nfs.org
thanks,
-chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
2005-10-12 23:03 [OOPS] nfsv4 in linux 2.6.13 (-ck7) Gabriel A. Devenyi
2005-10-12 23:24 ` Chris Wright
@ 2005-10-14 15:49 ` Trond Myklebust
2005-10-15 17:23 ` Gabriel A. Devenyi
1 sibling, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2005-10-14 15:49 UTC (permalink / raw)
To: Gabriel A. Devenyi; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2984 bytes --]
on den 12.10.2005 Klokka 19:03 (-0400) skreiv Gabriel A. Devenyi:
> This oops seems to occur during heavy i/o load over nfsv4.
>
> nfs-utils version 1.0.7
>
> OOps follows, what other information is needed?
>
> [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
> [kernel] <ffffffff80185e98>{generic_drop_inode+56}
> [kernel] PGD 34e3b067 PUD 34e68067 PMD 0
> [kernel] CPU 0
> [kernel] Modules linked in: nvidia snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event sn
> d_seq snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd
> [kernel] Pid: 179, comm: kswapd0 Tainted: P 2.6.13-ck7
> [kernel] RIP: 0010:[<ffffffff80185e98>] <ffffffff80185e98>{generic_drop_inode+56}
> [kernel] RSP: 0018:ffff81003fcd7b68 EFLAGS: 00010246
> [kernel] RAX: 0000000000100100 RBX: ffff81001a58c950 RCX: 0000000000200200
> [kernel] RDX: ffff81001a58c960 RSI: ffff81003eb84000 RDI: ffff81001a58c950
> [kernel] RBP: ffff81001a58c950 R08: 00000000fffffffa R09: ffff81001a58ca68
> [kernel] R10: 0000000000000001 R11: ffffffff80185e60 R12: 0000000000000000
> [kernel] R13: ffff81001a58c7d0 R14: ffff81001a58c860 R15: ffff81003f1f5200
> [kernel] FS: 0000000040800960(0000) GS:ffffffff80494800(0000) knlGS:0000000056160040
> [kernel] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [kernel] CR2: 0000000000100108 CR3: 0000000034de7000 CR4: 00000000000006e0
> [kernel] Process kswapd0 (pid: 179, threadinfo ffff81003fcd6000, task ffff81003fcb2760)
> [kernel] Stack: ffff81003f1f5c00 ffffffff801d7a25 00000001803e8238 ffff81003fcd7c18
> [kernel] ffffffffffffffff ffff81003fcd7c18 ffff81003fcd7c00 ffff81001a58c938
> [kernel] 0000000000000000 0000000000000000
> [kernel] Call Trace:<ffffffff801d7a25>{__nfs_revalidate_inode+261} <ffffffff8014e5df>{find_get_pages_tag+31}
> [kernel] <ffffffff8015781a>{pagevec_lookup_tag+26} <ffffffff8014e00e>{wait_on_page_writeback_range+206}
> [kernel] <ffffffff801f11ba>{nfs_do_return_delegation+42} <ffffffff801f12e5>{nfs_inode_return_delegation+197}
> [kernel] <ffffffff801d8a10>{nfs4_clear_inode+32} <ffffffff80184cfe>{clear_inode+158}
> [kernel] <ffffffff8018594e>{dispose_list+94} <ffffffff80185b82>{shrink_icache_memory+434}
> [kernel] <ffffffff8015806b>{shrink_slab+219} <ffffffff80159517>{balance_pgdat+695}
> [kernel] <ffffffff801597a8>{kswapd+312} <ffffffff80143b30>{autoremove_wake_function+0}
> [kernel] <ffffffff80143b30>{autoremove_wake_function+0} <ffffffff8010f30e>{child_rip+8}
> [kernel] <ffffffff80159670>{kswapd+0} <ffffffff8010f306>{child_rip+0}
> [kernel]
> [kernel] Code: 48 89 48 08 48 89 01 48 8b 05 aa 43 26 00 48 89 50 08 48 89
> [kernel] RIP <ffffffff80185e98>{generic_drop_inode+56} RSP <ffff81003fcd7b68>
Does the attached patch fix it for you?
Cheers,
Trond
[-- Attachment #2: linux-2.6.14-00-fix_iput.dif --]
[-- Type: text/plain, Size: 919 bytes --]
NFS: Fix Oopsable/unnecessary i_count manipulations in nfs_wait_on_inode()
Oopsable since nfs_wait_on_inode() can get called as part of iput_final().
Unnecessary since the caller had better be damned sure that the inode won't
disappear from underneath it anyway.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---
inode.c | 2 --
1 files changed, 2 deletions(-)
Index: linux-2.6.14-rc4/fs/nfs/inode.c
===================================================================
--- linux-2.6.14-rc4.orig/fs/nfs/inode.c
+++ linux-2.6.14-rc4/fs/nfs/inode.c
@@ -877,12 +877,10 @@ static int nfs_wait_on_inode(struct inod
sigset_t oldmask;
int error;
- atomic_inc(&inode->i_count);
rpc_clnt_sigmask(clnt, &oldmask);
error = wait_on_bit_lock(&nfsi->flags, NFS_INO_REVALIDATING,
nfs_wait_schedule, TASK_INTERRUPTIBLE);
rpc_clnt_sigunmask(clnt, &oldmask);
- iput(inode);
return error;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-10-15 17:23 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-12 23:03 [OOPS] nfsv4 in linux 2.6.13 (-ck7) Gabriel A. Devenyi
2005-10-12 23:24 ` Chris Wright
2005-10-12 23:27 ` Gabriel A. Devenyi
2005-10-12 23:31 ` Chris Wright
2005-10-12 23:37 ` Gabriel A. Devenyi
2005-10-12 23:56 ` Chris Wright
2005-10-14 14:05 ` Gabriel A. Devenyi
2005-10-14 18:13 ` Chris Wright
2005-10-14 15:49 ` Trond Myklebust
2005-10-15 17:23 ` Gabriel A. Devenyi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox