public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [OOPS] nfsv4 in linux 2.6.13 (-ck7)
@ 2005-10-12 23:03 Gabriel A. Devenyi
  2005-10-12 23:24 ` Chris Wright
  2005-10-14 15:49 ` Trond Myklebust
  0 siblings, 2 replies; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-12 23:03 UTC (permalink / raw)
  To: linux-kernel

This oops seems to occur during heavy i/o load over nfsv4.

nfs-utils version 1.0.7

OOps follows, what other information is needed?

 [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
 [kernel] <ffffffff80185e98>{generic_drop_inode+56}
 [kernel] PGD 34e3b067 PUD 34e68067 PMD 0
 [kernel] CPU 0
 [kernel] Modules linked in: nvidia snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event sn
d_seq snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd
 [kernel] Pid: 179, comm: kswapd0 Tainted: P      2.6.13-ck7
 [kernel] RIP: 0010:[<ffffffff80185e98>] <ffffffff80185e98>{generic_drop_inode+56}
 [kernel] RSP: 0018:ffff81003fcd7b68  EFLAGS: 00010246
 [kernel] RAX: 0000000000100100 RBX: ffff81001a58c950 RCX: 0000000000200200
 [kernel] RDX: ffff81001a58c960 RSI: ffff81003eb84000 RDI: ffff81001a58c950
 [kernel] RBP: ffff81001a58c950 R08: 00000000fffffffa R09: ffff81001a58ca68
 [kernel] R10: 0000000000000001 R11: ffffffff80185e60 R12: 0000000000000000
 [kernel] R13: ffff81001a58c7d0 R14: ffff81001a58c860 R15: ffff81003f1f5200
 [kernel] FS:  0000000040800960(0000) GS:ffffffff80494800(0000) knlGS:0000000056160040
 [kernel] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
 [kernel] CR2: 0000000000100108 CR3: 0000000034de7000 CR4: 00000000000006e0
 [kernel] Process kswapd0 (pid: 179, threadinfo ffff81003fcd6000, task ffff81003fcb2760)
 [kernel] Stack: ffff81003f1f5c00 ffffffff801d7a25 00000001803e8238 ffff81003fcd7c18
 [kernel]        ffffffffffffffff ffff81003fcd7c18 ffff81003fcd7c00 ffff81001a58c938
 [kernel]        0000000000000000 0000000000000000
 [kernel] Call Trace:<ffffffff801d7a25>{__nfs_revalidate_inode+261} <ffffffff8014e5df>{find_get_pages_tag+31}
 [kernel]        <ffffffff8015781a>{pagevec_lookup_tag+26} <ffffffff8014e00e>{wait_on_page_writeback_range+206}
 [kernel]        <ffffffff801f11ba>{nfs_do_return_delegation+42} <ffffffff801f12e5>{nfs_inode_return_delegation+197}
 [kernel]        <ffffffff801d8a10>{nfs4_clear_inode+32} <ffffffff80184cfe>{clear_inode+158}
 [kernel]        <ffffffff8018594e>{dispose_list+94} <ffffffff80185b82>{shrink_icache_memory+434}
 [kernel]        <ffffffff8015806b>{shrink_slab+219} <ffffffff80159517>{balance_pgdat+695}
 [kernel]        <ffffffff801597a8>{kswapd+312} <ffffffff80143b30>{autoremove_wake_function+0}
 [kernel]        <ffffffff80143b30>{autoremove_wake_function+0} <ffffffff8010f30e>{child_rip+8}
 [kernel]        <ffffffff80159670>{kswapd+0} <ffffffff8010f306>{child_rip+0}
 [kernel]
 [kernel] Code: 48 89 48 08 48 89 01 48 8b 05 aa 43 26 00 48 89 50 08 48 89
 [kernel] RIP <ffffffff80185e98>{generic_drop_inode+56} RSP <ffff81003fcd7b68>


-- 
Gabriel A. Devenyi
ace@staticwave.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:03 [OOPS] nfsv4 in linux 2.6.13 (-ck7) Gabriel A. Devenyi
@ 2005-10-12 23:24 ` Chris Wright
  2005-10-12 23:27   ` Gabriel A. Devenyi
  2005-10-14 15:49 ` Trond Myklebust
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Wright @ 2005-10-12 23:24 UTC (permalink / raw)
  To: Gabriel A. Devenyi; +Cc: linux-kernel

* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> This oops seems to occur during heavy i/o load over nfsv4.
> 
>  [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
>  [kernel] <ffffffff80185e98>{generic_drop_inode+56}

There have been a couple recent reports of this, and a fix is in the works.

See the recent thread here:

http://lkml.org/lkml/2005/9/25/44

>  [kernel] Modules linked in: nvidia
                               ^^^^^^
>  [kernel] Pid: 179, comm: kswapd0 Tainted: P      2.6.13-ck7

Tainted kernel, when sending bug reports please be sure bug happens
w/out tainted kernel.

thanks,
-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:24 ` Chris Wright
@ 2005-10-12 23:27   ` Gabriel A. Devenyi
  2005-10-12 23:31     ` Chris Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-12 23:27 UTC (permalink / raw)
  To: Chris Wright; +Cc: linux-kernel

On October 12, 2005 19:24, Chris Wright wrote:
> * Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> > This oops seems to occur during heavy i/o load over nfsv4.
> > 
> >  [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
> >  [kernel] <ffffffff80185e98>{generic_drop_inode+56}
> 
> There have been a couple recent reports of this, and a fix is in the works.
> 
> See the recent thread here:
> 
> http://lkml.org/lkml/2005/9/25/44
> 
> >  [kernel] Modules linked in: nvidia
>                                ^^^^^^
> >  [kernel] Pid: 179, comm: kswapd0 Tainted: P      2.6.13-ck7
> 
> Tainted kernel, when sending bug reports please be sure bug happens
> w/out tainted kernel.

Of course, my apologies, however, this is a fs error, is it even conceivable that something such as the nvidia kernel driver could affect this?


> thanks,
> -chris
> 
> 

-- 
Gabriel A. Devenyi
ace@staticwave.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:27   ` Gabriel A. Devenyi
@ 2005-10-12 23:31     ` Chris Wright
  2005-10-12 23:37       ` Gabriel A. Devenyi
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wright @ 2005-10-12 23:31 UTC (permalink / raw)
  To: Gabriel A. Devenyi; +Cc: Chris Wright, linux-kernel

* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> Of course, my apologies, however, this is a fs error, is it even
> conceivable that something such as the nvidia kernel driver could
> affect this?

In this case it's not very likely since others are seeing same problem
under load.  However, a binary module can corrupt any kernel memory.
So as a general rule all bets are off with a binary module loaded.

thanks,
-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:31     ` Chris Wright
@ 2005-10-12 23:37       ` Gabriel A. Devenyi
  2005-10-12 23:56         ` Chris Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-12 23:37 UTC (permalink / raw)
  To: Chris Wright; +Cc: linux-kernel

On October 12, 2005 19:31, Chris Wright wrote:
> In this case it's not very likely since others are seeing same problem
> under load.  However, a binary module can corrupt any kernel memory.
> So as a general rule all bets are off with a binary module loaded.

Thanks, I'll keep that in mind for next time. With regards to the patch in the other thread, 
should I try and patch the client, the server or both?

-- 
Gabriel A. Devenyi
ace@staticwave.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:37       ` Gabriel A. Devenyi
@ 2005-10-12 23:56         ` Chris Wright
  2005-10-14 14:05           ` Gabriel A. Devenyi
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wright @ 2005-10-12 23:56 UTC (permalink / raw)
  To: Gabriel A. Devenyi; +Cc: Chris Wright, linux-kernel

* Gabriel A. Devenyi (ace@staticwave.ca) wrote:

> Thanks, I'll keep that in mind for next time. With regards to the
> patch in the other thread, should I try and patch the client, the
> server or both?

Client side AFAIK.  May want to check with the nfs folks to see if
they've got any specific testing they'd find useful.

thanks,
-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:56         ` Chris Wright
@ 2005-10-14 14:05           ` Gabriel A. Devenyi
  2005-10-14 18:13             ` Chris Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-14 14:05 UTC (permalink / raw)
  To: Chris Wright; +Cc: linux-kernel

Chris Wright wrote:
> Client side AFAIK.  May want to check with the nfs folks to see if
> they've got any specific testing they'd find useful.

Well the patch seems to have cleared this problem up, do you happen to 
know where the NFS folks can be located so I can provide further 
testing/feedback? Thanks.


-- 
Gabriel A. Devenyi
ace@staticwave.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-12 23:03 [OOPS] nfsv4 in linux 2.6.13 (-ck7) Gabriel A. Devenyi
  2005-10-12 23:24 ` Chris Wright
@ 2005-10-14 15:49 ` Trond Myklebust
  2005-10-15 17:23   ` Gabriel A. Devenyi
  1 sibling, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2005-10-14 15:49 UTC (permalink / raw)
  To: Gabriel A. Devenyi; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2984 bytes --]

on den 12.10.2005 Klokka 19:03 (-0400) skreiv Gabriel A. Devenyi:
> This oops seems to occur during heavy i/o load over nfsv4.
> 
> nfs-utils version 1.0.7
> 
> OOps follows, what other information is needed?
> 
>  [kernel] Unable to handle kernel paging request at 0000000000100108 RIP:
>  [kernel] <ffffffff80185e98>{generic_drop_inode+56}
>  [kernel] PGD 34e3b067 PUD 34e68067 PMD 0
>  [kernel] CPU 0
>  [kernel] Modules linked in: nvidia snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event sn
> d_seq snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd
>  [kernel] Pid: 179, comm: kswapd0 Tainted: P      2.6.13-ck7
>  [kernel] RIP: 0010:[<ffffffff80185e98>] <ffffffff80185e98>{generic_drop_inode+56}
>  [kernel] RSP: 0018:ffff81003fcd7b68  EFLAGS: 00010246
>  [kernel] RAX: 0000000000100100 RBX: ffff81001a58c950 RCX: 0000000000200200
>  [kernel] RDX: ffff81001a58c960 RSI: ffff81003eb84000 RDI: ffff81001a58c950
>  [kernel] RBP: ffff81001a58c950 R08: 00000000fffffffa R09: ffff81001a58ca68
>  [kernel] R10: 0000000000000001 R11: ffffffff80185e60 R12: 0000000000000000
>  [kernel] R13: ffff81001a58c7d0 R14: ffff81001a58c860 R15: ffff81003f1f5200
>  [kernel] FS:  0000000040800960(0000) GS:ffffffff80494800(0000) knlGS:0000000056160040
>  [kernel] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>  [kernel] CR2: 0000000000100108 CR3: 0000000034de7000 CR4: 00000000000006e0
>  [kernel] Process kswapd0 (pid: 179, threadinfo ffff81003fcd6000, task ffff81003fcb2760)
>  [kernel] Stack: ffff81003f1f5c00 ffffffff801d7a25 00000001803e8238 ffff81003fcd7c18
>  [kernel]        ffffffffffffffff ffff81003fcd7c18 ffff81003fcd7c00 ffff81001a58c938
>  [kernel]        0000000000000000 0000000000000000
>  [kernel] Call Trace:<ffffffff801d7a25>{__nfs_revalidate_inode+261} <ffffffff8014e5df>{find_get_pages_tag+31}
>  [kernel]        <ffffffff8015781a>{pagevec_lookup_tag+26} <ffffffff8014e00e>{wait_on_page_writeback_range+206}
>  [kernel]        <ffffffff801f11ba>{nfs_do_return_delegation+42} <ffffffff801f12e5>{nfs_inode_return_delegation+197}
>  [kernel]        <ffffffff801d8a10>{nfs4_clear_inode+32} <ffffffff80184cfe>{clear_inode+158}
>  [kernel]        <ffffffff8018594e>{dispose_list+94} <ffffffff80185b82>{shrink_icache_memory+434}
>  [kernel]        <ffffffff8015806b>{shrink_slab+219} <ffffffff80159517>{balance_pgdat+695}
>  [kernel]        <ffffffff801597a8>{kswapd+312} <ffffffff80143b30>{autoremove_wake_function+0}
>  [kernel]        <ffffffff80143b30>{autoremove_wake_function+0} <ffffffff8010f30e>{child_rip+8}
>  [kernel]        <ffffffff80159670>{kswapd+0} <ffffffff8010f306>{child_rip+0}
>  [kernel]
>  [kernel] Code: 48 89 48 08 48 89 01 48 8b 05 aa 43 26 00 48 89 50 08 48 89
>  [kernel] RIP <ffffffff80185e98>{generic_drop_inode+56} RSP <ffff81003fcd7b68>

Does the attached patch fix it for you?

Cheers,
  Trond


[-- Attachment #2: linux-2.6.14-00-fix_iput.dif --]
[-- Type: text/plain, Size: 919 bytes --]

NFS: Fix Oopsable/unnecessary i_count manipulations in nfs_wait_on_inode()

 Oopsable since nfs_wait_on_inode() can get called as part of iput_final().

 Unnecessary since the caller had better be damned sure that the inode won't
 disappear from underneath it anyway.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---
 inode.c |    2 --
 1 files changed, 2 deletions(-)

Index: linux-2.6.14-rc4/fs/nfs/inode.c
===================================================================
--- linux-2.6.14-rc4.orig/fs/nfs/inode.c
+++ linux-2.6.14-rc4/fs/nfs/inode.c
@@ -877,12 +877,10 @@ static int nfs_wait_on_inode(struct inod
 	sigset_t oldmask;
 	int error;
 
-	atomic_inc(&inode->i_count);
 	rpc_clnt_sigmask(clnt, &oldmask);
 	error = wait_on_bit_lock(&nfsi->flags, NFS_INO_REVALIDATING,
 					nfs_wait_schedule, TASK_INTERRUPTIBLE);
 	rpc_clnt_sigunmask(clnt, &oldmask);
-	iput(inode);
 
 	return error;
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-14 14:05           ` Gabriel A. Devenyi
@ 2005-10-14 18:13             ` Chris Wright
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Wright @ 2005-10-14 18:13 UTC (permalink / raw)
  To: Gabriel A. Devenyi; +Cc: Chris Wright, linux-kernel

* Gabriel A. Devenyi (ace@staticwave.ca) wrote:
> Chris Wright wrote:
> >Client side AFAIK.  May want to check with the nfs folks to see if
> >they've got any specific testing they'd find useful.
> 
> Well the patch seems to have cleared this problem up, do you happen to 
> know where the NFS folks can be located so I can provide further 
> testing/feedback? Thanks.

They're at nfsv4@linux-nfs.org

thanks,
-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [OOPS] nfsv4 in linux 2.6.13 (-ck7)
  2005-10-14 15:49 ` Trond Myklebust
@ 2005-10-15 17:23   ` Gabriel A. Devenyi
  0 siblings, 0 replies; 10+ messages in thread
From: Gabriel A. Devenyi @ 2005-10-15 17:23 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel

On October 14, 2005 11:49, Trond Myklebust wrote:
> Does the attached patch fix it for you?
> 
> Cheers,
>   Trond

This patch http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-001-NFS_ALL_MODIFIED.dif
I found already fixed my problem, I applied your patch on top and everything seems to be working fine.

-- 
Gabriel A. Devenyi
ace@staticwave.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-10-15 17:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-12 23:03 [OOPS] nfsv4 in linux 2.6.13 (-ck7) Gabriel A. Devenyi
2005-10-12 23:24 ` Chris Wright
2005-10-12 23:27   ` Gabriel A. Devenyi
2005-10-12 23:31     ` Chris Wright
2005-10-12 23:37       ` Gabriel A. Devenyi
2005-10-12 23:56         ` Chris Wright
2005-10-14 14:05           ` Gabriel A. Devenyi
2005-10-14 18:13             ` Chris Wright
2005-10-14 15:49 ` Trond Myklebust
2005-10-15 17:23   ` Gabriel A. Devenyi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox