Re: nfsd oops with 2.6.5-rc2-mm4

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: nfsd oops with 2.6.5-rc2-mm4
@ 2004-03-27 23:17 Dieter Stueken
  0 siblings, 0 replies; 10+ messages in thread
From: Dieter Stueken @ 2004-03-27 23:17 UTC (permalink / raw)
  To: linux-kernel

Frank Denis wrote:

> Version 3. I can give version 2 a try but there will probably
> a significant loss of performance :(

probably better than a total loss of performance :-)

> Unfortunately this is a production environment and I can hardly
> switch the Solaris box off in order to make it sure that it is
 > triggering the bug.

Same here. I set up a NFS-server for test and let the Sun
mount the data in a temporary directory to verify the situation.
The test-server may crash, but the sun will survive. After you
stopped the test on the sun, you might have to reboot your
test-server to be able to unmount the disk on the sun again.

>> I reverted to 2.6.2-rc2-mm3, the server didn't crash after 6 hours. Crossing fingers... 
> 
> Pointless.
> 
> 2.6.2-rc2-mm3 just crashed the same way.

Yes, all 2.6.x seem to be affected. My good old 2.4.19 runs since 6 month.
I started to scan the 2.5.x patches for changes on nfsd, to find by which
one this kind of trouble was caused....

Dieter.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
@ 2004-03-27 19:10 Dieter Stueken
  2004-03-27 20:03 ` Frank Denis
  0 siblings, 1 reply; 10+ messages in thread
From: Dieter Stueken @ 2004-03-27 19:10 UTC (permalink / raw)
  To: linux-kernel

Frank wrote:
> I got a reproducible oops after a few minutes with a 2.6.5-rc2-mm4 kernel.
> ...
> [<c01e6a8d>] nfsd_readdir+0x69/0xe8

was some exported directory quite big? Try running a "find /mnt/..." on
the client to see when exactly it fails. I observe similar behavior
when reading huge directories of some 1000 entries. What nfs version
are you using? You may try "mount -o nfsvers=2 ..." or 3.

My own Oops seems to be reproducible when using a Sun (2.8) as
client, only. It did not occur when using nfsV2. I also failed
to reproduce the bug when mounting by an other Linux client.
So may be we observe two different bugs here.

With that instability observed, I won't/can't switch to 2.6.x for
my system in production :-( Can I help somehow? Making a TCP-dump?
Try some patches?

Dieter.
-- 
Dieter Stüken, con terra GmbH, Münster
     stueken@conterra.de
     http://www.conterra.de/
     (0)251-7474-501

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
  2004-03-27 19:10 Dieter Stueken
@ 2004-03-27 20:03 ` Frank Denis
  2004-03-27 20:40   ` Frank Denis
  0 siblings, 1 reply; 10+ messages in thread
From: Frank Denis @ 2004-03-27 20:03 UTC (permalink / raw)
  To: linux-kernel

Dieter Stueken wrote:
> was some exported directory quite big?

Yes, some exported directories contains a lot of small files.

 > What nfs version
> are you using? You may try "mount -o nfsvers=2 ..." or 3.

Version 3. I can give version 2 a try but there will probably a 
significant loss of performance :(

> My own Oops seems to be reproducible when using a Sun (2.8) as
> client, only.

There actually 10 clients, 9 are Linux 2.6.2-rc2-mm2, 1 is indeed a 
Solaris 2.8 box.

> It did not occur when using nfsV2. I also failed
> to reproduce the bug when mounting by an other Linux client.

Unfortunately this is a production environment and I can hardly switch 
the Solaris box off in order to make it sure that it is triggering the bug.

> So may be we observe two different bugs here.

Your situation looks similar.

I reverted to 2.6.2-rc2-mm3, the server didn't crash after 6 hours. 
Crossing fingers...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
  2004-03-27 20:03 ` Frank Denis
@ 2004-03-27 20:40   ` Frank Denis
  0 siblings, 0 replies; 10+ messages in thread
From: Frank Denis @ 2004-03-27 20:40 UTC (permalink / raw)
  To: linux-kernel

Frank Denis wrote:
> I reverted to 2.6.2-rc2-mm3, the server didn't crash after 6 hours. 
> Crossing fingers...

Pointless.

2.6.2-rc2-mm3 just crashed the same way.


Unable to handle kernel NULL pointer dereference at virtual address 00000004
  printing eip:
c029fb79
*pde = 00000000
Oops: 0002 [#1]
SMP
CPU:    0
EIP:    0060:[<c029fb79>]    Not tainted VLI
EFLAGS: 00010287   (2.6.5-rc2-mm3)
EIP is at do_tcp_sendpages+0x197/0xa79
eax: f5b75108   ebx: f42c0980   ecx: 00000008   edx: 00000000
esi: 00000001   edi: f5b75100   ebp: f61bddec   esp: f62c5e34
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 3325, threadinfo=f62c5000 task=f7727670)
Stack: 000000d0 000000d0 f62c5e60 00000000 15270000 c01e6991 f5b75110 
f61bdc64
        00000008 00000000 00000000 00000000 000005b4 00007530 00000000 
f61bdc00
        00000008 00000000 c02a04e3 f61bdc00 f62c5eac 00000000 00000008 
00000000
Call Trace:
  [<c01e6991>] nfsd_readdir+0x69/0xe8
  [<c02a04e3>] tcp_sendpage+0x88/0x96
  [<c02d8ccd>] svc_sendto+0x16a/0x29e
  [<c01ecfd9>] encode_post_op_attr+0x1c9/0x241
  [<c02d9d39>] svc_tcp_sendto+0x53/0xa8
  [<c02da4f1>] svc_send+0xb9/0xfc
  [<c02dc17c>] svcauth_unix_release+0x57/0x59
  [<c02d8189>] svc_process+0x184/0x60e
  [<c02e2d9c>] common_interrupt+0x18/0x20
  [<c01e0ce9>] nfsd+0x1ea/0x3b6
  [<c01e0aff>] nfsd+0x0/0x3b6
  [<c0104e01>] kernel_thread_helper+0x5/0xb

^ permalink raw reply	[flat|nested] 10+ messages in thread

* nfsd oops with 2.6.5-rc2-mm4
@ 2004-03-27 13:07 Jedi/Sector One
  2004-03-28  2:30 ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jedi/Sector One @ 2004-03-27 13:07 UTC (permalink / raw)
  To: linux-kernel

  Hello.
  
  I got a reproducible oops after a few minutes with a 2.6.5-rc2-mm4 kernel.
  
  /etc/exports :
/mnt/data 10.42.42.0/24(rw,async,no_subtree_check,root_squash,
                        anonuid=10000,anongid=10000)

  Clients are 2.6.5-rc2-mm2 kernels, filesystem is ReiserFS 3, data=writeback.
  Exports are mounted with tcp,nolock,soft,timeo=600,retrans=2,actimeo=30,
rsize=32768,wsize=32768.

  Once the oops has happened, no client can access the mount point any more.
  
  Unable to handle kernel NULL pointer dereference at virtual address 00000004
 printing eip:
c029fd35
*pde = 00000000
Oops: 0002 [#1]
SMP
CPU:    0
EIP:    0060:[<c029fd35>]    Not tainted VLI
EFLAGS: 00010287   (2.6.5-rc2-mm4)
EIP is at do_tcp_sendpages+0x197/0xa79
eax: d1d24108   ebx: f5e3fd80   ecx: 00000008   edx: 00000000
esi: 00000001   edi: d1d24100   ebp: f72391ec   esp: f6283e34
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 3330, threadinfo=f6283000 task=f62962b0)
Stack: 000000d0 000000d0 00000000 00000000 15270000 c01e6a8d d1d24110 f7239064
       00000008 00000000 00000000 00000000 000005b4 00007530 00000000 f7239000
       00000008 00000000 c02a069f f7239000 f6283eac 00000000 00000008 00000000
Call Trace:
 [<c01e6a8d>] nfsd_readdir+0x69/0xe8
 [<c02a069f>] tcp_sendpage+0x88/0x96
 [<c02d8ed4>] svc_sendto+0x16a/0x29e
 [<c01ed0d5>] encode_post_op_attr+0x1c9/0x241
 [<c02d9f40>] svc_tcp_sendto+0x53/0xa8
 [<c02da6f8>] svc_send+0xb9/0xfc
 [<c02dc384>] svcauth_unix_release+0x57/0x59
 [<c02d838c>] svc_process+0x187/0x611
 [<c01e0de5>] nfsd+0x1ea/0x3b6
 [<c01e0bfb>] nfsd+0x0/0x3b6
 [<c0104e01>] kernel_thread_helper+0x5/0xb

Code: 4c 24 20 85 f6 74 17 8d 04 f7 8d 50 08 89 54 24 18 8b 54 24 28 3b 50 08 0f 84 80 08 00 00 83 fe 11 0f 87 25 04 00 00 8b 54 24 28 <f0> ff 42 04 8b 7c 24 28 8b 83 98 00 00 00 8d 04 f0 89 78 10 8d

  Best regards,
  
       -Frank.
       

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
  2004-03-27 13:07 Jedi/Sector One
@ 2004-03-28  2:30 ` Linus Torvalds
  2004-03-28  2:48   ` Andrew Morton
       [not found]   ` <1080511633.5553.29.camel@lade.trondhjem.org>
  0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2004-03-28  2:30 UTC (permalink / raw)
  To: David S. Miller, Trond Myklebust, Neil Brown, Andrew Morton; +Cc: netdev


This oops is on a 

	lock incl 0x4(%edx)

and as far as I can tell, it's from do_tcp_sendpages():

		....

                i = skb_shinfo(skb)->nr_frags;
                if (can_coalesce(skb, i, page, offset)) {
                        skb_shinfo(skb)->frags[i - 1].size += copy;
                } else if (i < MAX_SKB_FRAGS) {
*********		get_page(page);			***************
                        fill_page_desc(skb, i, page, offset, copy);
                } else {
                        tcp_mark_push(tp, skb);
                        goto new_segment;
                }
		...

where "page" is NULL.

The caller seems to be svc_sendto()->tcp_sendpage()->do_tcp_sendpages()  
(the other addresses seem to be stale crud on the stack), which doesn't
look like it has changed lately. Unless there are changes in this area in
-mm..

Any ideas?

		Linus

-----
Jedi/Sector One <fdenis@skyrock.fr>
On Sat, 27 Mar 2004, Jedi/Sector One wrote:
>
>   Hello.
>   
>   I got a reproducible oops after a few minutes with a 2.6.5-rc2-mm4 kernel.
>   
>   /etc/exports :
> /mnt/data 10.42.42.0/24(rw,async,no_subtree_check,root_squash,
>                         anonuid=10000,anongid=10000)
> 
>   Clients are 2.6.5-rc2-mm2 kernels, filesystem is ReiserFS 3, data=writeback.
>   Exports are mounted with tcp,nolock,soft,timeo=600,retrans=2,actimeo=30,
> rsize=32768,wsize=32768.
> 
>   Once the oops has happened, no client can access the mount point any more.
>   
>   Unable to handle kernel NULL pointer dereference at virtual address 00000004
>  printing eip:
> c029fd35
> *pde = 00000000
> Oops: 0002 [#1]
> SMP
> CPU:    0
> EIP:    0060:[<c029fd35>]    Not tainted VLI
> EFLAGS: 00010287   (2.6.5-rc2-mm4)
> EIP is at do_tcp_sendpages+0x197/0xa79
> eax: d1d24108   ebx: f5e3fd80   ecx: 00000008   edx: 00000000
> esi: 00000001   edi: d1d24100   ebp: f72391ec   esp: f6283e34
> ds: 007b   es: 007b   ss: 0068
> Process nfsd (pid: 3330, threadinfo=f6283000 task=f62962b0)
> Stack: 000000d0 000000d0 00000000 00000000 15270000 c01e6a8d d1d24110 f7239064
>        00000008 00000000 00000000 00000000 000005b4 00007530 00000000 f7239000
>        00000008 00000000 c02a069f f7239000 f6283eac 00000000 00000008 00000000
> Call Trace:
>  [<c01e6a8d>] nfsd_readdir+0x69/0xe8
>  [<c02a069f>] tcp_sendpage+0x88/0x96
>  [<c02d8ed4>] svc_sendto+0x16a/0x29e
>  [<c01ed0d5>] encode_post_op_attr+0x1c9/0x241
>  [<c02d9f40>] svc_tcp_sendto+0x53/0xa8
>  [<c02da6f8>] svc_send+0xb9/0xfc
>  [<c02dc384>] svcauth_unix_release+0x57/0x59
>  [<c02d838c>] svc_process+0x187/0x611
>  [<c01e0de5>] nfsd+0x1ea/0x3b6
>  [<c01e0bfb>] nfsd+0x0/0x3b6
>  [<c0104e01>] kernel_thread_helper+0x5/0xb
> 
> Code: 4c 24 20 85 f6 74 17 8d 04 f7 8d 50 08 89 54 24 18 8b 54 24 28 3b 50 08 0f 84 80 08 00 00 83 fe 11 0f 87 25 04 00 00 8b 54 24 28 <f0> ff 42 04 8b 7c 24 28 8b 83 98 00 00 00 8d 04 f0 89 78 10 8d
> 
>   Best regards,
>   
>        -Frank.
>        
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
  2004-03-28  2:30 ` Linus Torvalds
@ 2004-03-28  2:48   ` Andrew Morton
  2004-03-28  8:05     ` Frank Denis
       [not found]   ` <1080511633.5553.29.camel@lade.trondhjem.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2004-03-28  2:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: davem, trond.myklebust, neilb, netdev, fdenis

Linus Torvalds <torvalds@osdl.org> wrote:
>
> 
> This oops is on a 
> 
> 	lock incl 0x4(%edx)
> 
> and as far as I can tell, it's from do_tcp_sendpages():
> 
> 		....
> 
>                 i = skb_shinfo(skb)->nr_frags;
>                 if (can_coalesce(skb, i, page, offset)) {
>                         skb_shinfo(skb)->frags[i - 1].size += copy;
>                 } else if (i < MAX_SKB_FRAGS) {
> *********		get_page(page);			***************
>                         fill_page_desc(skb, i, page, offset, copy);
>                 } else {
>                         tcp_mark_push(tp, skb);
>                         goto new_segment;
>                 }
> 		...
> 
> where "page" is NULL.
> 
> The caller seems to be svc_sendto()->tcp_sendpage()->do_tcp_sendpages()  
> (the other addresses seem to be stale crud on the stack), which doesn't
> look like it has changed lately. Unless there are changes in this area in
> -mm..

There are some knfsd patches in -mm.

This one might be the cuplrit:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.5-rc2/2.6.5-rc2-mm4/broken-out/knfsd-03-auth_error-formatting-fix.patch

Frank, if you have time it would be interesting to try reverting that (and
the other knfsd-* patches), see if the crash goes away.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
  2004-03-28  2:48   ` Andrew Morton
@ 2004-03-28  8:05     ` Frank Denis
  0 siblings, 0 replies; 10+ messages in thread
From: Frank Denis @ 2004-03-28  8:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, davem, trond.myklebust, neilb, netdev

Le dimanche 28 Mars 2004 04:48, Andrew Morton a écrit :
> This one might be the cuplrit:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.5-rc2/2.6
>.5-rc2-mm4/broken-out/knfsd-03-auth_error-formatting-fix.patch

Bad pick, same oops when that one is reverted.

Let's try other ones...

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <1080511633.5553.29.camel@lade.trondhjem.org>]

* Re: nfsd oops with 2.6.5-rc2-mm4
       [not found]   ` <1080511633.5553.29.camel@lade.trondhjem.org>
@ 2004-03-29  1:03     ` Neil Brown
  2004-03-29  2:44     ` Linus Torvalds
  1 sibling, 0 replies; 10+ messages in thread
From: Neil Brown @ 2004-03-29  1:03 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Bruce Allan, Linus Torvalds, David S Miller, Andrew Morton,
	netdev

On Sunday March 28, trond.myklebust@fys.uio.no wrote:
> På lau , 27/03/2004 klokka 21:30, skreiv Linus Torvalds:
> > This oops is on a 
> > 
> > 	lock incl 0x4(%edx)
> > 
> > and as far as I can tell, it's from do_tcp_sendpages():
> > 
> > 		....
> > 
> >                 i = skb_shinfo(skb)->nr_frags;
> >                 if (can_coalesce(skb, i, page, offset)) {
> >                         skb_shinfo(skb)->frags[i - 1].size += copy;
> >                 } else if (i < MAX_SKB_FRAGS) {
> > *********		get_page(page);			***************
> >                         fill_page_desc(skb, i, page, offset, copy);
> >                 } else {
> >                         tcp_mark_push(tp, skb);
> >                         goto new_segment;
> >                 }
> > 		...
> > 
> > where "page" is NULL.
> > 
> > The caller seems to be svc_sendto()->tcp_sendpage()->do_tcp_sendpages()  
> > (the other addresses seem to be stale crud on the stack), which doesn't
> > look like it has changed lately. Unless there are changes in this area in
> > -mm..
> > 
> > Any ideas?
> 
> I'm not really that familiar with Neil's code, but looking at
> encode_entry(), shouldn't it test for the buffer overflow condition
> whereby pn >= cd->rqstp->rq_resused, and exit if it does? Neil?

No... and yes... sort of....

It is (should be) an invariant that cd->buffer is always inside one
of the pages in rq_respages.  If a particular entry won't fit, it is
not added and nfserr_toosmall is returned there.

I say "should" because there is a '<=' which should be a '<', so it is
possible that if the entry fits exactly in the page, cd->buffer will
be left pointing off the end of the page, and so the invariant won't
be true.

While this won't happen often, it is more likely to happen with larger
directories, which seems to be a common theme.

There is another problem in this code - num_entry_words isn't set
properly if there is a problem getting the filehandle for a file.

I'm not sure if this has been a problem or not.


Anyway, here is the patch.  I don't have a test case that breaks
without this patch, but a large directory can be listed from Solaris
(which uses readdir-plus) with this patch, so I don't think I've
broken anything.

NeilBrown

###Comments for Changeset
Fix bugs introduced by recent improvements to readdir_plus

1/ make sure cd->buffer is always inside a page - previously if an
  entry fit perfectly in the remainder of a page, cd->buffer would 
  end up pointing past the end of that page.
2/ make sure num_entry_words is always correct, even on the error
  path.

 ----------- Diffstat output ------------
 ./fs/nfsd/nfs3xdr.c |   16 ++++++----------
 1 files changed, 6 insertions(+), 10 deletions(-)

diff ./fs/nfsd/nfs3xdr.c~current~ ./fs/nfsd/nfs3xdr.c
--- ./fs/nfsd/nfs3xdr.c~current~	2004-03-29 10:10:25.000000000 +1000
+++ ./fs/nfsd/nfs3xdr.c	2004-03-29 10:26:21.000000000 +1000
@@ -884,10 +884,11 @@ encode_entry(struct readdir_cd *ccd, con
 		if (plus) {
 			struct svc_fh	fh;
 
-			if (compose_entry_fh(cd, &fh, name, namlen) > 0)
-				goto noexec;
-
-			p = encode_entryplus_baggage(cd, p, &fh);
+			if (compose_entry_fh(cd, &fh, name, namlen) > 0) {
+				*p++ = 0;
+				*p++ = 0;
+			} else
+				p = encode_entryplus_baggage(cd, p, &fh);
 		}
 		num_entry_words = p - cd->buffer;
 	} else if (cd->rqstp->rq_respages[pn+1] != NULL) {
@@ -916,7 +917,7 @@ encode_entry(struct readdir_cd *ccd, con
 		/* determine entry word length and lengths to go in pages */
 		num_entry_words = p1 - tmp;
 		len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer;
-		if ((num_entry_words << 2) <= len1) {
+		if ((num_entry_words << 2) < len1) {
 			/* the actual number of words in the entry is less
 			 * than elen and can still fit in the current page
 			 */
@@ -945,16 +946,11 @@ encode_entry(struct readdir_cd *ccd, con
 		return -EINVAL;
 	}
 
-out:
 	cd->buflen -= num_entry_words;
 	cd->buffer = p;
 	cd->common.err = nfs_ok;
 	return 0;
 
-noexec:
-	*p++ = 0;
-	*p++ = 0;
-	goto out;
 }
 
 int

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfsd oops with 2.6.5-rc2-mm4
       [not found]   ` <1080511633.5553.29.camel@lade.trondhjem.org>
  2004-03-29  1:03     ` Neil Brown
@ 2004-03-29  2:44     ` Linus Torvalds
  1 sibling, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2004-03-29  2:44 UTC (permalink / raw)
  To: Trond Myklebust, Frank Denis
  Cc: David S Miller, Neil Brown, Andrew Morton, netdev



On Sun, 28 Mar 2004, Trond Myklebust wrote:
> 
> I'm not really that familiar with Neil's code, but looking at
> encode_entry(), shouldn't it test for the buffer overflow condition
> whereby pn >= cd->rqstp->rq_resused, and exit if it does? Neil?

Ok, I took Neils patch, which touches the same function, but looks very 
different.

Frank, can you test the current top-of-BK tree (or if not a BK user, try
the snapshot tomorrow that should get build at 2AM tonight). It would be 
good to get this thing confirmed.

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-03-29  2:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-27 23:17 nfsd oops with 2.6.5-rc2-mm4 Dieter Stueken
  -- strict thread matches above, loose matches on Subject: below --
2004-03-27 19:10 Dieter Stueken
2004-03-27 20:03 ` Frank Denis
2004-03-27 20:40   ` Frank Denis
2004-03-27 13:07 Jedi/Sector One
2004-03-28  2:30 ` Linus Torvalds
2004-03-28  2:48   ` Andrew Morton
2004-03-28  8:05     ` Frank Denis
     [not found]   ` <1080511633.5553.29.camel@lade.trondhjem.org>
2004-03-29  1:03     ` Neil Brown
2004-03-29  2:44     ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.