* 2.5: NFS troubles
@ 2003-04-06 12:06 Felipe Alfaro Solana
2003-04-06 12:58 ` Trond Myklebust
0 siblings, 1 reply; 12+ messages in thread
From: Felipe Alfaro Solana @ 2003-04-06 12:06 UTC (permalink / raw)
To: LKML
[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]
Hello,
I'm testing 2.5.66-bk11 on my NFS server running RH9. When I run the
"find" command on the NFS share from my client computer, it hangs
forever after a while, but it always hangs *exactly* at the same place
every time. However, if I boot into NFS with RH9's standard kernel
(2.4.20), the "find" command works as expected and is able to complete
with any hangs or delays.
My NFS server (hostname glass) has a whole ext3 partition - mounted
under /data - formatted as ext3.
/etc/exports is
/data 192.168.0.100(rw,no_root_squash)
/etc/fstab is
/dev/hda3 /data ext3 defaults,noatime 1 2
I have this NFS share placed on my client computer /etc/fstab as
glass:/data /net/glass_data nfs noauto,users,soft 0 0
The client computer is running 2.5.66-mm1. I have attached the following
files, compressed with bzip2 as they are really big:
local-find.bz2 is the result of running the find command locally on the
NFS server (not using NFS) for you to see the whole list of files that
should be shown up.
nfs-find.bz2 is the actual list of files that are shown up using the
find command over the NFS share (running on my client computer) before
the "find" command hangs.
nfs-tcpdump.bz2 is a partial tcpdump output generated while the "find"
command is running over the NFS share.
nfs-strace.bz2 is an "strace -p <pid_of_find_command>" on the find
command that I run on my client computer.
Any ideas on what's hapenning here?
Thank you very much!
________________________________________________________________________
Linux Registered User #287198
[-- Attachment #2: local-find.bz2 --]
[-- Type: application/x-bzip, Size: 15321 bytes --]
[-- Attachment #3: nfs-find.bz2 --]
[-- Type: application/x-bzip, Size: 546 bytes --]
[-- Attachment #4: nfs-strace.bz2 --]
[-- Type: application/x-bzip, Size: 22290 bytes --]
[-- Attachment #5: nfs-tcpdump.bz2 --]
[-- Type: application/x-bzip, Size: 736 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-06 12:06 2.5: NFS troubles Felipe Alfaro Solana
@ 2003-04-06 12:58 ` Trond Myklebust
2003-04-07 0:18 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2003-04-06 12:58 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: LKML
>>>>> " " == Felipe Alfaro Solana <felipe_alfaro@linuxmail.org> writes:
> Hello, I'm testing 2.5.66-bk11 on my NFS server running
> RH9. When I run the "find" command on the NFS share from my
> client computer, it hangs forever after a while, but it always
> hangs *exactly* at the same place every time. However, if I
> boot into NFS with RH9's standard kernel (2.4.20), the "find"
> command works as expected and is able to complete with any
> hangs or delays.
> My NFS server (hostname glass) has a whole ext3 partition -
> mounted under /data - formatted as ext3.
The 2.5.66 ext3 code still has some issues with respect to NFS readdir
cookies.
Cheers,
Trond
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-06 12:58 ` Trond Myklebust
@ 2003-04-07 0:18 ` Andrew Morton
2003-04-07 0:27 ` Robert Love
2003-04-07 8:58 ` Felipe Alfaro Solana
0 siblings, 2 replies; 12+ messages in thread
From: Andrew Morton @ 2003-04-07 0:18 UTC (permalink / raw)
To: Trond Myklebust; +Cc: felipe_alfaro, linux-kernel
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
>
> >>>>> " " == Felipe Alfaro Solana <felipe_alfaro@linuxmail.org> writes:
>
> > Hello, I'm testing 2.5.66-bk11 on my NFS server running
> > RH9. When I run the "find" command on the NFS share from my
> > client computer, it hangs forever after a while, but it always
> > hangs *exactly* at the same place every time. However, if I
> > boot into NFS with RH9's standard kernel (2.4.20), the "find"
> > command works as expected and is able to complete with any
> > hangs or delays.
>
> > My NFS server (hostname glass) has a whole ext3 partition -
> > mounted under /data - formatted as ext3.
>
> The 2.5.66 ext3 code still has some issues with respect to NFS readdir
> cookies.
It might do. I have Ted's htree/NFS fixes in there though.
Felipe, please do
dumpe2fs /dev/hdXX | grep features
if it shows dir_index then it might be an ext3 problem. If not then it is
probably an NFS problem.
If it does have dir_index set then please run
tune2fs -O ^dir_index /dev/hdXX
and reboot and retest.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 0:18 ` Andrew Morton
@ 2003-04-07 0:27 ` Robert Love
2003-04-07 9:01 ` Felipe Alfaro Solana
` (2 more replies)
2003-04-07 8:58 ` Felipe Alfaro Solana
1 sibling, 3 replies; 12+ messages in thread
From: Robert Love @ 2003-04-07 0:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: Trond Myklebust, felipe_alfaro, linux-kernel
On Sun, 2003-04-06 at 20:18, Andrew Morton wrote:
> if it shows dir_index then it might be an ext3 problem. If not then it is
> probably an NFS problem.
Nah, its not an ext3 problem (at least not with htree).
I am seeing this same problem, starting recently, with a 2.5 client and
a 2.4 server. Both are ext3 but neither have htree, and the problem is
new.
I have not yet figured out whether its the 2.5 kernel on the client or
the newly-upgraded Red Hat 9 on the server... but I suspect the 2.5
kernel on the client.
Robert Love
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 0:18 ` Andrew Morton
2003-04-07 0:27 ` Robert Love
@ 2003-04-07 8:58 ` Felipe Alfaro Solana
1 sibling, 0 replies; 12+ messages in thread
From: Felipe Alfaro Solana @ 2003-04-07 8:58 UTC (permalink / raw)
To: Andrew Morton; +Cc: Trond Myklebust, LKML
On Mon, 2003-04-07 at 02:18, Andrew Morton wrote:
> Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > The 2.5.66 ext3 code still has some issues with respect to NFS readdir
> > cookies.
>
> It might do. I have Ted's htree/NFS fixes in there though.
>
> Felipe, please do
>
> dumpe2fs /dev/hdXX | grep features
>
> if it shows dir_index then it might be an ext3 problem. If not then it is
> probably an NFS problem.
>
> If it does have dir_index set then please run
>
> tune2fs -O ^dir_index /dev/hdXX
>
> and reboot and retest.
Wonderful, Andrew... You were right. Disabling H/Tree indexes solved the
problem! Anything else?
PS: I previously solved the problem by mounting the filesystem as ext2
instead, but now it seems to be working pretty well with ext3 (at least,
I can't reproduce the hang I described in my previous message).
Thanks!
________________________________________________________________________
Linux Registered User #287198
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 0:27 ` Robert Love
@ 2003-04-07 9:01 ` Felipe Alfaro Solana
2003-04-07 9:13 ` Andrew Morton
2003-04-07 9:39 ` Trond Myklebust
2003-04-07 13:23 ` Trond Myklebust
2 siblings, 1 reply; 12+ messages in thread
From: Felipe Alfaro Solana @ 2003-04-07 9:01 UTC (permalink / raw)
To: Robert Love; +Cc: Andrew Morton, Trond Myklebust, LKML
On Mon, 2003-04-07 at 02:27, Robert Love wrote:
> On Sun, 2003-04-06 at 20:18, Andrew Morton wrote:
>
> > if it shows dir_index then it might be an ext3 problem. If not then it is
> > probably an NFS problem.
>
> Nah, its not an ext3 problem (at least not with htree).
But it could be an interaction problem between NFS and ext3. I did what
Andrew pointed (disabling dir_index) and it solved my problems.
I don't think it's a client problem, since I can't reproduce with
2.4+ext3, 2.5.66+ext2 and 2.5.66+ext3-dir_index, but is reproducible
with 2.5.66+ext3+dir_index.
________________________________________________________________________
Linux Registered User #287198
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 9:01 ` Felipe Alfaro Solana
@ 2003-04-07 9:13 ` Andrew Morton
2003-04-07 9:24 ` Felipe Alfaro Solana
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2003-04-07 9:13 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: rml, trond.myklebust, linux-kernel
Felipe Alfaro Solana <felipe_alfaro@linuxmail.org> wrote:
>
> On Mon, 2003-04-07 at 02:27, Robert Love wrote:
> > On Sun, 2003-04-06 at 20:18, Andrew Morton wrote:
> >
> > > if it shows dir_index then it might be an ext3 problem. If not then it is
> > > probably an NFS problem.
> >
> > Nah, its not an ext3 problem (at least not with htree).
>
> But it could be an interaction problem between NFS and ext3. I did what
> Andrew pointed (disabling dir_index) and it solved my problems.
>
> I don't think it's a client problem, since I can't reproduce with
> 2.4+ext3, 2.5.66+ext2 and 2.5.66+ext3-dir_index, but is reproducible
> with 2.5.66+ext3+dir_index.
>
Well it could still be an NFS problem. Turning off htree on the server could
cause filenames to be returned in a different order (or is that illegal?) or
changed timing or such.
If Robert is seeing it on non-htree servers then we'd need to see that fixed
up before deciding if there is also an(other) htree bug.
First thing we need to do is to debug it. Trond would have a better idea of
how to set about that than I. Possibly a tcpdump of the traffic?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 9:13 ` Andrew Morton
@ 2003-04-07 9:24 ` Felipe Alfaro Solana
2003-04-07 21:58 ` Trond Myklebust
0 siblings, 1 reply; 12+ messages in thread
From: Felipe Alfaro Solana @ 2003-04-07 9:24 UTC (permalink / raw)
To: Andrew Morton; +Cc: rml, trond.myklebust, LKML
On Mon, 2003-04-07 at 11:13, Andrew Morton wrote:
> Felipe Alfaro Solana <felipe_alfaro@linuxmail.org> wrote:
> >
> > I don't think it's a client problem, since I can't reproduce with
> > 2.4+ext3, 2.5.66+ext2 and 2.5.66+ext3-dir_index, but is reproducible
> > with 2.5.66+ext3+dir_index.
> >
>
> Well it could still be an NFS problem. Turning off htree on the server could
> cause filenames to be returned in a different order (or is that illegal?) or
> changed timing or such.
>
> If Robert is seeing it on non-htree servers then we'd need to see that fixed
> up before deciding if there is also an(other) htree bug.
>
> First thing we need to do is to debug it. Trond would have a better idea of
> how to set about that than I. Possibly a tcpdump of the traffic?
I sent a tcpdump and strace as attachments in my original message ;-)
Do you have it handy? Should I send it again?...
________________________________________________________________________
Linux Registered User #287198
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 0:27 ` Robert Love
2003-04-07 9:01 ` Felipe Alfaro Solana
@ 2003-04-07 9:39 ` Trond Myklebust
2003-04-07 13:23 ` Trond Myklebust
2 siblings, 0 replies; 12+ messages in thread
From: Trond Myklebust @ 2003-04-07 9:39 UTC (permalink / raw)
To: Robert Love; +Cc: Andrew Morton, Trond Myklebust, felipe_alfaro, linux-kernel
>>>>> " " == Robert Love <rml@tech9.net> writes:
> I have not yet figured out whether its the 2.5 kernel on the
> client or the newly-upgraded Red Hat 9 on the server... but I
> suspect the 2.5 kernel on the client.
There is a problem with generic reads under 2.5. Now that I think I've
got the resource management under control, I've started to look into
it.
Cheers,
Trond
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 0:27 ` Robert Love
2003-04-07 9:01 ` Felipe Alfaro Solana
2003-04-07 9:39 ` Trond Myklebust
@ 2003-04-07 13:23 ` Trond Myklebust
2003-04-07 15:17 ` Robert Love
2 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2003-04-07 13:23 UTC (permalink / raw)
To: Robert Love, Siim Vahtre; +Cc: linux-kernel, NFS maillist
OK. I've managed to squash the NFS read corruption problems that I had
on my 2.5.x client setup with the following patch.
Since the two of you reported what appears to be the same problem,
would you mind trying it out?
The fix basically tightens up consistency checks in the process of
reading the skb (which is done in the sk->data_ready() callback).
Cheers,
Trond
diff -u --recursive --new-file linux-2.5.66-10-nr_dirty/net/sunrpc/xprt.c linux-2.5.66-11-fix_read/net/sunrpc/xprt.c
--- linux-2.5.66-10-nr_dirty/net/sunrpc/xprt.c 2003-03-27 18:34:08.000000000 +0100
+++ linux-2.5.66-11-fix_read/net/sunrpc/xprt.c 2003-04-07 15:15:29.000000000 +0200
@@ -625,7 +625,8 @@
{
if (len > desc->count)
len = desc->count;
- skb_copy_bits(desc->skb, desc->offset, to, len);
+ if (skb_copy_bits(desc->skb, desc->offset, to, len))
+ return 0;
desc->count -= len;
desc->offset += len;
return len;
@@ -669,11 +670,15 @@
csum2 = skb_checksum(skb, desc.offset, skb->len - desc.offset, 0);
desc.csum = csum_block_add(desc.csum, csum2, desc.offset);
}
+ if (desc.count)
+ return -1;
if ((unsigned short)csum_fold(desc.csum))
return -1;
return 0;
no_checksum:
xdr_partial_copy_from_skb(xdr, 0, &desc, skb_read_bits);
+ if (desc.count)
+ return -1;
return 0;
}
@@ -750,7 +755,8 @@
{
if (len > desc->count)
len = desc->count;
- skb_copy_bits(desc->skb, desc->offset, p, len);
+ if (skb_copy_bits(desc->skb, desc->offset, p, len))
+ return 0;
desc->offset += len;
desc->count -= len;
return len;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 13:23 ` Trond Myklebust
@ 2003-04-07 15:17 ` Robert Love
0 siblings, 0 replies; 12+ messages in thread
From: Robert Love @ 2003-04-07 15:17 UTC (permalink / raw)
To: trond.myklebust; +Cc: Siim Vahtre, linux-kernel, NFS maillist
On Mon, 2003-04-07 at 09:23, Trond Myklebust wrote:
> OK. I've managed to squash the NFS read corruption problems that I had
> on my 2.5.x client setup with the following patch.
> Since the two of you reported what appears to be the same problem,
> would you mind trying it out?
This fixes it for me. No errors, no corruption.
I did a verify of the md5sums of all of the Red Hat 9 RPM packages over
NFS. I had random failures (in different packages each time) before.
I just did it twice to be sure -- it works.
Thank you, Trond.
Robert Love
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.5: NFS troubles
2003-04-07 9:24 ` Felipe Alfaro Solana
@ 2003-04-07 21:58 ` Trond Myklebust
0 siblings, 0 replies; 12+ messages in thread
From: Trond Myklebust @ 2003-04-07 21:58 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: Andrew Morton, rml, LKML
>>>>> " " == Felipe Alfaro Solana <felipe_alfaro@linuxmail.org> writes:
>> If Robert is seeing it on non-htree servers then we'd need to
>> see that fixed up before deciding if there is also an(other)
>> htree bug.
> I sent a tcpdump and strace as attachments in my original
> message ;-) Do you have it handy? Should I send it again?...
Robert was seeing a very different read-related bug (not related to
the htree bug). I posted a fix for it earlier this afternoon, and he
has already confirmed that it fixed his problem.
Cheers,
Trond
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2003-04-07 21:48 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-06 12:06 2.5: NFS troubles Felipe Alfaro Solana
2003-04-06 12:58 ` Trond Myklebust
2003-04-07 0:18 ` Andrew Morton
2003-04-07 0:27 ` Robert Love
2003-04-07 9:01 ` Felipe Alfaro Solana
2003-04-07 9:13 ` Andrew Morton
2003-04-07 9:24 ` Felipe Alfaro Solana
2003-04-07 21:58 ` Trond Myklebust
2003-04-07 9:39 ` Trond Myklebust
2003-04-07 13:23 ` Trond Myklebust
2003-04-07 15:17 ` Robert Love
2003-04-07 8:58 ` Felipe Alfaro Solana
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox