All of lore.kernel.org
 help / color / mirror / Atom feed
* htree+NFS (NFS client bug?)
@ 2002-11-26 23:44 Jeremy Fitzhardinge
  2002-11-27  3:26 ` [NFS] " Trond Myklebust
  2002-11-27 13:33 ` Theodore Ts'o
  0 siblings, 2 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2002-11-26 23:44 UTC (permalink / raw)
  To: Ext2 devel, NFS maillist; +Cc: Linux Kernel

I'm having problems with exporting htree ext3 filesystems over NFS. 
Quite often, if a program on the client side is reading a directory it
will simply stop and consume a vast amount of memory.  It seems that the
readdir never terminates, and endlessly returns the same dir entries
again and again.

This seems to be happening entirely on the NFS client side; after the
initial readdir and a couple of getattrs, there's no further network
traffic.

It looks to me like some sort of problem managing the NFS readdir
cookies, but it isn't clear to me whether this is the NFS server/ext3
generating bad cookies, or the NFS client handling them wrongly.

Both the client and server machines are running 2.4.20-rc1, with Ted's
extfs-update-2.4.20-rc1 patch (dating from Nov 8th or so).  Is there a
more up to date patch?

This is the network traffic:

# tcpdump -s1500 -vvv -ieth0 host server and port 2049
tcpdump: listening on eth0
15:13:39.784538 client.3095412229 > server.nfs: 140 lookup fh Unknown/1 "coregrind-" (DF) (ttl 64, id 0, len 168)
15:13:39.784814 server.nfs > client.3095412229: reply ok 128 lookup fh Unknown/1 DIR 40775 ids 1043/1043 sz 12288 nlink 3 rdev ffffffff fsid 301 nodeid 47976 a/m/ctime 1038043539.000000 1038351994.000000 1038351994.000000  (DF) (ttl 64, id 0, len 156)
15:13:39.785247 client.3112189445 > server.nfs: 124 getattr fh Unknown/1 (DF) (ttl 64, id 0, len 152)
15:13:39.785434 server.nfs > client.3112189445: reply ok 96 getattr DIR 40775 ids 1043/1043 sz 12288  (DF) (ttl 64, id 0, len 124)
15:13:39.785938 client.3128966661 > server.nfs: 132 readdir fh Unknown/1 4096 bytes @ 0 (DF) (ttl 64, id 0, len 160)
15:13:39.786939 server.nfs > client.3128966661: reply ok 1472 readdir offset 1 size 293238  eof (frag 48814:1480@0+) (ttl 64, len 1500)
15:14:50.727869 client.3246407173 > server.nfs: 116 getattr fh 0,1/16777984 (DF) (ttl 64, id 0, len 144)
15:14:50.728127 server.nfs > client.3246407173: reply ok 96 getattr DIR 40755 ids 0/0 sz 4096  (DF) (ttl 64, id 0, len 124)

For this command:
$ strace -s1500 ls -f .
execve("/bin/ls", ["ls", "-f", "."], [/* 122 vars */]) = 0
[...]
lstat64(".", {st_mode=S_IFDIR|0775, st_size=12288, ...}) = 0
open("/dev/null", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = -1 ENOTDIR (Not a directory)
open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0775, st_size=12288, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 94 entries */, 4096)   = 4096
brk(0x8059000)                          = 0x8059000
getdents64(3, /* 13 entries */, 4096)   = 496
brk(0x805f000)                          = 0x805f000
getdents64(3, /* 94 entries */, 4096)   = 4096
brk(0x806b000)                          = 0x806b000
getdents64(3, /* 13 entries */, 4096)   = 496
getdents64(3, /* 94 entries */, 4096)   = 4096
getdents64(3, /* 13 entries */, 4096)   = 496
getdents64(3, /* 94 entries */, 4096)   = 4096
brk(0x8082000)                          = 0x8082000
getdents64(3, /* 13 entries */, 4096)   = 496
getdents64(3, /* 94 entries */, 4096)   = 4096
getdents64(3, /* 13 entries */, 4096)   = 496
getdents64(3, /* 94 entries */, 4096)   = 4096
getdents64(3, /* 13 entries */, 4096)   = 496
getdents64(3, /* 94 entries */, 4096)   = 4096
getdents64(3, /* 13 entries */, 4096)   = 496
getdents64(3, /* 94 entries */, 4096)   = 4096
[...]

A simple program which does an opendir/while(readdir != NULL)/closedir
loop shows this:
$ readdir .
name=".", ino=293238
name="..", ino=298200
name="vg_symtypes.h~", ino=163580
name="vg_include.h", ino=163476
name=".deps", ino=293390
name="vg_to_ucode.c~47-chained-bb", ino=293901
name="vg_from_ucode.c~47-chained-bb", ino=293896
name="vg_memory.c", ino=293462
name="vg_symtypes.c~", ino=163620
name="vg_constants.h~48-chained-indirect", ino=293925
name="valgrind", ino=293394
name="vg_needs.c", ino=293431
name="vg_from_ucode.i", ino=293860
name="vg_symtypes.h~44-symbolic-addr", ino=293651
name="vg_libpthread.c~11-timedwait-rel", ino=293628
name="vg_libpthread_unimp.c", ino=293999
name="vg_constants.h~47-chained-bb", ino=293904
name="vg_libpthread.c~03-accept-nonblock", ino=293652
name="vg_transtab.c~47-chained-bb", ino=293900
name="vg_include.h~48-chained-indirect", ino=293939
name="vg_symtypes.c~44-symbolic-addr", ino=293639
name="vg_main.c~14-hg-mmap-magic-virgin", ino=293941
name="vg_symtab2-test.c", ino=293903
name="vg_startup.S", ino=298417
name="vg_main.c~51-kill-inceip", ino=294086
name="vg_default.c", ino=293439
name="vg_translate.c~01-partial-mul", ino=293615
name="vg_dispatch.S~48-chained-indirect", ino=293856
name="vg_scheduler.c~47-chained-bb", ino=293890
name="vg_clientfuncs.c", ino=298391
name="Makefile.in", ino=298437
name="gmon.out", ino=293911
name="vg_main.c~47-chained-bb", ino=293897
name="vg_procselfmaps.c", ino=298414
name="vg_symtab2.h", ino=163541
name="vg_main.c~48-chained-indirect", ino=293946
name="vg_translate.c~51-kill-inceip", ino=294087
name="vg_symtypes.h", ino=163543
name="vg_symtab_stabs.c", ino=163530
name="dosyms", ino=298389
name="vg_constants.h", ino=163142
name="vg_demangle.c", ino=298395
name="vg_messages.c", ino=293882
name="vg_include.h~50-fast-cond", ino=294050
name="vg_to_ucode.c", ino=163552
name="Makefile~", ino=293393
name="vg_symtab2.c~44-symbolic-addr", ino=293650
name="vg_dispatch.S~47-chained-bb", ino=293878
name="vg_libpthread.c", ino=163595
name="vg_kerneliface.h", ino=163597
name="vg_from_ucode.c~48-chained-indirect", ino=293880
name="vg_from_ucode.c", ino=163451
name="vg_from_ucode.c~49-no-inceip", ino=294060
name="vg_malloc2.c", ino=163531
name="vg_instrument.c", ino=298403
name="vg_signals.c~28-sigtrap", ino=293637
name="vg_main.c.rej", ino=163538
name="vg_symtab2-test", ino=293691
name="vg_errcontext.c", ino=163586
name="vg_translate.c", ino=163539
name="vg_symtypes.c", ino=163647
name="out.pid11357", ino=293908
name="vg_execontext.c", ino=293436
name="Makefile", ino=293504
name="vg_main.c", ino=163523
name="vg_errcontext.c~47-chained-bb", ino=293898
name="vg_signals.c", ino=163536
name="vg_symtab2.c~offset-dehack", ino=293949
name="vg_from_ucode.c~01-partial-mul", ino=293605
name="Makefile.am~44-symbolic-addr", ino=293671
name="vg_main.c.orig", ino=163519
name="vg_symtab2.c", ino=163599
name="vg_kerneliface.h~28-sigtrap", ino=293638
name="vg_ldt.c", ino=298405
name="vg_from_ucode.c~00-lazy-fp", ino=293269
name="vg_transtab.c", ino=163622
name="vg_mylibc.c~12-vgprof", ino=293636
name="vg_from_ucode.c~50-fast-cond", ino=293862
name="vg_symtab2.h~44-symbolic-addr", ino=293851
name="vg_symtab_dwarf.c~44-symbolic-addr", ino=293858
name="vg_unsafe.h", ino=298424
name="vg_include.h~51-kill-inceip", ino=294052
name="vg_symtab_stabs.c~44-symbolic-addr", ino=293842
name="vg_mylibc.c", ino=163177
name="vg_dummy_profile.c", ino=298397
name="vg_main.c~50-fast-cond", ino=293956
name="vg_mylibc.c~09-rdtsc-calibration", ino=293625
name="vg_include.h~47-chained-bb", ino=293879
name="vg_translate.c~47-chained-bb", ino=293894
name="vg_libpthread.c~46-fix-writeable_or_erring-proto", ino=293863
name="vg_syscall.S", ino=298419
name="vg_from_ucode.c~51-kill-inceip", ino=294051
name="vg_mylibc.c~28-sigtrap", ino=298763
name="vg_dispatch.S", ino=163511
name="vg_symtab_dwarf.c", ino=163448
name="vg_intercept.c", ino=293614
name="vg_symtab_stabs.c~", ino=163548
name="vg_clientmalloc.c", ino=298392
name="vg_valgrinq_dummy.c", ino=298425
name="vg_scheduler.c", ino=163617
name="vg_to_ucode.c~01-partial-mul", ino=293607
name="Makefile.am", ino=163594
name="vg_helpers.S", ino=298401
name="out.pid11334", ino=293907
name="vg_syscalls.c", ino=293575
name="valgrind.in", ino=298390
name="vg_libpthread.vs", ino=298407
name=".", ino=293238
name="..", ino=298200
name="vg_symtypes.h~", ino=163580
name="vg_include.h", ino=163476
name=".deps", ino=293390
name="vg_to_ucode.c~47-chained-bb", ino=293901
name="vg_from_ucode.c~47-chained-bb", ino=293896
name="vg_memory.c", ino=293462
name="vg_symtypes.c~", ino=163620
name="vg_constants.h~48-chained-indirect", ino=293925
name="valgrind", ino=293394
name="vg_needs.c", ino=293431
name="vg_from_ucode.i", ino=293860
name="vg_symtypes.h~44-symbolic-addr", ino=293651
name="vg_libpthread.c~11-timedwait-rel", ino=293628
name="vg_libpthread_unimp.c", ino=293999
name="vg_constants.h~47-chained-bb", ino=293904
name="vg_libpthread.c~03-accept-nonblock", ino=293652
name="vg_transtab.c~47-chained-bb", ino=293900
name="vg_include.h~48-chained-indirect", ino=293939
name="vg_symtypes.c~44-symbolic-addr", ino=293639
name="vg_main.c~14-hg-mmap-magic-virgin", ino=293941
name="vg_symtab2-test.c", ino=293903
name="vg_startup.S", ino=298417
name="vg_main.c~51-kill-inceip", ino=294086
name="vg_default.c", ino=293439
name="vg_translate.c~01-partial-mul", ino=293615
name="vg_dispatch.S~48-chained-indirect", ino=293856
name="vg_scheduler.c~47-chained-bb", ino=293890
name="vg_clientfuncs.c", ino=298391
name="Makefile.in", ino=298437
name="gmon.out", ino=293911
name="vg_main.c~47-chained-bb", ino=293897
name="vg_procselfmaps.c", ino=298414
name="vg_symtab2.h", ino=163541
name="vg_main.c~48-chained-indirect", ino=293946
name="vg_translate.c~51-kill-inceip", ino=294087
name="vg_symtypes.h", ino=163543
name="vg_symtab_stabs.c", ino=163530
name="dosyms", ino=298389
name="vg_constants.h", ino=163142
name="vg_demangle.c", ino=298395
name="vg_messages.c", ino=293882
name="vg_include.h~50-fast-cond", ino=294050
name="vg_to_ucode.c", ino=163552
name="Makefile~", ino=293393
name="vg_symtab2.c~44-symbolic-addr", ino=293650
name="vg_dispatch.S~47-chained-bb", ino=293878
name="vg_libpthread.c", ino=163595
name="vg_kerneliface.h", ino=163597
name="vg_from_ucode.c~48-chained-indirect", ino=293880
name="vg_from_ucode.c", ino=163451
name="vg_from_ucode.c~49-no-inceip", ino=294060
name="vg_malloc2.c", ino=163531
name="vg_instrument.c", ino=298403
name="vg_signals.c~28-sigtrap", ino=293637
name="vg_main.c.rej", ino=163538
name="vg_symtab2-test", ino=293691
name="vg_errcontext.c", ino=163586
name="vg_translate.c", ino=163539
name="vg_symtypes.c", ino=163647
name="out.pid11357", ino=293908
name="vg_execontext.c", ino=293436
name="Makefile", ino=293504
name="vg_main.c", ino=163523
name="vg_errcontext.c~47-chained-bb", ino=293898
name="vg_signals.c", ino=163536
name="vg_symtab2.c~offset-dehack", ino=293949
name="vg_from_ucode.c~01-partial-mul", ino=293605
name="Makefile.am~44-symbolic-addr", ino=293671
name="vg_main.c.orig", ino=163519
name="vg_symtab2.c", ino=163599
name="vg_kerneliface.h~28-sigtrap", ino=293638
name="vg_ldt.c", ino=298405
name="vg_from_ucode.c~00-lazy-fp", ino=293269
name="vg_transtab.c", ino=163622
name="vg_mylibc.c~12-vgprof", ino=293636
name="vg_from_ucode.c~50-fast-cond", ino=293862
name="vg_symtab2.h~44-symbolic-addr", ino=293851
name="vg_symtab_dwarf.c~44-symbolic-addr", ino=293858
name="vg_unsafe.h", ino=298424
name="vg_include.h~51-kill-inceip", ino=294052
name="vg_symtab_stabs.c~44-symbolic-addr", ino=293842
name="vg_mylibc.c", ino=163177
name="vg_dummy_profile.c", ino=298397
name="vg_main.c~50-fast-cond", ino=293956
name="vg_mylibc.c~09-rdtsc-calibration", ino=293625
name="vg_include.h~47-chained-bb", ino=293879
name="vg_translate.c~47-chained-bb", ino=293894
name="vg_libpthread.c~46-fix-writeable_or_erring-proto", ino=293863
name="vg_syscall.S", ino=298419
name="vg_from_ucode.c~51-kill-inceip", ino=294051
name="vg_mylibc.c~28-sigtrap", ino=298763
[...etc...]


	J

^ permalink raw reply	[flat|nested] 30+ messages in thread
* Re: htree+NFS (NFS client bug?)
@ 2002-11-27 17:26 Jeremy Fitzhardinge
  0 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2002-11-27 17:26 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext2 devel, NFS maillist, Linux Kernel List

[-- Attachment #1: Type: text/plain, Size: 1137 bytes --]

On Wed, 2002-11-27 at 05:33, Theodore Ts'o wrote: 
> Well, even if the NFS server is generating bad cookies (and that may
> be possible), the NFS client should be more robust and not spin in a
> loop forever.
> 
> Can you send me a directory list (from the server) of the directory in
> question?  Also, can you send me the output of dumpe2fs -h on the
> filesystem.  I'll need the later to get the seed for the htree hash,
> so I can try replicating this on my end.

The hash seed is 80f11989-89cc-4046-80d8-4aef72a7f80f (extracted with
tune2fs -l).  The listing is attached.

> Also, have you tried running e2fsck on filesystem on the server?  It
> would be very interesting to confirm whether or not the filesystem is
> consistent.

This happens very regularly.  Each time it does I create a new directory
and move everything over, which presumably works because it rehashes
everything and/or changes the alignment of particular direntries with
respect to the NFS reply packets.  What I'm saying is that if the
filesystem is getting into an inconsistent state, it is doing so at a
very high rate.  I'll check it anyway... all OK.

	J

[-- Attachment #2: Type: text/plain, Size: 2181 bytes --]

.
..
vg_symtypes.h~
vg_include.h
.deps
vg_to_ucode.c~47-chained-bb
vg_from_ucode.c~47-chained-bb
vg_memory.c
vg_symtypes.c~
vg_constants.h~48-chained-indirect
valgrind
vg_needs.c
vg_from_ucode.i
vg_symtypes.h~44-symbolic-addr
vg_libpthread.c~11-timedwait-rel
vg_libpthread_unimp.c
vg_constants.h~47-chained-bb
vg_libpthread.c~03-accept-nonblock
vg_transtab.c~47-chained-bb
vg_include.h~48-chained-indirect
vg_symtypes.c~44-symbolic-addr
vg_main.c~14-hg-mmap-magic-virgin
vg_symtab2-test.c
vg_startup.S
vg_main.c~51-kill-inceip
vg_default.c
vg_translate.c~01-partial-mul
vg_dispatch.S~48-chained-indirect
vg_scheduler.c~47-chained-bb
vg_clientfuncs.c
Makefile.in
gmon.out
vg_main.c~47-chained-bb
vg_procselfmaps.c
vg_symtab2.h
vg_main.c~48-chained-indirect
vg_translate.c~51-kill-inceip
vg_symtypes.h
vg_symtab_stabs.c
dosyms
vg_constants.h
vg_demangle.c
vg_messages.c
vg_include.h~50-fast-cond
vg_to_ucode.c
Makefile~
vg_symtab2.c~44-symbolic-addr
vg_dispatch.S~47-chained-bb
vg_libpthread.c
vg_kerneliface.h
vg_from_ucode.c~48-chained-indirect
vg_from_ucode.c
vg_from_ucode.c~49-no-inceip
vg_malloc2.c
vg_instrument.c
vg_signals.c~28-sigtrap
vg_main.c.rej
vg_symtab2-test
vg_errcontext.c
vg_translate.c
vg_symtypes.c
out.pid11357
vg_execontext.c
Makefile
vg_main.c
vg_errcontext.c~47-chained-bb
vg_signals.c
vg_symtab2.c~offset-dehack
vg_from_ucode.c~01-partial-mul
Makefile.am~44-symbolic-addr
vg_main.c.orig
vg_symtab2.c
vg_kerneliface.h~28-sigtrap
vg_ldt.c
vg_from_ucode.c~00-lazy-fp
vg_transtab.c
vg_mylibc.c~12-vgprof
vg_from_ucode.c~50-fast-cond
vg_symtab2.h~44-symbolic-addr
vg_symtab_dwarf.c~44-symbolic-addr
vg_unsafe.h
vg_include.h~51-kill-inceip
vg_symtab_stabs.c~44-symbolic-addr
vg_mylibc.c
vg_dummy_profile.c
vg_main.c~50-fast-cond
vg_mylibc.c~09-rdtsc-calibration
vg_include.h~47-chained-bb
vg_translate.c~47-chained-bb
vg_libpthread.c~46-fix-writeable_or_erring-proto
vg_syscall.S
vg_from_ucode.c~51-kill-inceip
vg_mylibc.c~28-sigtrap
vg_dispatch.S
vg_symtab_dwarf.c
vg_intercept.c
vg_symtab_stabs.c~
vg_clientmalloc.c
vg_valgrinq_dummy.c
vg_scheduler.c
vg_to_ucode.c~01-partial-mul
Makefile.am
vg_helpers.S
out.pid11334
vg_syscalls.c
valgrind.in
vg_libpthread.vs

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2002-11-28 20:00 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-26 23:44 htree+NFS (NFS client bug?) Jeremy Fitzhardinge
2002-11-27  3:26 ` [NFS] " Trond Myklebust
2002-11-27  2:59   ` [Ext2-devel] " chrisl
2002-11-27  8:58   ` Jeremy Fitzhardinge
2002-11-27 15:00     ` Stephen C. Tweedie
2002-11-27 15:00       ` [Ext2-devel] " Stephen C. Tweedie
2002-11-27 20:25       ` Trond Myklebust
2002-11-27 20:55         ` Stephen C. Tweedie
2002-11-27 20:55           ` [Ext2-devel] " Stephen C. Tweedie
2002-11-27 22:44           ` Trond Myklebust
2002-11-28 16:41             ` Stephen C. Tweedie
2002-11-28 16:41               ` [Ext2-devel] " Stephen C. Tweedie
2002-11-28 16:58               ` Trond Myklebust
2002-11-28 16:58                 ` [Ext2-devel] " Trond Myklebust
2002-11-28 17:09                 ` Stephen C. Tweedie
2002-11-28 17:09                   ` [Ext2-devel] " Stephen C. Tweedie
2002-11-28 17:57                   ` Trond Myklebust
2002-11-28 17:57                     ` [Ext2-devel] " Trond Myklebust
2002-11-28 16:44           ` Stephen C. Tweedie
2002-11-28 16:44             ` [Ext2-devel] " Stephen C. Tweedie
2002-11-28 17:13             ` Stephen C. Tweedie
2002-11-28 17:13               ` [Ext2-devel] " Stephen C. Tweedie
2002-11-28 17:44               ` Trond Myklebust
2002-11-28 17:44                 ` [Ext2-devel] " Trond Myklebust
2002-11-28 20:00               ` Jeremy Fitzhardinge
2002-11-28  2:07       ` Jeremy Fitzhardinge
2002-11-28  2:46         ` Trond Myklebust
2002-11-27 13:33 ` Theodore Ts'o
2002-11-27 20:42   ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2002-11-27 17:26 Jeremy Fitzhardinge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.