* Nfsd crashes/oops in 2.6.16-rc5
@ 2006-03-08 15:00 Bas van der Vlies
2006-03-09 7:43 ` Nfsd/gfs " Bas van der Vlies
0 siblings, 1 reply; 2+ messages in thread
From: Bas van der Vlies @ 2006-03-08 15:00 UTC (permalink / raw)
To: linux-kernel
uname: 2.6.16-rc5
libc: libc-2.3.2.so
Debian: Sarge
SMP system: 2 CPU's
On our 4 node GFS-cluster we use nfs to export the GFS filesystems to
our 640 node cluster On our fileserver nodes we get an nfs crash/oops.
We tried serveral kernels and they crashes/oops are the same. We node
installed 2.6.16-rc5 and here is the oops:
nable to handle kernel NULL pointer dereference at virtual address 00000038
printing eip:
f89a4be3
*pde = 37809001
*pte = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg
ide_floppy ide_cd cdrom qla2xxx siimage piix e1000 gfs lock_harness dm_mod
CPU: 0
EIP: 0060:[<f89a4be3>] Tainted: GF VLI
EFLAGS: 00010246 (2.6.16-rc5-sara3 #1)
EIP is at gfs_create+0x6f/0x153 [gfs]
eax: 00000000 ebx: ffffffef ecx: f27d0d98 edx: ffffffef
esi: f2f84690 edi: f8b93000 ebp: f34a5e98 esp: f34a5e20
ds: 007b es: 007b ss: 0068
Process nfsd (pid: 8973, threadinfo=f34a4000 task=f3462a70)
Stack: <0>f092a530 00000001 f34a5e48 00000000 f34a5e84 f89a6628 f34a5e48
ee1fc324 00000003 00000000 f34a5e48 f34a5e48 00000000 f3462a70 00000003
f34a5e5c f34a5e5c f27d0d98 f3462a70 00000001 00000020 00000000 000000c2
00000000
Call Trace:
[<c0103599>] show_stack_log_lvl+0xad/0xb5
[<c01036db>] show_registers+0x10d/0x176
[<c01038ad>] die+0xf2/0x16d
[<c010f668>] do_page_fault+0x3dd/0x57a
[<c010322f>] error_code+0x4f/0x54
[<c01585f2>] vfs_create+0x6a/0xa7
[<c0195e1c>] nfsd_create_v3+0x2b1/0x48a
[<c019af2f>] nfsd3_proc_create+0x116/0x123
[<c019229f>] nfsd_dispatch+0xbe/0x17f
[<c02e0a52>] svc_process+0x381/0x5c7
[<c019208c>] nfsd+0x18d/0x2e2
[<c0100ed9>] kernel_thread_helper+0x5/0xb
Code: 94 50 8b 45 0c ff 75 10 83 c0 1c 6a 01 89 45 88 50 8d 45 c4 50 e8
70 08 ff ff 83 c4 14 89 c3 85 c0 74 4883 f8 ef 75 33 8b 45 14 <80> 78 38
00 78 2a 8d 45 94 50 8d 45 c4 6a 00 ff 75 88 50 e8 3c
BUG: nfsd/8973, lock held at task exit time!
[ee1fc398] {inode_init_once}
.. held by: nfsd: 8973 [f3462a70, 115]
... acquired at: nfsd_create_v3+0x127/0x48a
--
--
********************************************************************
* *
* Bas van der Vlies e-mail: basv@sara.nl *
* SARA - Academic Computing Services phone: +31 20 592 8012 *
* Kruislaan 415 fax: +31 20 6683167 *
* 1098 SJ Amsterdam *
* *
********************************************************************
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: Nfsd/gfs crashes/oops in 2.6.16-rc5
2006-03-08 15:00 Nfsd crashes/oops in 2.6.16-rc5 Bas van der Vlies
@ 2006-03-09 7:43 ` Bas van der Vlies
0 siblings, 0 replies; 2+ messages in thread
From: Bas van der Vlies @ 2006-03-09 7:43 UTC (permalink / raw)
To: linux-kernel
Bas van der Vlies wrote:
> uname: 2.6.16-rc5
> libc: libc-2.3.2.so
> Debian: Sarge
> SMP system: 2 CPU's
>
> On our 4 node GFS-cluster we use nfs to export the GFS filesystems to
> our 640 node cluster On our fileserver nodes we get an nfs crash/oops.
> We tried serveral kernels and they crashes/oops are the same. We node
> installed 2.6.16-rc5 and here is the oops:
>
> nable to handle kernel NULL pointer dereference at virtual address 00000038
> printing eip:
> f89a4be3
> *pde = 37809001
> *pte = 00000000
> Oops: 0000 [#1]
> SMP
> Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg
> ide_floppy ide_cd cdrom qla2xxx siimage piix e1000 gfs lock_harness dm_mod
> CPU: 0
> EIP: 0060:[<f89a4be3>] Tainted: GF VLI
> EFLAGS: 00010246 (2.6.16-rc5-sara3 #1)
> EIP is at gfs_create+0x6f/0x153 [gfs]
Is is an GFS-crash and just for the record the GFS guys have made a fix
in the CVS Stable branch:
CVSROOT: /cvs/cluster
Module name: cluster
Branch: STABLE
Changes by: bmarzins@sourceware.org 2006-03-08 20:47:09
Modified files:
gfs-kernel/src/gfs: ops_inode.c
Log message:
Really gross hack!!!
This is a workaround for one of the bugs the got lumped into 166701. It
breaks POSIX behavior in a corner case to avoid crashing... It's icky.
when NFS opens a file with O_CREAT, the kernel nfs daemon checks to see
if the file exists. If it does, nfsd does the *right thing* (either
opens the file, or if the file was opened with O_EXCL, returns an
error). If the file doesn't exist, it passes the request down to the
underlying file system. Unfortunately, since nfs *knows* that the file
doesn't exist, it doesn't bother to pass a nameidata structure, which
would include the intent information. However since gfs is a cluster
file system, the file could have been created on another node after nfs
checks for it. If this is the case, gfs needs the intent information to
do the *right thing*. It panics when it finds a NULL pointer, instead
of the nameidata. Now, instead of panicing, if gfs finds a NULL
nameidata pointer. It assumes that the file was not created with _EXCL.
This assumption could be wrong, with the result that an application
could thing that it has created a new file, when in fact, it has opened
an existing one.
--
--
********************************************************************
* *
* Bas van der Vlies e-mail: basv@sara.nl *
* SARA - Academic Computing Services phone: +31 20 592 8012 *
* Kruislaan 415 fax: +31 20 6683167 *
* 1098 SJ Amsterdam *
* *
********************************************************************
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-03-09 7:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-08 15:00 Nfsd crashes/oops in 2.6.16-rc5 Bas van der Vlies
2006-03-09 7:43 ` Nfsd/gfs " Bas van der Vlies
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.