* Trying to port data-logging to RH 2.4.18-19.7.x kernel
@ 2003-01-31 14:41 John Dalbec
2003-01-31 14:55 ` Chris Mason
0 siblings, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-01-31 14:41 UTC (permalink / raw)
To: reiserfs-list
I'm trying to port Chris's data-logging patches to the Red Hat
2.4.18-19.7.x kernel. My first effort works fine on my workstation with
ReiserFS and NFS, but not on the production server:
> Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
> Jan 31 05:47:28 mail03 kernel: ------------[ cut here ]------------
> Jan 31 05:47:28 mail03 kernel: kernel BUG at journal.c:429!
> Jan 31 05:47:28 mail03 kernel: invalid operand: 0000
> Jan 31 05:47:28 mail03 kernel: nfsd lockd sunrpc pcnet32 mii ipt_state ipt_LOG ip_conntrack_ftp iptable_mangle iptable_nat ip_conntrack iptable_filter ip_tables st reiserfs usb-ohci usbcore
> Jan 31 05:47:28 mail03 kernel: CPU: 0
> Jan 31 05:47:29 mail03 kernel: EIP: 0010:[pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1844766/96] Not tainted
> Jan 31 05:47:29 mail03 kernel: EIP: 0010:[<e08c09e2>] Not tainted
> Jan 31 05:47:29 mail03 kernel: EFLAGS: 00010246
> Jan 31 05:47:29 mail03 kernel:
> Jan 31 05:47:29 mail03 kernel: EIP is at reiserfs_check_lock_depth [reiserfs] 0x22 (2.4.18-19.7.x.ysubigmem)
> Jan 31 05:47:29 mail03 kernel: eax: 00000000 ebx: df857800 ecx: 00000001 edx: c030a648
> Jan 31 05:47:29 mail03 kernel: esi: 00002108 edi: d42f5d74 ebp: df857800 esp: d42f5cd4
> Jan 31 05:47:29 mail03 kernel: ds: 0018 es: 0018 ss: 0018
> Jan 31 05:47:30 mail03 kernel: Process nfsd (pid: 1465, stackpage=d42f5000)
> Jan 31 05:47:30 mail03 kernel: Stack: e08ccd80 e08cc303 e08bb10a e08cc303 00000000 00000000 00000000 00000000
> Jan 31 05:47:30 mail03 kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> Jan 31 05:47:30 mail03 kernel: 00000000 00000000 00000000 00000000 00000000 00001000 00000004 d0cc9880
> Jan 31 05:47:30 mail03 kernel: Call Trace: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1794688/96] MAX_KEY [reiserfs] 0xda0 (0xd42f5cd4))
> Jan 31 05:47:30 mail03 kernel: Call Trace: [<e08ccd80>] MAX_KEY [reiserfs] 0xda0 (0xd42f5cd4))
> Jan 31 05:47:30 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd42f5cd8))
> Jan 31 05:47:31 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd42f5cd8))
> Jan 31 05:47:31 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1867510/96] search_by_key [reiserfs] 0x5a (0xd42f5cdc))
> Jan 31 05:47:31 mail03 kernel: [<e08bb10a>] search_by_key [reiserfs] 0x5a (0xd42f5cdc))
> Jan 31 05:47:31 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd42f5ce0))
> Jan 31 05:47:31 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd42f5ce0))
> Jan 31 05:47:31 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926756/96] reiserfs_read_inode2 [reiserfs] 0x6c (0xd42f5d40))
> Jan 31 05:47:31 mail03 kernel: [<e08ac99c>] reiserfs_read_inode2 [reiserfs] 0x6c (0xd42f5d40))
> Jan 31 05:47:31 mail03 kernel: [get_new_inode+187/352] get_new_inode [kernel] 0xbb (0xd42f5dc0))
> Jan 31 05:47:31 mail03 kernel: [<c015a56b>] get_new_inode [kernel] 0xbb (0xd42f5dc0))
> Jan 31 05:47:31 mail03 kernel: [iget4+217/240] iget4 [kernel] 0xd9 (0xd42f5dec))
> Jan 31 05:47:31 mail03 kernel: [<c015a7d9>] iget4 [kernel] 0xd9 (0xd42f5dec))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5dfc))
> Jan 31 05:47:32 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5dfc))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926540/96] reiserfs_iget [reiserfs] 0x24 (0xd42f5e24))
> Jan 31 05:47:32 mail03 kernel: [<e08aca74>] reiserfs_iget [reiserfs] 0x24 (0xd42f5e24))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5e30))
> Jan 31 05:47:32 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5e30))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926354/96] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd42f5e44))
> Jan 31 05:47:32 mail03 kernel: [<e08acb2e>] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd42f5e44))
> Jan 31 05:47:33 mail03 kernel: [<e0b0af98>] nfsd_get_dentry [nfsd] 0x28 (0xd42f5e80))
> Jan 31 05:47:33 mail03 kernel: [<e0b0b3ff>] find_fh_dentry [nfsd] 0x3f (0xd42f5ea4))
> Jan 31 05:47:33 mail03 kernel: [<e0b0ba11>] fh_verify [nfsd] 0x271 (0xd42f5ee0))
> Jan 31 05:47:33 mail03 kernel: [<e0afb7ff>] svc_udp_recvfrom [sunrpc] 0x19f (0xd42f5f0c))
> Jan 31 05:47:33 mail03 kernel: [<e0b123c7>] nfsd3_proc_getattr [nfsd] 0x97 (0xd42f5f3c))
> Jan 31 05:47:33 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd42f5f54))
> Jan 31 05:47:33 mail03 kernel: [<e0b09647>] nfsd_dispatch [nfsd] 0xb7 (0xd42f5f60))
> Jan 31 05:47:34 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd42f5f78))
> Jan 31 05:47:34 mail03 kernel: [<e0afad23>] svc_process_Rsmp_720c86cf [sunrpc] 0x363 (0xd42f5f7c))
> Jan 31 05:47:34 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd42f5f98))
> Jan 31 05:47:34 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd42f5f9c))
> Jan 31 05:47:34 mail03 kernel: [<e0b1abbc>] nfsd_program [nfsd] 0x0 (0xd42f5fa0))
> Jan 31 05:47:34 mail03 kernel: [<e0b09420>] nfsd [nfsd] 0x240 (0xd42f5fbc))
> Jan 31 05:47:34 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd42f5fe8))
> Jan 31 05:47:34 mail03 kernel: [kernel_thread+38/48] kernel_thread [kernel] 0x26 (0xd42f5ff0))
> Jan 31 05:47:34 mail03 kernel: [<c0107286>] kernel_thread [kernel] 0x26 (0xd42f5ff0))
> Jan 31 05:47:34 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd42f5ff8))
> Jan 31 05:47:34 mail03 kernel:
> Jan 31 05:47:34 mail03 kernel:
> Jan 31 05:47:34 mail03 kernel: Code: 0f 0b ad 01 a4 cd 8c e0 59 58 c3 8d 76 00 31 c0 c3 8d b6 00
> Jan 31 05:47:39 mail03 kernel: search_by_key called without kernel lock held
> Jan 31 05:47:39 mail03 kernel: ------------[ cut here ]------------
> Jan 31 05:47:39 mail03 kernel: kernel BUG at journal.c:429!
> Jan 31 05:47:39 mail03 kernel: invalid operand: 0000
> Jan 31 05:47:39 mail03 kernel: nfsd lockd sunrpc pcnet32 mii ipt_state ipt_LOG ip_conntrack_ftp iptable_mangle iptable_nat ip_conntrack iptable_filter ip_tables st reiserfs usb-ohci usbcore
> Jan 31 05:47:39 mail03 kernel: CPU: 0
> Jan 31 05:47:39 mail03 kernel: EIP: 0010:[pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1844766/96] Not tainted
> Jan 31 05:47:39 mail03 kernel: EIP: 0010:[<e08c09e2>] Not tainted
> Jan 31 05:47:39 mail03 kernel: EFLAGS: 00010246
> Jan 31 05:47:39 mail03 kernel:
> Jan 31 05:47:39 mail03 kernel: EIP is at reiserfs_check_lock_depth [reiserfs] 0x22 (2.4.18-19.7.x.ysubigmem)
> Jan 31 05:47:39 mail03 kernel: eax: 00000000 ebx: dc770800 ecx: 00000000 edx: dbd49f44
> Jan 31 05:47:39 mail03 kernel: esi: 000028e5 edi: d4317d74 ebp: dc770800 esp: d4317cd4
> Jan 31 05:47:40 mail03 kernel: ds: 0018 es: 0018 ss: 0018
> Jan 31 05:47:40 mail03 kernel: Process nfsd (pid: 1460, stackpage=d4317000)
> Jan 31 05:47:40 mail03 kernel: Stack: e08ccd80 e08cc303 e08bb10a e08cc303 00000000 00000000 00000000 00000000
> Jan 31 05:47:40 mail03 kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> Jan 31 05:47:40 mail03 kernel: 00000000 00000000 00000000 00000000 00000000 00001000 00000004 ce496880
> Jan 31 05:47:40 mail03 kernel: Call Trace: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1794688/96] MAX_KEY [reiserfs] 0xda0 (0xd4317cd4))
> Jan 31 05:47:40 mail03 kernel: Call Trace: [<e08ccd80>] MAX_KEY [reiserfs] 0xda0 (0xd4317cd4))
> Jan 31 05:47:40 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd4317cd8))
> Jan 31 05:47:41 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd4317cd8))
> Jan 31 05:47:41 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1867510/96] search_by_key [reiserfs] 0x5a (0xd4317cdc))
> Jan 31 05:47:41 mail03 kernel: [<e08bb10a>] search_by_key [reiserfs] 0x5a (0xd4317cdc))
> Jan 31 05:47:41 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd4317ce0))
> Jan 31 05:47:41 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd4317ce0))
> Jan 31 05:47:41 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926756/96] reiserfs_read_inode2 [reiserfs] 0x6c (0xd4317d40))
> Jan 31 05:47:41 mail03 kernel: [<e08ac99c>] reiserfs_read_inode2 [reiserfs] 0x6c (0xd4317d40))
> Jan 31 05:47:41 mail03 kernel: [get_new_inode+187/352] get_new_inode [kernel] 0xbb (0xd4317dc0))
> Jan 31 05:47:42 mail03 kernel: [<c015a56b>] get_new_inode [kernel] 0xbb (0xd4317dc0))
> Jan 31 05:47:42 mail03 kernel: [iget4+217/240] iget4 [kernel] 0xd9 (0xd4317dec))
> Jan 31 05:47:42 mail03 kernel: [<c015a7d9>] iget4 [kernel] 0xd9 (0xd4317dec))
> Jan 31 05:47:42 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd4317dfc))
> Jan 31 05:47:42 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd4317dfc))
> Jan 31 05:47:42 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926540/96] reiserfs_iget [reiserfs] 0x24 (0xd4317e24))
> Jan 31 05:47:43 mail03 kernel: [<e08aca74>] reiserfs_iget [reiserfs] 0x24 (0xd4317e24))
> Jan 31 05:47:43 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd4317e30))
> Jan 31 05:47:44 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd4317e30))
> Jan 31 05:47:44 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926354/96] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd4317e44))
> Jan 31 05:47:44 mail03 kernel: [<e08acb2e>] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd4317e44))
> Jan 31 05:47:44 mail03 kernel: [<e0b0af98>] nfsd_get_dentry [nfsd] 0x28 (0xd4317e80))
> Jan 31 05:47:44 mail03 kernel: [<e0b0b3ff>] find_fh_dentry [nfsd] 0x3f (0xd4317ea4))
> Jan 31 05:47:44 mail03 kernel: [<e0b0ba11>] fh_verify [nfsd] 0x271 (0xd4317ee0))
> Jan 31 05:47:44 mail03 kernel: [<e0afb7ff>] svc_udp_recvfrom [sunrpc] 0x19f (0xd4317f0c))
> Jan 31 05:47:44 mail03 kernel: [<e0b123c7>] nfsd3_proc_getattr [nfsd] 0x97 (0xd4317f3c))
> Jan 31 05:47:44 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd4317f54))
> Jan 31 05:47:45 mail03 kernel: [<e0b09647>] nfsd_dispatch [nfsd] 0xb7 (0xd4317f60))
> Jan 31 05:47:45 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd4317f78))
> Jan 31 05:47:45 mail03 kernel: [<e0afad23>] svc_process_Rsmp_720c86cf [sunrpc] 0x363 (0xd4317f7c))
> Jan 31 05:47:45 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd4317f98))
> Jan 31 05:47:45 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd4317f9c))
> Jan 31 05:47:45 mail03 kernel: [<e0b1abbc>] nfsd_program [nfsd] 0x0 (0xd4317fa0))
> Jan 31 05:47:45 mail03 kernel: [<e0b09420>] nfsd [nfsd] 0x240 (0xd4317fbc))
> Jan 31 05:47:45 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd4317fe8))
> Jan 31 05:47:45 mail03 kernel: [kernel_thread+38/48] kernel_thread [kernel] 0x26 (0xd4317ff0))
> Jan 31 05:47:45 mail03 kernel: [<c0107286>] kernel_thread [kernel] 0x26 (0xd4317ff0))
> Jan 31 05:47:45 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd4317ff8))
> Jan 31 05:47:45 mail03 kernel:
> Jan 31 05:47:45 mail03 kernel:
> Jan 31 05:47:46 mail03 kernel: Code: 0f 0b ad 01 a4 cd 8c e0 59 58 c3 8d 76 00 31 c0 c3 8d b6 00
Does this look familiar to anyone? Any suggestions as to what I need to
change?
Thanks,
John Dalbec
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-01-31 14:41 Trying to port data-logging to RH 2.4.18-19.7.x kernel John Dalbec
@ 2003-01-31 14:55 ` Chris Mason
2003-01-31 15:43 ` John Dalbec
0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2003-01-31 14:55 UTC (permalink / raw)
To: John Dalbec; +Cc: reiserfs-list
On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
> I'm trying to port Chris's data-logging patches to the Red Hat
> 2.4.18-19.7.x kernel. My first effort works fine on my workstation with
> ReiserFS and NFS, but not on the production server:
>
> > Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
This is a debugging check that shows our search_by_key function was
called without first taking the big kernel lock, and the trace below
shows it happened during a call to reiserfs_read_inode2. So, what you
need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
likely into reiserfs_lookup.
But, as new kernels come out, I don't usually back port data logging
fixes to the old kernels. So the 2.4.18 data logging code is missing a
number of fixes the later code has. Which version of the data logging
code are you running on?
-chris
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-01-31 14:55 ` Chris Mason
@ 2003-01-31 15:43 ` John Dalbec
2003-01-31 16:06 ` Chris Mason
0 siblings, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-01-31 15:43 UTC (permalink / raw)
To: Chris Mason; +Cc: reiserfs-list
Chris Mason wrote:
> On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
>
>>I'm trying to port Chris's data-logging patches to the Red Hat
>>2.4.18-19.7.x kernel. My first effort works fine on my workstation with
>>ReiserFS and NFS, but not on the production server:
>>
>>
>>>Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
>>
>
> This is a debugging check that shows our search_by_key function was
> called without first taking the big kernel lock, and the trace below
> shows it happened during a call to reiserfs_read_inode2. So, what you
> need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
> likely into reiserfs_lookup.
I don't see reiserfs_lookup in the stack trace, and it already calls
reiserfs_check_lock_depth. Why would I need lock_kernel there?
Red Hat's low-latency patch puts a conditional_schedule at the top of
search_by_key. Would that cause the kernel lock to be dropped? I see
/* The function is NOT SCHEDULE-SAFE! */
>
> But, as new kernels come out, I don't usually back port data logging
> fixes to the old kernels. So the 2.4.18 data logging code is missing a
> number of fixes the later code has. Which version of the data logging
> code are you running on?
>
> -chris
>
>
>
I started from your 2.4.20 patches and modified them to fit. The kernel
RPM I'm using starts from 2.4.19-rc1-ac1 and adds patches on top of
that. I've applied ReiserFS-pending patches 1-6 and 13 and your quota
patch for 2.4.19, followed by the data logging.
Thanks,
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-01-31 15:43 ` John Dalbec
@ 2003-01-31 16:06 ` Chris Mason
2003-01-31 21:28 ` John Dalbec
0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2003-01-31 16:06 UTC (permalink / raw)
To: John Dalbec; +Cc: reiserfs-list
On Fri, 2003-01-31 at 10:43, John Dalbec wrote:
> Chris Mason wrote:
> > On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
> >
> >>I'm trying to port Chris's data-logging patches to the Red Hat
> >>2.4.18-19.7.x kernel. My first effort works fine on my workstation with
> >>ReiserFS and NFS, but not on the production server:
> >>
> >>
> >>>Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
> >>
> >
> > This is a debugging check that shows our search_by_key function was
> > called without first taking the big kernel lock, and the trace below
> > shows it happened during a call to reiserfs_read_inode2. So, what you
> > need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
> > likely into reiserfs_lookup.
>
> I don't see reiserfs_lookup in the stack trace, and it already calls
> reiserfs_check_lock_depth. Why would I need lock_kernel there?
> Red Hat's low-latency patch puts a conditional_schedule at the top of
> search_by_key. Would that cause the kernel lock to be dropped? I see
> /* The function is NOT SCHEDULE-SAFE! */
>
Must be the iget4 path then, probably triggered by nfs. The locking
rules say the BKL is supposed to be held when you call read_inode. The
fix would either to be finding the caller or just adding lock_kernel
calls to reiserfs_read_inode2. It is safe to nest them, so adding them
won't cause problems.
The BKL is dropped during a schedule, but taken again before returning
control to the calling function, so that low latency patch probably
isn't causing problems. I'm assuming they are using Andrew Morton's low
latency patch, which doesn't cause problems.
-chris
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-01-31 16:06 ` Chris Mason
@ 2003-01-31 21:28 ` John Dalbec
2003-02-03 16:41 ` Chris Mason
0 siblings, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-01-31 21:28 UTC (permalink / raw)
To: Chris Mason; +Cc: reiserfs-list
Chris Mason wrote:
> On Fri, 2003-01-31 at 10:43, John Dalbec wrote:
>
>>Chris Mason wrote:
>>
>>>On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
>>>
>>>
>>>>I'm trying to port Chris's data-logging patches to the Red Hat
>>>>2.4.18-19.7.x kernel. My first effort works fine on my workstation with
>>>>ReiserFS and NFS, but not on the production server:
>>>>
>>>>
>>>>
>>>>>Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
>>>>
>>>This is a debugging check that shows our search_by_key function was
>>>called without first taking the big kernel lock, and the trace below
>>>shows it happened during a call to reiserfs_read_inode2. So, what you
>>>need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
>>>likely into reiserfs_lookup.
>>
>>I don't see reiserfs_lookup in the stack trace, and it already calls
>>reiserfs_check_lock_depth. Why would I need lock_kernel there?
>>Red Hat's low-latency patch puts a conditional_schedule at the top of
>>search_by_key. Would that cause the kernel lock to be dropped? I see
>>/* The function is NOT SCHEDULE-SAFE! */
>>
>
>
> Must be the iget4 path then, probably triggered by nfs. The locking
> rules say the BKL is supposed to be held when you call read_inode. The
> fix would either to be finding the caller or just adding lock_kernel
> calls to reiserfs_read_inode2. It is safe to nest them, so adding them
> won't cause problems.
>
> The BKL is dropped during a schedule, but taken again before returning
> control to the calling function, so that low latency patch probably
> isn't causing problems. I'm assuming they are using Andrew Morton's low
> latency patch, which doesn't cause problems.
>
> -chris
>
>
>
The immediate caller is the "ReiserFS specific hack" in
fs/inode.c:get_inode signed <mason@suse.com>. Is the BKL supposed to be
held when get_inode is called? I don't see it documented either way in
Documentation/filesystems/Locking. It looks easier to add lock_kernel
calls there than in read_inode2 (assuming they're supposed to last
through the entire function call).
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-01-31 21:28 ` John Dalbec
@ 2003-02-03 16:41 ` Chris Mason
2003-02-04 18:01 ` John Dalbec
2003-02-05 14:24 ` John Dalbec
0 siblings, 2 replies; 9+ messages in thread
From: Chris Mason @ 2003-02-03 16:41 UTC (permalink / raw)
To: John Dalbec; +Cc: reiserfs-list
On Fri, 2003-01-31 at 16:28, John Dalbec wrote:
> The immediate caller is the "ReiserFS specific hack" in
> fs/inode.c:get_inode signed <mason@suse.com>. Is the BKL supposed to be
> held when get_inode is called?
Traditionally, the BKL is supposed to be held when iget or iget4 is
called. RedHat might have patches that do away with that and simply
missed reiserfs, but it is more likely they have a patch to reduce BKL
use in NFS that missed the iget4 case.
So your two basic choices are adding the BKL to reiserfs_read_inode2, or
going into the nfsd source and putting them around the iget4 call. You
might want to double check to see if their source had the BKL in
reiserfs_read_inode2 before you started the data logging port.
If not, you should be able to reproduce the oops on an unmodified redhat
kernel (compiled with SMP on), and I'd appreciate it if you could send
them a bug report as well.
-chris
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-02-03 16:41 ` Chris Mason
@ 2003-02-04 18:01 ` John Dalbec
2003-02-05 14:24 ` John Dalbec
1 sibling, 0 replies; 9+ messages in thread
From: John Dalbec @ 2003-02-04 18:01 UTC (permalink / raw)
To: Chris Mason; +Cc: reiserfs-list
Chris Mason wrote:
> On Fri, 2003-01-31 at 16:28, John Dalbec wrote:
>
>
>>The immediate caller is the "ReiserFS specific hack" in
>>fs/inode.c:get_inode signed <mason@suse.com>. Is the BKL supposed to be
>>held when get_inode is called?
>
>
> Traditionally, the BKL is supposed to be held when iget or iget4 is
> called. RedHat might have patches that do away with that and simply
> missed reiserfs, but it is more likely they have a patch to reduce BKL
> use in NFS that missed the iget4 case.
After inspecting nfsfh.c I think the "correct" fix is to grab the BKL in
find_fh_dentry before calling nfsd_get_dentry. There are other calls to
nfsd_get_dentry later in find_fh_dentry but the BKL is already held at
that point.
--- linux/fs/nfsd/nfsfh.c.orig Tue Feb 4 10:55:36 2003
+++ linux/fs/nfsd/nfsfh.c Tue Feb 4 10:58:19 2003
@@ -410,7 +410,9 @@
*/
retry:
down(&sb->s_nfsd_free_path_sem);
+ lock_kernel();
result = nfsd_get_dentry(sb, datap, len, fhtype, 0);
+ unlock_kernel();
if (IS_ERR(result)
|| !(result->d_flags & DCACHE_NFSD_DISCONNECTED)
|| (!S_ISDIR(result->d_inode->i_mode) && ! needpath)) {
>
> So your two basic choices are adding the BKL to reiserfs_read_inode2, or
> going into the nfsd source and putting them around the iget4 call. You
> might want to double check to see if their source had the BKL in
> reiserfs_read_inode2 before you started the data logging port.
>
> If not, you should be able to reproduce the oops on an unmodified redhat
> kernel (compiled with SMP on), and I'd appreciate it if you could send
> them a bug report as well.
>
> -chris
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-02-03 16:41 ` Chris Mason
2003-02-04 18:01 ` John Dalbec
@ 2003-02-05 14:24 ` John Dalbec
2003-02-05 16:00 ` Chris Mason
1 sibling, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-02-05 14:24 UTC (permalink / raw)
To: Chris Mason; +Cc: reiserfs-list
Chris Mason wrote:
> So your two basic choices are adding the BKL to reiserfs_read_inode2, or
> going into the nfsd source and putting them around the iget4 call. You
> might want to double check to see if their source had the BKL in
> reiserfs_read_inode2 before you started the data logging port.
It did not.
>
> If not, you should be able to reproduce the oops on an unmodified redhat
> kernel (compiled with SMP on), and I'd appreciate it if you could send
> them a bug report as well.
I cannot reproduce the oops on an unmodified Red Hat kernel because they
don't call reiserfs_check_lock_depth in search_by_key. That was added
by your data-logging patch.
>
> -chris
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
2003-02-05 14:24 ` John Dalbec
@ 2003-02-05 16:00 ` Chris Mason
0 siblings, 0 replies; 9+ messages in thread
From: Chris Mason @ 2003-02-05 16:00 UTC (permalink / raw)
To: John Dalbec; +Cc: reiserfs-list
On Wed, 2003-02-05 at 09:24, John Dalbec wrote:
> >
> > If not, you should be able to reproduce the oops on an unmodified redhat
> > kernel (compiled with SMP on), and I'd appreciate it if you could send
> > them a bug report as well.
>
> I cannot reproduce the oops on an unmodified Red Hat kernel because they
> don't call reiserfs_check_lock_depth in search_by_key. That was added
> by your data-logging patch.
>
Heh, I added a bunch of those during the 2.4.0-test porting, I guess I
hit an odd bug during data logging coding and thought I needed a few
more.
Even though the check is only introduced in the data logging code, the
non-data logging code needs the BKL in search_by_key all the time. So,
if you could please forward the details to redhat I'd appreciate it.
-chris
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2003-02-05 16:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-31 14:41 Trying to port data-logging to RH 2.4.18-19.7.x kernel John Dalbec
2003-01-31 14:55 ` Chris Mason
2003-01-31 15:43 ` John Dalbec
2003-01-31 16:06 ` Chris Mason
2003-01-31 21:28 ` John Dalbec
2003-02-03 16:41 ` Chris Mason
2003-02-04 18:01 ` John Dalbec
2003-02-05 14:24 ` John Dalbec
2003-02-05 16:00 ` Chris Mason
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.