All of lore.kernel.org
 help / color / mirror / Atom feed
* Trying to port data-logging to RH 2.4.18-19.7.x kernel
@ 2003-01-31 14:41 John Dalbec
  2003-01-31 14:55 ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-01-31 14:41 UTC (permalink / raw)
  To: reiserfs-list

I'm trying to port Chris's data-logging patches to the Red Hat 
2.4.18-19.7.x kernel.  My first effort works fine on my workstation with 
ReiserFS and NFS, but not on the production server:

> Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
> Jan 31 05:47:28 mail03 kernel: ------------[ cut here ]------------
> Jan 31 05:47:28 mail03 kernel: kernel BUG at journal.c:429!
> Jan 31 05:47:28 mail03 kernel: invalid operand: 0000
> Jan 31 05:47:28 mail03 kernel: nfsd lockd sunrpc pcnet32 mii ipt_state ipt_LOG ip_conntrack_ftp iptable_mangle iptable_nat ip_conntrack iptable_filter ip_tables st reiserfs usb-ohci usbcore
> Jan 31 05:47:28 mail03 kernel: CPU:    0
> Jan 31 05:47:29 mail03 kernel: EIP:    0010:[pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1844766/96]    Not tainted
> Jan 31 05:47:29 mail03 kernel: EIP:    0010:[<e08c09e2>]    Not tainted
> Jan 31 05:47:29 mail03 kernel: EFLAGS: 00010246
> Jan 31 05:47:29 mail03 kernel: 
> Jan 31 05:47:29 mail03 kernel: EIP is at reiserfs_check_lock_depth [reiserfs] 0x22 (2.4.18-19.7.x.ysubigmem)
> Jan 31 05:47:29 mail03 kernel: eax: 00000000   ebx: df857800   ecx: 00000001   edx: c030a648
> Jan 31 05:47:29 mail03 kernel: esi: 00002108   edi: d42f5d74   ebp: df857800   esp: d42f5cd4
> Jan 31 05:47:29 mail03 kernel: ds: 0018   es: 0018   ss: 0018
> Jan 31 05:47:30 mail03 kernel: Process nfsd (pid: 1465, stackpage=d42f5000)
> Jan 31 05:47:30 mail03 kernel: Stack: e08ccd80 e08cc303 e08bb10a e08cc303 00000000 00000000 00000000 00000000 
> Jan 31 05:47:30 mail03 kernel:        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
> Jan 31 05:47:30 mail03 kernel:        00000000 00000000 00000000 00000000 00000000 00001000 00000004 d0cc9880 
> Jan 31 05:47:30 mail03 kernel: Call Trace: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1794688/96] MAX_KEY [reiserfs] 0xda0 (0xd42f5cd4))
> Jan 31 05:47:30 mail03 kernel: Call Trace: [<e08ccd80>] MAX_KEY [reiserfs] 0xda0 (0xd42f5cd4))
> Jan 31 05:47:30 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd42f5cd8))
> Jan 31 05:47:31 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd42f5cd8))
> Jan 31 05:47:31 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1867510/96] search_by_key [reiserfs] 0x5a (0xd42f5cdc))
> Jan 31 05:47:31 mail03 kernel: [<e08bb10a>] search_by_key [reiserfs] 0x5a (0xd42f5cdc))
> Jan 31 05:47:31 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd42f5ce0))
> Jan 31 05:47:31 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd42f5ce0))
> Jan 31 05:47:31 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926756/96] reiserfs_read_inode2 [reiserfs] 0x6c (0xd42f5d40))
> Jan 31 05:47:31 mail03 kernel: [<e08ac99c>] reiserfs_read_inode2 [reiserfs] 0x6c (0xd42f5d40))
> Jan 31 05:47:31 mail03 kernel: [get_new_inode+187/352] get_new_inode [kernel] 0xbb (0xd42f5dc0))
> Jan 31 05:47:31 mail03 kernel: [<c015a56b>] get_new_inode [kernel] 0xbb (0xd42f5dc0))
> Jan 31 05:47:31 mail03 kernel: [iget4+217/240] iget4 [kernel] 0xd9 (0xd42f5dec))
> Jan 31 05:47:31 mail03 kernel: [<c015a7d9>] iget4 [kernel] 0xd9 (0xd42f5dec))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5dfc))
> Jan 31 05:47:32 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5dfc))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926540/96] reiserfs_iget [reiserfs] 0x24 (0xd42f5e24))
> Jan 31 05:47:32 mail03 kernel: [<e08aca74>] reiserfs_iget [reiserfs] 0x24 (0xd42f5e24))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5e30))
> Jan 31 05:47:32 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd42f5e30))
> Jan 31 05:47:32 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926354/96] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd42f5e44))
> Jan 31 05:47:32 mail03 kernel: [<e08acb2e>] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd42f5e44))
> Jan 31 05:47:33 mail03 kernel: [<e0b0af98>] nfsd_get_dentry [nfsd] 0x28 (0xd42f5e80))
> Jan 31 05:47:33 mail03 kernel: [<e0b0b3ff>] find_fh_dentry [nfsd] 0x3f (0xd42f5ea4))
> Jan 31 05:47:33 mail03 kernel: [<e0b0ba11>] fh_verify [nfsd] 0x271 (0xd42f5ee0))
> Jan 31 05:47:33 mail03 kernel: [<e0afb7ff>] svc_udp_recvfrom [sunrpc] 0x19f (0xd42f5f0c))
> Jan 31 05:47:33 mail03 kernel: [<e0b123c7>] nfsd3_proc_getattr [nfsd] 0x97 (0xd42f5f3c))
> Jan 31 05:47:33 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd42f5f54))
> Jan 31 05:47:33 mail03 kernel: [<e0b09647>] nfsd_dispatch [nfsd] 0xb7 (0xd42f5f60))
> Jan 31 05:47:34 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd42f5f78))
> Jan 31 05:47:34 mail03 kernel: [<e0afad23>] svc_process_Rsmp_720c86cf [sunrpc] 0x363 (0xd42f5f7c))
> Jan 31 05:47:34 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd42f5f98))
> Jan 31 05:47:34 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd42f5f9c))
> Jan 31 05:47:34 mail03 kernel: [<e0b1abbc>] nfsd_program [nfsd] 0x0 (0xd42f5fa0))
> Jan 31 05:47:34 mail03 kernel: [<e0b09420>] nfsd [nfsd] 0x240 (0xd42f5fbc))
> Jan 31 05:47:34 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd42f5fe8))
> Jan 31 05:47:34 mail03 kernel: [kernel_thread+38/48] kernel_thread [kernel] 0x26 (0xd42f5ff0))
> Jan 31 05:47:34 mail03 kernel: [<c0107286>] kernel_thread [kernel] 0x26 (0xd42f5ff0))
> Jan 31 05:47:34 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd42f5ff8))
> Jan 31 05:47:34 mail03 kernel: 
> Jan 31 05:47:34 mail03 kernel: 
> Jan 31 05:47:34 mail03 kernel: Code: 0f 0b ad 01 a4 cd 8c e0 59 58 c3 8d 76 00 31 c0 c3 8d b6 00 
> Jan 31 05:47:39 mail03 kernel:  search_by_key called without kernel lock held
> Jan 31 05:47:39 mail03 kernel: ------------[ cut here ]------------
> Jan 31 05:47:39 mail03 kernel: kernel BUG at journal.c:429!
> Jan 31 05:47:39 mail03 kernel: invalid operand: 0000
> Jan 31 05:47:39 mail03 kernel: nfsd lockd sunrpc pcnet32 mii ipt_state ipt_LOG ip_conntrack_ftp iptable_mangle iptable_nat ip_conntrack iptable_filter ip_tables st reiserfs usb-ohci usbcore
> Jan 31 05:47:39 mail03 kernel: CPU:    0
> Jan 31 05:47:39 mail03 kernel: EIP:    0010:[pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1844766/96]    Not tainted
> Jan 31 05:47:39 mail03 kernel: EIP:    0010:[<e08c09e2>]    Not tainted
> Jan 31 05:47:39 mail03 kernel: EFLAGS: 00010246
> Jan 31 05:47:39 mail03 kernel: 
> Jan 31 05:47:39 mail03 kernel: EIP is at reiserfs_check_lock_depth [reiserfs] 0x22 (2.4.18-19.7.x.ysubigmem)
> Jan 31 05:47:39 mail03 kernel: eax: 00000000   ebx: dc770800   ecx: 00000000   edx: dbd49f44
> Jan 31 05:47:39 mail03 kernel: esi: 000028e5   edi: d4317d74   ebp: dc770800   esp: d4317cd4
> Jan 31 05:47:40 mail03 kernel: ds: 0018   es: 0018   ss: 0018
> Jan 31 05:47:40 mail03 kernel: Process nfsd (pid: 1460, stackpage=d4317000)
> Jan 31 05:47:40 mail03 kernel: Stack: e08ccd80 e08cc303 e08bb10a e08cc303 00000000 00000000 00000000 00000000 
> Jan 31 05:47:40 mail03 kernel:        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
> Jan 31 05:47:40 mail03 kernel:        00000000 00000000 00000000 00000000 00000000 00001000 00000004 ce496880 
> Jan 31 05:47:40 mail03 kernel: Call Trace: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1794688/96] MAX_KEY [reiserfs] 0xda0 (0xd4317cd4))
> Jan 31 05:47:40 mail03 kernel: Call Trace: [<e08ccd80>] MAX_KEY [reiserfs] 0xda0 (0xd4317cd4))
> Jan 31 05:47:40 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd4317cd8))
> Jan 31 05:47:41 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd4317cd8))
> Jan 31 05:47:41 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1867510/96] search_by_key [reiserfs] 0x5a (0xd4317cdc))
> Jan 31 05:47:41 mail03 kernel: [<e08bb10a>] search_by_key [reiserfs] 0x5a (0xd4317cdc))
> Jan 31 05:47:41 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1797373/96] MAX_KEY [reiserfs] 0x323 (0xd4317ce0))
> Jan 31 05:47:41 mail03 kernel: [<e08cc303>] MAX_KEY [reiserfs] 0x323 (0xd4317ce0))
> Jan 31 05:47:41 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926756/96] reiserfs_read_inode2 [reiserfs] 0x6c (0xd4317d40))
> Jan 31 05:47:41 mail03 kernel: [<e08ac99c>] reiserfs_read_inode2 [reiserfs] 0x6c (0xd4317d40))
> Jan 31 05:47:41 mail03 kernel: [get_new_inode+187/352] get_new_inode [kernel] 0xbb (0xd4317dc0))
> Jan 31 05:47:42 mail03 kernel: [<c015a56b>] get_new_inode [kernel] 0xbb (0xd4317dc0))
> Jan 31 05:47:42 mail03 kernel: [iget4+217/240] iget4 [kernel] 0xd9 (0xd4317dec))
> Jan 31 05:47:42 mail03 kernel: [<c015a7d9>] iget4 [kernel] 0xd9 (0xd4317dec))
> Jan 31 05:47:42 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd4317dfc))
> Jan 31 05:47:42 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd4317dfc))
> Jan 31 05:47:42 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926540/96] reiserfs_iget [reiserfs] 0x24 (0xd4317e24))
> Jan 31 05:47:43 mail03 kernel: [<e08aca74>] reiserfs_iget [reiserfs] 0x24 (0xd4317e24))
> Jan 31 05:47:43 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926608/96] reiserfs_find_actor [reiserfs] 0x0 (0xd4317e30))
> Jan 31 05:47:44 mail03 kernel: [<e08aca30>] reiserfs_find_actor [reiserfs] 0x0 (0xd4317e30))
> Jan 31 05:47:44 mail03 kernel: [pcnet32:__insmod_pcnet32_O/lib/modules/2.4.18-19.7.x.ysubigmem/kern+-1926354/96] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd4317e44))
> Jan 31 05:47:44 mail03 kernel: [<e08acb2e>] reiserfs_fh_to_dentry [reiserfs] 0x6e (0xd4317e44))
> Jan 31 05:47:44 mail03 kernel: [<e0b0af98>] nfsd_get_dentry [nfsd] 0x28 (0xd4317e80))
> Jan 31 05:47:44 mail03 kernel: [<e0b0b3ff>] find_fh_dentry [nfsd] 0x3f (0xd4317ea4))
> Jan 31 05:47:44 mail03 kernel: [<e0b0ba11>] fh_verify [nfsd] 0x271 (0xd4317ee0))
> Jan 31 05:47:44 mail03 kernel: [<e0afb7ff>] svc_udp_recvfrom [sunrpc] 0x19f (0xd4317f0c))
> Jan 31 05:47:44 mail03 kernel: [<e0b123c7>] nfsd3_proc_getattr [nfsd] 0x97 (0xd4317f3c))
> Jan 31 05:47:44 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd4317f54))
> Jan 31 05:47:45 mail03 kernel: [<e0b09647>] nfsd_dispatch [nfsd] 0xb7 (0xd4317f60))
> Jan 31 05:47:45 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd4317f78))
> Jan 31 05:47:45 mail03 kernel: [<e0afad23>] svc_process_Rsmp_720c86cf [sunrpc] 0x363 (0xd4317f7c))
> Jan 31 05:47:45 mail03 kernel: [<e0b1b204>] nfsd_procedures3 [nfsd] 0x24 (0xd4317f98))
> Jan 31 05:47:45 mail03 kernel: [<e0b1ab9c>] nfsd_version3 [nfsd] 0x0 (0xd4317f9c))
> Jan 31 05:47:45 mail03 kernel: [<e0b1abbc>] nfsd_program [nfsd] 0x0 (0xd4317fa0))
> Jan 31 05:47:45 mail03 kernel: [<e0b09420>] nfsd [nfsd] 0x240 (0xd4317fbc))
> Jan 31 05:47:45 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd4317fe8))
> Jan 31 05:47:45 mail03 kernel: [kernel_thread+38/48] kernel_thread [kernel] 0x26 (0xd4317ff0))
> Jan 31 05:47:45 mail03 kernel: [<c0107286>] kernel_thread [kernel] 0x26 (0xd4317ff0))
> Jan 31 05:47:45 mail03 kernel: [<e0b091e0>] nfsd [nfsd] 0x0 (0xd4317ff8))
> Jan 31 05:47:45 mail03 kernel: 
> Jan 31 05:47:45 mail03 kernel: 
> Jan 31 05:47:46 mail03 kernel: Code: 0f 0b ad 01 a4 cd 8c e0 59 58 c3 8d 76 00 31 c0 c3 8d b6 00 

Does this look familiar to anyone?  Any suggestions as to what I need to 
change?
Thanks,
John Dalbec


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-01-31 14:41 Trying to port data-logging to RH 2.4.18-19.7.x kernel John Dalbec
@ 2003-01-31 14:55 ` Chris Mason
  2003-01-31 15:43   ` John Dalbec
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2003-01-31 14:55 UTC (permalink / raw)
  To: John Dalbec; +Cc: reiserfs-list

On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
> I'm trying to port Chris's data-logging patches to the Red Hat 
> 2.4.18-19.7.x kernel.  My first effort works fine on my workstation with 
> ReiserFS and NFS, but not on the production server:
> 
> > Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held

This is a debugging check that shows our search_by_key function was
called without first taking the big kernel lock, and the trace below
shows it happened during a call to reiserfs_read_inode2.  So, what you
need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
likely into reiserfs_lookup.

But, as new kernels come out, I don't usually back port data logging
fixes to the old kernels.  So the 2.4.18 data logging code is missing a
number of fixes the later code has.  Which version of the data logging
code are you running on?

-chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-01-31 14:55 ` Chris Mason
@ 2003-01-31 15:43   ` John Dalbec
  2003-01-31 16:06     ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-01-31 15:43 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list

Chris Mason wrote:
> On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
> 
>>I'm trying to port Chris's data-logging patches to the Red Hat 
>>2.4.18-19.7.x kernel.  My first effort works fine on my workstation with 
>>ReiserFS and NFS, but not on the production server:
>>
>>
>>>Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
>>
> 
> This is a debugging check that shows our search_by_key function was
> called without first taking the big kernel lock, and the trace below
> shows it happened during a call to reiserfs_read_inode2.  So, what you
> need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
> likely into reiserfs_lookup.

I don't see reiserfs_lookup in the stack trace, and it already calls 
reiserfs_check_lock_depth.  Why would I need lock_kernel there?
Red Hat's low-latency patch puts a conditional_schedule at the top of 
search_by_key.  Would that cause the kernel lock to be dropped?  I see
/* The function is NOT SCHEDULE-SAFE! */

> 
> But, as new kernels come out, I don't usually back port data logging
> fixes to the old kernels.  So the 2.4.18 data logging code is missing a
> number of fixes the later code has.  Which version of the data logging
> code are you running on?
> 
> -chris
> 
> 
> 

I started from your 2.4.20 patches and modified them to fit.  The kernel 
RPM I'm using starts from 2.4.19-rc1-ac1 and adds patches on top of 
that.  I've applied ReiserFS-pending patches 1-6 and 13 and your quota 
patch for 2.4.19, followed by the data logging.
Thanks,
John




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-01-31 15:43   ` John Dalbec
@ 2003-01-31 16:06     ` Chris Mason
  2003-01-31 21:28       ` John Dalbec
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2003-01-31 16:06 UTC (permalink / raw)
  To: John Dalbec; +Cc: reiserfs-list

On Fri, 2003-01-31 at 10:43, John Dalbec wrote:
> Chris Mason wrote:
> > On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
> > 
> >>I'm trying to port Chris's data-logging patches to the Red Hat 
> >>2.4.18-19.7.x kernel.  My first effort works fine on my workstation with 
> >>ReiserFS and NFS, but not on the production server:
> >>
> >>
> >>>Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
> >>
> > 
> > This is a debugging check that shows our search_by_key function was
> > called without first taking the big kernel lock, and the trace below
> > shows it happened during a call to reiserfs_read_inode2.  So, what you
> > need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
> > likely into reiserfs_lookup.
> 
> I don't see reiserfs_lookup in the stack trace, and it already calls 
> reiserfs_check_lock_depth.  Why would I need lock_kernel there?
> Red Hat's low-latency patch puts a conditional_schedule at the top of 
> search_by_key.  Would that cause the kernel lock to be dropped?  I see
> /* The function is NOT SCHEDULE-SAFE! */
> 

Must be the iget4 path then, probably triggered by nfs.  The locking
rules say the BKL is supposed to be held when you call read_inode.  The
fix would either to be finding the caller or just adding lock_kernel
calls to reiserfs_read_inode2.  It is safe to nest them, so adding them
won't cause problems.

The BKL is dropped during a schedule, but taken again before returning
control to the calling function, so that low latency patch probably
isn't causing problems.  I'm assuming they are using Andrew Morton's low
latency patch, which doesn't cause problems.

-chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-01-31 16:06     ` Chris Mason
@ 2003-01-31 21:28       ` John Dalbec
  2003-02-03 16:41         ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-01-31 21:28 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list

Chris Mason wrote:
> On Fri, 2003-01-31 at 10:43, John Dalbec wrote:
> 
>>Chris Mason wrote:
>>
>>>On Fri, 2003-01-31 at 09:41, John Dalbec wrote:
>>>
>>>
>>>>I'm trying to port Chris's data-logging patches to the Red Hat 
>>>>2.4.18-19.7.x kernel.  My first effort works fine on my workstation with 
>>>>ReiserFS and NFS, but not on the production server:
>>>>
>>>>
>>>>
>>>>>Jan 31 05:47:28 mail03 kernel: search_by_key called without kernel lock held
>>>>
>>>This is a debugging check that shows our search_by_key function was
>>>called without first taking the big kernel lock, and the trace below
>>>shows it happened during a call to reiserfs_read_inode2.  So, what you
>>>need to do is put lock_kernel() calls into reiserfs_read_inode2, or more
>>>likely into reiserfs_lookup.
>>
>>I don't see reiserfs_lookup in the stack trace, and it already calls 
>>reiserfs_check_lock_depth.  Why would I need lock_kernel there?
>>Red Hat's low-latency patch puts a conditional_schedule at the top of 
>>search_by_key.  Would that cause the kernel lock to be dropped?  I see
>>/* The function is NOT SCHEDULE-SAFE! */
>>
> 
> 
> Must be the iget4 path then, probably triggered by nfs.  The locking
> rules say the BKL is supposed to be held when you call read_inode.  The
> fix would either to be finding the caller or just adding lock_kernel
> calls to reiserfs_read_inode2.  It is safe to nest them, so adding them
> won't cause problems.
> 
> The BKL is dropped during a schedule, but taken again before returning
> control to the calling function, so that low latency patch probably
> isn't causing problems.  I'm assuming they are using Andrew Morton's low
> latency patch, which doesn't cause problems.
> 
> -chris
> 
> 
> 

The immediate caller is the "ReiserFS specific hack" in 
fs/inode.c:get_inode signed <mason@suse.com>.  Is the BKL supposed to be 
held when get_inode is called?  I don't see it documented either way in 
Documentation/filesystems/Locking.  It looks easier to add lock_kernel 
calls there than in read_inode2 (assuming they're supposed to last 
through the entire function call).
John


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-01-31 21:28       ` John Dalbec
@ 2003-02-03 16:41         ` Chris Mason
  2003-02-04 18:01           ` John Dalbec
  2003-02-05 14:24           ` John Dalbec
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Mason @ 2003-02-03 16:41 UTC (permalink / raw)
  To: John Dalbec; +Cc: reiserfs-list

On Fri, 2003-01-31 at 16:28, John Dalbec wrote:

> The immediate caller is the "ReiserFS specific hack" in 
> fs/inode.c:get_inode signed <mason@suse.com>.  Is the BKL supposed to be 
> held when get_inode is called?  

Traditionally, the BKL is supposed to be held when iget or iget4 is
called.  RedHat might have patches that do away with that and simply
missed reiserfs, but it is more likely they have a patch to reduce BKL
use in NFS that missed the iget4 case.

So your two basic choices are adding the BKL to reiserfs_read_inode2, or
going into the nfsd source and putting them around the iget4 call.  You
might want to double check to see if their source had the BKL in
reiserfs_read_inode2 before you started the data logging port.   

If not, you should be able to reproduce the oops on an unmodified redhat
kernel (compiled with SMP on), and I'd appreciate it if you could send
them a bug report as well.

-chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-02-03 16:41         ` Chris Mason
@ 2003-02-04 18:01           ` John Dalbec
  2003-02-05 14:24           ` John Dalbec
  1 sibling, 0 replies; 9+ messages in thread
From: John Dalbec @ 2003-02-04 18:01 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list

Chris Mason wrote:
> On Fri, 2003-01-31 at 16:28, John Dalbec wrote:
> 
> 
>>The immediate caller is the "ReiserFS specific hack" in 
>>fs/inode.c:get_inode signed <mason@suse.com>.  Is the BKL supposed to be 
>>held when get_inode is called?  
> 
> 
> Traditionally, the BKL is supposed to be held when iget or iget4 is
> called.  RedHat might have patches that do away with that and simply
> missed reiserfs, but it is more likely they have a patch to reduce BKL
> use in NFS that missed the iget4 case.

After inspecting nfsfh.c I think the "correct" fix is to grab the BKL in 
find_fh_dentry before calling nfsd_get_dentry.  There are other calls to 
nfsd_get_dentry later in find_fh_dentry but the BKL is already held at 
that point.

--- linux/fs/nfsd/nfsfh.c.orig  Tue Feb  4 10:55:36 2003
+++ linux/fs/nfsd/nfsfh.c       Tue Feb  4 10:58:19 2003
@@ -410,7 +410,9 @@
          */
   retry:
         down(&sb->s_nfsd_free_path_sem);
+       lock_kernel();
         result = nfsd_get_dentry(sb, datap, len, fhtype, 0);
+       unlock_kernel();
         if (IS_ERR(result)
             || !(result->d_flags & DCACHE_NFSD_DISCONNECTED)
             || (!S_ISDIR(result->d_inode->i_mode) && ! needpath)) {

> 
> So your two basic choices are adding the BKL to reiserfs_read_inode2, or
> going into the nfsd source and putting them around the iget4 call.  You
> might want to double check to see if their source had the BKL in
> reiserfs_read_inode2 before you started the data logging port.   
> 
> If not, you should be able to reproduce the oops on an unmodified redhat
> kernel (compiled with SMP on), and I'd appreciate it if you could send
> them a bug report as well.
> 
> -chris
> 
> 
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-02-03 16:41         ` Chris Mason
  2003-02-04 18:01           ` John Dalbec
@ 2003-02-05 14:24           ` John Dalbec
  2003-02-05 16:00             ` Chris Mason
  1 sibling, 1 reply; 9+ messages in thread
From: John Dalbec @ 2003-02-05 14:24 UTC (permalink / raw)
  To: Chris Mason; +Cc: reiserfs-list

Chris Mason wrote:
> So your two basic choices are adding the BKL to reiserfs_read_inode2, or
> going into the nfsd source and putting them around the iget4 call.  You
> might want to double check to see if their source had the BKL in
> reiserfs_read_inode2 before you started the data logging port.

It did not.

> 
> If not, you should be able to reproduce the oops on an unmodified redhat
> kernel (compiled with SMP on), and I'd appreciate it if you could send
> them a bug report as well.

I cannot reproduce the oops on an unmodified Red Hat kernel because they 
don't call reiserfs_check_lock_depth in search_by_key.  That was added 
by your data-logging patch.

> 
> -chris
> 
> 
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
  2003-02-05 14:24           ` John Dalbec
@ 2003-02-05 16:00             ` Chris Mason
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Mason @ 2003-02-05 16:00 UTC (permalink / raw)
  To: John Dalbec; +Cc: reiserfs-list

On Wed, 2003-02-05 at 09:24, John Dalbec wrote:

> > 
> > If not, you should be able to reproduce the oops on an unmodified redhat
> > kernel (compiled with SMP on), and I'd appreciate it if you could send
> > them a bug report as well.
> 
> I cannot reproduce the oops on an unmodified Red Hat kernel because they 
> don't call reiserfs_check_lock_depth in search_by_key.  That was added 
> by your data-logging patch.
> 

Heh, I added a bunch of those during the 2.4.0-test porting, I guess I
hit an odd bug during data logging coding and thought I needed a few
more.

Even though the check is only introduced in the data logging code, the
non-data logging code needs the BKL in search_by_key all the time.  So,
if you could please forward the details to redhat I'd appreciate it.

-chris



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-02-05 16:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-31 14:41 Trying to port data-logging to RH 2.4.18-19.7.x kernel John Dalbec
2003-01-31 14:55 ` Chris Mason
2003-01-31 15:43   ` John Dalbec
2003-01-31 16:06     ` Chris Mason
2003-01-31 21:28       ` John Dalbec
2003-02-03 16:41         ` Chris Mason
2003-02-04 18:01           ` John Dalbec
2003-02-05 14:24           ` John Dalbec
2003-02-05 16:00             ` Chris Mason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.