list corruption in locks_start_grace with 2.6.28-rc3

All of lore.kernel.org
 help / color / mirror / Atom feed

* list corruption in locks_start_grace with 2.6.28-rc3
@ 2008-11-12 16:15 Jeff Moyer
       [not found] ` <x498wrokjj8.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Moyer @ 2008-11-12 16:15 UTC (permalink / raw)
  To: linux-nfs

Hi,

I'm doing some testing which involves roughly the following:

o mount a file system on the server
o start the nfs service
- mount the nfs-exported file system from a client
- perform a dd from the client
- umount the nfs-exported file system from a client
o stop the nfs service
o unmount the file system on the server

After several iterations of this, varying the number of nfsd threads
started, I get the attached backtrace.  I've reproduced it twice, now.

Let me know if I can be of further help.

Cheers,

Jeff

------------[ cut here ]------------
WARNING: at lib/list_debug.c:26 __list_add+0x42/0x87()
list_add corruption. next->prev should be prev (ffffffffa044fcc0), but was fffff
fffa0494b50. (next=ffffffffa0494b50).
Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs bridge stp bnep rfcom
m l2cap bluetooth sunrpc iptable_filter ip_tables ip6table_filter ip6_tables x_t
ables ipv6 loop dm_round_robin dm_multipath sg sd_mod crc_t10dif qla2xxx ipmi_si
 ide_cd_mod pcspkr ipmi_msghandler bnx2 cdrom scsi_transport_fc serio_raw shpchp
 button cciss scsi_mod dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mo
d ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 7076, comm: rpc.nfsd Tainted: G        W  2.6.28-rc3 #51
Call Trace:
 [<ffffffff8023ca74>] warn_slowpath+0xae/0xd5
 [<ffffffff802b036e>] ? kfree_debugcheck+0x11/0x2c
 [<ffffffff802b036e>] ? kfree_debugcheck+0x11/0x2c
 [<ffffffff802b036e>] ? kfree_debugcheck+0x11/0x2c
 [<ffffffff802b040a>] ? cache_free_debugcheck+0x81/0x238
 [<ffffffffa0058ee1>] ? ext3_htree_free_dir_info+0x19/0x1d [ext3]
 [<ffffffff8032c2b4>] ? selinux_file_free_security+0x1e/0x20
 [<ffffffff802b903d>] ? __fput+0x19e/0x1ab
 [<ffffffff803638e3>] __list_add+0x42/0x87
 [<ffffffffa0446629>] locks_start_grace+0x2e/0x41 [lockd]
 [<ffffffffa04764d7>] nfs4_state_start+0x94/0x106 [nfsd]
 [<ffffffffa045e5df>] nfsd_svc+0x6a/0x111 [nfsd]
 [<ffffffffa045f2d9>] ? write_threads+0x0/0xb4 [nfsd]
 [<ffffffffa045f343>] write_threads+0x6a/0xb4 [nfsd]
 [<ffffffff804e30b5>] ? _spin_unlock+0x30/0x4b
 [<ffffffff802d12d1>] ? simple_transaction_get+0x53/0xbf
 [<ffffffff802d1320>] ? simple_transaction_get+0xa2/0xbf
 [<ffffffffa045ea0c>] nfsctl_transaction_write+0x4c/0x7e [nfsd]
 [<ffffffff802b6362>] ? fd_install+0x30/0x5f
 [<ffffffff802b85b4>] vfs_write+0xae/0x137
 [<ffffffff802b8701>] sys_write+0x47/0x70
 [<ffffffff8020c00b>] system_call_fastpath+0x16/0x1b
---[ end trace 4eaa2a86a8e2da22 ]---

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <x498wrokjj8.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]

* Re: list corruption in locks_start_grace with 2.6.28-rc3
       [not found] ` <x498wrokjj8.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
@ 2008-11-20 20:37   ` J. Bruce Fields
  2008-11-21 15:28     ` Jeff Moyer
  0 siblings, 1 reply; 10+ messages in thread
From: J. Bruce Fields @ 2008-11-20 20:37 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-nfs

On Wed, Nov 12, 2008 at 11:15:23AM -0500, Jeff Moyer wrote:
> Hi,
> 
> I'm doing some testing which involves roughly the following:
> 
> o mount a file system on the server
> o start the nfs service
> - mount the nfs-exported file system from a client
> - perform a dd from the client
> - umount the nfs-exported file system from a client
> o stop the nfs service
> o unmount the file system on the server
> 
> After several iterations of this, varying the number of nfsd threads
> started, I get the attached backtrace.  I've reproduced it twice, now.
> 
> Let me know if I can be of further help.

Apologies for the delay, and thanks for the report.  Does the following
help?  (Untested).

--b.

commit 77b810d52cc07212c79848b98bb992f0541afefb
Author: J. Bruce Fields <bfields@citi.umich.edu>
Date:   Thu Nov 20 14:36:17 2008 -0600

    nfsd: clean up grace period on early exit
    
    If nfsd was shut down before the grace period ended, we could end up
    with a freed object still on grace_list.  Thanks to Jeff Moyer for
    reporting the resulting list corruption warnings.
    
    Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
    Cc: Jeff Moyer <jmoyer@redhat.com>

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index c631a83..56b0767 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -181,6 +181,7 @@ lockd(void *vrqstp)
 	}
 	flush_signals(current);
 	cancel_delayed_work_sync(&grace_period_end);
+	locks_end_grace(&lockd_manager);
 	if (nlmsvc_ops)
 		nlmsvc_invalidate_all();
 	nlm_shutdown_hosts();
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b0bebc5..1a052ac 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3261,6 +3261,7 @@ nfs4_state_shutdown(void)
 {
 	cancel_rearming_delayed_workqueue(laundry_wq, &laundromat_work);
 	destroy_workqueue(laundry_wq);
+	locks_end_grace(&nfsd4_manager);
 	nfs4_lock_state();
 	nfs4_release_reclaim();
 	__nfs4_state_shutdown();

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: list corruption in locks_start_grace with 2.6.28-rc3
  2008-11-20 20:37   ` J. Bruce Fields
@ 2008-11-21 15:28     ` Jeff Moyer
       [not found]       ` <x49ej15gktp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Moyer @ 2008-11-21 15:28 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

"J. Bruce Fields" <bfields@fieldses.org> writes:

> On Wed, Nov 12, 2008 at 11:15:23AM -0500, Jeff Moyer wrote:
>> Hi,
>> 
>> I'm doing some testing which involves roughly the following:
>> 
>> o mount a file system on the server
>> o start the nfs service
>> - mount the nfs-exported file system from a client
>> - perform a dd from the client
>> - umount the nfs-exported file system from a client
>> o stop the nfs service
>> o unmount the file system on the server
>> 
>> After several iterations of this, varying the number of nfsd threads
>> started, I get the attached backtrace.  I've reproduced it twice, now.
>> 
>> Let me know if I can be of further help.
>
> Apologies for the delay, and thanks for the report.  Does the following
> help?  (Untested).

I get a new and different backtrace with this patch applied.  ;)
I'm testing with 2.6.28-rc5, fyi.

static inline void __module_get(struct module *module)
{
        if (module) {
                BUG_ON(module_refcount(module) == 0);      <------------
                local_inc(&module->ref[get_cpu()].count);
                put_cpu();
        }
}

Called from net/sunrpc/svcexport.c:svc_recv:687

        } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
                struct svc_xprt *newxpt;
                newxpt = xprt->xpt_ops->xpo_accept(xprt);
                if (newxpt) {
                        /*
                         * We know this module_get will succeed because the
                         * listener holds a reference too
                         */
                        __module_get(newxpt->xpt_class->xcl_owner);

Cheers,
Jeff

 ------------[ cut here ]------------
kernel BUG at include/linux/module.h:394!
invalid opcode: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/local_cpus
Dumping ftrace buffer:
   (ftrace buffer empty)
CPU 0 
Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs bridge stp bnep rfcomm l2cap bluetooth sunrpc iptable_filter ip_tables ip6table_filter ip6_tables x_tables ipv6 loop dm_round_robin dm_multipath sg sd_mod crc_t10dif ide_cd_mod cdrom bnx2 serio_raw ipmi_si pcspkr qla2xxx ipmi_msghandler scsi_transport_fc button dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod shpchp cciss scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 5733, comm: nfsd Tainted: G        W  2.6.28-rc5 #56
RIP: 0010:[<ffffffffa03695a5>]  [<ffffffffa03695a5>] svc_recv+0x41f/0x7a1 [sunrpc]
RSP: 0018:ffff8802148b3e60  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffffa0382600 RCX: 0000000000000000
RDX: 0000000000007f80 RSI: ffff8802148b3d70 RDI: ffffffffa0382600
RBP: ffff8802148b3ef0 R08: ffff88021a1f1048 R09: 0000000000000000
R10: 0000000000000000 R11: ffff8802148b3c30 R12: ffff8802195f9a50
R13: ffff8802195f8000 R14: ffff88021c5904f0 R15: ffff8802299b45b0
FS:  0000000000000000(0000) GS:ffffffff80855a00(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f4d05281000 CR3: 000000021f5cc000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process nfsd (pid: 5733, threadinfo ffff8802148b2000, task ffff8802194c8040)
Stack:
 00000000195f8000 0000000000000000 000000000036ee80 ffff88021ed49930
 ffff8802191eb3d8 ffffffffa0487118 0000000000000000 ffff8802194c8040
 ffffffff80238e07 0000000000100100 0000000000200200 ffffffff804e2471
Call Trace:
 [<ffffffff80238e07>] ? default_wake_function+0x0/0xf
 [<ffffffff804e2471>] ? __mutex_unlock_slowpath+0x11e/0x127
 [<ffffffffa045d773>] nfsd+0xed/0x295 [nfsd]
 [<ffffffffa045d686>] ? nfsd+0x0/0x295 [nfsd]
 [<ffffffffa045d686>] ? nfsd+0x0/0x295 [nfsd]
 [<ffffffff802506a8>] kthread+0x49/0x76
 [<ffffffff8020d1d9>] child_rip+0xa/0x11
 [<ffffffff8020c6c8>] ? restore_args+0x0/0x30
 [<ffffffff8025065f>] ? kthread+0x0/0x76
 [<ffffffff8020d1cf>] ? child_rip+0x0/0x11
Code: 08 4c 89 f7 ff 50 08 48 85 c0 49 89 c7 0f 84 71 01 00 00 48 8b 00 48 8b 58 08 48 85 db 74 4e 48 89 df e8 17 9b ef df 85 c0 75 04 <0f> 0b eb fe bf 01 00 00 00 e8 9c d6 17 e0 e8 d4 a4 ff df 89 c0 
RIP  [<ffffffffa03695a5>] svc_recv+0x41f/0x7a1 [sunrpc]
 RSP <ffff8802148b3e60>
---[ end trace 4eaa2a86a8e2da22 ]---

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <x49ej15gktp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]

* Re: list corruption in locks_start_grace with 2.6.28-rc3
       [not found]       ` <x49ej15gktp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
@ 2008-11-22  0:54         ` J. Bruce Fields
  2008-11-22 14:22           ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: J. Bruce Fields @ 2008-11-22  0:54 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-nfs

On Fri, Nov 21, 2008 at 10:28:18AM -0500, Jeff Moyer wrote:
> "J. Bruce Fields" <bfields@fieldses.org> writes:
> 
> > On Wed, Nov 12, 2008 at 11:15:23AM -0500, Jeff Moyer wrote:
> >> Hi,
> >> 
> >> I'm doing some testing which involves roughly the following:
> >> 
> >> o mount a file system on the server
> >> o start the nfs service
> >> - mount the nfs-exported file system from a client
> >> - perform a dd from the client
> >> - umount the nfs-exported file system from a client
> >> o stop the nfs service
> >> o unmount the file system on the server
> >> 
> >> After several iterations of this, varying the number of nfsd threads
> >> started, I get the attached backtrace.  I've reproduced it twice, now.
> >> 
> >> Let me know if I can be of further help.
> >
> > Apologies for the delay, and thanks for the report.  Does the following
> > help?  (Untested).
> 
> I get a new and different backtrace with this patch applied.  ;)
> I'm testing with 2.6.28-rc5, fyi.

Thanks for the testing....

> 
> static inline void __module_get(struct module *module)
> {
>         if (module) {
>                 BUG_ON(module_refcount(module) == 0);      <------------
>                 local_inc(&module->ref[get_cpu()].count);
>                 put_cpu();
>         }
> }
> 
> Called from net/sunrpc/svcexport.c:svc_recv:687

You meant svc_xprt.c.  OK.

> 
>         } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
>                 struct svc_xprt *newxpt;
>                 newxpt = xprt->xpt_ops->xpo_accept(xprt);
>                 if (newxpt) {
>                         /*
>                          * We know this module_get will succeed because the
>                          * listener holds a reference too
>                          */

So clearly the assumption stated in the comment is wrong.

I can't see any relationship between this and the previous bug, but
perhaps it was covering this up somehow.

>                         __module_get(newxpt->xpt_class->xcl_owner);

I don't see the problem yet, but I'll look some more....

--b.

> 
> Cheers,
> Jeff
> 
>  ------------[ cut here ]------------
> kernel BUG at include/linux/module.h:394!
> invalid opcode: 0000 [#1] PREEMPT SMP 
> last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/local_cpus
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> CPU 0 
> Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs bridge stp bnep rfcomm l2cap bluetooth sunrpc iptable_filter ip_tables ip6table_filter ip6_tables x_tables ipv6 loop dm_round_robin dm_multipath sg sd_mod crc_t10dif ide_cd_mod cdrom bnx2 serio_raw ipmi_si pcspkr qla2xxx ipmi_msghandler scsi_transport_fc button dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod shpchp cciss scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
> Pid: 5733, comm: nfsd Tainted: G        W  2.6.28-rc5 #56
> RIP: 0010:[<ffffffffa03695a5>]  [<ffffffffa03695a5>] svc_recv+0x41f/0x7a1 [sunrpc]
> RSP: 0018:ffff8802148b3e60  EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffffffffa0382600 RCX: 0000000000000000
> RDX: 0000000000007f80 RSI: ffff8802148b3d70 RDI: ffffffffa0382600
> RBP: ffff8802148b3ef0 R08: ffff88021a1f1048 R09: 0000000000000000
> R10: 0000000000000000 R11: ffff8802148b3c30 R12: ffff8802195f9a50
> R13: ffff8802195f8000 R14: ffff88021c5904f0 R15: ffff8802299b45b0
> FS:  0000000000000000(0000) GS:ffffffff80855a00(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00007f4d05281000 CR3: 000000021f5cc000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process nfsd (pid: 5733, threadinfo ffff8802148b2000, task ffff8802194c8040)
> Stack:
>  00000000195f8000 0000000000000000 000000000036ee80 ffff88021ed49930
>  ffff8802191eb3d8 ffffffffa0487118 0000000000000000 ffff8802194c8040
>  ffffffff80238e07 0000000000100100 0000000000200200 ffffffff804e2471
> Call Trace:
>  [<ffffffff80238e07>] ? default_wake_function+0x0/0xf
>  [<ffffffff804e2471>] ? __mutex_unlock_slowpath+0x11e/0x127
>  [<ffffffffa045d773>] nfsd+0xed/0x295 [nfsd]
>  [<ffffffffa045d686>] ? nfsd+0x0/0x295 [nfsd]
>  [<ffffffffa045d686>] ? nfsd+0x0/0x295 [nfsd]
>  [<ffffffff802506a8>] kthread+0x49/0x76
>  [<ffffffff8020d1d9>] child_rip+0xa/0x11
>  [<ffffffff8020c6c8>] ? restore_args+0x0/0x30
>  [<ffffffff8025065f>] ? kthread+0x0/0x76
>  [<ffffffff8020d1cf>] ? child_rip+0x0/0x11
> Code: 08 4c 89 f7 ff 50 08 48 85 c0 49 89 c7 0f 84 71 01 00 00 48 8b 00 48 8b 58 08 48 85 db 74 4e 48 89 df e8 17 9b ef df 85 c0 75 04 <0f> 0b eb fe bf 01 00 00 00 e8 9c d6 17 e0 e8 d4 a4 ff df 89 c0 
> RIP  [<ffffffffa03695a5>] svc_recv+0x41f/0x7a1 [sunrpc]
>  RSP <ffff8802148b3e60>
> ---[ end trace 4eaa2a86a8e2da22 ]---

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: list corruption in locks_start_grace with 2.6.28-rc3
  2008-11-22  0:54         ` J. Bruce Fields
@ 2008-11-22 14:22           ` Jeff Layton
       [not found]             ` <20081122092237.1ab81cdb-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2008-11-22 14:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Moyer, linux-nfs

On Fri, 21 Nov 2008 19:54:00 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Fri, Nov 21, 2008 at 10:28:18AM -0500, Jeff Moyer wrote:
> > "J. Bruce Fields" <bfields@fieldses.org> writes:
> > 
> > > On Wed, Nov 12, 2008 at 11:15:23AM -0500, Jeff Moyer wrote:
> > >> Hi,
> > >> 
> > >> I'm doing some testing which involves roughly the following:
> > >> 
> > >> o mount a file system on the server
> > >> o start the nfs service
> > >> - mount the nfs-exported file system from a client
> > >> - perform a dd from the client
> > >> - umount the nfs-exported file system from a client
> > >> o stop the nfs service
> > >> o unmount the file system on the server
> > >> 
> > >> After several iterations of this, varying the number of nfsd threads
> > >> started, I get the attached backtrace.  I've reproduced it twice, now.
> > >> 
> > >> Let me know if I can be of further help.
> > >
> > > Apologies for the delay, and thanks for the report.  Does the following
> > > help?  (Untested).
> > 
> > I get a new and different backtrace with this patch applied.  ;)
> > I'm testing with 2.6.28-rc5, fyi.
> 
> Thanks for the testing....
> 
> > 
> > static inline void __module_get(struct module *module)
> > {
> >         if (module) {
> >                 BUG_ON(module_refcount(module) == 0);      <------------
> >                 local_inc(&module->ref[get_cpu()].count);
> >                 put_cpu();
> >         }
> > }
> > 
> > Called from net/sunrpc/svcexport.c:svc_recv:687
> 
> You meant svc_xprt.c.  OK.
> 
> > 
> >         } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> >                 struct svc_xprt *newxpt;
> >                 newxpt = xprt->xpt_ops->xpo_accept(xprt);
> >                 if (newxpt) {
> >                         /*
> >                          * We know this module_get will succeed because the
> >                          * listener holds a reference too
> >                          */
> 
> So clearly the assumption stated in the comment is wrong.
> 
> I can't see any relationship between this and the previous bug, but
> perhaps it was covering this up somehow.
> 
> >                         __module_get(newxpt->xpt_class->xcl_owner);
> 
> I don't see the problem yet, but I'll look some more....
> 

FWIW, I've noticed some problems with refcounting when starting and
stopping nfsd. When you bring it up and take it back down again
repeatedly (i.e. run "rpc.nfsd 1" and "rpc.nfsd 0"), you'll lose 2
sunrpc module refs on each cycle.

I suspect the problem Jeff is hitting is due to that. Maybe he was just
reliably crashing before it got to 0 before. It's on my to-do list once
I get some other things off my plate. If someone wants to track it down
first, be my guest :)

I have a little more info in this RHBZ, but haven't had time to nail it
down yet:

https://bugzilla.redhat.com/show_bug.cgi?id=464123#c10

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <20081122092237.1ab81cdb-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>]

* Re: list corruption in locks_start_grace with 2.6.28-rc3
       [not found]             ` <20081122092237.1ab81cdb-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2008-11-22 20:48               ` Tom Tucker
  2008-11-22 23:52                 ` Tom Tucker
  0 siblings, 1 reply; 10+ messages in thread
From: Tom Tucker @ 2008-11-22 20:48 UTC (permalink / raw)
  To: Jeff Layton; +Cc: J. Bruce Fields, Jeff Moyer, linux-nfs


So I think I know what's going on here. The svc_create_xprt function 
takes a reference on the module that implements the transport and 
svc_xprt_free releases it.

The svc_xprt_free function is called from svc_xprt_put when the kref 
goes to zero. nfsd and other services will put any transports they've 
created when unloaded.

The issue is that the "built in" transports of TCP and UDP are not 
created with svc_create_xprt and therefore the initial transport module 
reference is not taken. So when services exit, the sunrpc module 
reference count is getting incorrectly decremented (twice), once for TCP 
and once for UDP.

What I don't know is what changed to cause this to happen. These 
transports have always been created by svc_addsock and that hasn't
changed. Maybe xcl_owner was NULL for these transports initially?

I'll dig around and see what I can find out.

Tom

Jeff Layton wrote:
> On Fri, 21 Nov 2008 19:54:00 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
>> On Fri, Nov 21, 2008 at 10:28:18AM -0500, Jeff Moyer wrote:
>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>
>>>> On Wed, Nov 12, 2008 at 11:15:23AM -0500, Jeff Moyer wrote:
>>>>> Hi,
>>>>>
>>>>> I'm doing some testing which involves roughly the following:
>>>>>
>>>>> o mount a file system on the server
>>>>> o start the nfs service
>>>>> - mount the nfs-exported file system from a client
>>>>> - perform a dd from the client
>>>>> - umount the nfs-exported file system from a client
>>>>> o stop the nfs service
>>>>> o unmount the file system on the server
>>>>>
>>>>> After several iterations of this, varying the number of nfsd threads
>>>>> started, I get the attached backtrace.  I've reproduced it twice, now.
>>>>>
>>>>> Let me know if I can be of further help.
>>>> Apologies for the delay, and thanks for the report.  Does the following
>>>> help?  (Untested).
>>> I get a new and different backtrace with this patch applied.  ;)
>>> I'm testing with 2.6.28-rc5, fyi.
>> Thanks for the testing....
>>
>>> static inline void __module_get(struct module *module)
>>> {
>>>         if (module) {
>>>                 BUG_ON(module_refcount(module) == 0);      <------------
>>>                 local_inc(&module->ref[get_cpu()].count);
>>>                 put_cpu();
>>>         }
>>> }
>>>
>>> Called from net/sunrpc/svcexport.c:svc_recv:687
>> You meant svc_xprt.c.  OK.
>>
>>>         } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
>>>                 struct svc_xprt *newxpt;
>>>                 newxpt = xprt->xpt_ops->xpo_accept(xprt);
>>>                 if (newxpt) {
>>>                         /*
>>>                          * We know this module_get will succeed because the
>>>                          * listener holds a reference too
>>>                          */
>> So clearly the assumption stated in the comment is wrong.
>>
>> I can't see any relationship between this and the previous bug, but
>> perhaps it was covering this up somehow.
>>
>>>                         __module_get(newxpt->xpt_class->xcl_owner);
>> I don't see the problem yet, but I'll look some more....
>>
> 
> FWIW, I've noticed some problems with refcounting when starting and
> stopping nfsd. When you bring it up and take it back down again
> repeatedly (i.e. run "rpc.nfsd 1" and "rpc.nfsd 0"), you'll lose 2
> sunrpc module refs on each cycle.
> 
> I suspect the problem Jeff is hitting is due to that. Maybe he was just
> reliably crashing before it got to 0 before. It's on my to-do list once
> I get some other things off my plate. If someone wants to track it down
> first, be my guest :)
> 
> I have a little more info in this RHBZ, but haven't had time to nail it
> down yet:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=464123#c10
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: list corruption in locks_start_grace with 2.6.28-rc3
  2008-11-22 20:48               ` Tom Tucker
@ 2008-11-22 23:52                 ` Tom Tucker
  2008-11-23  1:00                   ` Jeff Layton
  2008-11-24 15:26                   ` Jeff Moyer
  0 siblings, 2 replies; 10+ messages in thread
From: Tom Tucker @ 2008-11-22 23:52 UTC (permalink / raw)
  To: Jeff Layton; +Cc: J. Bruce Fields, Jeff Moyer, linux-nfs

Jeff M/L:

Could you guys confirm that this patch fixes the problem? I was 
able to reproduce Jeff Layton's problem and this patch resolved
the under-reference condition for me.

Thanks,

The svc_addsock function adds transport instances without taking a
reference on the sunrpc.ko module, however, the generic transport
destruction code drops a reference when the transport instance
is destroyed.

Add a try_module_get call to the svc_addsock function for transports
added by this function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
---
 net/sunrpc/svcsock.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 95293f5..a1951dc 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1183,7 +1183,11 @@ int svc_addsock(struct svc_serv *serv,
 	else if (so->state > SS_UNCONNECTED)
 		err = -EISCONN;
 	else {
-		svsk = svc_setup_socket(serv, so, &err, SVC_SOCK_DEFAULTS);
+		if (!try_module_get(THIS_MODULE))
+			err = -ENOENT;
+		else
+			svsk = svc_setup_socket(serv, so, &err,
+						SVC_SOCK_DEFAULTS);
 		if (svsk) {
 			struct sockaddr_storage addr;
 			struct sockaddr *sin = (struct sockaddr *)&addr;
@@ -1196,7 +1200,8 @@ int svc_addsock(struct svc_serv *serv,
 			spin_unlock_bh(&serv->sv_lock);
 			svc_xprt_received(&svsk->sk_xprt);
 			err = 0;
-		}
+		} else
+			module_put(THIS_MODULE);
 	}
 	if (err) {
 		sockfd_put(so);



Tom Tucker wrote:
>
> So I think I know what's going on here. The svc_create_xprt function 
> takes a reference on the module that implements the transport and 
> svc_xprt_free releases it.
>
> The svc_xprt_free function is called from svc_xprt_put when the kref 
> goes to zero. nfsd and other services will put any transports they've 
> created when unloaded.
>
> The issue is that the "built in" transports of TCP and UDP are not 
> created with svc_create_xprt and therefore the initial transport 
> module reference is not taken. So when services exit, the sunrpc 
> module reference count is getting incorrectly decremented (twice), 
> once for TCP and once for UDP.
>
> What I don't know is what changed to cause this to happen. These 
> transports have always been created by svc_addsock and that hasn't
> changed. Maybe xcl_owner was NULL for these transports initially?
>
> I'll dig around and see what I can find out.
>
> Tom
>
> Jeff Layton wrote:
>> On Fri, 21 Nov 2008 19:54:00 -0500
>> "J. Bruce Fields" <bfields@fieldses.org> wrote:
>>
>>> On Fri, Nov 21, 2008 at 10:28:18AM -0500, Jeff Moyer wrote:
>>>> "J. Bruce Fields" <bfields@fieldses.org> writes:
>>>>
>>>>> On Wed, Nov 12, 2008 at 11:15:23AM -0500, Jeff Moyer wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm doing some testing which involves roughly the following:
>>>>>>
>>>>>> o mount a file system on the server
>>>>>> o start the nfs service
>>>>>> - mount the nfs-exported file system from a client
>>>>>> - perform a dd from the client
>>>>>> - umount the nfs-exported file system from a client
>>>>>> o stop the nfs service
>>>>>> o unmount the file system on the server
>>>>>>
>>>>>> After several iterations of this, varying the number of nfsd threads
>>>>>> started, I get the attached backtrace.  I've reproduced it twice, 
>>>>>> now.
>>>>>>
>>>>>> Let me know if I can be of further help.
>>>>> Apologies for the delay, and thanks for the report.  Does the 
>>>>> following
>>>>> help?  (Untested).
>>>> I get a new and different backtrace with this patch applied.  ;)
>>>> I'm testing with 2.6.28-rc5, fyi.
>>> Thanks for the testing....
>>>
>>>> static inline void __module_get(struct module *module)
>>>> {
>>>>         if (module) {
>>>>                 BUG_ON(module_refcount(module) == 0);      
>>>> <------------
>>>>                 local_inc(&module->ref[get_cpu()].count);
>>>>                 put_cpu();
>>>>         }
>>>> }
>>>>
>>>> Called from net/sunrpc/svcexport.c:svc_recv:687
>>> You meant svc_xprt.c.  OK.
>>>
>>>>         } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
>>>>                 struct svc_xprt *newxpt;
>>>>                 newxpt = xprt->xpt_ops->xpo_accept(xprt);
>>>>                 if (newxpt) {
>>>>                         /*
>>>>                          * We know this module_get will succeed 
>>>> because the
>>>>                          * listener holds a reference too
>>>>                          */
>>> So clearly the assumption stated in the comment is wrong.
>>>
>>> I can't see any relationship between this and the previous bug, but
>>> perhaps it was covering this up somehow.
>>>
>>>>                         __module_get(newxpt->xpt_class->xcl_owner);
>>> I don't see the problem yet, but I'll look some more....
>>>
>>
>> FWIW, I've noticed some problems with refcounting when starting and
>> stopping nfsd. When you bring it up and take it back down again
>> repeatedly (i.e. run "rpc.nfsd 1" and "rpc.nfsd 0"), you'll lose 2
>> sunrpc module refs on each cycle.
>>
>> I suspect the problem Jeff is hitting is due to that. Maybe he was just
>> reliably crashing before it got to 0 before. It's on my to-do list once
>> I get some other things off my plate. If someone wants to track it down
>> first, be my guest :)
>>
>> I have a little more info in this RHBZ, but haven't had time to nail it
>> down yet:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=464123#c10
>>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: list corruption in locks_start_grace with 2.6.28-rc3
  2008-11-22 23:52                 ` Tom Tucker
@ 2008-11-23  1:00                   ` Jeff Layton
  2008-11-24 15:26                   ` Jeff Moyer
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2008-11-23  1:00 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, Jeff Moyer, linux-nfs

On Sat, 22 Nov 2008 17:52:13 -0600
Tom Tucker <tom@opengridcomputing.com> wrote:

> Jeff M/L:
> 
> Could you guys confirm that this patch fixes the problem? I was 
> able to reproduce Jeff Layton's problem and this patch resolved
> the under-reference condition for me.
> 
> Thanks,
> 

Thanks Tom,
Confirmed. That patch does seem to resolve the refcount imbalance
when restarting nfsd.

Cheers,
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: list corruption in locks_start_grace with 2.6.28-rc3
  2008-11-22 23:52                 ` Tom Tucker
  2008-11-23  1:00                   ` Jeff Layton
@ 2008-11-24 15:26                   ` Jeff Moyer
       [not found]                     ` <x49wsetdu24.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Moyer @ 2008-11-24 15:26 UTC (permalink / raw)
  To: Tom Tucker; +Cc: Jeff Layton, J. Bruce Fields, linux-nfs

Tom Tucker <tom@opengridcomputing.com> writes:

> Jeff M/L:
>
> Could you guys confirm that this patch fixes the problem? I was able
> to reproduce Jeff Layton's problem and this patch resolved
> the under-reference condition for me.

With this patch and the one that Bruce posted, I no longer see any
problems.  Bruce, you can add a Tested-by: Jeff Moyer
<jmoyer@redhat.com> to your patch submission.

Thanks, guys!

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <x49wsetdu24.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]

* Re: list corruption in locks_start_grace with 2.6.28-rc3
       [not found]                     ` <x49wsetdu24.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
@ 2008-11-24 16:16                       ` J. Bruce Fields
  0 siblings, 0 replies; 10+ messages in thread
From: J. Bruce Fields @ 2008-11-24 16:16 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Tom Tucker, Jeff Layton, linux-nfs

On Mon, Nov 24, 2008 at 10:26:11AM -0500, Jeff Moyer wrote:
> Tom Tucker <tom@opengridcomputing.com> writes:
> 
> > Jeff M/L:
> >
> > Could you guys confirm that this patch fixes the problem? I was able
> > to reproduce Jeff Layton's problem and this patch resolved
> > the under-reference condition for me.
> 
> With this patch and the one that Bruce posted, I no longer see any
> problems.  Bruce, you can add a Tested-by: Jeff Moyer
> <jmoyer@redhat.com> to your patch submission.
> 
> Thanks, guys!

Done, thanks!  Patches pending for 2.6.28 available from the for-2.6.28
branch at:

	git://linux-nfs.org/~bfields/linux.git for-2.6.28

--b.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-11-24 16:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-12 16:15 list corruption in locks_start_grace with 2.6.28-rc3 Jeff Moyer
     [not found] ` <x498wrokjj8.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2008-11-20 20:37   ` J. Bruce Fields
2008-11-21 15:28     ` Jeff Moyer
     [not found]       ` <x49ej15gktp.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2008-11-22  0:54         ` J. Bruce Fields
2008-11-22 14:22           ` Jeff Layton
     [not found]             ` <20081122092237.1ab81cdb-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-11-22 20:48               ` Tom Tucker
2008-11-22 23:52                 ` Tom Tucker
2008-11-23  1:00                   ` Jeff Layton
2008-11-24 15:26                   ` Jeff Moyer
     [not found]                     ` <x49wsetdu24.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2008-11-24 16:16                       ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.