* [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload @ 2011-02-03 18:51 Boaz Harrosh 2011-02-04 8:36 ` Tao Ma 0 siblings, 1 reply; 5+ messages in thread From: Boaz Harrosh @ 2011-02-03 18:51 UTC (permalink / raw) To: Nick Piggin, linux-fsdevel Last good Kernel was 2.6.37 I'm doing a "mount" then "unmount". I think root is the only created inode. rmmod is called immediately after "unmount" within a script if I only do unmount and manually call "modprobe --remove exofs" after a small while all is fine. I get: slab error in kmem_cache_destroy(): cache `exofs_inode_cache': Can't free all objects Call Trace: 77dfde08: [<6007e9a6>] kmem_cache_destroy+0x82/0xca 77dfde38: [<7c1fa3da>] exit_exofs+0x1a/0x1c [exofs] 77dfde48: [<60054c10>] sys_delete_module+0x1b9/0x217 77dfdee8: [<60014d60>] handle_syscall+0x58/0x70 77dfdf08: [<60024163>] userspace+0x2dd/0x38a 77dfdfc8: [<600126af>] fork_handler+0x62/0x69 The UML Kernel also crashes after this message, with: Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc cryptomgr aead crc32c crypto_hash crypto_algapi iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod binfmt_misc [last unloaded: libosd] Pid: 6, comm: rcu_kthread Not tainted 2.6.38-rc3+ RIP: 0033:[<000000007c1fa0e7>] RSP: 000000007943be18 EFLAGS: 00010246 RAX: 000000007943a000 RBX: 000000007937bb80 RCX: 0000000000000095 RDX: 000000007937c8b8 RSI: 0000000077fb6c80 RDI: 000000007937bb80 RBP: 000000007943be40 R08: 000000007943be10 R09: 000000007943a000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000795123e0 R13: 0000000000000001 R14: 0000000000000000 R15: 000000000000000a Call Trace: 602678f8: [<600144ed>] segv+0x70/0x212 60267928: [<6001cd9e>] ubd_intr+0x72/0xdf 60267988: [<601b778e>] _raw_spin_unlock_irqrestore+0x18/0x1c 602679d8: [<600146ee>] segv_handler+0x5f/0x65 60267a08: [<60021488>] sig_handler_common+0x84/0x98 60267ab0: [<60130926>] strncpy+0xf/0x27 60267b38: [<600215ce>] sig_handler+0x30/0x3b 60267b58: [<60021800>] handle_signal+0x6d/0xa3 60267ba8: [<60023180>] hard_handler+0x10/0x14 Kernel panic - not syncing: Segfault with no mm Call Trace: 602677f8: [<601b52b1>] panic+0xea/0x1e6 60267818: [<6007e299>] kmem_cache_free+0x54/0x5f 60267850: [<6005342e>] __module_text_address+0xd/0x53 60267868: [<6005347d>] is_module_text_address+0x9/0x11 60267878: [<6004290c>] __kernel_text_address+0x65/0x6b 60267880: [<60023180>] hard_handler+0x10/0x14 60267898: [<6001345e>] show_trace+0x8e/0x95 602678c8: [<60026c40>] show_regs+0x2b/0x2f 602678f8: [<60014577>] segv+0xfa/0x212 60267928: [<6001cd9e>] ubd_intr+0x72/0xdf 60267988: [<601b778e>] _raw_spin_unlock_irqrestore+0x18/0x1c 602679d8: [<600146ee>] segv_handler+0x5f/0x65 60267a08: [<60021488>] sig_handler_common+0x84/0x98 60267ab0: [<60130926>] strncpy+0xf/0x27 60267b38: [<600215ce>] sig_handler+0x30/0x3b 60267b58: [<60021800>] handle_signal+0x6d/0xa3 60267ba8: [<60023180>] hard_handler+0x10/0x14 Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc cryptomgr aead crc32c crypto_hash crypto_algapi iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod binfmt_misc [last unloaded: libosd] Pid: 6, comm: rcu_kthread Not tainted 2.6.38-rc3+ RIP: 0033:[<0000003ea3832ad7>] RSP: 00007fff63338e38 EFLAGS: 00000202 RAX: 0000000000000000 RBX: 0000000000000219 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000013 RDI: 0000000000000219 RBP: 00007fff63338e70 R08: 0000000000000000 R09: 00007fff63338e70 R10: 00007fff63338be0 R11: 0000000000000202 R12: 0000000000000215 R13: 00007fe54ee756a8 R14: 00007fff63339090 R15: 00007fff63339928 Call Trace: 60267788: [<6001485b>] panic_exit+0x2f/0x45 602677a8: [<60048ad6>] notifier_call_chain+0x32/0x5e 602677e8: [<60048b24>] atomic_notifier_call_chain+0x13/0x15 602677f8: [<601b52cc>] panic+0x105/0x1e6 60267818: [<6007e299>] kmem_cache_free+0x54/0x5f 60267850: [<6005342e>] __module_text_address+0xd/0x53 60267868: [<6005347d>] is_module_text_address+0x9/0x11 60267878: [<6004290c>] __kernel_text_address+0x65/0x6b 60267880: [<60023180>] hard_handler+0x10/0x14 60267898: [<6001345e>] show_trace+0x8e/0x95 602678c8: [<60026c40>] show_regs+0x2b/0x2f 602678f8: [<60014577>] segv+0xfa/0x212 60267928: [<6001cd9e>] ubd_intr+0x72/0xdf 60267988: [<601b778e>] _raw_spin_unlock_irqrestore+0x18/0x1c 602679d8: [<600146ee>] segv_handler+0x5f/0x65 60267a08: [<60021488>] sig_handler_common+0x84/0x98 60267ab0: [<60130926>] strncpy+0xf/0x27 60267b38: [<600215ce>] sig_handler+0x30/0x3b 60267b58: [<60021800>] handle_signal+0x6d/0xa3 60267ba8: [<60023180>] hard_handler+0x10/0x14 Thanks Boaz ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload 2011-02-03 18:51 [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload Boaz Harrosh @ 2011-02-04 8:36 ` Tao Ma 2011-02-04 19:15 ` Chris Mason 0 siblings, 1 reply; 5+ messages in thread From: Tao Ma @ 2011-02-04 8:36 UTC (permalink / raw) To: Boaz Harrosh; +Cc: Nick Piggin, linux-fsdevel, ext4 development On 02/04/2011 02:51 AM, Boaz Harrosh wrote: > Last good Kernel was 2.6.37 > I'm doing a "mount" then "unmount". I think root is the only created inode. > rmmod is called immediately after "unmount" within a script > > if I only do unmount and manually call "modprobe --remove exofs" after a small while > all is fine. > > I get: > slab error in kmem_cache_destroy(): cache `exofs_inode_cache': Can't free all objects > Call Trace: > 77dfde08: [<6007e9a6>] kmem_cache_destroy+0x82/0xca > 77dfde38: [<7c1fa3da>] exit_exofs+0x1a/0x1c [exofs] > 77dfde48: [<60054c10>] sys_delete_module+0x1b9/0x217 > 77dfdee8: [<60014d60>] handle_syscall+0x58/0x70 > 77dfdf08: [<60024163>] userspace+0x2dd/0x38a > 77dfdfc8: [<600126af>] fork_handler+0x62/0x69 > I also get a similar error when testing ext4 and a bug is opened there. https://bugzilla.kernel.org/show_bug.cgi?id=27652 And I have done some simple investigation for ext4 and It looks as if now with the new *fs_i_callback doesn't free the inode to *fs_inode_cache immediately. So the old logic will destroy the inode cache before we free all the inode object. Since there are more than one fs affected by this, we may need to find a way in the VFS. Regards, Tao > The UML Kernel also crashes after this message, with: > > Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc cryptomgr aead crc32c crypto_hash crypto_algapi iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod binfmt_misc [last unloaded: libosd] > Pid: 6, comm: rcu_kthread Not tainted 2.6.38-rc3+ > RIP: 0033:[<000000007c1fa0e7>] > RSP: 000000007943be18 EFLAGS: 00010246 > RAX: 000000007943a000 RBX: 000000007937bb80 RCX: 0000000000000095 > RDX: 000000007937c8b8 RSI: 0000000077fb6c80 RDI: 000000007937bb80 > RBP: 000000007943be40 R08: 000000007943be10 R09: 000000007943a000 > R10: 0000000000000000 R11: 0000000000000000 R12: 00000000795123e0 > R13: 0000000000000001 R14: 0000000000000000 R15: 000000000000000a > Call Trace: > 602678f8: [<600144ed>] segv+0x70/0x212 > 60267928: [<6001cd9e>] ubd_intr+0x72/0xdf > 60267988: [<601b778e>] _raw_spin_unlock_irqrestore+0x18/0x1c > 602679d8: [<600146ee>] segv_handler+0x5f/0x65 > 60267a08: [<60021488>] sig_handler_common+0x84/0x98 > 60267ab0: [<60130926>] strncpy+0xf/0x27 > 60267b38: [<600215ce>] sig_handler+0x30/0x3b > 60267b58: [<60021800>] handle_signal+0x6d/0xa3 > 60267ba8: [<60023180>] hard_handler+0x10/0x14 > > Kernel panic - not syncing: Segfault with no mm > Call Trace: > 602677f8: [<601b52b1>] panic+0xea/0x1e6 > 60267818: [<6007e299>] kmem_cache_free+0x54/0x5f > 60267850: [<6005342e>] __module_text_address+0xd/0x53 > 60267868: [<6005347d>] is_module_text_address+0x9/0x11 > 60267878: [<6004290c>] __kernel_text_address+0x65/0x6b > 60267880: [<60023180>] hard_handler+0x10/0x14 > 60267898: [<6001345e>] show_trace+0x8e/0x95 > 602678c8: [<60026c40>] show_regs+0x2b/0x2f > 602678f8: [<60014577>] segv+0xfa/0x212 > 60267928: [<6001cd9e>] ubd_intr+0x72/0xdf > 60267988: [<601b778e>] _raw_spin_unlock_irqrestore+0x18/0x1c > 602679d8: [<600146ee>] segv_handler+0x5f/0x65 > 60267a08: [<60021488>] sig_handler_common+0x84/0x98 > 60267ab0: [<60130926>] strncpy+0xf/0x27 > 60267b38: [<600215ce>] sig_handler+0x30/0x3b > 60267b58: [<60021800>] handle_signal+0x6d/0xa3 > 60267ba8: [<60023180>] hard_handler+0x10/0x14 > > > Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc cryptomgr aead crc32c crypto_hash crypto_algapi iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod binfmt_misc [last unloaded: libosd] > Pid: 6, comm: rcu_kthread Not tainted 2.6.38-rc3+ > RIP: 0033:[<0000003ea3832ad7>] > RSP: 00007fff63338e38 EFLAGS: 00000202 > RAX: 0000000000000000 RBX: 0000000000000219 RCX: ffffffffffffffff > RDX: 0000000000000000 RSI: 0000000000000013 RDI: 0000000000000219 > RBP: 00007fff63338e70 R08: 0000000000000000 R09: 00007fff63338e70 > R10: 00007fff63338be0 R11: 0000000000000202 R12: 0000000000000215 > R13: 00007fe54ee756a8 R14: 00007fff63339090 R15: 00007fff63339928 > Call Trace: > 60267788: [<6001485b>] panic_exit+0x2f/0x45 > 602677a8: [<60048ad6>] notifier_call_chain+0x32/0x5e > 602677e8: [<60048b24>] atomic_notifier_call_chain+0x13/0x15 > 602677f8: [<601b52cc>] panic+0x105/0x1e6 > 60267818: [<6007e299>] kmem_cache_free+0x54/0x5f > 60267850: [<6005342e>] __module_text_address+0xd/0x53 > 60267868: [<6005347d>] is_module_text_address+0x9/0x11 > 60267878: [<6004290c>] __kernel_text_address+0x65/0x6b > 60267880: [<60023180>] hard_handler+0x10/0x14 > 60267898: [<6001345e>] show_trace+0x8e/0x95 > 602678c8: [<60026c40>] show_regs+0x2b/0x2f > 602678f8: [<60014577>] segv+0xfa/0x212 > 60267928: [<6001cd9e>] ubd_intr+0x72/0xdf > 60267988: [<601b778e>] _raw_spin_unlock_irqrestore+0x18/0x1c > 602679d8: [<600146ee>] segv_handler+0x5f/0x65 > 60267a08: [<60021488>] sig_handler_common+0x84/0x98 > 60267ab0: [<60130926>] strncpy+0xf/0x27 > 60267b38: [<600215ce>] sig_handler+0x30/0x3b > 60267b58: [<60021800>] handle_signal+0x6d/0xa3 > 60267ba8: [<60023180>] hard_handler+0x10/0x14 > > Thanks > Boaz > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload 2011-02-04 8:36 ` Tao Ma @ 2011-02-04 19:15 ` Chris Mason 2011-02-08 14:45 ` Boaz Harrosh 0 siblings, 1 reply; 5+ messages in thread From: Chris Mason @ 2011-02-04 19:15 UTC (permalink / raw) To: Tao Ma; +Cc: Boaz Harrosh, Nick Piggin, linux-fsdevel, ext4 development Excerpts from Tao Ma's message of 2011-02-04 03:36:59 -0500: > On 02/04/2011 02:51 AM, Boaz Harrosh wrote: > > Last good Kernel was 2.6.37 > > I'm doing a "mount" then "unmount". I think root is the only created inode. > > rmmod is called immediately after "unmount" within a script > > > > if I only do unmount and manually call "modprobe --remove exofs" after a small while > > all is fine. > > > > I get: > > slab error in kmem_cache_destroy(): cache `exofs_inode_cache': Can't free all objects > > Call Trace: > > 77dfde08: [<6007e9a6>] kmem_cache_destroy+0x82/0xca > > 77dfde38: [<7c1fa3da>] exit_exofs+0x1a/0x1c [exofs] > > 77dfde48: [<60054c10>] sys_delete_module+0x1b9/0x217 > > 77dfdee8: [<60014d60>] handle_syscall+0x58/0x70 > > 77dfdf08: [<60024163>] userspace+0x2dd/0x38a > > 77dfdfc8: [<600126af>] fork_handler+0x62/0x69 > > > I also get a similar error when testing ext4 and a bug is opened there. > > https://bugzilla.kernel.org/show_bug.cgi?id=27652 > > And I have done some simple investigation for ext4 and It looks as if now with the new *fs_i_callback doesn't free the inode to *fs_inode_cache immediately. So the old logic will destroy the inode cache before we free all the inode object. > > Since there are more than one fs affected by this, we may need to find a way in the VFS. Sounds like we just need a synchronize_rcu call before we delete the cache? -chris ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload 2011-02-04 19:15 ` Chris Mason @ 2011-02-08 14:45 ` Boaz Harrosh 2011-02-08 15:25 ` Tao Ma 0 siblings, 1 reply; 5+ messages in thread From: Boaz Harrosh @ 2011-02-08 14:45 UTC (permalink / raw) To: Chris Mason, Nick Piggin, Al Viro Cc: Tao Ma, linux-fsdevel, ext4 development, Andrew Morton, Rafael J. Wysocki On 02/04/2011 09:15 PM, Chris Mason wrote: > Excerpts from Tao Ma's message of 2011-02-04 03:36:59 -0500: >> On 02/04/2011 02:51 AM, Boaz Harrosh wrote: >>> Last good Kernel was 2.6.37 >>> I'm doing a "mount" then "unmount". I think root is the only created inode. >>> rmmod is called immediately after "unmount" within a script >>> >>> if I only do unmount and manually call "modprobe --remove exofs" after a small while >>> all is fine. >>> >>> I get: >>> slab error in kmem_cache_destroy(): cache `exofs_inode_cache': Can't free all objects >>> Call Trace: >>> 77dfde08: [<6007e9a6>] kmem_cache_destroy+0x82/0xca >>> 77dfde38: [<7c1fa3da>] exit_exofs+0x1a/0x1c [exofs] >>> 77dfde48: [<60054c10>] sys_delete_module+0x1b9/0x217 >>> 77dfdee8: [<60014d60>] handle_syscall+0x58/0x70 >>> 77dfdf08: [<60024163>] userspace+0x2dd/0x38a >>> 77dfdfc8: [<600126af>] fork_handler+0x62/0x69 >>> >> I also get a similar error when testing ext4 and a bug is opened there. >> >> https://bugzilla.kernel.org/show_bug.cgi?id=27652 >> >> And I have done some simple investigation for ext4 and It looks as if now with the new *fs_i_callback doesn't free the inode to *fs_inode_cache immediately. So the old logic will destroy the inode cache before we free all the inode object. >> >> Since there are more than one fs affected by this, we may need to find a way in the VFS. > > Sounds like we just need a synchronize_rcu call before we delete the > cache? > > -chris Hi Al, Nick. Al please look into this issue. Absolutely all filesystems should be affected. Tao Ma has attempted the below fix, but it does not help. Exact same trace with his patch applied. If you unmount and immediately rmmod the filesystem it will crash because of those RCU freed objects at umount, like the root inode. Nick is not responding, I'd try to fix it, but I don't know how. --- > From: Tao Ma <boyu.mt@taobao.com> > > In fa0d7e3, we use rcu free inode instead of freeing the inode > directly. It causes a problem when we rmmod immediately after > we umount the volume[1]. > > So we need to call synchronize_rcu after we kill_sb so that > the inode is freed before we do rmmod. The idea is inspired > by Chris Mason[2]. I tested with ext4 by umount+rmmod and it > doesn't show any error by now. > > 1. http://marc.info/?l=linux-fsdevel&m=129680863330185&w=2 > 2. http://marc.info/?l=linux-fsdevel&m=129684698713709&w=2 > > Cc: Nick Piggin <npiggin@kernel.dk> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Chris Mason <chris.mason@oracle.com> > Cc: Boaz Harrosh <bharrosh@panasas.com> > Signed-off-by: Tao Ma <boyu.mt@taobao.com> > --- > fs/super.c | 7 +++++++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/fs/super.c b/fs/super.c > index 74e149e..315bce9 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -177,6 +177,13 @@ void deactivate_locked_super(struct super_block *s) > struct file_system_type *fs = s->s_type; > if (atomic_dec_and_test(&s->s_active)) { > fs->kill_sb(s); > + /* > + * We need to synchronize rcu here so that > + * the delayed rcu inode free can be executed > + * before we put_super. > + * https://bugzilla.kernel.org/show_bug.cgi?id=27652 > + */ > + synchronize_rcu(); > put_filesystem(fs); > put_super(s); > } else { > -- 1.6.3.GIT Thanks Boaz ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload 2011-02-08 14:45 ` Boaz Harrosh @ 2011-02-08 15:25 ` Tao Ma 0 siblings, 0 replies; 5+ messages in thread From: Tao Ma @ 2011-02-08 15:25 UTC (permalink / raw) To: Boaz Harrosh Cc: Chris Mason, Nick Piggin, Al Viro, linux-fsdevel, ext4 development, Andrew Morton, Rafael J. Wysocki Hi Boaz, On 02/08/2011 10:45 PM, Boaz Harrosh wrote: > On 02/04/2011 09:15 PM, Chris Mason wrote: > >> Excerpts from Tao Ma's message of 2011-02-04 03:36:59 -0500: >> >>> On 02/04/2011 02:51 AM, Boaz Harrosh wrote: >>> >>>> Last good Kernel was 2.6.37 >>>> I'm doing a "mount" then "unmount". I think root is the only created inode. >>>> rmmod is called immediately after "unmount" within a script >>>> >>>> if I only do unmount and manually call "modprobe --remove exofs" after a small while >>>> all is fine. >>>> >>>> I get: >>>> slab error in kmem_cache_destroy(): cache `exofs_inode_cache': Can't free all objects >>>> Call Trace: >>>> 77dfde08: [<6007e9a6>] kmem_cache_destroy+0x82/0xca >>>> 77dfde38: [<7c1fa3da>] exit_exofs+0x1a/0x1c [exofs] >>>> 77dfde48: [<60054c10>] sys_delete_module+0x1b9/0x217 >>>> 77dfdee8: [<60014d60>] handle_syscall+0x58/0x70 >>>> 77dfdf08: [<60024163>] userspace+0x2dd/0x38a >>>> 77dfdfc8: [<600126af>] fork_handler+0x62/0x69 >>>> >>>> >>> I also get a similar error when testing ext4 and a bug is opened there. >>> >>> https://bugzilla.kernel.org/show_bug.cgi?id=27652 >>> >>> And I have done some simple investigation for ext4 and It looks as if now with the new *fs_i_callback doesn't free the inode to *fs_inode_cache immediately. So the old logic will destroy the inode cache before we free all the inode object. >>> >>> Since there are more than one fs affected by this, we may need to find a way in the VFS. >>> >> Sounds like we just need a synchronize_rcu call before we delete the >> cache? >> >> -chris >> > Hi Al, Nick. > > Al please look into this issue. Absolutely all filesystems should be affected. > Tao Ma has attempted the below fix, but it does not help. Exact same trace > with his patch applied. > I am in vacation so I could't reach my test box today. I have done some simple tracing yesterday, and it looked that although synchronize_rcu is called, I can still get ext4_i_callback after it. So the reason may be: 1. synchronize_rcu doesn't work as we expected. 2. the inode free rcu doesn't work as Nick expected. I will go to the office tomorrow and do more test and debug there. Hope to find something more. > If you unmount and immediately rmmod the filesystem it will crash because of > those RCU freed objects at umount, like the root inode. Nick is not responding, > I'd try to fix it, but I don't know how. > I raised the error to Nick on Jan. 19, about 3 weeks ago. http://marc.info/?l=linux-ext4&m=129542001031750&w=2 But it seems that he is quite busy these days. It is still rc3 and we have a lot of time before the final release. So no panic here. ;) Finally, I just tried to fix it recently. But it doesn't work. :( I will continue to work on it before Al or Nick respond with a perfect patch. :) Regards, Tao ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-02-08 15:26 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-03 18:51 [BUG] v2.6.38-rc3+ BUG when calling destroy_inodecache at module unload Boaz Harrosh 2011-02-04 8:36 ` Tao Ma 2011-02-04 19:15 ` Chris Mason 2011-02-08 14:45 ` Boaz Harrosh 2011-02-08 15:25 ` Tao Ma
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).