* Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown [not found] ` <CA+55aFyb72qoZ1Tjpb+=q-6+GmwoOXjfntY_zZnf300gg3d1Hg@mail.gmail.com> @ 2013-10-27 20:39 ` Linus Torvalds 2013-10-30 18:04 ` Pablo Neira Ayuso 0 siblings, 1 reply; 2+ messages in thread From: Linus Torvalds @ 2013-10-27 20:39 UTC (permalink / raw) To: Thomas Gleixner, Pablo Neira Ayuso, Patrick McHardy, Jozsef Kadlecsik, David Miller Cc: Knut Petersen, Ingo Molnar, Paul McKenney, Frédéric Weisbecker, Greg KH, linux-kernel, Network Development, netfilter-devel On Sun, Oct 27, 2013 at 8:20 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Appended is a warning I get with DEBUG_TIMER_OBJECTS. Seems to be a > device-mapper issue. .. and here's another one. This time it looks like nf_conntrack_free() is freeing something that has a delayed work in it (again, likely an embedded 'struct kobject'). Looks like it is the kmem_cache_destroy(net->ct.nf_conntrack_cachep); that triggers this. Which probably means that there are still slab entries on that slab cache or something, but I didn't dig any deeper.. David? Patrick? Pablo? Jozsef? Any ideas? This was immediately preceded by [ 1136.316280] kobject: 'nf_conntrack_ffff8800b74d0000' (ffff8801196fac78): kobject_uevent_env [ 1136.316287] kobject: 'nf_conntrack_ffff8800b74d0000' (ffff8801196fac78): fill_kobj_path: path = '/kernel/slab/nf_conntrack_ffff8800b74d0000' [ 1136.316331] kobject: 'nf_conntrack_ffff8800b74d0000' (ffff8801196fac78): kobject_release, parent (null) (delayed) and I think it's that delayed "kobject_release()" that triggers this. Notice that kobject_release() can be delayed *without* the magic kobject debugging option by simply having a reference count on it from some external source. So this particular issue is probably triggered by my extra debug options in this case (I'm running with all those nasty "try to find bad object freeing" options, and doing module unloading etc), but can happen without it (it's just very hard to trigger in practice without the debug options). Linus --- [ 1136.321294] ------------[ cut here ]------------ [ 1136.321305] WARNING: CPU: 2 PID: 2483 at lib/debugobjects.c:260 debug_print_object+0x83/0xa0() [ 1136.321311] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20 [ 1136.321313] Modules linked in: fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth ebtable_nat ebt [ 1136.321357] mfd_core mei snd_page_alloc snd_timer snd soundcore sony_laptop rfkill uinput dm_crypt crc32_pclmul crc32c_intel i915 i2c_algo_bit drm_kms_h [ 1136.321371] CPU: 2 PID: 2483 Comm: kworker/u8:2 Tainted: G W 3.12.0-rc6-00331-ga2ff82065b5b #2 [ 1136.321373] Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013 [ 1136.321378] Workqueue: netns cleanup_net [ 1136.321380] 0000000000000009 ffff8800a86cdbc8 ffffffff8160d4a2 ffff8800a86cdc10 [ 1136.321384] ffff8800a86cdc00 ffffffff810514e8 ffff8800b745f848 ffffffff81c365e0 [ 1136.321387] ffffffff819f9133 ffffffff81f34750 0000000000000001 ffff8800a86cdc60 [ 1136.321390] Call Trace: [ 1136.321398] [<ffffffff8160d4a2>] dump_stack+0x45/0x56 [ 1136.321405] [<ffffffff810514e8>] warn_slowpath_common+0x78/0xa0 [ 1136.321410] [<ffffffff81051557>] warn_slowpath_fmt+0x47/0x50 [ 1136.321414] [<ffffffff812f8883>] debug_print_object+0x83/0xa0 [ 1136.321420] [<ffffffff8106aa90>] ? execute_in_process_context+0x90/0x90 [ 1136.321424] [<ffffffff812f99fb>] debug_check_no_obj_freed+0x20b/0x250 [ 1136.321429] [<ffffffff8112e7f2>] ? kmem_cache_destroy+0x92/0x100 [ 1136.321433] [<ffffffff8115d945>] kmem_cache_free+0x125/0x210 [ 1136.321436] [<ffffffff8112e7f2>] kmem_cache_destroy+0x92/0x100 [ 1136.321443] [<ffffffffa046b806>] nf_conntrack_cleanup_net_list+0x126/0x160 [nf_conntrack] [ 1136.321449] [<ffffffffa046c43d>] nf_conntrack_pernet_exit+0x6d/0x80 [nf_conntrack] [ 1136.321453] [<ffffffff81511cc3>] ops_exit_list.isra.3+0x53/0x60 [ 1136.321457] [<ffffffff815124f0>] cleanup_net+0x100/0x1b0 [ 1136.321460] [<ffffffff8106b31e>] process_one_work+0x18e/0x430 [ 1136.321463] [<ffffffff8106bf49>] worker_thread+0x119/0x390 [ 1136.321467] [<ffffffff8106be30>] ? manage_workers.isra.23+0x2a0/0x2a0 [ 1136.321470] [<ffffffff8107210b>] kthread+0xbb/0xc0 [ 1136.321472] [<ffffffff81072050>] ? kthread_create_on_node+0x110/0x110 [ 1136.321477] [<ffffffff8161b8fc>] ret_from_fork+0x7c/0xb0 [ 1136.321479] [<ffffffff81072050>] ? kthread_create_on_node+0x110/0x110 [ 1136.321481] ---[ end trace 25f53c192da70825 ]--- ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown 2013-10-27 20:39 ` [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown Linus Torvalds @ 2013-10-30 18:04 ` Pablo Neira Ayuso 0 siblings, 0 replies; 2+ messages in thread From: Pablo Neira Ayuso @ 2013-10-30 18:04 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, Patrick McHardy, Jozsef Kadlecsik, David Miller, Knut Petersen, Ingo Molnar, Paul McKenney, Frédéric Weisbecker, Greg KH, linux-kernel, Network Development, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 2059 bytes --] On Sun, Oct 27, 2013 at 08:39:47PM +0000, Linus Torvalds wrote: > On Sun, Oct 27, 2013 at 8:20 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > Appended is a warning I get with DEBUG_TIMER_OBJECTS. Seems to be a > > device-mapper issue. > > .. and here's another one. This time it looks like nf_conntrack_free() > is freeing something that has a delayed work in it (again, likely an > embedded 'struct kobject'). Looks like it is the > > kmem_cache_destroy(net->ct.nf_conntrack_cachep); > > that triggers this. Which probably means that there are still slab > entries on that slab cache or something, but I didn't dig any deeper.. > > David? Patrick? Pablo? Jozsef? Any ideas? This was immediately preceded by > > [ 1136.316280] kobject: 'nf_conntrack_ffff8800b74d0000' > (ffff8801196fac78): kobject_uevent_env > [ 1136.316287] kobject: 'nf_conntrack_ffff8800b74d0000' > (ffff8801196fac78): fill_kobj_path: path = > '/kernel/slab/nf_conntrack_ffff8800b74d0000' > [ 1136.316331] kobject: 'nf_conntrack_ffff8800b74d0000' > (ffff8801196fac78): kobject_release, parent (null) (delayed) > > and I think it's that delayed "kobject_release()" that triggers this. > > Notice that kobject_release() can be delayed *without* the magic > kobject debugging option by simply having a reference count on it from > some external source. So this particular issue is probably triggered > by my extra debug options in this case (I'm running with all those > nasty "try to find bad object freeing" options, and doing module > unloading etc), but can happen without it (it's just very hard to > trigger in practice without the debug options). nf_conntrack_free() is decrementing our object counter (net->ct.count) before releasing the object. That counter is used in the nf_conntrack_cleanup_net_list path to check if it's time to kmem_cache_destroy our cache of conntrack objects. I think we have a race there that should be easier to trigger (although still hard) with CONFIG_DEBUG_OBJECTS_FREE as object releases become slowier. [-- Attachment #2: linus.patch --] [-- Type: text/x-diff, Size: 528 bytes --] diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 5d892fe..d60cf16 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -764,9 +764,10 @@ void nf_conntrack_free(struct nf_conn *ct) struct net *net = nf_ct_net(ct); nf_ct_ext_destroy(ct); - atomic_dec(&net->ct.count); nf_ct_ext_free(ct); kmem_cache_free(net->ct.nf_conntrack_cachep, ct); + smp_mb__before_atomic_dec(); + atomic_dec(&net->ct.count); } EXPORT_SYMBOL_GPL(nf_conntrack_free); ^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-10-30 18:04 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <525BD08C.2080101@t-online.de> [not found] ` <CA+55aFwN++wO=rTFaH7m6YYQX+Fv3qDt6Hxs7UnPUJpFsrwSkA@mail.gmail.com> [not found] ` <CA+55aFy7TpvkVuPEOJ7Qqirwap1gXoD1_MrYLhgPAYuVpN8u=w@mail.gmail.com> [not found] ` <alpine.DEB.2.02.1310251221310.5266@ionos.tec.linutronix.de> [not found] ` <CA+55aFxw1Hm4QVeEigkByP5RppSL30oHnLoR16C=henf-a=uFQ@mail.gmail.com> [not found] ` <CA+55aFyb72qoZ1Tjpb+=q-6+GmwoOXjfntY_zZnf300gg3d1Hg@mail.gmail.com> 2013-10-27 20:39 ` [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown Linus Torvalds 2013-10-30 18:04 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).