* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages [not found] <bug-189181-27@https.bugzilla.kernel.org/> @ 2016-11-29 22:56 ` Andrew Morton 2016-11-30 17:00 ` Michal Hocko 0 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2016-11-29 22:56 UTC (permalink / raw) To: Mel Gorman, Johannes Weiner, Michal Hocko Cc: bugzilla-daemon, linux-mm, marmarek (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sat, 26 Nov 2016 15:10:16 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=189181 > > Bug ID: 189181 > Summary: BUG: unable to handle kernel NULL pointer dereference > in mem_cgroup_node_nr_lru_pages > Product: Memory Management > Version: 2.5 > Kernel Version: 4.8.10 > Hardware: Intel > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Slab Allocator > Assignee: akpm@linux-foundation.org > Reporter: marmarek@mimuw.edu.pl > Regression: No > > Created attachment 245931 > --> https://bugzilla.kernel.org/attachment.cgi?id=245931&action=edit > Full console log > > Shortly after system startup sometimes (about 1/30 times) I get this: > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000400 > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40 > [ 15.665225] PGD 0 > [ 15.665230] Oops: 0000 [#1] SMP > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O) > xen_blkback xenfs xen_privcmd xen_blkfront > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O > 4.8.10-12.pvops.qubes.x86_64 #1 > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000 > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>] > mem_cgroup_node_nr_lru_pages+0x20/0x40 > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293 > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX: > 0000000000000002 > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI: > 0000000000000000 > [ 15.665322] RBP: ffff88001186bc70 R08: 28f5c28f5c28f5c3 R09: > 0000000000000000 > [ 15.665327] R10: 0000000000006c34 R11: 0000000000000333 R12: > 00000000000001f6 > [ 15.665332] R13: ffffffff81c6f6a0 R14: 0000000000000000 R15: > 0000000000000000 > [ 15.665343] FS: 0000000000000000(0000) GS:ffff880013c00000(0000) > knlGS:ffff880013d00000 > [ 15.665351] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 15.665358] CR2: 0000000000000400 CR3: 00000000122f2000 CR4: > 0000000000042660 > [ 15.665366] Stack: > [ 15.665371] ffff88001186bc98 ffffffff811e0dda 00000000000002eb > 0000000000000080 > [ 15.665384] ffffffff81c6f6a0 ffff88001186bd70 ffffffff811c36d9 > 0000000000000000 > [ 15.665397] ffff88001186bcb0 ffff88001186bcb0 ffff88001186bcc0 > 000000000000abc5 > [ 15.665410] Call Trace: > [ 15.665419] [<ffffffff811e0dda>] count_shadow_nodes+0x9a/0xa0 > [ 15.665428] [<ffffffff811c36d9>] shrink_slab.part.42+0x119/0x3e0 > [ 15.666049] [<ffffffff811c83ec>] shrink_node+0x22c/0x320 > [ 15.666049] [<ffffffff811c928c>] kswapd+0x32c/0x700 > [ 15.666049] [<ffffffff811c8f60>] ? mem_cgroup_shrink_node+0x180/0x180 > [ 15.666049] [<ffffffff810c1b08>] kthread+0xd8/0xf0 > [ 15.666049] [<ffffffff817a3abf>] ret_from_fork+0x1f/0x40 > [ 15.666049] [<ffffffff810c1a30>] ? kthread_create_on_node+0x190/0x190 > [ 15.666049] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 3b 35 dd > eb b1 00 55 48 89 e5 73 2c 89 d2 31 c9 31 c0 4c 63 ce 48 0f a3 ca 73 13 <4a> 8b > b4 cf 00 04 00 00 41 89 c8 4a 03 > 84 c6 80 00 00 00 83 c1 > [ 15.666049] RIP [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40 > [ 15.666049] RSP <ffff88001186bc70> > [ 15.666049] CR2: 0000000000000400 > [ 15.666049] ---[ end trace 100494b9edbdfc4d ]--- > > After this, there is another "unable to handle kerneel paging request" I guess > because of do_exit in kswapd0, then a lot of soft lockups and system is > unusable (see full log attached). > > This is running in PV domU on Xen 4.7.0 (the same also happens on Xen 4.6.3). > Same happens on 4.8.7 too. Previously it was working on v4.4.31 without any > problem. > > -- > You are receiving this mail because: > You are the assignee for the bug. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-11-29 22:56 ` [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages Andrew Morton @ 2016-11-30 17:00 ` Michal Hocko 2016-11-30 18:16 ` Johannes Weiner 0 siblings, 1 reply; 9+ messages in thread From: Michal Hocko @ 2016-11-30 17:00 UTC (permalink / raw) To: Andrew Morton Cc: Mel Gorman, Johannes Weiner, bugzilla-daemon, linux-mm, marmarek, Vladimir Davydov [CC Vladimir] On Tue 29-11-16 14:56:54, Andrew Morton wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Sat, 26 Nov 2016 15:10:16 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=189181 > > > > Bug ID: 189181 > > Summary: BUG: unable to handle kernel NULL pointer dereference > > in mem_cgroup_node_nr_lru_pages > > Product: Memory Management > > Version: 2.5 > > Kernel Version: 4.8.10 > > Hardware: Intel > > OS: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Slab Allocator > > Assignee: akpm@linux-foundation.org > > Reporter: marmarek@mimuw.edu.pl > > Regression: No > > > > Created attachment 245931 > > --> https://bugzilla.kernel.org/attachment.cgi?id=245931&action=edit > > Full console log > > > > Shortly after system startup sometimes (about 1/30 times) I get this: > > > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000400 > > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40 > > [ 15.665225] PGD 0 > > [ 15.665230] Oops: 0000 [#1] SMP > > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT > > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i > > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl > > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel > > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O) > > xen_blkback xenfs xen_privcmd xen_blkfront > > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O > > 4.8.10-12.pvops.qubes.x86_64 #1 > > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000 > > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>] > > mem_cgroup_node_nr_lru_pages+0x20/0x40 > > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293 > > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX: > > 0000000000000002 > > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI: > > 0000000000000000 I cannot generate a similar code to yours but the above suggests that we are getting NULL memcg. This would suggest a global reclaim and count_shadow_nodes misinterprets that because it does if (memcg_kmem_enabled()) { pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, LRU_ALL_FILE); } else { pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) + node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE); } this might be a race with kmem enabling AFAICS. Anyaway I believe that the above check needs to ne extended for the sc->memcg != NULL diff --git a/mm/workingset.c b/mm/workingset.c index 617475f529f4..0f07522c5c0e 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); local_irq_enable(); - if (memcg_kmem_enabled()) { + if (memcg_kmem_enabled() && sc->memcg) { pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, LRU_ALL_FILE); } else { Or am I missing something? [Keeping the rest of the email for the reference] > > [ 15.665322] RBP: ffff88001186bc70 R08: 28f5c28f5c28f5c3 R09: > > 0000000000000000 > > [ 15.665327] R10: 0000000000006c34 R11: 0000000000000333 R12: > > 00000000000001f6 > > [ 15.665332] R13: ffffffff81c6f6a0 R14: 0000000000000000 R15: > > 0000000000000000 > > [ 15.665343] FS: 0000000000000000(0000) GS:ffff880013c00000(0000) > > knlGS:ffff880013d00000 > > [ 15.665351] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 15.665358] CR2: 0000000000000400 CR3: 00000000122f2000 CR4: > > 0000000000042660 > > [ 15.665366] Stack: > > [ 15.665371] ffff88001186bc98 ffffffff811e0dda 00000000000002eb > > 0000000000000080 > > [ 15.665384] ffffffff81c6f6a0 ffff88001186bd70 ffffffff811c36d9 > > 0000000000000000 > > [ 15.665397] ffff88001186bcb0 ffff88001186bcb0 ffff88001186bcc0 > > 000000000000abc5 > > [ 15.665410] Call Trace: > > [ 15.665419] [<ffffffff811e0dda>] count_shadow_nodes+0x9a/0xa0 > > [ 15.665428] [<ffffffff811c36d9>] shrink_slab.part.42+0x119/0x3e0 > > [ 15.666049] [<ffffffff811c83ec>] shrink_node+0x22c/0x320 > > [ 15.666049] [<ffffffff811c928c>] kswapd+0x32c/0x700 > > [ 15.666049] [<ffffffff811c8f60>] ? mem_cgroup_shrink_node+0x180/0x180 > > [ 15.666049] [<ffffffff810c1b08>] kthread+0xd8/0xf0 > > [ 15.666049] [<ffffffff817a3abf>] ret_from_fork+0x1f/0x40 > > [ 15.666049] [<ffffffff810c1a30>] ? kthread_create_on_node+0x190/0x190 > > [ 15.666049] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 3b 35 dd > > eb b1 00 55 48 89 e5 73 2c 89 d2 31 c9 31 c0 4c 63 ce 48 0f a3 ca 73 13 <4a> 8b > > b4 cf 00 04 00 00 41 89 c8 4a 03 > > 84 c6 80 00 00 00 83 c1 > > [ 15.666049] RIP [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40 > > [ 15.666049] RSP <ffff88001186bc70> > > [ 15.666049] CR2: 0000000000000400 > > [ 15.666049] ---[ end trace 100494b9edbdfc4d ]--- > > > > After this, there is another "unable to handle kerneel paging request" I guess > > because of do_exit in kswapd0, then a lot of soft lockups and system is > > unusable (see full log attached). > > > > This is running in PV domU on Xen 4.7.0 (the same also happens on Xen 4.6.3). > > Same happens on 4.8.7 too. Previously it was working on v4.4.31 without any > > problem. > > > > -- > > You are receiving this mail because: > > You are the assignee for the bug. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-11-30 17:00 ` Michal Hocko @ 2016-11-30 18:16 ` Johannes Weiner 2016-11-30 18:30 ` Michal Hocko 2016-12-01 10:07 ` Vladimir Davydov 0 siblings, 2 replies; 9+ messages in thread From: Johannes Weiner @ 2016-11-30 18:16 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek, Vladimir Davydov Hi Michael, On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: > > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at > > > 0000000000000400 > > > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40 > > > [ 15.665225] PGD 0 > > > [ 15.665230] Oops: 0000 [#1] SMP > > > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT > > > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE > > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i > > > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl > > > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel > > > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O) > > > xen_blkback xenfs xen_privcmd xen_blkfront > > > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O > > > 4.8.10-12.pvops.qubes.x86_64 #1 > > > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000 > > > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>] > > > mem_cgroup_node_nr_lru_pages+0x20/0x40 > > > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293 > > > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX: > > > 0000000000000002 > > > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI: > > > 0000000000000000 > > I cannot generate a similar code to yours but the above suggests that we > are getting NULL memcg. This would suggest a global reclaim and > count_shadow_nodes misinterprets that because it does > > if (memcg_kmem_enabled()) { > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > LRU_ALL_FILE); > } else { > pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) + > node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE); > } > > this might be a race with kmem enabling AFAICS. Anyaway I believe that > the above check needs to ne extended for the sc->memcg != NULL Yep, my locally built code looks very different from the report, but it's clear that memcg is NULL. I didn't see the race you mention, but it makes sense to me: shrink_slab() is supposed to filter memcg-aware shrinkers based on whether we have a memcg or not, but it only does it when kmem accounting is enabled; if it's disabled, the shrinker should also use its non-memcg behavior. However, nothing prevents a memcg with kmem from onlining between the filter and the shrinker run. > diff --git a/mm/workingset.c b/mm/workingset.c > index 617475f529f4..0f07522c5c0e 100644 > --- a/mm/workingset.c > +++ b/mm/workingset.c > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); > local_irq_enable(); > > - if (memcg_kmem_enabled()) { > + if (memcg_kmem_enabled() && sc->memcg) { > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > LRU_ALL_FILE); > } else { If we do that, I'd remove the racy memcg_kmem_enabled() check altogether and just check for whether we have a memcg or not. What do you think, Vladimir? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-11-30 18:16 ` Johannes Weiner @ 2016-11-30 18:30 ` Michal Hocko 2016-12-01 0:33 ` Balbir Singh 2016-12-01 2:24 ` Marek Marczykowski-Górecki 2016-12-01 10:07 ` Vladimir Davydov 1 sibling, 2 replies; 9+ messages in thread From: Michal Hocko @ 2016-11-30 18:30 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek, Vladimir Davydov On Wed 30-11-16 13:16:53, Johannes Weiner wrote: > Hi Michael, > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: [...] > > diff --git a/mm/workingset.c b/mm/workingset.c > > index 617475f529f4..0f07522c5c0e 100644 > > --- a/mm/workingset.c > > +++ b/mm/workingset.c > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); > > local_irq_enable(); > > > > - if (memcg_kmem_enabled()) { > > + if (memcg_kmem_enabled() && sc->memcg) { > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > > LRU_ALL_FILE); > > } else { > > If we do that, I'd remove the racy memcg_kmem_enabled() check > altogether and just check for whether we have a memcg or not. But that would make this a memcg aware shrinker even when kmem is not enabled... But now that I am looking into the code shrink_slab: if (memcg_kmem_enabled() && !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE)) continue; this should be taken care of already. So sc->memcg should be indeed sufficient. So unless I am missing something I will respin my local patch and post it later after the reporter has some time to test the current one. > What do you think, Vladimir? -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-11-30 18:30 ` Michal Hocko @ 2016-12-01 0:33 ` Balbir Singh 2016-12-01 2:24 ` Marek Marczykowski-Górecki 1 sibling, 0 replies; 9+ messages in thread From: Balbir Singh @ 2016-12-01 0:33 UTC (permalink / raw) To: Michal Hocko, Johannes Weiner Cc: Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek, Vladimir Davydov On 01/12/16 05:30, Michal Hocko wrote: > On Wed 30-11-16 13:16:53, Johannes Weiner wrote: >> Hi Michael, >> >> On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: > [...] >>> diff --git a/mm/workingset.c b/mm/workingset.c >>> index 617475f529f4..0f07522c5c0e 100644 >>> --- a/mm/workingset.c >>> +++ b/mm/workingset.c >>> @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, >>> shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); >>> local_irq_enable(); >>> >>> - if (memcg_kmem_enabled()) { >>> + if (memcg_kmem_enabled() && sc->memcg) { >>> pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, >>> LRU_ALL_FILE); >>> } else { >> >> If we do that, I'd remove the racy memcg_kmem_enabled() check >> altogether and just check for whether we have a memcg or not. > > But that would make this a memcg aware shrinker even when kmem is not > enabled... > > But now that I am looking into the code > shrink_slab: > if (memcg_kmem_enabled() && > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE)) > continue; > > this should be taken care of already. So sc->memcg should be indeed > sufficient. So unless I am missing something I will respin my local > patch and post it later after the reporter has some time to test the > current one. > I did a quick dis-assembly of the code R9 and RDI are NULL and the instruction seems to be mov rsi, [rdi+r9*8+0x400] RDI is NULL, sc->memcg is NULL, which indicates global reclaim The check referred to earlier /* * If kernel memory accounting is disabled, we ignore * SHRINKER_MEMCG_AWARE flag and call all shrinkers * passing NULL for memcg. */ if (memcg_kmem_enabled() && !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE)) continue; So we do pass NULL for memcg A check for sc->memcg should be enough in count_shadow_nodes and a VM_BUG_ON for memcg == NULL in mem_cgroup_node_nr_lru_pages would be nice Balbir Singh. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-11-30 18:30 ` Michal Hocko 2016-12-01 0:33 ` Balbir Singh @ 2016-12-01 2:24 ` Marek Marczykowski-Górecki 2016-12-01 7:02 ` Michal Hocko 1 sibling, 1 reply; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2016-12-01 2:24 UTC (permalink / raw) To: Michal Hocko Cc: Johannes Weiner, Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, Vladimir Davydov [-- Attachment #1: Type: text/plain, Size: 1719 bytes --] On Wed, Nov 30, 2016 at 07:30:17PM +0100, Michal Hocko wrote: > On Wed 30-11-16 13:16:53, Johannes Weiner wrote: > > Hi Michael, > > > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: > [...] > > > diff --git a/mm/workingset.c b/mm/workingset.c > > > index 617475f529f4..0f07522c5c0e 100644 > > > --- a/mm/workingset.c > > > +++ b/mm/workingset.c > > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, > > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); > > > local_irq_enable(); > > > > > > - if (memcg_kmem_enabled()) { > > > + if (memcg_kmem_enabled() && sc->memcg) { > > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > > > LRU_ALL_FILE); > > > } else { > > > > If we do that, I'd remove the racy memcg_kmem_enabled() check > > altogether and just check for whether we have a memcg or not. > > But that would make this a memcg aware shrinker even when kmem is not > enabled... > > But now that I am looking into the code > shrink_slab: > if (memcg_kmem_enabled() && > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE)) > continue; > > this should be taken care of already. So sc->memcg should be indeed > sufficient. So unless I am missing something I will respin my local > patch and post it later after the reporter has some time to test the > current one. The above patch seems to help. At least the problem haven't occurred for the last ~40 VM startups. > > > What do you think, Vladimir? > -- Pozdrawiam / Best Regards, Marek Marczykowski-Górecki | RLU #390519 marmarek at staszic waw pl | xmpp:marmarek at staszic waw pl [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-12-01 2:24 ` Marek Marczykowski-Górecki @ 2016-12-01 7:02 ` Michal Hocko 2016-12-01 10:58 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 9+ messages in thread From: Michal Hocko @ 2016-12-01 7:02 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Johannes Weiner, Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, Vladimir Davydov On Thu 01-12-16 03:24:54, Marek Marczykowski-Gorecki wrote: > On Wed, Nov 30, 2016 at 07:30:17PM +0100, Michal Hocko wrote: > > On Wed 30-11-16 13:16:53, Johannes Weiner wrote: > > > Hi Michael, > > > > > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: > > [...] > > > > diff --git a/mm/workingset.c b/mm/workingset.c > > > > index 617475f529f4..0f07522c5c0e 100644 > > > > --- a/mm/workingset.c > > > > +++ b/mm/workingset.c > > > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, > > > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); > > > > local_irq_enable(); > > > > > > > > - if (memcg_kmem_enabled()) { > > > > + if (memcg_kmem_enabled() && sc->memcg) { > > > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > > > > LRU_ALL_FILE); > > > > } else { > > > > > > If we do that, I'd remove the racy memcg_kmem_enabled() check > > > altogether and just check for whether we have a memcg or not. > > > > But that would make this a memcg aware shrinker even when kmem is not > > enabled... > > > > But now that I am looking into the code > > shrink_slab: > > if (memcg_kmem_enabled() && > > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE)) > > continue; > > > > this should be taken care of already. So sc->memcg should be indeed > > sufficient. So unless I am missing something I will respin my local > > patch and post it later after the reporter has some time to test the > > current one. > > The above patch seems to help. At least the problem haven't occurred for > the last ~40 VM startups. I will consider this as Tested-by: Marek Marczykowski-Gorecki <marmarek@mimuw.edu.pl> OK? Thanks for the report and testing! -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-12-01 7:02 ` Michal Hocko @ 2016-12-01 10:58 ` Marek Marczykowski-Górecki 0 siblings, 0 replies; 9+ messages in thread From: Marek Marczykowski-Górecki @ 2016-12-01 10:58 UTC (permalink / raw) To: Michal Hocko Cc: Johannes Weiner, Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, Vladimir Davydov [-- Attachment #1: Type: text/plain, Size: 2108 bytes --] On Thu, Dec 01, 2016 at 08:02:13AM +0100, Michal Hocko wrote: > On Thu 01-12-16 03:24:54, Marek Marczykowski-Górecki wrote: > > On Wed, Nov 30, 2016 at 07:30:17PM +0100, Michal Hocko wrote: > > > On Wed 30-11-16 13:16:53, Johannes Weiner wrote: > > > > Hi Michael, > > > > > > > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: > > > [...] > > > > > diff --git a/mm/workingset.c b/mm/workingset.c > > > > > index 617475f529f4..0f07522c5c0e 100644 > > > > > --- a/mm/workingset.c > > > > > +++ b/mm/workingset.c > > > > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, > > > > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); > > > > > local_irq_enable(); > > > > > > > > > > - if (memcg_kmem_enabled()) { > > > > > + if (memcg_kmem_enabled() && sc->memcg) { > > > > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > > > > > LRU_ALL_FILE); > > > > > } else { > > > > > > > > If we do that, I'd remove the racy memcg_kmem_enabled() check > > > > altogether and just check for whether we have a memcg or not. > > > > > > But that would make this a memcg aware shrinker even when kmem is not > > > enabled... > > > > > > But now that I am looking into the code > > > shrink_slab: > > > if (memcg_kmem_enabled() && > > > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE)) > > > continue; > > > > > > this should be taken care of already. So sc->memcg should be indeed > > > sufficient. So unless I am missing something I will respin my local > > > patch and post it later after the reporter has some time to test the > > > current one. > > > > The above patch seems to help. At least the problem haven't occurred for > > the last ~40 VM startups. > > I will consider this as > Tested-by: Marek Marczykowski-Górecki <marmarek@mimuw.edu.pl> > > OK? Thanks for the report and testing! Yes. -- Pozdrawiam / Best Regards, Marek Marczykowski-Górecki | RLU #390519 marmarek at staszic waw pl | xmpp:marmarek at staszic waw pl [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages 2016-11-30 18:16 ` Johannes Weiner 2016-11-30 18:30 ` Michal Hocko @ 2016-12-01 10:07 ` Vladimir Davydov 1 sibling, 0 replies; 9+ messages in thread From: Vladimir Davydov @ 2016-12-01 10:07 UTC (permalink / raw) To: Johannes Weiner Cc: Michal Hocko, Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek On Wed, Nov 30, 2016 at 01:16:53PM -0500, Johannes Weiner wrote: > Hi Michael, > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote: > > > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at > > > > 0000000000000400 > > > > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40 > > > > [ 15.665225] PGD 0 > > > > [ 15.665230] Oops: 0000 [#1] SMP > > > > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT > > > > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE > > > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i > > > > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl > > > > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel > > > > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O) > > > > xen_blkback xenfs xen_privcmd xen_blkfront > > > > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O > > > > 4.8.10-12.pvops.qubes.x86_64 #1 > > > > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000 > > > > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>] > > > > mem_cgroup_node_nr_lru_pages+0x20/0x40 > > > > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293 > > > > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX: > > > > 0000000000000002 > > > > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI: > > > > 0000000000000000 > > > > I cannot generate a similar code to yours but the above suggests that we > > are getting NULL memcg. This would suggest a global reclaim and > > count_shadow_nodes misinterprets that because it does > > > > if (memcg_kmem_enabled()) { > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > > LRU_ALL_FILE); > > } else { > > pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) + > > node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE); > > } > > > > this might be a race with kmem enabling AFAICS. Anyaway I believe that > > the above check needs to ne extended for the sc->memcg != NULL > > Yep, my locally built code looks very different from the report, but > it's clear that memcg is NULL. I didn't see the race you mention, but > it makes sense to me: shrink_slab() is supposed to filter memcg-aware > shrinkers based on whether we have a memcg or not, but it only does it > when kmem accounting is enabled; if it's disabled, the shrinker should > also use its non-memcg behavior. However, nothing prevents a memcg > with kmem from onlining between the filter and the shrinker run. Yeah, I think the issue can be easily reproduced by triggering the reclaimer while running mkdir/rmdir on a memory cgroup directory in a loop provided no other memory cgroup exists in the system. > > > diff --git a/mm/workingset.c b/mm/workingset.c > > index 617475f529f4..0f07522c5c0e 100644 > > --- a/mm/workingset.c > > +++ b/mm/workingset.c > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker, > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc); > > local_irq_enable(); > > > > - if (memcg_kmem_enabled()) { > > + if (memcg_kmem_enabled() && sc->memcg) { > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid, > > LRU_ALL_FILE); > > } else { > > If we do that, I'd remove the racy memcg_kmem_enabled() check > altogether and just check for whether we have a memcg or not. Agree. BTW this is how the other memcg-aware shrinker, list_lru, works - see list_lru_shrink_count -> memcg_cache_id. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-12-01 10:58 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <bug-189181-27@https.bugzilla.kernel.org/> 2016-11-29 22:56 ` [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages Andrew Morton 2016-11-30 17:00 ` Michal Hocko 2016-11-30 18:16 ` Johannes Weiner 2016-11-30 18:30 ` Michal Hocko 2016-12-01 0:33 ` Balbir Singh 2016-12-01 2:24 ` Marek Marczykowski-Górecki 2016-12-01 7:02 ` Michal Hocko 2016-12-01 10:58 ` Marek Marczykowski-Górecki 2016-12-01 10:07 ` Vladimir Davydov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).