* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
[not found] <bug-189181-27@https.bugzilla.kernel.org/>
@ 2016-11-29 22:56 ` Andrew Morton
2016-11-30 17:00 ` Michal Hocko
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2016-11-29 22:56 UTC (permalink / raw)
To: Mel Gorman, Johannes Weiner, Michal Hocko
Cc: bugzilla-daemon, linux-mm, marmarek
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Sat, 26 Nov 2016 15:10:16 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=189181
>
> Bug ID: 189181
> Summary: BUG: unable to handle kernel NULL pointer dereference
> in mem_cgroup_node_nr_lru_pages
> Product: Memory Management
> Version: 2.5
> Kernel Version: 4.8.10
> Hardware: Intel
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Slab Allocator
> Assignee: akpm@linux-foundation.org
> Reporter: marmarek@mimuw.edu.pl
> Regression: No
>
> Created attachment 245931
> --> https://bugzilla.kernel.org/attachment.cgi?id=245931&action=edit
> Full console log
>
> Shortly after system startup sometimes (about 1/30 times) I get this:
>
> [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000400
> [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> [ 15.665225] PGD 0
> [ 15.665230] Oops: 0000 [#1] SMP
> [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT
> nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE
> nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i
> pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl
> x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel
> ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O)
> xen_blkback xenfs xen_privcmd xen_blkfront
> [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O
> 4.8.10-12.pvops.qubes.x86_64 #1
> [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000
> [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>]
> mem_cgroup_node_nr_lru_pages+0x20/0x40
> [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293
> [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX:
> 0000000000000002
> [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI:
> 0000000000000000
> [ 15.665322] RBP: ffff88001186bc70 R08: 28f5c28f5c28f5c3 R09:
> 0000000000000000
> [ 15.665327] R10: 0000000000006c34 R11: 0000000000000333 R12:
> 00000000000001f6
> [ 15.665332] R13: ffffffff81c6f6a0 R14: 0000000000000000 R15:
> 0000000000000000
> [ 15.665343] FS: 0000000000000000(0000) GS:ffff880013c00000(0000)
> knlGS:ffff880013d00000
> [ 15.665351] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 15.665358] CR2: 0000000000000400 CR3: 00000000122f2000 CR4:
> 0000000000042660
> [ 15.665366] Stack:
> [ 15.665371] ffff88001186bc98 ffffffff811e0dda 00000000000002eb
> 0000000000000080
> [ 15.665384] ffffffff81c6f6a0 ffff88001186bd70 ffffffff811c36d9
> 0000000000000000
> [ 15.665397] ffff88001186bcb0 ffff88001186bcb0 ffff88001186bcc0
> 000000000000abc5
> [ 15.665410] Call Trace:
> [ 15.665419] [<ffffffff811e0dda>] count_shadow_nodes+0x9a/0xa0
> [ 15.665428] [<ffffffff811c36d9>] shrink_slab.part.42+0x119/0x3e0
> [ 15.666049] [<ffffffff811c83ec>] shrink_node+0x22c/0x320
> [ 15.666049] [<ffffffff811c928c>] kswapd+0x32c/0x700
> [ 15.666049] [<ffffffff811c8f60>] ? mem_cgroup_shrink_node+0x180/0x180
> [ 15.666049] [<ffffffff810c1b08>] kthread+0xd8/0xf0
> [ 15.666049] [<ffffffff817a3abf>] ret_from_fork+0x1f/0x40
> [ 15.666049] [<ffffffff810c1a30>] ? kthread_create_on_node+0x190/0x190
> [ 15.666049] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 3b 35 dd
> eb b1 00 55 48 89 e5 73 2c 89 d2 31 c9 31 c0 4c 63 ce 48 0f a3 ca 73 13 <4a> 8b
> b4 cf 00 04 00 00 41 89 c8 4a 03
> 84 c6 80 00 00 00 83 c1
> [ 15.666049] RIP [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> [ 15.666049] RSP <ffff88001186bc70>
> [ 15.666049] CR2: 0000000000000400
> [ 15.666049] ---[ end trace 100494b9edbdfc4d ]---
>
> After this, there is another "unable to handle kerneel paging request" I guess
> because of do_exit in kswapd0, then a lot of soft lockups and system is
> unusable (see full log attached).
>
> This is running in PV domU on Xen 4.7.0 (the same also happens on Xen 4.6.3).
> Same happens on 4.8.7 too. Previously it was working on v4.4.31 without any
> problem.
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-11-29 22:56 ` [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages Andrew Morton
@ 2016-11-30 17:00 ` Michal Hocko
2016-11-30 18:16 ` Johannes Weiner
0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2016-11-30 17:00 UTC (permalink / raw)
To: Andrew Morton
Cc: Mel Gorman, Johannes Weiner, bugzilla-daemon, linux-mm, marmarek,
Vladimir Davydov
[CC Vladimir]
On Tue 29-11-16 14:56:54, Andrew Morton wrote:
>
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sat, 26 Nov 2016 15:10:16 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=189181
> >
> > Bug ID: 189181
> > Summary: BUG: unable to handle kernel NULL pointer dereference
> > in mem_cgroup_node_nr_lru_pages
> > Product: Memory Management
> > Version: 2.5
> > Kernel Version: 4.8.10
> > Hardware: Intel
> > OS: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: normal
> > Priority: P1
> > Component: Slab Allocator
> > Assignee: akpm@linux-foundation.org
> > Reporter: marmarek@mimuw.edu.pl
> > Regression: No
> >
> > Created attachment 245931
> > --> https://bugzilla.kernel.org/attachment.cgi?id=245931&action=edit
> > Full console log
> >
> > Shortly after system startup sometimes (about 1/30 times) I get this:
> >
> > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000400
> > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > [ 15.665225] PGD 0
> > [ 15.665230] Oops: 0000 [#1] SMP
> > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT
> > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE
> > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i
> > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl
> > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel
> > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O)
> > xen_blkback xenfs xen_privcmd xen_blkfront
> > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O
> > 4.8.10-12.pvops.qubes.x86_64 #1
> > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000
> > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>]
> > mem_cgroup_node_nr_lru_pages+0x20/0x40
> > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293
> > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX:
> > 0000000000000002
> > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI:
> > 0000000000000000
I cannot generate a similar code to yours but the above suggests that we
are getting NULL memcg. This would suggest a global reclaim and
count_shadow_nodes misinterprets that because it does
if (memcg_kmem_enabled()) {
pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
LRU_ALL_FILE);
} else {
pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) +
node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE);
}
this might be a race with kmem enabling AFAICS. Anyaway I believe that
the above check needs to ne extended for the sc->memcg != NULL
diff --git a/mm/workingset.c b/mm/workingset.c
index 617475f529f4..0f07522c5c0e 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
local_irq_enable();
- if (memcg_kmem_enabled()) {
+ if (memcg_kmem_enabled() && sc->memcg) {
pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
LRU_ALL_FILE);
} else {
Or am I missing something?
[Keeping the rest of the email for the reference]
> > [ 15.665322] RBP: ffff88001186bc70 R08: 28f5c28f5c28f5c3 R09:
> > 0000000000000000
> > [ 15.665327] R10: 0000000000006c34 R11: 0000000000000333 R12:
> > 00000000000001f6
> > [ 15.665332] R13: ffffffff81c6f6a0 R14: 0000000000000000 R15:
> > 0000000000000000
> > [ 15.665343] FS: 0000000000000000(0000) GS:ffff880013c00000(0000)
> > knlGS:ffff880013d00000
> > [ 15.665351] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 15.665358] CR2: 0000000000000400 CR3: 00000000122f2000 CR4:
> > 0000000000042660
> > [ 15.665366] Stack:
> > [ 15.665371] ffff88001186bc98 ffffffff811e0dda 00000000000002eb
> > 0000000000000080
> > [ 15.665384] ffffffff81c6f6a0 ffff88001186bd70 ffffffff811c36d9
> > 0000000000000000
> > [ 15.665397] ffff88001186bcb0 ffff88001186bcb0 ffff88001186bcc0
> > 000000000000abc5
> > [ 15.665410] Call Trace:
> > [ 15.665419] [<ffffffff811e0dda>] count_shadow_nodes+0x9a/0xa0
> > [ 15.665428] [<ffffffff811c36d9>] shrink_slab.part.42+0x119/0x3e0
> > [ 15.666049] [<ffffffff811c83ec>] shrink_node+0x22c/0x320
> > [ 15.666049] [<ffffffff811c928c>] kswapd+0x32c/0x700
> > [ 15.666049] [<ffffffff811c8f60>] ? mem_cgroup_shrink_node+0x180/0x180
> > [ 15.666049] [<ffffffff810c1b08>] kthread+0xd8/0xf0
> > [ 15.666049] [<ffffffff817a3abf>] ret_from_fork+0x1f/0x40
> > [ 15.666049] [<ffffffff810c1a30>] ? kthread_create_on_node+0x190/0x190
> > [ 15.666049] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 3b 35 dd
> > eb b1 00 55 48 89 e5 73 2c 89 d2 31 c9 31 c0 4c 63 ce 48 0f a3 ca 73 13 <4a> 8b
> > b4 cf 00 04 00 00 41 89 c8 4a 03
> > 84 c6 80 00 00 00 83 c1
> > [ 15.666049] RIP [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > [ 15.666049] RSP <ffff88001186bc70>
> > [ 15.666049] CR2: 0000000000000400
> > [ 15.666049] ---[ end trace 100494b9edbdfc4d ]---
> >
> > After this, there is another "unable to handle kerneel paging request" I guess
> > because of do_exit in kswapd0, then a lot of soft lockups and system is
> > unusable (see full log attached).
> >
> > This is running in PV domU on Xen 4.7.0 (the same also happens on Xen 4.6.3).
> > Same happens on 4.8.7 too. Previously it was working on v4.4.31 without any
> > problem.
> >
> > --
> > You are receiving this mail because:
> > You are the assignee for the bug.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-11-30 17:00 ` Michal Hocko
@ 2016-11-30 18:16 ` Johannes Weiner
2016-11-30 18:30 ` Michal Hocko
2016-12-01 10:07 ` Vladimir Davydov
0 siblings, 2 replies; 9+ messages in thread
From: Johannes Weiner @ 2016-11-30 18:16 UTC (permalink / raw)
To: Michal Hocko
Cc: Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek,
Vladimir Davydov
Hi Michael,
On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at
> > > 0000000000000400
> > > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > > [ 15.665225] PGD 0
> > > [ 15.665230] Oops: 0000 [#1] SMP
> > > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT
> > > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE
> > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i
> > > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl
> > > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel
> > > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O)
> > > xen_blkback xenfs xen_privcmd xen_blkfront
> > > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O
> > > 4.8.10-12.pvops.qubes.x86_64 #1
> > > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000
> > > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>]
> > > mem_cgroup_node_nr_lru_pages+0x20/0x40
> > > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293
> > > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX:
> > > 0000000000000002
> > > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI:
> > > 0000000000000000
>
> I cannot generate a similar code to yours but the above suggests that we
> are getting NULL memcg. This would suggest a global reclaim and
> count_shadow_nodes misinterprets that because it does
>
> if (memcg_kmem_enabled()) {
> pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> LRU_ALL_FILE);
> } else {
> pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) +
> node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE);
> }
>
> this might be a race with kmem enabling AFAICS. Anyaway I believe that
> the above check needs to ne extended for the sc->memcg != NULL
Yep, my locally built code looks very different from the report, but
it's clear that memcg is NULL. I didn't see the race you mention, but
it makes sense to me: shrink_slab() is supposed to filter memcg-aware
shrinkers based on whether we have a memcg or not, but it only does it
when kmem accounting is enabled; if it's disabled, the shrinker should
also use its non-memcg behavior. However, nothing prevents a memcg
with kmem from onlining between the filter and the shrinker run.
> diff --git a/mm/workingset.c b/mm/workingset.c
> index 617475f529f4..0f07522c5c0e 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> local_irq_enable();
>
> - if (memcg_kmem_enabled()) {
> + if (memcg_kmem_enabled() && sc->memcg) {
> pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> LRU_ALL_FILE);
> } else {
If we do that, I'd remove the racy memcg_kmem_enabled() check
altogether and just check for whether we have a memcg or not.
What do you think, Vladimir?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-11-30 18:16 ` Johannes Weiner
@ 2016-11-30 18:30 ` Michal Hocko
2016-12-01 0:33 ` Balbir Singh
2016-12-01 2:24 ` Marek Marczykowski-Górecki
2016-12-01 10:07 ` Vladimir Davydov
1 sibling, 2 replies; 9+ messages in thread
From: Michal Hocko @ 2016-11-30 18:30 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek,
Vladimir Davydov
On Wed 30-11-16 13:16:53, Johannes Weiner wrote:
> Hi Michael,
>
> On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
[...]
> > diff --git a/mm/workingset.c b/mm/workingset.c
> > index 617475f529f4..0f07522c5c0e 100644
> > --- a/mm/workingset.c
> > +++ b/mm/workingset.c
> > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > local_irq_enable();
> >
> > - if (memcg_kmem_enabled()) {
> > + if (memcg_kmem_enabled() && sc->memcg) {
> > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > LRU_ALL_FILE);
> > } else {
>
> If we do that, I'd remove the racy memcg_kmem_enabled() check
> altogether and just check for whether we have a memcg or not.
But that would make this a memcg aware shrinker even when kmem is not
enabled...
But now that I am looking into the code
shrink_slab:
if (memcg_kmem_enabled() &&
!!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
continue;
this should be taken care of already. So sc->memcg should be indeed
sufficient. So unless I am missing something I will respin my local
patch and post it later after the reporter has some time to test the
current one.
> What do you think, Vladimir?
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-11-30 18:30 ` Michal Hocko
@ 2016-12-01 0:33 ` Balbir Singh
2016-12-01 2:24 ` Marek Marczykowski-Górecki
1 sibling, 0 replies; 9+ messages in thread
From: Balbir Singh @ 2016-12-01 0:33 UTC (permalink / raw)
To: Michal Hocko, Johannes Weiner
Cc: Andrew Morton, Mel Gorman, bugzilla-daemon, linux-mm, marmarek,
Vladimir Davydov
On 01/12/16 05:30, Michal Hocko wrote:
> On Wed 30-11-16 13:16:53, Johannes Weiner wrote:
>> Hi Michael,
>>
>> On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> [...]
>>> diff --git a/mm/workingset.c b/mm/workingset.c
>>> index 617475f529f4..0f07522c5c0e 100644
>>> --- a/mm/workingset.c
>>> +++ b/mm/workingset.c
>>> @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
>>> shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
>>> local_irq_enable();
>>>
>>> - if (memcg_kmem_enabled()) {
>>> + if (memcg_kmem_enabled() && sc->memcg) {
>>> pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
>>> LRU_ALL_FILE);
>>> } else {
>>
>> If we do that, I'd remove the racy memcg_kmem_enabled() check
>> altogether and just check for whether we have a memcg or not.
>
> But that would make this a memcg aware shrinker even when kmem is not
> enabled...
>
> But now that I am looking into the code
> shrink_slab:
> if (memcg_kmem_enabled() &&
> !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
> continue;
>
> this should be taken care of already. So sc->memcg should be indeed
> sufficient. So unless I am missing something I will respin my local
> patch and post it later after the reporter has some time to test the
> current one.
>
I did a quick dis-assembly of the code
R9 and RDI are NULL and the instruction seems to be
mov rsi, [rdi+r9*8+0x400]
RDI is NULL, sc->memcg is NULL, which indicates global reclaim
The check referred to earlier
/*
* If kernel memory accounting is disabled, we ignore
* SHRINKER_MEMCG_AWARE flag and call all shrinkers
* passing NULL for memcg.
*/
if (memcg_kmem_enabled() &&
!!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
continue;
So we do pass NULL for memcg
A check for sc->memcg should be enough in count_shadow_nodes and a VM_BUG_ON
for memcg == NULL in mem_cgroup_node_nr_lru_pages would be nice
Balbir Singh.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-11-30 18:30 ` Michal Hocko
2016-12-01 0:33 ` Balbir Singh
@ 2016-12-01 2:24 ` Marek Marczykowski-Górecki
2016-12-01 7:02 ` Michal Hocko
1 sibling, 1 reply; 9+ messages in thread
From: Marek Marczykowski-Górecki @ 2016-12-01 2:24 UTC (permalink / raw)
To: Michal Hocko
Cc: Johannes Weiner, Andrew Morton, Mel Gorman, bugzilla-daemon,
linux-mm, Vladimir Davydov
[-- Attachment #1: Type: text/plain, Size: 1719 bytes --]
On Wed, Nov 30, 2016 at 07:30:17PM +0100, Michal Hocko wrote:
> On Wed 30-11-16 13:16:53, Johannes Weiner wrote:
> > Hi Michael,
> >
> > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> [...]
> > > diff --git a/mm/workingset.c b/mm/workingset.c
> > > index 617475f529f4..0f07522c5c0e 100644
> > > --- a/mm/workingset.c
> > > +++ b/mm/workingset.c
> > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > > local_irq_enable();
> > >
> > > - if (memcg_kmem_enabled()) {
> > > + if (memcg_kmem_enabled() && sc->memcg) {
> > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > > LRU_ALL_FILE);
> > > } else {
> >
> > If we do that, I'd remove the racy memcg_kmem_enabled() check
> > altogether and just check for whether we have a memcg or not.
>
> But that would make this a memcg aware shrinker even when kmem is not
> enabled...
>
> But now that I am looking into the code
> shrink_slab:
> if (memcg_kmem_enabled() &&
> !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
> continue;
>
> this should be taken care of already. So sc->memcg should be indeed
> sufficient. So unless I am missing something I will respin my local
> patch and post it later after the reporter has some time to test the
> current one.
The above patch seems to help. At least the problem haven't occurred for
the last ~40 VM startups.
>
> > What do you think, Vladimir?
>
--
Pozdrawiam / Best Regards,
Marek Marczykowski-Górecki | RLU #390519
marmarek at staszic waw pl | xmpp:marmarek at staszic waw pl
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-12-01 2:24 ` Marek Marczykowski-Górecki
@ 2016-12-01 7:02 ` Michal Hocko
2016-12-01 10:58 ` Marek Marczykowski-Górecki
0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2016-12-01 7:02 UTC (permalink / raw)
To: Marek Marczykowski-Górecki
Cc: Johannes Weiner, Andrew Morton, Mel Gorman, bugzilla-daemon,
linux-mm, Vladimir Davydov
On Thu 01-12-16 03:24:54, Marek Marczykowski-Gorecki wrote:
> On Wed, Nov 30, 2016 at 07:30:17PM +0100, Michal Hocko wrote:
> > On Wed 30-11-16 13:16:53, Johannes Weiner wrote:
> > > Hi Michael,
> > >
> > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> > [...]
> > > > diff --git a/mm/workingset.c b/mm/workingset.c
> > > > index 617475f529f4..0f07522c5c0e 100644
> > > > --- a/mm/workingset.c
> > > > +++ b/mm/workingset.c
> > > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > > > local_irq_enable();
> > > >
> > > > - if (memcg_kmem_enabled()) {
> > > > + if (memcg_kmem_enabled() && sc->memcg) {
> > > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > > > LRU_ALL_FILE);
> > > > } else {
> > >
> > > If we do that, I'd remove the racy memcg_kmem_enabled() check
> > > altogether and just check for whether we have a memcg or not.
> >
> > But that would make this a memcg aware shrinker even when kmem is not
> > enabled...
> >
> > But now that I am looking into the code
> > shrink_slab:
> > if (memcg_kmem_enabled() &&
> > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
> > continue;
> >
> > this should be taken care of already. So sc->memcg should be indeed
> > sufficient. So unless I am missing something I will respin my local
> > patch and post it later after the reporter has some time to test the
> > current one.
>
> The above patch seems to help. At least the problem haven't occurred for
> the last ~40 VM startups.
I will consider this as
Tested-by: Marek Marczykowski-Gorecki <marmarek@mimuw.edu.pl>
OK? Thanks for the report and testing!
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-11-30 18:16 ` Johannes Weiner
2016-11-30 18:30 ` Michal Hocko
@ 2016-12-01 10:07 ` Vladimir Davydov
1 sibling, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2016-12-01 10:07 UTC (permalink / raw)
To: Johannes Weiner
Cc: Michal Hocko, Andrew Morton, Mel Gorman, bugzilla-daemon,
linux-mm, marmarek
On Wed, Nov 30, 2016 at 01:16:53PM -0500, Johannes Weiner wrote:
> Hi Michael,
>
> On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> > > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at
> > > > 0000000000000400
> > > > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > > > [ 15.665225] PGD 0
> > > > [ 15.665230] Oops: 0000 [#1] SMP
> > > > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT
> > > > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE
> > > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i
> > > > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl
> > > > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel
> > > > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O)
> > > > xen_blkback xenfs xen_privcmd xen_blkfront
> > > > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O
> > > > 4.8.10-12.pvops.qubes.x86_64 #1
> > > > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000
> > > > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>]
> > > > mem_cgroup_node_nr_lru_pages+0x20/0x40
> > > > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293
> > > > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX:
> > > > 0000000000000002
> > > > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI:
> > > > 0000000000000000
> >
> > I cannot generate a similar code to yours but the above suggests that we
> > are getting NULL memcg. This would suggest a global reclaim and
> > count_shadow_nodes misinterprets that because it does
> >
> > if (memcg_kmem_enabled()) {
> > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > LRU_ALL_FILE);
> > } else {
> > pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) +
> > node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE);
> > }
> >
> > this might be a race with kmem enabling AFAICS. Anyaway I believe that
> > the above check needs to ne extended for the sc->memcg != NULL
>
> Yep, my locally built code looks very different from the report, but
> it's clear that memcg is NULL. I didn't see the race you mention, but
> it makes sense to me: shrink_slab() is supposed to filter memcg-aware
> shrinkers based on whether we have a memcg or not, but it only does it
> when kmem accounting is enabled; if it's disabled, the shrinker should
> also use its non-memcg behavior. However, nothing prevents a memcg
> with kmem from onlining between the filter and the shrinker run.
Yeah, I think the issue can be easily reproduced by triggering the
reclaimer while running mkdir/rmdir on a memory cgroup directory in a
loop provided no other memory cgroup exists in the system.
>
> > diff --git a/mm/workingset.c b/mm/workingset.c
> > index 617475f529f4..0f07522c5c0e 100644
> > --- a/mm/workingset.c
> > +++ b/mm/workingset.c
> > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > local_irq_enable();
> >
> > - if (memcg_kmem_enabled()) {
> > + if (memcg_kmem_enabled() && sc->memcg) {
> > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > LRU_ALL_FILE);
> > } else {
>
> If we do that, I'd remove the racy memcg_kmem_enabled() check
> altogether and just check for whether we have a memcg or not.
Agree. BTW this is how the other memcg-aware shrinker, list_lru, works -
see list_lru_shrink_count -> memcg_cache_id.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
2016-12-01 7:02 ` Michal Hocko
@ 2016-12-01 10:58 ` Marek Marczykowski-Górecki
0 siblings, 0 replies; 9+ messages in thread
From: Marek Marczykowski-Górecki @ 2016-12-01 10:58 UTC (permalink / raw)
To: Michal Hocko
Cc: Johannes Weiner, Andrew Morton, Mel Gorman, bugzilla-daemon,
linux-mm, Vladimir Davydov
[-- Attachment #1: Type: text/plain, Size: 2108 bytes --]
On Thu, Dec 01, 2016 at 08:02:13AM +0100, Michal Hocko wrote:
> On Thu 01-12-16 03:24:54, Marek Marczykowski-Górecki wrote:
> > On Wed, Nov 30, 2016 at 07:30:17PM +0100, Michal Hocko wrote:
> > > On Wed 30-11-16 13:16:53, Johannes Weiner wrote:
> > > > Hi Michael,
> > > >
> > > > On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> > > [...]
> > > > > diff --git a/mm/workingset.c b/mm/workingset.c
> > > > > index 617475f529f4..0f07522c5c0e 100644
> > > > > --- a/mm/workingset.c
> > > > > +++ b/mm/workingset.c
> > > > > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > > > > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > > > > local_irq_enable();
> > > > >
> > > > > - if (memcg_kmem_enabled()) {
> > > > > + if (memcg_kmem_enabled() && sc->memcg) {
> > > > > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > > > > LRU_ALL_FILE);
> > > > > } else {
> > > >
> > > > If we do that, I'd remove the racy memcg_kmem_enabled() check
> > > > altogether and just check for whether we have a memcg or not.
> > >
> > > But that would make this a memcg aware shrinker even when kmem is not
> > > enabled...
> > >
> > > But now that I am looking into the code
> > > shrink_slab:
> > > if (memcg_kmem_enabled() &&
> > > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
> > > continue;
> > >
> > > this should be taken care of already. So sc->memcg should be indeed
> > > sufficient. So unless I am missing something I will respin my local
> > > patch and post it later after the reporter has some time to test the
> > > current one.
> >
> > The above patch seems to help. At least the problem haven't occurred for
> > the last ~40 VM startups.
>
> I will consider this as
> Tested-by: Marek Marczykowski-Górecki <marmarek@mimuw.edu.pl>
>
> OK? Thanks for the report and testing!
Yes.
--
Pozdrawiam / Best Regards,
Marek Marczykowski-Górecki | RLU #390519
marmarek at staszic waw pl | xmpp:marmarek at staszic waw pl
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-12-01 10:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-189181-27@https.bugzilla.kernel.org/>
2016-11-29 22:56 ` [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages Andrew Morton
2016-11-30 17:00 ` Michal Hocko
2016-11-30 18:16 ` Johannes Weiner
2016-11-30 18:30 ` Michal Hocko
2016-12-01 0:33 ` Balbir Singh
2016-12-01 2:24 ` Marek Marczykowski-Górecki
2016-12-01 7:02 ` Michal Hocko
2016-12-01 10:58 ` Marek Marczykowski-Górecki
2016-12-01 10:07 ` Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).