Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org,
	marmarek@mimuw.edu.pl, Vladimir Davydov <vdavydov.dev@gmail.com>
Subject: Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
Date: Wed, 30 Nov 2016 18:00:40 +0100	[thread overview]
Message-ID: <20161130170040.GJ18432@dhcp22.suse.cz> (raw)
In-Reply-To: <20161129145654.c48bebbd684edcd6f64a03fe@linux-foundation.org>

[CC Vladimir]

On Tue 29-11-16 14:56:54, Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Sat, 26 Nov 2016 15:10:16 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=189181
> > 
> >             Bug ID: 189181
> >            Summary: BUG: unable to handle kernel NULL pointer dereference
> >                     in mem_cgroup_node_nr_lru_pages
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 4.8.10
> >           Hardware: Intel
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Slab Allocator
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: marmarek@mimuw.edu.pl
> >         Regression: No
> > 
> > Created attachment 245931
> >   --> https://bugzilla.kernel.org/attachment.cgi?id=245931&action=edit
> > Full console log
> > 
> > Shortly after system startup sometimes (about 1/30 times) I get this:
> > 
> > [   15.665196] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000400
> > [   15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > [   15.665225] PGD 0 
> > [   15.665230] Oops: 0000 [#1] SMP
> > [   15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT
> > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE
> > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i
> > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl
> > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel
> > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O) 
> > xen_blkback xenfs xen_privcmd xen_blkfront
> > [   15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G           O   
> > 4.8.10-12.pvops.qubes.x86_64 #1
> > [   15.665292] task: ffff880011863b00 task.stack: ffff880011868000
> > [   15.665297] RIP: e030:[<ffffffff8122d520>]  [<ffffffff8122d520>]
> > mem_cgroup_node_nr_lru_pages+0x20/0x40
> > [   15.665307] RSP: e02b:ffff88001186bc70  EFLAGS: 00010293
> > [   15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX:
> > 0000000000000002
> > [   15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI:
> > 0000000000000000

I cannot generate a similar code to yours but the above suggests that we
are getting NULL memcg. This would suggest a global reclaim and
count_shadow_nodes misinterprets that because it does

	if (memcg_kmem_enabled()) {
		pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
						     LRU_ALL_FILE);
	} else {
		pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) +
			node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE);
	}

this might be a race with kmem enabling AFAICS. Anyaway I believe that
the above check needs to ne extended for the sc->memcg != NULL

diff --git a/mm/workingset.c b/mm/workingset.c
index 617475f529f4..0f07522c5c0e 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
 	shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
 	local_irq_enable();
 
-	if (memcg_kmem_enabled()) {
+	if (memcg_kmem_enabled() && sc->memcg) {
 		pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
 						     LRU_ALL_FILE);
 	} else {

Or am I missing something?

[Keeping the rest of the email for the reference]

> > [   15.665322] RBP: ffff88001186bc70 R08: 28f5c28f5c28f5c3 R09:
> > 0000000000000000
> > [   15.665327] R10: 0000000000006c34 R11: 0000000000000333 R12:
> > 00000000000001f6
> > [   15.665332] R13: ffffffff81c6f6a0 R14: 0000000000000000 R15:
> > 0000000000000000
> > [   15.665343] FS:  0000000000000000(0000) GS:ffff880013c00000(0000)
> > knlGS:ffff880013d00000
> > [   15.665351] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   15.665358] CR2: 0000000000000400 CR3: 00000000122f2000 CR4:
> > 0000000000042660
> > [   15.665366] Stack:
> > [   15.665371]  ffff88001186bc98 ffffffff811e0dda 00000000000002eb
> > 0000000000000080
> > [   15.665384]  ffffffff81c6f6a0 ffff88001186bd70 ffffffff811c36d9
> > 0000000000000000
> > [   15.665397]  ffff88001186bcb0 ffff88001186bcb0 ffff88001186bcc0
> > 000000000000abc5
> > [   15.665410] Call Trace:
> > [   15.665419]  [<ffffffff811e0dda>] count_shadow_nodes+0x9a/0xa0
> > [   15.665428]  [<ffffffff811c36d9>] shrink_slab.part.42+0x119/0x3e0
> > [   15.666049]  [<ffffffff811c83ec>] shrink_node+0x22c/0x320
> > [   15.666049]  [<ffffffff811c928c>] kswapd+0x32c/0x700
> > [   15.666049]  [<ffffffff811c8f60>] ? mem_cgroup_shrink_node+0x180/0x180
> > [   15.666049]  [<ffffffff810c1b08>] kthread+0xd8/0xf0
> > [   15.666049]  [<ffffffff817a3abf>] ret_from_fork+0x1f/0x40
> > [   15.666049]  [<ffffffff810c1a30>] ? kthread_create_on_node+0x190/0x190
> > [   15.666049] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 3b 35 dd
> > eb b1 00 55 48 89 e5 73 2c 89 d2 31 c9 31 c0 4c 63 ce 48 0f a3 ca 73 13 <4a> 8b
> > b4 cf 00 04 00 00 41 89 c8 4a 03
> >  84 c6 80 00 00 00 83 c1 
> > [   15.666049] RIP  [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > [   15.666049]  RSP <ffff88001186bc70>
> > [   15.666049] CR2: 0000000000000400
> > [   15.666049] ---[ end trace 100494b9edbdfc4d ]---
> > 
> > After this, there is another "unable to handle kerneel paging request" I guess
> > because of do_exit in kswapd0, then a lot of soft lockups and system is
> > unusable (see full log attached).
> > 
> > This is running in PV domU on Xen 4.7.0 (the same also happens on Xen 4.6.3).
> > Same happens on 4.8.7 too. Previously it was working on v4.4.31 without any
> > problem.
> > 
> > -- 
> > You are receiving this mail because:
> > You are the assignee for the bug.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-11-30 17:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-189181-27@https.bugzilla.kernel.org/>
2016-11-29 22:56 ` [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages Andrew Morton
2016-11-30 17:00   ` Michal Hocko [this message]
2016-11-30 18:16     ` Johannes Weiner
2016-11-30 18:30       ` Michal Hocko
2016-12-01  0:33         ` Balbir Singh
2016-12-01  2:24         ` Marek Marczykowski-Górecki
2016-12-01  7:02           ` Michal Hocko
2016-12-01 10:58             ` Marek Marczykowski-Górecki
2016-12-01 10:07       ` Vladimir Davydov

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:617475f529f dfblob:0f07522c5c0 )
 OR (
bs:"Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161130170040.GJ18432@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=marmarek@mimuw.edu.pl \
    --cc=mgorman@techsingularity.net \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).