From: Vladimir Davydov <vdavydov@tarantool.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org,
marmarek@mimuw.edu.pl
Subject: Re: [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages
Date: Thu, 1 Dec 2016 13:07:17 +0300 [thread overview]
Message-ID: <20161201100717.GA13790@esperanza> (raw)
In-Reply-To: <20161130181653.GA30558@cmpxchg.org>
On Wed, Nov 30, 2016 at 01:16:53PM -0500, Johannes Weiner wrote:
> Hi Michael,
>
> On Wed, Nov 30, 2016 at 06:00:40PM +0100, Michal Hocko wrote:
> > > > [ 15.665196] BUG: unable to handle kernel NULL pointer dereference at
> > > > 0000000000000400
> > > > [ 15.665213] IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
> > > > [ 15.665225] PGD 0
> > > > [ 15.665230] Oops: 0000 [#1] SMP
> > > > [ 15.665235] Modules linked in: fuse xt_nat xen_netback xt_REDIRECT
> > > > nf_nat_redirect ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE
> > > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_i
> > > > pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack intel_rapl
> > > > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel
> > > > ghash_clmulni_intel pcspkr dummy_hcd udc_core u2mfn(O)
> > > > xen_blkback xenfs xen_privcmd xen_blkfront
> > > > [ 15.665285] CPU: 0 PID: 60 Comm: kswapd0 Tainted: G O
> > > > 4.8.10-12.pvops.qubes.x86_64 #1
> > > > [ 15.665292] task: ffff880011863b00 task.stack: ffff880011868000
> > > > [ 15.665297] RIP: e030:[<ffffffff8122d520>] [<ffffffff8122d520>]
> > > > mem_cgroup_node_nr_lru_pages+0x20/0x40
> > > > [ 15.665307] RSP: e02b:ffff88001186bc70 EFLAGS: 00010293
> > > > [ 15.665311] RAX: 0000000000000000 RBX: ffff88001186bd20 RCX:
> > > > 0000000000000002
> > > > [ 15.665317] RDX: 000000000000000c RSI: 0000000000000000 RDI:
> > > > 0000000000000000
> >
> > I cannot generate a similar code to yours but the above suggests that we
> > are getting NULL memcg. This would suggest a global reclaim and
> > count_shadow_nodes misinterprets that because it does
> >
> > if (memcg_kmem_enabled()) {
> > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > LRU_ALL_FILE);
> > } else {
> > pages = node_page_state(NODE_DATA(sc->nid), NR_ACTIVE_FILE) +
> > node_page_state(NODE_DATA(sc->nid), NR_INACTIVE_FILE);
> > }
> >
> > this might be a race with kmem enabling AFAICS. Anyaway I believe that
> > the above check needs to ne extended for the sc->memcg != NULL
>
> Yep, my locally built code looks very different from the report, but
> it's clear that memcg is NULL. I didn't see the race you mention, but
> it makes sense to me: shrink_slab() is supposed to filter memcg-aware
> shrinkers based on whether we have a memcg or not, but it only does it
> when kmem accounting is enabled; if it's disabled, the shrinker should
> also use its non-memcg behavior. However, nothing prevents a memcg
> with kmem from onlining between the filter and the shrinker run.
Yeah, I think the issue can be easily reproduced by triggering the
reclaimer while running mkdir/rmdir on a memory cgroup directory in a
loop provided no other memory cgroup exists in the system.
>
> > diff --git a/mm/workingset.c b/mm/workingset.c
> > index 617475f529f4..0f07522c5c0e 100644
> > --- a/mm/workingset.c
> > +++ b/mm/workingset.c
> > @@ -348,7 +348,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > local_irq_enable();
> >
> > - if (memcg_kmem_enabled()) {
> > + if (memcg_kmem_enabled() && sc->memcg) {
> > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > LRU_ALL_FILE);
> > } else {
>
> If we do that, I'd remove the racy memcg_kmem_enabled() check
> altogether and just check for whether we have a memcg or not.
Agree. BTW this is how the other memcg-aware shrinker, list_lru, works -
see list_lru_shrink_count -> memcg_cache_id.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2016-12-01 10:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-189181-27@https.bugzilla.kernel.org/>
2016-11-29 22:56 ` [Bug 189181] New: BUG: unable to handle kernel NULL pointer dereference in mem_cgroup_node_nr_lru_pages Andrew Morton
2016-11-30 17:00 ` Michal Hocko
2016-11-30 18:16 ` Johannes Weiner
2016-11-30 18:30 ` Michal Hocko
2016-12-01 0:33 ` Balbir Singh
2016-12-01 2:24 ` Marek Marczykowski-Górecki
2016-12-01 7:02 ` Michal Hocko
2016-12-01 10:58 ` Marek Marczykowski-Górecki
2016-12-01 10:07 ` Vladimir Davydov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161201100717.GA13790@esperanza \
--to=vdavydov@tarantool.org \
--cc=akpm@linux-foundation.org \
--cc=bugzilla-daemon@bugzilla.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=marmarek@mimuw.edu.pl \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).