From: Michal Hocko <mhocko@suse.cz>
To: Hugh Dickins <hughd@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>,
Dave Airlie <airlied@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>, Tejun Heo <tj@kernel.org>,
Vladimir Davydov <vdavydov@parallels.com>,
Jet Chen <jet.chen@intel.com>, Felipe Balbi <balbi@ti.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: [PATCH] memcg, shmem: fix shmem migration to use lrucare. (was: Re: [Intel-gfx] memcontrol.c BUG)
Date: Mon, 2 Feb 2015 16:00:51 +0100 [thread overview]
Message-ID: <20150202150050.GD4583@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.LSU.2.11.1501291751170.1761@eggly.anvils>
On Thu 29-01-15 18:04:15, Hugh Dickins wrote:
> On Wed, 28 Jan 2015, Michal Hocko wrote:
> > On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> > > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > > >
> > > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > > > mapcount:0 mapping: (null) index:0x0
> > > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> >
> > I guess this matches the following bugon in your kernel:
> > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
> >
> > so the oldpage is on the LRU list already. I am completely unfamiliar
> > with 965GM but is the page perhaps shared with somebody with a different
> > gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
> > the other (racing) caller didn't need to move the page and put it on
> > LRU.
>
> It would be surprising (but not impossible) for oldpage not to be on
> the LRU already: it's a swapin readahead page that has every right to
> be on LRU,
True, thanks for pointing this out.
> but turns out to have been allocated from an unsuitable zone,
> once we discover that it's needed in one of these odd hardware-limited
> mappings. (Whereas newpage is newly allocated and not yet on LRU.)
>
> >
> > If yes we need to tell shmem_replace_page to do the lrucare handling.
>
> Absolutely, thanks Michal. It would also be good to change the comment
> on mem_cgroup_migrate() in mm/memcontrol.c, from "@lrucare: both pages..."
> to "@lrucare: either or both pages..." - though I certainly won't pretend
> that the corrected wording would have prevented this bug creeping in!
Yes, I have updated the wording.
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 339e06639956..e3cdc1a16c0f 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> > */
> > oldpage = newpage;
> > } else {
> > - mem_cgroup_migrate(oldpage, newpage, false);
> > + mem_cgroup_migrate(oldpage, newpage, true);
> > lru_cache_add_anon(newpage);
> > *pagep = newpage;
> > }
>
> Acked-by: Hugh Dickins <hughd@google.com>
Thanks! The full patch is below. I wasn't sure who was the one to report
the issue so I hope the credits are right. I have marked the patch for
stable because some people are running with VM debugging enabled. AFAICS
the issue is not so harmful without debugging on because the stale
oldpage would be removed from the LRU list eventually.
---
next prev parent reply other threads:[~2015-02-02 15:00 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAPM=9tyyP_pKpWjc7LBZU7e6wAt26XGZsyhRh7N497B2+28rrQ@mail.gmail.com>
[not found] ` <20150128084852.GC28132@nuc-i3427.alporthouse.com>
2015-01-28 14:32 ` [Intel-gfx] memcontrol.c BUG Michal Hocko
2015-01-29 8:16 ` Chris Wilson
2015-01-29 23:26 ` Dave Airlie
2015-01-30 2:04 ` Hugh Dickins
2015-02-02 15:00 ` Michal Hocko [this message]
2015-02-02 16:18 ` [PATCH] memcg, shmem: fix shmem migration to use lrucare. (was: Re: [Intel-gfx] memcontrol.c BUG) Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150202150050.GD4583@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=balbi@ti.com \
--cc=chris@chris-wilson.co.uk \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jet.chen@intel.com \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).