linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@parallels.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>,
	"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>,
	linux-mm@kvack.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Will Deacon <Will.Deacon@arm.com>
Subject: Re: [PATCH cgroup/for-3.19-fixes] cgroup: implement cgroup_subsys->unbind() callback
Date: Mon, 12 Jan 2015 11:01:14 +0300	[thread overview]
Message-ID: <20150112080114.GE2110@esperanza> (raw)
In-Reply-To: <20150111205543.GA5480@phnom.home.cmpxchg.org>

On Sun, Jan 11, 2015 at 03:55:43PM -0500, Johannes Weiner wrote:
> On Sat, Jan 10, 2015 at 04:43:16PM -0500, Tejun Heo wrote:
> > > May be, we should kill the ref counter to the memory controller root in
> > > cgroup_kill_sb only if there is no children at all, neither online nor
> > > offline.
> > 
> > Ah, thanks for the analysis, but I really wanna avoid making hierarchy
> > destruction conditions opaque to userland.  This is userland visible
> > behavior.  It shouldn't be determined by kernel internals invisible
> > outside.  This patch adds ss->unbind() which memcg can hook into to
> > kick off draining of residual refs.  If this would work, I'll add this
> > patch to cgroup/for-3.19-fixes, possibly with stable cc'd.
> 
> How about this ->unbind() for memcg?
> 
> From d527ba1dbfdb58e1f7c7c4ee12b32ef2e5461990 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Sun, 11 Jan 2015 10:29:05 -0500
> Subject: [patch] mm: memcontrol: zap outstanding cache/swap references during
>  unbind
> 
> Cgroup core assumes that any outstanding css references after
> offlining are temporary in nature, and e.g. mount waits for them to
> disappear and release the root cgroup.  But leftover page cache and
> swapout records in an offlined memcg are only dropped when the pages
> get reclaimed under pressure or the swapped out pages get faulted in
> from other cgroups, and so those cgroup operations can hang forever.
> 
> Implement the ->unbind() callback to actively get rid of outstanding
> references when cgroup core wants them gone.  Swap out records are
> deleted, such that the swap-in path will charge those pages to the
> faulting task.  Page cache pages are moved to the root memory cgroup.

... and kmem pages are ignored. I reckon we could reparent them (I
submitted the patch set some time ago), but that's going to be tricky
and will complicate regular kmem charge/uncharge paths, as well as
list_lru_add/del. I don't think we can put up with it, provided we only
want reparenting on unmount, do we not?

Come to think of it, I wonder how many users actually want to mount
different controllers subset after unmount. Because we could allow
mounting the same subset perfectly well, even if it includes memcg. BTW,
AFAIU in the unified hierarchy we won't have this problem at all,
because by definition it mounts all controllers IIRC, so do we need to
bother fixing this in such a complicated manner at all for the setup
that's going to be deprecated anyway?

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-01-12  8:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-09 17:43 [Regression] 3.19-rc3 : memcg: Hang in mount memcg Suzuki K. Poulose
2015-01-09 21:46 ` Tejun Heo
2015-01-12 17:02   ` Suzuki K. Poulose
2015-01-10  8:55 ` Vladimir Davydov
2015-01-10 21:43   ` [PATCH cgroup/for-3.19-fixes] cgroup: implement cgroup_subsys->unbind() callback Tejun Heo
2015-01-11 20:55     ` Johannes Weiner
2015-01-12  8:01       ` Vladimir Davydov [this message]
2015-01-12 11:28         ` Tejun Heo
2015-01-12 12:59           ` Vladimir Davydov
2015-01-12 13:05             ` Tejun Heo
2015-01-14 11:16       ` Suzuki K. Poulose
2015-01-15 17:56       ` Michal Hocko
2015-01-15 17:26     ` Michal Hocko
2015-01-19 12:51   ` [Regression] 3.19-rc3 : memcg: Hang in mount memcg Suzuki K. Poulose
2015-01-21 16:39     ` Will Deacon
2015-01-22 13:45       ` Johannes Weiner
2015-01-22 14:34         ` Tejun Heo
2015-01-22 15:19           ` Johannes Weiner
2015-01-22 15:28             ` Tejun Heo
2015-01-23 15:00         ` Suzuki K. Poulose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150112080114.GE2110@esperanza \
    --to=vdavydov@parallels.com \
    --cc=Suzuki.Poulose@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).