From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Tejun Heo <tj@kernel.org>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
Date: Thu, 25 Sep 2014 13:43:39 +0200 [thread overview]
Message-ID: <20140925114339.GD12090@dhcp22.suse.cz> (raw)
In-Reply-To: <20140925024054.GA4888@cmpxchg.org>
On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> Argh, buggy css_put() against the root. Hand grenades, everywhere.
> Update:
>
> ---
> From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 24 Sep 2014 22:00:20 -0400
> Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
>
> The cgroup iterators yield css objects that have not yet gone through
> css_online(), but they are not complete memcgs at this point and so
> the memcg iterators should not return them. d8ad30559715 ("mm/memcg:
> iteration skip memcgs not yet fully initialized") set out to implement
> exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> not meet the ordering requirements for memcg, and so we still may see
> partially initialized memcgs from the iterators.
I do not see how would this happen. CSS_ONLINE is set after css_online
callback returns and mem_cgroup_css_online ends the core initialization
with mutex_unlock which should provide sufficient memory ordering
requirements (kmem is not covered but activate_kmem_mutex kmem.tcp by
proto_list_mutex). So the worst thing that might happen is that we miss
an already initialized memcg but that shouldn't matter because such a
memcg doesn't contain any tasks nor memory. memcg_has_children doesn't
rely on our iterators so important parts will not miss anything.
So I do not see any bug right now. The flag abuse is another story and I
do agree we should use proper memcg specific synchronization here as
explained by Tejun in other email.
> The cgroup core can not reasonably provide a clear answer on whether
> the object around the css has been fully initialized, as that depends
> on controller-specific locking and lifetime rules. Thus, introduce a
> memcg-specific flag that is set after the memcg has been initialized
> in css_online(), and read before mem_cgroup_iter() callers access the
> memcg members.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
With updated changelog
Acked-by: Michal Hocko <mhocko@suse.cz>
> Cc: <stable@vger.kernel.org> [3.12+]
This is not necessary IMO
> ---
> mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
> 1 file changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 306b6470784c..bafdac0f724e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -292,6 +292,9 @@ struct mem_cgroup {
> /* vmpressure notifications */
> struct vmpressure vmpressure;
>
> + /* css_online() has been completed */
> + bool initialized;
> +
> /*
> * the counter to account for mem+swap usage.
> */
> @@ -1090,10 +1093,23 @@ skip_node:
> * skipping css reference should be safe.
> */
> if (next_css) {
> - if ((next_css == &root->css) ||
> - ((next_css->flags & CSS_ONLINE) &&
> - css_tryget_online(next_css)))
> - return mem_cgroup_from_css(next_css);
> + struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
> +
> + if (next_css == &root->css)
> + return memcg;
> +
> + if (css_tryget_online(next_css)) {
> + if (memcg->initialized) {
> + /*
> + * Make sure the caller's accesses to
> + * the memcg members are issued after
> + * we see this flag set.
> + */
> + smp_rmb();
> + return memcg;
> + }
> + css_put(next_css);
> + }
>
> prev_css = next_css;
> goto skip_node;
> @@ -5413,6 +5429,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
> + int ret;
>
> if (css->id > MEM_CGROUP_ID_MAX)
> return -ENOSPC;
> @@ -5449,7 +5466,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
> }
> mutex_unlock(&memcg_create_mutex);
>
> - return memcg_init_kmem(memcg, &memory_cgrp_subsys);
> + ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> + if (ret)
> + return ret;
> +
> + /* Make sure the initialization is visible before the flag */
> + smp_wmb();
> +
> + memcg->initialized = true;
> +
> + return 0;
> }
>
> /*
> --
> 2.1.0
>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-09-25 11:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-25 2:31 [patch] mm: memcontrol: do not iterate uninitialized memcgs Johannes Weiner
2014-09-25 2:40 ` [patch v2] " Johannes Weiner
2014-09-25 11:43 ` Michal Hocko [this message]
2014-09-25 13:54 ` Johannes Weiner
2014-09-25 14:11 ` Michal Hocko
2014-09-25 2:57 ` [patch] " Tejun Heo
2014-09-25 13:43 ` Johannes Weiner
2014-09-25 14:23 ` Michal Hocko
2014-09-26 13:39 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140925114339.GD12090@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).