From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: linux-mm@kvack.org, Oleg Nesterov <oleg@redhat.com>,
Tejun Heo <tj@kernel.org>,
Vladimir Davydov <vdavydov@parallels.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Greg Thelen <gthelen@google.com>
Subject: Re: [RFC 3/3] memcg: get rid of mm_struct::owner
Date: Tue, 26 May 2015 13:20:19 -0400 [thread overview]
Message-ID: <20150526172019.GA12926@cmpxchg.org> (raw)
In-Reply-To: <20150526151149.GJ14681@dhcp22.suse.cz>
On Tue, May 26, 2015 at 05:11:49PM +0200, Michal Hocko wrote:
> On Tue 26-05-15 10:10:11, Johannes Weiner wrote:
> > On Tue, May 26, 2015 at 01:50:06PM +0200, Michal Hocko wrote:
> > > @@ -104,7 +105,12 @@ static inline bool mm_match_cgroup(struct mm_struct *mm,
> > > bool match = false;
> > >
> > > rcu_read_lock();
> > > - task_memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
> > > + /*
> > > + * rcu_dereference would be better but mem_cgroup is not a complete
> > > + * type here
> > > + */
> > > + task_memcg = READ_ONCE(mm->memcg);
> > > + smp_read_barrier_depends();
> > > if (task_memcg)
> > > match = mem_cgroup_is_descendant(task_memcg, memcg);
> > > rcu_read_unlock();
> >
> > This function has only one user in rmap. If you inline it there, you
> > can use rcu_dereference() and get rid of the specialness & comment.
>
> I am not sure I understand. struct mem_cgroup is defined in
> mm/memcontrol.c so mm/rmap.c will not see it. Or do you suggest pulling
> struct mem_cgroup out into a header with all the dependencies?
Yes, I think that would be preferrable. It's weird that we have such
a major data structure that is used all over the mm-code but only in
the shape of pointers to an incomplete type. It forces a bad style of
code that uses uninlinable callbacks and accessors for even the most
basic things. There are a few functions in memcontrol.c that could
instead be static inlines or should even be implemented as part of the
code that is using them, such as mem_cgroup_get_lru_size(),
mem_cgroup_is_descendant, mem_cgroup_inactive_anon_is_low(),
mem_cgroup_lruvec_online(), mem_cgroup_swappiness(),
mem_cgroup_select_victim_node(), mem_cgroup_update_page_stat(), and
mem_cgroup_events(). Your new functions fall into the same category.
> @@ -486,29 +486,13 @@ void mm_set_memcg(struct mm_struct *mm, struct mem_cgroup *memcg)
> void mm_drop_memcg(struct mm_struct *mm)
> {
> /*
> - * This is the last reference to mm so nobody can see
> - * this memcg
> + * We could reset mm->memcg, but the mm goes away as this is the
> + * last reference.
> */
> if (mm->memcg)
> css_put(&mm->memcg->css);
> }
This function is supposed to be an API call to disassociate a mm from
its memcg, but it actually doesn't do that and will leave a dangling
pointer based on assumptions it makes about how and when the caller
invokes it. That's bad. It's a subtle optimization with dependencies
spread across two moving parts. The result is very fragile code which
will break things in non-obvious ways when the caller changes later on.
And what's left standing is silly too: a memcg-specific API to call
css_put(), even though struct cgroup_subsys_state and css_put() are
public API already.
Both these things are a negative side effect of struct mem_cgroup
being semi-private. Memcg pointers are everywhere, yet we need a
public interface indirection for every simple dereference.
> @@ -5252,10 +5236,15 @@ static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
>
> if (mm) {
> /*
> - * Commit to a new memcg. mc.to points to the destination
> - * memcg even when the current charges are not moved.
> + * Commit to the target memcg even when we do not move
> + * charges.
> */
> - mm_move_memcg(mm, mc.to);
> + struct mem_cgroup *old_memcg = READ_ONCE(mm->memcg);
> + struct mem_cgroup *new_memcg = mem_cgroup_from_css(css);
> +
> + mm_set_memcg(mm, new_memcg);
> + if (old_memcg)
> + css_put(&old_memcg->css);
"Commit" is a problematic choice of words because of its existing
meaning in memcg of associating a page with a pre-reserved charge.
I'm not sure a comment is actually necessary here. Reassigning
mm->memcg when moving a process pretty straight forward IMO.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: linux-mm@kvack.org, Oleg Nesterov <oleg@redhat.com>,
Tejun Heo <tj@kernel.org>,
Vladimir Davydov <vdavydov@parallels.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Greg Thelen <gthelen@google.com>
Subject: Re: [RFC 3/3] memcg: get rid of mm_struct::owner
Date: Tue, 26 May 2015 13:20:19 -0400 [thread overview]
Message-ID: <20150526172019.GA12926@cmpxchg.org> (raw)
In-Reply-To: <20150526151149.GJ14681@dhcp22.suse.cz>
On Tue, May 26, 2015 at 05:11:49PM +0200, Michal Hocko wrote:
> On Tue 26-05-15 10:10:11, Johannes Weiner wrote:
> > On Tue, May 26, 2015 at 01:50:06PM +0200, Michal Hocko wrote:
> > > @@ -104,7 +105,12 @@ static inline bool mm_match_cgroup(struct mm_struct *mm,
> > > bool match = false;
> > >
> > > rcu_read_lock();
> > > - task_memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
> > > + /*
> > > + * rcu_dereference would be better but mem_cgroup is not a complete
> > > + * type here
> > > + */
> > > + task_memcg = READ_ONCE(mm->memcg);
> > > + smp_read_barrier_depends();
> > > if (task_memcg)
> > > match = mem_cgroup_is_descendant(task_memcg, memcg);
> > > rcu_read_unlock();
> >
> > This function has only one user in rmap. If you inline it there, you
> > can use rcu_dereference() and get rid of the specialness & comment.
>
> I am not sure I understand. struct mem_cgroup is defined in
> mm/memcontrol.c so mm/rmap.c will not see it. Or do you suggest pulling
> struct mem_cgroup out into a header with all the dependencies?
Yes, I think that would be preferrable. It's weird that we have such
a major data structure that is used all over the mm-code but only in
the shape of pointers to an incomplete type. It forces a bad style of
code that uses uninlinable callbacks and accessors for even the most
basic things. There are a few functions in memcontrol.c that could
instead be static inlines or should even be implemented as part of the
code that is using them, such as mem_cgroup_get_lru_size(),
mem_cgroup_is_descendant, mem_cgroup_inactive_anon_is_low(),
mem_cgroup_lruvec_online(), mem_cgroup_swappiness(),
mem_cgroup_select_victim_node(), mem_cgroup_update_page_stat(), and
mem_cgroup_events(). Your new functions fall into the same category.
> @@ -486,29 +486,13 @@ void mm_set_memcg(struct mm_struct *mm, struct mem_cgroup *memcg)
> void mm_drop_memcg(struct mm_struct *mm)
> {
> /*
> - * This is the last reference to mm so nobody can see
> - * this memcg
> + * We could reset mm->memcg, but the mm goes away as this is the
> + * last reference.
> */
> if (mm->memcg)
> css_put(&mm->memcg->css);
> }
This function is supposed to be an API call to disassociate a mm from
its memcg, but it actually doesn't do that and will leave a dangling
pointer based on assumptions it makes about how and when the caller
invokes it. That's bad. It's a subtle optimization with dependencies
spread across two moving parts. The result is very fragile code which
will break things in non-obvious ways when the caller changes later on.
And what's left standing is silly too: a memcg-specific API to call
css_put(), even though struct cgroup_subsys_state and css_put() are
public API already.
Both these things are a negative side effect of struct mem_cgroup
being semi-private. Memcg pointers are everywhere, yet we need a
public interface indirection for every simple dereference.
> @@ -5252,10 +5236,15 @@ static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
>
> if (mm) {
> /*
> - * Commit to a new memcg. mc.to points to the destination
> - * memcg even when the current charges are not moved.
> + * Commit to the target memcg even when we do not move
> + * charges.
> */
> - mm_move_memcg(mm, mc.to);
> + struct mem_cgroup *old_memcg = READ_ONCE(mm->memcg);
> + struct mem_cgroup *new_memcg = mem_cgroup_from_css(css);
> +
> + mm_set_memcg(mm, new_memcg);
> + if (old_memcg)
> + css_put(&old_memcg->css);
"Commit" is a problematic choice of words because of its existing
meaning in memcg of associating a page with a pre-reserved charge.
I'm not sure a comment is actually necessary here. Reassigning
mm->memcg when moving a process pretty straight forward IMO.
next prev parent reply other threads:[~2015-05-26 17:20 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-26 11:50 [RFC 0/3] get rid of mm_struct::owner Michal Hocko
2015-05-26 11:50 ` Michal Hocko
2015-05-26 11:50 ` [RFC 1/3] memcg: restructure mem_cgroup_can_attach() Michal Hocko
2015-05-26 11:50 ` Michal Hocko
2015-05-26 11:50 ` [RFC 2/3] memcg: Use mc.moving_task as the indication for charge moving Michal Hocko
2015-05-26 11:50 ` Michal Hocko
2015-05-26 11:50 ` [RFC 3/3] memcg: get rid of mm_struct::owner Michal Hocko
2015-05-26 11:50 ` Michal Hocko
2015-05-26 14:10 ` Johannes Weiner
2015-05-26 14:10 ` Johannes Weiner
2015-05-26 15:11 ` Michal Hocko
2015-05-26 15:11 ` Michal Hocko
2015-05-26 17:20 ` Johannes Weiner [this message]
2015-05-26 17:20 ` Johannes Weiner
2015-05-27 14:48 ` Michal Hocko
2015-05-27 14:48 ` Michal Hocko
2015-05-28 21:07 ` Tejun Heo
2015-05-28 21:07 ` Tejun Heo
2015-05-29 12:08 ` Michal Hocko
2015-05-29 12:08 ` Michal Hocko
2015-05-29 13:10 ` Tejun Heo
2015-05-29 13:10 ` Tejun Heo
2015-05-29 13:45 ` Michal Hocko
2015-05-29 13:45 ` Michal Hocko
2015-05-29 14:07 ` Tejun Heo
2015-05-29 14:07 ` Tejun Heo
2015-05-29 14:57 ` Michal Hocko
2015-05-29 14:57 ` Michal Hocko
2015-05-29 15:23 ` Tejun Heo
2015-05-29 15:23 ` Tejun Heo
2015-05-29 15:26 ` Michal Hocko
2015-05-29 15:26 ` Michal Hocko
2015-05-26 16:36 ` Oleg Nesterov
2015-05-26 16:36 ` Oleg Nesterov
2015-05-26 17:22 ` Michal Hocko
2015-05-26 17:22 ` Michal Hocko
2015-05-26 17:38 ` Oleg Nesterov
2015-05-26 17:38 ` Oleg Nesterov
2015-05-27 9:43 ` Michal Hocko
2015-05-27 9:43 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150526172019.GA12926@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=oleg@redhat.com \
--cc=tj@kernel.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.