All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lord Glauber Costa of Sealand <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org
Subject: Re: [PATCH v4 2/6] memcg: split part of memcg creation to css_online
Date: Mon, 28 Jan 2013 12:35:20 +0400	[thread overview]
Message-ID: <51063848.6070004@parallels.com> (raw)
In-Reply-To: <20130125155249.402c40dd.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>

On 01/26/2013 03:52 AM, Andrew Morton wrote:
> On Tue, 22 Jan 2013 17:47:37 +0400
> Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote:
> 
>> This patch is a preparatory work for later locking rework to get rid of
>> big cgroup lock from memory controller code.
> 
> Is this complete?  From my reading, the patch is also a bugfix.  It
> prevents stale tunable values from getting installed into new children?
> 
No, it is not a bug fix. This used to be all protected by the cgroup
lock under the hood - we don't see it, but it is there from cgroup core.

Yes, this is ugly. But it is one of the very problems this patchset is
trying to get rid of  =p

>> The memory controller uses some tunables to adjust its operation. Those
>> tunables are inherited from parent to children upon children
>> intialization. For most of them, the value cannot be changed after the
>> parent has a new children.
>>
>> cgroup core splits initialization in two phases: css_alloc and css_online.
>> After css_alloc, the memory allocation and basic initialization are
>> done. But the new group is not yet visible anywhere, not even for cgroup
>> core code. It is only somewhere between css_alloc and css_online that it
>> is inserted into the internal children lists. Copying tunable values in
>> css_alloc will lead to inconsistent values: the children will copy the
>> old parent values, that can change between the copy and the moment in
>> which the groups is linked to any data structure that can indicate the
>> presence of children.
> 
> That describes the problem, but not the fix.  Don't we need something
> like "therefore move the propagation of tunables into the css_online
> handler".
> 
> What remains unclear is how we prevent races during the operation of
> the css_online handler.  Suppose mem_cgroup_css_online() is
> mid-execution and userspace comes in and starts modifying the parent's
> tunables?
> 

At this point, the very same old cgroup_lock() - since it is still
present. In a later patch, we will need the memcg mutex around the
assignments.

IOW, The figure looks a bit like:

css_alloc() --> cgroup_internal_datastructure_update -> css_online()

This is all protected by the cgroup_lock(). So at this point, wherever
we do those assignments, we're safe. When we move to local locking, the
situation changes. Assigning in css_alloc will mean that we'll have a
non-locked window where the assignment is made, but the cgroup does not
yet show up in the internal data structures - so the pertinence tests
will fail and the tunable values will be allowed to change.


WARNING: multiple messages have this Message-ID (diff)
From: Lord Glauber Costa of Sealand <glommer@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
	Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	kamezawa.hiroyu@jp.fujitsu.com
Subject: Re: [PATCH v4 2/6] memcg: split part of memcg creation to css_online
Date: Mon, 28 Jan 2013 12:35:20 +0400	[thread overview]
Message-ID: <51063848.6070004@parallels.com> (raw)
In-Reply-To: <20130125155249.402c40dd.akpm@linux-foundation.org>

On 01/26/2013 03:52 AM, Andrew Morton wrote:
> On Tue, 22 Jan 2013 17:47:37 +0400
> Glauber Costa <glommer@parallels.com> wrote:
> 
>> This patch is a preparatory work for later locking rework to get rid of
>> big cgroup lock from memory controller code.
> 
> Is this complete?  From my reading, the patch is also a bugfix.  It
> prevents stale tunable values from getting installed into new children?
> 
No, it is not a bug fix. This used to be all protected by the cgroup
lock under the hood - we don't see it, but it is there from cgroup core.

Yes, this is ugly. But it is one of the very problems this patchset is
trying to get rid of  =p

>> The memory controller uses some tunables to adjust its operation. Those
>> tunables are inherited from parent to children upon children
>> intialization. For most of them, the value cannot be changed after the
>> parent has a new children.
>>
>> cgroup core splits initialization in two phases: css_alloc and css_online.
>> After css_alloc, the memory allocation and basic initialization are
>> done. But the new group is not yet visible anywhere, not even for cgroup
>> core code. It is only somewhere between css_alloc and css_online that it
>> is inserted into the internal children lists. Copying tunable values in
>> css_alloc will lead to inconsistent values: the children will copy the
>> old parent values, that can change between the copy and the moment in
>> which the groups is linked to any data structure that can indicate the
>> presence of children.
> 
> That describes the problem, but not the fix.  Don't we need something
> like "therefore move the propagation of tunables into the css_online
> handler".
> 
> What remains unclear is how we prevent races during the operation of
> the css_online handler.  Suppose mem_cgroup_css_online() is
> mid-execution and userspace comes in and starts modifying the parent's
> tunables?
> 

At this point, the very same old cgroup_lock() - since it is still
present. In a later patch, we will need the memcg mutex around the
assignments.

IOW, The figure looks a bit like:

css_alloc() --> cgroup_internal_datastructure_update -> css_online()

This is all protected by the cgroup_lock(). So at this point, wherever
we do those assignments, we're safe. When we move to local locking, the
situation changes. Assigning in css_alloc will mean that we'll have a
non-locked window where the assignment is made, but the cgroup does not
yet show up in the internal data structures - so the pertinence tests
will fail and the tunable values will be allowed to change.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-01-28  8:35 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-22 13:47 [PATCH v4 0/6] replace cgroup_lock with memcg specific locking Glauber Costa
2013-01-22 13:47 ` [PATCH v4 1/6] memcg: prevent changes to move_charge_at_immigrate during task attach Glauber Costa
     [not found]   ` <1358862461-18046-2-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-29  0:11     ` Kamezawa Hiroyuki
2013-01-29  0:11       ` Kamezawa Hiroyuki
2013-01-22 13:47 ` [PATCH v4 2/6] memcg: split part of memcg creation to css_online Glauber Costa
     [not found]   ` <1358862461-18046-3-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-25 23:52     ` Andrew Morton
2013-01-25 23:52       ` Andrew Morton
     [not found]       ` <20130125155249.402c40dd.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2013-01-28  8:35         ` Lord Glauber Costa of Sealand [this message]
2013-01-28  8:35           ` Lord Glauber Costa of Sealand
2013-01-29  0:12     ` Kamezawa Hiroyuki
2013-01-29  0:12       ` Kamezawa Hiroyuki
2013-01-22 13:47 ` [PATCH v4 3/6] memcg: fast hierarchy-aware child test Glauber Costa
2013-01-25 23:59   ` Andrew Morton
     [not found]     ` <20130125155901.4d3fb00c.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2013-01-28  8:30       ` Lord Glauber Costa of Sealand
2013-01-28  8:30         ` Lord Glauber Costa of Sealand
2013-01-29  0:14   ` Kamezawa Hiroyuki
2013-01-22 13:47 ` [PATCH v4 4/6] memcg: replace cgroup_lock with memcg specific memcg_lock Glauber Costa
     [not found]   ` <1358862461-18046-5-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-22 14:00     ` Michal Hocko
2013-01-22 14:00       ` Michal Hocko
2013-01-29  0:16     ` Kamezawa Hiroyuki
2013-01-29  0:16       ` Kamezawa Hiroyuki
2013-01-22 13:47 ` [PATCH v4 5/6] memcg: increment static branch right after limit set Glauber Costa
2013-01-29  0:18   ` Kamezawa Hiroyuki
2013-01-22 13:47 ` [PATCH v4 6/6] memcg: avoid dangling reference count in creation failure Glauber Costa
     [not found]   ` <1358862461-18046-7-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-22 14:00     ` Michal Hocko
2013-01-22 14:00       ` Michal Hocko
2013-01-29  0:22     ` Kamezawa Hiroyuki
2013-01-29  0:22       ` Kamezawa Hiroyuki
     [not found] ` <1358862461-18046-1-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-25 10:05   ` [PATCH v4 0/6] replace cgroup_lock with memcg specific locking Lord Glauber Costa of Sealand
2013-01-25 10:05     ` Lord Glauber Costa of Sealand
     [not found]     ` <510258D0.6060407-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-25 10:18       ` Michal Hocko
2013-01-25 10:18         ` Michal Hocko
2013-01-25 10:27         ` Lord Glauber Costa of Sealand
2013-01-25 17:37           ` Tejun Heo
     [not found]             ` <20130125173701.GH3081-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-01-26  0:03               ` Andrew Morton
2013-01-26  0:03                 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51063848.6070004@parallels.com \
    --to=glommer-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.