From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>,
netdev@vger.kernel.org, cgroups@vger.kernel.org,
Li Zefan <lizefan@huawei.com>, David Miller <davem@davemloft.net>,
devel@openvz.org
Subject: Re: [PATCH v2 5/5] decrement static keys on real destroy time
Date: Wed, 25 Apr 2012 09:22:37 +0900 [thread overview]
Message-ID: <4F9743CD.9030209@jp.fujitsu.com> (raw)
In-Reply-To: <4F969176.8010804@parallels.com>
(2012/04/24 20:41), Glauber Costa wrote:
> On 04/23/2012 11:40 PM, KAMEZAWA Hiroyuki wrote:
>> (2012/04/24 4:37), Glauber Costa wrote:
>>
>>> We call the destroy function when a cgroup starts to be removed,
>>> such as by a rmdir event.
>>>
>>> However, because of our reference counters, some objects are still
>>> inflight. Right now, we are decrementing the static_keys at destroy()
>>> time, meaning that if we get rid of the last static_key reference,
>>> some objects will still have charges, but the code to properly
>>> uncharge them won't be run.
>>>
>>> This becomes a problem specially if it is ever enabled again, because
>>> now new charges will be added to the staled charges making keeping
>>> it pretty much impossible.
>>>
>>> We just need to be careful with the static branch activation:
>>> since there is no particular preferred order of their activation,
>>> we need to make sure that we only start using it after all
>>> call sites are active. This is achieved by having a per-memcg
>>> flag that is only updated after static_key_slow_inc() returns.
>>> At this time, we are sure all sites are active.
>>>
>>> This is made per-memcg, not global, for a reason:
>>> it also has the effect of making socket accounting more
>>> consistent. The first memcg to be limited will trigger static_key()
>>> activation, therefore, accounting. But all the others will then be
>>> accounted no matter what. After this patch, only limited memcgs
>>> will have its sockets accounted.
>>>
>>> [v2: changed a tcp limited flag for a generic proto limited flag ]
>>> [v3: update the current active flag only after the static_key update ]
>>>
>>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>>
>>
>> Acked-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>>
>> A small request below.
>>
>> <snip>
>>
>>
>>> + * ->activated needs to be written after the static_key update.
>>> + * This is what guarantees that the socket activation function
>>> + * is the last one to run. See sock_update_memcg() for details,
>>> + * and note that we don't mark any socket as belonging to this
>>> + * memcg until that flag is up.
>>> + *
>>> + * We need to do this, because static_keys will span multiple
>>> + * sites, but we can't control their order. If we mark a socket
>>> + * as accounted, but the accounting functions are not patched in
>>> + * yet, we'll lose accounting.
>>> + *
>>> + * We never race with the readers in sock_update_memcg(), because
>>> + * when this value change, the code to process it is not patched in
>>> + * yet.
>>> + */
>>> + mutex_lock(&tcp_set_limit_mutex);
>>
>>
>> Could you explain for what this mutex is in above comment ?
>>
> This is explained at the site where the mutex is defined.
> If you still want me to mention it here, or maybe expand the explanation
> there, I surely can.
>
Ah, I think it's better to mention one more complicated race.
Let me explain.
Assume we don't have tcp_set_limit_mutex. And jump_label is not activated yet
i.e. memcg_socket_limit_enabled->count == 0.
When a user updates limit of 2 cgroups at once, following happens.
CPU A CPU B
if (cg_proto->activated) if (cg->proto_activated)
static_key_inc() static_key_inc()
=> set counter 0->1 => set counter 1->2, return immediately.
=> hold mutex => cg_proto->activated = true.
=> overwrite jmps.
Then, without mutex, activated/active may be set 'true' before the end
of jump_label modification.
Thanks,
-Kame
prev parent reply other threads:[~2012-04-25 0:24 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-23 19:37 [PATCH v2 0/5] Fix problem with static_key decrement Glauber Costa
[not found] ` <1335209867-1831-1-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-23 19:37 ` [PATCH v2 1/5] don't attach a task to a dead cgroup Glauber Costa
[not found] ` <1335209867-1831-2-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-24 2:20 ` KAMEZAWA Hiroyuki
2012-04-23 19:37 ` [PATCH v2 2/5] blkcg: protect blkcg->policy_list Glauber Costa
2012-04-23 19:37 ` [PATCH v2 3/5] change number_of_cpusets to an atomic Glauber Costa
[not found] ` <1335209867-1831-4-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-24 2:25 ` KAMEZAWA Hiroyuki
2012-04-24 15:02 ` Christoph Lameter
[not found] ` <alpine.DEB.2.00.1204241001050.26005-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
2012-04-24 16:15 ` Glauber Costa
[not found] ` <4F96D1A8.6040604-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-24 16:24 ` Christoph Lameter
[not found] ` <alpine.DEB.2.00.1204241120130.26005-sBS69tsa9Uj/9pzu0YdTqQ@public.gmane.org>
2012-04-24 16:30 ` Glauber Costa
[not found] ` <4F96D50D.4020804-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-24 18:27 ` Christoph Lameter
2012-04-23 19:37 ` [PATCH v2 4/5] don't take cgroup_mutex in destroy() Glauber Costa
[not found] ` <1335209867-1831-5-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-24 2:31 ` KAMEZAWA Hiroyuki
[not found] ` <4F96109A.8000907-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-04-24 11:42 ` Glauber Costa
[not found] ` <4F9691A8.1070106-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-25 8:01 ` Li Zefan
2012-04-23 19:37 ` [PATCH v2 5/5] decrement static keys on real destroy time Glauber Costa
[not found] ` <1335209867-1831-6-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-24 2:40 ` KAMEZAWA Hiroyuki
[not found] ` <4F9612B9.7050705-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-04-24 11:41 ` Glauber Costa
2012-04-25 0:22 ` KAMEZAWA Hiroyuki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F9743CD.9030209@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=cgroups@vger.kernel.org \
--cc=davem@davemloft.net \
--cc=devel@openvz.org \
--cc=glommer@parallels.com \
--cc=lizefan@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).