* Is not locking task_lock in cgroup_fork() safe?
@ 2012-10-08 2:00 Tejun Heo
2012-10-08 2:01 ` Tejun Heo
` (3 more replies)
0 siblings, 4 replies; 28+ messages in thread
From: Tejun Heo @ 2012-10-08 2:00 UTC (permalink / raw)
To: Frederic Weisbecker, Li Zefan,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
Hello, Frederic.
7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()") removed
task_lock from cgroup_fork citing that current->cgroups can't change
due to threadgroup_change locking; however, threadgroup_change locking
is used only during CLONE_THREAD forking. If @current is forking a
new process, there's nothing preventing someone else to migrate the
parent while forking is in progress and delete the css_set it
currently is using. Am I confused somewhere?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
2012-10-08 2:00 Is not locking task_lock in cgroup_fork() safe? Tejun Heo
@ 2012-10-08 2:01 ` Tejun Heo
2012-10-08 5:46 ` Li Zefan
2012-10-08 12:58 ` Frederic Weisbecker
2012-10-08 12:48 ` Frederic Weisbecker
` (2 subsequent siblings)
3 siblings, 2 replies; 28+ messages in thread
From: Tejun Heo @ 2012-10-08 2:01 UTC (permalink / raw)
To: Frederic Weisbecker, Li Zefan,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
On Mon, Oct 08, 2012 at 11:00:00AM +0900, Tejun Heo wrote:
> 7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()") removed
> task_lock from cgroup_fork citing that current->cgroups can't change
> due to threadgroup_change locking; however, threadgroup_change locking
> is used only during CLONE_THREAD forking. If @current is forking a
> new process, there's nothing preventing someone else to migrate the
> parent while forking is in progress and delete the css_set it
> currently is using. Am I confused somewhere?
Also, please note that task_lock is likely to be hot on local CPU at
that point and avoiding it there might not really buy much.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
2012-10-08 2:01 ` Tejun Heo
@ 2012-10-08 5:46 ` Li Zefan
[not found] ` <507268AA.8050509-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2012-10-08 12:58 ` Frederic Weisbecker
1 sibling, 1 reply; 28+ messages in thread
From: Li Zefan @ 2012-10-08 5:46 UTC (permalink / raw)
To: Tejun Heo
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
On 2012/10/8 10:01, Tejun Heo wrote:
> On Mon, Oct 08, 2012 at 11:00:00AM +0900, Tejun Heo wrote:
>> 7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()") removed
>> task_lock from cgroup_fork citing that current->cgroups can't change
>> due to threadgroup_change locking; however, threadgroup_change locking
>> is used only during CLONE_THREAD forking. If @current is forking a
>> new process, there's nothing preventing someone else to migrate the
>> parent while forking is in progress and delete the css_set it
>> currently is using. Am I confused somewhere?
You're right. threadgroup lock is held unconditionally in attach_task_py_pid(),
but it's held only for CLONE_THREAD in fork path, which I guess I overlooked
when reviewing the patch.
>
> Also, please note that task_lock is likely to be hot on local CPU at
> that point and avoiding it there might not really buy much.
>
Reverting that commit should be fine.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <507268AA.8050509-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2012-10-08 6:57 ` Tejun Heo
2012-10-16 19:34 ` Tejun Heo
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-08 6:57 UTC (permalink / raw)
To: Li Zefan
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
Hello, Li, Frederic.
On Mon, Oct 08, 2012 at 01:46:18PM +0800, Li Zefan wrote:
> You're right. threadgroup lock is held unconditionally in attach_task_py_pid(),
> but it's held only for CLONE_THREAD in fork path, which I guess I overlooked
> when reviewing the patch.
>
> > Also, please note that task_lock is likely to be hot on local CPU at
> > that point and avoiding it there might not really buy much.
>
> Reverting that commit should be fine.
There are other commits which perform similar optimization
7e3aa30ac8 ("cgroup: Remove task_lock() from cgroup_post_fork()")
c84cdf75cc ("cgroup: Remove unnecessary task_lock before fetching css_set on migration")
Are they wrong too?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
2012-10-08 2:00 Is not locking task_lock in cgroup_fork() safe? Tejun Heo
2012-10-08 2:01 ` Tejun Heo
@ 2012-10-08 12:48 ` Frederic Weisbecker
[not found] ` <CAFTL4hzXWtzp7megsCAEuak5=_2SWmp9age-+wrpyQAU4BRZ0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-19 0:59 ` [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()" Tejun Heo
2012-10-19 0:59 ` [PATCH cgroup/for-3.7-fixes 2/2] Revert "cgroup: Drop task_lock(parent) on cgroup_fork()" Tejun Heo
3 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-08 12:48 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
2012/10/8 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> Hello, Frederic.
>
> 7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()") removed
> task_lock from cgroup_fork citing that current->cgroups can't change
> due to threadgroup_change locking; however, threadgroup_change locking
> is used only during CLONE_THREAD forking. If @current is forking a
> new process, there's nothing preventing someone else to migrate the
> parent while forking is in progress and delete the css_set it
> currently is using. Am I confused somewhere?
Yeah I missed this one.
Now the whole cgroup_attach_task() is clusteracy without the
threadgroup lock anyway:
* The PF_EXITING check is racy (we are neither holding tsk->flags nor
threagroup lock).
* The cgrp == oldcgrp is racy (exit() can change the oldcgrp anytime.
* can_attach / attach / cancel_attach can race against fork/exit (and
post_fork if you consider those interested in cgroup task link like
the freezer. But that is racy in any case already even with
threadgroup lock)
It has been designed to be called under that lock. So I suspect the
best, at least for now, is to threadgroup lock from
cgroup_attach_task_all(). And also make cgroup_attach_task() static to
avoid future unsafe callers.
There is no harm yet because the only user of it calls that with
current as the "task" parameter, in a place that is
not in the middle of a fork. So no need to worry about some stable backport.
Also, looking at cgroup_attach_task_all(), what guarantee do we have
that "from" is not concurrently exiting and removing its cgrp. Which
is a separate problem. But we probably need to do some css_set_get()
before playing with it.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
2012-10-08 2:01 ` Tejun Heo
2012-10-08 5:46 ` Li Zefan
@ 2012-10-08 12:58 ` Frederic Weisbecker
1 sibling, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-08 12:58 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
2012/10/8 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> On Mon, Oct 08, 2012 at 11:00:00AM +0900, Tejun Heo wrote:
>> 7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()") removed
>> task_lock from cgroup_fork citing that current->cgroups can't change
>> due to threadgroup_change locking; however, threadgroup_change locking
>> is used only during CLONE_THREAD forking. If @current is forking a
>> new process, there's nothing preventing someone else to migrate the
>> parent while forking is in progress and delete the css_set it
>> currently is using. Am I confused somewhere?
>
> Also, please note that task_lock is likely to be hot on local CPU at
> that point and avoiding it there might not really buy much.
Locks are expensive anyway. And I think cgroup has enough of them in
that hot path that fork is (threadgroup_lock, css_set_lock, ...)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <CAFTL4hzXWtzp7megsCAEuak5=_2SWmp9age-+wrpyQAU4BRZ0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-16 19:33 ` Tejun Heo
[not found] ` <20121016193341.GD16166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-16 19:33 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Hey, Frederic.
On Mon, Oct 08, 2012 at 02:48:58PM +0200, Frederic Weisbecker wrote:
> Yeah I missed this one.
> Now the whole cgroup_attach_task() is clusteracy without the
Clusteracy?
> threadgroup lock anyway:
>
> * The PF_EXITING check is racy (we are neither holding tsk->flags nor
> threagroup lock).
PF_EXITING is *always* protected by threadgroup_change_begin/end().
> * The cgrp == oldcgrp is racy (exit() can change the oldcgrp anytime.
So, as long as this happens after PF_EXITING check, it should be safe.
> * can_attach / attach / cancel_attach can race against fork/exit (and
> post_fork if you consider those interested in cgroup task link like
> the freezer. But that is racy in any case already even with
> threadgroup lock)
Against exit, no. Against forking a new process, can they? If so, we
need to fix it.
> It has been designed to be called under that lock. So I suspect the
Ummm.... threadgroup_lock is a recent addition so things couldn't have
been designed to be called under that lock. threadgroup_lock protects
the *threadgroup* - creating a new task in the same process or a task
of the process exiting. It doesn't do anything about other processes.
In fact, the lock itself is per-process.
> best, at least for now, is to threadgroup lock from
> cgroup_attach_task_all(). And also make cgroup_attach_task() static to
> avoid future unsafe callers.
Oh, from that call path, sure. Can someone teach me why we need that
one at all? I think we're confusing each other here. I was talking
about the usual migration path not protected against forking a new
process.
> There is no harm yet because the only user of it calls that with
> current as the "task" parameter, in a place that is
> not in the middle of a fork. So no need to worry about some stable backport.
>
> Also, looking at cgroup_attach_task_all(), what guarantee do we have
> that "from" is not concurrently exiting and removing its cgrp. Which
> is a separate problem. But we probably need to do some css_set_get()
> before playing with it.
I really don't know. Why isn't it locking the threadgroup to begin
with?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
2012-10-08 6:57 ` Tejun Heo
@ 2012-10-16 19:34 ` Tejun Heo
[not found] ` <20121016193428.GE16166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-16 19:34 UTC (permalink / raw)
To: Li Zefan
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
On Mon, Oct 08, 2012 at 03:57:52PM +0900, Tejun Heo wrote:
> Hello, Li, Frederic.
>
> On Mon, Oct 08, 2012 at 01:46:18PM +0800, Li Zefan wrote:
> > You're right. threadgroup lock is held unconditionally in attach_task_py_pid(),
> > but it's held only for CLONE_THREAD in fork path, which I guess I overlooked
> > when reviewing the patch.
> >
> > > Also, please note that task_lock is likely to be hot on local CPU at
> > > that point and avoiding it there might not really buy much.
> >
> > Reverting that commit should be fine.
>
> There are other commits which perform similar optimization
>
> 7e3aa30ac8 ("cgroup: Remove task_lock() from cgroup_post_fork()")
> c84cdf75cc ("cgroup: Remove unnecessary task_lock before fetching css_set on migration")
>
> Are they wrong too?
Frederic, Li, Ping?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <20121016193428.GE16166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-17 7:26 ` Li Zefan
0 siblings, 0 replies; 28+ messages in thread
From: Li Zefan @ 2012-10-17 7:26 UTC (permalink / raw)
To: Tejun Heo
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
于 2012/10/17 3:34, Tejun Heo 写道:
> On Mon, Oct 08, 2012 at 03:57:52PM +0900, Tejun Heo wrote:
>> Hello, Li, Frederic.
>>
>> On Mon, Oct 08, 2012 at 01:46:18PM +0800, Li Zefan wrote:
>>> You're right. threadgroup lock is held unconditionally in attach_task_py_pid(),
>>> but it's held only for CLONE_THREAD in fork path, which I guess I overlooked
>>> when reviewing the patch.
>>>
>>>> Also, please note that task_lock is likely to be hot on local CPU at
>>>> that point and avoiding it there might not really buy much.
>>>
>>> Reverting that commit should be fine.
>>
>> There are other commits which perform similar optimization
>>
>> 7e3aa30ac8 ("cgroup: Remove task_lock() from cgroup_post_fork()")
>> c84cdf75cc ("cgroup: Remove unnecessary task_lock before fetching css_set on migration")
>>
>> Are they wrong too?
>
> Frederic, Li, Ping?
>
Yes, they're wrong. I sugguest we revert those commits, and then fix cgroup_attach_task_all()
and others if still any.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <20121016193341.GD16166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-18 14:50 ` Frederic Weisbecker
[not found] ` <CAFTL4hzo_w7HTgC9ApTk113X8WdZSpV+D+VSEe=604YEJFmKsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-18 14:50 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, LKML
2012/10/16 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> Hey, Frederic.
>
> On Mon, Oct 08, 2012 at 02:48:58PM +0200, Frederic Weisbecker wrote:
>> Yeah I missed this one.
>> Now the whole cgroup_attach_task() is clusteracy without the
>
> Clusteracy?
>
>> threadgroup lock anyway:
>>
>> * The PF_EXITING check is racy (we are neither holding tsk->flags nor
>> threagroup lock).
>
> PF_EXITING is *always* protected by threadgroup_change_begin/end().
>
>> * The cgrp == oldcgrp is racy (exit() can change the oldcgrp anytime.
>
> So, as long as this happens after PF_EXITING check, it should be safe.
>
>> * can_attach / attach / cancel_attach can race against fork/exit (and
>> post_fork if you consider those interested in cgroup task link like
>> the freezer. But that is racy in any case already even with
>> threadgroup lock)
>
> Against exit, no. Against forking a new process, can they? If so, we
> need to fix it.
>
>> It has been designed to be called under that lock. So I suspect the
>
> Ummm.... threadgroup_lock is a recent addition so things couldn't have
> been designed to be called under that lock. threadgroup_lock protects
> the *threadgroup* - creating a new task in the same process or a task
> of the process exiting. It doesn't do anything about other processes.
> In fact, the lock itself is per-process.
>
>> best, at least for now, is to threadgroup lock from
>> cgroup_attach_task_all(). And also make cgroup_attach_task() static to
>> avoid future unsafe callers.
>
> Oh, from that call path, sure. Can someone teach me why we need that
> one at all? I think we're confusing each other here. I was talking
> about the usual migration path not protected against forking a new
> process.
Ah right I was confused. Hmm, indeed we have a race here on
cgroup_fork(). How about using css_try_get() in cgroup_fork() and
refetch the parent's css until we succeed? This requires rcu_read_lock
though, and freeing the css_set under RCU.
Don't know which is better.
Different problem but I really would like we sanitize the cgroup hooks
in fork. There is cgroup_fork(), cgroup_post_fork() which takes that
big css_set_lock, plus the big threadgroup lock... I hope we can
simplify the mess there.
>
>> There is no harm yet because the only user of it calls that with
>> current as the "task" parameter, in a place that is
>> not in the middle of a fork. So no need to worry about some stable backport.
>>
>> Also, looking at cgroup_attach_task_all(), what guarantee do we have
>> that "from" is not concurrently exiting and removing its cgrp. Which
>> is a separate problem. But we probably need to do some css_set_get()
>> before playing with it.
>
> I really don't know. Why isn't it locking the threadgroup to begin
> with?
No idea, sounds like something to fix.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <CAFTL4hzo_w7HTgC9ApTk113X8WdZSpV+D+VSEe=604YEJFmKsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-18 20:07 ` Tejun Heo
[not found] ` <20121018200705.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-18 20:07 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML
Hello, Frederic.
On Thu, Oct 18, 2012 at 04:50:59PM +0200, Frederic Weisbecker wrote:
> Ah right I was confused. Hmm, indeed we have a race here on
> cgroup_fork(). How about using css_try_get() in cgroup_fork() and
> refetch the parent's css until we succeed? This requires rcu_read_lock
> though, and freeing the css_set under RCU.
>
> Don't know which is better.
For now, I'll revert the patches and cc stable. Let's think about
improving it later.
> Different problem but I really would like we sanitize the cgroup hooks
> in fork. There is cgroup_fork(), cgroup_post_fork() which takes that
> big css_set_lock, plus the big threadgroup lock... I hope we can
> simplify the mess there.
Oh yeah, I've been looking at that one too. There are a few problems
in that area. I think all we need is clearing ->cgroups to NULL on
copy_process() and all the rest can be moved to cgroup_post_fork().
I'd also like to make it very explicit that migration can't happen
before post_fork is complete.
> > I really don't know. Why isn't it locking the threadgroup to begin
> > with?
>
> No idea, sounds like something to fix.
Alrighty.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <20121018200705.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-18 20:53 ` Frederic Weisbecker
[not found] ` <CAFTL4hy7g4e11OUOyoihrEU8hiVgZoV1=141UtUpj9a72SNs_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-18 20:53 UTC (permalink / raw)
To: Tejun Heo
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML
2012/10/18 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> Hello, Frederic.
>
> On Thu, Oct 18, 2012 at 04:50:59PM +0200, Frederic Weisbecker wrote:
>> Ah right I was confused. Hmm, indeed we have a race here on
>> cgroup_fork(). How about using css_try_get() in cgroup_fork() and
>> refetch the parent's css until we succeed? This requires rcu_read_lock
>> though, and freeing the css_set under RCU.
>>
>> Don't know which is better.
>
> For now, I'll revert the patches and cc stable. Let's think about
> improving it later.
Ok for reverting in cgroup_fork(). Is it necessary for the
cgroup_post_fork() thing? I don't immediately see any race involved
there.
>> Different problem but I really would like we sanitize the cgroup hooks
>> in fork. There is cgroup_fork(), cgroup_post_fork() which takes that
>> big css_set_lock, plus the big threadgroup lock... I hope we can
>> simplify the mess there.
>
> Oh yeah, I've been looking at that one too. There are a few problems
> in that area. I think all we need is clearing ->cgroups to NULL on
> copy_process() and all the rest can be moved to cgroup_post_fork().
> I'd also like to make it very explicit that migration can't happen
> before post_fork is complete.
Sounds right.
>
>> > I really don't know. Why isn't it locking the threadgroup to begin
>> > with?
>>
>> No idea, sounds like something to fix.
>
> Alrighty.
Ok thanks.
> Thanks.
>
> --
> tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <CAFTL4hy7g4e11OUOyoihrEU8hiVgZoV1=141UtUpj9a72SNs_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-19 0:38 ` Tejun Heo
[not found] ` <20121019003835.GE13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-19 0:38 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, LKML
Hello, Frederic.
On Thu, Oct 18, 2012 at 10:53:47PM +0200, Frederic Weisbecker wrote:
> > For now, I'll revert the patches and cc stable. Let's think about
> > improving it later.
>
> Ok for reverting in cgroup_fork(). Is it necessary for the
> cgroup_post_fork() thing? I don't immediately see any race involved
> there.
Even if there isn't an actual race, the comment is dead wrong. I'm
reverting the following three patches. Let's try again later.
7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()")
7e3aa30ac8 ("cgroup: Remove task_lock() from cgroup_post_fork()")
c84cdf75cc ("cgroup: Remove unnecessary task_lock before fetching css_set on migration")
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <20121019003835.GE13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-19 0:58 ` Tejun Heo
[not found] ` <20121019005801.GF13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-19 0:58 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML
Hello, again.
On Thu, Oct 18, 2012 at 05:38:35PM -0700, Tejun Heo wrote:
> Even if there isn't an actual race, the comment is dead wrong. I'm
> reverting the following three patches. Let's try again later.
>
> 7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()")
> 7e3aa30ac8 ("cgroup: Remove task_lock() from cgroup_post_fork()")
So, after some more looking, I think the following is correct and
doesn't need to be reverted. It's depending on threadgroup locking
from migration path to synchronize against exit path which is always
performed.
> c84cdf75cc ("cgroup: Remove unnecessary task_lock before fetching css_set on migration")
Frederic, were you trying to say that the above commit is correct?
Li, do you agree?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
2012-10-08 2:00 Is not locking task_lock in cgroup_fork() safe? Tejun Heo
2012-10-08 2:01 ` Tejun Heo
2012-10-08 12:48 ` Frederic Weisbecker
@ 2012-10-19 0:59 ` Tejun Heo
[not found] ` <20121019005922.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 0:59 ` [PATCH cgroup/for-3.7-fixes 2/2] Revert "cgroup: Drop task_lock(parent) on cgroup_fork()" Tejun Heo
3 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-19 0:59 UTC (permalink / raw)
To: Frederic Weisbecker, Li Zefan,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
From d935a5d6832a264ce52f4257e176f4f96cbaf048 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Thu, 18 Oct 2012 17:40:30 -0700
This reverts commit 7e3aa30ac8c904a706518b725c451bb486daaae9.
The commit incorrectly assumed that fork path always performed
threadgroup_change_begin/end() and depended on that for
synchronization against task exit and cgroup migration paths instead
of explicitly grabbing task_lock().
threadgroup_change is not locked when forking a new process (as
opposed to a new thread in the same process) and even if it were it
wouldn't be effective as different processes use different threadgroup
locks.
Revert the incorrect optimization.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
LKML-Reference: <20121008020000.GB2575@localhost>
Cc: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
kernel/cgroup.c | 15 +++------------
1 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d1739fc..2990dc7 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4894,19 +4894,10 @@ void cgroup_post_fork(struct task_struct *child)
*/
if (use_task_css_set_links) {
write_lock(&css_set_lock);
- if (list_empty(&child->cg_list)) {
- /*
- * It's safe to use child->cgroups without task_lock()
- * here because we are protected through
- * threadgroup_change_begin() against concurrent
- * css_set change in cgroup_task_migrate(). Also
- * the task can't exit at that point until
- * wake_up_new_task() is called, so we are protected
- * against cgroup_exit() setting child->cgroup to
- * init_css_set.
- */
+ task_lock(child);
+ if (list_empty(&child->cg_list))
list_add(&child->cg_list, &child->cgroups->tasks);
- }
+ task_unlock(child);
write_unlock(&css_set_lock);
}
}
--
1.7.7.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH cgroup/for-3.7-fixes 2/2] Revert "cgroup: Drop task_lock(parent) on cgroup_fork()"
2012-10-08 2:00 Is not locking task_lock in cgroup_fork() safe? Tejun Heo
` (2 preceding siblings ...)
2012-10-19 0:59 ` [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()" Tejun Heo
@ 2012-10-19 0:59 ` Tejun Heo
[not found] ` <20121019005951.GH13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
3 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-19 0:59 UTC (permalink / raw)
To: Frederic Weisbecker, Li Zefan,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
From c8b27924a8b6fd74066088f1cf07c256bbc6ed74 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Thu, 18 Oct 2012 17:52:07 -0700
This reverts commit 7e381b0eb1e1a9805c37335562e8dc02e7d7848c.
The commit incorrectly assumed that fork path always performed
threadgroup_change_begin/end() and depended on that for
synchronization against task exit and cgroup migration paths instead
of explicitly grabbing task_lock().
threadgroup_change is not locked when forking a new process (as
opposed to a new thread in the same process) and even if it were it
wouldn't be effective as different processes use different threadgroup
locks.
Revert the incorrect optimization.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
LKML-Reference: <20121008020000.GB2575@localhost>
Cc: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
kernel/cgroup.c | 23 ++++++-----------------
1 files changed, 6 insertions(+), 17 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2990dc7..f24f724 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4814,31 +4814,20 @@ static const struct file_operations proc_cgroupstats_operations = {
*
* A pointer to the shared css_set was automatically copied in
* fork.c by dup_task_struct(). However, we ignore that copy, since
- * it was not made under the protection of RCU, cgroup_mutex or
- * threadgroup_change_begin(), so it might no longer be a valid
- * cgroup pointer. cgroup_attach_task() might have already changed
- * current->cgroups, allowing the previously referenced cgroup
- * group to be removed and freed.
- *
- * Outside the pointer validity we also need to process the css_set
- * inheritance between threadgoup_change_begin() and
- * threadgoup_change_end(), this way there is no leak in any process
- * wide migration performed by cgroup_attach_proc() that could otherwise
- * miss a thread because it is too early or too late in the fork stage.
+ * it was not made under the protection of RCU or cgroup_mutex, so
+ * might no longer be a valid cgroup pointer. cgroup_attach_task() might
+ * have already changed current->cgroups, allowing the previously
+ * referenced cgroup group to be removed and freed.
*
* At the point that cgroup_fork() is called, 'current' is the parent
* task, and the passed argument 'child' points to the child task.
*/
void cgroup_fork(struct task_struct *child)
{
- /*
- * We don't need to task_lock() current because current->cgroups
- * can't be changed concurrently here. The parent obviously hasn't
- * exited and called cgroup_exit(), and we are synchronized against
- * cgroup migration through threadgroup_change_begin().
- */
+ task_lock(current);
child->cgroups = current->cgroups;
get_css_set(child->cgroups);
+ task_unlock(current);
INIT_LIST_HEAD(&child->cg_list);
}
--
1.7.7.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Is not locking task_lock in cgroup_fork() safe?
[not found] ` <20121019005801.GF13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-19 8:50 ` Li Zefan
0 siblings, 0 replies; 28+ messages in thread
From: Li Zefan @ 2012-10-19 8:50 UTC (permalink / raw)
To: Tejun Heo
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, LKML
On 2012/10/19 8:58, Tejun Heo wrote:
> Hello, again.
>
> On Thu, Oct 18, 2012 at 05:38:35PM -0700, Tejun Heo wrote:
>> Even if there isn't an actual race, the comment is dead wrong. I'm
>> reverting the following three patches. Let's try again later.
>>
>> 7e381b0eb1 ("cgroup: Drop task_lock(parent) on cgroup_fork()")
>> 7e3aa30ac8 ("cgroup: Remove task_lock() from cgroup_post_fork()")
>
> So, after some more looking, I think the following is correct and
> doesn't need to be reverted. It's depending on threadgroup locking
> from migration path to synchronize against exit path which is always
> performed.
>
>> c84cdf75cc ("cgroup: Remove unnecessary task_lock before fetching css_set on migration")
>
> Frederic, were you trying to say that the above commit is correct?
> Li, do you agree?
>
This one does look innocent.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <20121019005922.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-19 8:51 ` Li Zefan
2012-10-19 13:35 ` Frederic Weisbecker
1 sibling, 0 replies; 28+ messages in thread
From: Li Zefan @ 2012-10-19 8:51 UTC (permalink / raw)
To: Tejun Heo
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
On 2012/10/19 8:59, Tejun Heo wrote:
>>From d935a5d6832a264ce52f4257e176f4f96cbaf048 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Date: Thu, 18 Oct 2012 17:40:30 -0700
>
> This reverts commit 7e3aa30ac8c904a706518b725c451bb486daaae9.
>
> The commit incorrectly assumed that fork path always performed
> threadgroup_change_begin/end() and depended on that for
> synchronization against task exit and cgroup migration paths instead
> of explicitly grabbing task_lock().
>
> threadgroup_change is not locked when forking a new process (as
> opposed to a new thread in the same process) and even if it were it
> wouldn't be effective as different processes use different threadgroup
> locks.
>
> Revert the incorrect optimization.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> LKML-Reference: <20121008020000.GB2575@localhost>
> Cc: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 2/2] Revert "cgroup: Drop task_lock(parent) on cgroup_fork()"
[not found] ` <20121019005951.GH13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-19 8:51 ` Li Zefan
2012-10-19 13:45 ` Frederic Weisbecker
1 sibling, 0 replies; 28+ messages in thread
From: Li Zefan @ 2012-10-19 8:51 UTC (permalink / raw)
To: Tejun Heo
Cc: Frederic Weisbecker,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
On 2012/10/19 8:59, Tejun Heo wrote:
>>From c8b27924a8b6fd74066088f1cf07c256bbc6ed74 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Date: Thu, 18 Oct 2012 17:52:07 -0700
>
> This reverts commit 7e381b0eb1e1a9805c37335562e8dc02e7d7848c.
>
> The commit incorrectly assumed that fork path always performed
> threadgroup_change_begin/end() and depended on that for
> synchronization against task exit and cgroup migration paths instead
> of explicitly grabbing task_lock().
>
> threadgroup_change is not locked when forking a new process (as
> opposed to a new thread in the same process) and even if it were it
> wouldn't be effective as different processes use different threadgroup
> locks.
>
> Revert the incorrect optimization.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> LKML-Reference: <20121008020000.GB2575@localhost>
> Cc: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <20121019005922.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 8:51 ` Li Zefan
@ 2012-10-19 13:35 ` Frederic Weisbecker
[not found] ` <CAFTL4hz82==b3ioSMhbKzh0CN1ivR7RQMKKMFFWu5PHPjg=Bfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-19 13:35 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
2012/10/18 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> From d935a5d6832a264ce52f4257e176f4f96cbaf048 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Date: Thu, 18 Oct 2012 17:40:30 -0700
>
> This reverts commit 7e3aa30ac8c904a706518b725c451bb486daaae9.
>
> The commit incorrectly assumed that fork path always performed
> threadgroup_change_begin/end() and depended on that for
> synchronization against task exit and cgroup migration paths instead
> of explicitly grabbing task_lock().
>
> threadgroup_change is not locked when forking a new process (as
> opposed to a new thread in the same process) and even if it were it
> wouldn't be effective as different processes use different threadgroup
> locks.
>
> Revert the incorrect optimization.
Ok but there is still no good reason to task_lock() there. But the
comment is indeed wrong, how about fixing it instead? I can send you
a patch for that.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> LKML-Reference: <20121008020000.GB2575@localhost>
> Cc: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
> kernel/cgroup.c | 15 +++------------
> 1 files changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index d1739fc..2990dc7 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4894,19 +4894,10 @@ void cgroup_post_fork(struct task_struct *child)
> */
> if (use_task_css_set_links) {
> write_lock(&css_set_lock);
> - if (list_empty(&child->cg_list)) {
> - /*
> - * It's safe to use child->cgroups without task_lock()
> - * here because we are protected through
> - * threadgroup_change_begin() against concurrent
> - * css_set change in cgroup_task_migrate(). Also
> - * the task can't exit at that point until
> - * wake_up_new_task() is called, so we are protected
> - * against cgroup_exit() setting child->cgroup to
> - * init_css_set.
> - */
> + task_lock(child);
> + if (list_empty(&child->cg_list))
> list_add(&child->cg_list, &child->cgroups->tasks);
> - }
> + task_unlock(child);
> write_unlock(&css_set_lock);
> }
> }
> --
> 1.7.7.3
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 2/2] Revert "cgroup: Drop task_lock(parent) on cgroup_fork()"
[not found] ` <20121019005951.GH13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 8:51 ` Li Zefan
@ 2012-10-19 13:45 ` Frederic Weisbecker
1 sibling, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-19 13:45 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
2012/10/18 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> From c8b27924a8b6fd74066088f1cf07c256bbc6ed74 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Date: Thu, 18 Oct 2012 17:52:07 -0700
>
> This reverts commit 7e381b0eb1e1a9805c37335562e8dc02e7d7848c.
>
> The commit incorrectly assumed that fork path always performed
> threadgroup_change_begin/end() and depended on that for
> synchronization against task exit and cgroup migration paths instead
> of explicitly grabbing task_lock().
>
> threadgroup_change is not locked when forking a new process (as
> opposed to a new thread in the same process) and even if it were it
> wouldn't be effective as different processes use different threadgroup
> locks.
>
> Revert the incorrect optimization.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> LKML-Reference: <20121008020000.GB2575@localhost>
(bitterly) Acked-by: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> :-)
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
> kernel/cgroup.c | 23 ++++++-----------------
> 1 files changed, 6 insertions(+), 17 deletions(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 2990dc7..f24f724 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4814,31 +4814,20 @@ static const struct file_operations proc_cgroupstats_operations = {
> *
> * A pointer to the shared css_set was automatically copied in
> * fork.c by dup_task_struct(). However, we ignore that copy, since
> - * it was not made under the protection of RCU, cgroup_mutex or
> - * threadgroup_change_begin(), so it might no longer be a valid
> - * cgroup pointer. cgroup_attach_task() might have already changed
> - * current->cgroups, allowing the previously referenced cgroup
> - * group to be removed and freed.
> - *
> - * Outside the pointer validity we also need to process the css_set
> - * inheritance between threadgoup_change_begin() and
> - * threadgoup_change_end(), this way there is no leak in any process
> - * wide migration performed by cgroup_attach_proc() that could otherwise
> - * miss a thread because it is too early or too late in the fork stage.
> + * it was not made under the protection of RCU or cgroup_mutex, so
> + * might no longer be a valid cgroup pointer. cgroup_attach_task() might
> + * have already changed current->cgroups, allowing the previously
> + * referenced cgroup group to be removed and freed.
> *
> * At the point that cgroup_fork() is called, 'current' is the parent
> * task, and the passed argument 'child' points to the child task.
> */
> void cgroup_fork(struct task_struct *child)
> {
> - /*
> - * We don't need to task_lock() current because current->cgroups
> - * can't be changed concurrently here. The parent obviously hasn't
> - * exited and called cgroup_exit(), and we are synchronized against
> - * cgroup migration through threadgroup_change_begin().
> - */
> + task_lock(current);
> child->cgroups = current->cgroups;
> get_css_set(child->cgroups);
> + task_unlock(current);
> INIT_LIST_HEAD(&child->cg_list);
> }
>
> --
> 1.7.7.3
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <CAFTL4hz82==b3ioSMhbKzh0CN1ivR7RQMKKMFFWu5PHPjg=Bfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-19 19:38 ` Tejun Heo
[not found] ` <20121019193808.GL13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-19 19:38 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA
On Fri, Oct 19, 2012 at 09:35:26AM -0400, Frederic Weisbecker wrote:
> 2012/10/18 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> > From d935a5d6832a264ce52f4257e176f4f96cbaf048 Mon Sep 17 00:00:00 2001
> > From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > Date: Thu, 18 Oct 2012 17:40:30 -0700
> >
> > This reverts commit 7e3aa30ac8c904a706518b725c451bb486daaae9.
> >
> > The commit incorrectly assumed that fork path always performed
> > threadgroup_change_begin/end() and depended on that for
> > synchronization against task exit and cgroup migration paths instead
> > of explicitly grabbing task_lock().
> >
> > threadgroup_change is not locked when forking a new process (as
> > opposed to a new thread in the same process) and even if it were it
> > wouldn't be effective as different processes use different threadgroup
> > locks.
> >
> > Revert the incorrect optimization.
>
> Ok but there is still no good reason to task_lock() there. But the
> comment is indeed wrong, how about fixing it instead? I can send you
> a patch for that.
For -stable, I think it's better to revert. If you want to remove
task_lock, let's do it for 3.8.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <20121019193808.GL13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-19 19:44 ` Frederic Weisbecker
[not found] ` <CAFTL4hwQ6Ntn5GJwj=jiO2p3GdwhEMp0MyR8dgUj_Lx0U4kNqg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-19 19:44 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, LKML
2012/10/19 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> On Fri, Oct 19, 2012 at 09:35:26AM -0400, Frederic Weisbecker wrote:
>> 2012/10/18 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
>> > From d935a5d6832a264ce52f4257e176f4f96cbaf048 Mon Sep 17 00:00:00 2001
>> > From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>> > Date: Thu, 18 Oct 2012 17:40:30 -0700
>> >
>> > This reverts commit 7e3aa30ac8c904a706518b725c451bb486daaae9.
>> >
>> > The commit incorrectly assumed that fork path always performed
>> > threadgroup_change_begin/end() and depended on that for
>> > synchronization against task exit and cgroup migration paths instead
>> > of explicitly grabbing task_lock().
>> >
>> > threadgroup_change is not locked when forking a new process (as
>> > opposed to a new thread in the same process) and even if it were it
>> > wouldn't be effective as different processes use different threadgroup
>> > locks.
>> >
>> > Revert the incorrect optimization.
>>
>> Ok but there is still no good reason to task_lock() there. But the
>> comment is indeed wrong, how about fixing it instead? I can send you
>> a patch for that.
>
> For -stable, I think it's better to revert. If you want to remove
> task_lock, let's do it for 3.8.
I don't think that a wrong comment justifies a patch to stable.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <CAFTL4hwQ6Ntn5GJwj=jiO2p3GdwhEMp0MyR8dgUj_Lx0U4kNqg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-19 21:07 ` Tejun Heo
[not found] ` <20121019210738.GA1180-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-19 21:07 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, LKML
Hello, Frederic.
On Fri, Oct 19, 2012 at 03:44:20PM -0400, Frederic Weisbecker wrote:
> > For -stable, I think it's better to revert. If you want to remove
> > task_lock, let's do it for 3.8.
>
> I don't think that a wrong comment justifies a patch to stable.
I'm not really sure whether it's safe or not. It seems all usages are
protected by write locking css_set_lock but maybe I'm missing
something and as the commit is born out of confusion, I'm very
inclined to revert it by default. Are you sure this one is safe?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <20121019210738.GA1180-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-10-20 18:21 ` Frederic Weisbecker
[not found] ` <CAFTL4hy+vrvJKrc1Y2FW44k=LBi72H=34337xALpbtG_3u5O7w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-20 18:21 UTC (permalink / raw)
To: Tejun Heo
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML
2012/10/19 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> Hello, Frederic.
>
> On Fri, Oct 19, 2012 at 03:44:20PM -0400, Frederic Weisbecker wrote:
>> > For -stable, I think it's better to revert. If you want to remove
>> > task_lock, let's do it for 3.8.
>>
>> I don't think that a wrong comment justifies a patch to stable.
>
> I'm not really sure whether it's safe or not. It seems all usages are
> protected by write locking css_set_lock but maybe I'm missing
> something and as the commit is born out of confusion, I'm very
> inclined to revert it by default. Are you sure this one is safe?
Thinking about it further, one scenario is worrying me but it
eventually looks safe but by accident.
CPU 0
CPU 1
cgroup_task_migrate {
task_lock(p)
rcu_assign_pointer(tsk->cgroups, newcg);
task_unlock(tsk);
write_lock(&css_set_lock);
if (!list_empty(&tsk->cg_list))
list_move(&tsk->cg_list, &newcg->tasks);
write_unlock(&css_set_lock);
write_lock(&css_set_lock);
put_css_set(oldcg);
list_add(&child->cg_list, &child->cgroups->tasks); (1)
On (1), child->cgroups should have the value of newcg and not oldcg
due to the memory ordering implied by the locking of css_set_lock. Now
I can't guarantee that because I'm no memory ordering expert. And even
if it's safe, it's so very non obvious that I now agree with you:
let's revert the patch and restart with a better base by gathering
all the cgroup fork code in the current cgroup_post_fork place.
Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <CAFTL4hy+vrvJKrc1Y2FW44k=LBi72H=34337xALpbtG_3u5O7w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-20 18:23 ` Frederic Weisbecker
2012-10-20 22:37 ` Tejun Heo
1 sibling, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-20 18:23 UTC (permalink / raw)
To: Tejun Heo
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML
2012/10/20 Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
> 2012/10/19 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
>> Hello, Frederic.
>>
>> On Fri, Oct 19, 2012 at 03:44:20PM -0400, Frederic Weisbecker wrote:
>>> > For -stable, I think it's better to revert. If you want to remove
>>> > task_lock, let's do it for 3.8.
>>>
>>> I don't think that a wrong comment justifies a patch to stable.
>>
>> I'm not really sure whether it's safe or not. It seems all usages are
>> protected by write locking css_set_lock but maybe I'm missing
>> something and as the commit is born out of confusion, I'm very
>> inclined to revert it by default. Are you sure this one is safe?
>
> Thinking about it further, one scenario is worrying me but it
> eventually looks safe but by accident.
>
> CPU 0
> CPU 1
>
> cgroup_task_migrate {
> task_lock(p)
> rcu_assign_pointer(tsk->cgroups, newcg);
> task_unlock(tsk);
>
> write_lock(&css_set_lock);
> if (!list_empty(&tsk->cg_list))
> list_move(&tsk->cg_list, &newcg->tasks);
> write_unlock(&css_set_lock);
>
> write_lock(&css_set_lock);
> put_css_set(oldcg);
> list_add(&child->cg_list, &child->cgroups->tasks); (1)
gmail mangled everything :(
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <CAFTL4hy+vrvJKrc1Y2FW44k=LBi72H=34337xALpbtG_3u5O7w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-20 18:23 ` Frederic Weisbecker
@ 2012-10-20 22:37 ` Tejun Heo
[not found] ` <20121020223709.GA5626-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
1 sibling, 1 reply; 28+ messages in thread
From: Tejun Heo @ 2012-10-20 22:37 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML
Hello, Frederic.
On Sat, Oct 20, 2012 at 02:21:43PM -0400, Frederic Weisbecker wrote:
> CPU 0
> CPU 1
>
> cgroup_task_migrate {
> task_lock(p)
> rcu_assign_pointer(tsk->cgroups, newcg);
> task_unlock(tsk);
>
> write_lock(&css_set_lock);
> if (!list_empty(&tsk->cg_list))
> list_move(&tsk->cg_list, &newcg->tasks);
> write_unlock(&css_set_lock);
>
> write_lock(&css_set_lock);
> put_css_set(oldcg);
> list_add(&child->cg_list, &child->cgroups->tasks); (1)
Man, that's confusing. :)
> On (1), child->cgroups should have the value of newcg and not oldcg
> due to the memory ordering implied by the locking of css_set_lock. Now
> I can't guarantee that because I'm no memory ordering expert. And even
> if it's safe, it's so very non obvious that I now agree with you:
> let's revert the patch and restart with a better base by gathering
> all the cgroup fork code in the current cgroup_post_fork place.
Aye aye, let's move everything to cgroup_post_fork() and then we don't
have to worry about grabbing task_lock multiple times.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
[not found] ` <20121020223709.GA5626-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2012-10-22 9:30 ` Frederic Weisbecker
0 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2012-10-22 9:30 UTC (permalink / raw)
To: Tejun Heo
Cc: Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA, LKML
2012/10/21 Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> Hello, Frederic.
>
> On Sat, Oct 20, 2012 at 02:21:43PM -0400, Frederic Weisbecker wrote:
>> CPU 0
>> CPU 1
>>
>> cgroup_task_migrate {
>> task_lock(p)
>> rcu_assign_pointer(tsk->cgroups, newcg);
>> task_unlock(tsk);
>>
>> write_lock(&css_set_lock);
>> if (!list_empty(&tsk->cg_list))
>> list_move(&tsk->cg_list, &newcg->tasks);
>> write_unlock(&css_set_lock);
>>
>> write_lock(&css_set_lock);
>> put_css_set(oldcg);
>> list_add(&child->cg_list, &child->cgroups->tasks); (1)
>
> Man, that's confusing. :)
Sorry and I'm currently stuck in some airport and too lazy to reorder
the above lines :)
>
>> On (1), child->cgroups should have the value of newcg and not oldcg
>> due to the memory ordering implied by the locking of css_set_lock. Now
>> I can't guarantee that because I'm no memory ordering expert. And even
>> if it's safe, it's so very non obvious that I now agree with you:
>> let's revert the patch and restart with a better base by gathering
>> all the cgroup fork code in the current cgroup_post_fork place.
>
> Aye aye, let's move everything to cgroup_post_fork() and then we don't
> have to worry about grabbing task_lock multiple times.
Agreed. and Acked-by: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2012-10-22 9:30 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-08 2:00 Is not locking task_lock in cgroup_fork() safe? Tejun Heo
2012-10-08 2:01 ` Tejun Heo
2012-10-08 5:46 ` Li Zefan
[not found] ` <507268AA.8050509-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2012-10-08 6:57 ` Tejun Heo
2012-10-16 19:34 ` Tejun Heo
[not found] ` <20121016193428.GE16166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-17 7:26 ` Li Zefan
2012-10-08 12:58 ` Frederic Weisbecker
2012-10-08 12:48 ` Frederic Weisbecker
[not found] ` <CAFTL4hzXWtzp7megsCAEuak5=_2SWmp9age-+wrpyQAU4BRZ0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-16 19:33 ` Tejun Heo
[not found] ` <20121016193341.GD16166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-18 14:50 ` Frederic Weisbecker
[not found] ` <CAFTL4hzo_w7HTgC9ApTk113X8WdZSpV+D+VSEe=604YEJFmKsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-18 20:07 ` Tejun Heo
[not found] ` <20121018200705.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-18 20:53 ` Frederic Weisbecker
[not found] ` <CAFTL4hy7g4e11OUOyoihrEU8hiVgZoV1=141UtUpj9a72SNs_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-19 0:38 ` Tejun Heo
[not found] ` <20121019003835.GE13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 0:58 ` Tejun Heo
[not found] ` <20121019005801.GF13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 8:50 ` Li Zefan
2012-10-19 0:59 ` [PATCH cgroup/for-3.7-fixes 1/2] Revert "cgroup: Remove task_lock() from cgroup_post_fork()" Tejun Heo
[not found] ` <20121019005922.GG13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 8:51 ` Li Zefan
2012-10-19 13:35 ` Frederic Weisbecker
[not found] ` <CAFTL4hz82==b3ioSMhbKzh0CN1ivR7RQMKKMFFWu5PHPjg=Bfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-19 19:38 ` Tejun Heo
[not found] ` <20121019193808.GL13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 19:44 ` Frederic Weisbecker
[not found] ` <CAFTL4hwQ6Ntn5GJwj=jiO2p3GdwhEMp0MyR8dgUj_Lx0U4kNqg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-19 21:07 ` Tejun Heo
[not found] ` <20121019210738.GA1180-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-20 18:21 ` Frederic Weisbecker
[not found] ` <CAFTL4hy+vrvJKrc1Y2FW44k=LBi72H=34337xALpbtG_3u5O7w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-20 18:23 ` Frederic Weisbecker
2012-10-20 22:37 ` Tejun Heo
[not found] ` <20121020223709.GA5626-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-10-22 9:30 ` Frederic Weisbecker
2012-10-19 0:59 ` [PATCH cgroup/for-3.7-fixes 2/2] Revert "cgroup: Drop task_lock(parent) on cgroup_fork()" Tejun Heo
[not found] ` <20121019005951.GH13370-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-10-19 8:51 ` Li Zefan
2012-10-19 13:45 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).