From: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: anjana vk <anjanvk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: cgroup attach task - slogging cpu
Date: Fri, 4 Oct 2013 15:02:07 +0200 [thread overview]
Message-ID: <20131004130207.GA9338@redhat.com> (raw)
In-Reply-To: <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Hello Anjana,
On 10/04, anjana vk wrote:
>
> I saw an issue of CPU slogging posted in
> https://lkml.org/lkml/2013/8/28/283, and require your valuable
> suggestion on a very similar issue I am facing.
Not sure I understand, but just in case: yes the lockless
while_each_thread() is buggy and should be fixed (it should actually
die eventually). And a lot of current users of while_each_thread()
are themselves buggy and need the fixes no matter what we do with
while_each_thread.
Oh. and just in case... I am (slowly) working on this, but didn't
finish it yet, sorry.
But I do not really understand cgroup_attach_task(), and I am not
sure you observed the same problem. Add Tejun.
> We are facing the issue of cpu slogging in the function
> cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
> bool threadgroup)
> in the do-while loop which iterates over the threadgroup.
>
> rcu_read_lock();
> do {
> struct task_and_cgroup ent;
>
> /* @tsk either already exited or can't exit until the end */
> if (tsk->flags & PF_EXITING)
> continue;
>
> /* as per above, nr_threads may decrease, but not increase. */
> BUG_ON(i >= group_size);
> ent.task = tsk;
> ent.cgrp = task_cgroup_from_root(tsk, root);
> /* nothing to do if this task is already in the cgroup */
> if (ent.cgrp == cgrp)
> continue;
> /*
> * saying GFP_ATOMIC has no effect here because we did prealloc
> * earlier, but it's good form to communicate our expectations.
> */
> retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
> BUG_ON(retval != 0);
> i++;
>
> if (!threadgroup)
> break;
> } while_each_thread(leader, tsk);
>
> Problem Observed:
> In this loop, in case of a single thread, and argument “bool
> threadgroup” as “false” and
> if(ent.cgrp == cgrp) is true
> we will continue to the next thread instead of breaking out of the loop.
But in this case the loop should be terminated by while_each_thread's
check?
Again, again, while_each_thread() is wrong, and we need
--- x/kernel/cgroup.c
+++ x/kernel/cgroup.c
@@ -2034,7 +2034,7 @@ static int cgroup_attach_task(struct cgr
* take an rcu_read_lock.
*/
rcu_read_lock();
- do {
+ for_each_thread(leader->signal, task) {
struct task_and_cgroup ent;
/* @tsk either already exited or can't exit until the end */
@@ -2058,7 +2058,7 @@ static int cgroup_attach_task(struct cgr
if (!threadgroup)
break;
- } while_each_thread(leader, tsk);
+ }
rcu_read_unlock();
/* remember the number of threads in the array for later. */
group_size = i;
after we add for_each_thread/etc. But I am not sure we need another
"threadgroup" check.
> Possible Solution and Doubt:
> When a break condition was added as shown in the patch attached, the
> cpu sluggishness was not occurring.
> Can you please provide your suggestions, if this is the right way to
> fix the above mentioned issue.
> Also, if a fix is already in for this, can you please guide me to that.
>
> Thanks and Regards
> Anjana
>
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1f53387..cae8416 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
> ent.task = tsk;
> ent.cgrp = task_cgroup_from_root(tsk, root);
> /* nothing to do if this task is already in the cgroup */
> - if (ent.cgrp == cgrp)
> + if (ent.cgrp == cgrp && !threadgroup)
> + break;
> + else if(ent.cgrp == cgrp)
> continue;
> /*
> * saying GFP_ATOMIC has no effect here because we did prealloc
> @@ -2199,7 +2201,6 @@ retry_find_task:
> ret = cgroup_attach_task(cgrp, tsk, threadgroup);
>
> threadgroup_unlock(tsk);
> -
> put_task_struct(tsk);
> out_unlock_cgroup:
> mutex_unlock(&cgroup_mutex);
next prev parent reply other threads:[~2013-10-04 13:02 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-04 5:25 cgroup attach task - slogging cpu anjana vk
[not found] ` <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-04 13:02 ` Oleg Nesterov [this message]
[not found] ` <20131004130207.GA9338-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-07 18:45 ` Tejun Heo
[not found] ` <20131007184507.GD27396-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-10-08 14:58 ` Oleg Nesterov
[not found] ` <20131008145833.GA15600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09 5:35 ` Li Zefan
[not found] ` <5254EB2A.7090803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-09 13:30 ` Oleg Nesterov
[not found] ` <20131009133047.GA12414-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09 14:05 ` Oleg Nesterov
[not found] ` <20131009140551.GA15849-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09 16:54 ` cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu) Oleg Nesterov
[not found] ` <20131009165448.GA22437-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-11 13:15 ` Li Zefan
[not found] ` <5257F9E3.5030708-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-11 16:00 ` Oleg Nesterov
[not found] ` <20131011160004.GA26416-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-12 2:59 ` [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly Li Zefan
[not found] ` <5258BB05.8030106-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-13 20:08 ` Tejun Heo
2013-10-15 5:04 ` cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu) anjana vk
[not found] ` <CALPf4Ty_xmred_Mf=tyzKDLu+rqStYUxTazZpYDOCKN_nm-5vQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-15 13:34 ` Tejun Heo
[not found] ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g@mail.gmail.com>
[not found] ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-10 4:22 ` cgroup attach task - slogging cpu anjana vk
2013-10-10 11:11 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131004130207.GA9338@redhat.com \
--to=oleg-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=anjanvk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).