cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: anjana vk <anjanvk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: cgroup attach task - slogging cpu
Date: Fri, 4 Oct 2013 15:02:07 +0200	[thread overview]
Message-ID: <20131004130207.GA9338@redhat.com> (raw)
In-Reply-To: <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hello Anjana,

On 10/04, anjana vk wrote:
>
> I saw an issue of CPU slogging posted in
> https://lkml.org/lkml/2013/8/28/283, and require your valuable
> suggestion on a very similar issue I am facing.

Not sure I understand, but just in case: yes the lockless
while_each_thread() is buggy and should be fixed (it should actually
die eventually). And a lot of current users of while_each_thread()
are themselves buggy and need the fixes no matter what we do with
while_each_thread.

Oh. and just in case... I am (slowly) working on this, but didn't
finish it yet, sorry.

But I do not really understand cgroup_attach_task(), and I am not
sure you observed the same problem. Add Tejun.

> We are facing the issue of cpu slogging in the function
>     cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
> bool threadgroup)
> in the do-while loop which iterates over the threadgroup.
>
>  rcu_read_lock();
>  do {
>  struct task_and_cgroup ent;
>
>  /* @tsk either already exited or can't exit until the end */
>  if (tsk->flags & PF_EXITING)
>  continue;
>
>  /* as per above, nr_threads may decrease, but not increase. */
>  BUG_ON(i >= group_size);
>  ent.task = tsk;
>  ent.cgrp = task_cgroup_from_root(tsk, root);
>  /* nothing to do if this task is already in the cgroup */
>  if (ent.cgrp == cgrp)
>  continue;
>  /*
>  * saying GFP_ATOMIC has no effect here because we did prealloc
>  * earlier, but it's good form to communicate our expectations.
>  */
>  retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
>  BUG_ON(retval != 0);
>  i++;
>
>  if (!threadgroup)
>  break;
>  } while_each_thread(leader, tsk);
>
> Problem Observed:
> In this loop, in case of a single thread, and argument “bool
> threadgroup” as “false” and
> if(ent.cgrp == cgrp) is true
> we will continue to the next thread instead of breaking out of the loop.

But in this case the loop should be terminated by while_each_thread's
check?

Again, again, while_each_thread() is wrong, and we need

	--- x/kernel/cgroup.c
	+++ x/kernel/cgroup.c
	@@ -2034,7 +2034,7 @@ static int cgroup_attach_task(struct cgr
		 * take an rcu_read_lock.
		 */
		rcu_read_lock();
	-	do {
	+	for_each_thread(leader->signal, task) {
			struct task_and_cgroup ent;
	 
			/* @tsk either already exited or can't exit until the end */
	@@ -2058,7 +2058,7 @@ static int cgroup_attach_task(struct cgr
	 
			if (!threadgroup)
				break;
	-	} while_each_thread(leader, tsk);
	+	}
		rcu_read_unlock();
		/* remember the number of threads in the array for later. */
		group_size = i;

after we add for_each_thread/etc. But I am not sure we need another
"threadgroup" check.

> Possible Solution and Doubt:
> When a break condition was added as shown in the patch attached, the
> cpu sluggishness was not occurring.
> Can you please provide your suggestions, if this is the right way to
> fix the above mentioned issue.
> Also, if a fix is already in for this, can you please guide me to that.
>
> Thanks and Regards
> Anjana
>
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1f53387..cae8416 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>  		ent.task = tsk;
>  		ent.cgrp = task_cgroup_from_root(tsk, root);
>  		/* nothing to do if this task is already in the cgroup */
> -		if (ent.cgrp == cgrp)
> +		if (ent.cgrp == cgrp && !threadgroup)
> +			break;
> +		else if(ent.cgrp == cgrp)
>  			continue;
>  		/*
>  		 * saying GFP_ATOMIC has no effect here because we did prealloc
> @@ -2199,7 +2201,6 @@ retry_find_task:
>  	ret = cgroup_attach_task(cgrp, tsk, threadgroup);
>  
>  	threadgroup_unlock(tsk);
> -
>  	put_task_struct(tsk);
>  out_unlock_cgroup:
>  	mutex_unlock(&cgroup_mutex);

  parent reply	other threads:[~2013-10-04 13:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-04  5:25 cgroup attach task - slogging cpu anjana vk
     [not found] ` <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-04 13:02   ` Oleg Nesterov [this message]
     [not found]     ` <20131004130207.GA9338-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-07 18:45       ` Tejun Heo
     [not found]         ` <20131007184507.GD27396-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-10-08 14:58           ` Oleg Nesterov
     [not found]             ` <20131008145833.GA15600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09  5:35               ` Li Zefan
     [not found]                 ` <5254EB2A.7090803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-09 13:30                   ` Oleg Nesterov
     [not found]                     ` <20131009133047.GA12414-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09 14:05                       ` Oleg Nesterov
     [not found]                         ` <20131009140551.GA15849-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09 16:54                           ` cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu) Oleg Nesterov
     [not found]                             ` <20131009165448.GA22437-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-11 13:15                               ` Li Zefan
     [not found]                                 ` <5257F9E3.5030708-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-11 16:00                                   ` Oleg Nesterov
     [not found]                                     ` <20131011160004.GA26416-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-12  2:59                                       ` [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly Li Zefan
     [not found]                                         ` <5258BB05.8030106-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-13 20:08                                           ` Tejun Heo
2013-10-15  5:04                                       ` cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu) anjana vk
     [not found]                                         ` <CALPf4Ty_xmred_Mf=tyzKDLu+rqStYUxTazZpYDOCKN_nm-5vQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-15 13:34                                           ` Tejun Heo
     [not found]                         ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g@mail.gmail.com>
     [not found]                           ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-10  4:22                             ` cgroup attach task - slogging cpu anjana vk
2013-10-10 11:11                             ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131004130207.GA9338@redhat.com \
    --to=oleg-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=anjanvk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).