All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org,
	Dhaval Giani <dhaval@linux.vnet.ibm.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Gautham R Shenoy <ego@in.ibm.com>,
	Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Ingo Molnar <mingo@elte.hu>, Pavel Emelyanov <xemul@openvz.org>,
	Herbert Poetzl <herbert@13thfloor.at>,
	Avi Kivity <avi@redhat.com>, Chris Friesen <cfriesen@nortel.com>,
	Paul Menage <menage@google.com>,
	Mike Waychison <mikew@google.com>
Subject: Re: [RFC v2 PATCH 4/8] sched: Enforce hard limits by throttling
Date: Wed, 14 Oct 2009 17:20:03 +0530	[thread overview]
Message-ID: <20091014115003.GA3540@in.ibm.com> (raw)
In-Reply-To: <1255511864.8392.370.camel@twins>

On Wed, Oct 14, 2009 at 11:17:44AM +0200, Peter Zijlstra wrote:
> On Wed, 2009-10-14 at 09:11 +0530, Bharata B Rao wrote:
> > On Tue, Oct 13, 2009 at 04:27:00PM +0200, Peter Zijlstra wrote:
> > > On Wed, 2009-09-30 at 18:22 +0530, Bharata B Rao wrote:
> > > 
> > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > index 0f1ea4a..77ace43 100644
> > > > --- a/include/linux/sched.h
> > > > +++ b/include/linux/sched.h
> > > > @@ -1024,7 +1024,7 @@ struct sched_domain;
> > > >  struct sched_class {
> > > >     const struct sched_class *next;
> > > >  
> > > > -   void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup);
> > > > +   int (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup);
> > > >     void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);
> > > >     void (*yield_task) (struct rq *rq);
> > > >  
> > > 
> > > I really hate this, it uglfies all the enqueue code in a horrid way
> > > (which is most of this patch).
> > > 
> > > Why can't we simply enqueue the task on a throttled group just like rt?
> > 
> > We do enqueue a task to its group even if the group is throttled. However such
> > throttled groups are not enqueued further. In such scenarios, even though the
> > task enqueue to its parent group succeeded, it really didn't add any task to
> > the cpu runqueue (rq). So we need to identify this condition and don't
> > increment rq->running. That is why this return value is needed.
> 
> I would still consider those tasks running, the fact that they don't get
> to run is a different matter.

Ok, that's how rt also considers them I realize. I thought that we should
update rq->running when tasks go off the runqueue due to throttling. When a
task is throttled, it is no doubt present on its group's cfs_rq, but it
doesn't contribute to the CPU load as the throttled group entity isn't there
on any cfs_rq. rq->running is used to obtain a few load balancing metrics and
they might go wrong if rq->running isn't uptodate.

Do you still think we shouldn't update rq->running ? If so, I can get rid
of this return value change.

> 
> This added return value really utterly craps up the code and I'm not
> going to take it.

OK :) I will work towards making them more acceptable in future iterations.

> 
> What I'm not seeing is why all this code looks so very much different
> from the rt bits.

Throttling code here looks different than rt for the following reasons:

- As I mentioned earlier, I update rq->running during throttling which
is not done in rt afaics.
- There are special conditions to prevent movement of tasks in and out
of the throttled groups during load balancing and migration.
- rt dequeues the throttled entity by walking the entity hierachy from
update_curr_rt(). But I found it difficult to do the same in cfs because
update_curr() is called from many different places and from places where
we are actually walking the entity hiearchy. A second walk (in update_curr)
of the hiearchy while we are in the middle of a hierarchy walk didn't look
all that good. So I resorted to just marking the entity as throttled in
update_curr() and later doing the dequeing from put_prev_entity() ?
Isn't this acceptable ?

Regards,
Bharata.

  reply	other threads:[~2009-10-14 11:51 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-30 12:49 [RFC v2 PATCH 0/8] CFS Hard limits - v2 Bharata B Rao
2009-09-30 12:50 ` [RFC v2 PATCH 1/8] sched: Rename sched_rt_period_mask() and use it in CFS also Bharata B Rao
2009-09-30 12:51 ` [RFC v2 PATCH 2/8] sched: Maintain aggregated tasks count in cfs_rq at each hierarchy level Bharata B Rao
2009-10-13 14:27   ` Peter Zijlstra
2009-10-14  3:42     ` Bharata B Rao
2009-09-30 12:52 ` [RFC v2 PATCH 3/8] sched: Bandwidth initialization for fair task groups Bharata B Rao
2009-10-13 14:27   ` Peter Zijlstra
2009-10-14  3:49     ` Bharata B Rao
2009-09-30 12:52 ` [RFC v2 PATCH 4/8] sched: Enforce hard limits by throttling Bharata B Rao
2009-10-13 14:27   ` Peter Zijlstra
2009-10-14  3:41     ` Bharata B Rao
2009-10-14  9:17       ` Peter Zijlstra
2009-10-14 11:50         ` Bharata B Rao [this message]
2009-10-14 13:18           ` Herbert Poetzl
2009-10-15  3:30             ` Bharata B Rao
2009-09-30 12:53 ` [RFC v2 PATCH 5/8] sched: Unthrottle the throttled tasks Bharata B Rao
2009-09-30 12:54 ` [RFC v2 PATCH 6/8] sched: Add throttle time statistics to /proc/sched_debug Bharata B Rao
2009-09-30 12:55 ` [RFC v2 PATCH 7/8] sched: Rebalance cfs runtimes Bharata B Rao
2009-09-30 12:55 ` [RFC v2 PATCH 8/8] sched: Hard limits documentation Bharata B Rao
2009-09-30 13:36 ` [RFC v2 PATCH 0/8] CFS Hard limits - v2 Pavel Emelyanov
2009-09-30 14:25   ` Bharata B Rao
2009-09-30 14:39     ` Srivatsa Vaddagiri
2009-09-30 15:09       ` Pavel Emelyanov
2009-10-13 11:39       ` Pavel Emelyanov
2009-10-13 12:03         ` Herbert Poetzl
2009-10-13 12:19           ` Pavel Emelyanov
2009-10-13 12:30             ` Dhaval Giani
2009-10-13 12:45               ` Pavel Emelyanov
2009-10-13 12:56                 ` Dhaval Giani
2009-10-13 12:57                 ` Bharata B Rao
2009-10-13 13:01                   ` Pavel Emelyanov
2009-10-13 14:56             ` Valdis.Kletnieks
2009-10-13 22:02             ` Herbert Poetzl
2009-10-13 14:49         ` Valdis.Kletnieks
2009-09-30 14:38   ` Balbir Singh
2009-09-30 15:10     ` Pavel Emelyanov
2009-09-30 15:30       ` Balbir Singh
2009-09-30 22:30         ` Herbert Poetzl
2009-10-01  5:12           ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091014115003.GA3540@in.ibm.com \
    --to=bharata@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=avi@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=cfriesen@nortel.com \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=herbert@13thfloor.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=mikew@google.com \
    --cc=mingo@elte.hu \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.