From: Bharata B Rao <bharata@linux.vnet.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org,
Dhaval Giani <dhaval@linux.vnet.ibm.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
Gautham R Shenoy <ego@in.ibm.com>,
Srivatsa Vaddagiri <vatsa@in.ibm.com>,
Ingo Molnar <mingo@elte.hu>, Pavel Emelyanov <xemul@openvz.org>,
Herbert Poetzl <herbert@13thfloor.at>,
Avi Kivity <avi@redhat.com>, Chris Friesen <cfriesen@nortel.com>,
Paul Menage <menage@google.com>,
Mike Waychison <mikew@google.com>
Subject: Re: [RFC v2 PATCH 4/8] sched: Enforce hard limits by throttling
Date: Wed, 14 Oct 2009 17:20:03 +0530 [thread overview]
Message-ID: <20091014115003.GA3540@in.ibm.com> (raw)
In-Reply-To: <1255511864.8392.370.camel@twins>
On Wed, Oct 14, 2009 at 11:17:44AM +0200, Peter Zijlstra wrote:
> On Wed, 2009-10-14 at 09:11 +0530, Bharata B Rao wrote:
> > On Tue, Oct 13, 2009 at 04:27:00PM +0200, Peter Zijlstra wrote:
> > > On Wed, 2009-09-30 at 18:22 +0530, Bharata B Rao wrote:
> > >
> > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > index 0f1ea4a..77ace43 100644
> > > > --- a/include/linux/sched.h
> > > > +++ b/include/linux/sched.h
> > > > @@ -1024,7 +1024,7 @@ struct sched_domain;
> > > > struct sched_class {
> > > > const struct sched_class *next;
> > > >
> > > > - void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup);
> > > > + int (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup);
> > > > void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);
> > > > void (*yield_task) (struct rq *rq);
> > > >
> > >
> > > I really hate this, it uglfies all the enqueue code in a horrid way
> > > (which is most of this patch).
> > >
> > > Why can't we simply enqueue the task on a throttled group just like rt?
> >
> > We do enqueue a task to its group even if the group is throttled. However such
> > throttled groups are not enqueued further. In such scenarios, even though the
> > task enqueue to its parent group succeeded, it really didn't add any task to
> > the cpu runqueue (rq). So we need to identify this condition and don't
> > increment rq->running. That is why this return value is needed.
>
> I would still consider those tasks running, the fact that they don't get
> to run is a different matter.
Ok, that's how rt also considers them I realize. I thought that we should
update rq->running when tasks go off the runqueue due to throttling. When a
task is throttled, it is no doubt present on its group's cfs_rq, but it
doesn't contribute to the CPU load as the throttled group entity isn't there
on any cfs_rq. rq->running is used to obtain a few load balancing metrics and
they might go wrong if rq->running isn't uptodate.
Do you still think we shouldn't update rq->running ? If so, I can get rid
of this return value change.
>
> This added return value really utterly craps up the code and I'm not
> going to take it.
OK :) I will work towards making them more acceptable in future iterations.
>
> What I'm not seeing is why all this code looks so very much different
> from the rt bits.
Throttling code here looks different than rt for the following reasons:
- As I mentioned earlier, I update rq->running during throttling which
is not done in rt afaics.
- There are special conditions to prevent movement of tasks in and out
of the throttled groups during load balancing and migration.
- rt dequeues the throttled entity by walking the entity hierachy from
update_curr_rt(). But I found it difficult to do the same in cfs because
update_curr() is called from many different places and from places where
we are actually walking the entity hiearchy. A second walk (in update_curr)
of the hiearchy while we are in the middle of a hierarchy walk didn't look
all that good. So I resorted to just marking the entity as throttled in
update_curr() and later doing the dequeing from put_prev_entity() ?
Isn't this acceptable ?
Regards,
Bharata.
next prev parent reply other threads:[~2009-10-14 11:51 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-30 12:49 [RFC v2 PATCH 0/8] CFS Hard limits - v2 Bharata B Rao
2009-09-30 12:50 ` [RFC v2 PATCH 1/8] sched: Rename sched_rt_period_mask() and use it in CFS also Bharata B Rao
2009-09-30 12:51 ` [RFC v2 PATCH 2/8] sched: Maintain aggregated tasks count in cfs_rq at each hierarchy level Bharata B Rao
2009-10-13 14:27 ` Peter Zijlstra
2009-10-14 3:42 ` Bharata B Rao
2009-09-30 12:52 ` [RFC v2 PATCH 3/8] sched: Bandwidth initialization for fair task groups Bharata B Rao
2009-10-13 14:27 ` Peter Zijlstra
2009-10-14 3:49 ` Bharata B Rao
2009-09-30 12:52 ` [RFC v2 PATCH 4/8] sched: Enforce hard limits by throttling Bharata B Rao
2009-10-13 14:27 ` Peter Zijlstra
2009-10-14 3:41 ` Bharata B Rao
2009-10-14 9:17 ` Peter Zijlstra
2009-10-14 11:50 ` Bharata B Rao [this message]
2009-10-14 13:18 ` Herbert Poetzl
2009-10-15 3:30 ` Bharata B Rao
2009-09-30 12:53 ` [RFC v2 PATCH 5/8] sched: Unthrottle the throttled tasks Bharata B Rao
2009-09-30 12:54 ` [RFC v2 PATCH 6/8] sched: Add throttle time statistics to /proc/sched_debug Bharata B Rao
2009-09-30 12:55 ` [RFC v2 PATCH 7/8] sched: Rebalance cfs runtimes Bharata B Rao
2009-09-30 12:55 ` [RFC v2 PATCH 8/8] sched: Hard limits documentation Bharata B Rao
2009-09-30 13:36 ` [RFC v2 PATCH 0/8] CFS Hard limits - v2 Pavel Emelyanov
2009-09-30 14:25 ` Bharata B Rao
2009-09-30 14:39 ` Srivatsa Vaddagiri
2009-09-30 15:09 ` Pavel Emelyanov
2009-10-13 11:39 ` Pavel Emelyanov
2009-10-13 12:03 ` Herbert Poetzl
2009-10-13 12:19 ` Pavel Emelyanov
2009-10-13 12:30 ` Dhaval Giani
2009-10-13 12:45 ` Pavel Emelyanov
2009-10-13 12:56 ` Dhaval Giani
2009-10-13 12:57 ` Bharata B Rao
2009-10-13 13:01 ` Pavel Emelyanov
2009-10-13 14:56 ` Valdis.Kletnieks
2009-10-13 22:02 ` Herbert Poetzl
2009-10-13 14:49 ` Valdis.Kletnieks
2009-09-30 14:38 ` Balbir Singh
2009-09-30 15:10 ` Pavel Emelyanov
2009-09-30 15:30 ` Balbir Singh
2009-09-30 22:30 ` Herbert Poetzl
2009-10-01 5:12 ` Bharata B Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091014115003.GA3540@in.ibm.com \
--to=bharata@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=cfriesen@nortel.com \
--cc=dhaval@linux.vnet.ibm.com \
--cc=ego@in.ibm.com \
--cc=herbert@13thfloor.at \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=mikew@google.com \
--cc=mingo@elte.hu \
--cc=svaidy@linux.vnet.ibm.com \
--cc=vatsa@in.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox