From: Peter Zijlstra <peterz@infradead.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Paul Turner <pjt@google.com>,
Benjamin Segall <bsegall@google.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mike Galbraith <bitbucket@online.de>
Subject: Re: [PATCH 7/9] sched/fair: Optimize cgroup pick_next_task_fair
Date: Thu, 30 Jan 2014 13:37:10 +0100 [thread overview]
Message-ID: <20140130123710.GA2936@laptop.programming.kicks-ass.net> (raw)
In-Reply-To: <CAKfTPtAgR5xdr-Vj14t+RRVoLACrvTsHw_mXgimUDKsuNBUKJA@mail.gmail.com>
On Thu, Jan 30, 2014 at 01:18:09PM +0100, Vincent Guittot wrote:
> On 28 January 2014 18:16, Peter Zijlstra <peterz@infradead.org> wrote:
>
> [snip]
>
> >
> > @@ -4662,9 +4682,86 @@ static void check_preempt_wakeup(struct
> > static struct task_struct *
> > pick_next_task_fair(struct rq *rq, struct task_struct *prev)
> > {
> > - struct task_struct *p;
> > struct cfs_rq *cfs_rq = &rq->cfs;
> > struct sched_entity *se;
> > + struct task_struct *p;
> > +
> > +#ifdef CONFIG_FAIR_GROUP_SCHED
> > + if (!cfs_rq->nr_running)
> > + return NULL;
>
> Couldn't you move the test above out of the CONFIG_FAIR_GROUP_SCHED
> and remove the same test that is done after the simple label
No, we have to check it twice because..
>
> > +
> > + if (prev->sched_class != &fair_sched_class)
> > + goto simple;
> > +
> > + /*
> > + * Because of the set_next_buddy() in dequeue_task_fair() it is rather
> > + * likely that a next task is from the same cgroup as the current.
> > + *
> > + * Therefore attempt to avoid putting and setting the entire cgroup
> > + * hierarchy, only change the part that actually changes.
> > + */
> > +
> > + do {
> > + struct sched_entity *curr = cfs_rq->curr;
> > +
> > + /*
> > + * Since we got here without doing put_prev_entity() we also
> > + * have to consider cfs_rq->curr. If it is still a runnable
> > + * entity, update_curr() will update its vruntime, otherwise
> > + * forget we've ever seen it.
> > + */
> > + if (curr && curr->on_rq)
> > + update_curr(cfs_rq);
> > + else
> > + curr = NULL;
> > +
> > + /*
> > + * This call to check_cfs_rq_runtime() will do the throttle and
> > + * dequeue its entity in the parent(s). Therefore the 'simple'
> > + * nr_running test will indeed be correct.
> > + */
> > + if (unlikely(check_cfs_rq_runtime(cfs_rq)))
> > + goto simple;
... here if you read the comment above, we could have modified the
nr_running.
> > + se = pick_next_entity(cfs_rq, curr);
> > + cfs_rq = group_cfs_rq(se);
> > + } while (cfs_rq);
> > +
> > + p = task_of(se);
> > +
> > + /*
> > + * Since we haven't yet done put_prev_entity and if the selected task
> > + * is a different task than we started out with, try and touch the
> > + * least amount of cfs_rqs.
> > + */
> > + if (prev != p) {
> > + struct sched_entity *pse = &prev->se;
> > +
> > + while (!(cfs_rq = is_same_group(se, pse))) {
> > + int se_depth = se->depth;
> > + int pse_depth = pse->depth;
> > +
> > + if (se_depth <= pse_depth) {
> > + put_prev_entity(cfs_rq_of(pse), pse);
> > + pse = parent_entity(pse);
> > + }
> > + if (se_depth >= pse_depth) {
> > + set_next_entity(cfs_rq_of(se), se);
> > + se = parent_entity(se);
> > + }
> > + }
> > +
> > + put_prev_entity(cfs_rq, pse);
> > + set_next_entity(cfs_rq, se);
> > + }
> > +
> > + if (hrtick_enabled(rq))
> > + hrtick_start_fair(rq, p);
> > +
> > + return p;
> > +simple:
> > + cfs_rq = &rq->cfs;
> > +#endif
> >
> > if (!cfs_rq->nr_running)
> > return NULL;
And therefore this test needs to stay.
next prev parent reply other threads:[~2014-01-30 12:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-28 17:16 [PATCH 0/9] Various sched patches -v2 Peter Zijlstra
2014-01-28 17:16 ` [PATCH 1/9] sched: Remove cpu parameter for idle_balance() Peter Zijlstra
2014-01-28 17:16 ` [PATCH 2/9] sched: Fix race in idle_balance() Peter Zijlstra
2014-01-28 17:16 ` [PATCH 3/9] sched: Move idle_stamp up to the core Peter Zijlstra
2014-01-28 17:16 ` [PATCH 4/9] sched/fair: Track cgroup depth Peter Zijlstra
2014-01-28 17:16 ` [PATCH 5/9] sched: Push put_prev_task() into pick_next_task() Peter Zijlstra
2014-01-28 18:46 ` bsegall
2014-01-28 19:14 ` Peter Zijlstra
2014-01-28 17:16 ` [PATCH 6/9] sched/fair: Clean up __clear_buddies_* Peter Zijlstra
2014-01-28 17:16 ` [PATCH 7/9] sched/fair: Optimize cgroup pick_next_task_fair Peter Zijlstra
2014-01-30 12:18 ` Vincent Guittot
2014-01-30 12:37 ` Peter Zijlstra [this message]
2014-01-30 12:56 ` Vincent Guittot
2014-01-28 17:16 ` [PATCH 8/9] sched: Clean up idle task SMP logic Peter Zijlstra
2014-01-30 10:52 ` Vincent Guittot
2014-01-28 17:16 ` [PATCH 9/9] sched: Push down pre_schedule() and idle_balance() Peter Zijlstra
2014-01-30 12:45 ` Vincent Guittot
2014-01-30 15:22 ` Peter Zijlstra
2014-01-28 18:07 ` [PATCH 0/9] Various sched patches -v2 Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140130123710.GA2936@laptop.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bitbucket@online.de \
--cc=bsegall@google.com \
--cc=daniel.lezcano@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=pjt@google.com \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.