From: "John Kacur" <jkacur@gmail.com>
To: "Peter Zijlstra" <a.p.zijlstra@chello.nl>
Cc: "Darren Hart" <dvhltc@us.ibm.com>,
linux-rt-users@vger.kernel.org,
"Steven Rostedt" <rostedt@goodmis.org>,
"Clark Williams" <williams@redhat.com>,
"Ingo Molnar" <mingo@elte.hu>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Dmitry Adamushko" <dmitry.adamushko@gmail.com>,
"Gregory Haskins" <ghaskins@novell.com>
Subject: Re: [RFC][PATCH] fix SCHED_FIFO spec violation (backport)
Date: Tue, 15 Jul 2008 11:19:31 +0200 [thread overview]
Message-ID: <520f0cf10807150219y2cafd0cbqed1550f5bc8ad4cf@mail.gmail.com> (raw)
In-Reply-To: <1216108996.12595.133.camel@twins>
On Tue, Jul 15, 2008 at 10:03 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Sat, 2008-07-05 at 08:18 -0700, Darren Hart wrote:
>> On Fri, 2008-07-04 at 15:08 +0200, John Kacur wrote:
>
>> > On Fri, Jul 4, 2008 at 12:41 AM, Darren Hart <dvhltc@us.ibm.com> wrote:
>> > > Enqueue deprioritized RT tasks to head of prio array
>> > >
>> > > This patch backports Peter Z's enqueue to head of prio array on
>> > > de-prioritization to 2.6.24.7-rt14 which doesn't have the
>> > > enqueue_rt_entity and associated changes.
>> > >
>> > > I've run several long running real-time java benchmarks and it's
>> > > holding so far. Steven, please consider this patch for inclusion
>> > > in the next 2.6.24.7-rtX release.
>> > >
>> > > Peter, I didn't include your Signed-off-by as only about half your
>> > > original patch applied to 2.6.24.7-r14. If you're happy with this
>> > > version, would you also sign off?
>> > >
>> > > Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
>> > >
>> > >
>> > > ---
>> > > Index: linux-2.6.24.7-ibmrt2.6-view/include/linux/sched.h
>> > > ===================================================================
>> > > --- linux-2.6.24.7-ibmrt2.6-view.orig/include/linux/sched.h
>> > > +++ linux-2.6.24.7-ibmrt2.6-view/include/linux/sched.h
>> > > @@ -897,11 +897,16 @@ struct uts_namespace;
>> > > struct rq;
>> > > struct sched_domain;
>> > >
>> > > +#define ENQUEUE_WAKEUP 0x01
>> > > +#define ENQUEUE_HEAD 0x02
>> > > +
>> > > +#define DEQUEUE_SLEEP 0x01
>> > > +
>> >
>> > Question: is ENQUEUE_WAKEUP equal to DEQUEUE_SLEEP by design or
>> > coincidence?
>>
>> Coincidence. The ENQUEUE_* flags are only to be used with the
>> enqueue_task* methods, while the DEQUEUE_* flags are for deqeue_task*.
>> Note that the conversion of sleep to the DEQUEUE_SLEEP flag isn't really
>> necessary as there is only the one flag, but it makes the calls
>> parallel, which I suspect was Peter's intention (but I speculate here).
>
> Indeed.
>
>> > The renaming of wakeup and sleep to flags makes it at
>> > least superficially seem like they overlap. Since a large part of the
>> > patch is renaming, it might be easier to understand if the renaming
>> > was done as a separate patch, but on the other hand, that is probably
>> > just a PITA. :)
>>
>> Seems a small enough patch to be all in one to me. If others object
>> I'll split it out, but again, I tried to keep the backport as close to
>> Peter's original patch as possible.
>
> I just hacked together what was needed to post out an RFC, never made it
> a 'pretty' series ;-)
>
> One could go the extra length and make it multiple patches, but its not
> too big, so I'm not sure we should worry about that.
>
>> >
>> > > struct sched_class {
>> > > const struct sched_class *next;
>> > >
>> > > - void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup);
>> > > - void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);
>> > > + void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags);
>> > > + void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags);
>> > > void (*yield_task) (struct rq *rq);
>> > > int (*select_task_rq)(struct task_struct *p, int sync);
>> > >
>
>> > > Index: linux-2.6.24.7-ibmrt2.6-view/kernel/sched_fair.c
>> > > ===================================================================
>> > > --- linux-2.6.24.7-ibmrt2.6-view.orig/kernel/sched_fair.c
>> > > +++ linux-2.6.24.7-ibmrt2.6-view/kernel/sched_fair.c
>> > > @@ -756,10 +756,11 @@ static inline struct sched_entity *paren
>> > > * increased. Here we update the fair scheduling stats and
>> > > * then put the task into the rbtree:
>> > > */
>> > > -static void enqueue_task_fair(struct rq *rq, struct task_struct *p, int wakeup)
>> > > +static void enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>> > > {
>> > > struct cfs_rq *cfs_rq;
>> > > struct sched_entity *se = &p->se;
>> > > + int wakeup = flags & ENQUEUE_WAKEUP;
>> >
>> > Minor nit: was it necessary to create a new int, why not just flags &=
>> > ENQUEUE_WAKEUP, plus subsequent renaming where necessary.
>
> As Darren already said, the rename of wakeup/sleep to flags is needed
> because of the multiple value thing and consistency.
>
>> Well... I've copied the entire function here for context:
>>
>> static void enqueue_task_fair(struct rq *rq, struct task_struct *p, int
>> flags)
>> {
>> struct cfs_rq *cfs_rq;
>> struct sched_entity *se = &p->se;
>> int wakeup = flags & ENQUEUE_WAKEUP;
>>
>> for_each_sched_entity(se) {
>> if (se->on_rq)
>> break;
>> cfs_rq = cfs_rq_of(se);
>> enqueue_entity(cfs_rq, se, wakeup);
>> wakeup = 1;
>> }
>> }
>>
>> Note that "wakeup = 1;" for all but the initial entity which uses the
>> flag that was passed in. So if this is correct behavior, then the new
>> integer seems like a reasonable approach to me. Note that the
>> dequeue_task_fair has a parallel implementation.
>>
>> Peter, can you explain why only the first entity is subject to the value
>> of the passed-in flags in these two functions. I understand this was
>> the orginal behavior as well.
>
> group scheduling :-)
>
> So, if you haven't ran away by now I'll continue explaining.
>
> The current group scheduler is a hierarchy of schedulers, what
> enqueue_task_fair() does is enqueue the task 'somewhere' in that
> hierarchy. And then walk up the hierarchy checking to see if each parent
> is already enqueued (the if (se->on_rq) bit). If so, it can stop since
> all further parents will then too be already enqueued. However, if the
> parent is not already enqueued if must enqueue him too.
>
> So what we do is dequeue sleep a scheduler from its parent scheduler
> when there are no more tasks to run, and enqueue wakeup when a new
> 'task' arrives. Note that a 'task' can be a so called super-task which
> is essentially another scheduler.
>
>
Hey Peter, thanks for the excellent explanations. I think Darren is
waiting for a Signed-off-by: you, since it is a backport of your work.
I was just reviewing it for fun. Darren if you want to add a
Reviewed-by: John Kacur <jkacur at gmail dot com>
to the patch, pls feel free too, but it is not necessary.
next prev parent reply other threads:[~2008-07-15 9:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-03 22:41 [RFC][PATCH] fix SCHED_FIFO spec violation (backport) Darren Hart
2008-07-04 13:08 ` John Kacur
2008-07-05 15:18 ` Darren Hart
2008-07-07 11:00 ` John Kacur
2008-07-07 15:24 ` Darren Hart
2008-07-15 8:03 ` Peter Zijlstra
2008-07-15 9:19 ` John Kacur [this message]
2008-07-15 10:01 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=520f0cf10807150219y2cafd0cbqed1550f5bc8ad4cf@mail.gmail.com \
--to=jkacur@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=dmitry.adamushko@gmail.com \
--cc=dvhltc@us.ibm.com \
--cc=ghaskins@novell.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).