Re: [RFC][PATCH] fix SCHED_FIFO spec violation (backport)

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "John Kacur" <jkacur@gmail.com>
To: "Darren Hart" <dvhltc@us.ibm.com>
Cc: linux-rt-users@vger.kernel.org,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Clark Williams" <williams@redhat.com>,
	"Ingo Molnar" <mingo@elte.hu>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Dmitry Adamushko" <dmitry.adamushko@gmail.com>,
	"Gregory Haskins" <ghaskins@novell.com>
Subject: Re: [RFC][PATCH] fix SCHED_FIFO spec violation (backport)
Date: Mon, 7 Jul 2008 13:00:49 +0200	[thread overview]
Message-ID: <520f0cf10807070400l67c14367hff52f7b16c7ee757@mail.gmail.com> (raw)
In-Reply-To: <1215271090.18491.20.camel@Aeon>

[-- Attachment #1: Type: text/plain, Size: 8039 bytes --]

On Sat, Jul 5, 2008 at 5:18 PM, Darren Hart <dvhltc@us.ibm.com> wrote:
> On Fri, 2008-07-04 at 15:08 +0200, John Kacur wrote:
>> Since we're always being admonished to do code-review, and I needed to
>> waste some time, here are a few comments, to what looks largely like a
>> nice patch to me.
>
> Hi John, thanks for the review.  Note that this is a backport of Peter'z
> git patch, so I'll try to address some of your rationale questions
> below, but the short answer is "I tried to match Peter's implementation
> as closely as possible."  Regarding the patch applying, it worked for
> me... see below.
>
>>
>> On Fri, Jul 4, 2008 at 12:41 AM, Darren Hart <dvhltc@us.ibm.com> wrote:
>> > Enqueue deprioritized RT tasks to head of prio array
>> >
>> > This patch backports Peter Z's enqueue to head of prio array on
>> > de-prioritization to 2.6.24.7-rt14 which doesn't have the
>> > enqueue_rt_entity and associated changes.
>> >
>> > I've run several long running real-time java benchmarks and it's
>> > holding so far.  Steven, please consider this patch for inclusion
>> > in the next 2.6.24.7-rtX release.
>> >
>> > Peter, I didn't include your Signed-off-by as only about half your
>> > original patch applied to 2.6.24.7-r14.  If you're happy with this
>> > version, would you also sign off?
>> >
>> > Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
>> >
>> >
>> > ---
>> > Index: linux-2.6.24.7-ibmrt2.6-view/include/linux/sched.h
>> > ===================================================================
>> > --- linux-2.6.24.7-ibmrt2.6-view.orig/include/linux/sched.h
>> > +++ linux-2.6.24.7-ibmrt2.6-view/include/linux/sched.h
>> > @@ -897,11 +897,16 @@ struct uts_namespace;
>> >  struct rq;
>> >  struct sched_domain;
>> >
>> > +#define ENQUEUE_WAKEUP 0x01
>> > +#define ENQUEUE_HEAD   0x02
>> > +
>> > +#define DEQUEUE_SLEEP  0x01
>> > +
>>
>> Question: is ENQUEUE_WAKEUP equal to DEQUEUE_SLEEP by design or
>> coincidence?
>
> Coincidence.  The ENQUEUE_* flags are only to be used with the
> enqueue_task* methods, while the DEQUEUE_* flags are for deqeue_task*.
> Note that the conversion of sleep to the DEQUEUE_SLEEP flag isn't really
> necessary as there is only the one flag, but it makes the calls
> parallel, which I suspect was Peter's intention (but I speculate here).
>
>> The renaming of wakeup and sleep to flags makes it at
>> least superficially seem like they overlap. Since a large part of the
>> patch is renaming, it might be easier to understand if the renaming
>> was done as a separate patch, but on the other hand, that is probably
>> just a PITA. :)
>
> Seems a small enough patch to be all in one to me.  If others object
> I'll split it out, but again, I tried to keep the backport as close to
> Peter's original patch as possible.
----SNIP-----
I'm not sure the renaming is necessary at all, since "wakeup" and
"sleep" seem more descriptive than "flags". If you skip the renaming
then the size of the patch is reduced to half it's current size.

>> Minor nit: was it necessary to create a new int, why not just flags &=
>> ENQUEUE_WAKEUP, plus subsequent renaming where necessary.
>>
>>
>
>
> Well... I've copied the entire function here for context:
>
> static void enqueue_task_fair(struct rq *rq, struct task_struct *p, int
> flags)
> {
>        struct cfs_rq *cfs_rq;
>        struct sched_entity *se = &p->se;
>        int wakeup = flags & ENQUEUE_WAKEUP;
>
>        for_each_sched_entity(se) {
>                if (se->on_rq)
>                        break;
>                cfs_rq = cfs_rq_of(se);
>                enqueue_entity(cfs_rq, se, wakeup);
>                wakeup = 1;
>        }
> }
>
> Note that "wakeup = 1;" for all but the initial entity which uses the
> flag that was passed in.  So if this is correct behavior, then the new
> integer seems like a reasonable approach to me.  Note that the
> dequeue_task_fair has a parallel implementation.
>
> Peter, can you explain why only the first entity is subject to the value
> of the passed-in flags in these two functions.  I understand this was
> the orginal behavior as well.

I wonder if the only reason the extra int was created was just as a
quick way to deal with the renaming of the argument but not changing
the code in the function. If we skip the renmaing once again, then it
becomes slightly simpler.

Ok, so Peter Z did the hard work of creating the original patch, and
Darren Hart did the hard work of backporting it, I am doing the
relatively simple task of refractoring it, this reduces Darren's patch
in size by at least half. Maybe this alternate patch would be more
readily acceptable by Steve et al? OTOH Note that Darren extensively
tested his version of the patch, and I just did a compile and boot
test.

Including both and inlined and attachment until I can figure out how
to avoid line wrap in gmail.

Signed-off-by: John Kacur <jkacur@gmail.com>

Index: linux-2.6.24.7-rt14/include/linux/sched.h
===================================================================
--- linux-2.6.24.7-rt14.orig/include/linux/sched.h
+++ linux-2.6.24.7-rt14/include/linux/sched.h
@@ -897,6 +897,11 @@ struct uts_namespace;
 struct rq;
 struct sched_domain;

+#define ENQUEUE_WAKEUP	0x01
+#define ENQUEUE_HEAD	0x02
+
+#define DEQUEUE_SLEEP	0x01
+
 struct sched_class {
 	const struct sched_class *next;

Index: linux-2.6.24.7-rt14/kernel/sched.c
===================================================================
--- linux-2.6.24.7-rt14.orig/kernel/sched.c
+++ linux-2.6.24.7-rt14/kernel/sched.c
@@ -1759,7 +1759,7 @@ out_activate:
 	else
 		schedstat_inc(p, se.nr_wakeups_remote);
 	update_rq_clock(rq);
-	activate_task(rq, p, 1);
+	activate_task(rq, p, ENQUEUE_WAKEUP);
 	check_preempt_curr(rq, p);
 	success = 1;

@@ -3968,7 +3968,7 @@ asmlinkage void __sched __schedule(void)
 			prev->state = TASK_RUNNING;
 		} else {
 			touch_softlockup_watchdog();
-			deactivate_task(rq, prev, 1);
+			deactivate_task(rq, prev, DEQUEUE_SLEEP);
 		}
 		switch_count = &prev->nvcsw;
 	}
@@ -4431,7 +4431,7 @@ EXPORT_SYMBOL(sleep_on_timeout);
 void task_setprio(struct task_struct *p, int prio)
 {
 	unsigned long flags;
-	int oldprio, prev_resched, on_rq, running;
+	int oldprio, prev_resched, on_rq, running, down;
 	struct rq *rq;
 	const struct sched_class *prev_class = p->sched_class;

@@ -4472,6 +4472,7 @@ void task_setprio(struct task_struct *p,
 	else
 		p->sched_class = &fair_sched_class;

+	down = (prio > p->prio) ? ENQUEUE_HEAD : 0;
 	p->prio = prio;

 //	trace_special_pid(p->pid, __PRIO(oldprio), PRIO(p));
@@ -4480,7 +4481,7 @@ void task_setprio(struct task_struct *p,
 	if (running)
 		p->sched_class->set_curr_task(rq);
 	if (on_rq) {
-		enqueue_task(rq, p, 0);
+		enqueue_task(rq, p, down);
 		check_class_changed(rq, p, prev_class, oldprio, running);
 	}
 //	trace_special(prev_resched, _need_resched(), 0);
Index: linux-2.6.24.7-rt14/kernel/sched_fair.c
===================================================================
--- linux-2.6.24.7-rt14.orig/kernel/sched_fair.c
+++ linux-2.6.24.7-rt14/kernel/sched_fair.c
@@ -764,6 +764,7 @@ static void enqueue_task_fair(struct rq
 {
 	struct cfs_rq *cfs_rq;
 	struct sched_entity *se = &p->se;
+	wakeup &= ENQUEUE_WAKEUP;

 	for_each_sched_entity(se) {
 		if (se->on_rq)
@@ -783,6 +784,7 @@ static void dequeue_task_fair(struct rq
 {
 	struct cfs_rq *cfs_rq;
 	struct sched_entity *se = &p->se;
+	sleep &= DEQUEUE_SLEEP;

 	for_each_sched_entity(se) {
 		cfs_rq = cfs_rq_of(se);
Index: linux-2.6.24.7-rt14/kernel/sched_rt.c
===================================================================
--- linux-2.6.24.7-rt14.orig/kernel/sched_rt.c
+++ linux-2.6.24.7-rt14/kernel/sched_rt.c
@@ -185,7 +185,11 @@ static void enqueue_task_rt(struct rq *r
 {
 	struct rt_prio_array *array = &rq->rt.active;

-	list_add_tail(&p->run_list, array->queue + p->prio);
+	if (unlikely(wakeup & ENQUEUE_HEAD))
+		list_add(&p->run_list, array->queue + p->prio);
+ 	else
+		list_add_tail(&p->run_list, array->queue + p->prio);
+
 	__set_bit(p->prio, array->bitmap);
 	inc_rt_tasks(p, rq);

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: pandd.patch --]
[-- Type: text/x-patch; name=pandd.patch, Size: 3215 bytes --]

Signed-off-by: John Kacur <jkacur@gmail.com>

Index: linux-2.6.24.7-rt14/include/linux/sched.h
===================================================================
--- linux-2.6.24.7-rt14.orig/include/linux/sched.h
+++ linux-2.6.24.7-rt14/include/linux/sched.h
@@ -897,6 +897,11 @@ struct uts_namespace;
 struct rq;
 struct sched_domain;
 
+#define ENQUEUE_WAKEUP	0x01
+#define ENQUEUE_HEAD	0x02
+
+#define DEQUEUE_SLEEP	0x01
+
 struct sched_class {
 	const struct sched_class *next;
 
Index: linux-2.6.24.7-rt14/kernel/sched.c
===================================================================
--- linux-2.6.24.7-rt14.orig/kernel/sched.c
+++ linux-2.6.24.7-rt14/kernel/sched.c
@@ -1759,7 +1759,7 @@ out_activate:
 	else
 		schedstat_inc(p, se.nr_wakeups_remote);
 	update_rq_clock(rq);
-	activate_task(rq, p, 1);
+	activate_task(rq, p, ENQUEUE_WAKEUP);
 	check_preempt_curr(rq, p);
 	success = 1;
 
@@ -3968,7 +3968,7 @@ asmlinkage void __sched __schedule(void)
 			prev->state = TASK_RUNNING;
 		} else {
 			touch_softlockup_watchdog();
-			deactivate_task(rq, prev, 1);
+			deactivate_task(rq, prev, DEQUEUE_SLEEP);
 		}
 		switch_count = &prev->nvcsw;
 	}
@@ -4431,7 +4431,7 @@ EXPORT_SYMBOL(sleep_on_timeout);
 void task_setprio(struct task_struct *p, int prio)
 {
 	unsigned long flags;
-	int oldprio, prev_resched, on_rq, running;
+	int oldprio, prev_resched, on_rq, running, down;
 	struct rq *rq;
 	const struct sched_class *prev_class = p->sched_class;
 
@@ -4472,6 +4472,7 @@ void task_setprio(struct task_struct *p,
 	else
 		p->sched_class = &fair_sched_class;
 
+	down = (prio > p->prio) ? ENQUEUE_HEAD : 0;
 	p->prio = prio;
 
 //	trace_special_pid(p->pid, __PRIO(oldprio), PRIO(p));
@@ -4480,7 +4481,7 @@ void task_setprio(struct task_struct *p,
 	if (running)
 		p->sched_class->set_curr_task(rq);
 	if (on_rq) {
-		enqueue_task(rq, p, 0);
+		enqueue_task(rq, p, down);
 		check_class_changed(rq, p, prev_class, oldprio, running);
 	}
 //	trace_special(prev_resched, _need_resched(), 0);
Index: linux-2.6.24.7-rt14/kernel/sched_fair.c
===================================================================
--- linux-2.6.24.7-rt14.orig/kernel/sched_fair.c
+++ linux-2.6.24.7-rt14/kernel/sched_fair.c
@@ -764,6 +764,7 @@ static void enqueue_task_fair(struct rq
 {
 	struct cfs_rq *cfs_rq;
 	struct sched_entity *se = &p->se;
+	wakeup &= ENQUEUE_WAKEUP;
 
 	for_each_sched_entity(se) {
 		if (se->on_rq)
@@ -783,6 +784,7 @@ static void dequeue_task_fair(struct rq
 {
 	struct cfs_rq *cfs_rq;
 	struct sched_entity *se = &p->se;
+	sleep &= DEQUEUE_SLEEP;
 
 	for_each_sched_entity(se) {
 		cfs_rq = cfs_rq_of(se);
Index: linux-2.6.24.7-rt14/kernel/sched_rt.c
===================================================================
--- linux-2.6.24.7-rt14.orig/kernel/sched_rt.c
+++ linux-2.6.24.7-rt14/kernel/sched_rt.c
@@ -185,7 +185,11 @@ static void enqueue_task_rt(struct rq *r
 {
 	struct rt_prio_array *array = &rq->rt.active;
 
-	list_add_tail(&p->run_list, array->queue + p->prio);
+	if (unlikely(wakeup & ENQUEUE_HEAD))
+		list_add(&p->run_list, array->queue + p->prio);
+ 	else
+		list_add_tail(&p->run_list, array->queue + p->prio);
+
 	__set_bit(p->prio, array->bitmap);
 	inc_rt_tasks(p, rq);

next prev parent reply	other threads:[~2008-07-07 11:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-03 22:41 [RFC][PATCH] fix SCHED_FIFO spec violation (backport) Darren Hart
2008-07-04 13:08 ` John Kacur
2008-07-05 15:18   ` Darren Hart
2008-07-07 11:00     ` John Kacur [this message]
2008-07-07 15:24       ` Darren Hart
2008-07-15  8:03     ` Peter Zijlstra
2008-07-15  9:19       ` John Kacur
2008-07-15 10:01 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520f0cf10807070400l67c14367hff52f7b16c7ee757@mail.gmail.com \
    --to=jkacur@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=dmitry.adamushko@gmail.com \
    --cc=dvhltc@us.ibm.com \
    --cc=ghaskins@novell.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).