[RFC PATCH -rt] Priority preemption latency

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH -rt] Priority preemption latency
@ 2006-06-10  0:01 Darren Hart
  2006-06-10  6:48 ` Ingo Molnar
  0 siblings, 1 reply; 15+ messages in thread
From: Darren Hart @ 2006-06-10  0:01 UTC (permalink / raw)
  To: linux-kerneL; +Cc: Ingo Molnar, Thomas Gleixner

We have run into a situation where a lower priority RT task will run after a 
higher priority RT task is awoken.  We are running the test case available 
here:

	http://linux.dvhart.com/tests/prio-preempt/

For all required files, including librt.h, download the tarball here:

	http://linux.dvhart.com/tests/tests.tar.bz2

The test case returns non zero on failure, so just run it in a loop, exitting 
on failure.  The following is a patch by Mike Kravetz that seems to resolve 
the problem.

Thoughts, comments?

Thanks,

-- 
Darren Hart
IBM Linux Technology Center

I've been looking into the priority preemption issues with [the prio-preempt 
test case]. My 'theory' is that when RT tasks are awakened they are not always
put on the 'best' runqueue.  As a result, the rt_overload functionality
has to be engaged to get the task to a CPU it can run on.  This 'delay'
in getting the task on a CPU is the 'bug' exposed by the testcase as
a lower priority task is allowed to run during this delay.

In the testcase, awakened worker tasks should all run on the same CPU
as the others are busy running other higher priority tasks.  But, from
the scheduler's point of view these other CPUs might be a better place
for the awakened task because they are 'less loaded'.

My quick and dirty patch (below) is to the try_to_wake_up path.  When
awakening an RT task don't send it to a remote CPU (determined to
be less loaded) unless it can prerempt the task running on the remote
CPU.  In such cases the task is added to the current CPU's runqueue.

I've been successfully running the C testcase for a while with this patch
applied.

I'm not sure of there is a 'complete' solution to this problem without
a redesign of 'global' RT scheduling on SMP.  The rt_overload mechanism
does not guarantee strict priority ordering (as evidenced by this bug).
Perhaps the best solution we can hope for with the current mechanism is
to make the scheduler be smarter with RT task placement on SMP.  This
would at least minimize the need for rt_overload.

-- 
Mike

diff -Naupr linux-rayrt12.1-r357/kernel/sched.c 
linux-rayrt12.1-r357.work/kernel/sched.c
--- linux-rayrt12.1-r357/kernel/sched.c	2006-05-27 00:43:35.000000000 +0000
+++ linux-rayrt12.1-r357.work/kernel/sched.c	2006-06-08 22:41:20.000000000 
+0000
@@ -1543,6 +1543,17 @@ static int try_to_wake_up(task_t *p, uns
 		}
 	}

+	/*
+	 * XXX  Don't send RT task elsewhere unless it can preempt current
+	 * XXX  on other CPU.  Better yet would be for awakened RT tasks to
+	 * XXX  examine this(and all other) CPU(s) to see what is the best
+	 * XXX  fit.  For example there is no check here to see if the
+	 * XXX  currently running task can be preempted (which would be the
+	 * XXX  ideal case).
+	 */
+	if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq))
+		goto out_set_cpu;
+
 	new_cpu = cpu; /* Could not wake to this_cpu. Wake to cpu instead */
 out_set_cpu:
 	new_cpu = wake_idle(new_cpu, p);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-10  0:01 [RFC PATCH -rt] Priority preemption latency Darren Hart
@ 2006-06-10  6:48 ` Ingo Molnar
  2006-06-10  7:21   ` Ingo Molnar
  2006-06-11  5:49   ` Darren Hart
  0 siblings, 2 replies; 15+ messages in thread
From: Ingo Molnar @ 2006-06-10  6:48 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kerneL, Thomas Gleixner


* Darren Hart <dvhltc@us.ibm.com> wrote:

> @@ -1543,6 +1543,17 @@ static int try_to_wake_up(task_t *p, uns
>  		}
>  	}
>  
> +	/*
> +	 * XXX  Don't send RT task elsewhere unless it can preempt current
> +	 * XXX  on other CPU.  Better yet would be for awakened RT tasks to
> +	 * XXX  examine this(and all other) CPU(s) to see what is the best
> +	 * XXX  fit.  For example there is no check here to see if the
> +	 * XXX  currently running task can be preempted (which would be the
> +	 * XXX  ideal case).
> +	 */
> +	if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq))
> +		goto out_set_cpu;
> +

Great testcase! Note that we already do RT-overload wakeups further 
below:

                /*
                 * If a newly woken up RT task cannot preempt the
                 * current (RT) task then try to find another
                 * CPU it can preempt:
                 */
                if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
                        smp_send_reschedule_allbutself();
                        rt_overload_wakeup++;
                }

what i think happened is that the above logic misses the case you have 
described in detail, and doesnt mark the current CPU for rescheduling. 

I.e. it sends an IPI to all _other_ CPUs, including the 'wrong target' - 
but it doesnt mark the current CPU for reschedule - hence if the current 
CPU is the only right target we might fail to handle this task!

could you try the (untested) patch below, does it solve your testcase 
too?

	Ingo

Index: linux-rt.q/kernel/sched.c
===================================================================
--- linux-rt.q.orig/kernel/sched.c
+++ linux-rt.q/kernel/sched.c
@@ -1587,11 +1587,15 @@ out_set_cpu:
 	} else {
 		/*
 		 * If a newly woken up RT task cannot preempt the
-		 * current (RT) task then try to find another
-		 * CPU it can preempt:
+		 * current (RT) task (on a target runqueue) then try
+		 * to find another CPU it can preempt:
 		 */
 		if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
 			smp_send_reschedule_allbutself();
+			/*
+			 * Trigger a reschedule check on the current CPU too:
+			 */
+			set_tsk_need_resched(current);
 			rt_overload_wakeup++;
 		}
 	}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-10  6:48 ` Ingo Molnar
@ 2006-06-10  7:21   ` Ingo Molnar
  2006-06-11  5:49   ` Darren Hart
  1 sibling, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2006-06-10  7:21 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kerneL, Thomas Gleixner


* Ingo Molnar <mingo@elte.hu> wrote:

> could you try the (untested) patch below, does it solve your testcase 
> too?

find updated patch below:

 - fix small race: use this_rq->curr not 'current'.

 - optimize the case where current CPU could be preempted and do not 
   send IPIs.

 - integrate the RT-overload global statistics counters into schedstats. 
   This should make things more scalable.

	Ingo

Index: linux-rt.q/kernel/sched.c
===================================================================
--- linux-rt.q.orig/kernel/sched.c
+++ linux-rt.q/kernel/sched.c
@@ -292,6 +292,11 @@ struct runqueue {
 	/* try_to_wake_up() stats */
 	unsigned long ttwu_cnt;
 	unsigned long ttwu_local;
+
+	/* RT-overload stats: */
+	unsigned long rto_schedule;
+	unsigned long rto_wakeup;
+	unsigned long rto_pulled;
 #endif
 };
 
@@ -426,7 +431,7 @@ static inline void task_rq_unlock(runque
  * bump this up when changing the output format or the meaning of an existing
  * format, so that tools can adapt (or abort)
  */
-#define SCHEDSTAT_VERSION 12
+#define SCHEDSTAT_VERSION 13
 
 static int show_schedstat(struct seq_file *seq, void *v)
 {
@@ -443,13 +448,14 @@ static int show_schedstat(struct seq_fil
 
 		/* runqueue-specific stats */
 		seq_printf(seq,
-		    "cpu%d %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu",
+		    "cpu%d %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu %lu",
 		    cpu, rq->yld_both_empty,
 		    rq->yld_act_empty, rq->yld_exp_empty, rq->yld_cnt,
 		    rq->sched_switch, rq->sched_cnt, rq->sched_goidle,
 		    rq->ttwu_cnt, rq->ttwu_local,
 		    rq->rq_sched_info.cpu_time,
-		    rq->rq_sched_info.run_delay, rq->rq_sched_info.pcnt);
+		    rq->rq_sched_info.run_delay, rq->rq_sched_info.pcnt,
+		    rq->rto_schedule, rq->rto_wakeup, rq->rto_pulled);
 
 		seq_printf(seq, "\n");
 
@@ -640,9 +646,7 @@ static inline void sched_info_switch(tas
 #define sched_info_switch(t, next)	do { } while (0)
 #endif /* CONFIG_SCHEDSTATS */
 
-int rt_overload_schedule, rt_overload_wakeup, rt_overload_pulled;
-
-__cacheline_aligned_in_smp atomic_t rt_overload;
+static __cacheline_aligned_in_smp atomic_t rt_overload;
 
 static inline void inc_rt_tasks(task_t *p, runqueue_t *rq)
 {
@@ -1312,7 +1316,7 @@ static void balance_rt_tasks(runqueue_t 
 		if (p && (p->prio < next->prio)) {
 			WARN_ON(p == src_rq->curr);
 			WARN_ON(!p->array);
-			rt_overload_pulled++;
+			schedstat_inc(this_rq, rto_pulled);
 
 			set_task_cpu(p, this_cpu);
 
@@ -1469,9 +1473,9 @@ static inline int wake_idle(int cpu, tas
 static int try_to_wake_up(task_t *p, unsigned int state, int sync, int mutex)
 {
 	int cpu, this_cpu, success = 0;
+	runqueue_t *this_rq, *rq;
 	unsigned long flags;
 	long old_state;
-	runqueue_t *rq;
 #ifdef CONFIG_SMP
 	unsigned long load, this_load;
 	struct sched_domain *sd, *this_sd = NULL;
@@ -1587,12 +1591,34 @@ out_set_cpu:
 	} else {
 		/*
 		 * If a newly woken up RT task cannot preempt the
-		 * current (RT) task then try to find another
-		 * CPU it can preempt:
+		 * current (RT) task (on a target runqueue) then try
+		 * to find another CPU it can preempt:
 		 */
 		if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
-			smp_send_reschedule_allbutself();
-			rt_overload_wakeup++;
+			this_rq = cpu_rq(this_cpu);
+			/*
+			 * Special-case: the task on this CPU can be
+			 * preempted. In that case there's no need to
+			 * trigger reschedules on other CPUs, we can
+			 * mark the current task for reschedule.
+			 *
+			 * (Note that it's safe to access this_rq without
+			 * extra locking in this particular case, because
+			 * we are on the current CPU.)
+			 */
+			if (TASK_PREEMPTS_CURR(p, this_rq))
+				set_tsk_need_resched(this_rq->curr);
+			else
+				/*
+				 * Neither the intended target runqueue
+				 * nor the current CPU can take this task.
+				 * Trigger a reschedule on all other CPUs
+				 * nevertheless, maybe one of them can take
+				 * this task:
+				 */
+				smp_send_reschedule_allbutself();
+
+			schedstat_inc(this_rq, rto_wakeup);
 		}
 	}
 
@@ -1951,7 +1977,7 @@ static inline void finish_task_switch(ru
 	 * then kick other CPUs, they might run it:
 	 */
 	if (unlikely(rt_task(current) && prev->array && rt_task(prev))) {
-		rt_overload_schedule++;
+		schedstat_inc(rq, rto_schedule);
 		smp_send_reschedule_allbutself();
 	}
 #endif

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-10  6:48 ` Ingo Molnar
  2006-06-10  7:21   ` Ingo Molnar
@ 2006-06-11  5:49   ` Darren Hart
  2006-06-11  5:58     ` Ingo Molnar
  2006-06-11  6:24     ` Darren Hart
  1 sibling, 2 replies; 15+ messages in thread
From: Darren Hart @ 2006-06-11  5:49 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kerneL, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 2092 bytes --]

On Friday 09 June 2006 23:48, Ingo Molnar wrote:
> * Darren Hart <dvhltc@us.ibm.com> wrote:
> > @@ -1543,6 +1543,17 @@ static int try_to_wake_up(task_t *p, uns
> >  		}
> >  	}
> >
> > +	/*
> > +	 * XXX  Don't send RT task elsewhere unless it can preempt current
> > +	 * XXX  on other CPU.  Better yet would be for awakened RT tasks to
> > +	 * XXX  examine this(and all other) CPU(s) to see what is the best
> > +	 * XXX  fit.  For example there is no check here to see if the
> > +	 * XXX  currently running task can be preempted (which would be the
> > +	 * XXX  ideal case).
> > +	 */
> > +	if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq))
> > +		goto out_set_cpu;
> > +
>
> Great testcase! Note that we already do RT-overload wakeups further
> below:
>
>                 /*
>                  * If a newly woken up RT task cannot preempt the
>                  * current (RT) task then try to find another
>                  * CPU it can preempt:
>                  */
>                 if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
>                         smp_send_reschedule_allbutself();
>                         rt_overload_wakeup++;
>                 }
>
> what i think happened is that the above logic misses the case you have
> described in detail, and doesnt mark the current CPU for rescheduling.
>
> I.e. it sends an IPI to all _other_ CPUs, including the 'wrong target' -
> but it doesnt mark the current CPU for reschedule - hence if the current
> CPU is the only right target we might fail to handle this task!
>
> could you try the (untested) patch below, does it solve your testcase
> too?

Thanks for the updated patch!  It wouldn't quite build (proc_misc.c still 
referenced the old rt_overload_* variables, fixup patch attached removing 
those print statements).  I have it running on a 4 way opteron box running 
prio-preempt in a timed while loop, exiting only on failure.  It's been 
running fine for several minutes - usually fails in under a mintue.  We'll 
see how it's doing in the morning :-)

-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team

[-- Attachment #2: rto-proc_misc-fixup.patch --]
[-- Type: text/x-diff, Size: 1129 bytes --]

diff -Nurp linux-2.6.16/fs/proc/proc_misc.c linux-2.6.16-fixup/fs/proc/proc_misc.c
--- linux-2.6.16/fs/proc/proc_misc.c	2006-06-10 16:48:23.000000000 -0700
+++ linux-2.6.16-fixup/fs/proc/proc_misc.c	2006-06-10 17:00:55.000000000 -0700
@@ -569,19 +569,11 @@ static int show_stat(struct seq_file *p,
 	{
 		unsigned long nr_uninterruptible_cpu(int cpu);
 		extern int pi_initialized;
-		extern int rt_overload_schedule,
-			   rt_overload_wakeup, rt_overload_pulled;
 		unsigned long rt_nr_running_cpu(int cpu);
 		extern atomic_t rt_overload;
 
 		int i;
 
-		seq_printf(p, "rt_overload_schedule: %d\n",
-					rt_overload_schedule);
-		seq_printf(p, "rt_overload_wakeup:   %d\n",
-					rt_overload_wakeup);
-		seq_printf(p, "rt_overload_pulled:   %d\n",
-					rt_overload_pulled);
 		seq_printf(p, "pi_init: %d\n", pi_initialized);
 		seq_printf(p, "nr_running(): %ld\n",
 			nr_running());
@@ -593,8 +585,6 @@ static int show_stat(struct seq_file *p,
 		for_each_cpu(i)
 			seq_printf(p, "rt_nr_running(%d): %ld\n",
 				i, rt_nr_running_cpu(i));
-		seq_printf(p, "rt_overload: %d\n", atomic_read(&rt_overload));
-
 	}
 #endif
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-11  5:49   ` Darren Hart
@ 2006-06-11  5:58     ` Ingo Molnar
  2006-06-11  6:37       ` Ingo Molnar
  2006-06-11  6:24     ` Darren Hart
  1 sibling, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2006-06-11  5:58 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kerneL, Thomas Gleixner


* Darren Hart <dvhltc@us.ibm.com> wrote:

> Thanks for the updated patch!  It wouldn't quite build (proc_misc.c 
> still referenced the old rt_overload_* variables, fixup patch attached 
> removing those print statements). [...]

doh, forgot to send those changes ...

> [...]  I have it running on a 4 way opteron box running prio-preempt 
> in a timed while loop, exiting only on failure.  It's been running 
> fine for several minutes - usually fails in under a mintue.  We'll see 
> how it's doing in the morning :-)

meanwhile i've released -rt3 with the fix (and the procfs change) 
included.

i slept on it meanwhile, and i think the safest would be to also do the 
attached patch ontop of -rt3. This makes the 'kick other CPUs' logic 
totally unconditional - so whatever happens the wakeup code will notice 
if an RT task is in trouble finding the right CPU. Under -rt3 we'd only 
run into this branch if we stayed on the same CPU - but there can be 
cases when we have your scenario even in precisely such a case. It's 
rare but possible.

	Ingo

Index: linux-rt.q/kernel/sched.c
===================================================================
--- linux-rt.q.orig/kernel/sched.c
+++ linux-rt.q/kernel/sched.c
@@ -1588,38 +1588,37 @@ out_set_cpu:
 
 		this_cpu = smp_processor_id();
 		cpu = task_cpu(p);
-	} else {
+	}
+	/*
+	 * If a newly woken up RT task cannot preempt the
+	 * current (RT) task (on a target runqueue) then try
+	 * to find another CPU it can preempt:
+	 */
+	if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
+		this_rq = cpu_rq(this_cpu);
 		/*
-		 * If a newly woken up RT task cannot preempt the
-		 * current (RT) task (on a target runqueue) then try
-		 * to find another CPU it can preempt:
+		 * Special-case: the task on this CPU can be
+		 * preempted. In that case there's no need to
+		 * trigger reschedules on other CPUs, we can
+		 * mark the current task for reschedule.
+		 *
+		 * (Note that it's safe to access this_rq without
+		 * extra locking in this particular case, because
+		 * we are on the current CPU.)
 		 */
-		if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
-			this_rq = cpu_rq(this_cpu);
+		if (TASK_PREEMPTS_CURR(p, this_rq))
+			set_tsk_need_resched(this_rq->curr);
+		else
 			/*
-			 * Special-case: the task on this CPU can be
-			 * preempted. In that case there's no need to
-			 * trigger reschedules on other CPUs, we can
-			 * mark the current task for reschedule.
-			 *
-			 * (Note that it's safe to access this_rq without
-			 * extra locking in this particular case, because
-			 * we are on the current CPU.)
+			 * Neither the intended target runqueue
+			 * nor the current CPU can take this task.
+			 * Trigger a reschedule on all other CPUs
+			 * nevertheless, maybe one of them can take
+			 * this task:
 			 */
-			if (TASK_PREEMPTS_CURR(p, this_rq))
-				set_tsk_need_resched(this_rq->curr);
-			else
-				/*
-				 * Neither the intended target runqueue
-				 * nor the current CPU can take this task.
-				 * Trigger a reschedule on all other CPUs
-				 * nevertheless, maybe one of them can take
-				 * this task:
-				 */
-				smp_send_reschedule_allbutself();
+			smp_send_reschedule_allbutself();
 
-			schedstat_inc(this_rq, rto_wakeup);
-		}
+		schedstat_inc(this_rq, rto_wakeup);
 	}
 
 out_activate:

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-11  5:58     ` Ingo Molnar
@ 2006-06-11  6:37       ` Ingo Molnar
  0 siblings, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2006-06-11  6:37 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kerneL, Thomas Gleixner


* Ingo Molnar <mingo@elte.hu> wrote:

> i slept on it meanwhile, and i think the safest would be to also do 
> the attached patch ontop of -rt3. This makes the 'kick other CPUs' 
> logic totally unconditional - so whatever happens the wakeup code will 
> notice if an RT task is in trouble finding the right CPU. Under -rt3 
> we'd only run into this branch if we stayed on the same CPU - but 
> there can be cases when we have your scenario even in precisely such a 
> case. It's rare but possible.

just to elaborate on that possibility a bit more, it's this portion of 
the wakeup code that could cause problems:

        new_cpu = wake_idle(new_cpu, p);
        if (new_cpu != cpu) {
                set_task_cpu(p, new_cpu);
                task_rq_unlock(rq, &flags);
                /* might preempt at this point */
                rq = task_rq_lock(p, &flags);
                old_state = p->state;
                if (!(old_state & state))

at the 'might preempt' point the kernel can go to any other CPU. The 
stock kernel does not care because it's really rare and wakeup placement 
of tasks is a statistical thing to it. But for RT it's important to get 
it right all the time, so my patch removes the RT-overload check from 
the else branch to an unconditional place. (I doubt you can trigger it 
normally, but it's a possibility nevertheless.)

(i'll do a patch for the upstream scheduler to not re-enable interrupts 
at this point [it's a waste of cycles], but even if we couldnt go to 
another CPU the whole scheduling scenario might change while we are 
trying to acquire the runqueue lock, so it's still beneficial to have 
the RT-overload check unconditional.)

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-11  5:49   ` Darren Hart
  2006-06-11  5:58     ` Ingo Molnar
@ 2006-06-11  6:24     ` Darren Hart
  2006-06-11  7:36       ` Ingo Molnar
  1 sibling, 1 reply; 15+ messages in thread
From: Darren Hart @ 2006-06-11  6:24 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kerneL, Thomas Gleixner

> Thanks for the updated patch!  It wouldn't quite build (proc_misc.c still
> referenced the old rt_overload_* variables, fixup patch attached removing
> those print statements).  I have it running on a 4 way opteron box running
> prio-preempt in a timed while loop, exiting only on failure.  It's been
> running fine for several minutes - usually fails in under a mintue.  We'll
> see how it's doing in the morning :-)

Well it failed in about 14 minutes.  The machine was under heavy load running 
another benchmark.  I have removed the secondary benchmark and am running 
prio-preempt alone on the same 4 way opteron box.  Will post again when I 
know more...

-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-11  6:24     ` Darren Hart
@ 2006-06-11  7:36       ` Ingo Molnar
  2006-06-12 15:38         ` Darren Hart
  2006-06-12 20:12         ` Mike Kravetz
  0 siblings, 2 replies; 15+ messages in thread
From: Ingo Molnar @ 2006-06-11  7:36 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-kernel, Thomas Gleixner


* Darren Hart <dvhltc@us.ibm.com> wrote:

> > Thanks for the updated patch!  It wouldn't quite build (proc_misc.c still
> > referenced the old rt_overload_* variables, fixup patch attached removing
> > those print statements).  I have it running on a 4 way opteron box running
> > prio-preempt in a timed while loop, exiting only on failure.  It's been
> > running fine for several minutes - usually fails in under a mintue.  We'll
> > see how it's doing in the morning :-)
> 
> Well it failed in about 14 minutes.  The machine was under heavy load 
> running another benchmark.  I have removed the secondary benchmark and 
> am running prio-preempt alone on the same 4 way opteron box.  Will 
> post again when I know more...

ok - could you try the patch from today (re-attached below)? Maybe that 
theoretical scenario i mentioned is only theoretical in theory ;-)

	Ingo

Index: linux-rt.q/kernel/sched.c
===================================================================
--- linux-rt.q.orig/kernel/sched.c
+++ linux-rt.q/kernel/sched.c
@@ -1588,38 +1588,37 @@ out_set_cpu:
 
 		this_cpu = smp_processor_id();
 		cpu = task_cpu(p);
-	} else {
+	}
+	/*
+	 * If a newly woken up RT task cannot preempt the
+	 * current (RT) task (on a target runqueue) then try
+	 * to find another CPU it can preempt:
+	 */
+	if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
+		this_rq = cpu_rq(this_cpu);
 		/*
-		 * If a newly woken up RT task cannot preempt the
-		 * current (RT) task (on a target runqueue) then try
-		 * to find another CPU it can preempt:
+		 * Special-case: the task on this CPU can be
+		 * preempted. In that case there's no need to
+		 * trigger reschedules on other CPUs, we can
+		 * mark the current task for reschedule.
+		 *
+		 * (Note that it's safe to access this_rq without
+		 * extra locking in this particular case, because
+		 * we are on the current CPU.)
 		 */
-		if (rt_task(p) && !TASK_PREEMPTS_CURR(p, rq)) {
-			this_rq = cpu_rq(this_cpu);
+		if (TASK_PREEMPTS_CURR(p, this_rq))
+			set_tsk_need_resched(this_rq->curr);
+		else
 			/*
-			 * Special-case: the task on this CPU can be
-			 * preempted. In that case there's no need to
-			 * trigger reschedules on other CPUs, we can
-			 * mark the current task for reschedule.
-			 *
-			 * (Note that it's safe to access this_rq without
-			 * extra locking in this particular case, because
-			 * we are on the current CPU.)
+			 * Neither the intended target runqueue
+			 * nor the current CPU can take this task.
+			 * Trigger a reschedule on all other CPUs
+			 * nevertheless, maybe one of them can take
+			 * this task:
 			 */
-			if (TASK_PREEMPTS_CURR(p, this_rq))
-				set_tsk_need_resched(this_rq->curr);
-			else
-				/*
-				 * Neither the intended target runqueue
-				 * nor the current CPU can take this task.
-				 * Trigger a reschedule on all other CPUs
-				 * nevertheless, maybe one of them can take
-				 * this task:
-				 */
-				smp_send_reschedule_allbutself();
+			smp_send_reschedule_allbutself();
 
-			schedstat_inc(this_rq, rto_wakeup);
-		}
+		schedstat_inc(this_rq, rto_wakeup);
 	}
 
 out_activate:

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-11  7:36       ` Ingo Molnar
@ 2006-06-12 15:38         ` Darren Hart
  2006-06-15 21:06           ` Mike Kravetz
  2006-06-12 20:12         ` Mike Kravetz
  1 sibling, 1 reply; 15+ messages in thread
From: Darren Hart @ 2006-06-12 15:38 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Thomas Gleixner

On Sunday 11 June 2006 00:36, Ingo Molnar wrote:

> ok - could you try the patch from today (re-attached below)? Maybe that
> theoretical scenario i mentioned is only theoretical in theory ;-)
>
> 	Ingo

I started running this version of the patch with prio-preemt in a loop over 10 
hours ago, and it's still running.  This seems to be the right fix.

Thanks!


-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-12 15:38         ` Darren Hart
@ 2006-06-15 21:06           ` Mike Kravetz
  2006-06-15 22:13             ` Darren Hart
                               ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Mike Kravetz @ 2006-06-15 21:06 UTC (permalink / raw)
  To: Darren Hart; +Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, John Stultz

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

On Mon, Jun 12, 2006 at 08:38:24AM -0700, Darren Hart wrote:
> I started running this version of the patch with prio-preemt in a loop
> over 10 hours ago, and it's still running.  This seems to be the right fix.

Unfortunately, this test did eventually fail over in our environment.
John Stultz added the concept of 'interrupter threads' to the testcase.
These high priority RT interrupter threads, occasionally wake up and
run for a short period of time.  Since these new threads are higher
priority than any other, they theoretically should not impact the
testcase.  This is because the primary purpose of the testcase is to
ensure that lower priority tasks do not run while higher priority tasks
are waiting for a CPU.

After adding these interrupter threads, the tetscase fails (on a system
here) about 50% of the time.  An updated version of prio-preempt.c is
attached.  It needs the same headers/makefile/etc as originally provided
by Darren.

Any help figuring out what is happening here would be appreciated.
-- 
Mike

[-- Attachment #2: prio-preempt.c.new --]
[-- Type: text/plain, Size: 5860 bytes --]

/*    Filename: prio-preempt.c
 *      Author: Dinakar Guniguntala <dino@us.ibm.com>
 * Description: Test priority preemption
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2006
 *
 * 2006-Jun-01:	Initial version by Dinakar Guniguntala
 *              Changes from John Stultz and Vivek Pallantla
 */

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <time.h>
#include <pthread.h>
#include <sched.h>
#include <errno.h>
#include <sys/syscall.h>
#include "librt.h"

#define NUM_WORKERS	27
#define CHECK_LIMIT	1

volatile int busy_threads = 0;
volatile int test_over = 0;
volatile int threads_running=0;
static int rt_threads = 0;
static pthread_mutex_t  bmutex = PTHREAD_MUTEX_INITIALIZER;

static pthread_mutex_t mutex[NUM_WORKERS+1];
static pthread_cond_t cond[NUM_WORKERS+1];
static int t_after_wait[NUM_WORKERS];

#define ADD_INTERRUPTER_THREADS
#ifdef ADD_INTERRUPTER_THREADS
void *int_thread(void* arg)
{
	int j, a = 0;
	while (!test_over) {
		/* do some busy work */
		if(!(a%4))
			a = a * 3;
		else if (!(a%6))
			a = a /2;
		else
			a++;
		usleep(10);
	}
	return (void*)a;
}
#endif

void *busy_thread(void* arg)
{
	struct sched_param sched_param;
	int policy, mypri = 0, tid;
	tid = (int) (((struct thread *) arg)->arg);

	if (pthread_getschedparam(pthread_self(), &policy, &sched_param) != 0) {
		printf("ERR: Couldn't get pthread info \n");
	} else {
		mypri = sched_param.sched_priority;
	}

	pthread_mutex_lock(&bmutex);
	busy_threads++;
	printf("Busy Thread %d(%d): Running...\n", tid, mypri);
	pthread_mutex_unlock(&bmutex);

	/* TODO: Add sched set affinity here */

	/* Busy loop */
	while (!test_over);

	printf("Busy Thread %d(%d): Exiting\n", tid, mypri);
	return NULL;
}

void *worker_thread(void* arg)
{
	struct sched_param sched_param;
	int policy, rc, mypri = 0, tid, times = 0;
	tid = (int) (((struct thread *) arg)->arg);
	nsec_t pstart, pend;
	
	if (pthread_getschedparam(pthread_self(), &policy, &sched_param) != 0) {
		printf("ERR: Couldn't get pthread info \n");
	} else {
		mypri = sched_param.sched_priority;
	}
	/* check in */
	pthread_mutex_lock(&bmutex);
	threads_running++;
	pthread_mutex_unlock(&bmutex);

	/* block */
	rc = pthread_mutex_lock(&mutex[tid]);
	rc = pthread_cond_wait(&cond[tid], &mutex[tid]);
	rc = pthread_mutex_unlock(&mutex[tid]);
	
	printf("%llu: Thread %d(%d) wakes up from sleep \n", rt_gettime(), tid, mypri);
	
	/*check if we're the last thread */
	if (tid == NUM_WORKERS-1) {
		t_after_wait[tid] = 1;
		pthread_mutex_lock(&bmutex);
		threads_running--;
		pthread_mutex_unlock(&bmutex);
		return NULL;
	}

	/* Signal next thread */
	rc = pthread_mutex_lock(&mutex[tid+1]);
	rc = pthread_cond_signal(&cond[tid+1]);
	printf("%llu: Thread %d(%d): Sent signal (%d) to (%d)\n", rt_gettime(), tid, mypri,
                        rc, tid+1);

   	pstart = pend = rt_gettime();
	rc = pthread_mutex_unlock(&mutex[tid+1]);

	printf("%llu: Thread %d(%d) setting it's bit \n",rt_gettime(), tid, mypri);

	t_after_wait[tid] = 1;

	while (t_after_wait[tid+1] != 1) {
		pend = rt_gettime();
		times++;
	}

	if (times >= CHECK_LIMIT) {
		printf("Thread %d(%d): Non-Preempt limit reached. %llu ns max latency\n",
				tid, mypri, pend-pstart);
		exit(1);
	}

	/* check out */
	pthread_mutex_lock(&bmutex);
	threads_running--;
	pthread_mutex_unlock(&bmutex);

	return NULL;
}

void *master_thread(void* arg)
{
	int i, pri_boost;

#ifdef ADD_INTERRUPTER_THREADS
	/* start interrupter thread */
	pri_boost = 90;
	for (i = 0; i < rt_threads; i++) {
		create_fifo_thread(int_thread, (void*)0,
			sched_get_priority_min(SCHED_FIFO) + pri_boost);
	}
#endif

	/* start the (N-1) busy threads */
	pri_boost = 80;
	for (i = rt_threads; i > 1; i--) {
                create_fifo_thread(busy_thread, (void*)i,
			sched_get_priority_min(SCHED_FIFO) + pri_boost);
	}

	/* make sure children are started */
	while (busy_threads < (rt_threads-1))
		sleep(1);

	printf("Busy threads created!\n");

	/* start NUM_WORKERS worker threads */
	for (i = 0, pri_boost = 10; i < NUM_WORKERS ; i++, pri_boost+=2) {
		pthread_mutex_init(&mutex[i], NULL);
		pthread_cond_init(&cond[i], NULL);
                create_fifo_thread(worker_thread, (void*)i,
				sched_get_priority_min(SCHED_FIFO) + pri_boost);
	}


	printf("Worker threads created\n");
	/* Let the worker threads wait on the cond vars */
	while (threads_running < NUM_WORKERS)
		sleep(2);

	printf("Signaling first thread\n");
	pthread_mutex_lock(&mutex[0]);
	pthread_cond_signal(&cond[0]);
	pthread_mutex_unlock(&mutex[0]);

	while (threads_running)
		sleep(20);

	test_over = 1;
	return NULL;
}

int parse_args(int c, char *v)
{
	int handled = 1;
	switch (c) {
		case 'n':
			rt_threads = atoi(v);
			break;
		default:
			handled = 0;
			break;
	}
	return handled;
}

int main(int argc, char* argv[])
{
	int pri_boost;

	rt_init("n:", parse_args, argc, argv);

	if (rt_threads == 0) {
		printf("Usage: %s -n num_rt_threads\n", argv[0]);
		exit(1);
	}

	pri_boost = 81;
        create_fifo_thread(master_thread, (void*)0,
		 sched_get_priority_min(SCHED_FIFO) + pri_boost);

        /* wait for threads to complete */
        join_threads();

	return 0;
}



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-15 21:06           ` Mike Kravetz
@ 2006-06-15 22:13             ` Darren Hart
  2006-06-15 23:05             ` Esben Nielsen
  2006-07-11 16:48             ` Mike Kravetz
  2 siblings, 0 replies; 15+ messages in thread
From: Darren Hart @ 2006-06-15 22:13 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, John Stultz

On Thursday 15 June 2006 14:06, Mike Kravetz wrote:
> On Mon, Jun 12, 2006 at 08:38:24AM -0700, Darren Hart wrote:
> > I started running this version of the patch with prio-preemt in a loop
> > over 10 hours ago, and it's still running.  This seems to be the right
> > fix.
>
> Unfortunately, this test did eventually fail over in our environment.
> John Stultz added the concept of 'interrupter threads' to the testcase.
> These high priority RT interrupter threads, occasionally wake up and
> run for a short period of time.  Since these new threads are higher
> priority than any other, they theoretically should not impact the
> testcase.  This is because the primary purpose of the testcase is to
> ensure that lower priority tasks do not run while higher priority tasks
> are waiting for a CPU.
>
> After adding these interrupter threads, the tetscase fails (on a system
> here) about 50% of the time.  An updated version of prio-preempt.c is
> attached.  It needs the same headers/makefile/etc as originally provided
> by Darren.

Thanks for the update Mike!  I have incorporated the interrupter threads using 
a command line argument, -i, they are disabled by default.  Grab the latest 
here:

http://linux.dvhart.com/tests/

To run the test on a 4 way machine with interrupter threads enabled, use:

$ ./prio-preempt -n 4 -i

>
> Any help figuring out what is happening here would be appreciated.

-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team
Phone: 503 578 3185
  T/L: 775 3185

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-15 21:06           ` Mike Kravetz
  2006-06-15 22:13             ` Darren Hart
@ 2006-06-15 23:05             ` Esben Nielsen
  2006-06-15 22:48               ` Mike Kravetz
  2006-07-11 16:48             ` Mike Kravetz
  2 siblings, 1 reply; 15+ messages in thread
From: Esben Nielsen @ 2006-06-15 23:05 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Darren Hart, Ingo Molnar, linux-kernel, Thomas Gleixner,
	John Stultz

On Thu, 15 Jun 2006, Mike Kravetz wrote:

> On Mon, Jun 12, 2006 at 08:38:24AM -0700, Darren Hart wrote:
>> I started running this version of the patch with prio-preemt in a loop
>> over 10 hours ago, and it's still running.  This seems to be the right fix.
>
> Unfortunately, this test did eventually fail over in our environment.
> John Stultz added the concept of 'interrupter threads' to the testcase.
> These high priority RT interrupter threads, occasionally wake up and
> run for a short period of time.  Since these new threads are higher
> priority than any other, they theoretically should not impact the
> testcase.  This is because the primary purpose of the testcase is to
> ensure that lower priority tasks do not run while higher priority tasks
> are waiting for a CPU.
>
> After adding these interrupter threads, the tetscase fails (on a system
> here) about 50% of the time.  An updated version of prio-preempt.c is
> attached.  It needs the same headers/makefile/etc as originally provided
> by Darren.
>
> Any help figuring out what is happening here would be appreciated.


Well, I can't say understand what your test is supposed to test. But I 
know having printf() inside something which is supposed to be RT is a big 
no-no as it can block.

What if the first printf() after pthread_cond_wait() is blocking?
The interrupt threads probably use a lot of CPU overall and they have 
priority 90. They might slow the output device for printf() so much down 
that some queue becomes full and printf() must block.

Esben

> -- 
> Mike
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-15 23:05             ` Esben Nielsen
@ 2006-06-15 22:48               ` Mike Kravetz
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Kravetz @ 2006-06-15 22:48 UTC (permalink / raw)
  To: Esben Nielsen
  Cc: Darren Hart, Ingo Molnar, linux-kernel, Thomas Gleixner,
	John Stultz

On Fri, Jun 16, 2006 at 12:05:21AM +0100, Esben Nielsen wrote:
> Well, I can't say understand what your test is supposed to test. But I 
> know having printf() inside something which is supposed to be RT is a big 
> no-no as it can block.

You are correct.  John's modifications to the testcase actually removed
the printfs from the threads we are concerned with.  I left them in because
they were part of the 'original' tetscase, and I wanted to introduce as
few changes as possible.  In hindsight, that was a mistake.

With the printfs commented out (in the worker threads) the testcase still
fails at about the same rate.

Darren, can you comment those out in the version on your site?  An updated
copy is attached.
-- 
Mike

/*    Filename: prio-preempt.c
 *      Author: Dinakar Guniguntala <dino@us.ibm.com>
 * Description: Test priority preemption
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2006
 *
 * 2006-Jun-01:	Initial version by Dinakar Guniguntala
 *              Changes from John Stultz and Vivek Pallantla
 */

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <time.h>
#include <pthread.h>
#include <sched.h>
#include <errno.h>
#include <sys/syscall.h>
#include "librt.h"

#define NUM_WORKERS	27
#define CHECK_LIMIT	1

volatile int busy_threads = 0;
volatile int test_over = 0;
volatile int threads_running=0;
static int rt_threads = 0;
static int int_threads = 0;
static pthread_mutex_t  bmutex = PTHREAD_MUTEX_INITIALIZER;

static pthread_mutex_t mutex[NUM_WORKERS+1];
static pthread_cond_t cond[NUM_WORKERS+1];
static int t_after_wait[NUM_WORKERS];

void *int_thread(void* arg)
{
	int j, a = 0;
	while (!test_over) {
		/* do some busy work */
		if (!(a%4))
			a = a * 3;
		else if (!(a%6))
			a = a /2;
		else
			a++;
		usleep(10);
	}
	return (void*)a;
}

void *busy_thread(void* arg)
{
	struct sched_param sched_param;
	int policy, mypri = 0, tid;
	tid = (int) (((struct thread *) arg)->arg);

	if (pthread_getschedparam(pthread_self(), &policy, &sched_param) != 0) {
		printf("ERR: Couldn't get pthread info \n");
	} else {
		mypri = sched_param.sched_priority;
	}

	pthread_mutex_lock(&bmutex);
	busy_threads++;
	printf("Busy Thread %d(%d): Running...\n", tid, mypri);
	pthread_mutex_unlock(&bmutex);

	/* TODO: Add sched set affinity here */

	/* Busy loop */
	while (!test_over);

	printf("Busy Thread %d(%d): Exiting\n", tid, mypri);
	return NULL;
}

void *worker_thread(void* arg)
{
	struct sched_param sched_param;
	int policy, rc, mypri = 0, tid, times = 0;
	tid = (int) (((struct thread *) arg)->arg);
	nsec_t pstart, pend;
	
	if (pthread_getschedparam(pthread_self(), &policy, &sched_param) != 0) {
		printf("ERR: Couldn't get pthread info \n");
	} else {
		mypri = sched_param.sched_priority;
	}
	/* check in */
	pthread_mutex_lock(&bmutex);
	threads_running++;
	pthread_mutex_unlock(&bmutex);

	/* block */
	rc = pthread_mutex_lock(&mutex[tid]);
	rc = pthread_cond_wait(&cond[tid], &mutex[tid]);
	rc = pthread_mutex_unlock(&mutex[tid]);
	
	/* printf("%llu: Thread %d(%d) wakes up from sleep \n", rt_gettime(), tid, mypri); */
	
	/*check if we're the last thread */
	if (tid == NUM_WORKERS-1) {
		t_after_wait[tid] = 1;
		pthread_mutex_lock(&bmutex);
		threads_running--;
		pthread_mutex_unlock(&bmutex);
		return NULL;
	}

	/* Signal next thread */
	rc = pthread_mutex_lock(&mutex[tid+1]);
	rc = pthread_cond_signal(&cond[tid+1]);
	/* printf("%llu: Thread %d(%d): Sent signal (%d) to (%d)\n", rt_gettime(), tid, mypri,
                        rc, tid+1); */

   	pstart = pend = rt_gettime();
	rc = pthread_mutex_unlock(&mutex[tid+1]);

	/* printf("%llu: Thread %d(%d) setting it's bit \n",rt_gettime(), tid, mypri); */

	t_after_wait[tid] = 1;

	while (t_after_wait[tid+1] != 1) {
		pend = rt_gettime();
		times++;
	}

	if (times >= CHECK_LIMIT) {
		printf("Thread %d(%d): Non-Preempt limit reached. %llu ns max latency\n",
				tid, mypri, pend-pstart);
		exit(1);
	}

	/* check out */
	pthread_mutex_lock(&bmutex);
	threads_running--;
	pthread_mutex_unlock(&bmutex);

	return NULL;
}

void *master_thread(void* arg)
{
	int i, pri_boost;

	/* start interrupter thread */
	if (int_threads) {
		pri_boost = 90;
		for (i = 0; i < rt_threads; i++) {
			create_fifo_thread(int_thread, (void*)0,
					sched_get_priority_min(SCHED_FIFO) + pri_boost);
		}
	}

	/* start the (N-1) busy threads */
	pri_boost = 80;
	for (i = rt_threads; i > 1; i--) {
                create_fifo_thread(busy_thread, (void*)i,
			sched_get_priority_min(SCHED_FIFO) + pri_boost);
	}

	/* make sure children are started */
	while (busy_threads < (rt_threads-1))
		sleep(1);

	printf("Busy threads created!\n");

	/* start NUM_WORKERS worker threads */
	for (i = 0, pri_boost = 10; i < NUM_WORKERS ; i++, pri_boost+=2) {
		pthread_mutex_init(&mutex[i], NULL);
		pthread_cond_init(&cond[i], NULL);
                create_fifo_thread(worker_thread, (void*)i,
				sched_get_priority_min(SCHED_FIFO) + pri_boost);
	}


	printf("Worker threads created\n");
	/* Let the worker threads wait on the cond vars */
	while (threads_running < NUM_WORKERS)
		sleep(2);

	printf("Signaling first thread\n");
	pthread_mutex_lock(&mutex[0]);
	pthread_cond_signal(&cond[0]);
	pthread_mutex_unlock(&mutex[0]);

	while (threads_running)
		sleep(20);

	test_over = 1;
	return NULL;
}

int parse_args(int c, char *v)
{
	int handled = 1;
	switch (c) {
		case 'n':
			rt_threads = atoi(v);
			break;
		case 'i':
			int_threads = 1;
			break;
		default:
			handled = 0;
			break;
	}
	return handled;
}

int main(int argc, char* argv[])
{
	int pri_boost;

	rt_init("n:i", parse_args, argc, argv);

	if (rt_threads == 0) {
		printf("Usage: %s -n num_rt_threads [-i]\n", argv[0]);
		exit(1);
	}

	pri_boost = 81;
        create_fifo_thread(master_thread, (void*)0,
		 sched_get_priority_min(SCHED_FIFO) + pri_boost);

        /* wait for threads to complete */
        join_threads();

	return 0;
}



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-15 21:06           ` Mike Kravetz
  2006-06-15 22:13             ` Darren Hart
  2006-06-15 23:05             ` Esben Nielsen
@ 2006-07-11 16:48             ` Mike Kravetz
  2 siblings, 0 replies; 15+ messages in thread
From: Mike Kravetz @ 2006-07-11 16:48 UTC (permalink / raw)
  To: Ingo Molnar, Darren Hart
  Cc: linux-kernel, Darren Hart, Thomas Gleixner, John Stultz

Hi Ingo,

As you can see from the note below, we are still having issues with
this testcase.  As a reminder Darren's mail with a pointer to the
new testcase is at:

http://www.uwsg.indiana.edu/hypermail/linux/kernel/0606.1/0729.html

When run with the '-i' option it fails on a regular (aprox 50% on 4 way)
basis.  Any insight into what may be happening would be appreciated.

Thanks,
-- 
Mike

On Thu, Jun 15, 2006 at 02:06:58PM -0700, Mike Kravetz wrote:
> On Mon, Jun 12, 2006 at 08:38:24AM -0700, Darren Hart wrote:
> > I started running this version of the patch with prio-preemt in a loop
> > over 10 hours ago, and it's still running.  This seems to be the right fix.
> 
> Unfortunately, this test did eventually fail over in our environment.
> John Stultz added the concept of 'interrupter threads' to the testcase.
> These high priority RT interrupter threads, occasionally wake up and
> run for a short period of time.  Since these new threads are higher
> priority than any other, they theoretically should not impact the
> testcase.  This is because the primary purpose of the testcase is to
> ensure that lower priority tasks do not run while higher priority tasks
> are waiting for a CPU.
> 
> After adding these interrupter threads, the tetscase fails (on a system
> here) about 50% of the time.  An updated version of prio-preempt.c is
> attached.  It needs the same headers/makefile/etc as originally provided
> by Darren.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH -rt] Priority preemption latency
  2006-06-11  7:36       ` Ingo Molnar
  2006-06-12 15:38         ` Darren Hart
@ 2006-06-12 20:12         ` Mike Kravetz
  1 sibling, 0 replies; 15+ messages in thread
From: Mike Kravetz @ 2006-06-12 20:12 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Darren Hart, linux-kernel, Thomas Gleixner

On Sun, Jun 11, 2006 at 09:36:09AM +0200, Ingo Molnar wrote:
> ok - could you try the patch from today (re-attached below)? Maybe that 
> theoretical scenario i mentioned is only theoretical in theory ;-)

Thanks for the patch!

I just started looking at the RT scheduling code and am trying to
get a good understanding of what is happening.

In the case where the task on the current CPU can not be preempted,
we send IPIs to all other CPUs and they end up running the
balance_rt_tasks() routine to try and schedule the task.  Is
that correct?  If the newly awakened task can preempt more than one
currently running RT task, then is it possible for tasks to 'bounce
around' a bit before we end up running the top N priority RT tasks
(where N is the number of CPUs)?

In the case where the newly awakened task can preempt the task on the
current CPU, shouldn't we also trigger reschedules on other CPUs?
Perhaps we do somewhere and I am missing it.  I am thinking of the
case where the current CPU was running a RT task with lower priority
than the awakened task.  But, another CPU is running an even lower
priority RT task.  In this case, we really want the newly awakened task
to preempt the lower priority task on the remote CPU.

Thanks,
-- 
Mike

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-07-11 16:47 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-10  0:01 [RFC PATCH -rt] Priority preemption latency Darren Hart
2006-06-10  6:48 ` Ingo Molnar
2006-06-10  7:21   ` Ingo Molnar
2006-06-11  5:49   ` Darren Hart
2006-06-11  5:58     ` Ingo Molnar
2006-06-11  6:37       ` Ingo Molnar
2006-06-11  6:24     ` Darren Hart
2006-06-11  7:36       ` Ingo Molnar
2006-06-12 15:38         ` Darren Hart
2006-06-15 21:06           ` Mike Kravetz
2006-06-15 22:13             ` Darren Hart
2006-06-15 23:05             ` Esben Nielsen
2006-06-15 22:48               ` Mike Kravetz
2006-07-11 16:48             ` Mike Kravetz
2006-06-12 20:12         ` Mike Kravetz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox