Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo.

All of lore.kernel.org
 help / color / mirror / Atom feed

* Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo.
@ 2007-09-26  2:35 Girish kathalagiri
  2007-10-04 22:10 ` Darren Hart
  0 siblings, 1 reply; 5+ messages in thread
From: Girish kathalagiri @ 2007-09-26  2:35 UTC (permalink / raw)
  To: linux-rt-users

Hi,
I am running Linux 2.6.21-rc4-rt1 on my IBM thinkpad T60 (I have
attached the cpuinfo).
When i run the hourglass ( http://www.cs.utah.edu/~regehr/hourglass/)
,6 threads each of priority RTHIGH (maxpriority-2).
Only one thread (thread 0) seems to be running on cpu#0, all the other
thread seems to be competing with each other on cpu#1.

command : ./hourglass -n 6 -a -d 60s -w CPU -i HR -p RTHIGH
here is a part of output from the hourglass test
----------------------------------------------------------
thread 0 will use high-res timers
thread 1 will use high-res timers
thread 2 will use high-res timers
thread 3 will use high-res timers
thread 4 will use high-res timers
thread 5 will use high-res timers
thread 0 will have priority RTHIGH
thread 1 will have priority RTHIGH
thread 2 will have priority RTHIGH
thread 3 will have priority RTHIGH
thread 4 will have priority RTHIGH
thread 5 will have priority RTHIGH
8.010864 MB allocated for trace records
Hourglass 1.0.1b : 6 threads; 60.000000 seconds; 1828.999688 MHz
timestamp counter
max gap is 63980 cycles
this test will last for  60.000000 seconds
numthreads: 6
work done by thrd 0 : 430394547
work done by thrd 1 : 42208618
work done by thrd 2 : 86995526
work done by thrd 3 : 86819711
work done by thrd 4 : 86887798
work done by thrd 5 : 86101724

thread 0 recorded 60.040444 seconds (99.999946 %)
thread 1 recorded 12.037532 seconds (20.049028 %)
 thread 2 recorded 12.030190 seconds (20.036800 %)
thread 3 recorded 12.005801 seconds (19.996179 %)
 thread 4 recorded 12.015195 seconds (20.011824 %)
thread 5 recorded 11.907206 seconds (19.831964 %)
 -------------------------------------------------------------------------------

Should not the threads be schedule fairly between the two cpu's, like
say ,3 threads competing for cpu#0 and other 3 for cpu#1 ?
 Or Am i missing something here ?

-- 
Thanks
   Giri

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo.
  2007-09-26  2:35 Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo Girish kathalagiri
@ 2007-10-04 22:10 ` Darren Hart
  2007-10-05  6:40   ` Girish kathalagiri
  0 siblings, 1 reply; 5+ messages in thread
From: Darren Hart @ 2007-10-04 22:10 UTC (permalink / raw)
  To: Girish kathalagiri; +Cc: linux-rt-users

On Tuesday 25 September 2007 19:35:35 Girish kathalagiri wrote:
> Hi,
> I am running Linux 2.6.21-rc4-rt1 on my IBM thinkpad T60 (I have
> attached the cpuinfo).
> When i run the hourglass ( http://www.cs.utah.edu/~regehr/hourglass/)
> ,6 threads each of priority RTHIGH (maxpriority-2).
> Only one thread (thread 0) seems to be running on cpu#0, all the other
> thread seems to be competing with each other on cpu#1.
>
> command : ./hourglass -n 6 -a -d 60s -w CPU -i HR -p RTHIGH
> here is a part of output from the hourglass test
> ----------------------------------------------------------
> thread 0 will use high-res timers
> thread 1 will use high-res timers
> thread 2 will use high-res timers
> thread 3 will use high-res timers
> thread 4 will use high-res timers
> thread 5 will use high-res timers
> thread 0 will have priority RTHIGH
> thread 1 will have priority RTHIGH
> thread 2 will have priority RTHIGH
> thread 3 will have priority RTHIGH
> thread 4 will have priority RTHIGH
> thread 5 will have priority RTHIGH
> 8.010864 MB allocated for trace records
> Hourglass 1.0.1b : 6 threads; 60.000000 seconds; 1828.999688 MHz
> timestamp counter
> max gap is 63980 cycles
> this test will last for  60.000000 seconds
> numthreads: 6
> work done by thrd 0 : 430394547
> work done by thrd 1 : 42208618
> work done by thrd 2 : 86995526
> work done by thrd 3 : 86819711
> work done by thrd 4 : 86887798
> work done by thrd 5 : 86101724
>
> thread 0 recorded 60.040444 seconds (99.999946 %)
> thread 1 recorded 12.037532 seconds (20.049028 %)
>  thread 2 recorded 12.030190 seconds (20.036800 %)
> thread 3 recorded 12.005801 seconds (19.996179 %)
>  thread 4 recorded 12.015195 seconds (20.011824 %)
> thread 5 recorded 11.907206 seconds (19.831964 %)
> 
> ---------------------------------------------------------------------------
>----
>
> Should not the threads be schedule fairly between the two cpu's, like
> say ,3 threads competing for cpu#0 and other 3 for cpu#1 ?
>  Or Am i missing something here ?

1) How are you determining which CPUs these threads spend their time on?
2) RTHIGH doesn't do anything for us.  What is the value of the SCHED_FIFO 
priority those threads run at?  Is it the same for every one?

Note that depending on exactly what those threads do (I am unfamiliar with the 
hourglass testcase) it isn't unreasonable for them to all run on the same CPU 
if their runnable/sleeping states happen to line up just right.  It is also 
very possible that they bounce around from CPU to CPU in rapid succession if 
the runnable/sleeping windows overlap in exactly the wrong way :-)


-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo.
  2007-10-04 22:10 ` Darren Hart
@ 2007-10-05  6:40   ` Girish kathalagiri
  2007-10-05 16:06     ` Darren Hart
  0 siblings, 1 reply; 5+ messages in thread
From: Girish kathalagiri @ 2007-10-05  6:40 UTC (permalink / raw)
  To: Darren Hart; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

Hi Daren,

> 1) How are you determining which CPUs these threads spend their time on?

The hourglass test's  stores the execution trace . The trace is
generated as the threads are run , each time the thread detects a gap
in the execution, it records a trace , which includes start time and
end time of the continuous block of  CPU time that a thread received.
  output looks something like this
tracerec: 3 4569.581302 4669.572704 99.991402 400.007546
tracerec: 1 4669.581304 4769.572370 99.991066 400.007679
tracerec: 2 4769.581030 4869.572168 99.991138 400.007727
tracerec: 4 4869.580624 4969.571912 99.991288 400.007546
 (columns: thread id , strat time and end time of gap free time of
CPU, duration of the interval, last column shows the time it has taken
from the end of the previous schedule of the thread)
The values are plotted and the graph is attached.
It can be noted that the thread 0 runs all the time and the thread 1-5
gets time equal slices. Hence it can be seen that thread 0 is hogging
a cpu completely and thread 1-5 are hogging the other cpu.

> 2) RTHIGH doesn't do anything for us.  What is the value of the SCHED_FIFO
> priority those threads run at?  Is it the same for every one?

All the threads are running at an RT priority of 97 (maxrt prio - 2).
code:

 max_realtime_pri = sched_get_priority_max (SCHED_RR);

and

case RTHIGH:
    param->sched_priority = max_realtime_pri - 2;

> Note that depending on exactly what those threads do (I am unfamiliar with the
> hourglass testcase) it isn't unreasonable for them to all run on the same CPU
> if their runnable/sleeping states happen to line up just right.  It is also
> very possible that they bounce around from CPU to CPU in rapid succession if
> the runnable/sleeping windows overlap in exactly the wrong way :-)
>
All the thread does is it hogs the CPU whenever it can.I have also
attached the test result and the graph that has run for 10s. Graph
basically shows the context switches that has happened between the
thread 1-5 (round robin).Thread 0 does not record any context switch
as it runs fully
thread 0 recorded 10.039664 seconds (99.999746 %)


-- 
Thanks
   Giri

[-- Attachment #2: hourglass.out --]
[-- Type: application/octet-stream, Size: 5234 bytes --]

thread 0 will run workload CPU
thread 1 will run workload CPU
thread 2 will run workload CPU
thread 3 will run workload CPU
thread 4 will run workload CPU
thread 5 will run workload CPU
thread 0 will use high-res timers
thread 1 will use high-res timers
thread 2 will use high-res timers
thread 3 will use high-res timers
thread 4 will use high-res timers
thread 5 will use high-res timers
thread 0 will have priority RTHIGH
thread 1 will have priority RTHIGH
thread 2 will have priority RTHIGH
thread 3 will have priority RTHIGH
thread 4 will have priority RTHIGH
thread 5 will have priority RTHIGH
8.010864 MB allocated for trace records
Hourglass 1.0.1b : 6 threads; 10.000000 seconds; 1829.032872 MHz timestamp counter
max gap is 64015 cycles
this test will last for 10.000000 seconds
numthreads: 6
work done by thrd 0 : 72029073
work done by thrd 1 : 7162093
work done by thrd 2 : 14712526
work done by thrd 3 : 13930769
work done by thrd 4 : 14596841
work done by thrd 5 : 14327441
there were 67 out of 300000 trace records; 0.022333 % used.
in cycles, trace start 7468911785159, end 7487274707715, duration 18362922556
trace duration 10.039690 seconds
thread 0 recorded 10.039664 seconds (99.999746 %)
thread 1 recorded 2.041019 seconds (20.329499 %)
thread 2 recorded 2.033881 seconds (20.258403 %)
thread 3 recorded 1.925891 seconds (19.182774 %)
thread 4 recorded 2.017882 seconds (20.099050 %)
thread 5 recorded 1.980695 seconds (19.728645 %)
total thread time 20.039032 seconds

time slots in ms: thread start end duration gap:
tracerec: 1 39.456393 880.581229 841.124836 0.000000
tracerec: 2 880.592608 1714.579263 833.986656 0.000000
tracerec: 4 1714.589860 2532.577469 817.987609 0.000000
tracerec: 5 2532.587891 3343.575650 810.987759 0.000000
tracerec: 3 3343.586175 4169.573755 825.987580 0.000000
tracerec: 1 4169.582356 4269.573626 99.991270 3289.001127
tracerec: 2 4269.582136 4369.573304 99.991168 2555.002872
tracerec: 4 4369.581549 4469.573078 99.991529 1837.004080
tracerec: 5 4469.581552 4569.572743 99.991192 1126.005901
tracerec: 3 4569.581302 4669.572704 99.991402 400.007546
tracerec: 1 4669.581304 4769.572370 99.991066 400.007679
tracerec: 2 4769.581030 4869.572168 99.991138 400.007727
tracerec: 4 4869.580624 4969.571912 99.991288 400.007546
tracerec: 5 4969.580302 5069.571758 99.991457 400.007558
tracerec: 3 5069.579992 5169.571508 99.991517 400.007288
tracerec: 1 5169.579976 5269.571180 99.991204 400.007606
tracerec: 2 5269.579889 5369.570990 99.991102 400.007721
tracerec: 4 5369.579350 5469.570903 99.991553 400.007438
tracerec: 5 5469.579268 5569.570581 99.991312 400.007510
tracerec: 3 5569.579043 5669.570373 99.991330 400.007534
tracerec: 1 5669.578871 5769.570033 99.991162 400.007691
tracerec: 2 5769.578735 5869.569957 99.991222 400.007745
tracerec: 4 5869.578395 5969.569791 99.991396 400.007492
tracerec: 5 5969.578295 6069.569493 99.991198 400.007715
tracerec: 3 6069.577877 6169.569388 99.991511 400.007504
tracerec: 1 6169.577849 6269.569084 99.991234 400.007817
tracerec: 2 6269.577828 6369.568972 99.991144 400.007871
tracerec: 4 6369.577831 6469.568572 99.990741 400.008039
tracerec: 5 6469.577238 6569.568346 99.991108 400.007745
tracerec: 3 6569.576892 6669.568162 99.991270 400.007504
tracerec: 1 6669.576684 6769.567846 99.991162 400.007600
tracerec: 2 6769.576578 6869.567722 99.991144 400.007606
tracerec: 4 6869.576286 6969.567472 99.991186 400.007715
tracerec: 5 6969.575964 7069.567367 99.991402 400.007618
tracerec: 3 7069.575949 7169.566978 99.991030 400.007787
tracerec: 1 7169.575536 7269.566746 99.991210 400.007691
tracerec: 2 7269.575491 7369.566526 99.991036 400.007769
tracerec: 4 7369.575367 7469.566397 99.991030 400.007895
tracerec: 5 7469.574943 7569.566237 99.991294 400.007576
tracerec: 3 7569.574819 7669.565981 99.991162 400.007841
tracerec: 1 7669.574533 7769.565653 99.991120 400.007787
tracerec: 2 7769.574193 7869.565397 99.991204 400.007666
tracerec: 4 7869.574033 7969.565237 99.991204 400.007636
tracerec: 5 7969.573771 8069.564969 99.991198 400.007534
tracerec: 3 8069.573660 8169.564834 99.991174 400.007679
tracerec: 1 8169.573512 8269.564626 99.991114 400.007859
tracerec: 2 8269.572925 8369.564382 99.991457 400.007528
tracerec: 4 8369.573012 8469.564204 99.991192 400.007775
tracerec: 5 8469.572702 8569.563960 99.991258 400.007733
tracerec: 3 8569.572801 8669.563662 99.990861 400.007967
tracerec: 1 8669.572376 8769.563274 99.990897 400.007751
tracerec: 2 8769.571970 8869.563240 99.991270 400.007588
tracerec: 4 8869.571756 8969.563044 99.991288 400.007552
tracerec: 5 8969.571608 9069.562776 99.991168 400.007648
tracerec: 3 9069.571515 9169.562532 99.991018 400.007853
tracerec: 1 9169.571084 9269.562264 99.991180 400.007811
tracerec: 2 9269.571087 9369.562165 99.991078 400.007847
tracerec: 4 9369.570735 9469.561921 99.991186 400.007691
tracerec: 5 9469.570431 9569.561605 99.991174 400.007654
tracerec: 3 9569.570181 9669.561355 99.991174 400.007648
tracerec: 1 9669.569931 9769.561111 99.991180 400.007666
tracerec: 2 9769.569837 9869.561059 99.991222 0.000000
tracerec: 4 9869.569611 9969.560719 99.991108 0.000000
tracerec: 5 9969.569265 10039.372537 69.803272 0.000000
tracerec: 3 10039.689739 10039.689739 0.000000 0.000000

[-- Attachment #3: hourglass.png --]
[-- Type: image/png, Size: 42665 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo.
  2007-10-05  6:40   ` Girish kathalagiri
@ 2007-10-05 16:06     ` Darren Hart
  2007-10-05 16:48       ` Darren Hart
  0 siblings, 1 reply; 5+ messages in thread
From: Darren Hart @ 2007-10-05 16:06 UTC (permalink / raw)
  To: Girish kathalagiri; +Cc: linux-rt-users

On Thursday 04 October 2007 23:40:53 Girish kathalagiri wrote:
> Hi Daren,
>
> > 1) How are you determining which CPUs these threads spend their time on?
>
> The hourglass test's  stores the execution trace . The trace is
> generated as the threads are run , each time the thread detects a gap
> in the execution, it records a trace , which includes start time and
> end time of the continuous block of  CPU time that a thread received.
>   output looks something like this
> tracerec: 3 4569.581302 4669.572704 99.991402 400.007546
> tracerec: 1 4669.581304 4769.572370 99.991066 400.007679
> tracerec: 2 4769.581030 4869.572168 99.991138 400.007727
> tracerec: 4 4869.580624 4969.571912 99.991288 400.007546
>  (columns: thread id , strat time and end time of gap free time of
> CPU, duration of the interval, last column shows the time it has taken
> from the end of the previous schedule of the thread)
> The values are plotted and the graph is attached.
> It can be noted that the thread 0 runs all the time and the thread 1-5
> gets time equal slices. Hence it can be seen that thread 0 is hogging
> a cpu completely and thread 1-5 are hogging the other cpu.

OK, looking at your output, I'd have to agree (although your plot doesn't show 
thread 0 running at all, I presume it should have a gray bar that is a full 
10 seconds long?).

I usually deal with SCHED_FIFO threads, so I'm going to take a look at the 
SCHED_RR behavior to see how it's implemented.

--Darren

>
> > 2) RTHIGH doesn't do anything for us.  What is the value of the
> > SCHED_FIFO priority those threads run at?  Is it the same for every one?
>
> All the threads are running at an RT priority of 97 (maxrt prio - 2).
> code:
>
>  max_realtime_pri = sched_get_priority_max (SCHED_RR);
>
> and
>
> case RTHIGH:
>     param->sched_priority = max_realtime_pri - 2;
>
> > Note that depending on exactly what those threads do (I am unfamiliar
> > with the hourglass testcase) it isn't unreasonable for them to all run on
> > the same CPU if their runnable/sleeping states happen to line up just
> > right.  It is also very possible that they bounce around from CPU to CPU
> > in rapid succession if the runnable/sleeping windows overlap in exactly
> > the wrong way :-)
>
> All the thread does is it hogs the CPU whenever it can.I have also
> attached the test result and the graph that has run for 10s. Graph
> basically shows the context switches that has happened between the
> thread 1-5 (round robin).Thread 0 does not record any context switch
> as it runs fully
> thread 0 recorded 10.039664 seconds (99.999746 %)



-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo.
  2007-10-05 16:06     ` Darren Hart
@ 2007-10-05 16:48       ` Darren Hart
  0 siblings, 0 replies; 5+ messages in thread
From: Darren Hart @ 2007-10-05 16:48 UTC (permalink / raw)
  To: Girish kathalagiri; +Cc: linux-rt-users

On Friday 05 October 2007 09:06:11 Darren Hart wrote:
> On Thursday 04 October 2007 23:40:53 Girish kathalagiri wrote:
> > Hi Daren,
> >
> > > 1) How are you determining which CPUs these threads spend their time
> > > on?
> >
> > The hourglass test's  stores the execution trace . The trace is
> > generated as the threads are run , each time the thread detects a gap
> > in the execution, it records a trace , which includes start time and
> > end time of the continuous block of  CPU time that a thread received.
> >   output looks something like this
> > tracerec: 3 4569.581302 4669.572704 99.991402 400.007546
> > tracerec: 1 4669.581304 4769.572370 99.991066 400.007679
> > tracerec: 2 4769.581030 4869.572168 99.991138 400.007727
> > tracerec: 4 4869.580624 4969.571912 99.991288 400.007546
> >  (columns: thread id , strat time and end time of gap free time of
> > CPU, duration of the interval, last column shows the time it has taken
> > from the end of the previous schedule of the thread)
> > The values are plotted and the graph is attached.
> > It can be noted that the thread 0 runs all the time and the thread 1-5
> > gets time equal slices. Hence it can be seen that thread 0 is hogging
> > a cpu completely and thread 1-5 are hogging the other cpu.
>
> OK, looking at your output, I'd have to agree (although your plot doesn't
> show thread 0 running at all, I presume it should have a gray bar that is a
> full 10 seconds long?).
>
> I usually deal with SCHED_FIFO threads, so I'm going to take a look at the
> SCHED_RR behavior to see how it's implemented.


I think I understand what is going on.  When the 6 threads are initially 
scheduled, T0 is placed on CPU0 and T[1-5] are placed on CPU 1.  Since they 
are very high priority, and all the same priority, they are never preempted, 
and they never "wake up", missing those opportunities for the rt_pulled code 
(which wouldn't pull them anyway since they are the same priority, not 
greater, than the threads on the other runqueues).  The scheduler doesn't 
prefer one task a given priority over any other of the same priority.  When 
they expire, they get sent to the back of the current runqueue, and the next 
SCHED_RR thread on the local runqueue would be scheduled to run (relying on 
the FIFO behavior of the array for ordering).  Since there is no way to do 
global RR ordering across all runqueues without some nasty logic looking at 
the last time each was scheduled, the scheduler appears to only guarantee a 
local runqueue RR policy.  If you want to better balance a SCHED_RR 
application, I suggest you look into CPU pinning.

Regards,

Darren Hart

>
> --Darren
>
> > > 2) RTHIGH doesn't do anything for us.  What is the value of the
> > > SCHED_FIFO priority those threads run at?  Is it the same for every
> > > one?
> >
> > All the threads are running at an RT priority of 97 (maxrt prio - 2).
> > code:
> >
> >  max_realtime_pri = sched_get_priority_max (SCHED_RR);
> >
> > and
> >
> > case RTHIGH:
> >     param->sched_priority = max_realtime_pri - 2;
> >
> > > Note that depending on exactly what those threads do (I am unfamiliar
> > > with the hourglass testcase) it isn't unreasonable for them to all run
> > > on the same CPU if their runnable/sleeping states happen to line up
> > > just right.  It is also very possible that they bounce around from CPU
> > > to CPU in rapid succession if the runnable/sleeping windows overlap in
> > > exactly the wrong way :-)
> >
> > All the thread does is it hogs the CPU whenever it can.I have also
> > attached the test result and the graph that has run for 10s. Graph
> > basically shows the context switches that has happened between the
> > thread 1-5 (round robin).Thread 0 does not record any context switch
> > as it runs fully
> > thread 0 recorded 10.039664 seconds (99.999746 %)



-- 
Darren Hart
IBM Linux Technology Center
Realtime Linux Team

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-10-05 16:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-26  2:35 Scheduling behaviour of 23-rc4-rt1 on my intel centrino Duo Girish kathalagiri
2007-10-04 22:10 ` Darren Hart
2007-10-05  6:40   ` Girish kathalagiri
2007-10-05 16:06     ` Darren Hart
2007-10-05 16:48       ` Darren Hart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.