public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* CFS Scheduler and real-time tasks
@ 2007-05-17 19:14 Török Edvin
  2007-05-17 19:23 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Török Edvin @ 2007-05-17 19:14 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel

Hi,

I have tested your new CFS scheduler (v2.6.21.1-v12) on a Dell
Inspiron 6400 (Intel Core Duo @ 1.66 Ghz, SMP).

I tested with the bash-for-loop, and the massiv_intr.c workload you
suggested here http://lkml.org/lkml/2007/5/14/227 , then I tested with
a modified version of massiv_intr.c.
My modifications were: add a new task, that modifies work_msecs after
each iteration, and also increases sleep_time; and run this task with
realtime static priority 1. The purpose isn't to test how much 'time'
the rt process gets (it should get all the CPU while it runs, right?),
but to test if a RT priority task can 'unbalance' non-rt tasks.
[diff can be found at end of my mail, it should be applied to
massive_intr.c from http://lkml.org/lkml/2007/3/26/319]

In short, my results are: CFS is much more precise than standard
scheduler (on my system, with my workload), but:
- the precision of CFS on my system is worse than on your system
- the precision of CFS decreases if real-time tasks are run

Ingo, you have obtained much more accurate results (massive_intr
output max error is 1; vs. my results, where error is: 20 - 40).
Is there anything I can change in my kernel to increase precision of CFS?
[I have tried running test with both ondemand and performance
governors, same results]

Is there something I am doing wrong, or is the precision decrease
expected, when running real-time tasks too?

I have run the test using 'Linux localhost 2.6.21-gentoo', gentoo 2007.0.

Results of bash-for-loop:
7730 edwin     20   0  2888  552  256 R   20  0.1   0:05.69 bash
 7731 edwin     20   0  2888  552  256 R   19  0.1   0:05.68 bash
 7732 edwin     20   0  2888  552  256 R   19  0.1   0:05.69 bash
 7728 edwin     20   0  2888  552  256 R   19  0.1   0:05.70 bash
 7729 edwin     20   0  2888  552  256 R   19  0.1   0:05.67 bash
 7733 edwin     20   0  2888  552  256 R   19  0.1   0:05.58 bash
 7734 edwin     20   0  2888  552  256 R   19  0.1   0:05.57 bash
 7735 edwin     20   0  2888  552  256 R   19  0.1   0:05.58 bash
 7736 edwin     20   0  2888  552  256 R   19  0.1   0:05.58 bash
 7737 edwin     20   0  2888  552  256 R   19  0.1   0:05.58 bash

Very precise indeed, the default scheduler produces lots of different
timings there.

Running original massiv_intr.c, I can see that the maximum error is ~20.
[gcc massive_intr_orig.c -o mo -lrt]

edwin@localhost ~ $ ./mo 9 10
025633  00000182
025630  00000182
025627  00000185
025628  00000187
025634  00000194
025632  00000181
025626  00000195
025629  00000181
025631  00000194
edwin@localhost ~ $ ./mo 9 10
025840  00000175
025845  00000175
025842  00000172
025848  00000174
025841  00000205
025847  00000207
025846  00000208
025844  00000173
025843  00000207
edwin@localhost ~ $ ./mo 9 10
026318  00000216
026319  00000212
026324  00000221
026322  00000218
026320  00000213
026321  00000213
026323  00000212
026317  00000194
026316  00000220
edwin@localhost ~ $ ./mo 9 10
026646  00000230
026645  00000219
026653  00000227
026651  00000233
026650  00000228
026647  00000228
026648  00000217
026652  00000208
026649  00000230

Running my version of massiv_intr.c:
[gcc massive_intr_new.c -o massive_new -lrt]

./massiv_new 9 10
027216  00000205
027218  00000202
027220  00000227
027217  00000205
027219  00000228
027222  00000226
027215  00000227
027221  00000206

RT JOB:027223   00000061

edwin@localhost ~ $ ./massiv_new 9 10
027547  00000196
027542  00000205
027546  00000223
027544  00000198
027541  00000224
027548  00000218
027545  00000226
027543  00000202

RT JOB:027549   00000059


Everything fine, now I run it with root privileges, so that the
real-time priority task gets to run with RT priority:
localhost edwin # ./massiv_new 9 10
027957  00000158
027954  00000172
027956  00000163
027955  00000146
027958  00000140
027960  00000184
027959  00000181
027961  00000157

RT JOB:027962   00000187

localhost edwin # ./massiv_new 9 10
028333  00000177
028335  00000159
028332  00000185
028337  00000136
028336  00000156
028334  00000145
028338  00000175
028339  00000162

RT JOB:028340   00000183

The error is ~40, which is double the error when running with no RT task.


Best regards,
Edwin

--- massiv_orig.c	2007-05-17 21:20:21.000000000 +0000
+++ massive_intr.c	2007-05-17 21:39:31.000000000 +0000
@@ -43,6 +43,7 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <sys/mman.h>
+#include <sched.h>
 #include <sys/wait.h>
 #include <fcntl.h>
 #include <unistd.h>
@@ -56,6 +57,8 @@

 #define WORK_MSECS	8
 #define SLEEP_MSECS	1
+#define RT_MSECS_INCREMENT 1
+#define RT_MSECS_MAX	 50

 #define MAX_PROC	1024
 #define SAMPLE_COUNT	1000000000
@@ -113,6 +116,74 @@
 	after = tv[1].tv_sec*USECS_PER_SEC+tv[1].tv_usec;
 	return SAMPLE_COUNT/(after - before)*USECS_PER_MSEC;
 }
+static void *test_job_rt(void *arg)
+{
+	int l = (int)arg;
+	int count = 0;
+	time_t current;
+	sigset_t sigset;
+	struct sigaction sa;
+	struct timespec ts0 = { 0, NSECS_PER_MSEC*SLEEP_MSECS};
+	struct timespec ts1 = { 0, NSECS_PER_MSEC*RT_MSECS_MAX};
+	struct timespec ts_delta = { 0, NSECS_PER_MSEC*RT_MSECS_INCREMENT};
+	struct timespec ts = ts0;
+	struct sched_param sp;
+	int work_msecs = WORK_MSECS;
+
+	sa.sa_handler = sighandler;
+	if (sigemptyset(&sa.sa_mask) < 0) {
+		warn("sigemptyset() failed");
+		abnormal_exit();
+	}
+	sa.sa_flags = 0;
+	if (sigaction(SIGUSR1, &sa, NULL) < 0) {
+		warn("sigaction() failed");
+		abnormal_exit();
+	}
+	if (sigemptyset(&sigset) < 0) {
+		warn("sigfillset() failed");
+		abnormal_exit();
+	}
+	sigsuspend(&sigset);
+	if (errno != EINTR) {
+		warn("sigsuspend() failed");
+		abnormal_exit();
+	}
+	sp.sched_priority = 1;
+
+	sched_setscheduler(0, SCHED_FIFO, &sp);
+	/* main loop */
+	do {
+		loopfnc(work_msecs*l);
+		if (nanosleep(&ts, NULL) < 0) {
+			warn("nanosleep() failed");
+			abnormal_exit();
+		}
+		count++;
+		if (time(&current) == -1) {
+			warn("time() failed");
+			abnormal_exit();
+		}
+		ts.tv_nsec += ts_delta.tv_nsec;
+		if(ts.tv_nsec > ts1.tv_nsec)
+			ts = ts0;
+		work_msecs += RT_MSECS_INCREMENT;
+		if(work_msecs > RT_MSECS_MAX)
+			work_msecs = WORK_MSECS;
+	} while (difftime(current, *first) < runtime);
+	sp.sched_priority = 0;
+	sched_setscheduler(0, SCHED_OTHER, &sp);
+	if (sem_wait(printsem) < 0) {
+		warn("sem_wait() failed");
+		abnormal_exit();
+	}
+	printf("\nRT JOB:%06d\t%08d\n\n", getpid(), count);
+	if (sem_post(printsem) < 0) {
+		warn("sem_post() failed");
+		abnormal_exit();
+	}
+	exit(EXIT_SUCCESS);
+}
 static void *test_job(void *arg)
 {
 	int l = (int)arg;
@@ -237,8 +308,12 @@
 					warn("kill() failed");
 			goto err_sem;
 		}
-		if (pid[i] == 0)
-			test_job((void *)c);
+		if (pid[i] == 0) {
+			if(i == nproc-1)
+				test_job_rt((void *)c);
+			else
+				test_job((void*)c);
+		}
 	}
 	if (sigemptyset(&sigset) < 0) {
 		warn("sigemptyset() failed");

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CFS Scheduler and real-time tasks
  2007-05-17 19:14 CFS Scheduler and real-time tasks Török Edvin
@ 2007-05-17 19:23 ` Ingo Molnar
  2007-05-17 20:10   ` Török Edvin
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2007-05-17 19:23 UTC (permalink / raw)
  To: Török Edvin; +Cc: mingo, linux-kernel


* Török Edvin <edwintorok@gmail.com> wrote:

> Is there something I am doing wrong, or is the precision decrease 
> expected, when running real-time tasks too?

yeah: real-time tasks are ueber-tasks that just get all CPU time 
immediately. I guess we could make SCHED_RR be covered by CFS accounting 
eventually.

> Very precise indeed, the default scheduler produces lots of different 
> timings there.

note: you ran it on a dual-core system and the SMP load-balancer needs 
time to gain precision in the non-RT test - so i'd suggest to test it 
much longer than 10 seconds: 300 seconds at least.

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CFS Scheduler and real-time tasks
  2007-05-17 19:23 ` Ingo Molnar
@ 2007-05-17 20:10   ` Török Edvin
  2007-05-17 20:13     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Török Edvin @ 2007-05-17 20:10 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: mingo, linux-kernel

On 5/17/07, Ingo Molnar <mingo@elte.hu> wrote:
>
> note: you ran it on a dual-core system and the SMP load-balancer needs
> time to gain precision in the non-RT test - so i'd suggest to test it
> much longer than 10 seconds: 300 seconds at least.

Thanks, running for 300 seconds shows only a 2% error.
Even running a I/O workload while running the massive_intr test keeps
the same error.
BTW, the system is more 'responsive'. For example starting
applications while running the tests, switching windows, moving
windows is much "smoother" than on default scheduler.

[while true;do cp /usr/portage/distfiles/linux-2.6.21.tar.bz2 . &&
sync && rm linux-2.6.21.tar.bz2 && sync; done]

./massiv_new 9 300
020860  00017782
020858  00017774

RT JOB:020865   00003785

020859  00017048
020862  00017387
020855  00017494
020861  00017245
020863  00017492
020864  00017122


Best regards,
Edwin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CFS Scheduler and real-time tasks
  2007-05-17 20:10   ` Török Edvin
@ 2007-05-17 20:13     ` Ingo Molnar
  2007-05-18 21:07       ` Török Edvin
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2007-05-17 20:13 UTC (permalink / raw)
  To: Török Edvin; +Cc: mingo, linux-kernel


* Török Edvin <edwintorok@gmail.com> wrote:

> On 5/17/07, Ingo Molnar <mingo@elte.hu> wrote:
> >
> >note: you ran it on a dual-core system and the SMP load-balancer needs
> >time to gain precision in the non-RT test - so i'd suggest to test it
> >much longer than 10 seconds: 300 seconds at least.
> 
> Thanks, running for 300 seconds shows only a 2% error. Even running a 
> I/O workload while running the massive_intr test keeps the same error.

hey, cool! Also try -v13 - it should be even more tighter.

> BTW, the system is more 'responsive'. For example starting 
> applications while running the tests, switching windows, moving 
> windows is much "smoother" than on default scheduler.

yeah :)

> [while true;do cp /usr/portage/distfiles/linux-2.6.21.tar.bz2 . && 
> sync && rm linux-2.6.21.tar.bz2 && sync; done]
> 
> ./massiv_new 9 300
> 020860  00017782
> 020858  00017774
> 
> RT JOB:020865   00003785
> 
> 020859  00017048
> 020862  00017387
> 020855  00017494
> 020861  00017245
> 020863  00017492
> 020864  00017122

looks pretty good!

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CFS Scheduler and real-time tasks
  2007-05-17 20:13     ` Ingo Molnar
@ 2007-05-18 21:07       ` Török Edvin
  2007-06-16  7:21         ` Török Edvin
  0 siblings, 1 reply; 6+ messages in thread
From: Török Edvin @ 2007-05-18 21:07 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: mingo, linux-kernel

On 5/17/07, Ingo Molnar <mingo@elte.hu> wrote:
> hey, cool! Also try -v13 - it should be even more tighter.

I tried -v13. However the scheduling "error" is now 10% (vs 2% with -v12).

I also noticed strange behaviour with CPU hotplug. I offlined cpu1
(echo 0 >/sys/devices/system/cpu/cpu1/online), and the typing speed on
my terminal decreased noticably. I could hardly type something there.
The system was otherwise idle (running in Xorg, and gnome-terminal).
As soon as I started massive_intr.c workload, the keyboard typing
speed became normal again.  If I have a workload running, while I
offline a CPU the keyboard is fine after cpu1 is offlined.
Also the entire system froze once, when I offlined cpu1. I couldn't
reproduce this.

I don't know how -v12 behaved wrt to CPU hotplug.

> > ./massiv_new 9 300
> > 020860  00017782
> > 020858  00017774
> >
> > RT JOB:020865   00003785
> >
> > 020859  00017048
> > 020862  00017387
> > 020855  00017494
> > 020861  00017245
> > 020863  00017492
> > 020864  00017122

Results with -v13:

edwin@localhost ~ $ ./mo 9 300
006203  00006071
006198  00005712
006201  00006088
006196  00006208
006200  00006166
006195  00006306
006199  00006403
006197  00006288
006202  00006237

./massiv_new 9 300
013803  00010073
013801  00009889
013804  00010303
013806  00009953
013800  00010010
013802  00010171
013807  00009956
013805  00010037

RT JOB:013808   00002230

Interesting, massiv_new has better precision than massiv_orig.

Just in case it helps, here's me /proc/sched_debug (when system is
idle, and later, when running the test):

Sched Debug Version: v0.02
now at 1187357967816 nsecs

cpu: 0
  .nr_running            : 0
  .raw_weighted_load     : 0
  .nr_switches           : 555712
  .nr_load_updates       : 221754
  .nr_uninterruptible    : 4294967190
  .jiffies               : 221839
  .next_balance          : 221835
  .curr->pid             : 0
  .clock                 : 755701854416
  .prev_clock_raw        : 871997978419
  .clock_warps           : 0
  .clock_unstable_events : 97254
  .clock_max_delta       : 4749235
  .fair_clock            : 184031607224
  .prev_fair_clock       : 0
  .exec_clock            : 749010316890
  .prev_exec_clock       : 0
  .wait_runtime          : 59753136705
  .wait_runtime_overruns : 2689
  .wait_runtime_underruns: 1141
  .cpu_load[0]           : 0
  .cpu_load[1]           : 0
  .cpu_load[2]           : 0
  .cpu_load[3]           : 0
  .cpu_load[4]           : 0
  .wait_runtime_rq_sum   : 0

runnable tasks:
            task   PID        tree-key         delta       waiting
switches  prio        sum-exec        sum-wait       sum-sleep
wait-overrun   wait-underrun
------------------------------------------------------------------------------------------------------------------------------------------------------------------

cpu: 1
  .nr_running            : 1
  .raw_weighted_load     : 1024
  .nr_switches           : 378755
  .nr_load_updates       : 202398
  .nr_uninterruptible    : 106
  .jiffies               : 221839
  .next_balance          : 221830
  .curr->pid             : 20421
  .clock                 : 794219468967
  .prev_clock_raw        : 874002443396
  .clock_warps           : 0
  .clock_unstable_events : 116368
  .clock_max_delta       : 4531475
  .fair_clock            : 175306014212
  .prev_fair_clock       : 0
  .exec_clock            : 718501418488
  .prev_exec_clock       : 0
  .wait_runtime          : -59753140134
  .wait_runtime_overruns : 1245
  .wait_runtime_underruns: 1604
  .cpu_load[0]           : 0
  .cpu_load[1]           : 0
  .cpu_load[2]           : 0
  .cpu_load[3]           : 0
  .cpu_load[4]           : 0
  .wait_runtime_rq_sum   : -34602

runnable tasks:
            task   PID        tree-key         delta       waiting
switches  prio        sum-exec        sum-wait       sum-sleep
wait-overrun   wait-underrun
------------------------------------------------------------------------------------------------------------------------------------------------------------------
R            cat 20421    175306048814         34602        -34602
    2   120           69205          -34602             853
   0               0


And sched_debug output while running the test:

Sched Debug Version: v0.02
now at 1261993795754 nsecs

cpu: 0
  .nr_running            : 5
  .raw_weighted_load     : 5120
  .nr_switches           : 588021
  .nr_load_updates       : 232164
  .nr_uninterruptible    : 4294967190
  .jiffies               : 240498
  .next_balance          : 240508
  .curr->pid             : 21096
  .clock                 : 770190363479
  .prev_clock_raw        : 895084968848
  .clock_warps           : 0
  .clock_unstable_events : 125214
  .clock_max_delta       : 0
  .fair_clock            : 190695995910
  .prev_fair_clock       : 0
  .exec_clock            : 763286081256
  .prev_exec_clock       : 0
  .wait_runtime          : 68703429598
  .wait_runtime_overruns : 2978
  .wait_runtime_underruns: 1273
  .cpu_load[0]           : 5120
  .cpu_load[1]           : 5119
  .cpu_load[2]           : 5110
  .cpu_load[3]           : 5077
  .cpu_load[4]           : 5037
  .wait_runtime_rq_sum   : -54771343

runnable tasks:
            task   PID        tree-key         delta       waiting
switches  prio        sum-exec        sum-wait       sum-sleep
wait-overrun   wait-underrun
------------------------------------------------------------------------------------------------------------------------------------------------------------------
R             mo 21096    190703231293       7235383      -7235383
  445   120      1813432308        -8250501       211806979
   0               1
              mo 21093    190704201003       8205093      -8472612
  450   120      1818737228       -12394371       206140729
   0               6
              mo 21094    190704732208       8736298      -9665761
  464   120      1816895620       -12925451       204333757
   0               4
              mo 21092    190704971831       8975921     -15574122
  456   120      1817099155       -16680247       208648471
   0               2
              mo 21091    190705631487       9635577     -13823465
  457   120      1816983212       -16160000       204693203
   0               4

cpu: 1
  .nr_running            : 5
  .raw_weighted_load     : 5120
  .nr_switches           : 415103
  .nr_load_updates       : 211674
  .nr_uninterruptible    : 106
  .jiffies               : 240498
  .next_balance          : 240516
  .curr->pid             : 21291
  .clock                 : 806335506481
  .prev_clock_raw        : 897090740058
  .clock_warps           : 0
  .clock_unstable_events : 150133
  .clock_max_delta       : 0
  .fair_clock            : 179777580248
  .prev_fair_clock       : 0
  .exec_clock            : 730473748729
  .prev_exec_clock       : 0
  .wait_runtime          : -68769955037
  .wait_runtime_overruns : 1250
  .wait_runtime_underruns: 1885
  .cpu_load[0]           : 4096
  .cpu_load[1]           : 4293
  .cpu_load[2]           : 5044
  .cpu_load[3]           : 7236
  .cpu_load[4]           : 8445
  .wait_runtime_rq_sum   : -11785269

runnable tasks:
            task   PID        tree-key         delta       waiting
switches  prio        sum-exec        sum-wait       sum-sleep
wait-overrun   wait-underrun
------------------------------------------------------------------------------------------------------------------------------------------------------------------
R            cat 21291    179777631414         51166        -51166
    2   120           66773          -51166           53218
   0               0
              mo 21095    179778195659        615411       -799325
  737   120      2088825958         -799325       201402067
   0               0
              mo 21090    179778267640        687392       -711781
  760   120      2094378519         -711781       208607795
   0               0
              mo 21097    179781627293       4047045      -5262328
  789   120      2090034861        -5262328       202105919
   0               0
              mo 21089    179782321359       4741111      -4960669
  757   120      2094677840        -4960669       202830299
   0               0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CFS Scheduler and real-time tasks
  2007-05-18 21:07       ` Török Edvin
@ 2007-06-16  7:21         ` Török Edvin
  0 siblings, 0 replies; 6+ messages in thread
From: Török Edvin @ 2007-06-16  7:21 UTC (permalink / raw)
  To: linux-kernel

On 5/19/07, Török Edvin <edwintorok@gmail.com> wrote:
> I tried -v13. However the scheduling "error" is now 10% (vs 2% with -v12).
>
> I also noticed strange behaviour with CPU hotplug. I offlined cpu1
> (echo 0 >/sys/devices/system/cpu/cpu1/online), and the typing speed on
> my terminal decreased noticably. I could hardly type something there.

These both have been fixed in cfs-v17. Smp load balancing is accurate,
and I noticed no problems with CPU hotplug.

Thanks,
Edwin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-06-16  7:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-17 19:14 CFS Scheduler and real-time tasks Török Edvin
2007-05-17 19:23 ` Ingo Molnar
2007-05-17 20:10   ` Török Edvin
2007-05-17 20:13     ` Ingo Molnar
2007-05-18 21:07       ` Török Edvin
2007-06-16  7:21         ` Török Edvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox