public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.5.2-pre performance degradation on an old 486
@ 2002-01-05  0:51 Mikael Pettersson
  2002-01-05  8:25 ` Matthias Hanisch
  0 siblings, 1 reply; 20+ messages in thread
From: Mikael Pettersson @ 2002-01-05  0:51 UTC (permalink / raw)
  To: axboe, torvalds; +Cc: linux-kernel

When running 2.5.2-pre7 on my old for-testing-only 486(*),
file-system accesses seem to come in distinct bursts preceded
by lengthy pauses. Overall performance is down quite significantly
compared to 2.4.18pre1 and 2.2.20pre2. To measure it I ran two
simple tests:

Test 1: time to boot the kernel, from hitting enter at the LILO
prompt to getting a login prompt
Test 2: time to "rm -rf" a clean linux-2.4.17 source tree, using
the newly booted kernel (no other access to the tree before that,
so it wasn't cached in any way, and the machine was otherwise idle)

		Test 1		Test 2
2.2.21pre2:	71 sec		 75 sec
2.4.18pre1:	64 sec		 72 sec
2.5.2-pre7:	97 sec		251 sec

I haven't noticed any slowdowns on my other boxes, so I didn't
do any measurements on them. On the 486 it's very very obvious.

/Mikael

(*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-05  0:51 2.5.2-pre performance degradation on an old 486 Mikael Pettersson
@ 2002-01-05  8:25 ` Matthias Hanisch
  2002-01-05 23:10   ` Davide Libenzi
  0 siblings, 1 reply; 20+ messages in thread
From: Matthias Hanisch @ 2002-01-05  8:25 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: axboe, torvalds, linux-kernel, Davide Libenzi

On Sat, 5 Jan 2002, Mikael Pettersson wrote:

> When running 2.5.2-pre7 on my old for-testing-only 486(*),
> file-system accesses seem to come in distinct bursts preceded
> by lengthy pauses. Overall performance is down quite significantly
> compared to 2.4.18pre1 and 2.2.20pre2. To measure it I ran two
> simple tests:
> 
> Test 1: time to boot the kernel, from hitting enter at the LILO
> prompt to getting a login prompt
> Test 2: time to "rm -rf" a clean linux-2.4.17 source tree, using
> the newly booted kernel (no other access to the tree before that,
> so it wasn't cached in any way, and the machine was otherwise idle)
> 
> 		Test 1		Test 2
> 2.2.21pre2:	71 sec		 75 sec
> 2.4.18pre1:	64 sec		 72 sec
> 2.5.2-pre7:	97 sec		251 sec
> 
> I haven't noticed any slowdowns on my other boxes, so I didn't
> do any measurements on them. On the 486 it's very very obvious.

This is exactly, what I see with my old 486 box. It started with
2.5.2-pre3, which contained two major items:

- bio changes from Jens
- scheduler changes from Davide

Surprisingly, backing out the bio changes didn't help. Backing out the
scheduler changes from Davide did!!

Maybe the problem lies somewhere in between, because it is often I/O
related, e.g. first call of ldconfig is horrible slow, as is e2fsck.

But I also see system hiccups from time to time, where console switching
does not work for 1 second on an idle box.


> (*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
> small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.

Is this ISA (maybe it has something to do with ISA bouncing)? Mine is:

486 DX/2 ISA, Adaptec 1542, two slow scsi disks and a self-made
slackware-based system.

Can you also backout the scheduler changes to verify this? I have a
backout patch for 2.5.2-pre6, if you don't want to do this for yourself.

Regards,
	Matze (trying 2.5.2-pre8 now)




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-05  8:25 ` Matthias Hanisch
@ 2002-01-05 23:10   ` Davide Libenzi
  2002-01-06 10:21     ` Jens Axboe
  2002-01-07  7:22     ` 2.5.2-pre performance degradation on an old 486 Matthias Hanisch
  0 siblings, 2 replies; 20+ messages in thread
From: Davide Libenzi @ 2002-01-05 23:10 UTC (permalink / raw)
  To: Matthias Hanisch; +Cc: Mikael Pettersson, axboe, Linus Torvalds, lkml

On Sat, 5 Jan 2002, Matthias Hanisch wrote:

> On Sat, 5 Jan 2002, Mikael Pettersson wrote:
>
> > When running 2.5.2-pre7 on my old for-testing-only 486(*),
> > file-system accesses seem to come in distinct bursts preceded
> > by lengthy pauses. Overall performance is down quite significantly
> > compared to 2.4.18pre1 and 2.2.20pre2. To measure it I ran two
> > simple tests:
> >
> > Test 1: time to boot the kernel, from hitting enter at the LILO
> > prompt to getting a login prompt
> > Test 2: time to "rm -rf" a clean linux-2.4.17 source tree, using
> > the newly booted kernel (no other access to the tree before that,
> > so it wasn't cached in any way, and the machine was otherwise idle)
> >
> > 		Test 1		Test 2
> > 2.2.21pre2:	71 sec		 75 sec
> > 2.4.18pre1:	64 sec		 72 sec
> > 2.5.2-pre7:	97 sec		251 sec
> >
> > I haven't noticed any slowdowns on my other boxes, so I didn't
> > do any measurements on them. On the 486 it's very very obvious.
>
> This is exactly, what I see with my old 486 box. It started with
> 2.5.2-pre3, which contained two major items:
>
> - bio changes from Jens
> - scheduler changes from Davide
>
> Surprisingly, backing out the bio changes didn't help. Backing out the
> scheduler changes from Davide did!!
>
> Maybe the problem lies somewhere in between, because it is often I/O
> related, e.g. first call of ldconfig is horrible slow, as is e2fsck.
>
> But I also see system hiccups from time to time, where console switching
> does not work for 1 second on an idle box.
>
>
> > (*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
> > small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.
>
> Is this ISA (maybe it has something to do with ISA bouncing)? Mine is:
>
> 486 DX/2 ISA, Adaptec 1542, two slow scsi disks and a self-made
> slackware-based system.
>
> Can you also backout the scheduler changes to verify this? I have a
> backout patch for 2.5.2-pre6, if you don't want to do this for yourself.

There should be some part of the kernel that assume a certain scheduler
behavior. There was a guy that reported a bad  hdparm  performance and i
tried it. By running  hdparm -t  my system has a context switch of 20-30
and an irq load of about 100-110.
The scheduler itself, even if you code it in visual basic, cannot make
this with such loads.
Did you try to profile the kernel ?




- Davide



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-05 23:10   ` Davide Libenzi
@ 2002-01-06 10:21     ` Jens Axboe
  2002-01-06 10:33       ` Andre Hedrick
  2002-01-06 23:59       ` [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Davide Libenzi
  2002-01-07  7:22     ` 2.5.2-pre performance degradation on an old 486 Matthias Hanisch
  1 sibling, 2 replies; 20+ messages in thread
From: Jens Axboe @ 2002-01-06 10:21 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Matthias Hanisch, Mikael Pettersson, Linus Torvalds, lkml

On Sat, Jan 05 2002, Davide Libenzi wrote:
> > > (*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
> > > small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.
> >
> > Is this ISA (maybe it has something to do with ISA bouncing)? Mine is:
> >
> > 486 DX/2 ISA, Adaptec 1542, two slow scsi disks and a self-made
> > slackware-based system.
> >
> > Can you also backout the scheduler changes to verify this? I have a
> > backout patch for 2.5.2-pre6, if you don't want to do this for yourself.
> 
> There should be some part of the kernel that assume a certain scheduler
> behavior. There was a guy that reported a bad  hdparm  performance and i
> tried it. By running  hdparm -t  my system has a context switch of 20-30
> and an irq load of about 100-110.
> The scheduler itself, even if you code it in visual basic, cannot make
> this with such loads.
> Did you try to profile the kernel ?

Davide,

If this is caused by ISA bounce problems, then you should be able to
reproduce by doing something ala

[ drivers/ide/ide-dma.c ]

ide_toggle_bounce()
{
	...

+	addr = BLK_BOUNCE_ISA;
	blk_queue_bounce_limit(&drive->queue, addr);
}

pseudo-diff, just add the addr = line. Now compare performance with and
without your scheduler changes.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-06 10:21     ` Jens Axboe
@ 2002-01-06 10:33       ` Andre Hedrick
  2002-01-06 23:59       ` [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Davide Libenzi
  1 sibling, 0 replies; 20+ messages in thread
From: Andre Hedrick @ 2002-01-06 10:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Davide Libenzi, Matthias Hanisch, Mikael Pettersson,
	Linus Torvalds, lkml

On Sun, 6 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 05 2002, Davide Libenzi wrote:
> > > > (*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
> > > > small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.
> > >
> > > Is this ISA (maybe it has something to do with ISA bouncing)? Mine is:
> > >
> > > 486 DX/2 ISA, Adaptec 1542, two slow scsi disks and a self-made
> > > slackware-based system.
> > >
> > > Can you also backout the scheduler changes to verify this? I have a
> > > backout patch for 2.5.2-pre6, if you don't want to do this for yourself.
> > 
> > There should be some part of the kernel that assume a certain scheduler
> > behavior. There was a guy that reported a bad  hdparm  performance and i
> > tried it. By running  hdparm -t  my system has a context switch of 20-30
> > and an irq load of about 100-110.
> > The scheduler itself, even if you code it in visual basic, cannot make
> > this with such loads.
> > Did you try to profile the kernel ?
> 
> Davide,
> 
> If this is caused by ISA bounce problems, then you should be able to
> reproduce by doing something ala
> 
> [ drivers/ide/ide-dma.c ]
> 
> ide_toggle_bounce()
> {
> 	...
> 
> +	addr = BLK_BOUNCE_ISA;
> 	blk_queue_bounce_limit(&drive->queue, addr);
> }

Jens, how about getting a hardware list because I have prime2/3 ISA DMA
cards.  Just not ready to test in 2.5.

Regards,


Andre Hedrick
CEO/President, LAD Storage Consulting Group
Linux ATA Development
Linux Disk Certification Project


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-06 10:21     ` Jens Axboe
  2002-01-06 10:33       ` Andre Hedrick
@ 2002-01-06 23:59       ` Davide Libenzi
  2002-01-07  1:38         ` Andrea Arcangeli
  2002-01-07  7:32         ` Jens Axboe
  1 sibling, 2 replies; 20+ messages in thread
From: Davide Libenzi @ 2002-01-06 23:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Matthias Hanisch, Mikael Pettersson, Linus Torvalds, lkml

On Sun, 6 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 05 2002, Davide Libenzi wrote:
> > > > (*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
> > > > small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.
> > >
> > > Is this ISA (maybe it has something to do with ISA bouncing)? Mine is:
> > >
> > > 486 DX/2 ISA, Adaptec 1542, two slow scsi disks and a self-made
> > > slackware-based system.
> > >
> > > Can you also backout the scheduler changes to verify this? I have a
> > > backout patch for 2.5.2-pre6, if you don't want to do this for yourself.
> >
> > There should be some part of the kernel that assume a certain scheduler
> > behavior. There was a guy that reported a bad  hdparm  performance and i
> > tried it. By running  hdparm -t  my system has a context switch of 20-30
> > and an irq load of about 100-110.
> > The scheduler itself, even if you code it in visual basic, cannot make
> > this with such loads.
> > Did you try to profile the kernel ?
>
> Davide,
>
> If this is caused by ISA bounce problems, then you should be able to
> reproduce by doing something ala
>
> [ drivers/ide/ide-dma.c ]
>
> ide_toggle_bounce()
> {
> 	...
>
> +	addr = BLK_BOUNCE_ISA;
> 	blk_queue_bounce_limit(&drive->queue, addr);
> }
>
> pseudo-diff, just add the addr = line. Now compare performance with and
> without your scheduler changes.

I fail to understand where the scheduler code can influence this.
There's basically nothing inside blk_queue_bounce_limit()
I made this patch for Andrea and it's the scheduler code for 2.4.18-pre1
Could someone give it a try on old 486s




- Davide





diff -Nru linux-2.4.18-pre1.vanilla/arch/alpha/kernel/process.c linux-2.4.18-pre1.tsss/arch/alpha/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/alpha/kernel/process.c	Sun Sep 30 12:26:08 2001
+++ linux-2.4.18-pre1.tsss/arch/alpha/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -75,7 +75,6 @@
 {
 	/* An endless idle loop with no priority at all.  */
 	current->nice = 20;
-	current->counter = -100;

 	while (1) {
 		/* FIXME -- EV6 and LCA45 know how to power down
diff -Nru linux-2.4.18-pre1.vanilla/arch/arm/kernel/process.c linux-2.4.18-pre1.tsss/arch/arm/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/arm/kernel/process.c	Sun Sep 30 12:26:08 2001
+++ linux-2.4.18-pre1.tsss/arch/arm/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -85,7 +85,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;

 	while (1) {
 		void (*idle)(void) = pm_idle;
diff -Nru linux-2.4.18-pre1.vanilla/arch/cris/kernel/process.c linux-2.4.18-pre1.tsss/arch/cris/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/cris/kernel/process.c	Fri Nov  9 13:58:02 2001
+++ linux-2.4.18-pre1.tsss/arch/cris/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -119,7 +119,6 @@
 int cpu_idle(void *unused)
 {
 	while(1) {
-		current->counter = -100;
 		schedule();
 	}
 }
diff -Nru linux-2.4.18-pre1.vanilla/arch/i386/kernel/process.c linux-2.4.18-pre1.tsss/arch/i386/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/i386/kernel/process.c	Thu Oct  4 18:42:54 2001
+++ linux-2.4.18-pre1.tsss/arch/i386/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -125,7 +125,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;

 	while (1) {
 		void (*idle)(void) = pm_idle;
diff -Nru linux-2.4.18-pre1.vanilla/arch/ia64/kernel/process.c linux-2.4.18-pre1.tsss/arch/ia64/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/ia64/kernel/process.c	Fri Nov  9 14:26:17 2001
+++ linux-2.4.18-pre1.tsss/arch/ia64/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -114,8 +114,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;
-

 	while (1) {
 #ifdef CONFIG_SMP
diff -Nru linux-2.4.18-pre1.vanilla/arch/m68k/kernel/process.c linux-2.4.18-pre1.tsss/arch/m68k/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/m68k/kernel/process.c	Sun Sep 30 12:26:08 2001
+++ linux-2.4.18-pre1.tsss/arch/m68k/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -81,7 +81,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;
 	idle();
 }

diff -Nru linux-2.4.18-pre1.vanilla/arch/mips/kernel/process.c linux-2.4.18-pre1.tsss/arch/mips/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/mips/kernel/process.c	Sun Sep  9 10:43:01 2001
+++ linux-2.4.18-pre1.tsss/arch/mips/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -36,7 +36,6 @@
 {
 	/* endless idle loop with no priority at all */
 	current->nice = 20;
-	current->counter = -100;
 	init_idle();

 	while (1) {
diff -Nru linux-2.4.18-pre1.vanilla/arch/mips64/kernel/process.c linux-2.4.18-pre1.tsss/arch/mips64/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/mips64/kernel/process.c	Fri Feb  9 11:29:44 2001
+++ linux-2.4.18-pre1.tsss/arch/mips64/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -34,7 +34,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;
 	while (1) {
 		while (!current->need_resched)
 			if (wait_available)
diff -Nru linux-2.4.18-pre1.vanilla/arch/parisc/kernel/process.c linux-2.4.18-pre1.tsss/arch/parisc/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/parisc/kernel/process.c	Fri Feb  9 11:29:44 2001
+++ linux-2.4.18-pre1.tsss/arch/parisc/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -71,7 +71,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;

 	while (1) {
 		while (!current->need_resched) {
diff -Nru linux-2.4.18-pre1.vanilla/arch/ppc/8260_io/uart.c linux-2.4.18-pre1.tsss/arch/ppc/8260_io/uart.c
--- linux-2.4.18-pre1.vanilla/arch/ppc/8260_io/uart.c	Sat Jan  5 19:34:50 2002
+++ linux-2.4.18-pre1.tsss/arch/ppc/8260_io/uart.c	Sat Jan  5 19:38:57 2002
@@ -1732,7 +1732,7 @@
 		printk("lsr = %d (jiff=%lu)...", lsr, jiffies);
 #endif
 		current->state = TASK_INTERRUPTIBLE;
-/*		current->counter = 0;	 make us low-priority */
+/*		current->dyn_prio = 0;	 make us low-priority */
 		schedule_timeout(char_time);
 		if (signal_pending(current))
 			break;
diff -Nru linux-2.4.18-pre1.vanilla/arch/ppc/8xx_io/uart.c linux-2.4.18-pre1.tsss/arch/ppc/8xx_io/uart.c
--- linux-2.4.18-pre1.vanilla/arch/ppc/8xx_io/uart.c	Sat Jan  5 19:34:50 2002
+++ linux-2.4.18-pre1.tsss/arch/ppc/8xx_io/uart.c	Sat Jan  5 19:38:57 2002
@@ -1798,7 +1798,7 @@
 		printk("lsr = %d (jiff=%lu)...", lsr, jiffies);
 #endif
 		current->state = TASK_INTERRUPTIBLE;
-/*		current->counter = 0;	 make us low-priority */
+/*		current->dyn_prio = 0;	 make us low-priority */
 		schedule_timeout(char_time);
 		if (signal_pending(current))
 			break;
diff -Nru linux-2.4.18-pre1.vanilla/arch/ppc/kernel/idle.c linux-2.4.18-pre1.tsss/arch/ppc/kernel/idle.c
--- linux-2.4.18-pre1.vanilla/arch/ppc/kernel/idle.c	Sat Jan  5 19:34:50 2002
+++ linux-2.4.18-pre1.tsss/arch/ppc/kernel/idle.c	Sat Jan  5 19:38:57 2002
@@ -54,7 +54,6 @@

 	/* endless loop with no priority at all */
 	current->nice = 20;
-	current->counter = -100;
 	init_idle();
 	for (;;) {
 #ifdef CONFIG_SMP
diff -Nru linux-2.4.18-pre1.vanilla/arch/s390/kernel/process.c linux-2.4.18-pre1.tsss/arch/s390/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/s390/kernel/process.c	Sat Jan  5 19:34:51 2002
+++ linux-2.4.18-pre1.tsss/arch/s390/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -57,7 +57,6 @@
 	/* endless idle loop with no priority at all */
         init_idle();
 	current->nice = 20;
-	current->counter = -100;
 	wait_psw.mask = _WAIT_PSW_MASK;
 	wait_psw.addr = (unsigned long) &&idle_wakeup | 0x80000000L;
 	while(1) {
diff -Nru linux-2.4.18-pre1.vanilla/arch/s390x/kernel/process.c linux-2.4.18-pre1.tsss/arch/s390x/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/s390x/kernel/process.c	Sat Jan  5 19:34:51 2002
+++ linux-2.4.18-pre1.tsss/arch/s390x/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -57,7 +57,6 @@
 	/* endless idle loop with no priority at all */
         init_idle();
 	current->nice = 20;
-	current->counter = -100;
 	wait_psw.mask = _WAIT_PSW_MASK;
 	wait_psw.addr = (unsigned long) &&idle_wakeup;
 	while(1) {
diff -Nru linux-2.4.18-pre1.vanilla/arch/sh/kernel/process.c linux-2.4.18-pre1.tsss/arch/sh/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/sh/kernel/process.c	Mon Oct 15 13:36:48 2001
+++ linux-2.4.18-pre1.tsss/arch/sh/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -41,7 +41,6 @@
 	/* endless idle loop with no priority at all */
 	init_idle();
 	current->nice = 20;
-	current->counter = -100;

 	while (1) {
 		if (hlt_counter) {
diff -Nru linux-2.4.18-pre1.vanilla/arch/sparc/kernel/process.c linux-2.4.18-pre1.tsss/arch/sparc/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/sparc/kernel/process.c	Fri Dec 21 09:41:53 2001
+++ linux-2.4.18-pre1.tsss/arch/sparc/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -61,7 +61,6 @@

 	/* endless idle loop with no priority at all */
 	current->nice = 20;
-	current->counter = -100;
 	init_idle();

 	for (;;) {
@@ -110,7 +109,6 @@
 {
 	/* endless idle loop with no priority at all */
 	current->nice = 20;
-	current->counter = -100;
 	init_idle();

 	while(1) {
diff -Nru linux-2.4.18-pre1.vanilla/arch/sparc64/kernel/process.c linux-2.4.18-pre1.tsss/arch/sparc64/kernel/process.c
--- linux-2.4.18-pre1.vanilla/arch/sparc64/kernel/process.c	Fri Dec 21 09:41:53 2001
+++ linux-2.4.18-pre1.tsss/arch/sparc64/kernel/process.c	Sat Jan  5 19:38:57 2002
@@ -54,7 +54,6 @@

 	/* endless idle loop with no priority at all */
 	current->nice = 20;
-	current->counter = -100;
 	init_idle();

 	for (;;) {
@@ -84,7 +83,6 @@
 int cpu_idle(void)
 {
 	current->nice = 20;
-	current->counter = -100;
 	init_idle();

 	while(1) {
diff -Nru linux-2.4.18-pre1.vanilla/drivers/net/slip.c linux-2.4.18-pre1.tsss/drivers/net/slip.c
--- linux-2.4.18-pre1.vanilla/drivers/net/slip.c	Sun Sep 30 12:26:07 2001
+++ linux-2.4.18-pre1.tsss/drivers/net/slip.c	Sat Jan  5 19:38:57 2002
@@ -1394,7 +1394,7 @@
 		 */
 		do {
 			if (busy) {
-				current->counter = 0;
+				current->time_slice = 0;
 				schedule();
 			}

diff -Nru linux-2.4.18-pre1.vanilla/fs/proc/array.c linux-2.4.18-pre1.tsss/fs/proc/array.c
--- linux-2.4.18-pre1.vanilla/fs/proc/array.c	Thu Oct 11 09:00:01 2001
+++ linux-2.4.18-pre1.tsss/fs/proc/array.c	Sat Jan  5 19:38:57 2002
@@ -335,8 +335,7 @@

 	/* scale priority and nice values from timeslices to -20..20 */
 	/* to make it look like a "normal" Unix priority/nice value  */
-	priority = task->counter;
-	priority = 20 - (priority * 10 + DEF_COUNTER / 2) / DEF_COUNTER;
+	priority = task->dyn_prio;
 	nice = task->nice;

 	read_lock(&tasklist_lock);
diff -Nru linux-2.4.18-pre1.vanilla/include/linux/sched.h linux-2.4.18-pre1.tsss/include/linux/sched.h
--- linux-2.4.18-pre1.vanilla/include/linux/sched.h	Fri Dec 21 09:42:03 2001
+++ linux-2.4.18-pre1.tsss/include/linux/sched.h	Sat Jan  5 19:56:14 2002
@@ -150,6 +150,7 @@
 extern void update_process_times(int user);
 extern void update_one_process(struct task_struct *p, unsigned long user,
 			       unsigned long system, int cpu);
+extern void expire_task(struct task_struct *p);

 #define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
 extern signed long FASTCALL(schedule_timeout(signed long timeout));
@@ -300,7 +301,7 @@
  * all fields in a single cacheline that are needed for
  * the goodness() loop in schedule().
  */
-	long counter;
+	unsigned long dyn_prio;
 	long nice;
 	unsigned long policy;
 	struct mm_struct *mm;
@@ -319,7 +320,9 @@
 	 * that's just fine.)
 	 */
 	struct list_head run_list;
-	unsigned long sleep_time;
+	long time_slice;
+	/* recalculation loop checkpoint */
+	unsigned long rcl_last;

 	struct task_struct *next_task, *prev_task;
 	struct mm_struct *active_mm;
@@ -446,8 +449,9 @@
  */
 #define _STK_LIM	(8*1024*1024)

-#define DEF_COUNTER	(10*HZ/100)	/* 100 ms time slice */
-#define MAX_COUNTER	(20*HZ/100)
+#define MAX_DYNPRIO	40
+#define DEF_TSLICE	(5 * HZ / 100)
+#define MAX_TSLICE	(20 * HZ / 100)
 #define DEF_NICE	(0)


@@ -468,14 +472,16 @@
     addr_limit:		KERNEL_DS,					\
     exec_domain:	&default_exec_domain,				\
     lock_depth:		-1,						\
-    counter:		DEF_COUNTER,					\
+    dyn_prio:		0,					\
     nice:		DEF_NICE,					\
     policy:		SCHED_OTHER,					\
     mm:			NULL,						\
     active_mm:		&init_mm,					\
     cpus_runnable:	-1,						\
     cpus_allowed:	-1,						\
-    run_list:		LIST_HEAD_INIT(tsk.run_list),			\
+    run_list:		{ NULL, NULL },			\
+    rcl_last:		0,					\
+    time_slice:		DEF_TSLICE,					\
     next_task:		&tsk,						\
     prev_task:		&tsk,						\
     p_opptr:		&tsk,						\
@@ -876,7 +882,6 @@
 static inline void del_from_runqueue(struct task_struct * p)
 {
 	nr_running--;
-	p->sleep_time = jiffies;
 	list_del(&p->run_list);
 	p->run_list.next = NULL;
 }
diff -Nru linux-2.4.18-pre1.vanilla/kernel/exit.c linux-2.4.18-pre1.tsss/kernel/exit.c
--- linux-2.4.18-pre1.vanilla/kernel/exit.c	Sat Jan  5 19:34:51 2002
+++ linux-2.4.18-pre1.tsss/kernel/exit.c	Sat Jan  5 19:38:57 2002
@@ -62,9 +62,9 @@
 		 * timeslices, because any timeslice recovered here
 		 * was given away by the parent in the first place.)
 		 */
-		current->counter += p->counter;
-		if (current->counter >= MAX_COUNTER)
-			current->counter = MAX_COUNTER;
+		current->time_slice += p->time_slice;
+		if (current->time_slice > MAX_TSLICE)
+			current->time_slice = MAX_TSLICE;
 		p->pid = 0;
 		free_task_struct(p);
 	} else {
diff -Nru linux-2.4.18-pre1.vanilla/kernel/fork.c linux-2.4.18-pre1.tsss/kernel/fork.c
--- linux-2.4.18-pre1.vanilla/kernel/fork.c	Wed Nov 21 10:18:42 2001
+++ linux-2.4.18-pre1.tsss/kernel/fork.c	Sat Jan  5 19:38:57 2002
@@ -682,9 +682,9 @@
 	 * more scheduling fairness. This is only important in the first
 	 * timeslice, on the long run the scheduling behaviour is unchanged.
 	 */
-	p->counter = (current->counter + 1) >> 1;
-	current->counter >>= 1;
-	if (!current->counter)
+	p->time_slice = (current->time_slice + 1) >> 1;
+	current->time_slice >>= 1;
+	if (!current->time_slice)
 		current->need_resched = 1;

 	/*
diff -Nru linux-2.4.18-pre1.vanilla/kernel/sched.c linux-2.4.18-pre1.tsss/kernel/sched.c
--- linux-2.4.18-pre1.vanilla/kernel/sched.c	Fri Dec 21 09:42:04 2001
+++ linux-2.4.18-pre1.tsss/kernel/sched.c	Sat Jan  5 19:52:29 2002
@@ -51,24 +51,16 @@
  * NOTE! The unix "nice" value influences how long a process
  * gets. The nice value ranges from -20 to +19, where a -20
  * is a "high-priority" task, and a "+10" is a low-priority
- * task.
- *
- * We want the time-slice to be around 50ms or so, so this
- * calculation depends on the value of HZ.
+ * task. The default time slice for zero-nice tasks will be 37ms.
  */
-#if HZ < 200
-#define TICK_SCALE(x)	((x) >> 2)
-#elif HZ < 400
-#define TICK_SCALE(x)	((x) >> 1)
-#elif HZ < 800
-#define TICK_SCALE(x)	(x)
-#elif HZ < 1600
-#define TICK_SCALE(x)	((x) << 1)
-#else
-#define TICK_SCALE(x)	((x) << 2)
-#endif
+#define NICE_RANGE	40
+#define MIN_NICE_TSLICE	10000
+#define MAX_NICE_TSLICE	90000
+#define TASK_TIMESLICE(p)	((int) ts_table[19 - (p)->nice])
+
+static unsigned char ts_table[NICE_RANGE];

-#define NICE_TO_TICKS(nice)	(TICK_SCALE(20-(nice))+1)
+#define MM_AFFINITY_BONUS	1


 /*
@@ -94,6 +86,8 @@

 static LIST_HEAD(runqueue_head);

+static unsigned long rcl_curr = 0;
+
 /*
  * We align per-CPU scheduling data on cacheline boundaries,
  * to prevent cacheline ping-pong.
@@ -165,10 +159,11 @@
 		 * Don't do any other calculations if the time slice is
 		 * over..
 		 */
-		weight = p->counter;
-		if (!weight)
-			goto out;
-
+		if (!p->time_slice)
+			return 0;
+
+		weight = p->dyn_prio + 1;
+
 #ifdef CONFIG_SMP
 		/* Give a largish advantage to the same processor...   */
 		/* (this is equivalent to penalizing other processors) */
@@ -178,7 +173,7 @@

 		/* .. and a slight advantage to the current MM */
 		if (p->mm == this_mm || !p->mm)
-			weight += 1;
+			weight += MM_AFFINITY_BONUS;
 		weight += 20 - p->nice;
 		goto out;
 	}
@@ -324,6 +319,9 @@
  */
 static inline void add_to_runqueue(struct task_struct * p)
 {
+	p->dyn_prio += rcl_curr - p->rcl_last;
+	p->rcl_last = rcl_curr;
+	if (p->dyn_prio > MAX_DYNPRIO) p->dyn_prio = MAX_DYNPRIO;
 	list_add(&p->run_list, &runqueue_head);
 	nr_running++;
 }
@@ -536,6 +534,19 @@
 	__schedule_tail(prev);
 }

+void expire_task(struct task_struct *p)
+{
+	if (unlikely(!p->time_slice))
+		goto need_resched;
+
+	if (!--p->time_slice) {
+		if (p->dyn_prio)
+			p->dyn_prio--;
+	need_resched:
+		p->need_resched = 1;
+	}
+}
+
 /*
  *  'schedule()' is the scheduler function. It's a very simple and nice
  * scheduler: it's not perfect, but certainly works for most things.
@@ -578,20 +589,20 @@

 	/* move an exhausted RR process to be last.. */
 	if (unlikely(prev->policy == SCHED_RR))
-		if (!prev->counter) {
-			prev->counter = NICE_TO_TICKS(prev->nice);
+		if (!prev->time_slice) {
+			prev->time_slice = TASK_TIMESLICE(prev);
 			move_last_runqueue(prev);
 		}

 	switch (prev->state) {
-		case TASK_INTERRUPTIBLE:
-			if (signal_pending(prev)) {
-				prev->state = TASK_RUNNING;
-				break;
-			}
-		default:
-			del_from_runqueue(prev);
-		case TASK_RUNNING:;
+	case TASK_INTERRUPTIBLE:
+		if (signal_pending(prev)) {
+			prev->state = TASK_RUNNING;
+			break;
+		}
+	default:
+		del_from_runqueue(prev);
+	case TASK_RUNNING:;
 	}
 	prev->need_resched = 0;

@@ -616,14 +627,12 @@

 	/* Do we need to re-calculate counters? */
 	if (unlikely(!c)) {
-		struct task_struct *p;
-
-		spin_unlock_irq(&runqueue_lock);
-		read_lock(&tasklist_lock);
-		for_each_task(p)
-			p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
-		read_unlock(&tasklist_lock);
-		spin_lock_irq(&runqueue_lock);
+		++rcl_curr;
+		list_for_each(tmp, &runqueue_head) {
+			p = list_entry(tmp, struct task_struct, run_list);
+			p->time_slice = TASK_TIMESLICE(p);
+			p->rcl_last = rcl_curr;
+		}
 		goto repeat_schedule;
 	}

@@ -1056,17 +1065,17 @@
 	nr_pending--;
 #endif
 	if (nr_pending) {
+		struct task_struct *ctsk = current;
 		/*
 		 * This process can only be rescheduled by us,
 		 * so this is safe without any locking.
 		 */
-		if (current->policy == SCHED_OTHER)
-			current->policy |= SCHED_YIELD;
-		current->need_resched = 1;
-
-		spin_lock_irq(&runqueue_lock);
-		move_last_runqueue(current);
-		spin_unlock_irq(&runqueue_lock);
+		if (ctsk->policy == SCHED_OTHER)
+			ctsk->policy |= SCHED_YIELD;
+		ctsk->need_resched = 1;
+
+		ctsk->time_slice = 0;
+		++ctsk->dyn_prio;
 	}
 	return 0;
 }
@@ -1115,7 +1124,7 @@
 	read_lock(&tasklist_lock);
 	p = find_process_by_pid(pid);
 	if (p)
-		jiffies_to_timespec(p->policy & SCHED_FIFO ? 0 : NICE_TO_TICKS(p->nice),
+		jiffies_to_timespec(p->policy & SCHED_FIFO ? 0 : TASK_TIMESLICE(p),
 				    &t);
 	read_unlock(&tasklist_lock);
 	if (p)
@@ -1306,9 +1315,10 @@

 	if (current != &init_task && task_on_runqueue(current)) {
 		printk("UGH! (%d:%d) was on the runqueue, removing.\n",
-			smp_processor_id(), current->pid);
+			   smp_processor_id(), current->pid);
 		del_from_runqueue(current);
 	}
+	current->dyn_prio = 0;
 	sched_data->curr = current;
 	sched_data->last_schedule = get_cycles();
 	clear_bit(current->processor, &wait_init_idle);
@@ -1316,6 +1326,18 @@

 extern void init_timervecs (void);

+static void fill_tslice_map(void)
+{
+	int i;
+
+	for (i = 0; i < NICE_RANGE; i++) {
+		ts_table[i] = ((MIN_NICE_TSLICE +
+						((MAX_NICE_TSLICE -
+						  MIN_NICE_TSLICE) / (NICE_RANGE - 1)) * i) * HZ) / 1000000;
+		if (!ts_table[i]) ts_table[i] = 1;
+	}
+}
+
 void __init sched_init(void)
 {
 	/*
@@ -1329,6 +1351,8 @@

 	for(nr = 0; nr < PIDHASH_SZ; nr++)
 		pidhash[nr] = NULL;
+
+	fill_tslice_map();

 	init_timervecs();

diff -Nru linux-2.4.18-pre1.vanilla/kernel/timer.c linux-2.4.18-pre1.tsss/kernel/timer.c
--- linux-2.4.18-pre1.vanilla/kernel/timer.c	Mon Oct  8 10:41:41 2001
+++ linux-2.4.18-pre1.tsss/kernel/timer.c	Sat Jan  5 19:38:57 2002
@@ -583,10 +583,7 @@

 	update_one_process(p, user_tick, system, cpu);
 	if (p->pid) {
-		if (--p->counter <= 0) {
-			p->counter = 0;
-			p->need_resched = 1;
-		}
+		expire_task(p);
 		if (p->nice > 0)
 			kstat.per_cpu_nice[cpu] += user_tick;
 		else
diff -Nru linux-2.4.18-pre1.vanilla/mm/oom_kill.c linux-2.4.18-pre1.tsss/mm/oom_kill.c
--- linux-2.4.18-pre1.vanilla/mm/oom_kill.c	Sat Nov  3 17:05:25 2001
+++ linux-2.4.18-pre1.tsss/mm/oom_kill.c	Sat Jan  5 19:38:57 2002
@@ -149,7 +149,8 @@
 	 * all the memory it needs. That way it should be able to
 	 * exit() and clear out its resources quickly...
 	 */
-	p->counter = 5 * HZ;
+	p->time_slice = 2 * MAX_TSLICE;
+	p->dyn_prio = MAX_DYNPRIO + 1;
 	p->flags |= PF_MEMALLOC | PF_MEMDIE;

 	/* This process has hardware access, be more careful. */



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
@ 2002-01-07  1:33 Mikael Pettersson
  2002-01-07  2:36 ` Davide Libenzi
  0 siblings, 1 reply; 20+ messages in thread
From: Mikael Pettersson @ 2002-01-07  1:33 UTC (permalink / raw)
  To: davidel; +Cc: axboe, linux-kernel, mjh, torvalds

On Sun, 6 Jan 2002 15:59:05 -0800 (PST), Davide Libenzi wrote:
>I made this patch for Andrea and it's the scheduler code for 2.4.18-pre1
>Could someone give it a try on old 486s

Done. On my '93 vintage 486, 2.4.18p1 + your scheduler results in very
bursty I/O and poor performance, just like I reported for 2.5.2-pre7.

/Mikael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-06 23:59       ` [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Davide Libenzi
@ 2002-01-07  1:38         ` Andrea Arcangeli
  2002-01-07 14:35           ` J.A. Magallon
  2002-01-07  7:32         ` Jens Axboe
  1 sibling, 1 reply; 20+ messages in thread
From: Andrea Arcangeli @ 2002-01-07  1:38 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Jens Axboe, Matthias Hanisch, Mikael Pettersson, Linus Torvalds,
	lkml

On Sun, Jan 06, 2002 at 03:59:05PM -0800, Davide Libenzi wrote:
> On Sun, 6 Jan 2002, Jens Axboe wrote:
> 
> > On Sat, Jan 05 2002, Davide Libenzi wrote:
> > > > > (*) 100MHz 486DX4, 28MB ram, no L2 cache, two old and slow IDE disks,
> > > > > small custom no-nonsense RedHat 7.2, kernels compiled with gcc 2.95.3.
> > > >
> > > > Is this ISA (maybe it has something to do with ISA bouncing)? Mine is:
> > > >
> > > > 486 DX/2 ISA, Adaptec 1542, two slow scsi disks and a self-made
> > > > slackware-based system.
> > > >
> > > > Can you also backout the scheduler changes to verify this? I have a
> > > > backout patch for 2.5.2-pre6, if you don't want to do this for yourself.
> > >
> > > There should be some part of the kernel that assume a certain scheduler
> > > behavior. There was a guy that reported a bad  hdparm  performance and i
> > > tried it. By running  hdparm -t  my system has a context switch of 20-30
> > > and an irq load of about 100-110.
> > > The scheduler itself, even if you code it in visual basic, cannot make
> > > this with such loads.
> > > Did you try to profile the kernel ?
> >
> > Davide,
> >
> > If this is caused by ISA bounce problems, then you should be able to
> > reproduce by doing something ala
> >
> > [ drivers/ide/ide-dma.c ]
> >
> > ide_toggle_bounce()
> > {
> > 	...
> >
> > +	addr = BLK_BOUNCE_ISA;
> > 	blk_queue_bounce_limit(&drive->queue, addr);
> > }
> >
> > pseudo-diff, just add the addr = line. Now compare performance with and
> > without your scheduler changes.
> 
> I fail to understand where the scheduler code can influence this.
> There's basically nothing inside blk_queue_bounce_limit()
> I made this patch for Andrea and it's the scheduler code for 2.4.18-pre1
> Could someone give it a try on old 486s

yes please (feel free to CC me on the answers), I'd really like to
reduce the scheduler O(N) overhead to the number of the running tasks,
rather than doing the recalculate all over the processes in the machine.
O(1) scheduler would be even better of course, but the below would
ensure not to hurt the 1 task running case, and it's way simpler to
check for correctness (so it's easier to include it as a start).

Andrea

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-07  1:33 [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Mikael Pettersson
@ 2002-01-07  2:36 ` Davide Libenzi
  2002-01-07  7:33   ` Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: Davide Libenzi @ 2002-01-07  2:36 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: Jens Axboe, lkml, mjh, Linus Torvalds

On Mon, 7 Jan 2002, Mikael Pettersson wrote:

> On Sun, 6 Jan 2002 15:59:05 -0800 (PST), Davide Libenzi wrote:
> >I made this patch for Andrea and it's the scheduler code for 2.4.18-pre1
> >Could someone give it a try on old 486s
>
> Done. On my '93 vintage 486, 2.4.18p1 + your scheduler results in very
> bursty I/O and poor performance, just like I reported for 2.5.2-pre7.

Can you try some changes that i'll tell you ?




- Davide



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-05 23:10   ` Davide Libenzi
  2002-01-06 10:21     ` Jens Axboe
@ 2002-01-07  7:22     ` Matthias Hanisch
  2002-01-07 16:43       ` Linus Torvalds
  2002-01-07 18:31       ` Davide Libenzi
  1 sibling, 2 replies; 20+ messages in thread
From: Matthias Hanisch @ 2002-01-07  7:22 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Matthias Hanisch, Mikael Pettersson, axboe, Linus Torvalds, lkml

On Sat, 5 Jan 2002, Davide Libenzi wrote:

> There should be some part of the kernel that assume a certain scheduler
> behavior. There was a guy that reported a bad  hdparm  performance and i
> tried it. By running  hdparm -t  my system has a context switch of 20-30
> and an irq load of about 100-110.

This guy was me, IMHO (just with my office email address :).


> The scheduler itself, even if you code it in visual basic, cannot make
> this with such loads.
> Did you try to profile the kernel ?

To answer your question, I wanted to profile 2.5.2-pre8 against
2.5.2-pre8-old-scheduler. _Fortunately_ I made some mistake and forgot to
back out the following chunk of memory.

--- v2.5.1/linux/arch/i386/kernel/process.c     Thu Oct  4 18:42:54 2001
+++ linux/arch/i386/kernel/process.c    Thu Dec 27 08:21:28 2001
@@ -125,7 +125,6 @@
        /* endless idle loop with no priority at all */
        init_idle();
        current->nice = 20;
-       current->counter = -100;
 
        while (1) {
                void (*idle)(void) = pm_idle;

So it seems, that removing this line from kernel sources with the old
scheduler causes this unresponsive behavior. This chunk looks also a
little bit strange. In most (all?) the other chunks "counter" gots
replaced with "dyn_prio", not completely removed.

I'll verify this tonight (have to earn some money at first :). I'll do
also some profiling.

Mikael, if you have time, maybe you can try to apply only this chunk of
patch (or only remove the line) to a clean 2.4.18-pre1 and report the
behavior.


Davide, regarding your question in the other mail:

> Can you try some changes that i'll tell you ?

Please forward to me also. Sometimes it takes a little bit longer, because
there is also life without LKML, but I want to get this understood and
fixed, so I'll try to help you as much as I can.


Regards,
	Matze



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-06 23:59       ` [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Davide Libenzi
  2002-01-07  1:38         ` Andrea Arcangeli
@ 2002-01-07  7:32         ` Jens Axboe
  2002-01-07 18:10           ` Davide Libenzi
  1 sibling, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2002-01-07  7:32 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Matthias Hanisch, Mikael Pettersson, Linus Torvalds, lkml

On Sun, Jan 06 2002, Davide Libenzi wrote:
> > Davide,
> >
> > If this is caused by ISA bounce problems, then you should be able to
> > reproduce by doing something ala
> >
> > [ drivers/ide/ide-dma.c ]
> >
> > ide_toggle_bounce()
> > {
> > 	...
> >
> > +	addr = BLK_BOUNCE_ISA;
> > 	blk_queue_bounce_limit(&drive->queue, addr);
> > }
> >
> > pseudo-diff, just add the addr = line. Now compare performance with and
> > without your scheduler changes.
> 
> I fail to understand where the scheduler code can influence this.
> There's basically nothing inside blk_queue_bounce_limit()

Eh of course not, no time will be spent inside blk_queue_bounce_limit. I
don't think you looked very long at this :-)

The point is that ISA bouncing will spend some time scheduling waiting
for available memory in the __GFP_DMA zone.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-07  2:36 ` Davide Libenzi
@ 2002-01-07  7:33   ` Jens Axboe
  2002-01-07 18:12     ` Davide Libenzi
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2002-01-07  7:33 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Mikael Pettersson, lkml, mjh, Linus Torvalds

On Sun, Jan 06 2002, Davide Libenzi wrote:
> On Mon, 7 Jan 2002, Mikael Pettersson wrote:
> 
> > On Sun, 6 Jan 2002 15:59:05 -0800 (PST), Davide Libenzi wrote:
> > >I made this patch for Andrea and it's the scheduler code for 2.4.18-pre1
> > >Could someone give it a try on old 486s
> >
> > Done. On my '93 vintage 486, 2.4.18p1 + your scheduler results in very
> > bursty I/O and poor performance, just like I reported for 2.5.2-pre7.
> 
> Can you try some changes that i'll tell you ?

Did you _try_ the ISA bounce trick to reproduce locally??

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-07  1:38         ` Andrea Arcangeli
@ 2002-01-07 14:35           ` J.A. Magallon
  2002-01-07 14:37             ` Andrea Arcangeli
  0 siblings, 1 reply; 20+ messages in thread
From: J.A. Magallon @ 2002-01-07 14:35 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Davide Libenzi, Jens Axboe, Matthias Hanisch, Mikael Pettersson,
	Linus Torvalds, lkml


On 20020107 Andrea Arcangeli wrote:
>
>yes please (feel free to CC me on the answers), I'd really like to
>reduce the scheduler O(N) overhead to the number of the running tasks,
>rather than doing the recalculate all over the processes in the machine.
>O(1) scheduler would be even better of course, but the below would
>ensure not to hurt the 1 task running case, and it's way simpler to
>check for correctness (so it's easier to include it as a start).
>

It looks like you all are going to turn the scheduler upside-down.
Hmm, as a non-kernel-hacker observer from the world outside, could I
make a suggestion ?
Is it easy to split the thing in steps:
- Move from single-queue to per-cpu-queue, with just the same algorithm
  that is running now for per-queue scheduling.
- Get that running for 2.18.18 and 2.5.2
- Then start to play with the per-queue scheduling algorithm:
	* better O(n)
	* O(1)
	* O(1) with different queues for RT and non RT
	etc...

Is it easy enough or are both steps so related that can not be split ?

Thanks.

(a linux user that tries experimental kernels and is seeing them grow
like mushrooms in latest weeks...)

-- 
J.A. Magallon                           #  Let the source be with you...        
mailto:jamagallon@able.es
Mandrake Linux release 8.2 (Cooker) for i586
Linux werewolf 2.4.18-pre1-beo #1 SMP Fri Jan 4 02:25:59 CET 2002 i686

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-07 14:35           ` J.A. Magallon
@ 2002-01-07 14:37             ` Andrea Arcangeli
  0 siblings, 0 replies; 20+ messages in thread
From: Andrea Arcangeli @ 2002-01-07 14:37 UTC (permalink / raw)
  To: J.A. Magallon
  Cc: Davide Libenzi, Jens Axboe, Matthias Hanisch, Mikael Pettersson,
	Linus Torvalds, lkml

On Mon, Jan 07, 2002 at 03:35:33PM +0100, J.A. Magallon wrote:
> 
> On 20020107 Andrea Arcangeli wrote:
> >
> >yes please (feel free to CC me on the answers), I'd really like to
> >reduce the scheduler O(N) overhead to the number of the running tasks,
> >rather than doing the recalculate all over the processes in the machine.
> >O(1) scheduler would be even better of course, but the below would
> >ensure not to hurt the 1 task running case, and it's way simpler to
> >check for correctness (so it's easier to include it as a start).
> >
> 
> It looks like you all are going to turn the scheduler upside-down.
> Hmm, as a non-kernel-hacker observer from the world outside, could I
> make a suggestion ?
> Is it easy to split the thing in steps:
> - Move from single-queue to per-cpu-queue, with just the same algorithm
>   that is running now for per-queue scheduling.

I don't mind about SMP (I don't think SMP scalability of the scheduler
is that bad to require this change in 2.4), I'd only like an UP (or SMP
as well of course) box not to follow a linked list of 2k tasks during a
reschedule if only 1 is running all the time.

> - Get that running for 2.18.18 and 2.5.2
> - Then start to play with the per-queue scheduling algorithm:
> 	* better O(n)
> 	* O(1)
> 	* O(1) with different queues for RT and non RT
> 	etc...
> 
> Is it easy enough or are both steps so related that can not be split ?
> 
> Thanks.
> 
> (a linux user that tries experimental kernels and is seeing them grow
> like mushrooms in latest weeks...)
> 
> -- 
> J.A. Magallon                           #  Let the source be with you...        
> mailto:jamagallon@able.es
> Mandrake Linux release 8.2 (Cooker) for i586
> Linux werewolf 2.4.18-pre1-beo #1 SMP Fri Jan 4 02:25:59 CET 2002 i686


Andrea

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-07  7:22     ` 2.5.2-pre performance degradation on an old 486 Matthias Hanisch
@ 2002-01-07 16:43       ` Linus Torvalds
  2002-01-07 18:31       ` Davide Libenzi
  1 sibling, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2002-01-07 16:43 UTC (permalink / raw)
  To: Matthias Hanisch; +Cc: Davide Libenzi, Mikael Pettersson, axboe, lkml


On Mon, 7 Jan 2002, Matthias Hanisch wrote:
>
> To answer your question, I wanted to profile 2.5.2-pre8 against
> 2.5.2-pre8-old-scheduler. _Fortunately_ I made some mistake and forgot to
> back out the following chunk of memory.
>
> --- v2.5.1/linux/arch/i386/kernel/process.c     Thu Oct  4 18:42:54 2001
> +++ linux/arch/i386/kernel/process.c    Thu Dec 27 08:21:28 2001
> @@ -125,7 +125,6 @@
>         /* endless idle loop with no priority at all */
>         init_idle();
>         current->nice = 20;
> -       current->counter = -100;
>
>         while (1) {
>                 void (*idle)(void) = pm_idle;

Hey, that would do it. It looks like the idle task ends up being a
_normal_ process (just nice'd down), so it will get real CPU time instead
of only getting scheduled when nothing else is runnable.

Davide, I think the bounce-buffer is a red herring, it's simply that we're
wasting time in idle..

		Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-07  7:32         ` Jens Axboe
@ 2002-01-07 18:10           ` Davide Libenzi
  0 siblings, 0 replies; 20+ messages in thread
From: Davide Libenzi @ 2002-01-07 18:10 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Matthias Hanisch, Mikael Pettersson, Linus Torvalds, lkml

On Mon, 7 Jan 2002, Jens Axboe wrote:

> On Sun, Jan 06 2002, Davide Libenzi wrote:
> > > Davide,
> > >
> > > If this is caused by ISA bounce problems, then you should be able to
> > > reproduce by doing something ala
> > >
> > > [ drivers/ide/ide-dma.c ]
> > >
> > > ide_toggle_bounce()
> > > {
> > > 	...
> > >
> > > +	addr = BLK_BOUNCE_ISA;
> > > 	blk_queue_bounce_limit(&drive->queue, addr);
> > > }
> > >
> > > pseudo-diff, just add the addr = line. Now compare performance with and
> > > without your scheduler changes.
> >
> > I fail to understand where the scheduler code can influence this.
> > There's basically nothing inside blk_queue_bounce_limit()
>
> Eh of course not, no time will be spent inside blk_queue_bounce_limit. I
> don't think you looked very long at this :-)
>
> The point is that ISA bouncing will spend some time scheduling waiting
> for available memory in the __GFP_DMA zone.

I looked and i already pointed out this to Linus.
The memory pool creation ends up by calling alloc_pages and there could
exist race.
I've not had the time for expariments.



- Davide



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 )
  2002-01-07  7:33   ` Jens Axboe
@ 2002-01-07 18:12     ` Davide Libenzi
  0 siblings, 0 replies; 20+ messages in thread
From: Davide Libenzi @ 2002-01-07 18:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mikael Pettersson, lkml, mjh, Linus Torvalds

On Mon, 7 Jan 2002, Jens Axboe wrote:

> On Sun, Jan 06 2002, Davide Libenzi wrote:
> > On Mon, 7 Jan 2002, Mikael Pettersson wrote:
> >
> > > On Sun, 6 Jan 2002 15:59:05 -0800 (PST), Davide Libenzi wrote:
> > > >I made this patch for Andrea and it's the scheduler code for 2.4.18-pre1
> > > >Could someone give it a try on old 486s
> > >
> > > Done. On my '93 vintage 486, 2.4.18p1 + your scheduler results in very
> > > bursty I/O and poor performance, just like I reported for 2.5.2-pre7.
> >
> > Can you try some changes that i'll tell you ?
>
> Did you _try_ the ISA bounce trick to reproduce locally??

I'll try today even if i think that one of the guy that had problems
pointed out it.




- Davide



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-07  7:22     ` 2.5.2-pre performance degradation on an old 486 Matthias Hanisch
  2002-01-07 16:43       ` Linus Torvalds
@ 2002-01-07 18:31       ` Davide Libenzi
  2002-01-07 21:43         ` Matthias Hanisch
  1 sibling, 1 reply; 20+ messages in thread
From: Davide Libenzi @ 2002-01-07 18:31 UTC (permalink / raw)
  To: Matthias Hanisch; +Cc: Mikael Pettersson, Jens Axboe, Linus Torvalds, lkml

On Mon, 7 Jan 2002, Matthias Hanisch wrote:

> On Sat, 5 Jan 2002, Davide Libenzi wrote:
>
> > There should be some part of the kernel that assume a certain scheduler
> > behavior. There was a guy that reported a bad  hdparm  performance and i
> > tried it. By running  hdparm -t  my system has a context switch of 20-30
> > and an irq load of about 100-110.
>
> This guy was me, IMHO (just with my office email address :).
>
>
> > The scheduler itself, even if you code it in visual basic, cannot make
> > this with such loads.
> > Did you try to profile the kernel ?
>
> To answer your question, I wanted to profile 2.5.2-pre8 against
> 2.5.2-pre8-old-scheduler. _Fortunately_ I made some mistake and forgot to
> back out the following chunk of memory.
>
> --- v2.5.1/linux/arch/i386/kernel/process.c     Thu Oct  4 18:42:54 2001
> +++ linux/arch/i386/kernel/process.c    Thu Dec 27 08:21:28 2001
> @@ -125,7 +125,6 @@
>         /* endless idle loop with no priority at all */
>         init_idle();
>         current->nice = 20;
> -       current->counter = -100;

In sched.c::init_idle() :

current->dyn_prio = -100;

Let me know.




- Davide



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-07 18:31       ` Davide Libenzi
@ 2002-01-07 21:43         ` Matthias Hanisch
  2002-01-07 22:17           ` Davide Libenzi
  0 siblings, 1 reply; 20+ messages in thread
From: Matthias Hanisch @ 2002-01-07 21:43 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Matthias Hanisch, Mikael Pettersson, Jens Axboe, Linus Torvalds,
	lkml

On Mon, 7 Jan 2002, Davide Libenzi wrote:

> In sched.c::init_idle() :
> 
> current->dyn_prio = -100;
> 
> Let me know.

Aehm. I already added the same line at the beginning of cpu_idle() in
arch/i386/process.c, which brought back the old performance. Your patch
should be analogous, but cleaner.

So: Bingo!!!!

I just wonder, why only two people with slow machines saw this behavior...

Now 2.5.2 can come :)

Regards,
	Matze



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: 2.5.2-pre performance degradation on an old 486
  2002-01-07 21:43         ` Matthias Hanisch
@ 2002-01-07 22:17           ` Davide Libenzi
  0 siblings, 0 replies; 20+ messages in thread
From: Davide Libenzi @ 2002-01-07 22:17 UTC (permalink / raw)
  To: Matthias Hanisch; +Cc: Mikael Pettersson, Jens Axboe, Linus Torvalds, lkml

On Mon, 7 Jan 2002, Matthias Hanisch wrote:

> On Mon, 7 Jan 2002, Davide Libenzi wrote:
>
> > In sched.c::init_idle() :
> >
> > current->dyn_prio = -100;
> >
> > Let me know.
>
> Aehm. I already added the same line at the beginning of cpu_idle() in
> arch/i386/process.c, which brought back the old performance. Your patch
> should be analogous, but cleaner.
>
> So: Bingo!!!!
>
> I just wonder, why only two people with slow machines saw this behavior...
>
> Now 2.5.2 can come :)

The problem is that slow machines shows different dyn_prio distribution.
What happened was that if a process with dyn_prio == was wake up while the
idle was running, preemption_goodness() failed to kick out the idle ( with
dyn_prio == 0 ) because of the strict > 0




- Davide



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2002-01-07 22:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-05  0:51 2.5.2-pre performance degradation on an old 486 Mikael Pettersson
2002-01-05  8:25 ` Matthias Hanisch
2002-01-05 23:10   ` Davide Libenzi
2002-01-06 10:21     ` Jens Axboe
2002-01-06 10:33       ` Andre Hedrick
2002-01-06 23:59       ` [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Davide Libenzi
2002-01-07  1:38         ` Andrea Arcangeli
2002-01-07 14:35           ` J.A. Magallon
2002-01-07 14:37             ` Andrea Arcangeli
2002-01-07  7:32         ` Jens Axboe
2002-01-07 18:10           ` Davide Libenzi
2002-01-07  7:22     ` 2.5.2-pre performance degradation on an old 486 Matthias Hanisch
2002-01-07 16:43       ` Linus Torvalds
2002-01-07 18:31       ` Davide Libenzi
2002-01-07 21:43         ` Matthias Hanisch
2002-01-07 22:17           ` Davide Libenzi
  -- strict thread matches above, loose matches on Subject: below --
2002-01-07  1:33 [patch] 2.5.2 scheduler code for 2.4.18-pre1 ( was 2.5.2-pre performance degradation on an old 486 ) Mikael Pettersson
2002-01-07  2:36 ` Davide Libenzi
2002-01-07  7:33   ` Jens Axboe
2002-01-07 18:12     ` Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox