public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] 2.5.17 fix for running a SMP kernel on a UP box
@ 2002-05-21 21:52 Greg KH
  2002-05-22  2:36 ` James Cleverdon
  2002-05-22 17:10 ` Jack F. Vogel
  0 siblings, 2 replies; 3+ messages in thread
From: Greg KH @ 2002-05-21 21:52 UTC (permalink / raw)
  To: mingo, linux-kernel


I can't seem to run a SMP 2.5.17 kernel on a UP machine, it locks up
during the boot process.  In talking to Jack Vogel, he suggested I make
the following patch, which seems to solve the problem for me.  In
looking at the code, I have no idea of why this seems to work, so there
probably is a better fix out there.

Any suggestions?

thanks,

greg k-h



diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
--- a/arch/i386/kernel/io_apic.c	Tue May 21 14:47:06 2002
+++ b/arch/i386/kernel/io_apic.c	Tue May 21 14:47:06 2002
@@ -205,7 +205,7 @@
 } ____cacheline_aligned irq_balance_t;
 
 static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
-			= { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
+			= { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
 
 extern unsigned long irq_affinity [NR_IRQS];
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] 2.5.17 fix for running a SMP kernel on a UP box
  2002-05-21 21:52 [PATCH] 2.5.17 fix for running a SMP kernel on a UP box Greg KH
@ 2002-05-22  2:36 ` James Cleverdon
  2002-05-22 17:10 ` Jack F. Vogel
  1 sibling, 0 replies; 3+ messages in thread
From: James Cleverdon @ 2002-05-22  2:36 UTC (permalink / raw)
  To: Greg KH, mingo, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1396 bytes --]

I was looking at this with Jack Vogel and I can't figure out how it goes 
wrong, either.  However, the code in move() that uses the cpu number is a bit 
strange.  Entering loops in their middles is generally considered bug-prone 
by programming style books.  What about eliminating the goto by using 
something like the attached patch?

On Tuesday 21 May 2002 02:52 pm, Greg KH wrote:
> I can't seem to run a SMP 2.5.17 kernel on a UP machine, it locks up
> during the boot process.  In talking to Jack Vogel, he suggested I make
> the following patch, which seems to solve the problem for me.  In
> looking at the code, I have no idea of why this seems to work, so there
> probably is a better fix out there.
> 
> Any suggestions?
> 
> thanks,
> 
> greg k-h
> 
> 
> 
> diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
> --- a/arch/i386/kernel/io_apic.c	Tue May 21 14:47:06 2002
> +++ b/arch/i386/kernel/io_apic.c	Tue May 21 14:47:06 2002
> @@ -205,7 +205,7 @@
>  } ____cacheline_aligned irq_balance_t;
>  
>  static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
> -			= { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
> +			= { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
>  
>  extern unsigned long irq_affinity [NR_IRQS];
>  
> -


-- 
James Cleverdon
IBM xSeries Linux Solutions
{jamesclv(Unix, preferred), cleverdj(Notes)} at us dot ibm dot com

[-- Attachment #2: irq_balance_move.patch2 --]
[-- Type: text/x-diff, Size: 1892 bytes --]

*** linux/arch/i386/kernel/io_apic.c.df	Mon May 20 22:07:36 2002
--- linux/arch/i386/kernel/io_apic.c	Tue May 21 18:41:54 2002
***************
*** 203,213 ****
  	unsigned int cpu;
  	unsigned long timestamp;
  } ____cacheline_aligned irq_balance_t;
  
  static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
! 			= { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
  
  extern unsigned long irq_affinity [NR_IRQS];
  
  #endif
  
--- 203,213 ----
  	unsigned int cpu;
  	unsigned long timestamp;
  } ____cacheline_aligned irq_balance_t;
  
  static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
! 			= { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
  
  extern unsigned long irq_affinity [NR_IRQS];
  
  #endif
  
***************
*** 220,248 ****
  static unsigned long move(int curr_cpu, unsigned long allowed_mask, unsigned long now, int direction)
  {
  	int search_idle = 1;
  	int cpu = curr_cpu;
  
! 	goto inside;
! 
! 	do {
! 		if (unlikely(cpu == curr_cpu))
! 			search_idle = 0;
! inside:
! 		if (direction == 1) {
! 			cpu++;
! 			if (cpu >= smp_num_cpus)
  				cpu = 0;
  		} else {
! 			cpu--;
! 			if (cpu == -1)
! 				cpu = smp_num_cpus-1;
  		}
! 	} while (!IRQ_ALLOWED(cpu,allowed_mask) ||
! 			(search_idle && !IDLE_ENOUGH(cpu,now)));
! 
! 	return cpu;
  }
  
  static inline void balance_irq(int irq)
  {
  #if CONFIG_SMP
--- 220,242 ----
  static unsigned long move(int curr_cpu, unsigned long allowed_mask, unsigned long now, int direction)
  {
  	int search_idle = 1;
  	int cpu = curr_cpu;
  
! 	for (;;) {
! 		if (direction) {
! 			if (++cpu >= smp_num_cpus)
  				cpu = 0;
  		} else {
! 			if (--cpu < 0)
! 				cpu = smp_num_cpus - 1;
  		}
! 		if (IRQ_ALLOWED(cpu, allowed_mask) && (!search_idle || IDLE_ENOUGH(cpu, now)))
! 			return cpu;
! 		if (unlikely(cpu == curr_cpu))
! 			search_idle = 0;
! 	}
  }
  
  static inline void balance_irq(int irq)
  {
  #if CONFIG_SMP

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] 2.5.17 fix for running a SMP kernel on a UP box
  2002-05-21 21:52 [PATCH] 2.5.17 fix for running a SMP kernel on a UP box Greg KH
  2002-05-22  2:36 ` James Cleverdon
@ 2002-05-22 17:10 ` Jack F. Vogel
  1 sibling, 0 replies; 3+ messages in thread
From: Jack F. Vogel @ 2002-05-22 17:10 UTC (permalink / raw)
  To: Greg KH, mingo, linux-kernel; +Cc: jfv

On Tuesday 21 May 2002 02:52 pm, Greg KH wrote:
> I can't seem to run a SMP 2.5.17 kernel on a UP machine, it locks up
> during the boot process.  In talking to Jack Vogel, he suggested I make
> the following patch, which seems to solve the problem for me.  In
> looking at the code, I have no idea of why this seems to work, so there
> probably is a better fix out there.
>
> Any suggestions?
>
> thanks,
>
> greg k-h

I should add one bit of information. I originally saw this problem on a
UP IBM Netvista machine, its the same box Greg had it happen on.

However, I have a 933Mhz PIII HP box at home and it does not
have the problem.

Since we are limited in the variety of machines to test on I am not
sure about this, but I believe its only going to occur on UP systems
with an IOAPIC.

If you apply the irq_balance patch to a 2.4.* kernel you can recreate
the same hang, in fact it was on 2.4.18 that i first ran into it.

I realize its kinda a corner case, running an SMP kernel on a
subset of UP machines, but hey, I figure its supposed to work :)

>
> diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
> --- a/arch/i386/kernel/io_apic.c	Tue May 21 14:47:06 2002
> +++ b/arch/i386/kernel/io_apic.c	Tue May 21 14:47:06 2002
> @@ -205,7 +205,7 @@
>  } ____cacheline_aligned irq_balance_t;
>
>  static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
> -			= { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
> +			= { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
>
>  extern unsigned long irq_affinity [NR_IRQS];

Cheers,

-- 
Jack F. Vogel
IBM  Linux Solutions
jfv@us.ibm.com  (work)
jfv@Bluesong.NET (home)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-05-22 17:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-21 21:52 [PATCH] 2.5.17 fix for running a SMP kernel on a UP box Greg KH
2002-05-22  2:36 ` James Cleverdon
2002-05-22 17:10 ` Jack F. Vogel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox