* Re: [PATCH] 2.5.17 fix for running a SMP kernel on a UP box
2002-05-21 21:52 [PATCH] 2.5.17 fix for running a SMP kernel on a UP box Greg KH
@ 2002-05-22 2:36 ` James Cleverdon
2002-05-22 17:10 ` Jack F. Vogel
1 sibling, 0 replies; 3+ messages in thread
From: James Cleverdon @ 2002-05-22 2:36 UTC (permalink / raw)
To: Greg KH, mingo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1396 bytes --]
I was looking at this with Jack Vogel and I can't figure out how it goes
wrong, either. However, the code in move() that uses the cpu number is a bit
strange. Entering loops in their middles is generally considered bug-prone
by programming style books. What about eliminating the goto by using
something like the attached patch?
On Tuesday 21 May 2002 02:52 pm, Greg KH wrote:
> I can't seem to run a SMP 2.5.17 kernel on a UP machine, it locks up
> during the boot process. In talking to Jack Vogel, he suggested I make
> the following patch, which seems to solve the problem for me. In
> looking at the code, I have no idea of why this seems to work, so there
> probably is a better fix out there.
>
> Any suggestions?
>
> thanks,
>
> greg k-h
>
>
>
> diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
> --- a/arch/i386/kernel/io_apic.c Tue May 21 14:47:06 2002
> +++ b/arch/i386/kernel/io_apic.c Tue May 21 14:47:06 2002
> @@ -205,7 +205,7 @@
> } ____cacheline_aligned irq_balance_t;
>
> static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
> - = { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
> + = { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
>
> extern unsigned long irq_affinity [NR_IRQS];
>
> -
--
James Cleverdon
IBM xSeries Linux Solutions
{jamesclv(Unix, preferred), cleverdj(Notes)} at us dot ibm dot com
[-- Attachment #2: irq_balance_move.patch2 --]
[-- Type: text/x-diff, Size: 1892 bytes --]
*** linux/arch/i386/kernel/io_apic.c.df Mon May 20 22:07:36 2002
--- linux/arch/i386/kernel/io_apic.c Tue May 21 18:41:54 2002
***************
*** 203,213 ****
unsigned int cpu;
unsigned long timestamp;
} ____cacheline_aligned irq_balance_t;
static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
! = { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
extern unsigned long irq_affinity [NR_IRQS];
#endif
--- 203,213 ----
unsigned int cpu;
unsigned long timestamp;
} ____cacheline_aligned irq_balance_t;
static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
! = { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
extern unsigned long irq_affinity [NR_IRQS];
#endif
***************
*** 220,248 ****
static unsigned long move(int curr_cpu, unsigned long allowed_mask, unsigned long now, int direction)
{
int search_idle = 1;
int cpu = curr_cpu;
! goto inside;
!
! do {
! if (unlikely(cpu == curr_cpu))
! search_idle = 0;
! inside:
! if (direction == 1) {
! cpu++;
! if (cpu >= smp_num_cpus)
cpu = 0;
} else {
! cpu--;
! if (cpu == -1)
! cpu = smp_num_cpus-1;
}
! } while (!IRQ_ALLOWED(cpu,allowed_mask) ||
! (search_idle && !IDLE_ENOUGH(cpu,now)));
!
! return cpu;
}
static inline void balance_irq(int irq)
{
#if CONFIG_SMP
--- 220,242 ----
static unsigned long move(int curr_cpu, unsigned long allowed_mask, unsigned long now, int direction)
{
int search_idle = 1;
int cpu = curr_cpu;
! for (;;) {
! if (direction) {
! if (++cpu >= smp_num_cpus)
cpu = 0;
} else {
! if (--cpu < 0)
! cpu = smp_num_cpus - 1;
}
! if (IRQ_ALLOWED(cpu, allowed_mask) && (!search_idle || IDLE_ENOUGH(cpu, now)))
! return cpu;
! if (unlikely(cpu == curr_cpu))
! search_idle = 0;
! }
}
static inline void balance_irq(int irq)
{
#if CONFIG_SMP
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] 2.5.17 fix for running a SMP kernel on a UP box
2002-05-21 21:52 [PATCH] 2.5.17 fix for running a SMP kernel on a UP box Greg KH
2002-05-22 2:36 ` James Cleverdon
@ 2002-05-22 17:10 ` Jack F. Vogel
1 sibling, 0 replies; 3+ messages in thread
From: Jack F. Vogel @ 2002-05-22 17:10 UTC (permalink / raw)
To: Greg KH, mingo, linux-kernel; +Cc: jfv
On Tuesday 21 May 2002 02:52 pm, Greg KH wrote:
> I can't seem to run a SMP 2.5.17 kernel on a UP machine, it locks up
> during the boot process. In talking to Jack Vogel, he suggested I make
> the following patch, which seems to solve the problem for me. In
> looking at the code, I have no idea of why this seems to work, so there
> probably is a better fix out there.
>
> Any suggestions?
>
> thanks,
>
> greg k-h
I should add one bit of information. I originally saw this problem on a
UP IBM Netvista machine, its the same box Greg had it happen on.
However, I have a 933Mhz PIII HP box at home and it does not
have the problem.
Since we are limited in the variety of machines to test on I am not
sure about this, but I believe its only going to occur on UP systems
with an IOAPIC.
If you apply the irq_balance patch to a 2.4.* kernel you can recreate
the same hang, in fact it was on 2.4.18 that i first ran into it.
I realize its kinda a corner case, running an SMP kernel on a
subset of UP machines, but hey, I figure its supposed to work :)
>
> diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
> --- a/arch/i386/kernel/io_apic.c Tue May 21 14:47:06 2002
> +++ b/arch/i386/kernel/io_apic.c Tue May 21 14:47:06 2002
> @@ -205,7 +205,7 @@
> } ____cacheline_aligned irq_balance_t;
>
> static irq_balance_t irq_balance[NR_IRQS] __cacheline_aligned
> - = { [ 0 ... NR_IRQS-1 ] = { 1, 0 } };
> + = { [ 0 ... NR_IRQS-1 ] = { 0, 0 } };
>
> extern unsigned long irq_affinity [NR_IRQS];
Cheers,
--
Jack F. Vogel
IBM Linux Solutions
jfv@us.ibm.com (work)
jfv@Bluesong.NET (home)
^ permalink raw reply [flat|nested] 3+ messages in thread