public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
       [not found] <1266357790-8962-1-git-send-email-trenn@suse.de>
@ 2010-02-17 16:05 ` Jiri Bohac
  0 siblings, 0 replies; 11+ messages in thread
From: Jiri Bohac @ 2010-02-17 16:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo, Yinghai Lu, akpm, jbohac

On Tue, Feb 16, 2010 at 11:03:10PM +0100, Thomas Renninger wrote:
> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
> pending IRQs to be cleared and would and still cause at most a second of delay
> if the loop were to lock-up for whatever reason.
...
> +	int i, j, acked = 0, max_loops = 0x1E9;

I meant 1E9 == 1000000, not 0x1E9, just to give the kernel a
chance to boot (with a delay) if something is completely wrong.
0x1E9 might be too small.

I also think a warning should be printed in case max_loops decreases to
zero:

> +	    max_loops--;
> +        } while (queued && max_loops > 0);

+	WARN_ON(!max_loops)

Thanks,

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, SUSE CZ


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
@ 2010-02-23 11:51 Thomas Renninger
  2010-02-23 12:01 ` Thomas Renninger
  2010-02-23 12:03 ` Avi Kivity
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Renninger @ 2010-02-23 11:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo,
	Thomas Renninger

From: Kerstin Jonsson <kerstin.jonsson@ericsson.com>

When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.
The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register)
safely, it doesn't consider lower prioritized queued interrupts stored
in the IRR register.

If you have more than one pending interrupt within the same 32 bit word
in the LAPIC vector table registers you may find yourself entering the
IO APIC setup with pending interrupts left in the LAPIC. This is a
situation for wich the IO APIC setup is not prepared. Depending of
what/which interrupt vector/vectors are stuck in the APIC tables your
system may show various degrees of malfunctioning.
That was the reason why the check_timer() failed in our system, the
timer interrupts was blocked by pending interrupts from the old kernel
when routed trough the IO APIC.

Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:

+if (loops++  > MAX_LOOPS) {
+        printk("LAPIC pending clean-up")
+        break;
+}
 while (queued);

with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.
==============

>From trenn@suse.de:
Merged Jiri suggestion into the patch.
Also made the max_loops depend on cpu_khz. Not sure how long an apic_read
takes, as it is on the CPU it may only be one cycle and we now wait 1 sec
in WARN_ON(..) case?

CC: jbohac@novell.com
CC: "Yinghai Lu" <yinghai@kernel.org>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: "Kerstin Jonsson" <kerstin.jonsson@ericsson.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/kernel/apic/apic.c |   34 +++++++++++++++++++++++++---------
 1 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 3987e44..912dd59 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -51,6 +51,7 @@
 #include <asm/smp.h>
 #include <asm/mce.h>
 #include <asm/kvm_para.h>
+#include <asm/tsc.h>
 
 unsigned int num_processors;
 
@@ -1151,8 +1152,8 @@ static void __cpuinit lapic_setup_esr(void)
  */
 void __cpuinit setup_local_APIC(void)
 {
-	unsigned int value;
-	int i, j;
+	unsigned int value, queued;
+	int i, j, acked = 0, max_loops = cpu_khz * 1000;
 
 	if (disable_apic) {
 		arch_disable_smp_support();
@@ -1204,13 +1205,28 @@ void __cpuinit setup_local_APIC(void)
 	 * the interrupt. Hence a vector might get locked. It was noticed
 	 * for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
 	 */
-	for (i = APIC_ISR_NR - 1; i >= 0; i--) {
-		value = apic_read(APIC_ISR + i*0x10);
-		for (j = 31; j >= 0; j--) {
-			if (value & (1<<j))
-				ack_APIC_irq();
-		}
-	}
+        do {
+            queued = 0;
+            for (i = APIC_ISR_NR - 1; i >= 0; i--)
+                queued |= apic_read(APIC_IRR + i*0x10);
+
+            for (i = APIC_ISR_NR - 1; i >= 0; i--) {
+                value = apic_read(APIC_ISR + i*0x10);
+                for (j = 31; j >= 0; j--) {
+                    if (value & (1<<j)) {
+                        ack_APIC_irq();
+                        acked++;
+                    }
+                }
+            }
+            if (acked > 256) {
+                printk(KERN_ERR "LAPIC pending interrupts after %d EOI\n",
+		       acked);
+                break;
+            }
+	    max_loops--;
+        } while (queued && max_loops > 0);
+	WARN_ON(!max_loops);
 
 	/*
 	 * Now that we are all set up, enable the APIC
-- 
1.6.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-02-23 11:51 [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec Thomas Renninger
@ 2010-02-23 12:01 ` Thomas Renninger
  2010-02-23 12:03 ` Avi Kivity
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Renninger @ 2010-02-23 12:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo

On Tuesday 23 February 2010 12:51:25 Thomas Renninger wrote:
> From: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
...
> +	int i, j, acked = 0, max_loops = cpu_khz * 1000;
Grmpfl, an unsigned long for max_loops, probably is a better idea...
What do you think, could this one get picked up then, anything else
to improve?

Thanks,

      Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-02-23 11:51 [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec Thomas Renninger
  2010-02-23 12:01 ` Thomas Renninger
@ 2010-02-23 12:03 ` Avi Kivity
  2010-02-26 19:47   ` Kerstin Jonsson
  1 sibling, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2010-02-23 12:03 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: linux-kernel, Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo

On 02/23/2010 01:51 PM, Thomas Renninger wrote:
> From: Kerstin Jonsson<kerstin.jonsson@ericsson.com>
>
> When the SMP kernel decides to crash_kexec() the local APICs may have
> pending interrupts in their vector tables.
> The setup routine for the local APIC has a deficient mechanism for
> clearing these interrupts, it only handles interrupts that has already
> been dispatched to the local core for servicing (the ISR register)
> safely, it doesn't consider lower prioritized queued interrupts stored
> in the IRR register.
>
> If you have more than one pending interrupt within the same 32 bit word
> in the LAPIC vector table registers you may find yourself entering the
> IO APIC setup with pending interrupts left in the LAPIC. This is a
> situation for wich the IO APIC setup is not prepared. Depending of
> what/which interrupt vector/vectors are stuck in the APIC tables your
> system may show various degrees of malfunctioning.
> That was the reason why the check_timer() failed in our system, the
> timer interrupts was blocked by pending interrupts from the old kernel
> when routed trough the IO APIC.
>
> Additional comment from Jiri Bohac:
> ==============
> If this should go into stable release,
> I'd add some kind of limit on the number of iterations, just to be safe from
> hard to debug lock-ups:
>
> +if (loops++>  MAX_LOOPS) {
> +        printk("LAPIC pending clean-up")
> +        break;
> +}
>   while (queued);
>
> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
> pending IRQs to be cleared and would and still cause at most a second of delay
> if the loop were to lock-up for whatever reason.
> ==============
>
>  From trenn@suse.de:
> Merged Jiri suggestion into the patch.
> Also made the max_loops depend on cpu_khz. Not sure how long an apic_read
> takes, as it is on the CPU it may only be one cycle and we now wait 1 sec
> in WARN_ON(..) case?
>
>    

An apic_read() can take a couple of microseconds when running 
virtualized, so this loop may run for hours.  On the other hand, 
virtualized hardware is unlikely to misbehave.

Still I recommend using a clocksource (tsc would do) and not a loop count.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-02-23 12:03 ` Avi Kivity
@ 2010-02-26 19:47   ` Kerstin Jonsson
  2010-03-08 10:18     ` Avi Kivity
  0 siblings, 1 reply; 11+ messages in thread
From: Kerstin Jonsson @ 2010-02-26 19:47 UTC (permalink / raw)
  To: Avi Kivity, Thomas Renninger
  Cc: linux-kernel@vger.kernel.org, jbohac@novell.com, Yinghai Lu,
	akpm@linux-foundation.org, mingo@elte.hu

> ________________________________________
> From: Avi Kivity [avi@redhat.com]
> Sent: Tuesday, February 23, 2010 1:03 PM
> To: Thomas Renninger
> Cc: linux-kernel@vger.kernel.org; Kerstin Jonsson; jbohac@novell.com; Yinghai Lu; akpm@linux-foundation.org; mingo@elte.hu
> Subject: Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
>
> On 02/23/2010 01:51 PM, Thomas Renninger wrote:
>  
>> From: Kerstin Jonsson<kerstin.jonsson@ericsson.com>
>>
>> When the SMP kernel decides to crash_kexec() the local APICs may have
>> pending interrupts in their vector tables.
>> The setup routine for the local APIC has a deficient mechanism for
>> clearing these interrupts, it only handles interrupts that has already
>> been dispatched to the local core for servicing (the ISR register)
>> safely, it doesn't consider lower prioritized queued interrupts stored
>> in the IRR register.
>>
>> If you have more than one pending interrupt within the same 32 bit word
>> in the LAPIC vector table registers you may find yourself entering the
>> IO APIC setup with pending interrupts left in the LAPIC. This is a
>> situation for wich the IO APIC setup is not prepared. Depending of
>> what/which interrupt vector/vectors are stuck in the APIC tables your
>> system may show various degrees of malfunctioning.
>> That was the reason why the check_timer() failed in our system, the
>> timer interrupts was blocked by pending interrupts from the old kernel
>> when routed trough the IO APIC.
>>
>> Additional comment from Jiri Bohac:
>> ==============
>> If this should go into stable release,
>> I'd add some kind of limit on the number of iterations, just to be safe from
>> hard to debug lock-ups:
>>
>> +if (loops++>  MAX_LOOPS) {
>> +        printk("LAPIC pending clean-up")
>> +        break;
>> +}
>>   while (queued);
>>
>> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
>> pending IRQs to be cleared and would and still cause at most a second of delay
>> if the loop were to lock-up for whatever reason.
>> ==============
>>
>>  From trenn@suse.de:
>> Merged Jiri suggestion into the patch.
>> Also made the max_loops depend on cpu_khz. Not sure how long an apic_read
>> takes, as it is on the CPU it may only be one cycle and we now wait 1 sec
>> in WARN_ON(..) case?
>>
>>
>>    
>
> An apic_read() can take a couple of microseconds when running
> virtualized, so this loop may run for hours.  On the other hand,
> virtualized hardware is unlikely to misbehave.
>
> Still I recommend using a clocksource (tsc would do) and not a loop count.
>
> --
> error compiling committee.c: too many arguments to function
>
>
>  
Is it possible/thinkable to distinguish between real and virtual targets?
I.e. to somehow detect that the target is a virtual machine and adapt accordingly.
There may be other cases as well, in which one would benefit from taking
target type into consideration when e.g. estimating the reasonable number of cycles 
for a specific operation.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-02-26 19:47   ` Kerstin Jonsson
@ 2010-03-08 10:18     ` Avi Kivity
  0 siblings, 0 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-08 10:18 UTC (permalink / raw)
  To: Kerstin Jonsson
  Cc: Thomas Renninger, linux-kernel@vger.kernel.org, jbohac@novell.com,
	Yinghai Lu, akpm@linux-foundation.org, mingo@elte.hu

On 02/26/2010 09:47 PM, Kerstin Jonsson wrote:
>>
>>      
>>> From: Kerstin Jonsson<kerstin.jonsson@ericsson.com>
>>>
>>> When the SMP kernel decides to crash_kexec() the local APICs may have
>>> pending interrupts in their vector tables.
>>> The setup routine for the local APIC has a deficient mechanism for
>>> clearing these interrupts, it only handles interrupts that has already
>>> been dispatched to the local core for servicing (the ISR register)
>>> safely, it doesn't consider lower prioritized queued interrupts stored
>>> in the IRR register.
>>>
>>> If you have more than one pending interrupt within the same 32 bit word
>>> in the LAPIC vector table registers you may find yourself entering the
>>> IO APIC setup with pending interrupts left in the LAPIC. This is a
>>> situation for wich the IO APIC setup is not prepared. Depending of
>>> what/which interrupt vector/vectors are stuck in the APIC tables your
>>> system may show various degrees of malfunctioning.
>>> That was the reason why the check_timer() failed in our system, the
>>> timer interrupts was blocked by pending interrupts from the old kernel
>>> when routed trough the IO APIC.
>>>
>>> Additional comment from Jiri Bohac:
>>> ==============
>>> If this should go into stable release,
>>> I'd add some kind of limit on the number of iterations, just to be safe from
>>> hard to debug lock-ups:
>>>
>>> +if (loops++>   MAX_LOOPS) {
>>> +        printk("LAPIC pending clean-up")
>>> +        break;
>>> +}
>>>    while (queued);
>>>
>>> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
>>> pending IRQs to be cleared and would and still cause at most a second of delay
>>> if the loop were to lock-up for whatever reason.
>>> ==============
>>>
>>>   From trenn@suse.de:
>>> Merged Jiri suggestion into the patch.
>>> Also made the max_loops depend on cpu_khz. Not sure how long an apic_read
>>> takes, as it is on the CPU it may only be one cycle and we now wait 1 sec
>>> in WARN_ON(..) case?
>>>
>>>
>>>
>>>        
>> An apic_read() can take a couple of microseconds when running
>> virtualized, so this loop may run for hours.  On the other hand,
>> virtualized hardware is unlikely to misbehave.
>>
>> Still I recommend using a clocksource (tsc would do) and not a loop count.
>>
>> --
>> error compiling committee.c: too many arguments to function
>>
>>
>>
>>      
> Is it possible/thinkable to distinguish between real and virtual targets?
> I.e. to somehow detect that the target is a virtual machine and adapt accordingly.
> There may be other cases as well, in which one would benefit from taking
> target type into consideration when e.g. estimating the reasonable number of cycles
> for a specific operation

It's possible (cpuid hypervisor bit), but I don't think it's a good 
idea.  Splitting up code paths doubles the chance of bugs.  Much better 
to find something that works both ways.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
@ 2010-03-08 11:17 Thomas Renninger
  2010-03-08 11:26 ` Avi Kivity
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Thomas Renninger @ 2010-03-08 11:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo, Avi Kivity,
	Thomas Renninger

From: Kerstin Jonsson <kerstin.jonsson@ericsson.com>

When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.
The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register)
safely, it doesn't consider lower prioritized queued interrupts stored
in the IRR register.

If you have more than one pending interrupt within the same 32 bit word
in the LAPIC vector table registers you may find yourself entering the
IO APIC setup with pending interrupts left in the LAPIC. This is a
situation for wich the IO APIC setup is not prepared. Depending of
what/which interrupt vector/vectors are stuck in the APIC tables your
system may show various degrees of malfunctioning.
That was the reason why the check_timer() failed in our system, the
timer interrupts was blocked by pending interrupts from the old kernel
when routed trough the IO APIC.

Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:

+if (loops++  > MAX_LOOPS) {
+        printk("LAPIC pending clean-up")
+        break;
+}
 while (queued);

with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.
==============

>From trenn@suse.de:
V2: Use tsc if avail to bail out after 1 sec due to possible virtual apic_read
    calls which may take rather long (suggested by: Avi Kivity <avi@redhat.com>)
    If no tsc is available bail out quickly after cpu_khz, if we broke out too
    early and still have irqs pending (which should never happen?) we still
    get a WARN_ON...


CC: jbohac@novell.com
CC: "Yinghai Lu" <yinghai@kernel.org>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: "Kerstin Jonsson" <kerstin.jonsson@ericsson.com>
CC: "Avi Kivity" <avi@redhat.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/kernel/apic/apic.c |   42 +++++++++++++++++++++++++++++++++---------
 1 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 3987e44..93cdb2a 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -51,6 +51,7 @@
 #include <asm/smp.h>
 #include <asm/mce.h>
 #include <asm/kvm_para.h>
+#include <asm/tsc.h>
 
 unsigned int num_processors;
 
@@ -1151,8 +1152,12 @@ static void __cpuinit lapic_setup_esr(void)
  */
 void __cpuinit setup_local_APIC(void)
 {
-	unsigned int value;
-	int i, j;
+	unsigned int value, queued;
+	int i, j, acked = 0;
+	unsigned long long tsc = 0, ntsc, max_loops = cpu_khz;
+
+	if (cpu_has_tsc)
+		rdtscll(ntsc);
 
 	if (disable_apic) {
 		arch_disable_smp_support();
@@ -1204,13 +1209,32 @@ void __cpuinit setup_local_APIC(void)
 	 * the interrupt. Hence a vector might get locked. It was noticed
 	 * for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
 	 */
-	for (i = APIC_ISR_NR - 1; i >= 0; i--) {
-		value = apic_read(APIC_ISR + i*0x10);
-		for (j = 31; j >= 0; j--) {
-			if (value & (1<<j))
-				ack_APIC_irq();
-		}
-	}
+        do {
+            queued = 0;
+            for (i = APIC_ISR_NR - 1; i >= 0; i--)
+                queued |= apic_read(APIC_IRR + i*0x10);
+
+            for (i = APIC_ISR_NR - 1; i >= 0; i--) {
+                value = apic_read(APIC_ISR + i*0x10);
+                for (j = 31; j >= 0; j--) {
+                    if (value & (1<<j)) {
+                        ack_APIC_irq();
+                        acked++;
+                    }
+                }
+            }
+            if (acked > 256) {
+                printk(KERN_ERR "LAPIC pending interrupts after %d EOI\n",
+		       acked);
+                break;
+            }
+	    if (cpu_has_tsc) {
+		    rdtscll(ntsc);
+		    max_loops = (cpu_khz << 10) - (ntsc - tsc);
+	    } else
+		    max_loops--;
+        } while (queued && max_loops > 0);
+	WARN_ON(!max_loops);
 
 	/*
 	 * Now that we are all set up, enable the APIC
-- 
1.6.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-03-08 11:17 Thomas Renninger
@ 2010-03-08 11:26 ` Avi Kivity
  2010-03-08 11:26 ` Thomas Renninger
  2010-03-08 11:34 ` Cyrill Gorcunov
  2 siblings, 0 replies; 11+ messages in thread
From: Avi Kivity @ 2010-03-08 11:26 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: linux-kernel, Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo

On 03/08/2010 01:17 PM, Thomas Renninger wrote:
> From: Kerstin Jonsson<kerstin.jonsson@ericsson.com>
>
> When the SMP kernel decides to crash_kexec() the local APICs may have
> pending interrupts in their vector tables.
> The setup routine for the local APIC has a deficient mechanism for
> clearing these interrupts, it only handles interrupts that has already
> been dispatched to the local core for servicing (the ISR register)
> safely, it doesn't consider lower prioritized queued interrupts stored
> in the IRR register.
>
> If you have more than one pending interrupt within the same 32 bit word
> in the LAPIC vector table registers you may find yourself entering the
> IO APIC setup with pending interrupts left in the LAPIC. This is a
> situation for wich the IO APIC setup is not prepared. Depending of
> what/which interrupt vector/vectors are stuck in the APIC tables your
> system may show various degrees of malfunctioning.
> That was the reason why the check_timer() failed in our system, the
> timer interrupts was blocked by pending interrupts from the old kernel
> when routed trough the IO APIC.
>
> Additional comment from Jiri Bohac:
> ==============
> If this should go into stable release,
> I'd add some kind of limit on the number of iterations, just to be safe from
> hard to debug lock-ups:
>
> +if (loops++>  MAX_LOOPS) {
> +        printk("LAPIC pending clean-up")
> +        break;
> +}
>   while (queued);
>
> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
> pending IRQs to be cleared and would and still cause at most a second of delay
> if the loop were to lock-up for whatever reason.
> ==============
>
>  From trenn@suse.de:
> V2: Use tsc if avail to bail out after 1 sec due to possible virtual apic_read
>      calls which may take rather long (suggested by: Avi Kivity<avi@redhat.com>)
>      If no tsc is available bail out quickly after cpu_khz, if we broke out too
>      early and still have irqs pending (which should never happen?) we still
>      get a WARN_ON...
>
>
>
> @@ -1151,8 +1152,12 @@ static void __cpuinit lapic_setup_esr(void)
>    */
>   void __cpuinit setup_local_APIC(void)
>   {
> -	unsigned int value;
> -	int i, j;
> +	unsigned int value, queued;
> +	int i, j, acked = 0;
> +	unsigned long long tsc = 0, ntsc, max_loops = cpu_khz;
> +
> +	if (cpu_has_tsc)
> +		rdtscll(ntsc);
>
>
>    

...

> +	    if (cpu_has_tsc) {
> +		    rdtscll(ntsc);
> +		    max_loops = (cpu_khz<<  10) - (ntsc - tsc);
>    

Since max_loops is unsigned, this will always be positive.

> +	    } else
> +		    max_loops--;
> +        } while (queued&&  max_loops>  0);
> +	WARN_ON(!max_loops);
>    

So the loop never terminates unless queued becomes true.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-03-08 11:17 Thomas Renninger
  2010-03-08 11:26 ` Avi Kivity
@ 2010-03-08 11:26 ` Thomas Renninger
  2010-03-08 11:34 ` Cyrill Gorcunov
  2 siblings, 0 replies; 11+ messages in thread
From: Thomas Renninger @ 2010-03-08 11:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo, Avi Kivity

On Monday 08 March 2010 12:17:10 Thomas Renninger wrote:
> From: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
> 
> When the SMP kernel decides to crash_kexec() the local APICs may have
> pending interrupts in their vector tables.
...
Sorry, indentation is totally messed up, I resend another version
which is "checkpatch'ed"
If this is an acceptable approach, it would be great if Kerstin
could try out and test the "break out after 1 sec" condition and
whether all still works as expected...

Thanks,

  Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-03-08 11:17 Thomas Renninger
  2010-03-08 11:26 ` Avi Kivity
  2010-03-08 11:26 ` Thomas Renninger
@ 2010-03-08 11:34 ` Cyrill Gorcunov
  2010-03-08 11:40   ` Thomas Renninger
  2 siblings, 1 reply; 11+ messages in thread
From: Cyrill Gorcunov @ 2010-03-08 11:34 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: linux-kernel, Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo,
	Avi Kivity

On Mon, Mar 08, 2010 at 12:17:10PM +0100, Thomas Renninger wrote:
...
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 3987e44..93cdb2a 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -51,6 +51,7 @@
>  #include <asm/smp.h>
>  #include <asm/mce.h>
>  #include <asm/kvm_para.h>
> +#include <asm/tsc.h>
>  
>  unsigned int num_processors;
>  
> @@ -1151,8 +1152,12 @@ static void __cpuinit lapic_setup_esr(void)
>   */
>  void __cpuinit setup_local_APIC(void)
>  {
> -	unsigned int value;
> -	int i, j;
> +	unsigned int value, queued;
> +	int i, j, acked = 0;
> +	unsigned long long tsc = 0, ntsc, max_loops = cpu_khz;
> +
> +	if (cpu_has_tsc)
> +		rdtscll(ntsc);

Perhaps rdtscll(tsc)?

>  
>  	if (disable_apic) {
>  		arch_disable_smp_support();
> @@ -1204,13 +1209,32 @@ void __cpuinit setup_local_APIC(void)
>  	 * the interrupt. Hence a vector might get locked. It was noticed
>  	 * for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
>  	 */
> -	for (i = APIC_ISR_NR - 1; i >= 0; i--) {
> -		value = apic_read(APIC_ISR + i*0x10);
> -		for (j = 31; j >= 0; j--) {
> -			if (value & (1<<j))
> -				ack_APIC_irq();
> -		}
> -	}
> +        do {
> +            queued = 0;
> +            for (i = APIC_ISR_NR - 1; i >= 0; i--)
> +                queued |= apic_read(APIC_IRR + i*0x10);
> +
> +            for (i = APIC_ISR_NR - 1; i >= 0; i--) {
> +                value = apic_read(APIC_ISR + i*0x10);
> +                for (j = 31; j >= 0; j--) {
> +                    if (value & (1<<j)) {
> +                        ack_APIC_irq();
> +                        acked++;
> +                    }
> +                }
> +            }
> +            if (acked > 256) {
> +                printk(KERN_ERR "LAPIC pending interrupts after %d EOI\n",
> +		       acked);
> +                break;
> +            }
> +	    if (cpu_has_tsc) {
> +		    rdtscll(ntsc);
> +		    max_loops = (cpu_khz << 10) - (ntsc - tsc);

Where is tsc modified? It remains tsc = 0 all the time?
Or I miss the snippet where it is set?

> +	    } else
> +		    max_loops--;
> +        } while (queued && max_loops > 0);
> +	WARN_ON(!max_loops);
>  
>  	/*
>  	 * Now that we are all set up, enable the APIC
> -- 
> 1.6.3
> 
	-- Cyrill

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec
  2010-03-08 11:34 ` Cyrill Gorcunov
@ 2010-03-08 11:40   ` Thomas Renninger
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Renninger @ 2010-03-08 11:40 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: linux-kernel, Kerstin Jonsson, jbohac, Yinghai Lu, akpm, mingo,
	Avi Kivity

On Monday 08 March 2010 12:34:52 Cyrill Gorcunov wrote:
> On Mon, Mar 08, 2010 at 12:17:10PM +0100, Thomas Renninger wrote:
> ...
> > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> > index 3987e44..93cdb2a 100644
> > +	if (cpu_has_tsc)
> > +		rdtscll(ntsc);
> 
> Perhaps rdtscll(tsc)?
Oh dear..., I played with this to wipe out:
warning: ‘tsc’ may be used uninitialized in this function

> > +	    if (cpu_has_tsc) {
> > +		    rdtscll(ntsc);
> > +		    max_loops = (cpu_khz << 10) - (ntsc - tsc);
> 
> Where is tsc modified? It remains tsc = 0 all the time?
> Or I miss the snippet where it is set?
Yes, you are right..., next version, thanks a lot for looking at this,

    Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-03-08 11:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-23 11:51 [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec Thomas Renninger
2010-02-23 12:01 ` Thomas Renninger
2010-02-23 12:03 ` Avi Kivity
2010-02-26 19:47   ` Kerstin Jonsson
2010-03-08 10:18     ` Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2010-03-08 11:17 Thomas Renninger
2010-03-08 11:26 ` Avi Kivity
2010-03-08 11:26 ` Thomas Renninger
2010-03-08 11:34 ` Cyrill Gorcunov
2010-03-08 11:40   ` Thomas Renninger
     [not found] <1266357790-8962-1-git-send-email-trenn@suse.de>
2010-02-17 16:05 ` Jiri Bohac

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox