* [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU
@ 2015-01-15 18:49 Aaro Koskinen
2015-01-15 18:49 ` [PATCH 2/2] MIPS: fix kernel lockup or crash after CPU offline/online Aaro Koskinen
2015-01-15 19:36 ` [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU David Daney
0 siblings, 2 replies; 6+ messages in thread
From: Aaro Koskinen @ 2015-01-15 18:49 UTC (permalink / raw)
To: Ralf Baechle, David Daney, linux-mips, linux-kernel
Cc: Hemmo Nieminen, stable, Aaro Koskinen
octeon_cpu_disable() will unconditionally enable interrupts when called
with interrupts disabled. Fix that.
The patch fixes the following crash when offlining a CPU:
[ 93.818785] ------------[ cut here ]------------
[ 93.823421] WARNING: CPU: 1 PID: 10 at kernel/smp.c:231 flush_smp_call_function_queue+0x1c4/0x1d0()
[ 93.836215] Modules linked in:
[ 93.839287] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.19.0-rc4-octeon-los_b5f0 #1
[ 93.847212] Stack : 0000000000000001 ffffffff81b2cf90 0000000000000004 ffffffff81630000
0000000000000000 0000000000000000 0000000000000000 000000000000004a
0000000000000006 ffffffff8117e550 0000000000000000 0000000000000000
ffffffff81b30000 ffffffff81b26808 8000000032c77748 ffffffff81627e07
ffffffff81595ec8 ffffffff81b26808 000000000000000a 0000000000000001
0000000000000001 0000000000000003 0000000010008ce1 ffffffff815030c8
8000000032cbbb38 ffffffff8113d42c 0000000010008ce1 ffffffff8117f36c
8000000032c77300 8000000032cbba50 0000000000000001 ffffffff81503984
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 ffffffff81121668 0000000000000000 0000000000000000
...
[ 93.912819] Call Trace:
[ 93.915273] [<ffffffff81121668>] show_stack+0x68/0x80
[ 93.920335] [<ffffffff81503984>] dump_stack+0x6c/0x90
[ 93.925395] [<ffffffff8113d58c>] warn_slowpath_common+0x94/0xd8
[ 93.931324] [<ffffffff811a402c>] flush_smp_call_function_queue+0x1c4/0x1d0
[ 93.938208] [<ffffffff811a4128>] hotplug_cfd+0xf0/0x108
[ 93.943444] [<ffffffff8115bacc>] notifier_call_chain+0x5c/0xb8
[ 93.949286] [<ffffffff8113d704>] cpu_notify+0x24/0x60
[ 93.954348] [<ffffffff81501738>] take_cpu_down+0x38/0x58
[ 93.959670] [<ffffffff811b343c>] multi_cpu_stop+0x154/0x180
[ 93.965250] [<ffffffff811b3768>] cpu_stopper_thread+0xd8/0x160
[ 93.971093] [<ffffffff8115ea4c>] smpboot_thread_fn+0x1ec/0x1f8
[ 93.976936] [<ffffffff8115ab04>] kthread+0xd4/0xf0
[ 93.981735] [<ffffffff8111c4f0>] ret_from_kernel_thread+0x14/0x1c
[ 93.987835]
[ 93.989326] ---[ end trace c9e3815ee655bda9 ]---
[ 93.993951] Kernel bug detected[#1]:
[ 93.997533] CPU: 1 PID: 10 Comm: migration/1 Tainted: G W 3.19.0-rc4-octeon-los_b5f0 #1
[ 94.006591] task: 8000000032c77300 ti: 8000000032cb8000 task.ti: 8000000032cb8000
[ 94.014081] $ 0 : 0000000000000000 0000000010000ce1 0000000000000001 ffffffff81620000
[ 94.022146] $ 4 : 8000000002c72ac0 0000000000000000 00000000000001a7 ffffffff813b06f0
[ 94.030210] $ 8 : ffffffff813b20d8 0000000000000000 0000000000000000 ffffffff81630000
[ 94.038275] $12 : 0000000000000087 0000000000000000 0000000000000086 0000000000000000
[ 94.046339] $16 : ffffffff81623168 0000000000000001 0000000000000000 0000000000000008
[ 94.054405] $20 : 0000000000000001 0000000000000001 0000000000000001 0000000000000003
[ 94.062470] $24 : 0000000000000038 ffffffff813b7f10
[ 94.070536] $28 : 8000000032cb8000 8000000032cbbc20 0000000010008ce1 ffffffff811bcaf4
[ 94.078601] Hi : 0000000000f188e8
[ 94.082179] Lo : d4fdf3b646c09d55
[ 94.085760] epc : ffffffff811bc9d0 irq_work_run_list+0x8/0xf8
[ 94.091686] Tainted: G W
[ 94.095613] ra : ffffffff811bcaf4 irq_work_run+0x34/0x60
[ 94.101192] Status: 10000ce3 KX SX UX KERNEL EXL IE
[ 94.106235] Cause : 40808034
[ 94.109119] PrId : 000d9301 (Cavium Octeon II)
[ 94.113653] Modules linked in:
[ 94.116721] Process migration/1 (pid: 10, threadinfo=8000000032cb8000, task=8000000032c77300, tls=0000000000000000)
[ 94.127168] Stack : 8000000002c74c80 ffffffff811a4128 0000000000000001 ffffffff81635720
fffffffffffffff2 ffffffff8115bacc 80000000320fbce0 80000000320fbca4
80000000320fbc80 0000000000000002 0000000000000004 ffffffff8113d704
80000000320fbce0 ffffffff81501738 0000000000000003 ffffffff811b343c
8000000002c72aa0 8000000002c72aa8 ffffffff8159cae8 ffffffff8159caa0
ffffffff81650000 80000000320fbbf0 80000000320fbc80 ffffffff811b32e8
0000000000000000 ffffffff811b3768 ffffffff81622b80 ffffffff815148a8
8000000032c77300 8000000002c73e80 ffffffff815148a8 8000000032c77300
ffffffff81622b80 ffffffff815148a8 8000000032c77300 ffffffff81503f48
ffffffff8115ea0c ffffffff81620000 0000000000000000 ffffffff81174d64
...
[ 94.192771] Call Trace:
[ 94.195222] [<ffffffff811bc9d0>] irq_work_run_list+0x8/0xf8
[ 94.200802] [<ffffffff811bcaf4>] irq_work_run+0x34/0x60
[ 94.206036] [<ffffffff811a4128>] hotplug_cfd+0xf0/0x108
[ 94.211269] [<ffffffff8115bacc>] notifier_call_chain+0x5c/0xb8
[ 94.217111] [<ffffffff8113d704>] cpu_notify+0x24/0x60
[ 94.222171] [<ffffffff81501738>] take_cpu_down+0x38/0x58
[ 94.227491] [<ffffffff811b343c>] multi_cpu_stop+0x154/0x180
[ 94.233072] [<ffffffff811b3768>] cpu_stopper_thread+0xd8/0x160
[ 94.238914] [<ffffffff8115ea4c>] smpboot_thread_fn+0x1ec/0x1f8
[ 94.244757] [<ffffffff8115ab04>] kthread+0xd4/0xf0
[ 94.249555] [<ffffffff8111c4f0>] ret_from_kernel_thread+0x14/0x1c
[ 94.255654]
[ 94.257146]
Code: a2423c40 40026000 30420001 <00020336> dc820000 10400037 00000000 0000010f 0000010f
[ 94.267183] ---[ end trace c9e3815ee655bdaa ]---
[ 94.271804] Fatal exception: panic in 5 seconds
Reported-by: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: stable@vger.kernel.org
---
arch/mips/cavium-octeon/smp.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/mips/cavium-octeon/smp.c b/arch/mips/cavium-octeon/smp.c
index ecd903d..9673c5b 100644
--- a/arch/mips/cavium-octeon/smp.c
+++ b/arch/mips/cavium-octeon/smp.c
@@ -231,6 +231,7 @@ DEFINE_PER_CPU(int, cpu_state);
static int octeon_cpu_disable(void)
{
unsigned int cpu = smp_processor_id();
+ unsigned long flags;
if (cpu == 0)
return -EBUSY;
@@ -240,9 +241,9 @@ static int octeon_cpu_disable(void)
set_cpu_online(cpu, false);
cpu_clear(cpu, cpu_callin_map);
- local_irq_disable();
+ local_irq_save(flags);
octeon_fixup_irqs();
- local_irq_enable();
+ local_irq_restore(flags);
flush_cache_all();
local_flush_tlb_all();
--
2.2.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] MIPS: fix kernel lockup or crash after CPU offline/online
2015-01-15 18:49 [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU Aaro Koskinen
@ 2015-01-15 18:49 ` Aaro Koskinen
2015-01-15 19:36 ` [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU David Daney
1 sibling, 0 replies; 6+ messages in thread
From: Aaro Koskinen @ 2015-01-15 18:49 UTC (permalink / raw)
To: Ralf Baechle, David Daney, linux-mips, linux-kernel
Cc: Hemmo Nieminen, stable, Aaro Koskinen
From: Hemmo Nieminen <hemmo.nieminen@iki.fi>
As printk() invocation can cause e.g. a TLB miss, printk() cannot be
called before the exception handlers have been properly initialized.
This can happen e.g. when netconsole has been loaded as a kernel module
and the TLB table has been cleared when a CPU was offline.
Call cpu_report() in start_secondary() only after the exception handlers
have been initialized to fix this.
Without the patch the kernel will randomly either lockup or crash
after a CPU is onlined and the console driver is a module.
Signed-off-by: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: stable@vger.kernel.org
---
arch/mips/kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index c94c4e9..1c0d8c5 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -123,10 +123,10 @@ asmlinkage void start_secondary(void)
unsigned int cpu;
cpu_probe();
- cpu_report();
per_cpu_trap_init(false);
mips_clockevent_init();
mp_ops->init_secondary();
+ cpu_report();
/*
* XXX parity protection should be folded in here when it's converted
--
2.2.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU
2015-01-15 18:49 [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU Aaro Koskinen
2015-01-15 18:49 ` [PATCH 2/2] MIPS: fix kernel lockup or crash after CPU offline/online Aaro Koskinen
@ 2015-01-15 19:36 ` David Daney
2015-01-15 19:53 ` Aaro Koskinen
1 sibling, 1 reply; 6+ messages in thread
From: David Daney @ 2015-01-15 19:36 UTC (permalink / raw)
To: Aaro Koskinen
Cc: Ralf Baechle, David Daney, linux-mips, linux-kernel,
Hemmo Nieminen, stable
On 01/15/2015 10:49 AM, Aaro Koskinen wrote:
> octeon_cpu_disable() will unconditionally enable interrupts when called
> with interrupts disabled. Fix that.
interrupts are always disabled here, so...
[...]
>
> Reported-by: Hemmo Nieminen <hemmo.nieminen@iki.fi>
> Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
> Cc: stable@vger.kernel.org
NACK!
> ---
> arch/mips/cavium-octeon/smp.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/mips/cavium-octeon/smp.c b/arch/mips/cavium-octeon/smp.c
> index ecd903d..9673c5b 100644
> --- a/arch/mips/cavium-octeon/smp.c
> +++ b/arch/mips/cavium-octeon/smp.c
> @@ -231,6 +231,7 @@ DEFINE_PER_CPU(int, cpu_state);
> static int octeon_cpu_disable(void)
> {
> unsigned int cpu = smp_processor_id();
> + unsigned long flags;
>
> if (cpu == 0)
> return -EBUSY;
> @@ -240,9 +241,9 @@ static int octeon_cpu_disable(void)
>
> set_cpu_online(cpu, false);
> cpu_clear(cpu, cpu_callin_map);
> - local_irq_disable();
> + local_irq_save(flags);
Just remove this...
> octeon_fixup_irqs();
> - local_irq_enable();
> + local_irq_restore(flags);
... and this.
>
> flush_cache_all();
> local_flush_tlb_all();
>
You can add an Acked-by me if you do that.
David Daney.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU
2015-01-15 19:36 ` [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU David Daney
@ 2015-01-15 19:53 ` Aaro Koskinen
2015-01-15 20:10 ` David Daney
0 siblings, 1 reply; 6+ messages in thread
From: Aaro Koskinen @ 2015-01-15 19:53 UTC (permalink / raw)
To: David Daney
Cc: Ralf Baechle, David Daney, linux-mips, linux-kernel,
Hemmo Nieminen, stable
Hi,
On Thu, Jan 15, 2015 at 11:36:12AM -0800, David Daney wrote:
> On 01/15/2015 10:49 AM, Aaro Koskinen wrote:
> >octeon_cpu_disable() will unconditionally enable interrupts when called
> >with interrupts disabled. Fix that.
>
> interrupts are always disabled here, so...
Is that also true for all the currently supported stable kernels...?
Or should I just drop Cc: stable from this patch?
> Just remove this...
>
> > octeon_fixup_irqs();
> >- local_irq_enable();
> >+ local_irq_restore(flags);
>
> ... and this.
>
> >
> > flush_cache_all();
> > local_flush_tlb_all();
> >
>
> You can add an Acked-by me if you do that.
Ok, I will do that.
A.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU
2015-01-15 19:53 ` Aaro Koskinen
@ 2015-01-15 20:10 ` David Daney
2015-01-15 20:24 ` Aaro Koskinen
0 siblings, 1 reply; 6+ messages in thread
From: David Daney @ 2015-01-15 20:10 UTC (permalink / raw)
To: Aaro Koskinen
Cc: Ralf Baechle, David Daney, linux-mips, linux-kernel,
Hemmo Nieminen, stable
On 01/15/2015 11:53 AM, Aaro Koskinen wrote:
> Hi,
>
> On Thu, Jan 15, 2015 at 11:36:12AM -0800, David Daney wrote:
>> On 01/15/2015 10:49 AM, Aaro Koskinen wrote:
>>> octeon_cpu_disable() will unconditionally enable interrupts when called
>>> with interrupts disabled. Fix that.
>>
>> interrupts are always disabled here, so...
>
> Is that also true for all the currently supported stable kernels...?
I haven't done extensive research recently, but I have removed that pair
of local_irq_disable/local_irq_enable in our SDK kernel based on 3.10
> Or should I just drop Cc: stable from this patch?
>
>> Just remove this...
>>
>>> octeon_fixup_irqs();
>>> - local_irq_enable();
>>> + local_irq_restore(flags);
>>
>> ... and this.
>>
>>>
>>> flush_cache_all();
>>> local_flush_tlb_all();
>>>
>>
>> You can add an Acked-by me if you do that.
>
> Ok, I will do that.
>
> A.
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU
2015-01-15 20:10 ` David Daney
@ 2015-01-15 20:24 ` Aaro Koskinen
0 siblings, 0 replies; 6+ messages in thread
From: Aaro Koskinen @ 2015-01-15 20:24 UTC (permalink / raw)
To: David Daney
Cc: Ralf Baechle, David Daney, linux-mips, linux-kernel,
Hemmo Nieminen, stable
Hi,
On Thu, Jan 15, 2015 at 12:10:08PM -0800, David Daney wrote:
> On 01/15/2015 11:53 AM, Aaro Koskinen wrote:
> >Hi,
> >
> >On Thu, Jan 15, 2015 at 11:36:12AM -0800, David Daney wrote:
> >>On 01/15/2015 10:49 AM, Aaro Koskinen wrote:
> >>>octeon_cpu_disable() will unconditionally enable interrupts when called
> >>>with interrupts disabled. Fix that.
> >>
> >>interrupts are always disabled here, so...
> >
> >Is that also true for all the currently supported stable kernels...?
>
> I haven't done extensive research recently, but I have removed that pair of
> local_irq_disable/local_irq_enable in our SDK kernel based on 3.10
Ok, I'll add stable tag only for 3.18 as that is the oldest I can test
with at the moment.
A.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-01-15 20:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-15 18:49 [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU Aaro Koskinen
2015-01-15 18:49 ` [PATCH 2/2] MIPS: fix kernel lockup or crash after CPU offline/online Aaro Koskinen
2015-01-15 19:36 ` [PATCH 1/2] MIPS: OCTEON: fix kernel crash when offlining a CPU David Daney
2015-01-15 19:53 ` Aaro Koskinen
2015-01-15 20:10 ` David Daney
2015-01-15 20:24 ` Aaro Koskinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox