* [PATCH] Add call to non-crashing cores through IPI
@ 2010-11-19 10:08 Per Fransson
  2010-11-22  7:24 ` Mika Westerberg
  0 siblings, 1 reply; 23+ messages in thread
From: Per Fransson @ 2010-11-19 10:08 UTC (permalink / raw)
  To: linux-arm-kernel
When kexec is used to start a crash kernel the other cores
are notified. These non-crashing cores will save their state
in the crash notes and then do nothing.
Signed-off-by: Per Fransson <per.xx.fransson@stericsson.com>
---
 arch/arm/kernel/machine_kexec.c |   28 ++++++++++++++++++++++++++++
 1 files changed, 28 insertions(+), 0 deletions(-)
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 3a8fd51..56683db 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -23,6 +23,8 @@ extern unsigned long kexec_indirection_page;
 extern unsigned long kexec_mach_type;
 extern unsigned long kexec_boot_atags;
 
+static atomic_t waiting_for_crash_ipi;
+
 /*
  * Provide a dummy crash_notes definition while crash dump arrives to arm.
  * This prevents breakage of crash_notes attribute in kernel/ksysfs.c.
@@ -37,8 +39,34 @@ void machine_kexec_cleanup(struct kimage *image)
 {
 }
 
+void machine_crash_nonpanic_core(void *unused)
+{
+	struct pt_regs regs;
+
+	crash_setup_regs(®s, NULL);
+	printk(KERN_EMERG "CPU %u will stop doing anything useful since another CPU has crashed\n",
+	       smp_processor_id());
+	crash_save_cpu(®s, smp_processor_id());
+	flush_cache_all();
+
+	atomic_dec(&waiting_for_crash_ipi);
+	while (1)
+		cpu_relax();
+}
+
 void machine_crash_shutdown(struct pt_regs *regs)
 {
+	unsigned long msecs;
+
+	atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1);
+
+	local_irq_enable();
+	smp_call_function(machine_crash_nonpanic_core, NULL, false);
+	msecs = 1000; /* Wait at most a second for the other cpus to stop */
+	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
+		mdelay(1);
+		msecs--;
+	}
 	local_irq_disable();
 	crash_save_cpu(regs, smp_processor_id());
 
-- 
1.7.2.2
^ permalink raw reply related	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-19 10:08 [PATCH] Add call to non-crashing cores through IPI Per Fransson
@ 2010-11-22  7:24 ` Mika Westerberg
  2010-11-22  9:47   ` Maxim Uvarov
  2010-11-22  9:53   ` Per Fransson
  0 siblings, 2 replies; 23+ messages in thread
From: Mika Westerberg @ 2010-11-22  7:24 UTC (permalink / raw)
  To: linux-arm-kernel
On Fri, Nov 19, 2010 at 11:08:30AM +0100, Per Fransson wrote:
> When kexec is used to start a crash kernel the other cores
> are notified. These non-crashing cores will save their state
> in the crash notes and then do nothing.
> 
> Signed-off-by: Per Fransson <per.xx.fransson@stericsson.com>
> ---
>  arch/arm/kernel/machine_kexec.c |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
> index 3a8fd51..56683db 100644
> --- a/arch/arm/kernel/machine_kexec.c
> +++ b/arch/arm/kernel/machine_kexec.c
> @@ -23,6 +23,8 @@ extern unsigned long kexec_indirection_page;
>  extern unsigned long kexec_mach_type;
>  extern unsigned long kexec_boot_atags;
>  
> +static atomic_t waiting_for_crash_ipi;
> +
>  /*
>   * Provide a dummy crash_notes definition while crash dump arrives to arm.
>   * This prevents breakage of crash_notes attribute in kernel/ksysfs.c.
> @@ -37,8 +39,34 @@ void machine_kexec_cleanup(struct kimage *image)
>  {
>  }
>  
> +void machine_crash_nonpanic_core(void *unused)
> +{
> +	struct pt_regs regs;
> +
> +	crash_setup_regs(®s, NULL);
> +	printk(KERN_EMERG "CPU %u will stop doing anything useful since another CPU has crashed\n",
> +	       smp_processor_id());
> +	crash_save_cpu(®s, smp_processor_id());
> +	flush_cache_all();
> +
> +	atomic_dec(&waiting_for_crash_ipi);
> +	while (1)
> +		cpu_relax();
> +}
> +
>  void machine_crash_shutdown(struct pt_regs *regs)
>  {
> +	unsigned long msecs;
> +
> +	atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1);
> +
> +	local_irq_enable();
I wonder whether it is good idea to enable interrupts here? What
if we came here from an interrupt handler with interrupts already
disabled?
I guess you did this because smp_call_function() needs to have
interrupts enabled, right?
As we still need to make sure that all the secondary CPUs are
stopped, should we do just something like:
	smp_send_stop();
and then in ipi_cpu_stop() we check whether oops_in_progress is set
and save the cpu state before entering that never-ending loop?
Regards,
MW
> +	smp_call_function(machine_crash_nonpanic_core, NULL, false);
> +	msecs = 1000; /* Wait at most a second for the other cpus to stop */
> +	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
> +		mdelay(1);
> +		msecs--;
> +	}
>  	local_irq_disable();
>  	crash_save_cpu(regs, smp_processor_id());
>  
> -- 
> 1.7.2.2
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22  7:24 ` Mika Westerberg
@ 2010-11-22  9:47   ` Maxim Uvarov
  2010-11-22  9:53   ` Per Fransson
  1 sibling, 0 replies; 23+ messages in thread
From: Maxim Uvarov @ 2010-11-22  9:47 UTC (permalink / raw)
  To: linux-arm-kernel
2010/11/22 Mika Westerberg <mika.westerberg@iki.fi>:
> On Fri, Nov 19, 2010 at 11:08:30AM +0100, Per Fransson wrote:
>> When kexec is used to start a crash kernel the other cores
>> are notified. These non-crashing cores will save their state
>> in the crash notes and then do nothing.
>>
>> Signed-off-by: Per Fransson <per.xx.fransson@stericsson.com>
>> ---
>> ?arch/arm/kernel/machine_kexec.c | ? 28 ++++++++++++++++++++++++++++
>> ?1 files changed, 28 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
>> index 3a8fd51..56683db 100644
>> --- a/arch/arm/kernel/machine_kexec.c
>> +++ b/arch/arm/kernel/machine_kexec.c
>> @@ -23,6 +23,8 @@ extern unsigned long kexec_indirection_page;
>> ?extern unsigned long kexec_mach_type;
>> ?extern unsigned long kexec_boot_atags;
>>
>> +static atomic_t waiting_for_crash_ipi;
>> +
>> ?/*
>> ? * Provide a dummy crash_notes definition while crash dump arrives to arm.
>> ? * This prevents breakage of crash_notes attribute in kernel/ksysfs.c.
>> @@ -37,8 +39,34 @@ void machine_kexec_cleanup(struct kimage *image)
>> ?{
>> ?}
>>
>> +void machine_crash_nonpanic_core(void *unused)
>> +{
>> + ? ? struct pt_regs regs;
>> +
>> + ? ? crash_setup_regs(®s, NULL);
>> + ? ? printk(KERN_EMERG "CPU %u will stop doing anything useful since another CPU has crashed\n",
>> + ? ? ? ? ? ?smp_processor_id());
>> + ? ? crash_save_cpu(®s, smp_processor_id());
>> + ? ? flush_cache_all();
>> +
>> + ? ? atomic_dec(&waiting_for_crash_ipi);
>> + ? ? while (1)
>> + ? ? ? ? ? ? cpu_relax();
>> +}
>> +
>> ?void machine_crash_shutdown(struct pt_regs *regs)
>> ?{
>> + ? ? unsigned long msecs;
>> +
>> + ? ? atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1);
>> +
>> + ? ? local_irq_enable();
>
> I wonder whether it is good idea to enable interrupts here? What
> if we came here from an interrupt handler with interrupts already
> disabled?
>
> I guess you did this because smp_call_function() needs to have
> interrupts enabled, right?
>
> As we still need to make sure that all the secondary CPUs are
> stopped, should we do just something like:
>
> ? ? ? ?smp_send_stop();
>
> and then in ipi_cpu_stop() we check whether oops_in_progress is set
> and save the cpu state before entering that never-ending loop?
>
> Regards,
> MW
>
This is absolutely true. In kexec case interrupts are enables so you
can use smp_call_function() but in crash dump case interrupts are
disables due place where crash was occur (null pointer deference, bad
memory area or something like that). Enabling interrupts only for
smp_call_function()  looks ugly and can be the cause of more serious
bugs. Please add CRASH_DUMP IPI for this,
Best regards,
Maxim.
>> + ? ? smp_call_function(machine_crash_nonpanic_core, NULL, false);
>> + ? ? msecs = 1000; /* Wait at most a second for the other cpus to stop */
>> + ? ? while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
>> + ? ? ? ? ? ? mdelay(1);
>> + ? ? ? ? ? ? msecs--;
>> + ? ? }
>> ? ? ? local_irq_disable();
>> ? ? ? crash_save_cpu(regs, smp_processor_id());
>>
>> --
>> 1.7.2.2
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
-- 
Best regards,
Maxim Uvarov
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22  7:24 ` Mika Westerberg
  2010-11-22  9:47   ` Maxim Uvarov
@ 2010-11-22  9:53   ` Per Fransson
  2010-11-22 10:47     ` Russell King - ARM Linux
  1 sibling, 1 reply; 23+ messages in thread
From: Per Fransson @ 2010-11-22  9:53 UTC (permalink / raw)
  To: linux-arm-kernel
>
> I wonder whether it is good idea to enable interrupts here? What
> if we came here from an interrupt handler with interrupts already
> disabled?
>
> I guess you did this because smp_call_function() needs to have
> interrupts enabled, right?
>
Yup, that's why I did it.
> As we still need to make sure that all the secondary CPUs are
> stopped, should we do just something like:
>
> ? ? ? ?smp_send_stop();
>
> and then in ipi_cpu_stop() we check whether oops_in_progress is set
> and save the cpu state before entering that never-ending loop?
But we still need to wait for it to complete, right? Will smp_send_stop()
handle the requirements of the underlying ipi regardless of the initial state?
Regards,
Per
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22  9:53   ` Per Fransson
@ 2010-11-22 10:47     ` Russell King - ARM Linux
  2010-11-22 11:27       ` Russell King - ARM Linux
  0 siblings, 1 reply; 23+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 10:47 UTC (permalink / raw)
  To: linux-arm-kernel
On Mon, Nov 22, 2010 at 10:53:00AM +0100, Per Fransson wrote:
> >
> > I wonder whether it is good idea to enable interrupts here? What
> > if we came here from an interrupt handler with interrupts already
> > disabled?
> >
> > I guess you did this because smp_call_function() needs to have
> > interrupts enabled, right?
> >
> 
> Yup, that's why I did it.
> 
> > As we still need to make sure that all the secondary CPUs are
> > stopped, should we do just something like:
> >
> > ? ? ? ?smp_send_stop();
> >
> > and then in ipi_cpu_stop() we check whether oops_in_progress is set
> > and save the cpu state before entering that never-ending loop?
> 
> But we still need to wait for it to complete, right? Will smp_send_stop()
> handle the requirements of the underlying ipi regardless of the initial state?
In a crashing state, any kind of IPIs are not guaranteed.  The other
CPUs may be in an IRQ-protected region waiting for a lock which the
crashing CPU already holds, and so the IPI won't be received.
However, we do need smp_send_stop() to wait for a limited time for the
other CPUs to respond to the request.  We should also do something more
intelligent than spinning in a while(1) loop.
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 10:47     ` Russell King - ARM Linux
@ 2010-11-22 11:27       ` Russell King - ARM Linux
  2010-11-22 13:07         ` Santosh Shilimkar
  2010-11-22 14:31         ` Per Fransson
  0 siblings, 2 replies; 23+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 11:27 UTC (permalink / raw)
  To: linux-arm-kernel
On Mon, Nov 22, 2010 at 10:47:40AM +0000, Russell King - ARM Linux wrote:
> However, we do need smp_send_stop() to wait for a limited time for the
> other CPUs to respond to the request.
ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
Wait up to one second for secondary CPUs to respond to a request to
stop.  This avoids the sender CPU continuing and possibly destroying
state before the recipients have had a chance to respond to the stop.
However, if the recipients have crashed, we won't hang the sender
CPU indefinitely.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/kernel/smp.c |    6 ++++++
 1 file changed, 6 insertions(+)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 5d0dc16..32a2158 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -589,10 +589,16 @@ void smp_send_reschedule(int cpu)
 
 void smp_send_stop(void)
 {
+	unsigned long timeout;
 	cpumask_t mask = cpu_online_map;
 	cpu_clear(smp_processor_id(), mask);
 	if (!cpus_empty(mask))
 		send_ipi_message(&mask, IPI_CPU_STOP);
+
+	/* Wait up to one second for other CPUs to stop */
+	timeout = USEC_PER_SEC;
+	while (num_online_cpus() > 1 && timeout--)
+		udelay(1);
 }
 
 /*
^ permalink raw reply related	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 11:27       ` Russell King - ARM Linux
@ 2010-11-22 13:07         ` Santosh Shilimkar
  2010-11-22 14:21           ` Russell King - ARM Linux
  2010-11-22 14:31         ` Per Fransson
  1 sibling, 1 reply; 23+ messages in thread
From: Santosh Shilimkar @ 2010-11-22 13:07 UTC (permalink / raw)
  To: linux-arm-kernel
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk]
> Sent: Monday, November 22, 2010 4:58 PM
> To: Per Fransson; Shilimkar, Santosh
> Cc: Mika Westerberg; kexec at lists.infradead.org; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH] Add call to non-crashing cores through IPI
>
> On Mon, Nov 22, 2010 at 10:47:40AM +0000, Russell King - ARM Linux
wrote:
> > However, we do need smp_send_stop() to wait for a limited time for the
> > other CPUs to respond to the request.
>
> ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
>
> Wait up to one second for secondary CPUs to respond to a request to
> stop.  This avoids the sender CPU continuing and possibly destroying
> state before the recipients have had a chance to respond to the stop.
> However, if the recipients have crashed, we won't hang the sender
> CPU indefinitely.
>
Just tried this patch thinking it might fix the reboot issue
reported earlier. But it doesn't help.
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  arch/arm/kernel/smp.c |    6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 5d0dc16..32a2158 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -589,10 +589,16 @@ void smp_send_reschedule(int cpu)
>
>  void smp_send_stop(void)
>  {
> +	unsigned long timeout;
>  	cpumask_t mask = cpu_online_map;
>  	cpu_clear(smp_processor_id(), mask);
>  	if (!cpus_empty(mask))
>  		send_ipi_message(&mask, IPI_CPU_STOP);
> +
> +	/* Wait up to one second for other CPUs to stop */
> +	timeout = USEC_PER_SEC;
> +	while (num_online_cpus() > 1 && timeout--)
> +		udelay(1);
>  }
>
>  /*
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 13:07         ` Santosh Shilimkar
@ 2010-11-22 14:21           ` Russell King - ARM Linux
  2010-11-22 17:25             ` Santosh Shilimkar
  0 siblings, 1 reply; 23+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 14:21 UTC (permalink / raw)
  To: linux-arm-kernel
On Mon, Nov 22, 2010 at 06:37:09PM +0530, Santosh Shilimkar wrote:
> > ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
> >
> > Wait up to one second for secondary CPUs to respond to a request to
> > stop.  This avoids the sender CPU continuing and possibly destroying
> > state before the recipients have had a chance to respond to the stop.
> > However, if the recipients have crashed, we won't hang the sender
> > CPU indefinitely.
> >
> Just tried this patch thinking it might fix the reboot issue
> reported earlier. But it doesn't help.
I was hoping that it might change the behaviour slightly, but if not
the patch can't do any harm.
The reboot stuff works fine on Versatile Express with Cortex A9, so
I suspect your problem might be specific to OMAP.
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 11:27       ` Russell King - ARM Linux
  2010-11-22 13:07         ` Santosh Shilimkar
@ 2010-11-22 14:31         ` Per Fransson
  2010-11-22 14:40           ` Russell King - ARM Linux
  1 sibling, 1 reply; 23+ messages in thread
From: Per Fransson @ 2010-11-22 14:31 UTC (permalink / raw)
  To: linux-arm-kernel
On Mon, Nov 22, 2010 at 12:27 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Nov 22, 2010 at 10:47:40AM +0000, Russell King - ARM Linux wrote:
>> However, we do need smp_send_stop() to wait for a limited time for the
>> other CPUs to respond to the request.
>
> ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
>
> Wait up to one second for secondary CPUs to respond to a request to
> stop. ?This avoids the sender CPU continuing and possibly destroying
> state before the recipients have had a chance to respond to the stop.
> However, if the recipients have crashed, we won't hang the sender
> CPU indefinitely.
>
The point of the crash kernel functionality is to make it possible to grab a
snapshot of the system at the time of the crash. smp_send_stop() will
take the other cores offline, which makes the snapshot differ from the
crash state more than it has to. To be more concrete, any core dump
analysis tool which reads the cpu_online_mask to determine the number
of cpus in use will get an incorrect picture.
Regards,
Per
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 14:31         ` Per Fransson
@ 2010-11-22 14:40           ` Russell King - ARM Linux
  2010-11-22 17:03             ` Per Fransson
  0 siblings, 1 reply; 23+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 14:40 UTC (permalink / raw)
  To: linux-arm-kernel
On Mon, Nov 22, 2010 at 03:31:24PM +0100, Per Fransson wrote:
> On Mon, Nov 22, 2010 at 12:27 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Mon, Nov 22, 2010 at 10:47:40AM +0000, Russell King - ARM Linux wrote:
> >> However, we do need smp_send_stop() to wait for a limited time for the
> >> other CPUs to respond to the request.
> >
> > ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
> >
> > Wait up to one second for secondary CPUs to respond to a request to
> > stop. ?This avoids the sender CPU continuing and possibly destroying
> > state before the recipients have had a chance to respond to the stop.
> > However, if the recipients have crashed, we won't hang the sender
> > CPU indefinitely.
> >
> 
> The point of the crash kernel functionality is to make it possible to grab a
> snapshot of the system at the time of the crash. smp_send_stop() will
> take the other cores offline, which makes the snapshot differ from the
> crash state more than it has to. To be more concrete, any core dump
> analysis tool which reads the cpu_online_mask to determine the number
> of cpus in use will get an incorrect picture.
Well, you can't go around randomly enabling interrupts to call functions
that require interrupts to be enabled, so I guess it's not possible to
save the state of the other cores.
I guess you're going to have to come up with another solution.
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 14:40           ` Russell King - ARM Linux
@ 2010-11-22 17:03             ` Per Fransson
  2010-11-22 17:12               ` Catalin Marinas
  0 siblings, 1 reply; 23+ messages in thread
From: Per Fransson @ 2010-11-22 17:03 UTC (permalink / raw)
  To: linux-arm-kernel
On the other hand, do we really need to enable the interrupts before
performing an ipi?
The important thing must be for them to be enabled on the
receiving/callee cores. It's not
as if we want the crashing core to be interruptable, is it?
I tried it and it seems to work fine.
/Per
On Mon, Nov 22, 2010 at 3:40 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Nov 22, 2010 at 03:31:24PM +0100, Per Fransson wrote:
>> On Mon, Nov 22, 2010 at 12:27 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Mon, Nov 22, 2010 at 10:47:40AM +0000, Russell King - ARM Linux wrote:
>> >> However, we do need smp_send_stop() to wait for a limited time for the
>> >> other CPUs to respond to the request.
>> >
>> > ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
>> >
>> > Wait up to one second for secondary CPUs to respond to a request to
>> > stop. ?This avoids the sender CPU continuing and possibly destroying
>> > state before the recipients have had a chance to respond to the stop.
>> > However, if the recipients have crashed, we won't hang the sender
>> > CPU indefinitely.
>> >
>>
>> The point of the crash kernel functionality is to make it possible to grab a
>> snapshot of the system at the time of the crash. smp_send_stop() will
>> take the other cores offline, which makes the snapshot differ from the
>> crash state more than it has to. To be more concrete, any core dump
>> analysis tool which reads the cpu_online_mask to determine the number
>> of cpus in use will get an incorrect picture.
>
> Well, you can't go around randomly enabling interrupts to call functions
> that require interrupts to be enabled, so I guess it's not possible to
> save the state of the other cores.
>
> I guess you're going to have to come up with another solution.
>
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 17:03             ` Per Fransson
@ 2010-11-22 17:12               ` Catalin Marinas
  2010-11-22 20:09                 ` Maxim Uvarov
  0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2010-11-22 17:12 UTC (permalink / raw)
  To: linux-arm-kernel
On 22 November 2010 17:03, Per Fransson <per.fransson.ml@gmail.com> wrote:
> On the other hand, do we really need to enable the interrupts before
> performing an ipi?
> The important thing must be for them to be enabled on the
> receiving/callee cores. It's not
> as if we want the crashing core to be interruptable, is it?
That's for the situation when the other core tries to send an IPI at
the same time. Unlikely but both cores may deadlock.
I actually had a patch to allow this which was really complicated and
it still failed in a real-world scenario which I could foresee.
-- 
Catalin
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 14:21           ` Russell King - ARM Linux
@ 2010-11-22 17:25             ` Santosh Shilimkar
  0 siblings, 0 replies; 23+ messages in thread
From: Santosh Shilimkar @ 2010-11-22 17:25 UTC (permalink / raw)
  To: linux-arm-kernel
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk]
> Sent: Monday, November 22, 2010 7:51 PM
> To: Santosh Shilimkar
> Cc: Per Fransson; Mika Westerberg; kexec at lists.infradead.org; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH] Add call to non-crashing cores through IPI
>
> On Mon, Nov 22, 2010 at 06:37:09PM +0530, Santosh Shilimkar wrote:
> > > ARM: smp: make smp_send_stop() wait for secondary CPUs to stop
> > >
> > > Wait up to one second for secondary CPUs to respond to a request to
> > > stop.  This avoids the sender CPU continuing and possibly destroying
> > > state before the recipients have had a chance to respond to the
stop.
> > > However, if the recipients have crashed, we won't hang the sender
> > > CPU indefinitely.
> > >
> > Just tried this patch thinking it might fix the reboot issue
> > reported earlier. But it doesn't help.
>
> I was hoping that it might change the behaviour slightly, but if not
> the patch can't do any harm.
>
> The reboot stuff works fine on Versatile Express with Cortex A9, so
> I suspect your problem might be specific to OMAP.
Interesting.
I thought it's a common issue. Will have a look at it more
closely
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 17:12               ` Catalin Marinas
@ 2010-11-22 20:09                 ` Maxim Uvarov
  2010-11-22 22:14                   ` Per Fransson
  2010-11-23  9:14                   ` Catalin Marinas
  0 siblings, 2 replies; 23+ messages in thread
From: Maxim Uvarov @ 2010-11-22 20:09 UTC (permalink / raw)
  To: linux-arm-kernel
2010/11/22 Catalin Marinas <catalin.marinas@arm.com>:
> On 22 November 2010 17:03, Per Fransson <per.fransson.ml@gmail.com> wrote:
>> On the other hand, do we really need to enable the interrupts before
>> performing an ipi?
>> The important thing must be for them to be enabled on the
>> receiving/callee cores. It's not
>> as if we want the crashing core to be interruptable, is it?
>
> That's for the situation when the other core tries to send an IPI at
> the same time. Unlikely but both cores may deadlock.
>
Am I right that deadlock can occur only if the same IPI was sent? And
there should
be no situation when CRASH_DUMP IPI was sent from different processors at
the same time.
> I actually had a patch to allow this which was really complicated and
> it still failed in a real-world scenario which I could foresee.
>
> --
> Catalin
>
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
-- 
Best regards,
Maxim Uvarov
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 20:09                 ` Maxim Uvarov
@ 2010-11-22 22:14                   ` Per Fransson
  2010-11-22 22:30                     ` Per Fransson
  2010-11-23  9:14                   ` Catalin Marinas
  1 sibling, 1 reply; 23+ messages in thread
From: Per Fransson @ 2010-11-22 22:14 UTC (permalink / raw)
  To: linux-arm-kernel
>>
>> That's for the situation when the other core tries to send an IPI at
>> the same time. Unlikely but both cores may deadlock.
>>
>
> Am I right that deadlock can occur only if the same IPI was sent? And
> there should
> be no situation when CRASH_DUMP IPI was sent from different processors at
> the same time.
>
I hope you are right, but I can't see why we couldn't get a crash IPI
from several
processors at the same time, unfortunately.
Even if such a thing can happen however, the timeout in my original suggestion
(ignoring the interrupt enabling) should keep us from deadlocking indefinitely.
/Per
>> I actually had a patch to allow this which was really complicated and
>> it still failed in a real-world scenario which I could foresee.
>>
>> --
>> Catalin
>>
>> _______________________________________________
>> kexec mailing list
>> kexec at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
>
>
>
> --
> Best regards,
> Maxim Uvarov
>
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 22:14                   ` Per Fransson
@ 2010-11-22 22:30                     ` Per Fransson
  0 siblings, 0 replies; 23+ messages in thread
From: Per Fransson @ 2010-11-22 22:30 UTC (permalink / raw)
  To: linux-arm-kernel
On Mon, Nov 22, 2010 at 11:14 PM, Per Fransson
<per.fransson.ml@gmail.com> wrote:
>>>
>>> That's for the situation when the other core tries to send an IPI at
>>> the same time. Unlikely but both cores may deadlock.
>>>
>>
>> Am I right that deadlock can occur only if the same IPI was sent? And
>> there should
>> be no situation when CRASH_DUMP IPI was sent from different processors at
>> the same time.
>>
>
> I hope you are right, but I can't see why we couldn't get a crash IPI
> from several
> processors at the same time, unfortunately.
>
I'll just follow up my own post here. On second thought, crash_kexec()
takes the kexec_mutex,
so simultaneous crash IPIs probably couldn't happen.
/Per
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-22 20:09                 ` Maxim Uvarov
  2010-11-22 22:14                   ` Per Fransson
@ 2010-11-23  9:14                   ` Catalin Marinas
  2010-11-23 10:57                     ` Per Fransson
  1 sibling, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2010-11-23  9:14 UTC (permalink / raw)
  To: linux-arm-kernel
On 22 November 2010 20:09, Maxim Uvarov <muvarov@gmail.com> wrote:
> 2010/11/22 Catalin Marinas <catalin.marinas@arm.com>:
>> On 22 November 2010 17:03, Per Fransson <per.fransson.ml@gmail.com> wrote:
>>> On the other hand, do we really need to enable the interrupts before
>>> performing an ipi?
>>> The important thing must be for them to be enabled on the
>>> receiving/callee cores. It's not
>>> as if we want the crashing core to be interruptable, is it?
>>
>> That's for the situation when the other core tries to send an IPI at
>> the same time. Unlikely but both cores may deadlock.
>>
>
> Am I right that deadlock can occur only if the same IPI was sent?
Not necessarily. It depends on whether the CPU issuing the IPI needs
to wait for the completion of the cross-call. If you don't need to
wait, you can send the IPI with the interrupts disabled (the platform
smp_cross_call is already called with interrupts disabled).
-- 
Catalin
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-23  9:14                   ` Catalin Marinas
@ 2010-11-23 10:57                     ` Per Fransson
  2010-11-23 12:37                       ` Catalin Marinas
  0 siblings, 1 reply; 23+ messages in thread
From: Per Fransson @ 2010-11-23 10:57 UTC (permalink / raw)
  To: linux-arm-kernel
>> Am I right that deadlock can occur only if the same IPI was sent?
>
> Not necessarily. It depends on whether the CPU issuing the IPI needs
> to wait for the completion of the cross-call. If you don't need to
> wait, you can send the IPI with the interrupts disabled (the platform
> smp_cross_call is already called with interrupts disabled).
>
Well, smp_call_function() only optionally waits. As long as we choose not to,
using that function to ask the other cores to save their states and idle, should
be ok, right? In that case we don't need another ipi_msg_type and we can do
it with the interrupts of the ipi caller disabled.
/Per
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-23 10:57                     ` Per Fransson
@ 2010-11-23 12:37                       ` Catalin Marinas
  2010-11-23 12:41                         ` Mika Westerberg
  0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2010-11-23 12:37 UTC (permalink / raw)
  To: linux-arm-kernel
On 23 November 2010 10:57, Per Fransson <per.fransson.ml@gmail.com> wrote:
>>> Am I right that deadlock can occur only if the same IPI was sent?
>>
>> Not necessarily. It depends on whether the CPU issuing the IPI needs
>> to wait for the completion of the cross-call. If you don't need to
>> wait, you can send the IPI with the interrupts disabled (the platform
>> smp_cross_call is already called with interrupts disabled).
>>
>
> Well, smp_call_function() only optionally waits. As long as we choose not to,
> using that function to ask the other cores to save their states and idle, should
> be ok, right? In that case we don't need another ipi_msg_type and we can do
> it with the interrupts of the ipi caller disabled.
I think that should work but we still have a WARN_ON_ONCE in the
generic smp_call_function_*() if interrupts are disabled.
-- 
Catalin
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-23 12:37                       ` Catalin Marinas
@ 2010-11-23 12:41                         ` Mika Westerberg
  2010-11-23 12:48                           ` Catalin Marinas
  0 siblings, 1 reply; 23+ messages in thread
From: Mika Westerberg @ 2010-11-23 12:41 UTC (permalink / raw)
  To: linux-arm-kernel
On Tue, Nov 23, 2010 at 12:37:04PM +0000, Catalin Marinas wrote:
> On 23 November 2010 10:57, Per Fransson <per.fransson.ml@gmail.com> wrote:
> >>> Am I right that deadlock can occur only if the same IPI was sent?
> >>
> >> Not necessarily. It depends on whether the CPU issuing the IPI needs
> >> to wait for the completion of the cross-call. If you don't need to
> >> wait, you can send the IPI with the interrupts disabled (the platform
> >> smp_cross_call is already called with interrupts disabled).
> >>
> >
> > Well, smp_call_function() only optionally waits. As long as we choose not to,
> > using that function to ask the other cores to save their states and idle, should
> > be ok, right? In that case we don't need another ipi_msg_type and we can do
> > it with the interrupts of the ipi caller disabled.
> 
> I think that should work but we still have a WARN_ON_ONCE in the
> generic smp_call_function_*() if interrupts are disabled.
But when we are crashing we have oops_in_progress set:
        WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
                     && !oops_in_progress);
so this warning is never printed, right?
MW
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-23 12:41                         ` Mika Westerberg
@ 2010-11-23 12:48                           ` Catalin Marinas
  2010-11-23 13:01                             ` Maxim Uvarov
  0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2010-11-23 12:48 UTC (permalink / raw)
  To: linux-arm-kernel
On Tue, 2010-11-23 at 12:41 +0000, Mika Westerberg wrote:
> On Tue, Nov 23, 2010 at 12:37:04PM +0000, Catalin Marinas wrote:
> > On 23 November 2010 10:57, Per Fransson <per.fransson.ml@gmail.com> wrote:
> > >>> Am I right that deadlock can occur only if the same IPI was sent?
> > >>
> > >> Not necessarily. It depends on whether the CPU issuing the IPI needs
> > >> to wait for the completion of the cross-call. If you don't need to
> > >> wait, you can send the IPI with the interrupts disabled (the platform
> > >> smp_cross_call is already called with interrupts disabled).
> > >>
> > >
> > > Well, smp_call_function() only optionally waits. As long as we choose not to,
> > > using that function to ask the other cores to save their states and idle, should
> > > be ok, right? In that case we don't need another ipi_msg_type and we can do
> > > it with the interrupts of the ipi caller disabled.
> >
> > I think that should work but we still have a WARN_ON_ONCE in the
> > generic smp_call_function_*() if interrupts are disabled.
> 
> But when we are crashing we have oops_in_progress set:
> 
>         WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>                      && !oops_in_progress);
> 
> so this warning is never printed, right?
Right. So just make sure the caller doesn't set the 'wait' argument.
-- 
Catalin
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-23 12:48                           ` Catalin Marinas
@ 2010-11-23 13:01                             ` Maxim Uvarov
  2010-11-23 16:31                               ` Per Fransson
  0 siblings, 1 reply; 23+ messages in thread
From: Maxim Uvarov @ 2010-11-23 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
2010/11/23 Catalin Marinas <catalin.marinas@arm.com>:
> On Tue, 2010-11-23 at 12:41 +0000, Mika Westerberg wrote:
>> On Tue, Nov 23, 2010 at 12:37:04PM +0000, Catalin Marinas wrote:
>> > On 23 November 2010 10:57, Per Fransson <per.fransson.ml@gmail.com> wrote:
>> > >>> Am I right that deadlock can occur only if the same IPI was sent?
>> > >>
>> > >> Not necessarily. It depends on whether the CPU issuing the IPI needs
>> > >> to wait for the completion of the cross-call. If you don't need to
>> > >> wait, you can send the IPI with the interrupts disabled (the platform
>> > >> smp_cross_call is already called with interrupts disabled).
>> > >>
>> > >
>> > > Well, smp_call_function() only optionally waits. As long as we choose not to,
>> > > using that function to ask the other cores to save their states and idle, should
>> > > be ok, right? In that case we don't need another ipi_msg_type and we can do
>> > > it with the interrupts of the ipi caller disabled.
>> >
>> > I think that should work but we still have a WARN_ON_ONCE in the
>> > generic smp_call_function_*() if interrupts are disabled.
>>
>> But when we are crashing we have oops_in_progress set:
>>
>> ? ? ? ? WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>> ? ? ? ? ? ? ? ? ? ? ?&& !oops_in_progress);
>>
>> so this warning is never printed, right?
>
> Right. So just make sure the caller doesn't set the 'wait' argument.
>
Please double check it. In powerpc crashdump  case  oops_in_progress is not set.
Maxim.
> --
> Catalin
>
>
>
-- 
Best regards,
Maxim Uvarov
^ permalink raw reply	[flat|nested] 23+ messages in thread
* [PATCH] Add call to non-crashing cores through IPI
  2010-11-23 13:01                             ` Maxim Uvarov
@ 2010-11-23 16:31                               ` Per Fransson
  0 siblings, 0 replies; 23+ messages in thread
From: Per Fransson @ 2010-11-23 16:31 UTC (permalink / raw)
  To: linux-arm-kernel
>>> But when we are crashing we have oops_in_progress set:
>>>
>>> ? ? ? ? WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>>> ? ? ? ? ? ? ? ? ? ? ?&& !oops_in_progress);
>>>
>>> so this warning is never printed, right?
>>
>> Right. So just make sure the caller doesn't set the 'wait' argument.
>>
>
> Please double check it. In powerpc crashdump ?case ?oops_in_progress is not set.
>
I tried this. Didn't get the warning.
/Per
^ permalink raw reply	[flat|nested] 23+ messages in thread
end of thread, other threads:[~2010-11-23 16:31 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-19 10:08 [PATCH] Add call to non-crashing cores through IPI Per Fransson
2010-11-22  7:24 ` Mika Westerberg
2010-11-22  9:47   ` Maxim Uvarov
2010-11-22  9:53   ` Per Fransson
2010-11-22 10:47     ` Russell King - ARM Linux
2010-11-22 11:27       ` Russell King - ARM Linux
2010-11-22 13:07         ` Santosh Shilimkar
2010-11-22 14:21           ` Russell King - ARM Linux
2010-11-22 17:25             ` Santosh Shilimkar
2010-11-22 14:31         ` Per Fransson
2010-11-22 14:40           ` Russell King - ARM Linux
2010-11-22 17:03             ` Per Fransson
2010-11-22 17:12               ` Catalin Marinas
2010-11-22 20:09                 ` Maxim Uvarov
2010-11-22 22:14                   ` Per Fransson
2010-11-22 22:30                     ` Per Fransson
2010-11-23  9:14                   ` Catalin Marinas
2010-11-23 10:57                     ` Per Fransson
2010-11-23 12:37                       ` Catalin Marinas
2010-11-23 12:41                         ` Mika Westerberg
2010-11-23 12:48                           ` Catalin Marinas
2010-11-23 13:01                             ` Maxim Uvarov
2010-11-23 16:31                               ` Per Fransson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).