public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
@ 2011-03-08  9:53 saeed bishara
  2011-03-08 13:33 ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: saeed bishara @ 2011-03-08  9:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,
    The lockup below happens on my SMP system that doesn't support hw
tlb broadcast.
    after some debug I found that the mask inside
smp_call_function_many() which points to mm_cpumask, get changed
asynchronously, apparently by reset_context() that called from IPI
that was issued by another cpu.
    when I disable interrupts in smp_call_function_many around the
code that uses the mask, the issue disappears.
    also, reverting the patch "ARM: 5905/1: ARM: Global ASID
allocation on SMP" eliminates this specific bug.



BUG: soft lockup - CPU#0 stuck for 61s! [aptitude:1721]
Modules linked in:
Pid: 1721, comm:             aptitude
CPU: 0    Not tainted  (2.6.35.9-00005-g106dd76 #4)
PC is at csd_lock_wait+0x14/0x28
LR is at smp_call_function_many+0x1f0/0x21c
pc : [<c01b70d8>]    lr : [<c01b7804>]    psr: 20000113
sp : e19b5d30  ip : e19b5d40  fp : e19b5d3c
r10: c05560e0  r9 : 00da3000  r8 : 00000001
r7 : c002ea00  r6 : c0dd1a00  r5 : 00000000  r4 : c0dd1a18
r3 : 00000000  r2 : 00000e00  r1 : 00000004  r0 : c0dd1a00
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 00c5387d  Table: 21b6001a  DAC: 00000015
[<c014e5c0>] (show_regs+0x0/0x50) from [<c01bdc64>] (softlockup_tick+0x160/0x1c8
)
 r4:e19b5ce8 r3:c002db44
[<c01bdb04>] (softlockup_tick+0x0/0x1c8) from [<c019c330>] (run_local_timers+0x1
c/0x20)
[<c019c314>] (run_local_timers+0x0/0x20) from [<c019c368>] (update_process_times
+0x34/0x58)
[<c019c334>] (update_process_times+0x0/0x58) from [<c01b33d4>] (tick_periodic+0x
e8/0x114)
 r6:c0dd0050 r5:c05a3dc0 r4:c0dd0050 r3:20000113
[<c01b32ec>] (tick_periodic+0x0/0x114) from [<c01b342c>] (tick_handle_periodic+0
x2c/0xd4)
[<c01b3400>] (tick_handle_periodic+0x0/0xd4) from [<c01529cc>] (ipi_timer+0x3c/0
x4c)
 r7:c002ea00 r6:80000020 r5:c05a3dc0 r4:c0dd0050
[<c0152990>] (ipi_timer+0x0/0x4c) from [<c002f3fc>] (do_local_timer+0x58/0x88)
 r4:00000000 r3:00003687
[<c002f3a4>] (do_local_timer+0x0/0x88) from [<c0043f74>] (__irq_svc+0x34/0x100)
Exception stack(0xe19b5ce8 to 0xe19b5d30)
5ce0:                   c0dd1a00 00000004 00000e00 00000000 c0dd1a18 00000000
5d00: c0dd1a00 c002ea00 00000001 00da3000 c05560e0 e19b5d3c e19b5d40 e19b5d30
5d20: c01b7804 c01b70d8 20000113 ffffffff
 r5:fbb21000 r4:ffffffff
[<c01b70c4>] (csd_lock_wait+0x0/0x28) from [<c01b7804>] (smp_call_function_many+
0x1f0/0x21c)
[<c01b7614>] (smp_call_function_many+0x0/0x21c) from [<c0152bac>] (T.319+0x2c/0x
6c)
[<c0152b80>] (T.319+0x0/0x6c) from [<c0152ca8>] (flush_tlb_page+0x40/0xa4)
 r6:41fb1000 r5:e1b906a8 r4:00000000 r3:00000000
[<c0152c68>] (flush_tlb_page+0x0/0xa4) from [<c01e653c>] (do_wp_page+0x3c0/0x7cc
)
[<c01e617c>] (do_wp_page+0x0/0x7cc) from [<c01e735c>] (handle_mm_fault+0x6e4/0x7
a8)
[<c01e6c78>] (handle_mm_fault+0x0/0x7a8) from [<c0046014>] (do_page_fault+0x130/
0x2f8)
[<c0045ee4>] (do_page_fault+0x0/0x2f8) from [<c002f590>] (do_DataAbort+0x3c/0xa0
)
[<c002f554>] (do_DataAbort+0x0/0xa0) from [<c00443c4>] (ret_from_exception+0x0/0
x10)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-08  9:53 SMP soft lockup on smp_call_function_many when doing flush_tlb_page saeed bishara
@ 2011-03-08 13:33 ` Catalin Marinas
  2011-03-08 15:28   ` saeed bishara
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2011-03-08 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2011-03-08 at 09:53 +0000, saeed bishara wrote:
>     The lockup below happens on my SMP system that doesn't support hw
> tlb broadcast.
>     after some debug I found that the mask inside
> smp_call_function_many() which points to mm_cpumask, get changed
> asynchronously, apparently by reset_context() that called from IPI
> that was issued by another cpu.
>     when I disable interrupts in smp_call_function_many around the
> code that uses the mask, the issue disappears.
>     also, reverting the patch "ARM: 5905/1: ARM: Global ASID
> allocation on SMP" eliminates this specific bug.

It looks like smp_call_function_many() does all the checks on the mask
argument that it received and decides that there are CPUs to call (fair
enough). But in the meantime reset_context() via IPI clears this mask
leaving only the current CPU. Once the IPI was handled,
smp_call_function_many() copies this mask to data->cpumask and clears
the current CPU leaving an empty mask. It then waits for the other
(none) CPUs to clear the csd lock which would never happen as data->refs
is 0.

There are probably a few ways to fix this. My preferred method is to
modify the generic code (I'll add a proper commit log later and post it
on LKML if it works):


diff --git a/kernel/smp.c b/kernel/smp.c
index 9910744..a79454f 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -499,6 +499,10 @@ void smp_call_function_many(const struct cpumask *mask,
 	smp_wmb();
 
 	atomic_set(&data->refs, cpumask_weight(data->cpumask));
+	if (unlikely(!atomic_read(&data->refs))) {
+		csd_unlock(&data->csd);
+		return;
+	}
 
 	raw_spin_lock_irqsave(&call_function.lock, flags);
 	/*


An alternative would be to copy the cpumask to a local variable in
on_each_cpu_mask(), though the workaround above would cover other cases
that we haven't spotted yet. Also, the smp_call_function_many()
description doesn't state that the cpumask should not be modified.


diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
index 8f57f32..1717dec 100644
--- a/arch/arm/kernel/smp_tlb.c
+++ b/arch/arm/kernel/smp_tlb.c
@@ -16,10 +16,13 @@
 static void on_each_cpu_mask(void (*func)(void *), void *info, int wait,
 	const struct cpumask *mask)
 {
+	struct cpumask call_mask;
+
 	preempt_disable();
 
+	cpumask_copy(&call_mask, mask);
 	smp_call_function_many(mask, func, info, wait);
-	if (cpumask_test_cpu(smp_processor_id(), mask))
+	if (cpumask_test_cpu(smp_processor_id(), &call_mask))
 		func(info);
 
 	preempt_enable();


-- 
Catalin

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-08 13:33 ` Catalin Marinas
@ 2011-03-08 15:28   ` saeed bishara
  2011-03-08 16:49     ` saeed bishara
  0 siblings, 1 reply; 12+ messages in thread
From: saeed bishara @ 2011-03-08 15:28 UTC (permalink / raw)
  To: linux-arm-kernel

> It looks like smp_call_function_many() does all the checks on the mask
> argument that it received and decides that there are CPUs to call (fair
> enough). But in the meantime reset_context() via IPI clears this mask
> leaving only the current CPU. Once the IPI was handled,
> smp_call_function_many() copies this mask to data->cpumask and clears
> the current CPU leaving an empty mask. It then waits for the other
> (none) CPUs to clear the csd lock which would never happen as data->refs
> is 0.
>
> There are probably a few ways to fix this. My preferred method is to
> modify the generic code (I'll add a proper commit log later and post it
> on LKML if it works):
>
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 9910744..a79454f 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -499,6 +499,10 @@ void smp_call_function_many(const struct cpumask *mask,
> ? ? ? ?smp_wmb();
>
> ? ? ? ?atomic_set(&data->refs, cpumask_weight(data->cpumask));
> + ? ? ? if (unlikely(!atomic_read(&data->refs))) {
> + ? ? ? ? ? ? ? csd_unlock(&data->csd);
> + ? ? ? ? ? ? ? return;
> + ? ? ? }
I don't think this is save, if the mask get cleaned after having cpu
set to valid value and before calculating the next_cpu, the code with
go to the fast path (smp_call_function_single)
>
> ? ? ? ?raw_spin_lock_irqsave(&call_function.lock, flags);
> ? ? ? ?/*
>
>
> An alternative would be to copy the cpumask to a local variable in
> on_each_cpu_mask(), though the workaround above would cover other cases
> that we haven't spotted yet. Also, the smp_call_function_many()
> description doesn't state that the cpumask should not be modified.
>
>
> diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
> index 8f57f32..1717dec 100644
> --- a/arch/arm/kernel/smp_tlb.c
> +++ b/arch/arm/kernel/smp_tlb.c
> @@ -16,10 +16,13 @@
> ?static void on_each_cpu_mask(void (*func)(void *), void *info, int wait,
> ? ? ? ?const struct cpumask *mask)
> ?{
> + ? ? ? struct cpumask call_mask;
> +
> ? ? ? ?preempt_disable();
>
> + ? ? ? cpumask_copy(&call_mask, mask);
> ? ? ? ?smp_call_function_many(mask, func, info, wait);
 I'll check this one, but the mask here should be call_mask.
> - ? ? ? if (cpumask_test_cpu(smp_processor_id(), mask))
> + ? ? ? if (cpumask_test_cpu(smp_processor_id(), &call_mask))
> ? ? ? ? ? ? ? ?func(info);
>
> ? ? ? ?preempt_enable();
>
>
> --
> Catalin
>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-08 15:28   ` saeed bishara
@ 2011-03-08 16:49     ` saeed bishara
  2011-03-08 16:59       ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: saeed bishara @ 2011-03-08 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

>> ? ? ? ?atomic_set(&data->refs, cpumask_weight(data->cpumask));
>> + ? ? ? if (unlikely(!atomic_read(&data->refs))) {
>> + ? ? ? ? ? ? ? csd_unlock(&data->csd);
>> + ? ? ? ? ? ? ? return;
>> + ? ? ? }
> I don't think this is save, if the mask get cleaned after having cpu
> set to valid value and before calculating the next_cpu, the code with
> go to the fast path (smp_call_function_single)
I was wrong, this is actually not a problem, taking the fast path will
not hang the process.

>> An alternative would be to copy the cpumask to a local variable in
>> on_each_cpu_mask(), though the workaround above would cover other cases
>> that we haven't spotted yet. Also, the smp_call_function_many()
>> description doesn't state that the cpumask should not be modified.
>>
>>
>> diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
>> index 8f57f32..1717dec 100644
>> --- a/arch/arm/kernel/smp_tlb.c
>> +++ b/arch/arm/kernel/smp_tlb.c
>> @@ -16,10 +16,13 @@
>> ?static void on_each_cpu_mask(void (*func)(void *), void *info, int wait,
>> ? ? ? ?const struct cpumask *mask)
>> ?{
>> + ? ? ? struct cpumask call_mask;
>> +
>> ? ? ? ?preempt_disable();
>>
>> + ? ? ? cpumask_copy(&call_mask, mask);
>> ? ? ? ?smp_call_function_many(mask, func, info, wait);
> ?I'll check this one, but the mask here should be call_mask.
>> - ? ? ? if (cpumask_test_cpu(smp_processor_id(), mask))
>> + ? ? ? if (cpumask_test_cpu(smp_processor_id(), &call_mask))
>> ? ? ? ? ? ? ? ?func(info);
>>
this patch increases my system instability, for some reason, the
call_function_data data get corrupted when the
generic_smp_call_function_interrupt() is running.
saeed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-08 16:49     ` saeed bishara
@ 2011-03-08 16:59       ` Catalin Marinas
  2011-03-09 15:22         ` saeed bishara
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2011-03-08 16:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2011-03-08 at 16:49 +0000, saeed bishara wrote:
> >>        atomic_set(&data->refs, cpumask_weight(data->cpumask));
> >> +       if (unlikely(!atomic_read(&data->refs))) {
> >> +               csd_unlock(&data->csd);
> >> +               return;
> >> +       }
> > I don't think this is save, if the mask get cleaned after having cpu
> > set to valid value and before calculating the next_cpu, the code with
> > go to the fast path (smp_call_function_single)
> I was wrong, this is actually not a problem, taking the fast path will
> not hang the process.

Did you get a chance to try this patch?

> >> An alternative would be to copy the cpumask to a local variable in
> >> on_each_cpu_mask(), though the workaround above would cover other cases
> >> that we haven't spotted yet. Also, the smp_call_function_many()
> >> description doesn't state that the cpumask should not be modified.
> >>
> >>
> >> diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
> >> index 8f57f32..1717dec 100644
> >> --- a/arch/arm/kernel/smp_tlb.c
> >> +++ b/arch/arm/kernel/smp_tlb.c
> >> @@ -16,10 +16,13 @@
> >>  static void on_each_cpu_mask(void (*func)(void *), void *info, int wait,
> >>        const struct cpumask *mask)
> >>  {
> >> +       struct cpumask call_mask;
> >> +
> >>        preempt_disable();
> >>
> >> +       cpumask_copy(&call_mask, mask);
> >>        smp_call_function_many(mask, func, info, wait);
> >  I'll check this one, but the mask here should be call_mask.
> >> -       if (cpumask_test_cpu(smp_processor_id(), mask))
> >> +       if (cpumask_test_cpu(smp_processor_id(), &call_mask))
> >>                func(info);
> >>
> this patch increases my system instability, for some reason, the
> call_function_data data get corrupted when the
> generic_smp_call_function_interrupt() is running.

Strange, the patch only copies the cpumask.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-08 16:59       ` Catalin Marinas
@ 2011-03-09 15:22         ` saeed bishara
  2011-03-09 15:50           ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: saeed bishara @ 2011-03-09 15:22 UTC (permalink / raw)
  To: linux-arm-kernel

>
> Did you get a chance to try this patch?
yes, it works fine
>
>> >> An alternative would be to copy the cpumask to a local variable in
>> >> on_each_cpu_mask(), though the workaround above would cover other cases
>> >> that we haven't spotted yet. Also, the smp_call_function_many()
>> >> description doesn't state that the cpumask should not be modified.
>> >>
>> >>
>> >> diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
>> >> index 8f57f32..1717dec 100644
>> >> --- a/arch/arm/kernel/smp_tlb.c
>> >> +++ b/arch/arm/kernel/smp_tlb.c
>> >> @@ -16,10 +16,13 @@
>> >> ?static void on_each_cpu_mask(void (*func)(void *), void *info, int wait,
>> >> ? ? ? ?const struct cpumask *mask)
>> >> ?{
>> >> + ? ? ? struct cpumask call_mask;
>> >> +
>> >> ? ? ? ?preempt_disable();
>> >>
>> >> + ? ? ? cpumask_copy(&call_mask, mask);
>> >> ? ? ? ?smp_call_function_many(mask, func, info, wait);
>> > ?I'll check this one, but the mask here should be call_mask.
>> >> - ? ? ? if (cpumask_test_cpu(smp_processor_id(), mask))
>> >> + ? ? ? if (cpumask_test_cpu(smp_processor_id(), &call_mask))
>> >> ? ? ? ? ? ? ? ?func(info);
>> >>
>> this patch increases my system instability, for some reason, the
>> call_function_data data get corrupted when the
>> generic_smp_call_function_interrupt() is running.
>
> Strange, the patch only copies the cpumask.
this patch also works fine, seems the instability was due to the code
that I added in order to accelerate the failure.

saeed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-09 15:22         ` saeed bishara
@ 2011-03-09 15:50           ` Catalin Marinas
  2011-03-30  5:18             ` George G. Davis
  0 siblings, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2011-03-09 15:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2011-03-09 at 15:22 +0000, saeed bishara wrote:
> > Did you get a chance to try this patch?
> yes, it works fine

Thanks for trying. I'll add a commit log and post.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-09 15:50           ` Catalin Marinas
@ 2011-03-30  5:18             ` George G. Davis
  2011-03-30  8:17               ` saeed bishara
  2011-03-30  8:37               ` Catalin Marinas
  0 siblings, 2 replies; 12+ messages in thread
From: George G. Davis @ 2011-03-30  5:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Wed, Mar 09, 2011 at 03:50:48PM +0000, Catalin Marinas wrote:
> On Wed, 2011-03-09 at 15:22 +0000, saeed bishara wrote:
> > > Did you get a chance to try this patch?
> > yes, it works fine
> 
> Thanks for trying. I'll add a commit log and post.

Based on your comments here [1], it looks like your fix for this [2] is
superseeded by [3] which is already applied [4].  Can you confirm?

TIA!

--
Regards,
George

[1] https://lkml.org/lkml/2011/3/15/429
[2] https://lkml.org/lkml/2011/3/15/296
[3] https://lkml.org/lkml/2011/3/15/315
[4] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=723aae2
> 
> -- 
> Catalin
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-30  5:18             ` George G. Davis
@ 2011-03-30  8:17               ` saeed bishara
  2011-03-31  1:05                 ` George G. Davis
  2011-03-30  8:37               ` Catalin Marinas
  1 sibling, 1 reply; 12+ messages in thread
From: saeed bishara @ 2011-03-30  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 30, 2011 at 7:18 AM, George G. Davis <gdavis@mvista.com> wrote:
> Hi,
>
> On Wed, Mar 09, 2011 at 03:50:48PM +0000, Catalin Marinas wrote:
>> On Wed, 2011-03-09 at 15:22 +0000, saeed bishara wrote:
>> > > Did you get a chance to try this patch?
>> > yes, it works fine
>>
>> Thanks for trying. I'll add a commit log and post.
>
> Based on your comments here [1], it looks like your fix for this [2] is
> superseeded by [3] which is already applied [4]. ?Can you confirm?
[4] fixed the bug that I reported here. and since my kernel based on
2.6.35.9, I backported several patches besides to that one, here is
the list::
0001-generic-ipi-Fix-deadlock-in-__smp_call_function_sing.patch
0002-kernel-smp.c-fix-smp_call_function_many-SMP-race.patch
0003-kernel-smp.c-consolidate-writes-in-smp_call_function.patch
0004-call_function_many-fix-list-delete-vs-add-race.patch
0005-call_function_many-add-missing-ordering.patch
0006-smp_call_function_many-handle-concurrent-clearing-of.patch

with those patches I don't see IPI lockup issue any more.
saeed
>
> TIA!
>
> --
> Regards,
> George
>
> [1] https://lkml.org/lkml/2011/3/15/429
> [2] https://lkml.org/lkml/2011/3/15/296
> [3] https://lkml.org/lkml/2011/3/15/315
> [4] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=723aae2
>>
>> --
>> Catalin
>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-30  5:18             ` George G. Davis
  2011-03-30  8:17               ` saeed bishara
@ 2011-03-30  8:37               ` Catalin Marinas
  2011-03-31  1:05                 ` George G. Davis
  1 sibling, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2011-03-30  8:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 30 March 2011 06:18, George G. Davis <gdavis@mvista.com> wrote:
> On Wed, Mar 09, 2011 at 03:50:48PM +0000, Catalin Marinas wrote:
>> On Wed, 2011-03-09 at 15:22 +0000, saeed bishara wrote:
>> > > Did you get a chance to try this patch?
>> > yes, it works fine
>>
>> Thanks for trying. I'll add a commit log and post.
>
> Based on your comments here [1], it looks like your fix for this [2] is
> superseeded by [3] which is already applied [4]. ?Can you confirm?

Yes, it's been fixed by the commit you mentioned.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-30  8:17               ` saeed bishara
@ 2011-03-31  1:05                 ` George G. Davis
  0 siblings, 0 replies; 12+ messages in thread
From: George G. Davis @ 2011-03-31  1:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Wed, Mar 30, 2011 at 10:17:38AM +0200, saeed bishara wrote:
> On Wed, Mar 30, 2011 at 7:18 AM, George G. Davis <gdavis@mvista.com> wrote:
> > Hi,
> >
> > On Wed, Mar 09, 2011 at 03:50:48PM +0000, Catalin Marinas wrote:
> >> On Wed, 2011-03-09 at 15:22 +0000, saeed bishara wrote:
> >> > > Did you get a chance to try this patch?
> >> > yes, it works fine
> >>
> >> Thanks for trying. I'll add a commit log and post.
> >
> > Based on your comments here [1], it looks like your fix for this [2] is
> > superseeded by [3] which is already applied [4]. ?Can you confirm?
> [4] fixed the bug that I reported here. and since my kernel based on
> 2.6.35.9, I backported several patches besides to that one, here is
> the list::
> 0001-generic-ipi-Fix-deadlock-in-__smp_call_function_sing.patch
> 0002-kernel-smp.c-fix-smp_call_function_many-SMP-race.patch
> 0003-kernel-smp.c-consolidate-writes-in-smp_call_function.patch
> 0004-call_function_many-fix-list-delete-vs-add-race.patch
> 0005-call_function_many-add-missing-ordering.patch
> 0006-smp_call_function_many-handle-concurrent-clearing-of.patch
> 
> with those patches I don't see IPI lockup issue any more.
> saeed

Thanks!

--
Regards,
George

> >
> > TIA!
> >
> > --
> > Regards,
> > George
> >
> > [1] https://lkml.org/lkml/2011/3/15/429
> > [2] https://lkml.org/lkml/2011/3/15/296
> > [3] https://lkml.org/lkml/2011/3/15/315
> > [4] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=723aae2
> >>
> >> --
> >> Catalin
> >>
> >>
> >>
> >> _______________________________________________
> >> linux-arm-kernel mailing list
> >> linux-arm-kernel at lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* SMP soft lockup on smp_call_function_many when doing flush_tlb_page
  2011-03-30  8:37               ` Catalin Marinas
@ 2011-03-31  1:05                 ` George G. Davis
  0 siblings, 0 replies; 12+ messages in thread
From: George G. Davis @ 2011-03-31  1:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 30, 2011 at 09:37:51AM +0100, Catalin Marinas wrote:
> On 30 March 2011 06:18, George G. Davis <gdavis@mvista.com> wrote:
> > On Wed, Mar 09, 2011 at 03:50:48PM +0000, Catalin Marinas wrote:
> >> On Wed, 2011-03-09 at 15:22 +0000, saeed bishara wrote:
> >> > > Did you get a chance to try this patch?
> >> > yes, it works fine
> >>
> >> Thanks for trying. I'll add a commit log and post.
> >
> > Based on your comments here [1], it looks like your fix for this [2] is
> > superseeded by [3] which is already applied [4]. ?Can you confirm?
> 
> Yes, it's been fixed by the commit you mentioned.

Thanks!

--
Regards,
George

> 
> -- 
> Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-03-31  1:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-08  9:53 SMP soft lockup on smp_call_function_many when doing flush_tlb_page saeed bishara
2011-03-08 13:33 ` Catalin Marinas
2011-03-08 15:28   ` saeed bishara
2011-03-08 16:49     ` saeed bishara
2011-03-08 16:59       ` Catalin Marinas
2011-03-09 15:22         ` saeed bishara
2011-03-09 15:50           ` Catalin Marinas
2011-03-30  5:18             ` George G. Davis
2011-03-30  8:17               ` saeed bishara
2011-03-31  1:05                 ` George G. Davis
2011-03-30  8:37               ` Catalin Marinas
2011-03-31  1:05                 ` George G. Davis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox