ARM11MPcore: tlb_ops_need_broadcast causes deadlock

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
@ 2012-03-22 12:24 EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31)
  2012-03-23 17:30 ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31) @ 2012-03-22 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

Here we are again with another issue on ARM11mpcore (2 cores for Linux):

In relatively rare circumstances the system soft-locks up:

cpuA                                                                cpuB

kswapd searches for pages to reclaim
via shrink_zone
page_referenced
page_referenced_one
    page_check_address(&ptl)   <- ptl gets locked!
    ptep_clear_flush_young_notify                                   jump to the "innocent" page
                                                                        IRQS OFF
                                                                        do_DataAbort-> handle_mm_fault
                                                                            handle_pte_fault (inlined)
                                                                            ptl = pte_lockptr(mm, pmd);
                                                                            spin_lock(ptl);

        flush_tlb_page
            tlb_ops_need_broadcast
            on_each_cpu_mask(ipi_flush_tlb_page, with WAIT)
                csd_lock_wait()
                                          DEADLOCK, IPI on cpuB does not finish because IRQs are OFF
    pte_unmap_unlock(pte, ptl);

And here is some explanation:

Every then and now pages are marked inaccessible in the hardware PTE
(page table entry) so that the VM subsystem can check if the page is
accessed at all. If it's frequently accessed it will become a "young" page.
On memory pressure "old" pages will be the first to get evicted.

The kswapd kernel thread goes through a list of pages to check if they
were accessed in a given interval and mark our target page as young.

The cpuB executes some user code hitting that page and because the PTE
is marked "inaccessible", so that the attempt can be stored, it results
in a page fault.

Unluckily the kswapd calls tlb_flush and that is configured to inform all
cpus about that change via IPIs. cpuB is in an user abort handler (__dabt_usr)
and the disaster takes its course:

For checking if it's a thumb instruction that caused the fault the abort handler
accesses the page resulting into another fault - but now entering svc abort handler
(__dabt_svc) and that turns off interrupts!

That leads to cpuA waiting in csd_lock_wait for the IPI to signal its end of execution
(via csd->flags) but that does not happen because IRQs are off on cpuB that
is stuck in the page fault handler spinning to get the lock for the mm->page_table_lock
but this is still held on cpuA waiting for the IPIs to finish.

possible solutions:

a) do not wait for that particular IPI since the mapping does not change
 (just the access bits)

b) open code the ptep_set_access_flags() and change the sequence that the IPI
 is called without holding the page_table_lock anymore

This shows up on CPUs where tlb_ops_need_broadcast() returns true.

Input welcome how to resolve this issue.

regards

        Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-22 12:24 ARM11MPcore: tlb_ops_need_broadcast causes deadlock EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31)
@ 2012-03-23 17:30 ` Will Deacon
  2012-03-25 12:08   ` Peter Waechtler
       [not found]   ` <274124B9C6907D4B8CE985903EAA19E91B2D5798D9@SI-MBX06.de.bosch.com>
  0 siblings, 2 replies; 12+ messages in thread
From: Will Deacon @ 2012-03-23 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Peter,

Thanks for the detailed explanation. However, there's one bit I'm not sure
that I follow:

On Thu, Mar 22, 2012 at 12:24:47PM +0000, EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31) wrote:
> The cpuB executes some user code hitting that page and because the PTE
> is marked "inaccessible", so that the attempt can be stored, it results
> in a page fault.

Shouldn't this trigger a prefetch abort rather than a data abort?

> Unluckily the kswapd calls tlb_flush and that is configured to inform all
> cpus about that change via IPIs. cpuB is in an user abort handler (__dabt_usr)
> and the disaster takes its course:
> 
> For checking if it's a thumb instruction that caused the fault the abort handler
> accesses the page resulting into another fault - but now entering svc abort handler
> (__dabt_svc) and that turns off interrupts!

This is where I'm confused - why are we in the data abort handler due to an
I-side fault?

Thanks,

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-23 17:30 ` Will Deacon
@ 2012-03-25 12:08   ` Peter Waechtler
  2012-03-25 13:09     ` Russell King - ARM Linux
       [not found]   ` <274124B9C6907D4B8CE985903EAA19E91B2D5798D9@SI-MBX06.de.bosch.com>
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Waechtler @ 2012-03-25 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

Will Deacon <will.deacon <at> arm.com> writes:

> 
> > The cpuB executes some user code hitting that page and because the PTE
> > is marked "inaccessible", so that the attempt can be stored, it results
> > in a page fault.
> 
> Shouldn't this trigger a prefetch abort rather than a data abort?
> 

Yes, I think the abort part is not completely understood yet.

So it's a data abort, some code referenced a page resulting in a dabrt_usr.
Then the abort handler tries to read the instruction and gets another dabrt.

That would mean that the code page was marked as inaccessible in the
mean time.
Now how could that happen? Probably on the other cpu?

Weird stuff.

But Will, is that tlb_flush necessary at all? The ARM has only 3 permission
bits in the page table (APX and AP0 and AP1). The young/accessed bit is done
via software.

    Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-25 12:08   ` Peter Waechtler
@ 2012-03-25 13:09     ` Russell King - ARM Linux
  2012-03-25 18:22       ` Peter Waechtler
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2012-03-25 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 25, 2012 at 12:08:47PM +0000, Peter Waechtler wrote:
> But Will, is that tlb_flush necessary at all? The ARM has only 3 permission
> bits in the page table (APX and AP0 and AP1). The young/accessed bit is done
> via software.

Yes it most definitely is, because setting a page to be young means we
must receive a subsequent fault to make it 'old' again.  This means we
must set the page to be inaccessible to get that fault, and flush the
TLBs across all CPUs so that any CPU accessing that page receives a
fault.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-25 13:09     ` Russell King - ARM Linux
@ 2012-03-25 18:22       ` Peter Waechtler
  2012-03-25 19:15         ` Russell King - ARM Linux
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Waechtler @ 2012-03-25 18:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 25.03.2012 15:09, Russell King - ARM Linux wrote:
> On Sun, Mar 25, 2012 at 12:08:47PM +0000, Peter Waechtler wrote:
>> But Will, is that tlb_flush necessary at all? The ARM has only 3 permission
>> bits in the page table (APX and AP0 and AP1). The young/accessed bit is done
>> via software.
> Yes it most definitely is, because setting a page to be young means we
> must receive a subsequent fault to make it 'old' again.  This means we
> must set the page to be inaccessible to get that fault, and flush the
> TLBs across all CPUs so that any CPU accessing that page receives a
> fault.
Ok I see, it's also not the "right or perfect" fix.

But the worst thing that can happen is:

young page: causes no page fault anymore and stays longer young than
kswapd wants in case a TLB has a stale entry but that access would have
marked it young again - no big deal?

old page: causes a page fault so that it can be made young, a stale TLB
would cause still a page fault - but in that path the tlb_flush still 
happens

 From my point of view: I definitively prefer to avoid the deadlock ;)

I'm afraid that I missed something? I hope not :)

     Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-25 18:22       ` Peter Waechtler
@ 2012-03-25 19:15         ` Russell King - ARM Linux
  2012-03-25 20:22           ` Peter Waechtler
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2012-03-25 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 25, 2012 at 08:22:05PM +0200, Peter Waechtler wrote:
> On 25.03.2012 15:09, Russell King - ARM Linux wrote:
>> On Sun, Mar 25, 2012 at 12:08:47PM +0000, Peter Waechtler wrote:
>>> But Will, is that tlb_flush necessary at all? The ARM has only 3 permission
>>> bits in the page table (APX and AP0 and AP1). The young/accessed bit is done
>>> via software.
>> Yes it most definitely is, because setting a page to be young means we
>> must receive a subsequent fault to make it 'old' again.  This means we
>> must set the page to be inaccessible to get that fault, and flush the
>> TLBs across all CPUs so that any CPU accessing that page receives a
>> fault.
> Ok I see, it's also not the "right or perfect" fix.

It's not a fix or anything, it's required behaviour - otherwise we could
end up throwing out pages from the system which are actually 'hot' because
they've stayed in the TLB and we haven't received a fault to make them
young again.

Moreover, what about the case where we actually remove the page?

Aren't we also holding the pte lock there?  So I don't think there's an
obvious solution to your deadlock.

I think the real question is - in your example - why are you touching
a userspace page with IRQs off _and_ expecting the fault to be fixed up?
You never really explained what CPU B was doing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-25 19:15         ` Russell King - ARM Linux
@ 2012-03-25 20:22           ` Peter Waechtler
  2012-03-25 21:55             ` Russell King - ARM Linux
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Waechtler @ 2012-03-25 20:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 25.03.2012 21:15, Russell King - ARM Linux wrote:
> On Sun, Mar 25, 2012 at 08:22:05PM +0200, Peter Waechtler wrote:
>> On 25.03.2012 15:09, Russell King - ARM Linux wrote:
>>> On Sun, Mar 25, 2012 at 12:08:47PM +0000, Peter Waechtler wrote:
>>>> But Will, is that tlb_flush necessary at all? The ARM has only 3 permission
>>>> bits in the page table (APX and AP0 and AP1). The young/accessed bit is done
>>>> via software.
>>> Yes it most definitely is, because setting a page to be young means we
>>> must receive a subsequent fault to make it 'old' again.  This means we
>>> must set the page to be inaccessible to get that fault, and flush the
>>> TLBs across all CPUs so that any CPU accessing that page receives a
>>> fault.
>> Ok I see, it's also not the "right or perfect" fix.
> It's not a fix or anything, it's required behaviour - otherwise we could
> end up throwing out pages from the system which are actually 'hot' because
> they've stayed in the TLB and we haven't received a fault to make them
> young again.

I'm arguing solely on kswapd making a young page old. So it can't be a 
hot page.
But yes in theory it's possible that it just become hot on another cpu...

And again I don't understand the abort handler: why do we get a page 
fault on
a young page then? grrh

> Moreover, what about the case where we actually remove the page?
I don't claim that this is the only way to deadlock - but this is the 
case we encounter.

> Aren't we also holding the pte lock there?  So I don't think there's an
> obvious solution to your deadlock.
>
> I think the real question is - in your example - why are you touching
> a userspace page with IRQs off _and_ expecting the fault to be fixed up?
> You never really explained what CPU B was doing.
It was running some user space program. It was not in the kernel.
I will post the jtag probe screenshots tomorrow.

     Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-25 20:22           ` Peter Waechtler
@ 2012-03-25 21:55             ` Russell King - ARM Linux
  2012-03-26 15:20               ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2012-03-25 21:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 25, 2012 at 10:22:49PM +0200, Peter Waechtler wrote:
> On 25.03.2012 21:15, Russell King - ARM Linux wrote:
>> On Sun, Mar 25, 2012 at 08:22:05PM +0200, Peter Waechtler wrote:
>>> On 25.03.2012 15:09, Russell King - ARM Linux wrote:
>>>> On Sun, Mar 25, 2012 at 12:08:47PM +0000, Peter Waechtler wrote:
>>>>> But Will, is that tlb_flush necessary at all? The ARM has only 3 permission
>>>>> bits in the page table (APX and AP0 and AP1). The young/accessed bit is done
>>>>> via software.
>>>> Yes it most definitely is, because setting a page to be young means we
>>>> must receive a subsequent fault to make it 'old' again.  This means we
>>>> must set the page to be inaccessible to get that fault, and flush the
>>>> TLBs across all CPUs so that any CPU accessing that page receives a
>>>> fault.
>>> Ok I see, it's also not the "right or perfect" fix.
>> It's not a fix or anything, it's required behaviour - otherwise we could
>> end up throwing out pages from the system which are actually 'hot' because
>> they've stayed in the TLB and we haven't received a fault to make them
>> young again.
>
> I'm arguing solely on kswapd making a young page old. So it can't be a  
> hot page.

No, it can be a hot page because the way it finds out that it's a hot
page is when it has to make the page repeatedly young, having first
made it old.

There's no other way the system can know what pages are being accessed
by userspace.

> But yes in theory it's possible that it just become hot on another cpu...
>
> And again I don't understand the abort handler: why do we get a page  
> fault on
> a young page then? grrh

Permissions?  Userspace trying to write to the page when it isn't marked
writable and dirty?

>> Moreover, what about the case where we actually remove the page?
> I don't claim that this is the only way to deadlock - but this is the  
> case we encounter.

No, but you're arguing that we drop the TLB flush for your specific case.
I'm telling you that's pointless if there's other cases as well which
we'll deadlock.

But that's neither here nor there because you haven't fully explained
what the problem is yet...

>> Aren't we also holding the pte lock there?  So I don't think there's an
>> obvious solution to your deadlock.
>>
>> I think the real question is - in your example - why are you touching
>> a userspace page with IRQs off _and_ expecting the fault to be fixed up?
>> You never really explained what CPU B was doing.
> It was running some user space program. It was not in the kernel.
> I will post the jtag probe screenshots tomorrow.

Why do we need silly screen shots, why can't you explain it, or paste
the output?  I'm not going to bother looking at GIFs, PNGs or jpegs
because my mail reader is text only.

Moreover, user space programs can't disable interrupts.  So you should
not be receiving a data abort from userspace with interrupts disabled.
Yes, when the CPU enters the data abort handler, it will disable
interrupts, but we re-enable them before processing the abort.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-25 21:55             ` Russell King - ARM Linux
@ 2012-03-26 15:20               ` Will Deacon
  0 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2012-03-26 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hmm, I somehow got dropped from this, but I'll pick it back up here.

On Sun, Mar 25, 2012 at 10:55:35PM +0100, Russell King - ARM Linux wrote:
> On Sun, Mar 25, 2012 at 10:22:49PM +0200, Peter Waechtler wrote:
> > And again I don't understand the abort handler: why do we get a page  
> > fault on
> > a young page then? grrh
> 
> Permissions?  Userspace trying to write to the page when it isn't marked
> writable and dirty?
> 
> >> Moreover, what about the case where we actually remove the page?
> > I don't claim that this is the only way to deadlock - but this is the  
> > case we encounter.
> 
> No, but you're arguing that we drop the TLB flush for your specific case.
> I'm telling you that's pointless if there's other cases as well which
> we'll deadlock.
> 
> But that's neither here nor there because you haven't fully explained
> what the problem is yet..

Yes, I'm inclined to agree with Russell here. Things don't add up and
without further information there's not a lot we can do.

Peter - are you able to reproduce and investigate this problem or was it a
one-off observation? If you can figure out what really goes on inside CPU B
in your example, then we may be able to look into this further. A good first
step might be to work out what triggers the initial data abort and look at
the state of the world at that point.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <274124B9C6907D4B8CE985903EAA19E91B2D5798D9@SI-MBX06.de.bosch.com>]

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
       [not found]   ` <274124B9C6907D4B8CE985903EAA19E91B2D5798D9@SI-MBX06.de.bosch.com>
@ 2012-03-27 13:32     ` Will Deacon
  2012-03-27 17:41       ` George G. Davis
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2012-03-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

Peter,

On Mon, Mar 26, 2012 at 05:10:45PM +0100, EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31) wrote:
> Probably just an "expected deadlock" as mentioned in the comment
> of the v6_early_abort macro:
> 
>  * Purpose : obtain information about current aborted instruction.
>  * Note: we read user space.  This means we might cause a data
>  * abort here if the I-TLB and D-TLB aren't seeing the same
>  * picture.  Unfortunately, this does happen.  We live with it.

I don't see this referring to an expected deadlock.

> For now the errata workarounds are removed for the 11MPcore
> like proposed in this thread to avoid faulting with IRQs turned off:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2011-February/041869.html
> 
> But there it looked like an optimization, but it wasn't.

I have a theory about what goes on:

Say we have a valid (i.e. non-faulting) page which contains a load instruction
that will fault. A CPU executes this load and takes a data abort but at the
same time another CPU marks the page being executed as old. So when the
original CPU tries to load the faulting instruction in do_thumb_abort, we take
a second data abort (assumedly because we don't have a D-side TLB entry for the
text page, so we immediately see that it is old) and, because interrupts were
not yet re-enabled in the first fault, they are not enabled in the nested fault
either.

At this point, the faulting CPU will be unable to get the lock on the page,
since the other guy has it and is waiting for the TLB broadcast to complete.
Given that interrupts are disabled on the faulting CPU, everything locks up.

Possible solutions:

(1) Enable interrupts if they are enabled in the faulting context before
    loading instructions on the dabt path.

(2) Use the FSR to determine whather a fault is due to a read or a write on
    ARMv6 - only load and disassemble the instruction on 1136 CPUs affected
    by erratum #325103 (which aren't SMP, so cannot hit the problem above).

The latter is probably best. Please can you try the patch below? I've
checked that it does the right thing on an r0p1 1136 core using a simple
fork/swp program to trigger a CoW.

Will


diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index dfb0312..dedb885 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1163,6 +1163,15 @@ if !MMU
 source "arch/arm/Kconfig-nommu"
 endif
 
+config ARM_ERRATA_326103
+       bool "ARM errata: FSR write bit incorrect on a SWP to read-only memory"
+       depends on CPU_V6
+       help
+         Executing a SWP instruction to read-only memory does not set bit 11
+         of the FSR on the ARM 1136 prior to r1p0. This causes the kernel to
+         treat the access as a read, preventing a COW from occurring and
+         causing the faulting task to livelock.
+
 config ARM_ERRATA_411920
        bool "ARM errata: Invalidation of the Instruction Cache operation can fail"
        depends on CPU_V6 || CPU_V6K
diff --git a/arch/arm/mm/abort-ev6.S b/arch/arm/mm/abort-ev6.S
index ff1f7cc..8074199 100644
--- a/arch/arm/mm/abort-ev6.S
+++ b/arch/arm/mm/abort-ev6.S
@@ -26,18 +26,23 @@ ENTRY(v6_early_abort)
        mrc     p15, 0, r1, c5, c0, 0           @ get FSR
        mrc     p15, 0, r0, c6, c0, 0           @ get FAR
 /*
- * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR (erratum 326103).
- * The test below covers all the write situations, including Java bytecodes
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
  */
-       bic     r1, r1, #1 << 11                @ clear bit 11 of FSR
+#ifdef CONFIG_ARM_ERRATA_326103
+       ldr     ip, =0x4107b36
+       mrc     p15, 0, r3, c0, c0, 0           @ get processor id
+       teq     ip, r3, lsr #4                  @ r0 ARM1136?
+       bne     do_DataAbort
        tst     r5, #PSR_J_BIT                  @ Java?
+       tsteq   r5, #PSR_T_BIT                  @ Thumb?
        bne     do_DataAbort
-       do_thumb_abort fsr=r1, pc=r4, psr=r5, tmp=r3
-       ldreq   r3, [r4]                        @ read aborted ARM instruction
+       bic     r1, r1, #1 << 11                @ clear bit 11 of FSR
+       ldr     r3, [r4]                        @ read aborted ARM instruction
 #ifdef CONFIG_CPU_ENDIAN_BE8
-       reveq   r3, r3
+       rev     r3, r3
 #endif
        do_ldrd_abort tmp=ip, insn=r3
        tst     r3, #1 << 20                    @ L = 0 -> write
        orreq   r1, r1, #1 << 11                @ yes.
+#endif
        b       do_DataAbort

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-27 13:32     ` Will Deacon
@ 2012-03-27 17:41       ` George G. Davis
  2012-03-28  8:56         ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: George G. Davis @ 2012-03-27 17:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tue, Mar 27, 2012 at 02:32:26PM +0100, Will Deacon wrote:
> Peter,
> 
> On Mon, Mar 26, 2012 at 05:10:45PM +0100, EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31) wrote:
> > Probably just an "expected deadlock" as mentioned in the comment
> > of the v6_early_abort macro:
> > 
> >  * Purpose : obtain information about current aborted instruction.
> >  * Note: we read user space.  This means we might cause a data
> >  * abort here if the I-TLB and D-TLB aren't seeing the same
> >  * picture.  Unfortunately, this does happen.  We live with it.
> 
> I don't see this referring to an expected deadlock.
> 
> > For now the errata workarounds are removed for the 11MPcore
> > like proposed in this thread to avoid faulting with IRQs turned off:
> > 
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2011-February/041869.html
> > 
> > But there it looked like an optimization, but it wasn't.
> 
> I have a theory about what goes on:
> 
> Say we have a valid (i.e. non-faulting) page which contains a load instruction
> that will fault. A CPU executes this load and takes a data abort but at the
> same time another CPU marks the page being executed as old. So when the
> original CPU tries to load the faulting instruction in do_thumb_abort, we take
> a second data abort (assumedly because we don't have a D-side TLB entry for the
> text page, so we immediately see that it is old) and, because interrupts were
> not yet re-enabled in the first fault, they are not enabled in the nested fault
> either.

This is precisely what happened here.  The only difference is that the traces
I've reviewed faulted at "not_thumb:" while attempting to read the userspace
ARM instruction which lead to the (second) data abort with interrupts disabled.


> At this point, the faulting CPU will be unable to get the lock on the page,
> since the other guy has it and is waiting for the TLB broadcast to complete.
> Given that interrupts are disabled on the faulting CPU, everything locks up.

Right again.


> Possible solutions:
> 
> (1) Enable interrupts if they are enabled in the faulting context before
>     loading instructions on the dabt path.
> 
> (2) Use the FSR to determine whather a fault is due to a read or a write on
>     ARMv6 - only load and disassemble the instruction on 1136 CPUs affected
>     by erratum #325103 (which aren't SMP, so cannot hit the problem above).

We submitted a change similar to (2) above to the ARM Linux kernel mailing
list for RFC [1] over a year ago.  That change [1] is similar to your change
below.


> The latter is probably best. Please can you try the patch below? I've
> checked that it does the right thing on an r0p1 1136 core using a simple
> fork/swp program to trigger a CoW.

I tested your patch but only on a CPU_V6K based SMP machine.  In this
case, ARM_ERRATA_326103 depends on CPU_V6, so is left disabled, renderring
this patch functionally equivaltent to [1] below.


> Will
> 
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index dfb0312..dedb885 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1163,6 +1163,15 @@ if !MMU
>  source "arch/arm/Kconfig-nommu"
>  endif
>  
> +config ARM_ERRATA_326103
> +       bool "ARM errata: FSR write bit incorrect on a SWP to read-only memory"
> +       depends on CPU_V6
> +       help
> +         Executing a SWP instruction to read-only memory does not set bit 11
> +         of the FSR on the ARM 1136 prior to r1p0. This causes the kernel to
> +         treat the access as a read, preventing a COW from occurring and
> +         causing the faulting task to livelock.
> +
>  config ARM_ERRATA_411920
>         bool "ARM errata: Invalidation of the Instruction Cache operation can fail"
>         depends on CPU_V6 || CPU_V6K
> diff --git a/arch/arm/mm/abort-ev6.S b/arch/arm/mm/abort-ev6.S
> index ff1f7cc..8074199 100644
> --- a/arch/arm/mm/abort-ev6.S
> +++ b/arch/arm/mm/abort-ev6.S
> @@ -26,18 +26,23 @@ ENTRY(v6_early_abort)
>         mrc     p15, 0, r1, c5, c0, 0           @ get FSR
>         mrc     p15, 0, r0, c6, c0, 0           @ get FAR
>  /*
> - * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR (erratum 326103).
> - * The test below covers all the write situations, including Java bytecodes
> + * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
>   */
> -       bic     r1, r1, #1 << 11                @ clear bit 11 of FSR
> +#ifdef CONFIG_ARM_ERRATA_326103
> +       ldr     ip, =0x4107b36
> +       mrc     p15, 0, r3, c0, c0, 0           @ get processor id
> +       teq     ip, r3, lsr #4                  @ r0 ARM1136?
> +       bne     do_DataAbort
>         tst     r5, #PSR_J_BIT                  @ Java?
> +       tsteq   r5, #PSR_T_BIT                  @ Thumb?
>         bne     do_DataAbort
> -       do_thumb_abort fsr=r1, pc=r4, psr=r5, tmp=r3
> -       ldreq   r3, [r4]                        @ read aborted ARM instruction
> +       bic     r1, r1, #1 << 11                @ clear bit 11 of FSR
> +       ldr     r3, [r4]                        @ read aborted ARM instruction
>  #ifdef CONFIG_CPU_ENDIAN_BE8
> -       reveq   r3, r3
> +       rev     r3, r3
>  #endif
>         do_ldrd_abort tmp=ip, insn=r3
>         tst     r3, #1 << 20                    @ L = 0 -> write
>         orreq   r1, r1, #1 << 11                @ yes.
> +#endif
>         b       do_DataAbort

FYI/FWIW, your patch above suffered whitespace damage.

Thanks!

--
Regards,
George

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2011-February/041733.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ARM11MPcore: tlb_ops_need_broadcast causes deadlock
  2012-03-27 17:41       ` George G. Davis
@ 2012-03-28  8:56         ` Will Deacon
  0 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2012-03-28  8:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 27, 2012 at 06:41:52PM +0100, George G. Davis wrote:
> On Tue, Mar 27, 2012 at 02:32:26PM +0100, Will Deacon wrote:
> > I have a theory about what goes on:
> > 
> > Say we have a valid (i.e. non-faulting) page which contains a load instruction
> > that will fault. A CPU executes this load and takes a data abort but at the
> > same time another CPU marks the page being executed as old. So when the
> > original CPU tries to load the faulting instruction in do_thumb_abort, we take
> > a second data abort (assumedly because we don't have a D-side TLB entry for the
> > text page, so we immediately see that it is old) and, because interrupts were
> > not yet re-enabled in the first fault, they are not enabled in the nested fault
> > either.
> 
> This is precisely what happened here.  The only difference is that the traces
> I've reviewed faulted at "not_thumb:" while attempting to read the userspace
> ARM instruction which lead to the (second) data abort with interrupts disabled.

Right, I think that's the same problem though.

> > Possible solutions:
> > 
> > (1) Enable interrupts if they are enabled in the faulting context before
> >     loading instructions on the dabt path.
> > 
> > (2) Use the FSR to determine whather a fault is due to a read or a write on
> >     ARMv6 - only load and disassemble the instruction on 1136 CPUs affected
> >     by erratum #325103 (which aren't SMP, so cannot hit the problem above).
> 
> We submitted a change similar to (2) above to the ARM Linux kernel mailing
> list for RFC [1] over a year ago.  That change [1] is similar to your change
> below.

Apologies, I missed that. Are you happy for me to continue with my change
below? I'd really like it if Peter could confirm it fixes his problem.

> > The latter is probably best. Please can you try the patch below? I've
> > checked that it does the right thing on an r0p1 1136 core using a simple
> > fork/swp program to trigger a CoW.
> 
> I tested your patch but only on a CPU_V6K based SMP machine.  In this
> case, ARM_ERRATA_326103 depends on CPU_V6, so is left disabled, renderring
> this patch functionally equivaltent to [1] below.

Thanks George. Do you have a testcase for reliably reproducing the deadlock
without this patch applied?

> FYI/FWIW, your patch above suffered whitespace damage.

That'll be the sorry excuse for an email system that I'm forced to use.
Perhaps the list archive has a better version:

http://lists.arm.linux.org.uk/lurker/attach/1 at 20120327.133226.639a8b79.attach

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-03-28  8:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-22 12:24 ARM11MPcore: tlb_ops_need_broadcast causes deadlock EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31)
2012-03-23 17:30 ` Will Deacon
2012-03-25 12:08   ` Peter Waechtler
2012-03-25 13:09     ` Russell King - ARM Linux
2012-03-25 18:22       ` Peter Waechtler
2012-03-25 19:15         ` Russell King - ARM Linux
2012-03-25 20:22           ` Peter Waechtler
2012-03-25 21:55             ` Russell King - ARM Linux
2012-03-26 15:20               ` Will Deacon
     [not found]   ` <274124B9C6907D4B8CE985903EAA19E91B2D5798D9@SI-MBX06.de.bosch.com>
2012-03-27 13:32     ` Will Deacon
2012-03-27 17:41       ` George G. Davis
2012-03-28  8:56         ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).