* Fix small race in 44x tlbie function
@ 2007-08-07 4:20 David Gibson
2007-08-08 14:49 ` Josh Boyer
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: David Gibson @ 2007-08-07 4:20 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev, Todd Inglett, Volkmar Uhlig
The 440 family of processors don't have a tlbie instruction. So, we
implement TLB invalidates by explicitly searching the TLB with tlbsx.,
then clobbering the relevant entry, if any. Unfortunately the PID for
the search needs to be stored in the MMUCR register, which is also
used by the TLB miss handler. Interrupts were enabled in _tlbie(), so
an interrupt between loading the MMUCR and the tlbsx could cause
incorrect search results, and thus a failure to invalide TLB entries
which needed to be invalidated.
This patch fixes the problem in both arch/ppc and arch/powerpc by
inhibiting interrupts (even critical and debug interrupts) across the
relevant instructions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
Paul, this one's a bugfix, which I think should go into 2.6.23.
Index: working-2.6/arch/powerpc/kernel/misc_32.S
===================================================================
--- working-2.6.orig/arch/powerpc/kernel/misc_32.S 2007-07-27 14:19:46.000000000 +1000
+++ working-2.6/arch/powerpc/kernel/misc_32.S 2007-07-27 14:30:46.000000000 +1000
@@ -301,9 +301,19 @@ _GLOBAL(_tlbie)
mfspr r4,SPRN_MMUCR
mfspr r5,SPRN_PID /* Get PID */
rlwimi r4,r5,0,24,31 /* Set TID */
- mtspr SPRN_MMUCR,r4
+ /* We have to run the search with interrupts disabled, even critical
+ * and debug interrupts (in fact the only critical exceptions we have
+ * are debug and machine check). Otherwise an interrupt which causes
+ * a TLB miss can clobber the MMUCR between the mtspr and the tlbsx. */
+ mfmsr r5
+ lis r6,(MSR_EE|MSR_CE|MSR_ME|MSR_DE)@ha
+ addi r6,r6,(MSR_EE|MSR_CE|MSR_ME|MSR_DE)@l
+ andc r6,r5,r6
+ mtmsr r6
+ mtspr SPRN_MMUCR,r4
tlbsx. r3, 0, r3
+ mtmsr r5
bne 10f
sync
/* There are only 64 TLB entries, so r3 < 64,
Index: working-2.6/arch/ppc/kernel/misc.S
===================================================================
--- working-2.6.orig/arch/ppc/kernel/misc.S 2007-07-27 14:19:46.000000000 +1000
+++ working-2.6/arch/ppc/kernel/misc.S 2007-07-27 14:31:31.000000000 +1000
@@ -237,9 +237,19 @@ _GLOBAL(_tlbie)
mfspr r4,SPRN_MMUCR
mfspr r5,SPRN_PID /* Get PID */
rlwimi r4,r5,0,24,31 /* Set TID */
- mtspr SPRN_MMUCR,r4
+ /* We have to run the search with interrupts disabled, even critical
+ * and debug interrupts (in fact the only critical exceptions we have
+ * are debug and machine check). Otherwise an interrupt which causes
+ * a TLB miss can clobber the MMUCR between the mtspr and the tlbsx. */
+ mfmsr r5
+ lis r6,(MSR_EE|MSR_CE|MSR_ME|MSR_DE)@ha
+ addi r6,r6,(MSR_EE|MSR_CE|MSR_ME|MSR_DE)@l
+ andc r6,r5,r6
+ mtmsr r6
+ mtspr SPRN_MMUCR,r4
tlbsx. r3, 0, r3
+ mtmsr r5
bne 10f
sync
/* There are only 64 TLB entries, so r3 < 64,
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-07 4:20 Fix small race in 44x tlbie function David Gibson
@ 2007-08-08 14:49 ` Josh Boyer
2007-08-08 15:20 ` Kumar Gala
2007-08-08 20:43 ` Hollis Blanchard
2 siblings, 0 replies; 19+ messages in thread
From: Josh Boyer @ 2007-08-08 14:49 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-dev, Paul Mackerras, Todd Inglett, Volkmar Uhlig
On Tue, 7 Aug 2007 14:20:50 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> The 440 family of processors don't have a tlbie instruction. So, we
> implement TLB invalidates by explicitly searching the TLB with tlbsx.,
> then clobbering the relevant entry, if any. Unfortunately the PID for
> the search needs to be stored in the MMUCR register, which is also
> used by the TLB miss handler. Interrupts were enabled in _tlbie(), so
> an interrupt between loading the MMUCR and the tlbsx could cause
> incorrect search results, and thus a failure to invalide TLB entries
> which needed to be invalidated.
>
> This patch fixes the problem in both arch/ppc and arch/powerpc by
> inhibiting interrupts (even critical and debug interrupts) across the
> relevant instructions.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
And I agree this should go into 2.6.23.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-07 4:20 Fix small race in 44x tlbie function David Gibson
2007-08-08 14:49 ` Josh Boyer
@ 2007-08-08 15:20 ` Kumar Gala
2007-08-08 16:00 ` Josh Boyer
2007-08-08 20:43 ` Hollis Blanchard
2 siblings, 1 reply; 19+ messages in thread
From: Kumar Gala @ 2007-08-08 15:20 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-dev, Paul Mackerras, Todd Inglett, Volkmar Uhlig
On Aug 6, 2007, at 11:20 PM, David Gibson wrote:
> The 440 family of processors don't have a tlbie instruction. So, we
> implement TLB invalidates by explicitly searching the TLB with tlbsx.,
> then clobbering the relevant entry, if any. Unfortunately the PID for
> the search needs to be stored in the MMUCR register, which is also
> used by the TLB miss handler. Interrupts were enabled in _tlbie(), so
> an interrupt between loading the MMUCR and the tlbsx could cause
> incorrect search results, and thus a failure to invalide TLB entries
> which needed to be invalidated.
>
> This patch fixes the problem in both arch/ppc and arch/powerpc by
> inhibiting interrupts (even critical and debug interrupts) across the
> relevant instructions.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> Paul, this one's a bugfix, which I think should go into 2.6.23.
Did you actually see this happen?
- k
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 15:20 ` Kumar Gala
@ 2007-08-08 16:00 ` Josh Boyer
2007-08-09 5:28 ` Kumar Gala
0 siblings, 1 reply; 19+ messages in thread
From: Josh Boyer @ 2007-08-08 16:00 UTC (permalink / raw)
To: Kumar Gala
Cc: linuxppc-dev, Volkmar Uhlig, Paul Mackerras, Todd Inglett,
David Gibson
On Wed, 8 Aug 2007 10:20:45 -0500
Kumar Gala <galak@kernel.crashing.org> wrote:
>
> On Aug 6, 2007, at 11:20 PM, David Gibson wrote:
>
> > The 440 family of processors don't have a tlbie instruction. So, we
> > implement TLB invalidates by explicitly searching the TLB with tlbsx.,
> > then clobbering the relevant entry, if any. Unfortunately the PID for
> > the search needs to be stored in the MMUCR register, which is also
> > used by the TLB miss handler. Interrupts were enabled in _tlbie(), so
> > an interrupt between loading the MMUCR and the tlbsx could cause
> > incorrect search results, and thus a failure to invalide TLB entries
> > which needed to be invalidated.
> >
> > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > inhibiting interrupts (even critical and debug interrupts) across the
> > relevant instructions.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > Paul, this one's a bugfix, which I think should go into 2.6.23.
>
> Did you actually see this happen?
Yes.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 16:00 ` Josh Boyer
@ 2007-08-09 5:28 ` Kumar Gala
2007-08-09 5:34 ` David Gibson
2007-08-09 12:04 ` Josh Boyer
0 siblings, 2 replies; 19+ messages in thread
From: Kumar Gala @ 2007-08-09 5:28 UTC (permalink / raw)
To: Josh Boyer
Cc: linuxppc-dev, Volkmar Uhlig, Paul Mackerras, Todd Inglett,
David Gibson
On Aug 8, 2007, at 11:00 AM, Josh Boyer wrote:
> On Wed, 8 Aug 2007 10:20:45 -0500
> Kumar Gala <galak@kernel.crashing.org> wrote:
>
>>
>> On Aug 6, 2007, at 11:20 PM, David Gibson wrote:
>>
>>> The 440 family of processors don't have a tlbie instruction. So, we
>>> implement TLB invalidates by explicitly searching the TLB with
>>> tlbsx.,
>>> then clobbering the relevant entry, if any. Unfortunately the
>>> PID for
>>> the search needs to be stored in the MMUCR register, which is also
>>> used by the TLB miss handler. Interrupts were enabled in _tlbie
>>> (), so
>>> an interrupt between loading the MMUCR and the tlbsx could cause
>>> incorrect search results, and thus a failure to invalide TLB entries
>>> which needed to be invalidated.
>>>
>>> This patch fixes the problem in both arch/ppc and arch/powerpc by
>>> inhibiting interrupts (even critical and debug interrupts) across
>>> the
>>> relevant instructions.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>> Paul, this one's a bugfix, which I think should go into 2.6.23.
>>
>> Did you actually see this happen?
>
> Yes.
When?
We don't have critical wired to anything, I don't expect watchdog to
cause another fault.. so just wondering.
- k
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-09 5:28 ` Kumar Gala
@ 2007-08-09 5:34 ` David Gibson
2007-08-09 6:35 ` Kumar Gala
2007-08-09 12:04 ` Josh Boyer
1 sibling, 1 reply; 19+ messages in thread
From: David Gibson @ 2007-08-09 5:34 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, Volkmar Uhlig, Paul Mackerras, Todd Inglett
On Thu, Aug 09, 2007 at 12:28:20AM -0500, Kumar Gala wrote:
>
> On Aug 8, 2007, at 11:00 AM, Josh Boyer wrote:
>
> > On Wed, 8 Aug 2007 10:20:45 -0500
> > Kumar Gala <galak@kernel.crashing.org> wrote:
> >
> >>
> >> On Aug 6, 2007, at 11:20 PM, David Gibson wrote:
> >>
> >>> The 440 family of processors don't have a tlbie instruction. So, we
> >>> implement TLB invalidates by explicitly searching the TLB with
> >>> tlbsx.,
> >>> then clobbering the relevant entry, if any. Unfortunately the
> >>> PID for
> >>> the search needs to be stored in the MMUCR register, which is also
> >>> used by the TLB miss handler. Interrupts were enabled in _tlbie
> >>> (), so
> >>> an interrupt between loading the MMUCR and the tlbsx could cause
> >>> incorrect search results, and thus a failure to invalide TLB entries
> >>> which needed to be invalidated.
> >>>
> >>> This patch fixes the problem in both arch/ppc and arch/powerpc by
> >>> inhibiting interrupts (even critical and debug interrupts) across
> >>> the
> >>> relevant instructions.
> >>>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>> ---
> >>> Paul, this one's a bugfix, which I think should go into 2.6.23.
> >>
> >> Did you actually see this happen?
> >
> > Yes.
>
> When?
>
> We don't have critical wired to anything, I don't expect watchdog to
> cause another fault.. so just wondering.
On debug (trace) interrupts on blue gene.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-09 5:34 ` David Gibson
@ 2007-08-09 6:35 ` Kumar Gala
2007-08-09 7:01 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 19+ messages in thread
From: Kumar Gala @ 2007-08-09 6:35 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-dev, Volkmar Uhlig, Paul Mackerras, Todd Inglett
>>>> Did you actually see this happen?
>>>
>>> Yes.
>>
>> When?
>>
>> We don't have critical wired to anything, I don't expect watchdog to
>> cause another fault.. so just wondering.
>
> On debug (trace) interrupts on blue gene.
Do you know why the debug code caused a fault?
- k
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-09 6:35 ` Kumar Gala
@ 2007-08-09 7:01 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2007-08-09 7:01 UTC (permalink / raw)
To: Kumar Gala
Cc: Todd Inglett, linuxppc-dev, Paul Mackerras, Volkmar Uhlig,
David Gibson
On Thu, 2007-08-09 at 01:35 -0500, Kumar Gala wrote:
> >>>> Did you actually see this happen?
> >>>
> >>> Yes.
> >>
> >> When?
> >>
> >> We don't have critical wired to anything, I don't expect watchdog to
> >> cause another fault.. so just wondering.
> >
> > On debug (trace) interrupts on blue gene.
>
> Do you know why the debug code caused a fault?
Sure, it may access vmalloc space for example, which can cause a TLB
miss...
Ben.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-09 5:28 ` Kumar Gala
2007-08-09 5:34 ` David Gibson
@ 2007-08-09 12:04 ` Josh Boyer
2007-08-09 13:05 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 19+ messages in thread
From: Josh Boyer @ 2007-08-09 12:04 UTC (permalink / raw)
To: Kumar Gala
Cc: linuxppc-dev, Volkmar Uhlig, Paul Mackerras, Todd Inglett,
David Gibson
On Thu, Aug 09, 2007 at 12:28:20AM -0500, Kumar Gala wrote:
> >>Did you actually see this happen?
> >
> >Yes.
>
> When?
During some bluegene debug.
> We don't have critical wired to anything, I don't expect watchdog to
> cause another fault.. so just wondering.
We being who? I'm slightly confused here.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-09 12:04 ` Josh Boyer
@ 2007-08-09 13:05 ` Benjamin Herrenschmidt
2007-08-09 13:26 ` Josh Boyer
0 siblings, 1 reply; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2007-08-09 13:05 UTC (permalink / raw)
To: Josh Boyer
Cc: Volkmar Uhlig, linuxppc-dev, Paul Mackerras, Todd Inglett,
David Gibson
On Thu, 2007-08-09 at 07:04 -0500, Josh Boyer wrote:
>
> > We don't have critical wired to anything, I don't expect watchdog
> to
> > cause another fault.. so just wondering.
>
> We being who? I'm slightly confused here.
I think Kumar doesn't know that we are talking about the BG kernel which
has more things "wired" to CRIT than what is upstream at the moment :-)
Ben.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-09 13:05 ` Benjamin Herrenschmidt
@ 2007-08-09 13:26 ` Josh Boyer
0 siblings, 0 replies; 19+ messages in thread
From: Josh Boyer @ 2007-08-09 13:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Uhlig, linuxppc-dev, Paul Mackerras, Todd Inglett, Volkmar,
David Gibson
On Thu, 09 Aug 2007 23:05:36 +1000
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> On Thu, 2007-08-09 at 07:04 -0500, Josh Boyer wrote:
> >
> > > We don't have critical wired to anything, I don't expect watchdog
> > to
> > > cause another fault.. so just wondering.
> >
> > We being who? I'm slightly confused here.
>
> I think Kumar doesn't know that we are talking about the BG kernel
> which has more things "wired" to CRIT than what is upstream at the
> moment :-)
Ah, sure. But even though we don't have much upstream that uses CE,
that doesn't mean someone can't reprogram the UICs on their boards to
use CE for some things, for example. I know of at least one project
that has done that in the past.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-07 4:20 Fix small race in 44x tlbie function David Gibson
2007-08-08 14:49 ` Josh Boyer
2007-08-08 15:20 ` Kumar Gala
@ 2007-08-08 20:43 ` Hollis Blanchard
2007-08-08 21:29 ` Josh Boyer
2 siblings, 1 reply; 19+ messages in thread
From: Hollis Blanchard @ 2007-08-08 20:43 UTC (permalink / raw)
To: linuxppc-dev
On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
>
> This patch fixes the problem in both arch/ppc and arch/powerpc by
> inhibiting interrupts (even critical and debug interrupts) across the
> relevant instructions.
How could a critical or debug interrupt modify the contents of MMUCR?
--
Hollis Blanchard
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 20:43 ` Hollis Blanchard
@ 2007-08-08 21:29 ` Josh Boyer
2007-08-08 22:11 ` Hollis Blanchard
2007-08-08 23:01 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 19+ messages in thread
From: Josh Boyer @ 2007-08-08 21:29 UTC (permalink / raw)
To: Hollis Blanchard; +Cc: linuxppc-dev
On Wed, 8 Aug 2007 20:43:25 +0000 (UTC)
Hollis Blanchard <hollisb@us.ibm.com> wrote:
> On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
> >
> > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > inhibiting interrupts (even critical and debug interrupts) across the
> > relevant instructions.
>
> How could a critical or debug interrupt modify the contents of MMUCR?
Interrupts from UICs can be configured as critical. If one of those
triggers, (or any other CE triggers) and causes a tlb miss, you have a
race. The watchdog timer interrupt also is a CE IIRC.
CE and DE are admittedly a much smaller race, but still possible.
Masking EE off is the largest one.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 21:29 ` Josh Boyer
@ 2007-08-08 22:11 ` Hollis Blanchard
2007-08-08 23:30 ` Benjamin Herrenschmidt
2007-08-08 23:41 ` Josh Boyer
2007-08-08 23:01 ` Benjamin Herrenschmidt
1 sibling, 2 replies; 19+ messages in thread
From: Hollis Blanchard @ 2007-08-08 22:11 UTC (permalink / raw)
To: Josh Boyer; +Cc: linuxppc-dev
On Wed, 2007-08-08 at 16:29 -0500, Josh Boyer wrote:
> On Wed, 8 Aug 2007 20:43:25 +0000 (UTC)
> Hollis Blanchard <hollisb@us.ibm.com> wrote:
>
> > On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
> > >
> > > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > > inhibiting interrupts (even critical and debug interrupts) across the
> > > relevant instructions.
> >
> > How could a critical or debug interrupt modify the contents of MMUCR?
>
> Interrupts from UICs can be configured as critical. If one of those
> triggers, (or any other CE triggers) and causes a tlb miss, you have a
> race. The watchdog timer interrupt also is a CE IIRC.
By "causes a tlb miss", you mean the interrupt handler associated with
the critical-priority UIC interrupt performs MMIO which causes a TLB
miss? Regular code couldn't cause a TLB miss AFAICS, since the kernel is
always mapped, and an interrupt handler doesn't access userspace.
--
Hollis Blanchard
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 22:11 ` Hollis Blanchard
@ 2007-08-08 23:30 ` Benjamin Herrenschmidt
2007-08-08 23:41 ` Josh Boyer
1 sibling, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2007-08-08 23:30 UTC (permalink / raw)
To: Hollis Blanchard; +Cc: linuxppc-dev
On Wed, 2007-08-08 at 17:11 -0500, Hollis Blanchard wrote:
> On Wed, 2007-08-08 at 16:29 -0500, Josh Boyer wrote:
> > On Wed, 8 Aug 2007 20:43:25 +0000 (UTC)
> > Hollis Blanchard <hollisb@us.ibm.com> wrote:
> >
> > > On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
> > > >
> > > > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > > > inhibiting interrupts (even critical and debug interrupts) across the
> > > > relevant instructions.
> > >
> > > How could a critical or debug interrupt modify the contents of MMUCR?
> >
> > Interrupts from UICs can be configured as critical. If one of those
> > triggers, (or any other CE triggers) and causes a tlb miss, you have a
> > race. The watchdog timer interrupt also is a CE IIRC.
>
> By "causes a tlb miss", you mean the interrupt handler associated with
> the critical-priority UIC interrupt performs MMIO which causes a TLB
> miss? Regular code couldn't cause a TLB miss AFAICS, since the kernel is
> always mapped, and an interrupt handler doesn't access userspace.
ioremap is an example, vmalloc space is another...
Ben.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 22:11 ` Hollis Blanchard
2007-08-08 23:30 ` Benjamin Herrenschmidt
@ 2007-08-08 23:41 ` Josh Boyer
1 sibling, 0 replies; 19+ messages in thread
From: Josh Boyer @ 2007-08-08 23:41 UTC (permalink / raw)
To: Hollis Blanchard; +Cc: linuxppc-dev
On Wed, Aug 08, 2007 at 05:11:09PM -0500, Hollis Blanchard wrote:
> On Wed, 2007-08-08 at 16:29 -0500, Josh Boyer wrote:
> > On Wed, 8 Aug 2007 20:43:25 +0000 (UTC)
> > Hollis Blanchard <hollisb@us.ibm.com> wrote:
> >
> > > On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
> > > >
> > > > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > > > inhibiting interrupts (even critical and debug interrupts) across the
> > > > relevant instructions.
> > >
> > > How could a critical or debug interrupt modify the contents of MMUCR?
> >
> > Interrupts from UICs can be configured as critical. If one of those
> > triggers, (or any other CE triggers) and causes a tlb miss, you have a
> > race. The watchdog timer interrupt also is a CE IIRC.
>
> By "causes a tlb miss", you mean the interrupt handler associated with
> the critical-priority UIC interrupt performs MMIO which causes a TLB
> miss? Regular code couldn't cause a TLB miss AFAICS, since the kernel is
> always mapped, and an interrupt handler doesn't access userspace.
Yes.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 21:29 ` Josh Boyer
2007-08-08 22:11 ` Hollis Blanchard
@ 2007-08-08 23:01 ` Benjamin Herrenschmidt
2007-08-09 0:06 ` Josh Boyer
1 sibling, 1 reply; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2007-08-08 23:01 UTC (permalink / raw)
To: Josh Boyer; +Cc: linuxppc-dev, Hollis Blanchard
On Wed, 2007-08-08 at 16:29 -0500, Josh Boyer wrote:
> On Wed, 8 Aug 2007 20:43:25 +0000 (UTC)
> Hollis Blanchard <hollisb@us.ibm.com> wrote:
>
> > On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
> > >
> > > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > > inhibiting interrupts (even critical and debug interrupts) across the
> > > relevant instructions.
> >
> > How could a critical or debug interrupt modify the contents of MMUCR?
>
> Interrupts from UICs can be configured as critical. If one of those
> triggers, (or any other CE triggers) and causes a tlb miss, you have a
> race. The watchdog timer interrupt also is a CE IIRC.
>
> CE and DE are admittedly a much smaller race, but still possible.
> Masking EE off is the largest one.
There is a much bigger problem if CEs can do tlb misses though... they
can interrupt the tlb miss handler itself, either between the two halves
of a tlb write, or between the write to MMUCR and the write to the tlb,
and I suspect both cases will cause trouble.
We might want to check if we were in the TLB miss handler upon return
from the CE and MCE handlers, and in this case, restart them (just
return to the faulting instruction, that is use srr0 instead of
csrr0/mcsrr0).
Ben.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Fix small race in 44x tlbie function
2007-08-08 23:01 ` Benjamin Herrenschmidt
@ 2007-08-09 0:06 ` Josh Boyer
0 siblings, 0 replies; 19+ messages in thread
From: Josh Boyer @ 2007-08-09 0:06 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Hollis Blanchard
On Thu, Aug 09, 2007 at 09:01:29AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2007-08-08 at 16:29 -0500, Josh Boyer wrote:
> > On Wed, 8 Aug 2007 20:43:25 +0000 (UTC)
> > Hollis Blanchard <hollisb@us.ibm.com> wrote:
> >
> > > On Tue, 07 Aug 2007 14:20:50 +1000, David Gibson wrote:
> > > >
> > > > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > > > inhibiting interrupts (even critical and debug interrupts) across the
> > > > relevant instructions.
> > >
> > > How could a critical or debug interrupt modify the contents of MMUCR?
> >
> > Interrupts from UICs can be configured as critical. If one of those
> > triggers, (or any other CE triggers) and causes a tlb miss, you have a
> > race. The watchdog timer interrupt also is a CE IIRC.
> >
> > CE and DE are admittedly a much smaller race, but still possible.
> > Masking EE off is the largest one.
>
> There is a much bigger problem if CEs can do tlb misses though... they
> can interrupt the tlb miss handler itself, either between the two halves
> of a tlb write, or between the write to MMUCR and the write to the tlb,
> and I suspect both cases will cause trouble.
Yes.
> We might want to check if we were in the TLB miss handler upon return
> from the CE and MCE handlers, and in this case, restart them (just
> return to the faulting instruction, that is use srr0 instead of
> csrr0/mcsrr0).
Something should be looked at, yeah.
josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Fix small race in 44x tlbie function
@ 2007-08-08 15:34 Volkmar Uhlig
0 siblings, 0 replies; 19+ messages in thread
From: Volkmar Uhlig @ 2007-08-08 15:34 UTC (permalink / raw)
To: galak, david; +Cc: linuxppc-dev, paulus, Todd Inglett
> -----Original Message-----
> From: galak@kernel.crashing.org [mailto:galak@kernel.crashing.org]
> Sent: Wednesday, August 08, 2007 11:21 AM
> To: david@gibson.dropbear.id.au
> Cc: paulus@samba.org; linuxppc-dev@ozlabs.org; Todd Inglett;
> Volkmar Uhlig
> Subject: Re: Fix small race in 44x tlbie function
>
>
> On Aug 6, 2007, at 11:20 PM, David Gibson wrote:
>
> > The 440 family of processors don't have a tlbie instruction. So, we
> > implement TLB invalidates by explicitly searching the TLB
> with tlbsx.,
> > then clobbering the relevant entry, if any. Unfortunately
> the PID for
> > the search needs to be stored in the MMUCR register, which is also
> > used by the TLB miss handler. Interrupts were enabled in
> _tlbie(), so
> > an interrupt between loading the MMUCR and the tlbsx could cause
> > incorrect search results, and thus a failure to invalide TLB entries
> > which needed to be invalidated.
> >
> > This patch fixes the problem in both arch/ppc and arch/powerpc by
> > inhibiting interrupts (even critical and debug interrupts)
> across the
> > relevant instructions.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > Paul, this one's a bugfix, which I think should go into 2.6.23.
>
> Did you actually see this happen?
Yes! (I guess you didn't get the initial mail...)
- Volkmar
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-08-09 13:27 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-07 4:20 Fix small race in 44x tlbie function David Gibson
2007-08-08 14:49 ` Josh Boyer
2007-08-08 15:20 ` Kumar Gala
2007-08-08 16:00 ` Josh Boyer
2007-08-09 5:28 ` Kumar Gala
2007-08-09 5:34 ` David Gibson
2007-08-09 6:35 ` Kumar Gala
2007-08-09 7:01 ` Benjamin Herrenschmidt
2007-08-09 12:04 ` Josh Boyer
2007-08-09 13:05 ` Benjamin Herrenschmidt
2007-08-09 13:26 ` Josh Boyer
2007-08-08 20:43 ` Hollis Blanchard
2007-08-08 21:29 ` Josh Boyer
2007-08-08 22:11 ` Hollis Blanchard
2007-08-08 23:30 ` Benjamin Herrenschmidt
2007-08-08 23:41 ` Josh Boyer
2007-08-08 23:01 ` Benjamin Herrenschmidt
2007-08-09 0:06 ` Josh Boyer
-- strict thread matches above, loose matches on Subject: below --
2007-08-08 15:34 Volkmar Uhlig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).