* ERRATA work-arounds in the kernel
@ 2015-03-20 16:30 Mason
2015-03-20 17:20 ` Catalin Marinas
0 siblings, 1 reply; 4+ messages in thread
From: Mason @ 2015-03-20 16:30 UTC (permalink / raw)
To: linux-arm-kernel
Hello everyone,
I was recently looking (in arch/arm/Kconfig) at the list of work-arounds
for ARM CPU Errata. I saw 18, with 8 documented as applying to Cortex A9.
I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are
roughly 90 errata documented there. (This document is 2 years old.)
I assume that some (most?) of these do not apply to Linux, but it seems
likely that some do?
I'm wondering why there are not more work-arounds available in Kconfig?
Could it be that some work-arounds have been applied unconditionally, thus
not showing as an entry in Kconfig? (I doubt that, since work-arounds are
very CPU-specific.)
I'm wondering if it is possible to trigger some of these with a "normal"
work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked
at that? (I suppose they have.)
For example, errata #782772
"Speculative execution of a Load-Exclusive or Store-Exclusive instruction
after a write to Strongly Ordered memory might deadlock the processor."
(The recommended work-around is a strategically-placed DMB.)
Since ldrex is used in low-level code, it seems possible to hit that one?
Or perhaps Linux does not support "Strongly Ordered" memory regions?
Regards.
^ permalink raw reply [flat|nested] 4+ messages in thread
* ERRATA work-arounds in the kernel
2015-03-20 16:30 ERRATA work-arounds in the kernel Mason
@ 2015-03-20 17:20 ` Catalin Marinas
2015-03-20 21:40 ` Mason
0 siblings, 1 reply; 4+ messages in thread
From: Catalin Marinas @ 2015-03-20 17:20 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Mar 20, 2015 at 05:30:10PM +0100, Mason wrote:
> I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are
> roughly 90 errata documented there. (This document is 2 years old.)
>
> I assume that some (most?) of these do not apply to Linux, but it seems
> likely that some do?
>
> I'm wondering why there are not more work-arounds available in Kconfig?
There are a few reasons:
- erratum cannot be triggered in Linux
- erratum cannot be worked around in Linux (e.g. it requires some
undocumented control bits to be set by firmware or even hw workaround
like the system errata)
- cat A erratum with no feasible workaround (and partners usually take
an ECO fix)
- erratum does not affect any CPU revision in production (not all rxpy
revisions are in the field; I would include here early CPU revisions
that were licensed as development chips but not widely used)
- we simply missed them. So if you think there is any that needs to be
upstreamed, let us know or submit a patch
> I'm wondering if it is possible to trigger some of these with a "normal"
> work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked
> at that? (I suppose they have.)
Define "normal". It's really hard to quantify as the workloads can vary
widely between different use cases (e.g. mobile vs server).
> For example, errata #782772
> "Speculative execution of a Load-Exclusive or Store-Exclusive instruction
> after a write to Strongly Ordered memory might deadlock the processor."
> (The recommended work-around is a strategically-placed DMB.)
>
> Since ldrex is used in low-level code, it seems possible to hit that one?
> Or perhaps Linux does not support "Strongly Ordered" memory regions?
It support SO memory and it's used in some cases.
--
Catalin
^ permalink raw reply [flat|nested] 4+ messages in thread
* ERRATA work-arounds in the kernel
2015-03-20 17:20 ` Catalin Marinas
@ 2015-03-20 21:40 ` Mason
2015-03-23 12:49 ` Catalin Marinas
0 siblings, 1 reply; 4+ messages in thread
From: Mason @ 2015-03-20 21:40 UTC (permalink / raw)
To: linux-arm-kernel
On 20/03/2015 18:20, Catalin Marinas wrote:
> On Fri, Mar 20, 2015 at 05:30:10PM +0100, Mason wrote:
>> I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are
>> roughly 90 errata documented there. (This document is 2 years old.)
>>
>> I assume that some (most?) of these do not apply to Linux, but it seems
>> likely that some do?
>>
>> I'm wondering why there are not more work-arounds available in Kconfig?
>
> There are a few reasons:
>
> - erratum cannot be triggered in Linux
> - erratum cannot be worked around in Linux (e.g. it requires some
> undocumented control bits to be set by firmware or even hw workaround
> like the system errata)
> - cat A erratum with no feasible workaround (and partners usually take
> an ECO fix)
What's an ECO fix?
> - erratum does not affect any CPU revision in production (not all rxpy
> revisions are in the field; I would include here early CPU revisions
> that were licensed as development chips but not widely used)
> - we simply missed them. So if you think there is any that needs to be
> upstreamed, let us know or submit a patch
>
>> I'm wondering if it is possible to trigger some of these with a "normal"
>> work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked
>> at that? (I suppose they have.)
>
> Define "normal". It's really hard to quantify as the workloads can vary
> widely between different use cases (e.g. mobile vs server).
Well, the quotes around "normal" were a tongue-in-cheek cop-out
recognizing that defining "norm" here is tricky business ;-)
That being said, there are errata (speaking generally, not just
about ARM) that only trigger in the lab (or in simulation) and
there are errata that fire more readily (more hand-waving, sorry).
And #782772 looked like the latter to me (but I would defer to
your experience).
>> For example, errata #782772
>> "Speculative execution of a Load-Exclusive or Store-Exclusive instruction
>> after a write to Strongly Ordered memory might deadlock the processor."
>> (The recommended work-around is a strategically-placed DMB.)
>>
>> Since ldrex is used in low-level code, it seems possible to hit that one?
>> Or perhaps Linux does not support "Strongly Ordered" memory regions?
>
> It support SO memory and it's used in some cases.
Therefore, errata 782772 could trigger on a "typical" system,
right?
Looking more closely at mmu.c
static struct mem_type mem_types[] = {
[MT_DEVICE] = { /* Strongly ordered / ARMv6 shared device */
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_SHARED |
L_PTE_SHARED,
.prot_pte_s2 = s2_policy(PROT_PTE_S2_DEVICE) |
s2_policy(L_PTE_S2_MT_DEV_SHARED) |
L_PTE_SHARED,
.prot_l1 = PMD_TYPE_TABLE,
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_S,
.domain = DOMAIN_IO,
},
Perhaps, I'll come back to the list of errata once I've taken
care of more trivial matters. (And once I have a better grasp
of Linux internals.)
Regards.
^ permalink raw reply [flat|nested] 4+ messages in thread
* ERRATA work-arounds in the kernel
2015-03-20 21:40 ` Mason
@ 2015-03-23 12:49 ` Catalin Marinas
0 siblings, 0 replies; 4+ messages in thread
From: Catalin Marinas @ 2015-03-23 12:49 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Mar 20, 2015 at 10:40:09PM +0100, Mason wrote:
> On 20/03/2015 18:20, Catalin Marinas wrote:
> > On Fri, Mar 20, 2015 at 05:30:10PM +0100, Mason wrote:
> >> I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are
> >> roughly 90 errata documented there. (This document is 2 years old.)
> >>
> >> I assume that some (most?) of these do not apply to Linux, but it seems
> >> likely that some do?
> >>
> >> I'm wondering why there are not more work-arounds available in Kconfig?
> >
> > There are a few reasons:
> >
> > - erratum cannot be triggered in Linux
> > - erratum cannot be worked around in Linux (e.g. it requires some
> > undocumented control bits to be set by firmware or even hw workaround
> > like the system errata)
> > - cat A erratum with no feasible workaround (and partners usually take
> > an ECO fix)
>
> What's an ECO fix?
A netlist fix for hardware:
http://en.wikipedia.org/wiki/Engineering_change_order
> > - erratum does not affect any CPU revision in production (not all rxpy
> > revisions are in the field; I would include here early CPU revisions
> > that were licensed as development chips but not widely used)
> > - we simply missed them. So if you think there is any that needs to be
> > upstreamed, let us know or submit a patch
> >
> >> I'm wondering if it is possible to trigger some of these with a "normal"
> >> work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked
> >> at that? (I suppose they have.)
> >
> > Define "normal". It's really hard to quantify as the workloads can vary
> > widely between different use cases (e.g. mobile vs server).
>
> Well, the quotes around "normal" were a tongue-in-cheek cop-out
> recognizing that defining "norm" here is tricky business ;-)
>
> That being said, there are errata (speaking generally, not just
> about ARM) that only trigger in the lab (or in simulation) and
> there are errata that fire more readily (more hand-waving, sorry).
>
> And #782772 looked like the latter to me (but I would defer to
> your experience).
Unless we are sure that the conditions cannot be met in Linux, there is
no way to guarantee that the erratum won't hit. Many of them are really
unlikely and may have never been reproduced at top level (bare metal
software) but since it cannot be guaranteed, we implement the
workarounds in the kernel. It is up to the device vendor to decide
whether to enable it in production or not (based on some intensive
testing). Also, if we can't reproduce it here, it doesn't mean that a
different device won't trigger it, especially when the erratum is highly
dependent on timings (the reverse is also true, we trigger it here but
some hw vendors can't).
> >> For example, errata #782772
> >> "Speculative execution of a Load-Exclusive or Store-Exclusive instruction
> >> after a write to Strongly Ordered memory might deadlock the processor."
> >> (The recommended work-around is a strategically-placed DMB.)
> >>
> >> Since ldrex is used in low-level code, it seems possible to hit that one?
> >> Or perhaps Linux does not support "Strongly Ordered" memory regions?
> >
> > It support SO memory and it's used in some cases.
>
> Therefore, errata 782772 could trigger on a "typical" system,
> right?
If that "typical" system is using SO memory. There are some cases where
Linux ends up with SO memory (MT_UNCACHED or pgprot_noncached).
--
Catalin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-03-23 12:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-20 16:30 ERRATA work-arounds in the kernel Mason
2015-03-20 17:20 ` Catalin Marinas
2015-03-20 21:40 ` Mason
2015-03-23 12:49 ` Catalin Marinas
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.