* ERRATA work-arounds in the kernel @ 2015-03-20 16:30 Mason 2015-03-20 17:20 ` Catalin Marinas 0 siblings, 1 reply; 4+ messages in thread From: Mason @ 2015-03-20 16:30 UTC (permalink / raw) To: linux-arm-kernel Hello everyone, I was recently looking (in arch/arm/Kconfig) at the list of work-arounds for ARM CPU Errata. I saw 18, with 8 documented as applying to Cortex A9. I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are roughly 90 errata documented there. (This document is 2 years old.) I assume that some (most?) of these do not apply to Linux, but it seems likely that some do? I'm wondering why there are not more work-arounds available in Kconfig? Could it be that some work-arounds have been applied unconditionally, thus not showing as an entry in Kconfig? (I doubt that, since work-arounds are very CPU-specific.) I'm wondering if it is possible to trigger some of these with a "normal" work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked at that? (I suppose they have.) For example, errata #782772 "Speculative execution of a Load-Exclusive or Store-Exclusive instruction after a write to Strongly Ordered memory might deadlock the processor." (The recommended work-around is a strategically-placed DMB.) Since ldrex is used in low-level code, it seems possible to hit that one? Or perhaps Linux does not support "Strongly Ordered" memory regions? Regards. ^ permalink raw reply [flat|nested] 4+ messages in thread
* ERRATA work-arounds in the kernel 2015-03-20 16:30 ERRATA work-arounds in the kernel Mason @ 2015-03-20 17:20 ` Catalin Marinas 2015-03-20 21:40 ` Mason 0 siblings, 1 reply; 4+ messages in thread From: Catalin Marinas @ 2015-03-20 17:20 UTC (permalink / raw) To: linux-arm-kernel On Fri, Mar 20, 2015 at 05:30:10PM +0100, Mason wrote: > I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are > roughly 90 errata documented there. (This document is 2 years old.) > > I assume that some (most?) of these do not apply to Linux, but it seems > likely that some do? > > I'm wondering why there are not more work-arounds available in Kconfig? There are a few reasons: - erratum cannot be triggered in Linux - erratum cannot be worked around in Linux (e.g. it requires some undocumented control bits to be set by firmware or even hw workaround like the system errata) - cat A erratum with no feasible workaround (and partners usually take an ECO fix) - erratum does not affect any CPU revision in production (not all rxpy revisions are in the field; I would include here early CPU revisions that were licensed as development chips but not widely used) - we simply missed them. So if you think there is any that needs to be upstreamed, let us know or submit a patch > I'm wondering if it is possible to trigger some of these with a "normal" > work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked > at that? (I suppose they have.) Define "normal". It's really hard to quantify as the workloads can vary widely between different use cases (e.g. mobile vs server). > For example, errata #782772 > "Speculative execution of a Load-Exclusive or Store-Exclusive instruction > after a write to Strongly Ordered memory might deadlock the processor." > (The recommended work-around is a strategically-placed DMB.) > > Since ldrex is used in low-level code, it seems possible to hit that one? > Or perhaps Linux does not support "Strongly Ordered" memory regions? It support SO memory and it's used in some cases. -- Catalin ^ permalink raw reply [flat|nested] 4+ messages in thread
* ERRATA work-arounds in the kernel 2015-03-20 17:20 ` Catalin Marinas @ 2015-03-20 21:40 ` Mason 2015-03-23 12:49 ` Catalin Marinas 0 siblings, 1 reply; 4+ messages in thread From: Mason @ 2015-03-20 21:40 UTC (permalink / raw) To: linux-arm-kernel On 20/03/2015 18:20, Catalin Marinas wrote: > On Fri, Mar 20, 2015 at 05:30:10PM +0100, Mason wrote: >> I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are >> roughly 90 errata documented there. (This document is 2 years old.) >> >> I assume that some (most?) of these do not apply to Linux, but it seems >> likely that some do? >> >> I'm wondering why there are not more work-arounds available in Kconfig? > > There are a few reasons: > > - erratum cannot be triggered in Linux > - erratum cannot be worked around in Linux (e.g. it requires some > undocumented control bits to be set by firmware or even hw workaround > like the system errata) > - cat A erratum with no feasible workaround (and partners usually take > an ECO fix) What's an ECO fix? > - erratum does not affect any CPU revision in production (not all rxpy > revisions are in the field; I would include here early CPU revisions > that were licensed as development chips but not widely used) > - we simply missed them. So if you think there is any that needs to be > upstreamed, let us know or submit a patch > >> I'm wondering if it is possible to trigger some of these with a "normal" >> work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked >> at that? (I suppose they have.) > > Define "normal". It's really hard to quantify as the workloads can vary > widely between different use cases (e.g. mobile vs server). Well, the quotes around "normal" were a tongue-in-cheek cop-out recognizing that defining "norm" here is tricky business ;-) That being said, there are errata (speaking generally, not just about ARM) that only trigger in the lab (or in simulation) and there are errata that fire more readily (more hand-waving, sorry). And #782772 looked like the latter to me (but I would defer to your experience). >> For example, errata #782772 >> "Speculative execution of a Load-Exclusive or Store-Exclusive instruction >> after a write to Strongly Ordered memory might deadlock the processor." >> (The recommended work-around is a strategically-placed DMB.) >> >> Since ldrex is used in low-level code, it seems possible to hit that one? >> Or perhaps Linux does not support "Strongly Ordered" memory regions? > > It support SO memory and it's used in some cases. Therefore, errata 782772 could trigger on a "typical" system, right? Looking more closely at mmu.c static struct mem_type mem_types[] = { [MT_DEVICE] = { /* Strongly ordered / ARMv6 shared device */ .prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_SHARED | L_PTE_SHARED, .prot_pte_s2 = s2_policy(PROT_PTE_S2_DEVICE) | s2_policy(L_PTE_S2_MT_DEV_SHARED) | L_PTE_SHARED, .prot_l1 = PMD_TYPE_TABLE, .prot_sect = PROT_SECT_DEVICE | PMD_SECT_S, .domain = DOMAIN_IO, }, Perhaps, I'll come back to the list of errata once I've taken care of more trivial matters. (And once I have a better grasp of Linux internals.) Regards. ^ permalink raw reply [flat|nested] 4+ messages in thread
* ERRATA work-arounds in the kernel 2015-03-20 21:40 ` Mason @ 2015-03-23 12:49 ` Catalin Marinas 0 siblings, 0 replies; 4+ messages in thread From: Catalin Marinas @ 2015-03-23 12:49 UTC (permalink / raw) To: linux-arm-kernel On Fri, Mar 20, 2015 at 10:40:09PM +0100, Mason wrote: > On 20/03/2015 18:20, Catalin Marinas wrote: > > On Fri, Mar 20, 2015 at 05:30:10PM +0100, Mason wrote: > >> I also looked at ARM's "Errata Summary Table" for the Cortex A9. There are > >> roughly 90 errata documented there. (This document is 2 years old.) > >> > >> I assume that some (most?) of these do not apply to Linux, but it seems > >> likely that some do? > >> > >> I'm wondering why there are not more work-arounds available in Kconfig? > > > > There are a few reasons: > > > > - erratum cannot be triggered in Linux > > - erratum cannot be worked around in Linux (e.g. it requires some > > undocumented control bits to be set by firmware or even hw workaround > > like the system errata) > > - cat A erratum with no feasible workaround (and partners usually take > > an ECO fix) > > What's an ECO fix? A netlist fix for hardware: http://en.wikipedia.org/wiki/Engineering_change_order > > - erratum does not affect any CPU revision in production (not all rxpy > > revisions are in the field; I would include here early CPU revisions > > that were licensed as development chips but not widely used) > > - we simply missed them. So if you think there is any that needs to be > > upstreamed, let us know or submit a patch > > > >> I'm wondering if it is possible to trigger some of these with a "normal" > >> work-load on a "normal" kernel? Has anyone (perhaps ARM employees) looked > >> at that? (I suppose they have.) > > > > Define "normal". It's really hard to quantify as the workloads can vary > > widely between different use cases (e.g. mobile vs server). > > Well, the quotes around "normal" were a tongue-in-cheek cop-out > recognizing that defining "norm" here is tricky business ;-) > > That being said, there are errata (speaking generally, not just > about ARM) that only trigger in the lab (or in simulation) and > there are errata that fire more readily (more hand-waving, sorry). > > And #782772 looked like the latter to me (but I would defer to > your experience). Unless we are sure that the conditions cannot be met in Linux, there is no way to guarantee that the erratum won't hit. Many of them are really unlikely and may have never been reproduced at top level (bare metal software) but since it cannot be guaranteed, we implement the workarounds in the kernel. It is up to the device vendor to decide whether to enable it in production or not (based on some intensive testing). Also, if we can't reproduce it here, it doesn't mean that a different device won't trigger it, especially when the erratum is highly dependent on timings (the reverse is also true, we trigger it here but some hw vendors can't). > >> For example, errata #782772 > >> "Speculative execution of a Load-Exclusive or Store-Exclusive instruction > >> after a write to Strongly Ordered memory might deadlock the processor." > >> (The recommended work-around is a strategically-placed DMB.) > >> > >> Since ldrex is used in low-level code, it seems possible to hit that one? > >> Or perhaps Linux does not support "Strongly Ordered" memory regions? > > > > It support SO memory and it's used in some cases. > > Therefore, errata 782772 could trigger on a "typical" system, > right? If that "typical" system is using SO memory. There are some cases where Linux ends up with SO memory (MT_UNCACHED or pgprot_noncached). -- Catalin ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-03-23 12:49 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-20 16:30 ERRATA work-arounds in the kernel Mason 2015-03-20 17:20 ` Catalin Marinas 2015-03-20 21:40 ` Mason 2015-03-23 12:49 ` Catalin Marinas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.