* Cortex A9 MP: ARM errata 754323 implementation? @ 2015-09-03 7:40 Dirk Behme 2015-09-03 8:05 ` Russell King - ARM Linux 0 siblings, 1 reply; 6+ messages in thread From: Dirk Behme @ 2015-09-03 7:40 UTC (permalink / raw) To: linux-arm-kernel Hi, looking through the ARM Cortex A9 errata list [1] I wonder why we don't have a workaround for (754323) Repeated Store in the same cache line might delay the visibility of the Store in the kernel? Or have I missed it? We do have the workaround for the related erratum #754327 implemented [2], but that is supposed only for cores prior to r2p0. While #754323 seems to affect all newer cores, too? Any idea? Best regards Dirk [1] ARM Cortex-A9 processors r4 releases Software Developers Errata Notice ARM UAN 0009D (ID032315) [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/Kconfig#n1196 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation? 2015-09-03 7:40 Cortex A9 MP: ARM errata 754323 implementation? Dirk Behme @ 2015-09-03 8:05 ` Russell King - ARM Linux 2015-09-03 8:26 ` Dirk Behme 0 siblings, 1 reply; 6+ messages in thread From: Russell King - ARM Linux @ 2015-09-03 8:05 UTC (permalink / raw) To: linux-arm-kernel On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote: > looking through the ARM Cortex A9 errata list [1] I wonder why we don't have > a workaround for > > (754323) Repeated Store in the same cache line might delay the visibility of > the Store > > in the kernel? Or have I missed it? The policy for errata is not to implement them unless there's a requirement to do so - and then the errata should be implemented in board firmware in preference to the kernel where possible. Are you seeing a problem directly attributable to this errata? -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation? 2015-09-03 8:05 ` Russell King - ARM Linux @ 2015-09-03 8:26 ` Dirk Behme 2015-09-03 17:29 ` Catalin Marinas 0 siblings, 1 reply; 6+ messages in thread From: Dirk Behme @ 2015-09-03 8:26 UTC (permalink / raw) To: linux-arm-kernel On 03.09.2015 10:05, Russell King - ARM Linux wrote: > On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote: >> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have >> a workaround for >> >> (754323) Repeated Store in the same cache line might delay the visibility of >> the Store >> >> in the kernel? Or have I missed it? > > The policy for errata is not to implement them unless there's a requirement > to do so - and then the errata should be implemented in board firmware in > preference to the kernel where possible. > > Are you seeing a problem directly attributable to this errata? I got a report from some internal testing that an issue they see goes away if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore can't be affected by this errata. Looking through the list of erratas I then found the related 754323 which seems to apply to i.MX6, but is not implemented. The issue we are talking about is Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM PC is at kfree+0x10c/0x238 LR is at release_firmware+0x5c/0x70 which is said to be triggered by this code void kfree(const void *x) ... page = virt_to_head_page(x); if (unlikely(!PageSlab(page))) { BUG_ON(!PageCompound(page)); ... on a custom 3.14.x kernel. I haven't looked into this myself, but at least two people think that the kmalloc/kfree is correct with the request_firmware()/release_firmware() usage in the driver. Best regards Dirk ^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation? 2015-09-03 8:26 ` Dirk Behme @ 2015-09-03 17:29 ` Catalin Marinas 2015-09-04 14:00 ` Dirk Behme 0 siblings, 1 reply; 6+ messages in thread From: Catalin Marinas @ 2015-09-03 17:29 UTC (permalink / raw) To: linux-arm-kernel On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote: > On 03.09.2015 10:05, Russell King - ARM Linux wrote: > >On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote: > >>looking through the ARM Cortex A9 errata list [1] I wonder why we don't have > >>a workaround for > >> > >>(754323) Repeated Store in the same cache line might delay the visibility of > >>the Store > >> > >>in the kernel? Or have I missed it? > > > >The policy for errata is not to implement them unless there's a requirement > >to do so - and then the errata should be implemented in board firmware in > >preference to the kernel where possible. > > > >Are you seeing a problem directly attributable to this errata? > > I got a report from some internal testing that an issue they see goes away > if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore > can't be affected by this errata. Looking through the list of erratas I then > found the related 754323 which seems to apply to i.MX6, but is not > implemented. These errata are usually harmless, in most cases it prevents the system from making progress (like flag update not visible while being polled by another CPU), hence the workaround makes cpu_relax() a barrier since most polling loops should use it. > The issue we are talking about is > > Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM > PC is at kfree+0x10c/0x238 > LR is at release_firmware+0x5c/0x70 > > which is said to be triggered by this code > > void kfree(const void *x) > ... > page = virt_to_head_page(x); > if (unlikely(!PageSlab(page))) { > BUG_ON(!PageCompound(page)); > ... > > on a custom 3.14.x kernel. I haven't looked into this myself, but at least > two people think that the kmalloc/kfree is correct with the > request_firmware()/release_firmware() usage in the driver. I don't see how the erratum above would trigger a BUG. It's possible that there are some memory ordering issues (and A9 has some read after read bugs) that are hidden when enabling the barrier in cpu_relax(). -- Catalin ^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation? 2015-09-03 17:29 ` Catalin Marinas @ 2015-09-04 14:00 ` Dirk Behme 2015-09-04 14:23 ` Catalin Marinas 0 siblings, 1 reply; 6+ messages in thread From: Dirk Behme @ 2015-09-04 14:00 UTC (permalink / raw) To: linux-arm-kernel On 03.09.2015 19:29, Catalin Marinas wrote: > On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote: >> On 03.09.2015 10:05, Russell King - ARM Linux wrote: >>> On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote: >>>> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have >>>> a workaround for >>>> >>>> (754323) Repeated Store in the same cache line might delay the visibility of >>>> the Store >>>> >>>> in the kernel? Or have I missed it? >>> >>> The policy for errata is not to implement them unless there's a requirement >>> to do so - and then the errata should be implemented in board firmware in >>> preference to the kernel where possible. >>> >>> Are you seeing a problem directly attributable to this errata? >> >> I got a report from some internal testing that an issue they see goes away >> if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore >> can't be affected by this errata. Looking through the list of erratas I then >> found the related 754323 which seems to apply to i.MX6, but is not >> implemented. > > These errata are usually harmless, in most cases it prevents the system > from making progress (like flag update not visible while being polled by > another CPU), hence the workaround makes cpu_relax() a barrier since > most polling loops should use it. > >> The issue we are talking about is >> >> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM >> PC is at kfree+0x10c/0x238 >> LR is at release_firmware+0x5c/0x70 >> >> which is said to be triggered by this code >> >> void kfree(const void *x) >> ... >> page = virt_to_head_page(x); >> if (unlikely(!PageSlab(page))) { >> BUG_ON(!PageCompound(page)); >> ... >> >> on a custom 3.14.x kernel. I haven't looked into this myself, but at least >> two people think that the kmalloc/kfree is correct with the >> request_firmware()/release_firmware() usage in the driver. > > I don't see how the erratum above would trigger a BUG. It's possible > that there are some memory ordering issues (and A9 has some read after > read bugs) that are hidden when enabling the barrier in cpu_relax(). Do you have anything specific in mind we could try? Besides enabling 754327? Best regards Dirk ^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation? 2015-09-04 14:00 ` Dirk Behme @ 2015-09-04 14:23 ` Catalin Marinas 0 siblings, 0 replies; 6+ messages in thread From: Catalin Marinas @ 2015-09-04 14:23 UTC (permalink / raw) To: linux-arm-kernel On Fri, Sep 04, 2015 at 04:00:50PM +0200, Dirk Behme wrote: > On 03.09.2015 19:29, Catalin Marinas wrote: > >On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote: > >>On 03.09.2015 10:05, Russell King - ARM Linux wrote: > >>>On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote: > >>>>looking through the ARM Cortex A9 errata list [1] I wonder why we don't have > >>>>a workaround for > >>>> > >>>>(754323) Repeated Store in the same cache line might delay the visibility of > >>>>the Store > >>>> > >>>>in the kernel? Or have I missed it? > >>> > >>>The policy for errata is not to implement them unless there's a requirement > >>>to do so - and then the errata should be implemented in board firmware in > >>>preference to the kernel where possible. > >>> > >>>Are you seeing a problem directly attributable to this errata? > >> > >>I got a report from some internal testing that an issue they see goes away > >>if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore > >>can't be affected by this errata. Looking through the list of erratas I then > >>found the related 754323 which seems to apply to i.MX6, but is not > >>implemented. > > > >These errata are usually harmless, in most cases it prevents the system > >from making progress (like flag update not visible while being polled by > >another CPU), hence the workaround makes cpu_relax() a barrier since > >most polling loops should use it. > > > >>The issue we are talking about is > >> > >>Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM > >>PC is at kfree+0x10c/0x238 > >>LR is at release_firmware+0x5c/0x70 > >> > >>which is said to be triggered by this code > >> > >>void kfree(const void *x) > >>... > >>page = virt_to_head_page(x); > >>if (unlikely(!PageSlab(page))) { > >>BUG_ON(!PageCompound(page)); > >>... > >> > >>on a custom 3.14.x kernel. I haven't looked into this myself, but at least > >>two people think that the kmalloc/kfree is correct with the > >>request_firmware()/release_firmware() usage in the driver. > > > >I don't see how the erratum above would trigger a BUG. It's possible > >that there are some memory ordering issues (and A9 has some read after > >read bugs) that are hidden when enabling the barrier in cpu_relax(). > > Do you have anything specific in mind we could try? Besides enabling 754327? You may hit erratum 761319. There is a more detailed explanation here: http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf But there isn't much we can do in the kernel, other than recompiling it with a gcc that can work around the erratum. Searching for this erratum number and gcc seems to show some patches adding -mfix-cortex-a9-volatile-hazards but I can't tell when/whether they've been merged in gcc. For this specific case, you could place a DMB (smp_mb) before BUG_ON to see if this hunk is causing the problem. -- Catalin ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-09-04 14:23 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-03 7:40 Cortex A9 MP: ARM errata 754323 implementation? Dirk Behme 2015-09-03 8:05 ` Russell King - ARM Linux 2015-09-03 8:26 ` Dirk Behme 2015-09-03 17:29 ` Catalin Marinas 2015-09-04 14:00 ` Dirk Behme 2015-09-04 14:23 ` Catalin Marinas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.