linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Misc ARM optimizations
@ 2016-01-28 11:23 Thomas Petazzoni
  2016-01-28 11:23 ` [PATCH v2 1/2] ARM: smp_scu: enable coherent speculative linefills Thomas Petazzoni
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Thomas Petazzoni @ 2016-01-28 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Here is a set of two patches that I already sent on March 2015 (first
patch) and June 2015 (second patch).

The first patch hasn't gotten any feedback.

The second patch received some feedback: Russell said he would like
this to be merged at the beginning of a cycle, to give it enough time
for testing. And we are precisely at the beginning of a cycle :-) Rob
Herring had some comments, which I also addressed in a reply.

Would it be possible to either get those patches merged, or a decision
be taken that they are not acceptable, so that I can stop worrying
about them ? :-)

Thanks,

Thomas

Thomas Petazzoni (2):
  ARM: smp_scu: enable coherent speculative linefills
  ARM: mm: enable L1 prefetch on Cortex-A9

 arch/arm/kernel/smp_scu.c | 8 ++++++--
 arch/arm/mm/proc-v7.S     | 5 ++++-
 2 files changed, 10 insertions(+), 3 deletions(-)

-- 
2.6.4

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/2] ARM: smp_scu: enable coherent speculative linefills
  2016-01-28 11:23 [PATCH v2 0/2] Misc ARM optimizations Thomas Petazzoni
@ 2016-01-28 11:23 ` Thomas Petazzoni
  2016-01-28 11:23 ` [PATCH v2 2/2] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
  2016-02-11 14:40 ` [PATCH v2 0/2] Misc ARM optimizations Will Deacon
  2 siblings, 0 replies; 4+ messages in thread
From: Thomas Petazzoni @ 2016-01-28 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

According to the ARM TRM, about the SCU Control Register, bit 3
(Speculative linefills enable) :

  When set, coherent linefill requests are sent speculatively to the
  PL310 in parallel with the tag lookup. If the tag lookup misses, the
  confirmed linefill is sent to the PL310 and gets Rdata earlier
  because the data request was already initiated by the speculative
  request. This feature works only if the PL310 is present in the
  design.

This feature may improve the overall system performance.

Since the public ARM web site only documents Cortex-A9 revisions r2p0
and later, we err on the safe side and only enable this bit on >= r2p0
platforms, like the standby bit.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 arch/arm/kernel/smp_scu.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/smp_scu.c b/arch/arm/kernel/smp_scu.c
index 72f9241..227bb86 100644
--- a/arch/arm/kernel/smp_scu.c
+++ b/arch/arm/kernel/smp_scu.c
@@ -18,6 +18,7 @@
 
 #define SCU_CTRL		0x00
 #define SCU_ENABLE		(1 << 0)
+#define SCU_SPEC_LINEFILL	(1 << 3)
 #define SCU_STANDBY_ENABLE	(1 << 5)
 #define SCU_CONFIG		0x04
 #define SCU_CPU_STATUS		0x08
@@ -57,10 +58,13 @@ void scu_enable(void __iomem *scu_base)
 
 	scu_ctrl |= SCU_ENABLE;
 
-	/* Cortex-A9 earlier than r2p0 has no standby bit in SCU */
+	/*
+	 * Cortex-A9 earlier than r2p0 has no standby / speculative
+	 * line fills bits in SCU
+	 */
 	if ((read_cpuid_id() & 0xff0ffff0) == 0x410fc090 &&
 	    (read_cpuid_id() & 0x00f0000f) >= 0x00200000)
-		scu_ctrl |= SCU_STANDBY_ENABLE;
+		scu_ctrl |= SCU_STANDBY_ENABLE | SCU_SPEC_LINEFILL;
 
 	writel_relaxed(scu_ctrl, scu_base + SCU_CTRL);
 
-- 
2.6.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/2] ARM: mm: enable L1 prefetch on Cortex-A9
  2016-01-28 11:23 [PATCH v2 0/2] Misc ARM optimizations Thomas Petazzoni
  2016-01-28 11:23 ` [PATCH v2 1/2] ARM: smp_scu: enable coherent speculative linefills Thomas Petazzoni
@ 2016-01-28 11:23 ` Thomas Petazzoni
  2016-02-11 14:40 ` [PATCH v2 0/2] Misc ARM optimizations Will Deacon
  2 siblings, 0 replies; 4+ messages in thread
From: Thomas Petazzoni @ 2016-01-28 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

The Cortex-A9 has a L1 prefetch capability documented at
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/Chdejhgd.html:

  The Cortex-A9 data cache implements an automatic prefetcher that
  monitors cache misses done by the processor. This unit can monitor
  and prefetch two independent data streams. It can be activated in
  software using a CP15 Auxiliary Control Register bit. See Auxiliary
  Control Register.

This commit enables this L1 prefetch feature unconditionally on all
Cortex-A9 by setting bit 2 in the Auxiliary Control CP15
register. Note that since this bit only exists on Cortex-A9 but not on
Cortex-A5 or Cortex-R7, we separate the handling of Cortex-A9 from the
one of those two other cores.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 arch/arm/mm/proc-v7.S | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 0f92d57..415c0cb 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -263,8 +263,11 @@ ENDPROC(cpu_pj4b_do_resume)
  *	It is assumed that:
  *	- cache type register is implemented
  */
-__v7_ca5mp_setup:
 __v7_ca9mp_setup:
+	mov	r10, #(1 << 0)			@ Cache/TLB ops broadcasting
+	orr	r10, r10, #(1 << 2)		@ L1 prefetch
+	b	1f
+__v7_ca5mp_setup:
 __v7_cr7mp_setup:
 	mov	r10, #(1 << 0)			@ Cache/TLB ops broadcasting
 	b	1f
-- 
2.6.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 0/2] Misc ARM optimizations
  2016-01-28 11:23 [PATCH v2 0/2] Misc ARM optimizations Thomas Petazzoni
  2016-01-28 11:23 ` [PATCH v2 1/2] ARM: smp_scu: enable coherent speculative linefills Thomas Petazzoni
  2016-01-28 11:23 ` [PATCH v2 2/2] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
@ 2016-02-11 14:40 ` Will Deacon
  2 siblings, 0 replies; 4+ messages in thread
From: Will Deacon @ 2016-02-11 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 28, 2016 at 12:23:47PM +0100, Thomas Petazzoni wrote:
> Here is a set of two patches that I already sent on March 2015 (first
> patch) and June 2015 (second patch).
> 
> The first patch hasn't gotten any feedback.
> 
> The second patch received some feedback: Russell said he would like
> this to be merged at the beginning of a cycle, to give it enough time
> for testing. And we are precisely at the beginning of a cycle :-) Rob
> Herring had some comments, which I also addressed in a reply.
> 
> Would it be possible to either get those patches merged, or a decision
> be taken that they are not acceptable, so that I can stop worrying
> about them ? :-)

There's a PL310 erratum [(729806) Speculative reads from the Cortex-A9
MPCore processor can cause deadlock] that requires both speculative
linefills and L1 prefetch to be disabled, otherwise there is scope for
deadlock.

So, these need to be predicated on a Kconfig option if nothing else,
otherwise we may run into nasty regressions on things like OMAP4 (and
anything else with PL310 r3p0 afaict).

Will

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-02-11 14:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-28 11:23 [PATCH v2 0/2] Misc ARM optimizations Thomas Petazzoni
2016-01-28 11:23 ` [PATCH v2 1/2] ARM: smp_scu: enable coherent speculative linefills Thomas Petazzoni
2016-01-28 11:23 ` [PATCH v2 2/2] ARM: mm: enable L1 prefetch on Cortex-A9 Thomas Petazzoni
2016-02-11 14:40 ` [PATCH v2 0/2] Misc ARM optimizations Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).