* [PATCH v3] MIPS: R12000: Enable branch prediction global history
@ 2015-06-02 22:21 Joshua Kinard
2015-06-03 8:21 ` Ralf Baechle
0 siblings, 1 reply; 3+ messages in thread
From: Joshua Kinard @ 2015-06-02 22:21 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Linux MIPS List
From: Joshua Kinard <kumba@gentoo.org>
The R12000 added a new feature to enhance branch prediction called
"global history". Per the Vr10000 Series User Manual (U10278EJ4V0UM),
Coprocessor 0, Diagnostic Register (22):
"""
If bit 26 is set, branch prediction uses all eight bits of the global
history register. If bit 26 is not set, then bits 25:23 specify a count
of the number of bits of global history to be used. Thus if bits 26:23
are all zero, global history is disabled.
The global history contains a record of the taken/not-taken status of
recently executed branches, and when used is XOR'ed with the PC of a
branch being predicted to produce a hashed value for indexing the BPT.
Some programs with small "working set of conditional branches" benefit
significantly from the use of such hashing, some see slight performance
degradation.
"""
This patch enables global history on R12000 CPUs and up by setting bit
26 in the branch prediction diagnostic register (CP0 $22) to '1'. Bits
25:23 are left alone so that all eight bits of the global history
register are available for branch prediction.
Signed-off-by: Joshua Kinard <kumba@gentoo.org>
---
arch/mips/include/asm/cpu-features.h | 3 +++
arch/mips/include/asm/cpu.h | 1 +
arch/mips/include/asm/mipsregs.h | 13 +++++++++++++
arch/mips/kernel/cpu-probe.c | 8 ++++++--
4 files changed, 23 insertions(+), 2 deletions(-)
This version builds on v2 and actually uses the cpu_has_bp_ghist #define
instead of checking the cpu_data.options field directly.
linux-mips-r12k-branch-ghistory.patch
diff --git a/arch/mips/include/asm/cpu-features.h b/arch/mips/include/asm/cpu-features.h
index 5aeaf19..f25de77 100644
--- a/arch/mips/include/asm/cpu-features.h
+++ b/arch/mips/include/asm/cpu-features.h
@@ -108,6 +108,9 @@
#ifndef cpu_has_llsc
#define cpu_has_llsc (cpu_data[0].options & MIPS_CPU_LLSC)
#endif
+#ifndef cpu_has_bp_ghist
+#define cpu_has_bp_ghist (cpu_data[0].options & MIPS_CPU_BP_GHIST)
+#endif
#ifndef kernel_uses_llsc
#define kernel_uses_llsc cpu_has_llsc
#endif
diff --git a/arch/mips/include/asm/cpu.h b/arch/mips/include/asm/cpu.h
index e3adca1..76154ba 100644
--- a/arch/mips/include/asm/cpu.h
+++ b/arch/mips/include/asm/cpu.h
@@ -379,6 +379,7 @@ enum cpu_type_enum {
#define MIPS_CPU_RW_LLB 0x1000000000ull /* LLADDR/LLB writes are allowed */
#define MIPS_CPU_XPA 0x2000000000ull /* CPU supports Extended Physical Addressing */
#define MIPS_CPU_CDMM 0x4000000000ull /* CPU has Common Device Memory Map */
+#define MIPS_CPU_BP_GHIST 0x8000000000ull /* R12K+ Branch Prediction Global History */
/*
* CPU ASE encodings
diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 764e275..fc63ba7 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -685,6 +685,15 @@
#define TX39_CONF_DRSIZE_SHIFT 0
#define TX39_CONF_DRSIZE_MASK 0x00000003
+/*
+ * Interesting Bits in the R10K CP0 Branch Diagnostic Register
+ */
+/* Disable Branch Target Address Cache */
+#define R10K_DIAG_D_BTAC (_ULCAST_(1) << 27)
+/* Enable Branch Prediction Global History */
+#define R10K_DIAG_E_GHIST (_ULCAST_(1) << 26)
+/* Disable Branch Return Cache */
+#define R10K_DIAG_D_BRC (_ULCAST_(1) << 22)
/*
* Coprocessor 1 (FPU) register names
@@ -1247,6 +1256,10 @@ do { \
#define read_c0_diag() __read_32bit_c0_register($22, 0)
#define write_c0_diag(val) __write_32bit_c0_register($22, 0, val)
+/* R10K CP0 Branch Diagnostic register is 64bits wide */
+#define read_c0_r10k_diag() __read_64bit_c0_register($22, 0)
+#define write_c0_r10k_diag(val) __write_64bit_c0_register($22, 0, val)
+
#define read_c0_diag1() __read_32bit_c0_register($22, 1)
#define write_c0_diag1(val) __write_32bit_c0_register($22, 1, val)
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index e36515d..c98b6c5 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -946,7 +946,7 @@ static inline void cpu_probe_legacy(struct cpuinfo_mips *c, unsigned int cpu)
c->options = MIPS_CPU_TLB | MIPS_CPU_4K_CACHE | MIPS_CPU_4KEX |
MIPS_CPU_FPU | MIPS_CPU_32FPR |
MIPS_CPU_COUNTER | MIPS_CPU_WATCH |
- MIPS_CPU_LLSC;
+ MIPS_CPU_LLSC | MIPS_CPU_BP_GHIST;
c->tlbsize = 64;
break;
case PRID_IMP_R14000:
@@ -961,7 +961,7 @@ static inline void cpu_probe_legacy(struct cpuinfo_mips *c, unsigned int cpu)
c->options = MIPS_CPU_TLB | MIPS_CPU_4K_CACHE | MIPS_CPU_4KEX |
MIPS_CPU_FPU | MIPS_CPU_32FPR |
MIPS_CPU_COUNTER | MIPS_CPU_WATCH |
- MIPS_CPU_LLSC;
+ MIPS_CPU_LLSC | MIPS_CPU_BP_GHIST;
c->tlbsize = 64;
break;
case PRID_IMP_LOONGSON_64: /* Loongson-2/3 */
@@ -1479,6 +1479,10 @@ void cpu_probe(void)
else
cpu_set_nofpu_opts(c);
+ if (cpu_has_bp_ghist)
+ write_c0_r10k_diag(read_c0_r10k_diag() |
+ R10K_DIAG_E_GHIST);
+
if (cpu_has_mips_r2_r6) {
c->srsets = ((read_c0_srsctl() >> 26) & 0x0f) + 1;
/* R2 has Performance Counter Interrupt indicator */
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v3] MIPS: R12000: Enable branch prediction global history
2015-06-02 22:21 [PATCH v3] MIPS: R12000: Enable branch prediction global history Joshua Kinard
@ 2015-06-03 8:21 ` Ralf Baechle
2015-06-04 3:27 ` Joshua Kinard
0 siblings, 1 reply; 3+ messages in thread
From: Ralf Baechle @ 2015-06-03 8:21 UTC (permalink / raw)
To: Joshua Kinard; +Cc: Linux MIPS List
On Tue, Jun 02, 2015 at 06:21:33PM -0400, Joshua Kinard wrote:
> From: Joshua Kinard <kumba@gentoo.org>
>
> The R12000 added a new feature to enhance branch prediction called
> "global history". Per the Vr10000 Series User Manual (U10278EJ4V0UM),
> Coprocessor 0, Diagnostic Register (22):
>
> """
> If bit 26 is set, branch prediction uses all eight bits of the global
> history register. If bit 26 is not set, then bits 25:23 specify a count
> of the number of bits of global history to be used. Thus if bits 26:23
> are all zero, global history is disabled.
>
> The global history contains a record of the taken/not-taken status of
> recently executed branches, and when used is XOR'ed with the PC of a
> branch being predicted to produce a hashed value for indexing the BPT.
> Some programs with small "working set of conditional branches" benefit
> significantly from the use of such hashing, some see slight performance
> degradation.
> """
>
> This patch enables global history on R12000 CPUs and up by setting bit
> 26 in the branch prediction diagnostic register (CP0 $22) to '1'. Bits
> 25:23 are left alone so that all eight bits of the global history
> register are available for branch prediction.
Will apply but could you also submit a patch to set cpu_has_bp_ghist to
0/1 as applicable in all cpu-feature-overrides.h?
Also the manual suggests this CPU feature may not always be neneficial
for performance so I'm wondering if we should add a way to modify it
at runtime.
I'm curious, have you checked the default setting of the global history
on kernel entry?
Ralf
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v3] MIPS: R12000: Enable branch prediction global history
2015-06-03 8:21 ` Ralf Baechle
@ 2015-06-04 3:27 ` Joshua Kinard
0 siblings, 0 replies; 3+ messages in thread
From: Joshua Kinard @ 2015-06-04 3:27 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Linux MIPS List
On 06/03/2015 04:21, Ralf Baechle wrote:
> On Tue, Jun 02, 2015 at 06:21:33PM -0400, Joshua Kinard wrote:
>
>> From: Joshua Kinard <kumba@gentoo.org>
>>
>> The R12000 added a new feature to enhance branch prediction called
>> "global history". Per the Vr10000 Series User Manual (U10278EJ4V0UM),
>> Coprocessor 0, Diagnostic Register (22):
>>
>> """
>> If bit 26 is set, branch prediction uses all eight bits of the global
>> history register. If bit 26 is not set, then bits 25:23 specify a count
>> of the number of bits of global history to be used. Thus if bits 26:23
>> are all zero, global history is disabled.
>>
>> The global history contains a record of the taken/not-taken status of
>> recently executed branches, and when used is XOR'ed with the PC of a
>> branch being predicted to produce a hashed value for indexing the BPT.
>> Some programs with small "working set of conditional branches" benefit
>> significantly from the use of such hashing, some see slight performance
>> degradation.
>> """
>>
>> This patch enables global history on R12000 CPUs and up by setting bit
>> 26 in the branch prediction diagnostic register (CP0 $22) to '1'. Bits
>> 25:23 are left alone so that all eight bits of the global history
>> register are available for branch prediction.
>
> Will apply but could you also submit a patch to set cpu_has_bp_ghist to
> 0/1 as applicable in all cpu-feature-overrides.h?
I can, though at that point, the R10000 Kconfig item needs to be split to
differentiate between R10000 and R12000/R14000/R16000. I sent a patch in to do
that a few weeks ago, but it was rejected. Can you outline your specific
issues with it and I'll re-submit it, then the 'cpu_has_bp_ghist' define can be
'0' for R10000's and '1' for R12K-R16K?
That'll also set things up for the potential discovery of bits specific to
R14K/R16K that may be useful, but aren't known/understood just.
> Also the manual suggests this CPU feature may not always be neneficial
> for performance so I'm wondering if we should add a way to modify it
> at runtime.
I thought about this, too. It'd also allow for R12K+ options to control the
Disable Branch Target Address Cache (BTAC, Bit 27) and the Disable Branch
Return Cache (Bit 22). For global history, I just set Bit 26 so all of the
ghistory bits are available, but even this could become a Kconfig item to
control Bits 25:23. Would probably require some benchmarking to see what the
effects of this are, but the entry in the manual suggests that the benefits
outweigh the penalties in the end.
> I'm curious, have you checked the default setting of the global history
> on kernel entry?
Yup, it's disabled by default:
[ 0.000000] DEBUG: CPU0: c0_diag #1: 0x000400001030c000
[ 0.000000] DEBUG: CPU0: c0_diag #2: 0x0004000014148000
[ 7.798066] DEBUG: CPU1: c0_diag #1: 0x00000000103c8000
[ 7.798092] DEBUG: CPU1: c0_diag #2: 0x0000000014144000
I B G -BRC- -----------BP----------
T S B H H D | | | M S
L I T I I B | | | o t I
B d A S S R | | | M d a d O
0 M 0 x C T T C V W H P e t ** x 0 p
xxxxxxxxxxxx xxxx xxxxxxxxxxxxxxxx xxxx x x xxx x x x x x xx xx xx xxxxxxxxx x xx
---------------------------------------------------------------------------------
000000000000 0100 0000000000000000 0001 0 0 000 0 1 1 0 0 00 11 00 000000000 0 00 CPU0 Before
000000000000 0100 0000000000000000 0001 0 1 000 0 0 1 0 1 00 10 00 000000000 0 00 CPU0 After
000000000000 0000 0000000000000000 0001 0 0 000 0 1 1 1 1 00 10 00 000000000 0 00 CPU1 Before
000000000000 0000 0000000000000000 0001 0 1 000 0 0 1 0 1 00 01 00 000000000 0 00 CPU1 After
---------------------------------------------------------------------------------
12 4 16 4 1 1 3 1 1 1 1 1 2 2 2 9 1 2
** R12000 and up: Upper-two bits of BP-Idx.
--J
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-06-04 3:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-02 22:21 [PATCH v3] MIPS: R12000: Enable branch prediction global history Joshua Kinard
2015-06-03 8:21 ` Ralf Baechle
2015-06-04 3:27 ` Joshua Kinard
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.