LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* RE: [PATCH 06/12] openrisc: Use of_get_cpu_hwid()
From: David Laight @ 2021-10-07  7:53 UTC (permalink / raw)
  To: 'Segher Boessenkool', Stafford Horne
  Cc: Rich Felker, Rafael J. Wysocki, Catalin Marinas, x86@kernel.org,
	Guo Ren, H. Peter Anvin, linux-riscv@lists.infradead.org,
	Frank Rowand, Jonas Bonn, Rob Herring, Florian Fainelli,
	Will Deacon, linux-sh@vger.kernel.org, Russell King,
	linux-csky@vger.kernel.org, Ingo Molnar,
	bcm-kernel-feedback-list@broadcom.com, Palmer Dabbelt,
	devicetree@vger.kernel.org, Albert Ou, Ray Jui,
	Stefan Kristiansson, openrisc@lists.librecores.org,
	Borislav Petkov, Paul Walmsley, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org, Scott Branden,
	Yoshinori Sato, linux-kernel@vger.kernel.org, James Morse,
	Greg Kroah-Hartman, Paul Mackerras, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20211006212728.GM10333@gate.crashing.org>

From: Segher Boessenkool
> Sent: 06 October 2021 22:27
> 
> On Thu, Oct 07, 2021 at 05:44:00AM +0900, Stafford Horne wrote:
> > You have defined of_get_cpu_hwid to return u64, will this create compiler
> > warnings when since we are storing a u64 into a u32?
> >
> > It seems only if we make with W=3.
> 
> Yes.  This is done by -Wconversion, "Warn for implicit conversions that
> may alter a value."
> 
> > I thought we usually warned on this.

The microsoft compiler does - best to turn all those warnings off.

> This warning is not in -Wall or -Wextra either, it suffers too much from
> false positives.  It is very natural to just ignore the high bits of
> modulo types (which is what "unsigned" types *are*).  Or the bits that
> "fall off" on a conversion.  The C standard makes this required
> behaviour, it is useful, and it is the only convenient way of getting
> this!

I've also seen a compiler convert:
	struct->char_member = (char)(int_val & 0xff);
into:
	reg = int_val;
	reg &= 0xff; // for the & 0xff
	reg &= 0xff; // for the cast
	struct->char_member = low_8bits(reg);

You really don't want the extra noise.

I'll bet that (char)int_val is actually an arithmetic expression.
So its type will be 'int'.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* [V2] powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10
From: Athira Rajeev @ 2021-10-07  7:51 UTC (permalink / raw)
  To: mpe; +Cc: kjain, maddy, linuxppc-dev, rnsastry

From: Athira Rajeev <atrajeev@linux.vnet.ibm.cm>

In power9 and before platforms, the default event used for cyles and
instructions is PM_CYC (0x0001e) and PM_INST_CMPL (0x00002) respectively.
These events uses two programmable PMCs and by default will count
irrespective of the run latch state. But since it is using programmable
PMCs, these events will cause multiplexing with basic event set supported
by perf stat. Hence in power10, performance monitoring unit (PMU) driver
uses performance monitor counter 5 (PMC5) and performance monitor counter6
(PMC6) for counting instructions and cycles.

In power10, event used for cycles is PM_RUN_CYC (0x600F4) and instructions
is PM_RUN_INST_CMPL (0x500fa). But counting of these events in idle state
is controlled by the CC56RUN bit setting in Monitor Mode Control Register0
(MMCR0). If the CC56RUN bit is not set, PMC5/6 will not count when
CTRL[RUN] is zero. This could lead to miss some counts if a thread
is in idle state during system wide profiling.

Patch sets the CC56RUN bit in MMCR0 for power10 which makes PMC5 and
PMC6 count instructions and cycles regardless of the run bit. Since
this change make PMC5/6 count as PM_INST_CMPL/PM_CYC,  renamed event
code 0x600f4 as PM_CYC instead of PM_RUN_CYC and event code 0x500fa
as PM_INST_CMPL instead of PM_RUN_INST_CMPL. The changes are only for
PMC5/6 event codes and will not affect the  behaviour of
PM_RUN_CYC/PM_RUN_INST_CMPL if progammed in other PMC's.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.cm>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
Changelog:
 Updated commit message to explain in detail on why and
 how it affects counting.

Notes on testing done for this change:
 Tested this patch change with a kernel module that
 turns off and turns on the runlatch. kernel module also
 reads the counter values for PMC5 and PMC6 during the
 period when runlatch is off.
 - Started PMU counters via "perf stat" and loaded the
   test module.
 - Checked the counter values captured from module during
   the runlatch off period.
 - Verified that counters were frozen without the patch and
   with the patch, observed counters were incrementing.

 arch/powerpc/perf/power10-events-list.h |  8 ++---
 arch/powerpc/perf/power10-pmu.c         | 44 +++++++++++++++++--------
 2 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/perf/power10-events-list.h b/arch/powerpc/perf/power10-events-list.h
index 93be7197d250..564f14097f07 100644
--- a/arch/powerpc/perf/power10-events-list.h
+++ b/arch/powerpc/perf/power10-events-list.h
@@ -9,10 +9,10 @@
 /*
  * Power10 event codes.
  */
-EVENT(PM_RUN_CYC,				0x600f4);
+EVENT(PM_CYC,				0x600f4);
 EVENT(PM_DISP_STALL_CYC,			0x100f8);
 EVENT(PM_EXEC_STALL,				0x30008);
-EVENT(PM_RUN_INST_CMPL,				0x500fa);
+EVENT(PM_INST_CMPL,				0x500fa);
 EVENT(PM_BR_CMPL,                               0x4d05e);
 EVENT(PM_BR_MPRED_CMPL,                         0x400f6);
 EVENT(PM_BR_FIN,				0x2f04a);
@@ -50,8 +50,8 @@ EVENT(PM_DTLB_MISS,				0x300fc);
 /* ITLB Reloaded */
 EVENT(PM_ITLB_MISS,				0x400fc);
 
-EVENT(PM_RUN_CYC_ALT,				0x0001e);
-EVENT(PM_RUN_INST_CMPL_ALT,			0x00002);
+EVENT(PM_CYC_ALT,				0x0001e);
+EVENT(PM_INST_CMPL_ALT,				0x00002);
 
 /*
  * Memory Access Events
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index f9d64c63bb4a..9dd75f385837 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -91,8 +91,8 @@ extern u64 PERF_REG_EXTENDED_MASK;
 
 /* Table of alternatives, sorted by column 0 */
 static const unsigned int power10_event_alternatives[][MAX_ALT] = {
-	{ PM_RUN_CYC_ALT,		PM_RUN_CYC },
-	{ PM_RUN_INST_CMPL_ALT,		PM_RUN_INST_CMPL },
+	{ PM_CYC_ALT,			PM_CYC },
+	{ PM_INST_CMPL_ALT,		PM_INST_CMPL },
 };
 
 static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
@@ -118,8 +118,8 @@ static int power10_check_attr_config(struct perf_event *ev)
 	return 0;
 }
 
-GENERIC_EVENT_ATTR(cpu-cycles,			PM_RUN_CYC);
-GENERIC_EVENT_ATTR(instructions,		PM_RUN_INST_CMPL);
+GENERIC_EVENT_ATTR(cpu-cycles,			PM_CYC);
+GENERIC_EVENT_ATTR(instructions,		PM_INST_CMPL);
 GENERIC_EVENT_ATTR(branch-instructions,		PM_BR_CMPL);
 GENERIC_EVENT_ATTR(branch-misses,		PM_BR_MPRED_CMPL);
 GENERIC_EVENT_ATTR(cache-references,		PM_LD_REF_L1);
@@ -148,8 +148,8 @@ CACHE_EVENT_ATTR(dTLB-load-misses,		PM_DTLB_MISS);
 CACHE_EVENT_ATTR(iTLB-load-misses,		PM_ITLB_MISS);
 
 static struct attribute *power10_events_attr_dd1[] = {
-	GENERIC_EVENT_PTR(PM_RUN_CYC),
-	GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
+	GENERIC_EVENT_PTR(PM_CYC),
+	GENERIC_EVENT_PTR(PM_INST_CMPL),
 	GENERIC_EVENT_PTR(PM_BR_CMPL),
 	GENERIC_EVENT_PTR(PM_BR_MPRED_CMPL),
 	GENERIC_EVENT_PTR(PM_LD_REF_L1),
@@ -173,8 +173,8 @@ static struct attribute *power10_events_attr_dd1[] = {
 };
 
 static struct attribute *power10_events_attr[] = {
-	GENERIC_EVENT_PTR(PM_RUN_CYC),
-	GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
+	GENERIC_EVENT_PTR(PM_CYC),
+	GENERIC_EVENT_PTR(PM_INST_CMPL),
 	GENERIC_EVENT_PTR(PM_BR_FIN),
 	GENERIC_EVENT_PTR(PM_MPRED_BR_FIN),
 	GENERIC_EVENT_PTR(PM_LD_REF_L1),
@@ -271,8 +271,8 @@ static const struct attribute_group *power10_pmu_attr_groups[] = {
 };
 
 static int power10_generic_events_dd1[] = {
-	[PERF_COUNT_HW_CPU_CYCLES] =			PM_RUN_CYC,
-	[PERF_COUNT_HW_INSTRUCTIONS] =			PM_RUN_INST_CMPL,
+	[PERF_COUNT_HW_CPU_CYCLES] =			PM_CYC,
+	[PERF_COUNT_HW_INSTRUCTIONS] =			PM_INST_CMPL,
 	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =		PM_BR_CMPL,
 	[PERF_COUNT_HW_BRANCH_MISSES] =			PM_BR_MPRED_CMPL,
 	[PERF_COUNT_HW_CACHE_REFERENCES] =		PM_LD_REF_L1,
@@ -280,8 +280,8 @@ static int power10_generic_events_dd1[] = {
 };
 
 static int power10_generic_events[] = {
-	[PERF_COUNT_HW_CPU_CYCLES] =			PM_RUN_CYC,
-	[PERF_COUNT_HW_INSTRUCTIONS] =			PM_RUN_INST_CMPL,
+	[PERF_COUNT_HW_CPU_CYCLES] =			PM_CYC,
+	[PERF_COUNT_HW_INSTRUCTIONS] =			PM_INST_CMPL,
 	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =		PM_BR_FIN,
 	[PERF_COUNT_HW_BRANCH_MISSES] =			PM_MPRED_BR_FIN,
 	[PERF_COUNT_HW_CACHE_REFERENCES] =		PM_LD_REF_L1,
@@ -548,6 +548,24 @@ static u64 power10_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 
 #undef C
 
+/*
+ * Set the MMCR0[CC56RUN] bit to enable counting for
+ * PMC5 and PMC6 regardless of the state of CTRL[RUN],
+ * so that we can use counters 5 and 6 as PM_INST_CMPL and
+ * PM_CYC.
+ */
+static int power10_compute_mmcr(u64 event[], int n_ev,
+				unsigned int hwc[], struct mmcr_regs *mmcr,
+				struct perf_event *pevents[], u32 flags)
+{
+	int ret;
+
+	ret = isa207_compute_mmcr(event, n_ev, hwc, mmcr, pevents, flags);
+	if (!ret)
+		mmcr->mmcr0 |= MMCR0_C56RUN;
+	return ret;
+}
+
 static struct power_pmu power10_pmu = {
 	.name			= "POWER10",
 	.n_counter		= MAX_PMU_COUNTERS,
@@ -555,7 +573,7 @@ static struct power_pmu power10_pmu = {
 	.test_adder		= ISA207_TEST_ADDER,
 	.group_constraint_mask	= CNST_CACHE_PMC4_MASK,
 	.group_constraint_val	= CNST_CACHE_PMC4_VAL,
-	.compute_mmcr		= isa207_compute_mmcr,
+	.compute_mmcr		= power10_compute_mmcr,
 	.config_bhrb		= power10_config_bhrb,
 	.bhrb_filter_map	= power10_bhrb_filter_map,
 	.get_constraint		= isa207_get_constraint,
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related

* [V3 4/4] tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs
From: Athira Rajeev @ 2021-10-07  6:55 UTC (permalink / raw)
  To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
In-Reply-To: <20211007065505.27809-1-atrajeev@linux.vnet.ibm.com>

Patch enables presenting of Sampled Instruction Address Register (SIAR)
and Sampled Data Address Register (SDAR) SPRs as part of extended regsiters
for perf tool. Add these SPR's to sample_reg_mask in the tool side (to use
with -I? option).

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 tools/arch/powerpc/include/uapi/asm/perf_regs.h | 11 +++++++----
 tools/perf/arch/powerpc/include/perf_regs.h     |  2 ++
 tools/perf/arch/powerpc/util/perf_regs.c        |  2 ++
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index 085094553f3b..749a2e3af89e 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -61,17 +61,19 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_PMC4,
 	PERF_REG_POWERPC_PMC5,
 	PERF_REG_POWERPC_PMC6,
+	PERF_REG_POWERPC_SDAR,
+	PERF_REG_POWERPC_SIAR,
 	/* Max mask value for interrupt regs w/o extended regs */
 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 	/* Max mask value for interrupt regs including extended regs */
-	PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_PMC6 + 1,
+	PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_SIAR + 1,
 };
 
 #define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
 
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300
- * includes 9 SPRS from MMCR0 to PMC6 excluding the
+ * includes 11 SPRS from MMCR0 to SIAR excluding the
  * unsupported SPRS MMCR3, SIER2 and SIER3.
  */
 #define PERF_REG_PMU_MASK_300	\
@@ -79,11 +81,12 @@ enum perf_event_powerpc_regs {
 	(1ULL << PERF_REG_POWERPC_MMCR2) | (1ULL << PERF_REG_POWERPC_PMC1) | \
 	(1ULL << PERF_REG_POWERPC_PMC2) | (1ULL << PERF_REG_POWERPC_PMC3) | \
 	(1ULL << PERF_REG_POWERPC_PMC4) | (1ULL << PERF_REG_POWERPC_PMC5) | \
-	(1ULL << PERF_REG_POWERPC_PMC6))
+	(1ULL << PERF_REG_POWERPC_PMC6) | (1ULL << PERF_REG_POWERPC_SDAR) | \
+	(1ULL << PERF_REG_POWERPC_SIAR))
 
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31
- * includes 12 SPRs from MMCR0 to PMC6.
+ * includes 14 SPRs from MMCR0 to SIAR.
  */
 #define PERF_REG_PMU_MASK_31	\
 	(PERF_REG_PMU_MASK_300 | (1ULL << PERF_REG_POWERPC_MMCR3) | \
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
index 04e5dc07e93f..93339d17acc4 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -77,6 +77,8 @@ static const char *reg_names[] = {
 	[PERF_REG_POWERPC_PMC4] = "pmc4",
 	[PERF_REG_POWERPC_PMC5] = "pmc5",
 	[PERF_REG_POWERPC_PMC6] = "pmc6",
+	[PERF_REG_POWERPC_SDAR] = "sdar",
+	[PERF_REG_POWERPC_SIAR] = "siar",
 };
 
 static inline const char *__perf_reg_name(int id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index 8116a253f91f..8d07a78e742a 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -74,6 +74,8 @@ const struct sample_reg sample_reg_masks[] = {
 	SMPL_REG(pmc4, PERF_REG_POWERPC_PMC4),
 	SMPL_REG(pmc5, PERF_REG_POWERPC_PMC5),
 	SMPL_REG(pmc6, PERF_REG_POWERPC_PMC6),
+	SMPL_REG(sdar, PERF_REG_POWERPC_SDAR),
+	SMPL_REG(siar, PERF_REG_POWERPC_SIAR),
 	SMPL_REG_END
 };
 
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related

* [V3 3/4] powerpc/perf: Expose instruction and data address registers as part of extended regs
From: Athira Rajeev @ 2021-10-07  6:55 UTC (permalink / raw)
  To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
In-Reply-To: <20211007065505.27809-1-atrajeev@linux.vnet.ibm.com>

Patch adds support to include Sampled Instruction Address Register
(SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended
registers. Update the definition of PERF_REG_PMU_MASK_300/31 and
PERF_REG_EXTENDED_MAX to include these SPR's.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
---
 arch/powerpc/include/uapi/asm/perf_regs.h | 11 +++++++----
 arch/powerpc/perf/perf_regs.c             |  4 ++++
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
index 085094553f3b..749a2e3af89e 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -61,17 +61,19 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_PMC4,
 	PERF_REG_POWERPC_PMC5,
 	PERF_REG_POWERPC_PMC6,
+	PERF_REG_POWERPC_SDAR,
+	PERF_REG_POWERPC_SIAR,
 	/* Max mask value for interrupt regs w/o extended regs */
 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 	/* Max mask value for interrupt regs including extended regs */
-	PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_PMC6 + 1,
+	PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_SIAR + 1,
 };
 
 #define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
 
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300
- * includes 9 SPRS from MMCR0 to PMC6 excluding the
+ * includes 11 SPRS from MMCR0 to SIAR excluding the
  * unsupported SPRS MMCR3, SIER2 and SIER3.
  */
 #define PERF_REG_PMU_MASK_300	\
@@ -79,11 +81,12 @@ enum perf_event_powerpc_regs {
 	(1ULL << PERF_REG_POWERPC_MMCR2) | (1ULL << PERF_REG_POWERPC_PMC1) | \
 	(1ULL << PERF_REG_POWERPC_PMC2) | (1ULL << PERF_REG_POWERPC_PMC3) | \
 	(1ULL << PERF_REG_POWERPC_PMC4) | (1ULL << PERF_REG_POWERPC_PMC5) | \
-	(1ULL << PERF_REG_POWERPC_PMC6))
+	(1ULL << PERF_REG_POWERPC_PMC6) | (1ULL << PERF_REG_POWERPC_SDAR) | \
+	(1ULL << PERF_REG_POWERPC_SIAR))
 
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31
- * includes 12 SPRs from MMCR0 to PMC6.
+ * includes 14 SPRs from MMCR0 to SIAR.
  */
 #define PERF_REG_PMU_MASK_31	\
 	(PERF_REG_PMU_MASK_300 | (1ULL << PERF_REG_POWERPC_MMCR3) | \
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index b931eed482c9..51d31b65e423 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -90,7 +90,11 @@ static u64 get_ext_regs_value(int idx)
 		return mfspr(SPRN_SIER2);
 	case PERF_REG_POWERPC_SIER3:
 		return mfspr(SPRN_SIER3);
+	case PERF_REG_POWERPC_SDAR:
+		return mfspr(SPRN_SDAR);
 #endif
+	case PERF_REG_POWERPC_SIAR:
+		return mfspr(SPRN_SIAR);
 	default: return 0;
 	}
 }
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related

* [V3 2/4] tools/perf: Refactor the code definition of perf reg extended mask in tools side header file
From: Athira Rajeev @ 2021-10-07  6:55 UTC (permalink / raw)
  To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
In-Reply-To: <20211007065505.27809-1-atrajeev@linux.vnet.ibm.com>

PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 defines the mask
value for extended registers. Current definition of these mask values
uses hex constant and does not use registers by name, making it less
readable. Patch refactor the macro values in perf tools side header file
by or'ing together the actual register value constants.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 .../arch/powerpc/include/uapi/asm/perf_regs.h | 21 ++++++++++++-------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index 578b3ee86105..085094553f3b 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -61,27 +61,32 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_PMC4,
 	PERF_REG_POWERPC_PMC5,
 	PERF_REG_POWERPC_PMC6,
-	/* Max regs without the extended regs */
+	/* Max mask value for interrupt regs w/o extended regs */
 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
+	/* Max mask value for interrupt regs including extended regs */
+	PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_PMC6 + 1,
 };
 
 #define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
 
-/* Exclude MMCR3, SIER2, SIER3 for CPU_FTR_ARCH_300 */
-#define	PERF_EXCLUDE_REG_EXT_300	(7ULL << PERF_REG_POWERPC_MMCR3)
-
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300
  * includes 9 SPRS from MMCR0 to PMC6 excluding the
- * unsupported SPRS in PERF_EXCLUDE_REG_EXT_300.
+ * unsupported SPRS MMCR3, SIER2 and SIER3.
  */
-#define PERF_REG_PMU_MASK_300   ((0xfffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300)
+#define PERF_REG_PMU_MASK_300	\
+	((1ULL << PERF_REG_POWERPC_MMCR0) | (1ULL << PERF_REG_POWERPC_MMCR1) | \
+	(1ULL << PERF_REG_POWERPC_MMCR2) | (1ULL << PERF_REG_POWERPC_PMC1) | \
+	(1ULL << PERF_REG_POWERPC_PMC2) | (1ULL << PERF_REG_POWERPC_PMC3) | \
+	(1ULL << PERF_REG_POWERPC_PMC4) | (1ULL << PERF_REG_POWERPC_PMC5) | \
+	(1ULL << PERF_REG_POWERPC_PMC6))
 
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31
  * includes 12 SPRs from MMCR0 to PMC6.
  */
-#define PERF_REG_PMU_MASK_31   (0xfffULL << PERF_REG_POWERPC_MMCR0)
+#define PERF_REG_PMU_MASK_31	\
+	(PERF_REG_PMU_MASK_300 | (1ULL << PERF_REG_POWERPC_MMCR3) | \
+	(1ULL << PERF_REG_POWERPC_SIER2) | (1ULL << PERF_REG_POWERPC_SIER3))
 
-#define PERF_REG_EXTENDED_MAX  (PERF_REG_POWERPC_PMC6 + 1)
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related

* [V3 1/4] powerpc/perf: Refactor the code definition of perf reg extended mask
From: Athira Rajeev @ 2021-10-07  6:55 UTC (permalink / raw)
  To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
In-Reply-To: <20211007065505.27809-1-atrajeev@linux.vnet.ibm.com>

PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 defines the mask
value for extended registers. Current definition of these mask values
uses hex constant and does not use registers by name, making it less
readable. Patch refactor the macro values by or'ing together the actual
register value constants. Also include PERF_REG_EXTENDED_MAX as
part of enum definition.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/include/uapi/asm/perf_regs.h | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
index 578b3ee86105..085094553f3b 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -61,27 +61,32 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_PMC4,
 	PERF_REG_POWERPC_PMC5,
 	PERF_REG_POWERPC_PMC6,
-	/* Max regs without the extended regs */
+	/* Max mask value for interrupt regs w/o extended regs */
 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
+	/* Max mask value for interrupt regs including extended regs */
+	PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_PMC6 + 1,
 };
 
 #define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
 
-/* Exclude MMCR3, SIER2, SIER3 for CPU_FTR_ARCH_300 */
-#define	PERF_EXCLUDE_REG_EXT_300	(7ULL << PERF_REG_POWERPC_MMCR3)
-
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300
  * includes 9 SPRS from MMCR0 to PMC6 excluding the
- * unsupported SPRS in PERF_EXCLUDE_REG_EXT_300.
+ * unsupported SPRS MMCR3, SIER2 and SIER3.
  */
-#define PERF_REG_PMU_MASK_300   ((0xfffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300)
+#define PERF_REG_PMU_MASK_300	\
+	((1ULL << PERF_REG_POWERPC_MMCR0) | (1ULL << PERF_REG_POWERPC_MMCR1) | \
+	(1ULL << PERF_REG_POWERPC_MMCR2) | (1ULL << PERF_REG_POWERPC_PMC1) | \
+	(1ULL << PERF_REG_POWERPC_PMC2) | (1ULL << PERF_REG_POWERPC_PMC3) | \
+	(1ULL << PERF_REG_POWERPC_PMC4) | (1ULL << PERF_REG_POWERPC_PMC5) | \
+	(1ULL << PERF_REG_POWERPC_PMC6))
 
 /*
  * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31
  * includes 12 SPRs from MMCR0 to PMC6.
  */
-#define PERF_REG_PMU_MASK_31   (0xfffULL << PERF_REG_POWERPC_MMCR0)
+#define PERF_REG_PMU_MASK_31	\
+	(PERF_REG_PMU_MASK_300 | (1ULL << PERF_REG_POWERPC_MMCR3) | \
+	(1ULL << PERF_REG_POWERPC_SIER2) | (1ULL << PERF_REG_POWERPC_SIER3))
 
-#define PERF_REG_EXTENDED_MAX  (PERF_REG_POWERPC_PMC6 + 1)
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related

* [V3 0/4] powerpc/perf: Add instruction and data address registers to extended regs
From: Athira Rajeev @ 2021-10-07  6:55 UTC (permalink / raw)
  To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry

Patch set adds PMU registers namely Sampled Instruction Address Register
(SIAR) and Sampled Data Address Register (SDAR) as part of extended regs
in PowerPC. These registers provides the instruction/data address and
adding these to extended regs helps in debug purposes.

Patch 1/4 and 2/4 refactors the existing macro definition of
PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 to make it more
readable.
Patch 3/4 adds SIAR and SDAR as part of the extended regs mask.
Patch 4/4 includes perf tools side changes to add the SPRs to
sample_reg_mask to use with -I? option.

Changelog:
Change from v2 -> v3:
Addressed review comments from Michael Ellerman
- Fixed the macro definition to use "unsigned long long"
  which otherwise will cause build error with perf on
  32-bit.
- Added Reviewed-by from Daniel Axtens for patch3.

Change from v1 -> v2:
Addressed review comments from Michael Ellerman
- Refactored the perf reg extended mask value macros for
  PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 to
  make it more readable. Also moved PERF_REG_EXTENDED_MAX
  along with enum definition similar to PERF_REG_POWERPC_MAX.

Athira Rajeev (4):
  powerpc/perf: Refactor the code definition of perf reg extended mask
  tools/perf: Refactor the code definition of perf reg extended mask in
    tools side header file
  powerpc/perf: Expose instruction and data address registers as part of
    extended regs
  tools/perf: Add perf tools support to expose instruction and data
    address registers as part of extended regs

 arch/powerpc/include/uapi/asm/perf_regs.h     | 28 ++++++++++++-------
 arch/powerpc/perf/perf_regs.c                 |  4 +++
 .../arch/powerpc/include/uapi/asm/perf_regs.h | 28 ++++++++++++-------
 tools/perf/arch/powerpc/include/perf_regs.h   |  2 ++
 tools/perf/arch/powerpc/util/perf_regs.c      |  2 ++
 5 files changed, 44 insertions(+), 20 deletions(-)

-- 
2.30.1 (Apple Git-130)


^ permalink raw reply

* Re: [PATCH v3 0/4] Add mem_hops field in perf_mem_data_src structure
From: Peter Zijlstra @ 2021-10-07  6:49 UTC (permalink / raw)
  To: Kajol Jain
  Cc: mark.rutland, atrajeev, ak, daniel, rnsastry, alexander.shishkin,
	linux-kernel, acme, ast, linux-perf-users, yao.jin, mingo, paulus,
	maddy, jolsa, namhyung, songliubraving, linuxppc-dev, kan.liang
In-Reply-To: <20211006140654.298352-1-kjain@linux.ibm.com>

On Wed, Oct 06, 2021 at 07:36:50PM +0530, Kajol Jain wrote:

> Kajol Jain (4):
>   perf: Add comment about current state of PERF_MEM_LVL_* namespace and
>     remove an extra line
>   perf: Add mem_hops field in perf_mem_data_src structure
>   tools/perf: Add mem_hops field in perf_mem_data_src structure
>   powerpc/perf: Fix data source encodings for L2.1 and L3.1 accesses
> 
>  arch/powerpc/perf/isa207-common.c     | 26 +++++++++++++++++++++-----
>  arch/powerpc/perf/isa207-common.h     |  2 ++
>  include/uapi/linux/perf_event.h       | 19 ++++++++++++++++---
>  tools/include/uapi/linux/perf_event.h | 19 ++++++++++++++++---
>  tools/perf/util/mem-events.c          | 20 ++++++++++++++++++--
>  5 files changed, 73 insertions(+), 13 deletions(-)

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

How do we want this routed? Shall I take it, or does Michael want it in
the Power tree?

^ permalink raw reply

* Re: [PATCH 03/12] ARM: broadcom: Use of_get_cpu_hwid()
From: Florian Fainelli @ 2021-10-07  2:24 UTC (permalink / raw)
  To: Rob Herring, Russell King, James Morse, Catalin Marinas,
	Will Deacon, Guo Ren, Jonas Bonn, Stefan Kristiansson,
	Stafford Horne, Michael Ellerman, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Yoshinori Sato, Rich Felker, x86, Greg Kroah-Hartman
  Cc: devicetree, linuxppc-dev, Scott Branden, Rafael J. Wysocki,
	linux-sh, Ray Jui, H. Peter Anvin, linux-kernel, linux-csky,
	openrisc, Ingo Molnar, Paul Mackerras, Borislav Petkov,
	bcm-kernel-feedback-list, Thomas Gleixner, Frank Rowand,
	linux-riscv, linux-arm-kernel
In-Reply-To: <20211006164332.1981454-4-robh@kernel.org>



On 10/6/2021 9:43 AM, Rob Herring wrote:
> Replace open coded parsing of CPU nodes 'reg' property with
> of_get_cpu_hwid().
> 
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: Ray Jui <rjui@broadcom.com>
> Cc: Scott Branden <sbranden@broadcom.com>
> Cc: bcm-kernel-feedback-list@broadcom.com
> Cc: Russell King <linux@armlinux.org.uk>
> Signed-off-by: Rob Herring <robh@kernel.org>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH 00/12] DT: CPU h/w id parsing clean-ups and cacheinfo id support
From: Florian Fainelli @ 2021-10-07  2:24 UTC (permalink / raw)
  To: Rob Herring, Russell King, James Morse, Catalin Marinas,
	Will Deacon, Guo Ren, Jonas Bonn, Stefan Kristiansson,
	Stafford Horne, Michael Ellerman, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Yoshinori Sato, Rich Felker, x86, Greg Kroah-Hartman
  Cc: devicetree, linuxppc-dev, Scott Branden, Rafael J. Wysocki,
	linux-sh, Ray Jui, H. Peter Anvin, linux-kernel, linux-csky,
	openrisc, Ingo Molnar, Paul Mackerras, Borislav Petkov,
	bcm-kernel-feedback-list, Thomas Gleixner, Frank Rowand,
	linux-riscv, linux-arm-kernel
In-Reply-To: <20211006164332.1981454-1-robh@kernel.org>



On 10/6/2021 9:43 AM, Rob Herring wrote:
> The first 10 patches add a new function, of_get_cpu_hwid(), which parses
> CPU DT node 'reg' property, and then use it to replace all the open
> coded versions of parsing CPU node 'reg' properties.
> 
> The last 2 patches add support for populating the cacheinfo 'id' on DT
> platforms. The minimum associated CPU hwid is used for the id. The id is
> optional, but necessary for resctrl which is being adapted for Arm MPAM.
> 
> Tested on arm64. Compile tested on arm, x86 and powerpc.

On ARM and ARM64:

Tested-by: Florian Fainelli <f.fainelli@gmail.com>

lscpu -C continues to work on ARM64 as before with cache properties 
provided in the FDT.
-- 
Florian

^ permalink raw reply

* Re: [PATCH] Documentation: Fix typo in testing/sysfs-class-cxl
From: Andrew Donnellan @ 2021-10-07  0:09 UTC (permalink / raw)
  To: Sohaib Mohamed; +Cc: Frederic Barrat, linuxppc-dev, linux-kernel
In-Reply-To: <20211006155017.135592-1-sohaib.amhmd@gmail.com>

On 7/10/21 2:50 am, Sohaib Mohamed wrote:
> Remove repeated words: "the the lowest" and "this this kernel"
> 
> Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>

Thanks for catching this.

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited

^ permalink raw reply

* Re: [PATCH 06/12] openrisc: Use of_get_cpu_hwid()
From: Stafford Horne @ 2021-10-06 22:37 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Rich Felker, Rafael J. Wysocki, Catalin Marinas, x86, Guo Ren,
	H. Peter Anvin, linux-riscv, Will Deacon, Jonas Bonn, Rob Herring,
	Florian Fainelli, Frank Rowand, linux-sh, Russell King,
	linux-csky, Ingo Molnar, bcm-kernel-feedback-list, James Morse,
	devicetree, Albert Ou, Ray Jui, Stefan Kristiansson, openrisc,
	Borislav Petkov, Paul Walmsley, Thomas Gleixner, linux-arm-kernel,
	Scott Branden, Yoshinori Sato, linux-kernel, Palmer Dabbelt,
	Greg Kroah-Hartman, Paul Mackerras, linuxppc-dev
In-Reply-To: <20211006212728.GM10333@gate.crashing.org>

Hi Segher,

On Wed, Oct 06, 2021 at 04:27:28PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 07, 2021 at 05:44:00AM +0900, Stafford Horne wrote:
> > You have defined of_get_cpu_hwid to return u64, will this create compiler
> > warnings when since we are storing a u64 into a u32?
> > 
> > It seems only if we make with W=3.
> 
> Yes.  This is done by -Wconversion, "Warn for implicit conversions that
> may alter a value."

Yeah, that is what I found out when I looked into it.

> > I thought we usually warned on this.
> 
> This warning is not in -Wall or -Wextra either, it suffers too much from
> false positives.  It is very natural to just ignore the high bits of
> modulo types (which is what "unsigned" types *are*).  Or the bits that
> "fall off" on a conversion.  The C standard makes this required
> behaviour, it is useful, and it is the only convenient way of getting
> this!

Thanks for the background, It does make sense. I guess I was confused with java
which requires casting when you store to a smaller size.  I.e.

    Test.java:5: error: incompatible types: possible lossy conversion from int to short
	s = i;

-Stafford

^ permalink raw reply

* Re: [PATCH 06/12] openrisc: Use of_get_cpu_hwid()
From: Segher Boessenkool @ 2021-10-06 21:27 UTC (permalink / raw)
  To: Stafford Horne
  Cc: Rich Felker, Rafael J. Wysocki, Catalin Marinas, x86, Guo Ren,
	H. Peter Anvin, linux-riscv, Will Deacon, Jonas Bonn, Rob Herring,
	Florian Fainelli, Frank Rowand, linux-sh, Russell King,
	linux-csky, Ingo Molnar, bcm-kernel-feedback-list, James Morse,
	devicetree, Albert Ou, Ray Jui, Stefan Kristiansson, openrisc,
	Borislav Petkov, Paul Walmsley, Thomas Gleixner, linux-arm-kernel,
	Scott Branden, Yoshinori Sato, linux-kernel, Palmer Dabbelt,
	Greg Kroah-Hartman, Paul Mackerras, linuxppc-dev
In-Reply-To: <YV4KkAC2p9D4yCnH@antec>

On Thu, Oct 07, 2021 at 05:44:00AM +0900, Stafford Horne wrote:
> You have defined of_get_cpu_hwid to return u64, will this create compiler
> warnings when since we are storing a u64 into a u32?
> 
> It seems only if we make with W=3.

Yes.  This is done by -Wconversion, "Warn for implicit conversions that
may alter a value."

> I thought we usually warned on this.

This warning is not in -Wall or -Wextra either, it suffers too much from
false positives.  It is very natural to just ignore the high bits of
modulo types (which is what "unsigned" types *are*).  Or the bits that
"fall off" on a conversion.  The C standard makes this required
behaviour, it is useful, and it is the only convenient way of getting
this!


Segher

^ permalink raw reply

* Re: [PATCH 06/12] openrisc: Use of_get_cpu_hwid()
From: Stafford Horne @ 2021-10-06 21:25 UTC (permalink / raw)
  To: Rob Herring
  Cc: Rich Felker, Rafael J. Wysocki, linux-kernel@vger.kernel.org,
	Guo Ren, H. Peter Anvin, linux-riscv, Will Deacon, Jonas Bonn,
	Florian Fainelli, Yoshinori Sato, SH-Linux, X86 ML, Russell King,
	linux-csky, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Catalin Marinas,
	Palmer Dabbelt, devicetree, Albert Ou, Ray Jui,
	Stefan Kristiansson, Openrisc, Borislav Petkov, Paul Walmsley,
	Thomas Gleixner, linux-arm-kernel, Scott Branden,
	Greg Kroah-Hartman, Frank Rowand, James Morse, Paul Mackerras,
	linuxppc-dev
In-Reply-To: <CAL_JsqLv+Ym=hxxz2vm0H3pbx1FRkBpHs3V=8DKjG43n+gS+RA@mail.gmail.com>

On Wed, Oct 06, 2021 at 04:08:38PM -0500, Rob Herring wrote:
> On Wed, Oct 6, 2021 at 3:44 PM Stafford Horne <shorne@gmail.com> wrote:
> >
> > On Wed, Oct 06, 2021 at 11:43:26AM -0500, Rob Herring wrote:
> > > Replace open coded parsing of CPU nodes' 'reg' property with
> > > of_get_cpu_hwid().
> > >
> > > Cc: Jonas Bonn <jonas@southpole.se>
> > > Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> > > Cc: Stafford Horne <shorne@gmail.com>
> > > Cc: openrisc@lists.librecores.org
> > > Signed-off-by: Rob Herring <robh@kernel.org>
> > > ---
> > >  arch/openrisc/kernel/smp.c | 6 +-----
> > >  1 file changed, 1 insertion(+), 5 deletions(-)
> > >
> > > diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
> > > index 415e209732a3..7d5a4f303a5a 100644
> > > --- a/arch/openrisc/kernel/smp.c
> > > +++ b/arch/openrisc/kernel/smp.c
> > > @@ -65,11 +65,7 @@ void __init smp_init_cpus(void)
> > >       u32 cpu_id;
> > >
> > >       for_each_of_cpu_node(cpu) {
> > > -             if (of_property_read_u32(cpu, "reg", &cpu_id)) {
> > > -                     pr_warn("%s missing reg property", cpu->full_name);
> > > -                     continue;
> > > -             }
> > > -
> > > +             cpu_id = of_get_cpu_hwid(cpu);
> 
> Oops, that should be: of_get_cpu_hwid(cpu, 0);

OK. I checked all other patches in the series, it seems OpenRISC was the only
one missing that.  Sorry I missed it initially.

> I thought I double checked all those...
> 
> > You have defined of_get_cpu_hwid to return u64, will this create compiler
> > warnings when since we are storing a u64 into a u32?
> 
> I'm counting on the caller to know the max size for their platform.

OK.

> >
> > It seems only if we make with W=3.
> >
> > I thought we usually warned on this.  Oh well, for the openrisc bits.
> 
> That's only on ptr truncation I think.

Right, that makes sense.

-Stafford

^ permalink raw reply

* [PATCH] Documentation: Fix typo in testing/sysfs-class-cxl
From: Sohaib Mohamed @ 2021-10-06 15:50 UTC (permalink / raw)
  To: sohaib.amhmd
  Cc: Frederic Barrat, linuxppc-dev, Andrew Donnellan, linux-kernel

Remove repeated words: "the the lowest" and "this this kernel"

Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
---
 Documentation/ABI/testing/sysfs-class-cxl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
index 3c77677e0ca7..594fda254130 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -103,8 +103,8 @@ What:           /sys/class/cxl/<afu>/api_version_compatible
 Date:           September 2014
 Contact:        linuxppc-dev@lists.ozlabs.org
 Description:    read only
-                Decimal value of the the lowest version of the userspace API
-                this this kernel supports.
+                Decimal value of the lowest version of the userspace API
+                this kernel supports.
 Users:		https://github.com/ibm-capi/libcxl


--
2.25.1


^ permalink raw reply related

* [PATCH] Documentation: Fix typo in testing/sysfs-class-cxl
From: Sohaib Mohamed @ 2021-10-06 14:39 UTC (permalink / raw)
  To: sohaib.amhmd
  Cc: Frederic Barrat, linuxppc-dev, Andrew Donnellan, linux-kernel

Remove repeated worlds: "the the lowest" and "this this kernel"

Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
---
 Documentation/ABI/testing/sysfs-class-cxl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
index 3c77677e0ca7..594fda254130 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -103,8 +103,8 @@ What:           /sys/class/cxl/<afu>/api_version_compatible
 Date:           September 2014
 Contact:        linuxppc-dev@lists.ozlabs.org
 Description:    read only
-                Decimal value of the the lowest version of the userspace API
-                this this kernel supports.
+                Decimal value of the lowest version of the userspace API
+                this kernel supports.
 Users:		https://github.com/ibm-capi/libcxl
 
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH] docs: typo fixes in Documentation/ABI/
From: Sohaib Mohamed @ 2021-10-06 13:20 UTC (permalink / raw)
  To: sohaib.amhmd
  Cc: Daejun Park, Gioh Kim, Can Guo, Bean Huo, Fabrice Gasnier,
	Jonathan Corbet, Mauro Carvalho Chehab, Jason Gunthorpe,
	Lukas Bulwahn, Zhang Rui, Jack Wang, Andrew Donnellan,
	Avri Altman, Jonathan Cameron, Adrian Hunter, Carlos Bilbao,
	Jens Axboe, Martin K. Petersen, Greg Kroah-Hartman, linux-kernel,
	Frederic Barrat, linuxppc-dev

All these changes are about to remove repeated words from severals place in the Documentation/ABI/ directory:

- In file stable/sysfs-module:41: "the the source"

- In file testing/sysfs-bus-rapidio:98: "that that owns"

- In file testing/sysfs-class-cxl:106: "the the lowest"

- In file testing/sysfs-class-cxl:107: "this this kernel"

- In file testing/sysfs-class-rnbd-client:131: "as as the"

- In file testing/sysfs-class-rtrs-client:81: "the the name"

- In file testing/sysfs-class-rtrs-server:27: "the the name"

- In file testing/sysfs-devices-platform-ACPI-TAD:77: "the the status"

- In file testing/sysfs-devices-power:306: "the the children"

- In file testing/sysfs-driver-ufs:986: "the The amount"

- In file testing/sysfs-firmware-acpi:115: "send send a Notify"

Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
---
 Documentation/ABI/stable/sysfs-module                     | 2 +-
 Documentation/ABI/testing/sysfs-bus-rapidio               | 2 +-
 Documentation/ABI/testing/sysfs-class-cxl                 | 4 ++--
 Documentation/ABI/testing/sysfs-class-rnbd-client         | 2 +-
 Documentation/ABI/testing/sysfs-class-rtrs-client         | 2 +-
 Documentation/ABI/testing/sysfs-class-rtrs-server         | 2 +-
 Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD | 2 +-
 Documentation/ABI/testing/sysfs-devices-power             | 2 +-
 Documentation/ABI/testing/sysfs-driver-ufs                | 2 +-
 Documentation/ABI/testing/sysfs-firmware-acpi             | 2 +-
 10 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-module b/Documentation/ABI/stable/sysfs-module
index 560b4a3278df..41b1f16e8795 100644
--- a/Documentation/ABI/stable/sysfs-module
+++ b/Documentation/ABI/stable/sysfs-module
@@ -38,7 +38,7 @@ What:		/sys/module/<MODULENAME>/srcversion
 Date:		Jun 2005
 Description:
 		If the module source has MODULE_VERSION, this file will contain
-		the checksum of the the source code.
+		the checksum of the source code.
 
 What:		/sys/module/<MODULENAME>/version
 Date:		Jun 2005
diff --git a/Documentation/ABI/testing/sysfs-bus-rapidio b/Documentation/ABI/testing/sysfs-bus-rapidio
index f8b6728dac10..9e8fbff99b75 100644
--- a/Documentation/ABI/testing/sysfs-bus-rapidio
+++ b/Documentation/ABI/testing/sysfs-bus-rapidio
@@ -95,7 +95,7 @@ Contact:	Matt Porter <mporter@kernel.crashing.org>,
 		Alexandre Bounine <alexandre.bounine@idt.com>
 Description:
 		(RO) returns name of previous device (switch) on the path to the
-		device that that owns this attribute
+		device that owns this attribute
 
 What:		/sys/bus/rapidio/devices/<nn>:<d>:<iiii>/modalias
 Date:		Jul, 2013
diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
index 3c77677e0ca7..594fda254130 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -103,8 +103,8 @@ What:           /sys/class/cxl/<afu>/api_version_compatible
 Date:           September 2014
 Contact:        linuxppc-dev@lists.ozlabs.org
 Description:    read only
-                Decimal value of the the lowest version of the userspace API
-                this this kernel supports.
+                Decimal value of the lowest version of the userspace API
+                this kernel supports.
 Users:		https://github.com/ibm-capi/libcxl
 
 
diff --git a/Documentation/ABI/testing/sysfs-class-rnbd-client b/Documentation/ABI/testing/sysfs-class-rnbd-client
index 0b5997ab3365..e6cdc851952c 100644
--- a/Documentation/ABI/testing/sysfs-class-rnbd-client
+++ b/Documentation/ABI/testing/sysfs-class-rnbd-client
@@ -128,6 +128,6 @@ Description:	For each device mapped on the client a new symbolic link is created
 		The <device_id> of each device is created as follows:
 
 		- If the 'device_path' provided during mapping contains slashes ("/"),
-		  they are replaced by exclamation mark ("!") and used as as the
+		  they are replaced by exclamation mark ("!") and used as the
 		  <device_id>. Otherwise, the <device_id> will be the same as the
 		  "device_path" provided.
diff --git a/Documentation/ABI/testing/sysfs-class-rtrs-client b/Documentation/ABI/testing/sysfs-class-rtrs-client
index 49a4157c7bf1..fecc59d1b96f 100644
--- a/Documentation/ABI/testing/sysfs-class-rtrs-client
+++ b/Documentation/ABI/testing/sysfs-class-rtrs-client
@@ -78,7 +78,7 @@ What:		/sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_name
 Date:		Feb 2020
 KernelVersion:	5.7
 Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
-Description:	RO, Contains the the name of HCA the connection established on.
+Description:	RO, Contains the name of HCA the connection established on.
 
 What:		/sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_port
 Date:		Feb 2020
diff --git a/Documentation/ABI/testing/sysfs-class-rtrs-server b/Documentation/ABI/testing/sysfs-class-rtrs-server
index 3b6d5b067df0..b08601d80409 100644
--- a/Documentation/ABI/testing/sysfs-class-rtrs-server
+++ b/Documentation/ABI/testing/sysfs-class-rtrs-server
@@ -24,7 +24,7 @@ What:		/sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_name
 Date:		Feb 2020
 KernelVersion:	5.7
 Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
-Description:	RO, Contains the the name of HCA the connection established on.
+Description:	RO, Contains the name of HCA the connection established on.
 
 What:		/sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_port
 Date:		Feb 2020
diff --git a/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD b/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
index f7b360a61b21..bc44bc903bc8 100644
--- a/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
+++ b/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
@@ -74,7 +74,7 @@ Description:
 
 		Reads also cause the AC alarm timer status to be reset.
 
-		Another way to reset the the status of the AC alarm timer is to
+		Another way to reset the status of the AC alarm timer is to
 		write (the number) 0 to this file.
 
 		If the status return value indicates that the timer has expired,
diff --git a/Documentation/ABI/testing/sysfs-devices-power b/Documentation/ABI/testing/sysfs-devices-power
index 1b2a2d41ff80..54195530e97a 100644
--- a/Documentation/ABI/testing/sysfs-devices-power
+++ b/Documentation/ABI/testing/sysfs-devices-power
@@ -303,5 +303,5 @@ Date:		Apr 2010
 Contact:	Dominik Brodowski <linux@dominikbrodowski.net>
 Description:
 		Reports the runtime PM children usage count of a device, or
-		0 if the the children will be ignored.
+		0 if the children will be ignored.
 
diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs
index 863cc4897277..57aec11a573f 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -983,7 +983,7 @@ Description:	This file shows the amount of data that the host plans to
 What:		/sys/class/scsi_device/*/device/dyn_cap_needed
 Date:		February 2018
 Contact:	Stanislav Nijnikov <stanislav.nijnikov@wdc.com>
-Description:	This file shows the The amount of physical memory needed
+Description:	This file shows The amount of physical memory needed
 		to be removed from the physical memory resources pool of
 		the particular logical unit. The full information about
 		the attribute could be found at UFS specifications 2.1.
diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi
index 819939d858c9..39173375c53a 100644
--- a/Documentation/ABI/testing/sysfs-firmware-acpi
+++ b/Documentation/ABI/testing/sysfs-firmware-acpi
@@ -112,7 +112,7 @@ Description:
 		OS context.  GPE 0x12, for example, would vector
 		to a level or edge handler called _L12 or _E12.
 		The handler may do its business and return.
-		Or the handler may send send a Notify event
+		Or the handler may send a Notify event
 		to a Linux device driver registered on an ACPI device,
 		such as a battery, or a processor.
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH] docs: typo fixes in Documentation/ABI/
From: Sohaib Mohamed @ 2021-10-06 12:13 UTC (permalink / raw)
  To: sohaib.amhmd
  Cc: Daejun Park, Adrian Hunter, Can Guo, Bean Huo, Jonathan Corbet,
	Mauro Carvalho Chehab, Jason Gunthorpe, Lukas Bulwahn,
	Ilya Dryomov, Jack Wang, Andrew Donnellan, Avri Altman,
	Jonathan Cameron, Fabrice Gasnier, Zhang Rui, Jens Axboe,
	Martin K. Petersen, Greg Kroah-Hartman, Gioh Kim, linux-kernel,
	Frederic Barrat, linuxppc-dev

Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
---
 Documentation/ABI/stable/sysfs-module                     | 2 +-
 Documentation/ABI/testing/sysfs-bus-rapidio               | 2 +-
 Documentation/ABI/testing/sysfs-class-cxl                 | 4 ++--
 Documentation/ABI/testing/sysfs-class-rnbd-client         | 2 +-
 Documentation/ABI/testing/sysfs-class-rtrs-client         | 2 +-
 Documentation/ABI/testing/sysfs-class-rtrs-server         | 2 +-
 Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD | 2 +-
 Documentation/ABI/testing/sysfs-devices-power             | 2 +-
 Documentation/ABI/testing/sysfs-driver-ufs                | 2 +-
 Documentation/ABI/testing/sysfs-firmware-acpi             | 2 +-
 10 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-module b/Documentation/ABI/stable/sysfs-module
index 560b4a3278df..41b1f16e8795 100644
--- a/Documentation/ABI/stable/sysfs-module
+++ b/Documentation/ABI/stable/sysfs-module
@@ -38,7 +38,7 @@ What:		/sys/module/<MODULENAME>/srcversion
 Date:		Jun 2005
 Description:
 		If the module source has MODULE_VERSION, this file will contain
-		the checksum of the the source code.
+		the checksum of the source code.
 
 What:		/sys/module/<MODULENAME>/version
 Date:		Jun 2005
diff --git a/Documentation/ABI/testing/sysfs-bus-rapidio b/Documentation/ABI/testing/sysfs-bus-rapidio
index f8b6728dac10..9e8fbff99b75 100644
--- a/Documentation/ABI/testing/sysfs-bus-rapidio
+++ b/Documentation/ABI/testing/sysfs-bus-rapidio
@@ -95,7 +95,7 @@ Contact:	Matt Porter <mporter@kernel.crashing.org>,
 		Alexandre Bounine <alexandre.bounine@idt.com>
 Description:
 		(RO) returns name of previous device (switch) on the path to the
-		device that that owns this attribute
+		device that owns this attribute
 
 What:		/sys/bus/rapidio/devices/<nn>:<d>:<iiii>/modalias
 Date:		Jul, 2013
diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
index 3c77677e0ca7..594fda254130 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -103,8 +103,8 @@ What:           /sys/class/cxl/<afu>/api_version_compatible
 Date:           September 2014
 Contact:        linuxppc-dev@lists.ozlabs.org
 Description:    read only
-                Decimal value of the the lowest version of the userspace API
-                this this kernel supports.
+                Decimal value of the lowest version of the userspace API
+                this kernel supports.
 Users:		https://github.com/ibm-capi/libcxl
 
 
diff --git a/Documentation/ABI/testing/sysfs-class-rnbd-client b/Documentation/ABI/testing/sysfs-class-rnbd-client
index 0b5997ab3365..e6cdc851952c 100644
--- a/Documentation/ABI/testing/sysfs-class-rnbd-client
+++ b/Documentation/ABI/testing/sysfs-class-rnbd-client
@@ -128,6 +128,6 @@ Description:	For each device mapped on the client a new symbolic link is created
 		The <device_id> of each device is created as follows:
 
 		- If the 'device_path' provided during mapping contains slashes ("/"),
-		  they are replaced by exclamation mark ("!") and used as as the
+		  they are replaced by exclamation mark ("!") and used as the
 		  <device_id>. Otherwise, the <device_id> will be the same as the
 		  "device_path" provided.
diff --git a/Documentation/ABI/testing/sysfs-class-rtrs-client b/Documentation/ABI/testing/sysfs-class-rtrs-client
index 49a4157c7bf1..fecc59d1b96f 100644
--- a/Documentation/ABI/testing/sysfs-class-rtrs-client
+++ b/Documentation/ABI/testing/sysfs-class-rtrs-client
@@ -78,7 +78,7 @@ What:		/sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_name
 Date:		Feb 2020
 KernelVersion:	5.7
 Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
-Description:	RO, Contains the the name of HCA the connection established on.
+Description:	RO, Contains the name of HCA the connection established on.
 
 What:		/sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_port
 Date:		Feb 2020
diff --git a/Documentation/ABI/testing/sysfs-class-rtrs-server b/Documentation/ABI/testing/sysfs-class-rtrs-server
index 3b6d5b067df0..b08601d80409 100644
--- a/Documentation/ABI/testing/sysfs-class-rtrs-server
+++ b/Documentation/ABI/testing/sysfs-class-rtrs-server
@@ -24,7 +24,7 @@ What:		/sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_name
 Date:		Feb 2020
 KernelVersion:	5.7
 Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
-Description:	RO, Contains the the name of HCA the connection established on.
+Description:	RO, Contains the name of HCA the connection established on.
 
 What:		/sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_port
 Date:		Feb 2020
diff --git a/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD b/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
index f7b360a61b21..bc44bc903bc8 100644
--- a/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
+++ b/Documentation/ABI/testing/sysfs-devices-platform-ACPI-TAD
@@ -74,7 +74,7 @@ Description:
 
 		Reads also cause the AC alarm timer status to be reset.
 
-		Another way to reset the the status of the AC alarm timer is to
+		Another way to reset the status of the AC alarm timer is to
 		write (the number) 0 to this file.
 
 		If the status return value indicates that the timer has expired,
diff --git a/Documentation/ABI/testing/sysfs-devices-power b/Documentation/ABI/testing/sysfs-devices-power
index 1b2a2d41ff80..54195530e97a 100644
--- a/Documentation/ABI/testing/sysfs-devices-power
+++ b/Documentation/ABI/testing/sysfs-devices-power
@@ -303,5 +303,5 @@ Date:		Apr 2010
 Contact:	Dominik Brodowski <linux@dominikbrodowski.net>
 Description:
 		Reports the runtime PM children usage count of a device, or
-		0 if the the children will be ignored.
+		0 if the children will be ignored.
 
diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs
index 863cc4897277..57aec11a573f 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -983,7 +983,7 @@ Description:	This file shows the amount of data that the host plans to
 What:		/sys/class/scsi_device/*/device/dyn_cap_needed
 Date:		February 2018
 Contact:	Stanislav Nijnikov <stanislav.nijnikov@wdc.com>
-Description:	This file shows the The amount of physical memory needed
+Description:	This file shows The amount of physical memory needed
 		to be removed from the physical memory resources pool of
 		the particular logical unit. The full information about
 		the attribute could be found at UFS specifications 2.1.
diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi
index 819939d858c9..39173375c53a 100644
--- a/Documentation/ABI/testing/sysfs-firmware-acpi
+++ b/Documentation/ABI/testing/sysfs-firmware-acpi
@@ -112,7 +112,7 @@ Description:
 		OS context.  GPE 0x12, for example, would vector
 		to a level or edge handler called _L12 or _E12.
 		The handler may do its business and return.
-		Or the handler may send send a Notify event
+		Or the handler may send a Notify event
 		to a Linux device driver registered on an ACPI device,
 		such as a battery, or a processor.
 
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH 06/12] openrisc: Use of_get_cpu_hwid()
From: Rob Herring @ 2021-10-06 21:08 UTC (permalink / raw)
  To: Stafford Horne
  Cc: Rich Felker, Rafael J. Wysocki, linux-kernel@vger.kernel.org,
	Guo Ren, H. Peter Anvin, linux-riscv, Will Deacon, Jonas Bonn,
	Florian Fainelli, Yoshinori Sato, SH-Linux, X86 ML, Russell King,
	linux-csky, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Catalin Marinas,
	Palmer Dabbelt, devicetree, Albert Ou, Ray Jui,
	Stefan Kristiansson, Openrisc, Borislav Petkov, Paul Walmsley,
	Thomas Gleixner, linux-arm-kernel, Scott Branden,
	Greg Kroah-Hartman, Frank Rowand, James Morse, Paul Mackerras,
	linuxppc-dev
In-Reply-To: <YV4KkAC2p9D4yCnH@antec>

On Wed, Oct 6, 2021 at 3:44 PM Stafford Horne <shorne@gmail.com> wrote:
>
> On Wed, Oct 06, 2021 at 11:43:26AM -0500, Rob Herring wrote:
> > Replace open coded parsing of CPU nodes' 'reg' property with
> > of_get_cpu_hwid().
> >
> > Cc: Jonas Bonn <jonas@southpole.se>
> > Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> > Cc: Stafford Horne <shorne@gmail.com>
> > Cc: openrisc@lists.librecores.org
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > ---
> >  arch/openrisc/kernel/smp.c | 6 +-----
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
> > index 415e209732a3..7d5a4f303a5a 100644
> > --- a/arch/openrisc/kernel/smp.c
> > +++ b/arch/openrisc/kernel/smp.c
> > @@ -65,11 +65,7 @@ void __init smp_init_cpus(void)
> >       u32 cpu_id;
> >
> >       for_each_of_cpu_node(cpu) {
> > -             if (of_property_read_u32(cpu, "reg", &cpu_id)) {
> > -                     pr_warn("%s missing reg property", cpu->full_name);
> > -                     continue;
> > -             }
> > -
> > +             cpu_id = of_get_cpu_hwid(cpu);

Oops, that should be: of_get_cpu_hwid(cpu, 0);

I thought I double checked all those...

> You have defined of_get_cpu_hwid to return u64, will this create compiler
> warnings when since we are storing a u64 into a u32?

I'm counting on the caller to know the max size for their platform.

>
> It seems only if we make with W=3.
>
> I thought we usually warned on this.  Oh well, for the openrisc bits.

That's only on ptr truncation I think.

> Acked-by: Stafford Horne <shorne@gmail.com>
>
> >               if (cpu_id < NR_CPUS)
> >                       set_cpu_possible(cpu_id, true);
> >       }
> > --
> > 2.30.2
> >

^ permalink raw reply

* Re: [PATCH 06/12] openrisc: Use of_get_cpu_hwid()
From: Stafford Horne @ 2021-10-06 20:44 UTC (permalink / raw)
  To: Rob Herring
  Cc: Rich Felker, Rafael J. Wysocki, linux-kernel, Guo Ren,
	H. Peter Anvin, linux-riscv, Will Deacon, Jonas Bonn,
	Florian Fainelli, Yoshinori Sato, linux-sh, x86, Russell King,
	linux-csky, Ingo Molnar, bcm-kernel-feedback-list,
	Catalin Marinas, Palmer Dabbelt, devicetree, Albert Ou, Ray Jui,
	Stefan Kristiansson, openrisc, Borislav Petkov, Paul Walmsley,
	Thomas Gleixner, linux-arm-kernel, Scott Branden,
	Greg Kroah-Hartman, Frank Rowand, James Morse, Paul Mackerras,
	linuxppc-dev
In-Reply-To: <20211006164332.1981454-7-robh@kernel.org>

On Wed, Oct 06, 2021 at 11:43:26AM -0500, Rob Herring wrote:
> Replace open coded parsing of CPU nodes' 'reg' property with
> of_get_cpu_hwid().
> 
> Cc: Jonas Bonn <jonas@southpole.se>
> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> Cc: Stafford Horne <shorne@gmail.com>
> Cc: openrisc@lists.librecores.org
> Signed-off-by: Rob Herring <robh@kernel.org>
> ---
>  arch/openrisc/kernel/smp.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
> index 415e209732a3..7d5a4f303a5a 100644
> --- a/arch/openrisc/kernel/smp.c
> +++ b/arch/openrisc/kernel/smp.c
> @@ -65,11 +65,7 @@ void __init smp_init_cpus(void)
>  	u32 cpu_id;
>  
>  	for_each_of_cpu_node(cpu) {
> -		if (of_property_read_u32(cpu, "reg", &cpu_id)) {
> -			pr_warn("%s missing reg property", cpu->full_name);
> -			continue;
> -		}
> -
> +		cpu_id = of_get_cpu_hwid(cpu);

You have defined of_get_cpu_hwid to return u64, will this create compiler
warnings when since we are storing a u64 into a u32?

It seems only if we make with W=3.

I thought we usually warned on this.  Oh well, for the openrisc bits.

Acked-by: Stafford Horne <shorne@gmail.com>

>  		if (cpu_id < NR_CPUS)
>  			set_cpu_possible(cpu_id, true);
>  	}
> -- 
> 2.30.2
> 

^ permalink raw reply

* Re: [PATCH] perf vendor events power10: Add metric events json file for power10 platform
From: Paul A. Clarke @ 2021-10-06 17:32 UTC (permalink / raw)
  To: Kajol Jain
  Cc: maddy, rnsastry, linuxppc-dev, linux-kernel, acme,
	linux-perf-users, atrajeev, jolsa
In-Reply-To: <20211006073119.276340-1-kjain@linux.ibm.com>

Kajol,

On Wed, Oct 06, 2021 at 01:01:19PM +0530, Kajol Jain wrote:
> Add pmu metric json file for power10 platform.

Thanks for producing this!  A few minor corrections, plus a number of
stylistic comments below...

> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
>  .../arch/powerpc/power10/metrics.json         | 772 ++++++++++++++++++
>  1 file changed, 772 insertions(+)
>  create mode 100644 tools/perf/pmu-events/arch/powerpc/power10/metrics.json
> 
> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/metrics.json b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
> new file mode 100644
> index 000000000000..028c9777a516
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
> @@ -0,0 +1,772 @@
> +[
> +    {
> +        "BriefDescription": "Percentage of cycles that are run cycles",
> +        "MetricExpr": "PM_RUN_CYC / PM_CYC * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "RUN_CYCLES_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per completed instruction",
> +        "MetricExpr": "PM_CYC / PM_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "CYCLES_PER_INSTRUCTION"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled for any reason",
> +        "MetricExpr": "PM_DISP_STALL_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled because there was a flush",
> +        "MetricExpr": "PM_DISP_STALL_FLUSH / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_FLUSH_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled because the MMU was handling a translation miss",
> +        "MetricExpr": "PM_DISP_STALL_TRANSLATION / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_TRANSLATION_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled waiting to resolve an instruction ERAT miss",
> +        "MetricExpr": "PM_DISP_STALL_IERAT_ONLY_MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_IERAT_ONLY_MISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled waiting to resolve an instruction TLB miss",
> +        "MetricExpr": "PM_DISP_STALL_ITLB_MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_ITLB_MISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled due to an icache miss",
> +        "MetricExpr": "PM_DISP_STALL_IC_MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_IC_MISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled while the instruction was fetched form the local L2",

s/form/from/

> +        "MetricExpr": "PM_DISP_STALL_IC_L2 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_IC_L2_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled while the instruction was fetched form the local L3",

s/form/from/

> +        "MetricExpr": "PM_DISP_STALL_IC_L3 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_IC_L3_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled while the instruction was fetched from any source beyond the local L3",
> +        "MetricExpr": "PM_DISP_STALL_IC_L3MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_IC_L3MISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled due to an icache miss after a branch mispredict",
> +        "MetricExpr": "PM_DISP_STALL_BR_MPRED_ICMISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_BR_MPRED_ICMISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled while instruction was fetched from the local L2 after suffering a branch mispredict",
> +        "MetricExpr": "PM_DISP_STALL_BR_MPRED_IC_L2 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_BR_MPRED_IC_L2_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled while instruction was fetched from the local L3 after suffering a branch mispredict",
> +        "MetricExpr": "PM_DISP_STALL_BR_MPRED_IC_L3 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_BR_MPRED_IC_L3_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled while instruction was fetched from any source beyond  the local L3 after suffering a branch mispredict",

extra space after "beyond"

> +        "MetricExpr": "PM_DISP_STALL_BR_MPRED_IC_L3MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_BR_MPRED_IC_L3MISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled due to a branch mispredict",
> +        "MetricExpr": "PM_DISP_STALL_BR_MPRED / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_BR_MPRED_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch for any reason",

s/ntc/NTC/  or "next-to-complete"
I do see uses of "NTC" below.
Same comment for other instances of "ntc", below...

> +        "MetricExpr": "PM_DISP_STALL_HELD_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_HELD_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch",
> +        "MetricExpr": "PM_DISP_STALL_HELD_SYNC_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISP_HELD_STALL_SYNC_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch while waiting on the scoreboard",
> +        "MetricExpr": "PM_DISP_STALL_HELD_SCOREBOARD_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISP_HELD_STALL_SCOREBOARD_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch due to issue q full",

s/q/queue/

> +        "MetricExpr": "PM_DISP_STALL_HELD_ISSQ_FULL_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISP_HELD_STALL_ISSQ_FULL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch because the mapper/SRB was full",
> +        "MetricExpr": "PM_DISP_STALL_HELD_RENAME_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_HELD_RENAME_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch because the STF mapper/SRB was full",
> +        "MetricExpr": "PM_DISP_STALL_HELD_STF_MAPPER_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_HELD_STF_MAPPER_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch because the XVFC mapper/SRB was full",
> +        "MetricExpr": "PM_DISP_STALL_HELD_XVFC_MAPPER_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_HELD_XVFC_MAPPER_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch for any other reason",
> +        "MetricExpr": "PM_DISP_STALL_HELD_OTHER_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_HELD_OTHER_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction has been dispatched but not issued for any reason",
> +        "MetricExpr": "PM_ISSUE_STALL / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "ISSUE_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting to be finished in one of the execution units",
> +        "MetricExpr": "PM_EXEC_STALL / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "EXECUTION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction spent executing an NTC instruction that gets flushed some time after dispatch",
> +        "MetricExpr": "PM_EXEC_STALL_NTC_FLUSH / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "NTC_FLUSH_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the instruction finishes at dispatch",

I'm not sure what that means.

> +        "MetricExpr": "PM_EXEC_STALL_FIN_AT_DISP / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "FIN_AT_DISP_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is executing in the branch unit",
> +        "MetricExpr": "PM_EXEC_STALL_BRU / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "BRU_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a simple fixed point instr that is executing in the lsu unit",

s/instr/instruction/
s/lsu unit/LSU/

> +        "MetricExpr": "PM_EXEC_STALL_SIMPLE_FX / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "SIMPLE_FX_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is executing in the vsu unit",

s/vsu unit/VSU/

> +        "MetricExpr": "PM_EXEC_STALL_VSU / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "VSU_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting to be finished in one of the execution units",
> +        "MetricExpr": "PM_EXEC_STALL_TRANSLATION / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "TRANSLATION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a load or store that suffered a translation miss",
> +        "MetricExpr": "PM_EXEC_STALL_DERAT_ONLY_MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DERAT_ONLY_MISS_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is recovering from a TLB miss",
> +        "MetricExpr": "PM_EXEC_STALL_DERAT_DTLB_MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DERAT_DTLB_MISS_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is executing in the lsu unit",

s/lsu unit/LSU/

> +        "MetricExpr": "PM_EXEC_STALL_LSU / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "LSU_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a load that is executing in the lsu unit",

s/lsu unit/LSU/

> +        "MetricExpr": "PM_EXEC_STALL_LOAD / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "LOAD_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from either the local L2 or local L3",
> +        "MetricExpr": "PM_EXEC_STALL_DMISS_L2L3 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_L2L3_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from either the local L2 or local L3, with an RC dispatch conflict",
> +        "MetricExpr": "PM_EXEC_STALL_DMISS_L2L3_CONFLICT / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_L2L3_CONFLICT_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from either the local L2 or local L3, without an RC dispatch conflict",
> +        "MetricExpr": "PM_EXEC_STALL_DMISS_L2L3_NOCONFLICT / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_L2L3_NOCONFLICT_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from a source beyond the local L2 and local L3",
> +        "MetricExpr": "PM_EXEC_STALL_DMISS_L3MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_L3MISS_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from a neighbor chiplet's L2 or L3 in the same chip",
> +        "MetricExpr": "PM_EXEC_STALL_DMISS_L21_L31 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_L21_L31_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from local memory, L4 or OpenCapp chip",

Most descriptions put L4 before memory.
(My preference is to use an "Oxford comma", as in "memory, L4, or ..."
(comma after "L4"), but acknowledge there are those who prefer otherwise.)

> +        "MetricExpr": "PM_EXEC_STALL_DMISS_LMEM / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_LMEM_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from a remote chip (cache, L4, memory or CAPP) in the same group",

Is there a distinction between "OpenCapp" and "CAPP"?  If not, pick one throughout.
Is this supposed to be "OpenCAPI"?

> +        "MetricExpr": "PM_EXEC_STALL_DMISS_OFF_CHIP / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_OFF_CHIP_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is waiting for a load miss to resolve from a distant chip (cache, L4, memory or CAPP chip)",
> +        "MetricExpr": "PM_EXEC_STALL_DMISS_OFF_NODE / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DMISS_OFF_NODE_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is executing a TLBIEL instruction",
> +        "MetricExpr": "PM_EXEC_STALL_TLBIEL / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "TLBIEL_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is finishing a load after its data has been reloaded from a data source beyond the local L1, OR when the LSU is processing an L1-hit, OR when the NTF instruction merged with another load in the LMQ",
> +        "MetricExpr": "PM_EXEC_STALL_LOAD_FINISH / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "LOAD_FINISH_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a store that is executing in the lsu unit",

s/lsu unit/LSU/

> +        "MetricExpr": "PM_EXEC_STALL_STORE / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "STORE_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is in the store unit outside of handling store misses or other special store operations",

s/store unit/LSU/ ?

> +        "MetricExpr": "PM_EXEC_STALL_STORE_PIPE / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "STORE_PIPE_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a store whose cache line was not resident in the L1 and had to wait for allocation of the missing line into the L1",
> +        "MetricExpr": "PM_EXEC_STALL_STORE_MISS / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "STORE_MISS_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a TLBIE instruction waiting for a response from the L2",
> +        "MetricExpr": "PM_EXEC_STALL_TLBIE / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "TLBIE_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is executing a PTESYNC instruction",
> +        "MetricExpr": "PM_EXEC_STALL_PTESYNC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "PTESYNC_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction cannot complete because the thread was blocked",
> +        "MetricExpr": "PM_CMPL_STALL / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction cannot complete because it was interrupted by ANY exception",
> +        "MetricExpr": "PM_CMPL_STALL_EXCEPTION / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "EXCEPTION_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is stuck at finish waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC",
> +        "MetricExpr": "PM_CMPL_STALL_MEM_ECC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "MEM_ECC_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction cannot complete the instruction is a stcx waiting for resolution from the nest",
> +        "MetricExpr": "PM_CMPL_STALL_STCX / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "STCX_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a LWSYNC instruction waiting to complete",

Sometimes instruction mnemonics are ALL CAPS, like here, and sometimes not,
like "stcx", above. Pick one style. Also pick whether the mnemonic is
followed by "instruction" or not.  I prefer including "instruction" for
clarity.

> +        "MetricExpr": "PM_CMPL_STALL_LWSYNC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "LWSYNC_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction is a HWSYNC instruction stuck at finish waiting for a response from the L2",
> +        "MetricExpr": "PM_CMPL_STALL_HWSYNC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "HWSYNC_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction required special handling before completion",
> +        "MetricExpr": "PM_CMPL_STALL_SPECIAL / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "SPECIAL_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, disp_stall_translation or children are miscounting",

Are these "Should equal 0" metrics generally useful?

> +        "MetricExpr": "DISPATCHED_TRANSLATION_CPI - (DISPATCHED_IERAT_ONLY_MISS_CPI + DISPATCHED_ITLB_MISS_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DISPATCHED_TRANSLATION_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, disp_stall_ic_miss or children are miscounting",
> +        "MetricExpr": "DISPATCHED_IC_MISS_CPI - (DISPATCHED_IC_L2_CPI + DISPATCHED_IC_L3_CPI + DISPATCHED_IC_L3MISS_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DISPATCHED_IC_MISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, disp_stall_br_mpred_icmiss or children are miscounting",
> +        "MetricExpr": "DISPATCHED_BR_MPRED_ICMISS_CPI - (DISPATCHED_BR_MPRED_IC_L2_CPI + DISPATCHED_BR_MPRED_IC_L3_CPI + DISPATCHED_BR_MPRED_IC_L3MISS_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DISPATCHED_BR_MPRED_ICMISS_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, disp_stall_held_rename or children are miscounting",
> +        "MetricExpr": "DISPATCHED_HELD_RENAME_CPI - (DISPATCHED_HELD_STF_MAPPER_CPI + DISPATCHED_HELD_XVFC_MAPPER_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DISPATCHED_HELD_RENAME_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, disp_stall_held or children are miscounting",
> +        "MetricExpr": "DISPATCHED_HELD_CPI - (DISP_HELD_STALL_SYNC_CPI + DISP_HELD_STALL_SCOREBOARD_CPI + DISP_HELD_STALL_ISSQ_FULL_CPI + DISPATCHED_HELD_RENAME_CPI + DISPATCHED_HELD_OTHER_CPI + DISPATCHED_HELD_HALT_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DISPATCHED_HELD_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, disp_stall or children are miscounting",
> +        "MetricExpr": "DISPATCHED_CPI - (DISPATCHED_FLUSH_CPI + DISPATCHED_TRANSLATION_CPI + DISPATCHED_IC_MISS_CPI + DISPATCHED_BR_MPRED_ICMISS_CPI + DISPATCHED_BR_MPRED_CPI + DISPATCHED_HELD_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DISPATCHED_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall_translation or children are miscounting",
> +        "MetricExpr": "TRANSLATION_STALL_CPI - (DERAT_ONLY_MISS_STALL_CPI + DERAT_DTLB_MISS_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_TRANSLATION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall_dmiss_l2l3 or children are miscounting",
> +        "MetricExpr": "DMISS_L2L3_STALL_CPI - (DMISS_L2L3_CONFLICT_STALL_CPI + DMISS_L2L3_NOCONFLICT_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DMISS_L2L3_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall_dmiss_l3miss or children are miscounting",
> +        "MetricExpr": "DMISS_L3MISS_STALL_CPI - (DMISS_L21_L31_STALL_CPI + DMISS_LMEM_STALL_CPI + DMISS_OFF_CHIP_STALL_CPI + DMISS_OFF_NODE_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_DMISS_L3MISS_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall_load or children are miscounting",
> +        "MetricExpr": "LOAD_STALL_CPI - (DMISS_L2L3_STALL_CPI + DMISS_L3MISS_STALL_CPI + TLBIEL_STALL_CPI + LOAD_FINISH_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_LOAD_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall_store or children are miscounting",
> +        "MetricExpr": "STORE_STALL_CPI - (STORE_PIPE_STALL_CPI + STORE_MISS_STALL_CPI + TLBIE_STALL_CPI + PTESYNC_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_STORE_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall_lsu or children are miscounting",
> +        "MetricExpr": "LSU_STALL_CPI - (LOAD_STALL_CPI + STORE_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_LSU_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, cmpl_stall or children are miscounting",
> +        "MetricExpr": "COMPLETION_STALL_CPI - (EXCEPTION_COMPLETION_STALL_CPI + MEM_ECC_COMPLETION_STALL_CPI + STCX_COMPLETION_STALL_CPI + LWSYNC_COMPLETION_STALL_CPI + HWSYNC_COMPLETION_STALL_CPI + SPECIAL_COMPLETION_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_COMPLETION_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, exec_stall or children are miscounting",
> +        "MetricExpr": "EXECUTION_STALL_CPI - (NTC_FLUSH_STALL_CPI + FIN_AT_DISP_STALL_CPI + BRU_STALL_CPI + SIMPLE_FX_STALL_CPI + VSU_STALL_CPI + TRANSLATION_STALL_CPI + LSU_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_STALL_CPI"
> +    },
> +    {
> +        "BriefDescription": "Should equal 0. If not, pm_cyc or children are miscounting",
> +        "MetricExpr": "CYCLES_PER_INSTRUCTION - (DISPATCHED_CPI + ISSUE_STALL_CPI + EXECUTION_STALL_CPI + COMPLETION_STALL_CPI)",
> +        "MetricGroup": "CPI",
> +        "MetricName": "OTHER_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when dispatch was stalled because Fetch was being held,  so there was nothing in the pipeline for this thread",

s/Fetch/fetch/
extra space after "held,"

> +        "MetricExpr": "PM_DISP_STALL_FETCH / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_FETCH_CPI"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntc instruction was held at dispatch because of power management",
> +        "MetricExpr": "PM_DISP_STALL_HELD_HALT_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "CPI",
> +        "MetricName": "DISPATCHED_HELD_HALT_CPI"
> +    },
> +    {
> +        "BriefDescription": "Percentage of flushes per completed instruction",
> +        "MetricExpr": "PM_FLUSH / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Others",
> +        "MetricName": "FLUSH_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of flushes due to a branch mispredict per instruction",
> +        "MetricExpr": "PM_FLUSH_MPRED / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Others",
> +        "MetricName": "BR_MPRED_FLUSH_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of branch mispredictions per completed instruction",
> +        "MetricExpr": "PM_BR_MPRED_CMPL / PM_RUN_INST_CMPL",
> +        "MetricGroup": "Others",
> +        "MetricName": "BRANCH_MISPREDICTION_RATE"
> +    },
> +    {
> +        "BriefDescription": "Percentage of finished loads that missed in the L1",
> +        "MetricExpr": "PM_LD_MISS_L1 / PM_LD_REF_L1 * 100",
> +        "MetricGroup": "Others",
> +        "MetricName": "L1_LD_MISS_RATIO",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of completed instructions that were loads that missed the L1",
> +        "MetricExpr": "PM_LD_MISS_L1 / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Others",
> +        "MetricName": "L1_LD_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of instructions when the DPTEG required for the load/store instruction in execution was missing from the TLB",
> +        "MetricExpr": "PM_DTLB_MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Others",
> +        "MetricName": "DTLB_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Average number of instruction dispatched per instruction completed",

s/instruction/instrucions/

> +        "MetricExpr": "PM_INST_DISP / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "DISPATCH_PER_INST_CMPL"
> +    },
> +    {
> +        "BriefDescription": "Percentage of completed instructions that were a demand load that did not hit in the L1 or L2",
> +        "MetricExpr": "PM_DATA_FROM_L2MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "L2_LD_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of completed instructions that were demand fetches that missed the L1 instruction cache",
> +        "MetricExpr": "PM_L1_ICACHE_MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Instruction_Misses",
> +        "MetricName": "L1_INST_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of completed instructions that were demand fetches that reloaded from beyond the L3 instruction cache",
> +        "MetricExpr": "PM_INST_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "L3_INST_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Average number of completed instructions per cycle",
> +        "MetricExpr": "PM_INST_CMPL / PM_CYC",
> +        "MetricGroup": "General",
> +        "MetricName": "IPC"
> +    },
> +    {
> +        "BriefDescription": "Average number of cycles per completed instruction group",
> +        "MetricExpr": "PM_CYC / PM_1PLUS_PPC_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "CYCLES_PER_COMPLETED_INSTRUCTIONS_SET"
> +    },
> +    {
> +        "BriefDescription": "Percentage of cycles when at least 1 instruction dispatched",
> +        "MetricExpr": "PM_1PLUS_PPC_DISP / PM_RUN_CYC * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "CYCLES_ATLEAST_ONE_INST_DISPATCHED",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Rate of finished loads per completed instruction",

Most similar "rate" metrics are using the phrase "average number of".
Do we want to use that here as well?  (Applies to all "rate" metrics.)

> +        "MetricExpr": "PM_LD_REF_L1 / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "LOADS_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of finished stores per completed instruction",
> +        "MetricExpr": "PM_ST_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "STORES_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Percentage of demand loads that reloaded from beyond the L2 per completed instruction",
> +        "MetricExpr": "PM_DATA_FROM_L2MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "dL1_Reloads",
> +        "MetricName": "DL1_RELOAD_FROM_L2_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of demand loads that reloaded from beyond the L3 per completed instruction",
> +        "MetricExpr": "PM_DATA_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "dL1_Reloads",
> +        "MetricName": "DL1_RELOAD_FROM_L3_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of DERAT misses with 4k page size per completed run instruction",

When PM_RUN_INST_CMPL is used, sometimes we say "run instruction",
and sometimes we say "completed instruction".  Let's pick one.

> +        "MetricExpr": "PM_DERAT_MISS_4K / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_4K_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of DERAT misses with 64k page size per completed run instruction",
> +        "MetricExpr": "PM_DERAT_MISS_64K / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_64K_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Average number of run cycles per completed run instruction",

Here we cover our bases and say "completed run instruction". ;-)
Let's convert this one to whichever phrase is chosen for PM_RUN_INST_CMPL.
Seen below, too.

> +        "MetricExpr": "PM_RUN_CYC / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "RUN_CPI"
> +    },
> +    {
> +        "BriefDescription": "Total number of run cycles",
> +        "MetricExpr": "PM_RUN_CYC",

Isn't this more an event than a metric?
Does it need to be included here?

> +        "MetricGroup": "General",
> +        "MetricName": "TOTAL_RUN_CYCLES"
> +    },
> +    {
> +        "BriefDescription": "Percentage of DERAT misses per completed run instruction",
> +        "MetricExpr": "PM_DERAT_MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Average number of completed run instructions per run cycle",
> +        "MetricExpr": "PM_RUN_INST_CMPL / PM_RUN_CYC",
> +        "MetricGroup": "General",
> +        "MetricName": "RUN_IPC"
> +    },
> +    {
> +        "BriefDescription": "Average number of instruction completed per instruction group",

s/instruction/instructions/

> +        "MetricExpr": "PM_RUN_INST_CMPL / PM_1PLUS_PPC_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "AVERAGE_COMPLETED_INSTRUCTION_SET_SIZE"
> +    },
> +    {
> +        "BriefDescription": "Rate of finished instructions per completed instructions",


> +        "MetricExpr": "PM_INST_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "INST_FIN_PER_CMPL"
> +    },
> +    {
> +        "BriefDescription": "Average cycles per instruction when the ntf instruction is completing and the finish was overlooked",

s/ntf/NTF/
"overlooked" seems like an odd term.

> +        "MetricExpr": "PM_EXEC_STALL_UNKNOWN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "EXEC_STALL_UNKOWN_CPI"
> +    },
> +    {
> +        "BriefDescription": "Percentage of finished branches that were taken",
> +        "MetricExpr": "PM_BR_TAKEN_CMPL / PM_BR_FIN * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "TAKEN_BRANCHES",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of completed instructions that were a demand load that did not hit in the L1, L2, or the L3",
> +        "MetricExpr": "PM_DATA_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "L3_LD_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Rate of finished branches per completed instruction",
> +        "MetricExpr": "PM_BR_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "BRANCHES_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of instructions finished in the LSU per completed instruction",
> +        "MetricExpr": "PM_LSU_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "LSU_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of instructions finished in the VSU per completed instruction",
> +        "MetricExpr": "PM_VSU_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "VSU_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of TLBIE instructions finished in the LSU per completed instruction",
> +        "MetricExpr": "PM_TLBIE_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "TLBIE_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of STCX instructions finshed per completed instruction",
> +        "MetricExpr": "PM_STCX_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "STXC_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of LARX instructions finshed per completed instruction",
> +        "MetricExpr": "PM_LARX_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "LARX_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of ptesync instructions finshed per completed instruction",
> +        "MetricExpr": "PM_PTESYNC_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "PTESYNC_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Rate of simple fixed-point instructions finshed in the store unit per completed instruction",

s/store unit/LSU/ ?

> +        "MetricExpr": "PM_FX_LSU_FIN / PM_RUN_INST_CMPL",
> +        "MetricGroup": "General",
> +        "MetricName": "FX_PER_INST"
> +    },
> +    {
> +        "BriefDescription": "Percentage of demand load misses that reloaded the L1 cache",
> +        "MetricExpr": "PM_LD_DEMAND_MISS_L1 / PM_LD_MISS_L1 * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "DL1_MISS_RELOADS",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of demand load misses that reloaded from beyond the local L2",
> +        "MetricExpr": "PM_DATA_FROM_L2MISS / PM_LD_DEMAND_MISS_L1 * 100",
> +        "MetricGroup": "dL1_Reloads",
> +        "MetricName": "DL1_RELOAD_FROM_L2_MISS",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of demand load misses that reloaded from beyond the local L3",
> +        "MetricExpr": "PM_DATA_FROM_L3MISS / PM_LD_DEMAND_MISS_L1 * 100",
> +        "MetricGroup": "dL1_Reloads",
> +        "MetricName": "DL1_RELOAD_FROM_L3_MISS",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of cycles stalled due to the ntc instruction waiting for a load miss to resolve from a source beyond the local L2 and local L3",
> +        "MetricExpr": "DMISS_L3MISS_STALL_CPI / RUN_CPI * 100",
> +        "MetricGroup": "General",
> +        "MetricName": "DCACHE_MISS_CPI",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of DERAT misses with 2M page size per completed run instruction",
> +        "MetricExpr": "PM_DERAT_MISS_2M / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_2M_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of DERAT misses with 16M page size per completed run instruction",
> +        "MetricExpr": "PM_DERAT_MISS_16M / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_16M_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "DERAT miss ratio for 4K page size",
> +        "MetricExpr": "PM_DERAT_MISS_4K / PM_DERAT_MISS",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_4K_MISS_RATIO"
> +    },
> +    {
> +        "BriefDescription": "DERAT miss ratio for 2M page size",
> +        "MetricExpr": "PM_DERAT_MISS_2M / PM_DERAT_MISS",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_2M_MISS_RATIO"
> +    },
> +    {
> +        "BriefDescription": "DERAT miss ratio for 16M page size",
> +        "MetricExpr": "PM_DERAT_MISS_16M / PM_DERAT_MISS",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_16M_MISS_RATIO"
> +    },
> +    {
> +        "BriefDescription": "DERAT miss ratio for 64K page size",
> +        "MetricExpr": "PM_DERAT_MISS_64K / PM_DERAT_MISS",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_64K_MISS_RATIO"
> +    },
> +    {
> +        "BriefDescription": "Percentage of DERAT misses that resulted in TLB reloads",
> +        "MetricExpr": "PM_DTLB_MISS / PM_DERAT_MISS * 100",
> +        "MetricGroup": "Translation",
> +        "MetricName": "DERAT_MISS_RELOAD",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of ICache misses that were reloaded from beyond the local L3",

Sometimes we use "ICache" and sometimes "icache".  Pick one.

> +        "MetricExpr": "PM_INST_FROM_L3MISS / PM_L1_ICACHE_MISS * 100",
> +        "MetricGroup": "Instruction_Misses",
> +        "MetricName": "INST_FROM_L3_MISS",
> +        "ScaleUnit": "1%"
> +    },
> +    {
> +        "BriefDescription": "Percentage of ICache reloads from the beyond the L3 per completed run instruction",
> +        "MetricExpr": "PM_INST_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
> +        "MetricGroup": "Instruction_Misses",
> +        "MetricName": "INST_FROM_L3_MISS_RATE",
> +        "ScaleUnit": "1%"
> +    }
> +]

PC

^ permalink raw reply

* [PATCH 12/12] cacheinfo: Set cache 'id' based on DT data
From: Rob Herring @ 2021-10-06 16:43 UTC (permalink / raw)
  To: Russell King, James Morse, Catalin Marinas, Will Deacon, Guo Ren,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne, Michael Ellerman,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
	Rich Felker, x86, Greg Kroah-Hartman
  Cc: devicetree, Florian Fainelli, Scott Branden, Rafael J. Wysocki,
	linux-sh, Ray Jui, H. Peter Anvin, linux-kernel, linux-csky,
	openrisc, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	Borislav Petkov, bcm-kernel-feedback-list, Thomas Gleixner,
	Frank Rowand, linux-riscv, linux-arm-kernel
In-Reply-To: <20211006164332.1981454-1-robh@kernel.org>

Use the minimum CPU h/w id of the CPUs associated with the cache for the
cache 'id'. This will provide a stable id value for a given system. As
we need to check all possible CPUs, we can't use the shared_cpu_map
which is just online CPUs. As there's not a cache to CPUs mapping in DT,
we have to walk all CPU nodes and then walk cache levels.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
---
 drivers/base/cacheinfo.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 66d10bdb863b..44547fd96f72 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -136,6 +136,31 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
 	return of_property_read_bool(np, "cache-unified");
 }
 
+static void cache_of_set_id(struct cacheinfo *this_leaf, struct device_node *np)
+{
+	struct device_node *cpu;
+	unsigned long min_id = ~0UL;
+
+	for_each_of_cpu_node(cpu) {
+		struct device_node *cache_node = cpu;
+		u64 id = of_get_cpu_hwid(cache_node, 0);
+
+		while ((cache_node = of_find_next_cache_node(cache_node))) {
+			if ((cache_node == np) && (id < min_id)) {
+				min_id = id;
+				of_node_put(cache_node);
+				break;
+			}
+			of_node_put(cache_node);
+		}
+	}
+
+	if (min_id != ~0UL) {
+		this_leaf->id = min_id;
+		this_leaf->attributes |= CACHE_ID;
+	}
+}
+
 static void cache_of_set_props(struct cacheinfo *this_leaf,
 			       struct device_node *np)
 {
@@ -151,6 +176,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf,
 	cache_get_line_size(this_leaf, np);
 	cache_nr_sets(this_leaf, np);
 	cache_associativity(this_leaf);
+	cache_of_set_id(this_leaf, np);
 }
 
 static int cache_setup_of_node(unsigned int cpu)
-- 
2.30.2


^ permalink raw reply related

* [PATCH 11/12] cacheinfo: Allow for >32-bit cache 'id'
From: Rob Herring @ 2021-10-06 16:43 UTC (permalink / raw)
  To: Russell King, James Morse, Catalin Marinas, Will Deacon, Guo Ren,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne, Michael Ellerman,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
	Rich Felker, x86, Greg Kroah-Hartman
  Cc: devicetree, Florian Fainelli, Scott Branden, Rafael J. Wysocki,
	linux-sh, Ray Jui, H. Peter Anvin, linux-kernel, linux-csky,
	openrisc, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	Borislav Petkov, bcm-kernel-feedback-list, Thomas Gleixner,
	Frank Rowand, linux-riscv, linux-arm-kernel
In-Reply-To: <20211006164332.1981454-1-robh@kernel.org>

In preparation to set the cache 'id' based on the CPU h/w ids, allow for
64-bit bit 'id' value. The only case that needs this is arm64, so
unsigned long is sufficient.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
---
 drivers/base/cacheinfo.c  | 8 +++++++-
 include/linux/cacheinfo.h | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..66d10bdb863b 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -366,13 +366,19 @@ static ssize_t file_name##_show(struct device *dev,		\
 	return sysfs_emit(buf, "%u\n", this_leaf->object);	\
 }
 
-show_one(id, id);
 show_one(level, level);
 show_one(coherency_line_size, coherency_line_size);
 show_one(number_of_sets, number_of_sets);
 show_one(physical_line_partition, physical_line_partition);
 show_one(ways_of_associativity, ways_of_associativity);
 
+static ssize_t id_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%lu\n", this_leaf->id);
+}
+
 static ssize_t size_show(struct device *dev,
 			 struct device_attribute *attr, char *buf)
 {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 2f909ed084c6..b2e7f3e40204 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -48,7 +48,7 @@ extern unsigned int coherency_max_size;
  * keeping, the remaining members form the core properties of the cache
  */
 struct cacheinfo {
-	unsigned int id;
+	unsigned long id;
 	enum cache_type type;
 	unsigned int level;
 	unsigned int coherency_line_size;
-- 
2.30.2


^ permalink raw reply related

* [PATCH 10/12] x86: dt: Use of_get_cpu_hwid()
From: Rob Herring @ 2021-10-06 16:43 UTC (permalink / raw)
  To: Russell King, James Morse, Catalin Marinas, Will Deacon, Guo Ren,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne, Michael Ellerman,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
	Rich Felker, x86, Greg Kroah-Hartman
  Cc: devicetree, Florian Fainelli, Scott Branden, Rafael J. Wysocki,
	linux-sh, Ray Jui, H. Peter Anvin, linux-kernel, linux-csky,
	openrisc, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	Borislav Petkov, bcm-kernel-feedback-list, Thomas Gleixner,
	Frank Rowand, linux-riscv, linux-arm-kernel
In-Reply-To: <20211006164332.1981454-1-robh@kernel.org>

Replace open coded parsing of CPU nodes' 'reg' property with
of_get_cpu_hwid().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
 arch/x86/kernel/devicetree.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c
index 6a4cb71c2498..3aa1e99df2a9 100644
--- a/arch/x86/kernel/devicetree.c
+++ b/arch/x86/kernel/devicetree.c
@@ -139,12 +139,11 @@ static void __init dtb_cpu_setup(void)
 {
 	struct device_node *dn;
 	u32 apic_id, version;
-	int ret;
 
 	version = GET_APIC_VERSION(apic_read(APIC_LVR));
 	for_each_of_cpu_node(dn) {
-		ret = of_property_read_u32(dn, "reg", &apic_id);
-		if (ret < 0) {
+		apic_id = of_get_cpu_hwid(dn, 0);
+		if (apic_id == ~0U) {
 			pr_warn("%pOF: missing local APIC ID\n", dn);
 			continue;
 		}
-- 
2.30.2


^ permalink raw reply related

* [PATCH 09/12] sh: Use of_get_cpu_hwid()
From: Rob Herring @ 2021-10-06 16:43 UTC (permalink / raw)
  To: Russell King, James Morse, Catalin Marinas, Will Deacon, Guo Ren,
	Jonas Bonn, Stefan Kristiansson, Stafford Horne, Michael Ellerman,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
	Rich Felker, x86, Greg Kroah-Hartman
  Cc: devicetree, Florian Fainelli, Scott Branden, Rafael J. Wysocki,
	linux-sh, Ray Jui, H. Peter Anvin, linux-kernel, linux-csky,
	openrisc, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	Borislav Petkov, bcm-kernel-feedback-list, Thomas Gleixner,
	Frank Rowand, linux-riscv, linux-arm-kernel
In-Reply-To: <20211006164332.1981454-1-robh@kernel.org>

Replace open coded parsing of CPU nodes' 'reg' property with
of_get_cpu_hwid().

Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
---
 arch/sh/boards/of-generic.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/sh/boards/of-generic.c b/arch/sh/boards/of-generic.c
index 921d76fc3358..f7f3e618e85b 100644
--- a/arch/sh/boards/of-generic.c
+++ b/arch/sh/boards/of-generic.c
@@ -62,9 +62,8 @@ static void sh_of_smp_probe(void)
 	init_cpu_possible(cpumask_of(0));
 
 	for_each_of_cpu_node(np) {
-		const __be32 *cell = of_get_property(np, "reg", NULL);
-		u64 id = -1;
-		if (cell) id = of_read_number(cell, of_n_addr_cells(np));
+		u64 id = of_get_cpu_hwid(np, 0);
+
 		if (id < NR_CPUS) {
 			if (!method)
 				of_property_read_string(np, "enable-method", &method);
-- 
2.30.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox