public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE
@ 2023-11-30  7:46 Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Yicong Yang @ 2023-11-30  7:46 UTC (permalink / raw)
  To: will, mark.rutland, catalin.marinas, broonie, james.morse,
	anshuman.khandual, linux-arm-kernel
  Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
	fanghao11, yangyicong, linuxarm

From: Yicong Yang <yangyicong@hisilicon.com>

When using the SPE, it's noticed the operation of a VM is not profiled at
all on a VHE enabled host. This is because the driver's using
PMSCR_EL1.{E1SPE, E0SPE} to enable the profiling. On a VHE enabled host,
we're actually setting the PMSCR_EL2.{E2SPE, E0SPE}. This will enable the
profiling of EL2 and EL0 in a EL2&0 translation regime. However the VM
is using a EL1&0 translation regime, so it's not profiled. We can enable
the profiling of EL1&0 translation regime by setting PMSCR_EL12.{E1SPE, E0SPE}
from EL2.

This patch adds the support to this by:
- Add the sysreg definition of PMSCR_EL12
- Factor the code to allow extension
- Enable the profiling of EL1&0 and complete perf's exclude_* option

Tests have been done on VHE and non-VHE host with
`perf record -e {arm_spe_0//}:u|k|h|G|h`
and results shows as expected.

The sample from El0 in the VM will be like (generated by `perf report -D`):
.  000003b8:  b0 48 eb 12 ab ff ff 00 80                      PC 0xffffab12eb48 el0 ns=1
.  000003c1:  99 07 00                                        LAT 7 ISSUE
.  000003c4:  98 08 00                                        LAT 8 TOT
.  000003c7:  62 42 00 00 00                                  EV RETIRED NOT-TAKEN
.  000003cc:  4a 01                                           B COND
.  000003ce:  00 00                                           PAD
.  000003d0:  b1 4c eb 12 ab ff ff 00 80                      TGT 0xffffab12eb4c el0 ns=1
.  000003d9:  00 00 00                                        PAD
.  000003dc:  64 50 01 00 00                                  CONTEXT 0x150 el1
.  000003e1:  65 3c 11 00 00                                  CONTEXT 0x113c el2
.  000003e6:  00 00 00 00 00                                  PAD
.  000003eb:  71 aa aa 00 fd 06 00 00 00                      TS 30014483114

Change since v1:
- Add tag by Mark and ordered PMSCR_EL12 by address as suggested. By now at least
  PMSCR_EL12 keeps order among its *_EL12 siblings in sysreg.
Link: https://lore.kernel.org/linux-arm-kernel/20231122084602.53914-1-yangyicong@huawei.com/

Yicong Yang (3):
  arm64/sysreg: Add PMSCR_EL12 and factor out the common fields
  perf: arm_spe: Factor out PMSCR set/clear operations
  perf: arm_spe: Enable the profiling of EL0&1 translation regime

 arch/arm64/tools/sysreg    | 10 ++++-
 drivers/perf/arm_spe_pmu.c | 76 ++++++++++++++++++++++++++++----------
 2 files changed, 65 insertions(+), 21 deletions(-)

-- 
2.24.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields
  2023-11-30  7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
@ 2023-11-30  7:46 ` Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
  2 siblings, 0 replies; 8+ messages in thread
From: Yicong Yang @ 2023-11-30  7:46 UTC (permalink / raw)
  To: will, mark.rutland, catalin.marinas, broonie, james.morse,
	anshuman.khandual, linux-arm-kernel
  Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
	fanghao11, yangyicong, linuxarm

From: Yicong Yang <yangyicong@hisilicon.com>

Add PMSCR_EL12 for accessing PMSCR_EL1 from EL2. Since PMSCR_EL12
and PMSCR_EL1 share the same definition of the fields, define a
common PMSCR_EL1x for both. Update the field name used in the
driver accordingly.

Trying hard to order PMSCR_EL12 by the address with its *_EL12
siblings in sysreg file.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/tools/sysreg    | 10 +++++++++-
 drivers/perf/arm_spe_pmu.c | 20 ++++++++++----------
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 96cbeeab4eec..b55544f721ec 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1800,7 +1800,7 @@ Sysreg	FAR_EL1	3	0	6	0	0
 Field	63:0	ADDR
 EndSysreg
 
-Sysreg	PMSCR_EL1	3	0	9	9	0
+SysregFields	PMSCR_EL1x
 Res0	63:8
 Field	7:6	PCT
 Field	5	TS
@@ -1809,6 +1809,10 @@ Field	3	CX
 Res0	2
 Field	1	E1SPE
 Field	0	E0SPE
+EndSysregFields
+
+Sysreg	PMSCR_EL1	3	0	9	9	0
+Fields	PMSCR_EL1x
 EndSysreg
 
 Sysreg	PMSNEVFR_EL1	3	0	9	9	1
@@ -2411,6 +2415,10 @@ Sysreg	FAR_EL12	3	5	6	0	0
 Field	63:0	ADDR
 EndSysreg
 
+Sysreg	PMSCR_EL12	3	5	9	9	0
+Fields	PMSCR_EL1x
+EndSysreg
+
 Sysreg	CONTEXTIDR_EL12	3	5	13	0	1
 Fields	CONTEXTIDR_ELx
 EndSysreg
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d2b0cbf0e0c4..05647cfff61d 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -172,13 +172,13 @@ static const struct attribute_group arm_spe_pmu_cap_group = {
 };
 
 /* User ABI */
-#define ATTR_CFG_FLD_ts_enable_CFG		config	/* PMSCR_EL1.TS */
+#define ATTR_CFG_FLD_ts_enable_CFG		config	/* PMSCR_EL1x.TS */
 #define ATTR_CFG_FLD_ts_enable_LO		0
 #define ATTR_CFG_FLD_ts_enable_HI		0
-#define ATTR_CFG_FLD_pa_enable_CFG		config	/* PMSCR_EL1.PA */
+#define ATTR_CFG_FLD_pa_enable_CFG		config	/* PMSCR_EL1x.PA */
 #define ATTR_CFG_FLD_pa_enable_LO		1
 #define ATTR_CFG_FLD_pa_enable_HI		1
-#define ATTR_CFG_FLD_pct_enable_CFG		config	/* PMSCR_EL1.PCT */
+#define ATTR_CFG_FLD_pct_enable_CFG		config	/* PMSCR_EL1x.PCT */
 #define ATTR_CFG_FLD_pct_enable_LO		2
 #define ATTR_CFG_FLD_pct_enable_HI		2
 #define ATTR_CFG_FLD_jitter_CFG			config	/* PMSIRR_EL1.RND */
@@ -303,18 +303,18 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
 	struct perf_event_attr *attr = &event->attr;
 	u64 reg = 0;
 
-	reg |= FIELD_PREP(PMSCR_EL1_TS, ATTR_CFG_GET_FLD(attr, ts_enable));
-	reg |= FIELD_PREP(PMSCR_EL1_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
-	reg |= FIELD_PREP(PMSCR_EL1_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));
+	reg |= FIELD_PREP(PMSCR_EL1x_TS, ATTR_CFG_GET_FLD(attr, ts_enable));
+	reg |= FIELD_PREP(PMSCR_EL1x_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
+	reg |= FIELD_PREP(PMSCR_EL1x_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));
 
 	if (!attr->exclude_user)
-		reg |= PMSCR_EL1_E0SPE;
+		reg |= PMSCR_EL1x_E0SPE;
 
 	if (!attr->exclude_kernel)
-		reg |= PMSCR_EL1_E1SPE;
+		reg |= PMSCR_EL1x_E1SPE;
 
 	if (get_spe_event_has_cx(event))
-		reg |= PMSCR_EL1_CX;
+		reg |= PMSCR_EL1x_CX;
 
 	return reg;
 }
@@ -768,7 +768,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
 	set_spe_event_has_cx(event);
 	reg = arm_spe_event_to_pmscr(event);
 	if (!perfmon_capable() &&
-	    (reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT)))
+	    (reg & (PMSCR_EL1x_PA | PMSCR_EL1x_PCT)))
 		return -EACCES;
 
 	return 0;
-- 
2.24.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations
  2023-11-30  7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
@ 2023-11-30  7:46 ` Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
  2 siblings, 0 replies; 8+ messages in thread
From: Yicong Yang @ 2023-11-30  7:46 UTC (permalink / raw)
  To: will, mark.rutland, catalin.marinas, broonie, james.morse,
	anshuman.khandual, linux-arm-kernel
  Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
	fanghao11, yangyicong, linuxarm

From: Yicong Yang <yangyicong@hisilicon.com>

Currently we convert the user settings to PMSCR config in
arm_spe_event_to_pmscr() and set/clear the PMSCR register
separately. It blocks further extension for filtering the
exception level. So Factor out PMSCR set/clear operatons
into separate function and only configure the ELx filtering
when setting the register.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/perf/arm_spe_pmu.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 05647cfff61d..09570d4d63cd 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -297,7 +297,7 @@ static const struct attribute_group *arm_spe_pmu_attr_groups[] = {
 	NULL,
 };
 
-/* Convert between user ABI and register values */
+/* Convert between user ABI and register values, except the exception control */
 static u64 arm_spe_event_to_pmscr(struct perf_event *event)
 {
 	struct perf_event_attr *attr = &event->attr;
@@ -307,16 +307,32 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
 	reg |= FIELD_PREP(PMSCR_EL1x_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
 	reg |= FIELD_PREP(PMSCR_EL1x_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));
 
+	if (get_spe_event_has_cx(event))
+		reg |= PMSCR_EL1x_CX;
+
+	return reg;
+}
+
+static void arm_spe_pmu_set_pmscr(struct perf_event *event)
+{
+	struct perf_event_attr *attr = &event->attr;
+	u64 reg = 0;
+
+	reg = arm_spe_event_to_pmscr(event);
 	if (!attr->exclude_user)
 		reg |= PMSCR_EL1x_E0SPE;
 
 	if (!attr->exclude_kernel)
 		reg |= PMSCR_EL1x_E1SPE;
 
-	if (get_spe_event_has_cx(event))
-		reg |= PMSCR_EL1x_CX;
+	isb();
+	write_sysreg_s(reg, SYS_PMSCR_EL1);
+}
 
-	return reg;
+static void arm_spe_pmu_clr_pmscr(void)
+{
+	write_sysreg_s(0, SYS_PMSCR_EL1);
+	isb();
 }
 
 static void arm_spe_event_sanitise_period(struct perf_event *event)
@@ -566,8 +582,7 @@ static void arm_spe_perf_aux_output_end(struct perf_output_handle *handle)
 static void arm_spe_pmu_disable_and_drain_local(void)
 {
 	/* Disable profiling at EL0 and EL1 */
-	write_sysreg_s(0, SYS_PMSCR_EL1);
-	isb();
+	arm_spe_pmu_clr_pmscr();
 
 	/* Drain any buffered data */
 	psb_csync();
@@ -808,9 +823,7 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
 		write_sysreg_s(reg, SYS_PMSICR_EL1);
 	}
 
-	reg = arm_spe_event_to_pmscr(event);
-	isb();
-	write_sysreg_s(reg, SYS_PMSCR_EL1);
+	arm_spe_pmu_set_pmscr(event);
 }
 
 static void arm_spe_pmu_stop(struct perf_event *event, int flags)
-- 
2.24.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
  2023-11-30  7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
  2023-11-30  7:46 ` [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations Yicong Yang
@ 2023-11-30  7:46 ` Yicong Yang
  2024-04-11 14:28   ` Will Deacon
  2 siblings, 1 reply; 8+ messages in thread
From: Yicong Yang @ 2023-11-30  7:46 UTC (permalink / raw)
  To: will, mark.rutland, catalin.marinas, broonie, james.morse,
	anshuman.khandual, linux-arm-kernel
  Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
	fanghao11, yangyicong, linuxarm

From: Yicong Yang <yangyicong@hisilicon.com>

On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
and we're actually enabling E0SPE and E2SPE in the driver. This means
the data from EL0&1 translation regime of a VM will not be profiled.
So this patch tries to add the support of profiling EL0 and EL1 of
a VM. Users can filter data of different exception level by using
the perf's exclude_* attributes. The exclude_* decision is referred
to Documentation/arch/arm64/perf.rst and the implementation of
arm_pmuv3.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 09570d4d63cd..a647d625f359 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
 static void arm_spe_pmu_set_pmscr(struct perf_event *event)
 {
 	struct perf_event_attr *attr = &event->attr;
-	u64 reg = 0;
+	u64 pmscr_el1, pmscr_el12;
 
-	reg = arm_spe_event_to_pmscr(event);
-	if (!attr->exclude_user)
-		reg |= PMSCR_EL1x_E0SPE;
+	pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
+
+	/*
+	 * Map the exclude_* descision to ELx according to
+	 * Documentation/arch/arm64/perf.rst.
+	 */
+	if (is_kernel_in_hyp_mode()) {
+		if (!attr->exclude_kernel && !attr->exclude_host)
+			pmscr_el1 |= PMSCR_EL1x_E1SPE;
 
-	if (!attr->exclude_kernel)
-		reg |= PMSCR_EL1x_E1SPE;
+		if (!attr->exclude_kernel && !attr->exclude_guest)
+			pmscr_el12 |= PMSCR_EL1x_E1SPE;
+
+		if (!attr->exclude_user && !attr->exclude_host) {
+			pmscr_el1 |= PMSCR_EL1x_E0SPE;
+			pmscr_el12 |= PMSCR_EL1x_E0SPE;
+		}
+	} else {
+		if (!attr->exclude_kernel)
+			pmscr_el1 |= PMSCR_EL1x_E1SPE;
+
+		if (!attr->exclude_user)
+			pmscr_el1 |= PMSCR_EL1x_E0SPE;
+	}
 
 	isb();
-	write_sysreg_s(reg, SYS_PMSCR_EL1);
+	write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
+	if (is_kernel_in_hyp_mode())
+		write_sysreg_s(pmscr_el12, SYS_PMSCR_EL12);
 }
 
 static void arm_spe_pmu_clr_pmscr(void)
 {
+	if (is_kernel_in_hyp_mode())
+		write_sysreg_s(0, SYS_PMSCR_EL12);
+
 	write_sysreg_s(0, SYS_PMSCR_EL1);
 	isb();
 }
-- 
2.24.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
  2023-11-30  7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
@ 2024-04-11 14:28   ` Will Deacon
  2024-04-12  9:22     ` Yicong Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2024-04-11 14:28 UTC (permalink / raw)
  To: Yicong Yang
  Cc: mark.rutland, catalin.marinas, broonie, james.morse,
	anshuman.khandual, linux-arm-kernel, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, fanghao11, yangyicong,
	linuxarm, maz

On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
> and we're actually enabling E0SPE and E2SPE in the driver. This means
> the data from EL0&1 translation regime of a VM will not be profiled.
> So this patch tries to add the support of profiling EL0 and EL1 of
> a VM. Users can filter data of different exception level by using
> the perf's exclude_* attributes. The exclude_* decision is referred
> to Documentation/arch/arm64/perf.rst and the implementation of
> arm_pmuv3.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
>  1 file changed, 30 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index 09570d4d63cd..a647d625f359 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>  static void arm_spe_pmu_set_pmscr(struct perf_event *event)
>  {
>  	struct perf_event_attr *attr = &event->attr;
> -	u64 reg = 0;
> +	u64 pmscr_el1, pmscr_el12;
>  
> -	reg = arm_spe_event_to_pmscr(event);
> -	if (!attr->exclude_user)
> -		reg |= PMSCR_EL1x_E0SPE;
> +	pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
> +
> +	/*
> +	 * Map the exclude_* descision to ELx according to
> +	 * Documentation/arch/arm64/perf.rst.
> +	 */
> +	if (is_kernel_in_hyp_mode()) {
> +		if (!attr->exclude_kernel && !attr->exclude_host)
> +			pmscr_el1 |= PMSCR_EL1x_E1SPE;
>  
> -	if (!attr->exclude_kernel)
> -		reg |= PMSCR_EL1x_E1SPE;
> +		if (!attr->exclude_kernel && !attr->exclude_guest)
> +			pmscr_el12 |= PMSCR_EL1x_E1SPE;
> +
> +		if (!attr->exclude_user && !attr->exclude_host) {
> +			pmscr_el1 |= PMSCR_EL1x_E0SPE;
> +			pmscr_el12 |= PMSCR_EL1x_E0SPE;
> +		}

Hmm, I don't understand this part. Doesn't this mean that setting
'exclude_host' to true will also exclude userspace (EL0) profiling for
the guest?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
  2024-04-11 14:28   ` Will Deacon
@ 2024-04-12  9:22     ` Yicong Yang
  2024-04-12  9:59       ` Marc Zyngier
  0 siblings, 1 reply; 8+ messages in thread
From: Yicong Yang @ 2024-04-12  9:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: yangyicong, mark.rutland, catalin.marinas, broonie, james.morse,
	anshuman.khandual, linux-arm-kernel, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, fanghao11, linuxarm, maz

On 2024/4/11 22:28, Will Deacon wrote:
> On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
>> and we're actually enabling E0SPE and E2SPE in the driver. This means
>> the data from EL0&1 translation regime of a VM will not be profiled.
>> So this patch tries to add the support of profiling EL0 and EL1 of
>> a VM. Users can filter data of different exception level by using
>> the perf's exclude_* attributes. The exclude_* decision is referred
>> to Documentation/arch/arm64/perf.rst and the implementation of
>> arm_pmuv3.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
>>  1 file changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>> index 09570d4d63cd..a647d625f359 100644
>> --- a/drivers/perf/arm_spe_pmu.c
>> +++ b/drivers/perf/arm_spe_pmu.c
>> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>>  static void arm_spe_pmu_set_pmscr(struct perf_event *event)
>>  {
>>  	struct perf_event_attr *attr = &event->attr;
>> -	u64 reg = 0;
>> +	u64 pmscr_el1, pmscr_el12;
>>  
>> -	reg = arm_spe_event_to_pmscr(event);
>> -	if (!attr->exclude_user)
>> -		reg |= PMSCR_EL1x_E0SPE;
>> +	pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
>> +
>> +	/*
>> +	 * Map the exclude_* descision to ELx according to
>> +	 * Documentation/arch/arm64/perf.rst.
>> +	 */
>> +	if (is_kernel_in_hyp_mode()) {
>> +		if (!attr->exclude_kernel && !attr->exclude_host)
>> +			pmscr_el1 |= PMSCR_EL1x_E1SPE;
>>  
>> -	if (!attr->exclude_kernel)
>> -		reg |= PMSCR_EL1x_E1SPE;
>> +		if (!attr->exclude_kernel && !attr->exclude_guest)
>> +			pmscr_el12 |= PMSCR_EL1x_E1SPE;
>> +
>> +		if (!attr->exclude_user && !attr->exclude_host) {
>> +			pmscr_el1 |= PMSCR_EL1x_E0SPE;
>> +			pmscr_el12 |= PMSCR_EL1x_E0SPE;
>> +		}
> 
> Hmm, I don't understand this part. Doesn't this mean that setting
> 'exclude_host' to true will also exclude userspace (EL0) profiling for
> the guest?
> 

I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here
in the driver but we should handle it on guest enter/exit, which is missed
in this patch. Will see how to handle it properly on guest enter/exit and
respin a v3.

Thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
  2024-04-12  9:22     ` Yicong Yang
@ 2024-04-12  9:59       ` Marc Zyngier
  2024-04-12 10:12         ` Yicong Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2024-04-12  9:59 UTC (permalink / raw)
  To: Yicong Yang
  Cc: Will Deacon, yangyicong, mark.rutland, catalin.marinas, broonie,
	james.morse, anshuman.khandual, linux-arm-kernel,
	jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
	fanghao11, linuxarm

On Fri, 12 Apr 2024 10:22:28 +0100,
Yicong Yang <yangyicong@huawei.com> wrote:
> 
> On 2024/4/11 22:28, Will Deacon wrote:
> > On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
> >> From: Yicong Yang <yangyicong@hisilicon.com>
> >>
> >> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
> >> and we're actually enabling E0SPE and E2SPE in the driver. This means
> >> the data from EL0&1 translation regime of a VM will not be profiled.
> >> So this patch tries to add the support of profiling EL0 and EL1 of
> >> a VM. Users can filter data of different exception level by using
> >> the perf's exclude_* attributes. The exclude_* decision is referred
> >> to Documentation/arch/arm64/perf.rst and the implementation of
> >> arm_pmuv3.
> >>
> >> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> >> ---
> >>  drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
> >>  1 file changed, 30 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> >> index 09570d4d63cd..a647d625f359 100644
> >> --- a/drivers/perf/arm_spe_pmu.c
> >> +++ b/drivers/perf/arm_spe_pmu.c
> >> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
> >>  static void arm_spe_pmu_set_pmscr(struct perf_event *event)
> >>  {
> >>  	struct perf_event_attr *attr = &event->attr;
> >> -	u64 reg = 0;
> >> +	u64 pmscr_el1, pmscr_el12;
> >>  
> >> -	reg = arm_spe_event_to_pmscr(event);
> >> -	if (!attr->exclude_user)
> >> -		reg |= PMSCR_EL1x_E0SPE;
> >> +	pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
> >> +
> >> +	/*
> >> +	 * Map the exclude_* descision to ELx according to
> >> +	 * Documentation/arch/arm64/perf.rst.
> >> +	 */
> >> +	if (is_kernel_in_hyp_mode()) {
> >> +		if (!attr->exclude_kernel && !attr->exclude_host)
> >> +			pmscr_el1 |= PMSCR_EL1x_E1SPE;
> >>  
> >> -	if (!attr->exclude_kernel)
> >> -		reg |= PMSCR_EL1x_E1SPE;
> >> +		if (!attr->exclude_kernel && !attr->exclude_guest)
> >> +			pmscr_el12 |= PMSCR_EL1x_E1SPE;
> >> +
> >> +		if (!attr->exclude_user && !attr->exclude_host) {
> >> +			pmscr_el1 |= PMSCR_EL1x_E0SPE;
> >> +			pmscr_el12 |= PMSCR_EL1x_E0SPE;
> >> +		}
> > 
> > Hmm, I don't understand this part. Doesn't this mean that setting
> > 'exclude_host' to true will also exclude userspace (EL0) profiling for
> > the guest?
> > 
> 
> I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here
> in the driver but we should handle it on guest enter/exit, which is missed
> in this patch. Will see how to handle it properly on guest enter/exit and
> respin a v3.

Why should you handle this on guest entry/exit? It should be enough to
deal with PMCSCR_EL12 at the point where the vcpu is scheduled, and
not on the fast path (i.e. it should get hooked into load/put). The
PMU does something vaguely similar.

Another thing is that it shouldn't get in the way of a future SPE
support for the guest itself.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
  2024-04-12  9:59       ` Marc Zyngier
@ 2024-04-12 10:12         ` Yicong Yang
  0 siblings, 0 replies; 8+ messages in thread
From: Yicong Yang @ 2024-04-12 10:12 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: yangyicong, Will Deacon, mark.rutland, catalin.marinas, broonie,
	james.morse, anshuman.khandual, linux-arm-kernel,
	jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
	fanghao11, linuxarm

On 2024/4/12 17:59, Marc Zyngier wrote:
> On Fri, 12 Apr 2024 10:22:28 +0100,
> Yicong Yang <yangyicong@huawei.com> wrote:
>>
>> On 2024/4/11 22:28, Will Deacon wrote:
>>> On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>
>>>> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
>>>> and we're actually enabling E0SPE and E2SPE in the driver. This means
>>>> the data from EL0&1 translation regime of a VM will not be profiled.
>>>> So this patch tries to add the support of profiling EL0 and EL1 of
>>>> a VM. Users can filter data of different exception level by using
>>>> the perf's exclude_* attributes. The exclude_* decision is referred
>>>> to Documentation/arch/arm64/perf.rst and the implementation of
>>>> arm_pmuv3.
>>>>
>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>> ---
>>>>  drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
>>>>  1 file changed, 30 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>>>> index 09570d4d63cd..a647d625f359 100644
>>>> --- a/drivers/perf/arm_spe_pmu.c
>>>> +++ b/drivers/perf/arm_spe_pmu.c
>>>> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>>>>  static void arm_spe_pmu_set_pmscr(struct perf_event *event)
>>>>  {
>>>>  	struct perf_event_attr *attr = &event->attr;
>>>> -	u64 reg = 0;
>>>> +	u64 pmscr_el1, pmscr_el12;
>>>>  
>>>> -	reg = arm_spe_event_to_pmscr(event);
>>>> -	if (!attr->exclude_user)
>>>> -		reg |= PMSCR_EL1x_E0SPE;
>>>> +	pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
>>>> +
>>>> +	/*
>>>> +	 * Map the exclude_* descision to ELx according to
>>>> +	 * Documentation/arch/arm64/perf.rst.
>>>> +	 */
>>>> +	if (is_kernel_in_hyp_mode()) {
>>>> +		if (!attr->exclude_kernel && !attr->exclude_host)
>>>> +			pmscr_el1 |= PMSCR_EL1x_E1SPE;
>>>>  
>>>> -	if (!attr->exclude_kernel)
>>>> -		reg |= PMSCR_EL1x_E1SPE;
>>>> +		if (!attr->exclude_kernel && !attr->exclude_guest)
>>>> +			pmscr_el12 |= PMSCR_EL1x_E1SPE;
>>>> +
>>>> +		if (!attr->exclude_user && !attr->exclude_host) {
>>>> +			pmscr_el1 |= PMSCR_EL1x_E0SPE;
>>>> +			pmscr_el12 |= PMSCR_EL1x_E0SPE;
>>>> +		}
>>>
>>> Hmm, I don't understand this part. Doesn't this mean that setting
>>> 'exclude_host' to true will also exclude userspace (EL0) profiling for
>>> the guest?
>>>
>>
>> I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here
>> in the driver but we should handle it on guest enter/exit, which is missed
>> in this patch. Will see how to handle it properly on guest enter/exit and
>> respin a v3.
> 
> Why should you handle this on guest entry/exit? It should be enough to
> deal with PMCSCR_EL12 at the point where the vcpu is scheduled, and
> not on the fast path (i.e. it should get hooked into load/put). The
> PMU does something vaguely similar.
> 

Yes. I was mean in kvm_vcpu_pmu_restore_{guest,host}(), where also handles the
PMU counters. We should mean the same place.

> Another thing is that it shouldn't get in the way of a future SPE
> support for the guest itself.
> 
> Thanks,
> 
> 	M.
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-12 10:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-30  7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
2023-11-30  7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
2023-11-30  7:46 ` [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations Yicong Yang
2023-11-30  7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
2024-04-11 14:28   ` Will Deacon
2024-04-12  9:22     ` Yicong Yang
2024-04-12  9:59       ` Marc Zyngier
2024-04-12 10:12         ` Yicong Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox