* [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE
@ 2023-11-30 7:46 Yicong Yang
2023-11-30 7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Yicong Yang @ 2023-11-30 7:46 UTC (permalink / raw)
To: will, mark.rutland, catalin.marinas, broonie, james.morse,
anshuman.khandual, linux-arm-kernel
Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
fanghao11, yangyicong, linuxarm
From: Yicong Yang <yangyicong@hisilicon.com>
When using the SPE, it's noticed the operation of a VM is not profiled at
all on a VHE enabled host. This is because the driver's using
PMSCR_EL1.{E1SPE, E0SPE} to enable the profiling. On a VHE enabled host,
we're actually setting the PMSCR_EL2.{E2SPE, E0SPE}. This will enable the
profiling of EL2 and EL0 in a EL2&0 translation regime. However the VM
is using a EL1&0 translation regime, so it's not profiled. We can enable
the profiling of EL1&0 translation regime by setting PMSCR_EL12.{E1SPE, E0SPE}
from EL2.
This patch adds the support to this by:
- Add the sysreg definition of PMSCR_EL12
- Factor the code to allow extension
- Enable the profiling of EL1&0 and complete perf's exclude_* option
Tests have been done on VHE and non-VHE host with
`perf record -e {arm_spe_0//}:u|k|h|G|h`
and results shows as expected.
The sample from El0 in the VM will be like (generated by `perf report -D`):
. 000003b8: b0 48 eb 12 ab ff ff 00 80 PC 0xffffab12eb48 el0 ns=1
. 000003c1: 99 07 00 LAT 7 ISSUE
. 000003c4: 98 08 00 LAT 8 TOT
. 000003c7: 62 42 00 00 00 EV RETIRED NOT-TAKEN
. 000003cc: 4a 01 B COND
. 000003ce: 00 00 PAD
. 000003d0: b1 4c eb 12 ab ff ff 00 80 TGT 0xffffab12eb4c el0 ns=1
. 000003d9: 00 00 00 PAD
. 000003dc: 64 50 01 00 00 CONTEXT 0x150 el1
. 000003e1: 65 3c 11 00 00 CONTEXT 0x113c el2
. 000003e6: 00 00 00 00 00 PAD
. 000003eb: 71 aa aa 00 fd 06 00 00 00 TS 30014483114
Change since v1:
- Add tag by Mark and ordered PMSCR_EL12 by address as suggested. By now at least
PMSCR_EL12 keeps order among its *_EL12 siblings in sysreg.
Link: https://lore.kernel.org/linux-arm-kernel/20231122084602.53914-1-yangyicong@huawei.com/
Yicong Yang (3):
arm64/sysreg: Add PMSCR_EL12 and factor out the common fields
perf: arm_spe: Factor out PMSCR set/clear operations
perf: arm_spe: Enable the profiling of EL0&1 translation regime
arch/arm64/tools/sysreg | 10 ++++-
drivers/perf/arm_spe_pmu.c | 76 ++++++++++++++++++++++++++++----------
2 files changed, 65 insertions(+), 21 deletions(-)
--
2.24.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields
2023-11-30 7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
@ 2023-11-30 7:46 ` Yicong Yang
2023-11-30 7:46 ` [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations Yicong Yang
2023-11-30 7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
2 siblings, 0 replies; 8+ messages in thread
From: Yicong Yang @ 2023-11-30 7:46 UTC (permalink / raw)
To: will, mark.rutland, catalin.marinas, broonie, james.morse,
anshuman.khandual, linux-arm-kernel
Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
fanghao11, yangyicong, linuxarm
From: Yicong Yang <yangyicong@hisilicon.com>
Add PMSCR_EL12 for accessing PMSCR_EL1 from EL2. Since PMSCR_EL12
and PMSCR_EL1 share the same definition of the fields, define a
common PMSCR_EL1x for both. Update the field name used in the
driver accordingly.
Trying hard to order PMSCR_EL12 by the address with its *_EL12
siblings in sysreg file.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
arch/arm64/tools/sysreg | 10 +++++++++-
drivers/perf/arm_spe_pmu.c | 20 ++++++++++----------
2 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 96cbeeab4eec..b55544f721ec 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1800,7 +1800,7 @@ Sysreg FAR_EL1 3 0 6 0 0
Field 63:0 ADDR
EndSysreg
-Sysreg PMSCR_EL1 3 0 9 9 0
+SysregFields PMSCR_EL1x
Res0 63:8
Field 7:6 PCT
Field 5 TS
@@ -1809,6 +1809,10 @@ Field 3 CX
Res0 2
Field 1 E1SPE
Field 0 E0SPE
+EndSysregFields
+
+Sysreg PMSCR_EL1 3 0 9 9 0
+Fields PMSCR_EL1x
EndSysreg
Sysreg PMSNEVFR_EL1 3 0 9 9 1
@@ -2411,6 +2415,10 @@ Sysreg FAR_EL12 3 5 6 0 0
Field 63:0 ADDR
EndSysreg
+Sysreg PMSCR_EL12 3 5 9 9 0
+Fields PMSCR_EL1x
+EndSysreg
+
Sysreg CONTEXTIDR_EL12 3 5 13 0 1
Fields CONTEXTIDR_ELx
EndSysreg
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d2b0cbf0e0c4..05647cfff61d 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -172,13 +172,13 @@ static const struct attribute_group arm_spe_pmu_cap_group = {
};
/* User ABI */
-#define ATTR_CFG_FLD_ts_enable_CFG config /* PMSCR_EL1.TS */
+#define ATTR_CFG_FLD_ts_enable_CFG config /* PMSCR_EL1x.TS */
#define ATTR_CFG_FLD_ts_enable_LO 0
#define ATTR_CFG_FLD_ts_enable_HI 0
-#define ATTR_CFG_FLD_pa_enable_CFG config /* PMSCR_EL1.PA */
+#define ATTR_CFG_FLD_pa_enable_CFG config /* PMSCR_EL1x.PA */
#define ATTR_CFG_FLD_pa_enable_LO 1
#define ATTR_CFG_FLD_pa_enable_HI 1
-#define ATTR_CFG_FLD_pct_enable_CFG config /* PMSCR_EL1.PCT */
+#define ATTR_CFG_FLD_pct_enable_CFG config /* PMSCR_EL1x.PCT */
#define ATTR_CFG_FLD_pct_enable_LO 2
#define ATTR_CFG_FLD_pct_enable_HI 2
#define ATTR_CFG_FLD_jitter_CFG config /* PMSIRR_EL1.RND */
@@ -303,18 +303,18 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
struct perf_event_attr *attr = &event->attr;
u64 reg = 0;
- reg |= FIELD_PREP(PMSCR_EL1_TS, ATTR_CFG_GET_FLD(attr, ts_enable));
- reg |= FIELD_PREP(PMSCR_EL1_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
- reg |= FIELD_PREP(PMSCR_EL1_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));
+ reg |= FIELD_PREP(PMSCR_EL1x_TS, ATTR_CFG_GET_FLD(attr, ts_enable));
+ reg |= FIELD_PREP(PMSCR_EL1x_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
+ reg |= FIELD_PREP(PMSCR_EL1x_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));
if (!attr->exclude_user)
- reg |= PMSCR_EL1_E0SPE;
+ reg |= PMSCR_EL1x_E0SPE;
if (!attr->exclude_kernel)
- reg |= PMSCR_EL1_E1SPE;
+ reg |= PMSCR_EL1x_E1SPE;
if (get_spe_event_has_cx(event))
- reg |= PMSCR_EL1_CX;
+ reg |= PMSCR_EL1x_CX;
return reg;
}
@@ -768,7 +768,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
set_spe_event_has_cx(event);
reg = arm_spe_event_to_pmscr(event);
if (!perfmon_capable() &&
- (reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT)))
+ (reg & (PMSCR_EL1x_PA | PMSCR_EL1x_PCT)))
return -EACCES;
return 0;
--
2.24.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations
2023-11-30 7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
2023-11-30 7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
@ 2023-11-30 7:46 ` Yicong Yang
2023-11-30 7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
2 siblings, 0 replies; 8+ messages in thread
From: Yicong Yang @ 2023-11-30 7:46 UTC (permalink / raw)
To: will, mark.rutland, catalin.marinas, broonie, james.morse,
anshuman.khandual, linux-arm-kernel
Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
fanghao11, yangyicong, linuxarm
From: Yicong Yang <yangyicong@hisilicon.com>
Currently we convert the user settings to PMSCR config in
arm_spe_event_to_pmscr() and set/clear the PMSCR register
separately. It blocks further extension for filtering the
exception level. So Factor out PMSCR set/clear operatons
into separate function and only configure the ELx filtering
when setting the register.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
drivers/perf/arm_spe_pmu.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 05647cfff61d..09570d4d63cd 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -297,7 +297,7 @@ static const struct attribute_group *arm_spe_pmu_attr_groups[] = {
NULL,
};
-/* Convert between user ABI and register values */
+/* Convert between user ABI and register values, except the exception control */
static u64 arm_spe_event_to_pmscr(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
@@ -307,16 +307,32 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
reg |= FIELD_PREP(PMSCR_EL1x_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
reg |= FIELD_PREP(PMSCR_EL1x_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));
+ if (get_spe_event_has_cx(event))
+ reg |= PMSCR_EL1x_CX;
+
+ return reg;
+}
+
+static void arm_spe_pmu_set_pmscr(struct perf_event *event)
+{
+ struct perf_event_attr *attr = &event->attr;
+ u64 reg = 0;
+
+ reg = arm_spe_event_to_pmscr(event);
if (!attr->exclude_user)
reg |= PMSCR_EL1x_E0SPE;
if (!attr->exclude_kernel)
reg |= PMSCR_EL1x_E1SPE;
- if (get_spe_event_has_cx(event))
- reg |= PMSCR_EL1x_CX;
+ isb();
+ write_sysreg_s(reg, SYS_PMSCR_EL1);
+}
- return reg;
+static void arm_spe_pmu_clr_pmscr(void)
+{
+ write_sysreg_s(0, SYS_PMSCR_EL1);
+ isb();
}
static void arm_spe_event_sanitise_period(struct perf_event *event)
@@ -566,8 +582,7 @@ static void arm_spe_perf_aux_output_end(struct perf_output_handle *handle)
static void arm_spe_pmu_disable_and_drain_local(void)
{
/* Disable profiling at EL0 and EL1 */
- write_sysreg_s(0, SYS_PMSCR_EL1);
- isb();
+ arm_spe_pmu_clr_pmscr();
/* Drain any buffered data */
psb_csync();
@@ -808,9 +823,7 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
write_sysreg_s(reg, SYS_PMSICR_EL1);
}
- reg = arm_spe_event_to_pmscr(event);
- isb();
- write_sysreg_s(reg, SYS_PMSCR_EL1);
+ arm_spe_pmu_set_pmscr(event);
}
static void arm_spe_pmu_stop(struct perf_event *event, int flags)
--
2.24.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
2023-11-30 7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
2023-11-30 7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
2023-11-30 7:46 ` [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations Yicong Yang
@ 2023-11-30 7:46 ` Yicong Yang
2024-04-11 14:28 ` Will Deacon
2 siblings, 1 reply; 8+ messages in thread
From: Yicong Yang @ 2023-11-30 7:46 UTC (permalink / raw)
To: will, mark.rutland, catalin.marinas, broonie, james.morse,
anshuman.khandual, linux-arm-kernel
Cc: jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
fanghao11, yangyicong, linuxarm
From: Yicong Yang <yangyicong@hisilicon.com>
On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
and we're actually enabling E0SPE and E2SPE in the driver. This means
the data from EL0&1 translation regime of a VM will not be profiled.
So this patch tries to add the support of profiling EL0 and EL1 of
a VM. Users can filter data of different exception level by using
the perf's exclude_* attributes. The exclude_* decision is referred
to Documentation/arch/arm64/perf.rst and the implementation of
arm_pmuv3.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 09570d4d63cd..a647d625f359 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
static void arm_spe_pmu_set_pmscr(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
- u64 reg = 0;
+ u64 pmscr_el1, pmscr_el12;
- reg = arm_spe_event_to_pmscr(event);
- if (!attr->exclude_user)
- reg |= PMSCR_EL1x_E0SPE;
+ pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
+
+ /*
+ * Map the exclude_* descision to ELx according to
+ * Documentation/arch/arm64/perf.rst.
+ */
+ if (is_kernel_in_hyp_mode()) {
+ if (!attr->exclude_kernel && !attr->exclude_host)
+ pmscr_el1 |= PMSCR_EL1x_E1SPE;
- if (!attr->exclude_kernel)
- reg |= PMSCR_EL1x_E1SPE;
+ if (!attr->exclude_kernel && !attr->exclude_guest)
+ pmscr_el12 |= PMSCR_EL1x_E1SPE;
+
+ if (!attr->exclude_user && !attr->exclude_host) {
+ pmscr_el1 |= PMSCR_EL1x_E0SPE;
+ pmscr_el12 |= PMSCR_EL1x_E0SPE;
+ }
+ } else {
+ if (!attr->exclude_kernel)
+ pmscr_el1 |= PMSCR_EL1x_E1SPE;
+
+ if (!attr->exclude_user)
+ pmscr_el1 |= PMSCR_EL1x_E0SPE;
+ }
isb();
- write_sysreg_s(reg, SYS_PMSCR_EL1);
+ write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
+ if (is_kernel_in_hyp_mode())
+ write_sysreg_s(pmscr_el12, SYS_PMSCR_EL12);
}
static void arm_spe_pmu_clr_pmscr(void)
{
+ if (is_kernel_in_hyp_mode())
+ write_sysreg_s(0, SYS_PMSCR_EL12);
+
write_sysreg_s(0, SYS_PMSCR_EL1);
isb();
}
--
2.24.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
2023-11-30 7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
@ 2024-04-11 14:28 ` Will Deacon
2024-04-12 9:22 ` Yicong Yang
0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2024-04-11 14:28 UTC (permalink / raw)
To: Yicong Yang
Cc: mark.rutland, catalin.marinas, broonie, james.morse,
anshuman.khandual, linux-arm-kernel, jonathan.cameron,
shameerali.kolothum.thodi, prime.zeng, fanghao11, yangyicong,
linuxarm, maz
On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
> and we're actually enabling E0SPE and E2SPE in the driver. This means
> the data from EL0&1 translation regime of a VM will not be profiled.
> So this patch tries to add the support of profiling EL0 and EL1 of
> a VM. Users can filter data of different exception level by using
> the perf's exclude_* attributes. The exclude_* decision is referred
> to Documentation/arch/arm64/perf.rst and the implementation of
> arm_pmuv3.
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
> 1 file changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index 09570d4d63cd..a647d625f359 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
> static void arm_spe_pmu_set_pmscr(struct perf_event *event)
> {
> struct perf_event_attr *attr = &event->attr;
> - u64 reg = 0;
> + u64 pmscr_el1, pmscr_el12;
>
> - reg = arm_spe_event_to_pmscr(event);
> - if (!attr->exclude_user)
> - reg |= PMSCR_EL1x_E0SPE;
> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
> +
> + /*
> + * Map the exclude_* descision to ELx according to
> + * Documentation/arch/arm64/perf.rst.
> + */
> + if (is_kernel_in_hyp_mode()) {
> + if (!attr->exclude_kernel && !attr->exclude_host)
> + pmscr_el1 |= PMSCR_EL1x_E1SPE;
>
> - if (!attr->exclude_kernel)
> - reg |= PMSCR_EL1x_E1SPE;
> + if (!attr->exclude_kernel && !attr->exclude_guest)
> + pmscr_el12 |= PMSCR_EL1x_E1SPE;
> +
> + if (!attr->exclude_user && !attr->exclude_host) {
> + pmscr_el1 |= PMSCR_EL1x_E0SPE;
> + pmscr_el12 |= PMSCR_EL1x_E0SPE;
> + }
Hmm, I don't understand this part. Doesn't this mean that setting
'exclude_host' to true will also exclude userspace (EL0) profiling for
the guest?
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
2024-04-11 14:28 ` Will Deacon
@ 2024-04-12 9:22 ` Yicong Yang
2024-04-12 9:59 ` Marc Zyngier
0 siblings, 1 reply; 8+ messages in thread
From: Yicong Yang @ 2024-04-12 9:22 UTC (permalink / raw)
To: Will Deacon
Cc: yangyicong, mark.rutland, catalin.marinas, broonie, james.morse,
anshuman.khandual, linux-arm-kernel, jonathan.cameron,
shameerali.kolothum.thodi, prime.zeng, fanghao11, linuxarm, maz
On 2024/4/11 22:28, Will Deacon wrote:
> On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
>> and we're actually enabling E0SPE and E2SPE in the driver. This means
>> the data from EL0&1 translation regime of a VM will not be profiled.
>> So this patch tries to add the support of profiling EL0 and EL1 of
>> a VM. Users can filter data of different exception level by using
>> the perf's exclude_* attributes. The exclude_* decision is referred
>> to Documentation/arch/arm64/perf.rst and the implementation of
>> arm_pmuv3.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
>> 1 file changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>> index 09570d4d63cd..a647d625f359 100644
>> --- a/drivers/perf/arm_spe_pmu.c
>> +++ b/drivers/perf/arm_spe_pmu.c
>> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>> static void arm_spe_pmu_set_pmscr(struct perf_event *event)
>> {
>> struct perf_event_attr *attr = &event->attr;
>> - u64 reg = 0;
>> + u64 pmscr_el1, pmscr_el12;
>>
>> - reg = arm_spe_event_to_pmscr(event);
>> - if (!attr->exclude_user)
>> - reg |= PMSCR_EL1x_E0SPE;
>> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
>> +
>> + /*
>> + * Map the exclude_* descision to ELx according to
>> + * Documentation/arch/arm64/perf.rst.
>> + */
>> + if (is_kernel_in_hyp_mode()) {
>> + if (!attr->exclude_kernel && !attr->exclude_host)
>> + pmscr_el1 |= PMSCR_EL1x_E1SPE;
>>
>> - if (!attr->exclude_kernel)
>> - reg |= PMSCR_EL1x_E1SPE;
>> + if (!attr->exclude_kernel && !attr->exclude_guest)
>> + pmscr_el12 |= PMSCR_EL1x_E1SPE;
>> +
>> + if (!attr->exclude_user && !attr->exclude_host) {
>> + pmscr_el1 |= PMSCR_EL1x_E0SPE;
>> + pmscr_el12 |= PMSCR_EL1x_E0SPE;
>> + }
>
> Hmm, I don't understand this part. Doesn't this mean that setting
> 'exclude_host' to true will also exclude userspace (EL0) profiling for
> the guest?
>
I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here
in the driver but we should handle it on guest enter/exit, which is missed
in this patch. Will see how to handle it properly on guest enter/exit and
respin a v3.
Thanks.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
2024-04-12 9:22 ` Yicong Yang
@ 2024-04-12 9:59 ` Marc Zyngier
2024-04-12 10:12 ` Yicong Yang
0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2024-04-12 9:59 UTC (permalink / raw)
To: Yicong Yang
Cc: Will Deacon, yangyicong, mark.rutland, catalin.marinas, broonie,
james.morse, anshuman.khandual, linux-arm-kernel,
jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
fanghao11, linuxarm
On Fri, 12 Apr 2024 10:22:28 +0100,
Yicong Yang <yangyicong@huawei.com> wrote:
>
> On 2024/4/11 22:28, Will Deacon wrote:
> > On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
> >> From: Yicong Yang <yangyicong@hisilicon.com>
> >>
> >> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
> >> and we're actually enabling E0SPE and E2SPE in the driver. This means
> >> the data from EL0&1 translation regime of a VM will not be profiled.
> >> So this patch tries to add the support of profiling EL0 and EL1 of
> >> a VM. Users can filter data of different exception level by using
> >> the perf's exclude_* attributes. The exclude_* decision is referred
> >> to Documentation/arch/arm64/perf.rst and the implementation of
> >> arm_pmuv3.
> >>
> >> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> >> ---
> >> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
> >> 1 file changed, 30 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> >> index 09570d4d63cd..a647d625f359 100644
> >> --- a/drivers/perf/arm_spe_pmu.c
> >> +++ b/drivers/perf/arm_spe_pmu.c
> >> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
> >> static void arm_spe_pmu_set_pmscr(struct perf_event *event)
> >> {
> >> struct perf_event_attr *attr = &event->attr;
> >> - u64 reg = 0;
> >> + u64 pmscr_el1, pmscr_el12;
> >>
> >> - reg = arm_spe_event_to_pmscr(event);
> >> - if (!attr->exclude_user)
> >> - reg |= PMSCR_EL1x_E0SPE;
> >> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
> >> +
> >> + /*
> >> + * Map the exclude_* descision to ELx according to
> >> + * Documentation/arch/arm64/perf.rst.
> >> + */
> >> + if (is_kernel_in_hyp_mode()) {
> >> + if (!attr->exclude_kernel && !attr->exclude_host)
> >> + pmscr_el1 |= PMSCR_EL1x_E1SPE;
> >>
> >> - if (!attr->exclude_kernel)
> >> - reg |= PMSCR_EL1x_E1SPE;
> >> + if (!attr->exclude_kernel && !attr->exclude_guest)
> >> + pmscr_el12 |= PMSCR_EL1x_E1SPE;
> >> +
> >> + if (!attr->exclude_user && !attr->exclude_host) {
> >> + pmscr_el1 |= PMSCR_EL1x_E0SPE;
> >> + pmscr_el12 |= PMSCR_EL1x_E0SPE;
> >> + }
> >
> > Hmm, I don't understand this part. Doesn't this mean that setting
> > 'exclude_host' to true will also exclude userspace (EL0) profiling for
> > the guest?
> >
>
> I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here
> in the driver but we should handle it on guest enter/exit, which is missed
> in this patch. Will see how to handle it properly on guest enter/exit and
> respin a v3.
Why should you handle this on guest entry/exit? It should be enough to
deal with PMCSCR_EL12 at the point where the vcpu is scheduled, and
not on the fast path (i.e. it should get hooked into load/put). The
PMU does something vaguely similar.
Another thing is that it shouldn't get in the way of a future SPE
support for the guest itself.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime
2024-04-12 9:59 ` Marc Zyngier
@ 2024-04-12 10:12 ` Yicong Yang
0 siblings, 0 replies; 8+ messages in thread
From: Yicong Yang @ 2024-04-12 10:12 UTC (permalink / raw)
To: Marc Zyngier
Cc: yangyicong, Will Deacon, mark.rutland, catalin.marinas, broonie,
james.morse, anshuman.khandual, linux-arm-kernel,
jonathan.cameron, shameerali.kolothum.thodi, prime.zeng,
fanghao11, linuxarm
On 2024/4/12 17:59, Marc Zyngier wrote:
> On Fri, 12 Apr 2024 10:22:28 +0100,
> Yicong Yang <yangyicong@huawei.com> wrote:
>>
>> On 2024/4/11 22:28, Will Deacon wrote:
>>> On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote:
>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>
>>>> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2
>>>> and we're actually enabling E0SPE and E2SPE in the driver. This means
>>>> the data from EL0&1 translation regime of a VM will not be profiled.
>>>> So this patch tries to add the support of profiling EL0 and EL1 of
>>>> a VM. Users can filter data of different exception level by using
>>>> the perf's exclude_* attributes. The exclude_* decision is referred
>>>> to Documentation/arch/arm64/perf.rst and the implementation of
>>>> arm_pmuv3.
>>>>
>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>> ---
>>>> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++-------
>>>> 1 file changed, 30 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>>>> index 09570d4d63cd..a647d625f359 100644
>>>> --- a/drivers/perf/arm_spe_pmu.c
>>>> +++ b/drivers/perf/arm_spe_pmu.c
>>>> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>>>> static void arm_spe_pmu_set_pmscr(struct perf_event *event)
>>>> {
>>>> struct perf_event_attr *attr = &event->attr;
>>>> - u64 reg = 0;
>>>> + u64 pmscr_el1, pmscr_el12;
>>>>
>>>> - reg = arm_spe_event_to_pmscr(event);
>>>> - if (!attr->exclude_user)
>>>> - reg |= PMSCR_EL1x_E0SPE;
>>>> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event);
>>>> +
>>>> + /*
>>>> + * Map the exclude_* descision to ELx according to
>>>> + * Documentation/arch/arm64/perf.rst.
>>>> + */
>>>> + if (is_kernel_in_hyp_mode()) {
>>>> + if (!attr->exclude_kernel && !attr->exclude_host)
>>>> + pmscr_el1 |= PMSCR_EL1x_E1SPE;
>>>>
>>>> - if (!attr->exclude_kernel)
>>>> - reg |= PMSCR_EL1x_E1SPE;
>>>> + if (!attr->exclude_kernel && !attr->exclude_guest)
>>>> + pmscr_el12 |= PMSCR_EL1x_E1SPE;
>>>> +
>>>> + if (!attr->exclude_user && !attr->exclude_host) {
>>>> + pmscr_el1 |= PMSCR_EL1x_E0SPE;
>>>> + pmscr_el12 |= PMSCR_EL1x_E0SPE;
>>>> + }
>>>
>>> Hmm, I don't understand this part. Doesn't this mean that setting
>>> 'exclude_host' to true will also exclude userspace (EL0) profiling for
>>> the guest?
>>>
>>
>> I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here
>> in the driver but we should handle it on guest enter/exit, which is missed
>> in this patch. Will see how to handle it properly on guest enter/exit and
>> respin a v3.
>
> Why should you handle this on guest entry/exit? It should be enough to
> deal with PMCSCR_EL12 at the point where the vcpu is scheduled, and
> not on the fast path (i.e. it should get hooked into load/put). The
> PMU does something vaguely similar.
>
Yes. I was mean in kvm_vcpu_pmu_restore_{guest,host}(), where also handles the
PMU counters. We should mean the same place.
> Another thing is that it shouldn't get in the way of a future SPE
> support for the guest itself.
>
> Thanks,
>
> M.
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-04-12 10:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-30 7:46 [PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE Yicong Yang
2023-11-30 7:46 ` [PATCH v2 1/3] arm64/sysreg: Add PMSCR_EL12 and factor out the common fields Yicong Yang
2023-11-30 7:46 ` [PATCH v2 2/3] perf: arm_spe: Factor out PMSCR set/clear operations Yicong Yang
2023-11-30 7:46 ` [PATCH v2 3/3] perf: arm_spe: Enable the profiling of EL0&1 translation regime Yicong Yang
2024-04-11 14:28 ` Will Deacon
2024-04-12 9:22 ` Yicong Yang
2024-04-12 9:59 ` Marc Zyngier
2024-04-12 10:12 ` Yicong Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox