* [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-24 9:16 ` Wei Li
0 siblings, 0 replies; 44+ messages in thread
From: Wei Li @ 2020-07-24 9:16 UTC (permalink / raw)
To: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Catalin Marinas, James Clark, Jiri Olsa, Leo Yan, Mark Rutland,
Namhyung Kim, Suzuki K Poulose, Will Deacon, zhangshaokun
Cc: linux-arm-kernel, linux-kernel, Ingo Molnar, Peter Zijlstra,
guohanjun
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).
The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
and filtering on these events using PMSEVFR_EL1.
Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in kernel driver.
Signed-off-by: Wei Li <liwei391@huawei.com>
---
arch/arm64/include/asm/sysreg.h | 4 +++-
drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
2 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 463175f80341..be4c44ccdb56 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -281,7 +281,6 @@
#define SYS_PMSFCR_EL1_ST_SHIFT 18
#define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
-#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
#define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
#define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
@@ -769,6 +768,9 @@
#define ID_AA64DFR0_PMUVER_8_5 0x6
#define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
+#define ID_AA64DFR0_PMSVER_8_2 0x1
+#define ID_AA64DFR0_PMSVER_8_3 0x2
+
#define ID_DFR0_PERFMON_SHIFT 24
#define ID_DFR0_PERFMON_8_1 0x4
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index e51ddb6d63ed..5ec7ee0c8fa1 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -54,7 +54,7 @@ struct arm_spe_pmu {
struct hlist_node hotplug_node;
int irq; /* PPI */
-
+ int pmuver;
u16 min_period;
u16 counter_sz;
@@ -80,6 +80,15 @@ struct arm_spe_pmu {
/* Keep track of our dynamic hotplug state */
static enum cpuhp_state arm_spe_pmu_online;
+static u64 sys_pmsevfr_el1_mask[] = {
+ [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
+ GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
+ BIT_ULL(1),
+ [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
+ GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
+ BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
+};
+
enum arm_spe_pmu_buf_fault_action {
SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
SPE_PMU_BUF_FAULT_ACT_FATAL,
@@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
return -ENOENT;
- if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
+ if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
return -EOPNOTSUPP;
if (attr->exclude_idle)
@@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
fld, smp_processor_id());
return;
}
+ spe_pmu->pmuver = fld;
/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
@@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
}
dev_info(dev,
- "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
- cpumask_pr_args(&spe_pmu->supported_cpus),
+ "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
+ spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features);
spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED;
--
2.17.1
^ permalink raw reply related [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-24 9:16 ` Wei Li
@ 2020-07-28 12:27 ` Leo Yan
-1 siblings, 0 replies; 44+ messages in thread
From: Leo Yan @ 2020-07-28 12:27 UTC (permalink / raw)
To: Wei Li
Cc: Mark Rutland, Will Deacon, Suzuki K Poulose, Alexander Shishkin,
Catalin Marinas, Adrian Hunter, Arnaldo Carvalho de Melo,
linux-kernel, zhangshaokun, Peter Zijlstra, Ingo Molnar,
James Clark, guohanjun, Namhyung Kim, Jiri Olsa, linux-arm-kernel
Hi Wei,
On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
> Armv8.3 extends the SPE by adding:
> - Alignment field in the Events packet, and filtering on this event
> using PMSEVFR_EL1.
> - Support for the Scalable Vector Extension (SVE).
>
> The main additions for SVE are:
> - Recording the vector length for SVE operations in the Operation Type
> packet. It is not possible to filter on vector length.
> - Incomplete predicate and empty predicate fields in the Events packet,
> and filtering on these events using PMSEVFR_EL1.
>
> Update the check of pmsevfr for empty/partial predicated SVE and
> alignment event in kernel driver.
>
> Signed-off-by: Wei Li <liwei391@huawei.com>
> ---
> arch/arm64/include/asm/sysreg.h | 4 +++-
> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 463175f80341..be4c44ccdb56 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -281,7 +281,6 @@
> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>
> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>
> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> @@ -769,6 +768,9 @@
> #define ID_AA64DFR0_PMUVER_8_5 0x6
> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>
> +#define ID_AA64DFR0_PMSVER_8_2 0x1
> +#define ID_AA64DFR0_PMSVER_8_3 0x2
> +
> #define ID_DFR0_PERFMON_SHIFT 24
>
> #define ID_DFR0_PERFMON_8_1 0x4
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> struct hlist_node hotplug_node;
>
> int irq; /* PPI */
> -
> + int pmuver;
Since the version number is only 4 bits width, 'u16' would be enough
to record SPE version number.
> u16 min_period;
> u16 counter_sz;
>
> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> /* Keep track of our dynamic hotplug state */
> static enum cpuhp_state arm_spe_pmu_online;
>
> +static u64 sys_pmsevfr_el1_mask[] = {
> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> + BIT_ULL(1),
> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> +};
Seems to me, the definitions for Aarch64 system registers should be
placed into the file 'arch/arm64/include/asm/sysreg.h'. Like below
two macros:
#define SYS_PMSEVFR_EL1_RES0_8_2 0x0000ffff00ff0f55UL
#define SYS_PMSEVFR_EL1_RES0_8_3 ...
Let's wait for Will or Mark Rutland's comments for this, in case I
mislead for this.
> +
> enum arm_spe_pmu_buf_fault_action {
> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> SPE_PMU_BUF_FAULT_ACT_FATAL,
> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> return -ENOENT;
>
> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> return -EOPNOTSUPP;
>
> if (attr->exclude_idle)
> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> fld, smp_processor_id());
> return;
> }
> + spe_pmu->pmuver = fld;
>
> /* Read PMBIDR first to determine whether or not we have access */
> reg = read_sysreg_s(SYS_PMBIDR_EL1);
> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
> }
>
> dev_info(dev,
> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> - cpumask_pr_args(&spe_pmu->supported_cpus),
> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
Let's output explict info, like:
"probed for CPUs %*pbl [pmuver %d, max_record_sz %u, align %u, features 0x%llx]\n",
Thanks,
Leo
> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
> spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features);
>
> spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED;
> --
> 2.17.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-28 12:27 ` Leo Yan
0 siblings, 0 replies; 44+ messages in thread
From: Leo Yan @ 2020-07-28 12:27 UTC (permalink / raw)
To: Wei Li
Cc: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Catalin Marinas, James Clark, Jiri Olsa, Mark Rutland,
Namhyung Kim, Suzuki K Poulose, Will Deacon, zhangshaokun,
linux-arm-kernel, linux-kernel, Ingo Molnar, Peter Zijlstra,
guohanjun
Hi Wei,
On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
> Armv8.3 extends the SPE by adding:
> - Alignment field in the Events packet, and filtering on this event
> using PMSEVFR_EL1.
> - Support for the Scalable Vector Extension (SVE).
>
> The main additions for SVE are:
> - Recording the vector length for SVE operations in the Operation Type
> packet. It is not possible to filter on vector length.
> - Incomplete predicate and empty predicate fields in the Events packet,
> and filtering on these events using PMSEVFR_EL1.
>
> Update the check of pmsevfr for empty/partial predicated SVE and
> alignment event in kernel driver.
>
> Signed-off-by: Wei Li <liwei391@huawei.com>
> ---
> arch/arm64/include/asm/sysreg.h | 4 +++-
> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 463175f80341..be4c44ccdb56 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -281,7 +281,6 @@
> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>
> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>
> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> @@ -769,6 +768,9 @@
> #define ID_AA64DFR0_PMUVER_8_5 0x6
> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>
> +#define ID_AA64DFR0_PMSVER_8_2 0x1
> +#define ID_AA64DFR0_PMSVER_8_3 0x2
> +
> #define ID_DFR0_PERFMON_SHIFT 24
>
> #define ID_DFR0_PERFMON_8_1 0x4
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> struct hlist_node hotplug_node;
>
> int irq; /* PPI */
> -
> + int pmuver;
Since the version number is only 4 bits width, 'u16' would be enough
to record SPE version number.
> u16 min_period;
> u16 counter_sz;
>
> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> /* Keep track of our dynamic hotplug state */
> static enum cpuhp_state arm_spe_pmu_online;
>
> +static u64 sys_pmsevfr_el1_mask[] = {
> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> + BIT_ULL(1),
> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> +};
Seems to me, the definitions for Aarch64 system registers should be
placed into the file 'arch/arm64/include/asm/sysreg.h'. Like below
two macros:
#define SYS_PMSEVFR_EL1_RES0_8_2 0x0000ffff00ff0f55UL
#define SYS_PMSEVFR_EL1_RES0_8_3 ...
Let's wait for Will or Mark Rutland's comments for this, in case I
mislead for this.
> +
> enum arm_spe_pmu_buf_fault_action {
> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> SPE_PMU_BUF_FAULT_ACT_FATAL,
> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> return -ENOENT;
>
> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> return -EOPNOTSUPP;
>
> if (attr->exclude_idle)
> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> fld, smp_processor_id());
> return;
> }
> + spe_pmu->pmuver = fld;
>
> /* Read PMBIDR first to determine whether or not we have access */
> reg = read_sysreg_s(SYS_PMBIDR_EL1);
> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
> }
>
> dev_info(dev,
> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> - cpumask_pr_args(&spe_pmu->supported_cpus),
> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
Let's output explict info, like:
"probed for CPUs %*pbl [pmuver %d, max_record_sz %u, align %u, features 0x%llx]\n",
Thanks,
Leo
> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
> spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features);
>
> spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED;
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-28 12:27 ` Leo Yan
@ 2020-07-28 13:24 ` liwei (GF)
-1 siblings, 0 replies; 44+ messages in thread
From: liwei (GF) @ 2020-07-28 13:24 UTC (permalink / raw)
To: Leo Yan, Will Deacon
Cc: Mark Rutland, Suzuki K Poulose, Alexander Shishkin,
Catalin Marinas, Adrian Hunter, Arnaldo Carvalho de Melo,
linux-kernel, zhangshaokun, Peter Zijlstra, Ingo Molnar,
James Clark, guohanjun, Namhyung Kim, Jiri Olsa, linux-arm-kernel
Hi Leo,
On 2020/7/28 20:27, Leo Yan wrote:
> Hi Wei,
>
> On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
>> Armv8.3 extends the SPE by adding:
>> - Alignment field in the Events packet, and filtering on this event
>> using PMSEVFR_EL1.
>> - Support for the Scalable Vector Extension (SVE).
>>
>> The main additions for SVE are:
>> - Recording the vector length for SVE operations in the Operation Type
>> packet. It is not possible to filter on vector length.
>> - Incomplete predicate and empty predicate fields in the Events packet,
>> and filtering on these events using PMSEVFR_EL1.
>>
>> Update the check of pmsevfr for empty/partial predicated SVE and
>> alignment event in kernel driver.
>>
>> Signed-off-by: Wei Li <liwei391@huawei.com>
>> ---
>> arch/arm64/include/asm/sysreg.h | 4 +++-
>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 463175f80341..be4c44ccdb56 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -281,7 +281,6 @@
>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>
>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>>
>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>> @@ -769,6 +768,9 @@
>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>
>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>> +
>> #define ID_DFR0_PERFMON_SHIFT 24
>>
>> #define ID_DFR0_PERFMON_8_1 0x4
>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>> --- a/drivers/perf/arm_spe_pmu.c
>> +++ b/drivers/perf/arm_spe_pmu.c
>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>> struct hlist_node hotplug_node;
>>
>> int irq; /* PPI */
>> -
>> + int pmuver;
>
> Since the version number is only 4 bits width, 'u16' would be enough
> to record SPE version number.
Sounds reasonable, i can change it to 'u16' if you insist.
>> u16 min_period;
>> u16 counter_sz;
>>
>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>> /* Keep track of our dynamic hotplug state */
>> static enum cpuhp_state arm_spe_pmu_online;
>>
>> +static u64 sys_pmsevfr_el1_mask[] = {
>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>> + BIT_ULL(1),
>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>> +};
>
> Seems to me, the definitions for Aarch64 system registers should be
> placed into the file 'arch/arm64/include/asm/sysreg.h'. Like below
> two macros:
>
> #define SYS_PMSEVFR_EL1_RES0_8_2 0x0000ffff00ff0f55UL
> #define SYS_PMSEVFR_EL1_RES0_8_3 ...
I really think using GENMASK_ULL() to generate the mask is better than a definition
with magic number. It is beneficial to be reviewed and extended later.
> Let's wait for Will or Mark Rutland's comments for this, in case I
> mislead for this.
>> +
>> enum arm_spe_pmu_buf_fault_action {
>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>> return -ENOENT;
>>
>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>> return -EOPNOTSUPP;
>>
>> if (attr->exclude_idle)
>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> fld, smp_processor_id());
>> return;
>> }
>> + spe_pmu->pmuver = fld;
>>
>> /* Read PMBIDR first to determine whether or not we have access */
>> reg = read_sysreg_s(SYS_PMBIDR_EL1);
>> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> }
>>
>> dev_info(dev,
>> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>> - cpumask_pr_args(&spe_pmu->supported_cpus),
>> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>
> Let's output explict info, like:
>
> "probed for CPUs %*pbl [pmuver %d, max_record_sz %u, align %u, features 0x%llx]\n",
>
Agree, and i have a little question here:
Currently, the of_compatible of SPE PMU is "arm,statistical-profiling-extension-v1", and
the platform_device name is "arm,spe-v1". So this message looks weird when supporting
ARMv8.3-SPE because the pmuver is 2.
As the version of SPE can be probed by reading 'ID_AA64DFR0_EL1.PMSVer', can we remove
the version hint in of_compatible and platform_device name?
>
>> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
>> spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features);
>>
>> spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED;
>> --
>> 2.17.1
>>
Thanks,
Wei
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-28 13:24 ` liwei (GF)
0 siblings, 0 replies; 44+ messages in thread
From: liwei (GF) @ 2020-07-28 13:24 UTC (permalink / raw)
To: Leo Yan, Will Deacon
Cc: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Catalin Marinas, James Clark, Jiri Olsa, Mark Rutland,
Namhyung Kim, Suzuki K Poulose, zhangshaokun, linux-arm-kernel,
linux-kernel, Ingo Molnar, Peter Zijlstra, guohanjun
Hi Leo,
On 2020/7/28 20:27, Leo Yan wrote:
> Hi Wei,
>
> On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
>> Armv8.3 extends the SPE by adding:
>> - Alignment field in the Events packet, and filtering on this event
>> using PMSEVFR_EL1.
>> - Support for the Scalable Vector Extension (SVE).
>>
>> The main additions for SVE are:
>> - Recording the vector length for SVE operations in the Operation Type
>> packet. It is not possible to filter on vector length.
>> - Incomplete predicate and empty predicate fields in the Events packet,
>> and filtering on these events using PMSEVFR_EL1.
>>
>> Update the check of pmsevfr for empty/partial predicated SVE and
>> alignment event in kernel driver.
>>
>> Signed-off-by: Wei Li <liwei391@huawei.com>
>> ---
>> arch/arm64/include/asm/sysreg.h | 4 +++-
>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 463175f80341..be4c44ccdb56 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -281,7 +281,6 @@
>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>
>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>>
>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>> @@ -769,6 +768,9 @@
>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>
>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>> +
>> #define ID_DFR0_PERFMON_SHIFT 24
>>
>> #define ID_DFR0_PERFMON_8_1 0x4
>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>> --- a/drivers/perf/arm_spe_pmu.c
>> +++ b/drivers/perf/arm_spe_pmu.c
>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>> struct hlist_node hotplug_node;
>>
>> int irq; /* PPI */
>> -
>> + int pmuver;
>
> Since the version number is only 4 bits width, 'u16' would be enough
> to record SPE version number.
Sounds reasonable, i can change it to 'u16' if you insist.
>> u16 min_period;
>> u16 counter_sz;
>>
>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>> /* Keep track of our dynamic hotplug state */
>> static enum cpuhp_state arm_spe_pmu_online;
>>
>> +static u64 sys_pmsevfr_el1_mask[] = {
>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>> + BIT_ULL(1),
>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>> +};
>
> Seems to me, the definitions for Aarch64 system registers should be
> placed into the file 'arch/arm64/include/asm/sysreg.h'. Like below
> two macros:
>
> #define SYS_PMSEVFR_EL1_RES0_8_2 0x0000ffff00ff0f55UL
> #define SYS_PMSEVFR_EL1_RES0_8_3 ...
I really think using GENMASK_ULL() to generate the mask is better than a definition
with magic number. It is beneficial to be reviewed and extended later.
> Let's wait for Will or Mark Rutland's comments for this, in case I
> mislead for this.
>> +
>> enum arm_spe_pmu_buf_fault_action {
>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>> return -ENOENT;
>>
>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>> return -EOPNOTSUPP;
>>
>> if (attr->exclude_idle)
>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> fld, smp_processor_id());
>> return;
>> }
>> + spe_pmu->pmuver = fld;
>>
>> /* Read PMBIDR first to determine whether or not we have access */
>> reg = read_sysreg_s(SYS_PMBIDR_EL1);
>> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> }
>>
>> dev_info(dev,
>> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>> - cpumask_pr_args(&spe_pmu->supported_cpus),
>> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>
> Let's output explict info, like:
>
> "probed for CPUs %*pbl [pmuver %d, max_record_sz %u, align %u, features 0x%llx]\n",
>
Agree, and i have a little question here:
Currently, the of_compatible of SPE PMU is "arm,statistical-profiling-extension-v1", and
the platform_device name is "arm,spe-v1". So this message looks weird when supporting
ARMv8.3-SPE because the pmuver is 2.
As the version of SPE can be probed by reading 'ID_AA64DFR0_EL1.PMSVer', can we remove
the version hint in of_compatible and platform_device name?
>
>> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
>> spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features);
>>
>> spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED;
>> --
>> 2.17.1
>>
Thanks,
Wei
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-28 13:24 ` liwei (GF)
@ 2020-07-29 7:08 ` Leo Yan
-1 siblings, 0 replies; 44+ messages in thread
From: Leo Yan @ 2020-07-29 7:08 UTC (permalink / raw)
To: liwei (GF), Al Grant
Cc: Mark Rutland, Suzuki K Poulose, Alexander Shishkin,
Catalin Marinas, Jiri Olsa, Adrian Hunter,
Arnaldo Carvalho de Melo, linux-kernel, zhangshaokun,
Peter Zijlstra, Ingo Molnar, James Clark, guohanjun, Namhyung Kim,
Will Deacon, linux-arm-kernel
On Tue, Jul 28, 2020 at 09:24:42PM +0800, liwei (GF) wrote:
[...]
> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> >> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> >> --- a/drivers/perf/arm_spe_pmu.c
> >> +++ b/drivers/perf/arm_spe_pmu.c
> >> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> >> struct hlist_node hotplug_node;
> >>
> >> int irq; /* PPI */
> >> -
> >> + int pmuver;
> >
> > Since the version number is only 4 bits width, 'u16' would be enough
> > to record SPE version number.
>
> Sounds reasonable, i can change it to 'u16' if you insist.
>
> >> u16 min_period;
> >> u16 counter_sz;
> >>
> >> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> >> /* Keep track of our dynamic hotplug state */
> >> static enum cpuhp_state arm_spe_pmu_online;
> >>
> >> +static u64 sys_pmsevfr_el1_mask[] = {
> >> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> >> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> >> + BIT_ULL(1),
> >> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> >> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> >> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> >> +};
> >
> > Seems to me, the definitions for Aarch64 system registers should be
> > placed into the file 'arch/arm64/include/asm/sysreg.h'. Like below
> > two macros:
> >
> > #define SYS_PMSEVFR_EL1_RES0_8_2 0x0000ffff00ff0f55UL
> > #define SYS_PMSEVFR_EL1_RES0_8_3 ...
>
> I really think using GENMASK_ULL() to generate the mask is better than a definition
> with magic number. It is beneficial to be reviewed and extended later.
Understand. Here I just want to remind, you could see the ARMv8's
system registers definition usually are placed into the global header
sysreg.h rather than define them in separate source files.
You could define the bit mask with GENMASK_ULL() for the two macros
in sysreg.h.
> > Let's wait for Will or Mark Rutland's comments for this, in case I
> > mislead for this.
> >> +
> >> enum arm_spe_pmu_buf_fault_action {
> >> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> >> SPE_PMU_BUF_FAULT_ACT_FATAL,
> >> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> >> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> >> return -ENOENT;
> >>
> >> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> >> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> >> return -EOPNOTSUPP;
> >>
> >> if (attr->exclude_idle)
> >> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> >> fld, smp_processor_id());
> >> return;
> >> }
> >> + spe_pmu->pmuver = fld;
> >>
> >> /* Read PMBIDR first to determine whether or not we have access */
> >> reg = read_sysreg_s(SYS_PMBIDR_EL1);
> >> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
> >> }
> >>
> >> dev_info(dev,
> >> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> >> - cpumask_pr_args(&spe_pmu->supported_cpus),
> >> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> >
> > Let's output explict info, like:
> >
> > "probed for CPUs %*pbl [pmuver %d, max_record_sz %u, align %u, features 0x%llx]\n",
> >
>
> Agree, and i have a little question here:
> Currently, the of_compatible of SPE PMU is "arm,statistical-profiling-extension-v1", and
> the platform_device name is "arm,spe-v1". So this message looks weird when supporting
> ARMv8.3-SPE because the pmuver is 2.
I think here we need to distinguish two things: SPE (as an IP) and
ARMv8.2/ARMv8.3 (as CPU architectures). From my understanding, now we
are working on SPE-v1, but it needs to support ARMv8 variants, e.g.
ARMv8.2 and ARMv8.3 with SVE extension.
I am not the best person to clarify the version number for SPE, if Arm
colleagues disagree with this, very welcome to correct me.
Also loop in Al for this.
> As the version of SPE can be probed by reading 'ID_AA64DFR0_EL1.PMSVer', can we remove
> the version hint in of_compatible and platform_device name?
No, for device tree, usually we need to keep back compability for the
DT binding, so we cannot remove compatible string.
Thanks,
Leo
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-29 7:08 ` Leo Yan
0 siblings, 0 replies; 44+ messages in thread
From: Leo Yan @ 2020-07-29 7:08 UTC (permalink / raw)
To: liwei (GF), Al Grant
Cc: Will Deacon, Adrian Hunter, Alexander Shishkin,
Arnaldo Carvalho de Melo, Catalin Marinas, James Clark, Jiri Olsa,
Mark Rutland, Namhyung Kim, Suzuki K Poulose, zhangshaokun,
linux-arm-kernel, linux-kernel, Ingo Molnar, Peter Zijlstra,
guohanjun
On Tue, Jul 28, 2020 at 09:24:42PM +0800, liwei (GF) wrote:
[...]
> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> >> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> >> --- a/drivers/perf/arm_spe_pmu.c
> >> +++ b/drivers/perf/arm_spe_pmu.c
> >> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> >> struct hlist_node hotplug_node;
> >>
> >> int irq; /* PPI */
> >> -
> >> + int pmuver;
> >
> > Since the version number is only 4 bits width, 'u16' would be enough
> > to record SPE version number.
>
> Sounds reasonable, i can change it to 'u16' if you insist.
>
> >> u16 min_period;
> >> u16 counter_sz;
> >>
> >> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> >> /* Keep track of our dynamic hotplug state */
> >> static enum cpuhp_state arm_spe_pmu_online;
> >>
> >> +static u64 sys_pmsevfr_el1_mask[] = {
> >> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> >> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> >> + BIT_ULL(1),
> >> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> >> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> >> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> >> +};
> >
> > Seems to me, the definitions for Aarch64 system registers should be
> > placed into the file 'arch/arm64/include/asm/sysreg.h'. Like below
> > two macros:
> >
> > #define SYS_PMSEVFR_EL1_RES0_8_2 0x0000ffff00ff0f55UL
> > #define SYS_PMSEVFR_EL1_RES0_8_3 ...
>
> I really think using GENMASK_ULL() to generate the mask is better than a definition
> with magic number. It is beneficial to be reviewed and extended later.
Understand. Here I just want to remind, you could see the ARMv8's
system registers definition usually are placed into the global header
sysreg.h rather than define them in separate source files.
You could define the bit mask with GENMASK_ULL() for the two macros
in sysreg.h.
> > Let's wait for Will or Mark Rutland's comments for this, in case I
> > mislead for this.
> >> +
> >> enum arm_spe_pmu_buf_fault_action {
> >> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> >> SPE_PMU_BUF_FAULT_ACT_FATAL,
> >> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> >> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> >> return -ENOENT;
> >>
> >> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> >> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> >> return -EOPNOTSUPP;
> >>
> >> if (attr->exclude_idle)
> >> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> >> fld, smp_processor_id());
> >> return;
> >> }
> >> + spe_pmu->pmuver = fld;
> >>
> >> /* Read PMBIDR first to determine whether or not we have access */
> >> reg = read_sysreg_s(SYS_PMBIDR_EL1);
> >> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
> >> }
> >>
> >> dev_info(dev,
> >> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> >> - cpumask_pr_args(&spe_pmu->supported_cpus),
> >> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> >
> > Let's output explict info, like:
> >
> > "probed for CPUs %*pbl [pmuver %d, max_record_sz %u, align %u, features 0x%llx]\n",
> >
>
> Agree, and i have a little question here:
> Currently, the of_compatible of SPE PMU is "arm,statistical-profiling-extension-v1", and
> the platform_device name is "arm,spe-v1". So this message looks weird when supporting
> ARMv8.3-SPE because the pmuver is 2.
I think here we need to distinguish two things: SPE (as an IP) and
ARMv8.2/ARMv8.3 (as CPU architectures). From my understanding, now we
are working on SPE-v1, but it needs to support ARMv8 variants, e.g.
ARMv8.2 and ARMv8.3 with SVE extension.
I am not the best person to clarify the version number for SPE, if Arm
colleagues disagree with this, very welcome to correct me.
Also loop in Al for this.
> As the version of SPE can be probed by reading 'ID_AA64DFR0_EL1.PMSVer', can we remove
> the version hint in of_compatible and platform_device name?
No, for device tree, usually we need to keep back compability for the
DT binding, so we cannot remove compatible string.
Thanks,
Leo
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-24 9:16 ` Wei Li
@ 2020-07-29 9:12 ` Suzuki K Poulose
-1 siblings, 0 replies; 44+ messages in thread
From: Suzuki K Poulose @ 2020-07-29 9:12 UTC (permalink / raw)
To: liwei391, adrian.hunter, alexander.shishkin, acme,
catalin.marinas, James.Clark, jolsa, leo.yan, mark.rutland,
namhyung, will, zhangshaokun
Cc: peterz, mingo, linux-kernel, linux-arm-kernel, guohanjun
On 07/24/2020 10:16 AM, Wei Li wrote:
> Armv8.3 extends the SPE by adding:
> - Alignment field in the Events packet, and filtering on this event
> using PMSEVFR_EL1.
> - Support for the Scalable Vector Extension (SVE).
>
> The main additions for SVE are:
> - Recording the vector length for SVE operations in the Operation Type
> packet. It is not possible to filter on vector length.
> - Incomplete predicate and empty predicate fields in the Events packet,
> and filtering on these events using PMSEVFR_EL1.
>
> Update the check of pmsevfr for empty/partial predicated SVE and
> alignment event in kernel driver.
>
> Signed-off-by: Wei Li <liwei391@huawei.com>
> ---
> arch/arm64/include/asm/sysreg.h | 4 +++-
> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 463175f80341..be4c44ccdb56 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -281,7 +281,6 @@
> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>
> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>
> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> @@ -769,6 +768,9 @@
> #define ID_AA64DFR0_PMUVER_8_5 0x6
> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>
> +#define ID_AA64DFR0_PMSVER_8_2 0x1
> +#define ID_AA64DFR0_PMSVER_8_3 0x2
> +
> #define ID_DFR0_PERFMON_SHIFT 24
>
> #define ID_DFR0_PERFMON_8_1 0x4
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> struct hlist_node hotplug_node;
>
> int irq; /* PPI */
> -
> + int pmuver;
> u16 min_period;
> u16 counter_sz;
>
> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> /* Keep track of our dynamic hotplug state */
> static enum cpuhp_state arm_spe_pmu_online;
>
> +static u64 sys_pmsevfr_el1_mask[] = {
> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> + BIT_ULL(1),
> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> +};
> +
> enum arm_spe_pmu_buf_fault_action {
> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> SPE_PMU_BUF_FAULT_ACT_FATAL,
> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> return -ENOENT;
>
> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> return -EOPNOTSUPP;
>
> if (attr->exclude_idle)
> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> fld, smp_processor_id());
> return;
> }
> + spe_pmu->pmuver = fld;
How do we deal with cases where we have big.LITTLE system with differing
SPE versions ?
Cheers
Suzuki
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-29 9:12 ` Suzuki K Poulose
0 siblings, 0 replies; 44+ messages in thread
From: Suzuki K Poulose @ 2020-07-29 9:12 UTC (permalink / raw)
To: liwei391, adrian.hunter, alexander.shishkin, acme,
catalin.marinas, James.Clark, jolsa, leo.yan, mark.rutland,
namhyung, will, zhangshaokun
Cc: linux-arm-kernel, linux-kernel, mingo, peterz, guohanjun
On 07/24/2020 10:16 AM, Wei Li wrote:
> Armv8.3 extends the SPE by adding:
> - Alignment field in the Events packet, and filtering on this event
> using PMSEVFR_EL1.
> - Support for the Scalable Vector Extension (SVE).
>
> The main additions for SVE are:
> - Recording the vector length for SVE operations in the Operation Type
> packet. It is not possible to filter on vector length.
> - Incomplete predicate and empty predicate fields in the Events packet,
> and filtering on these events using PMSEVFR_EL1.
>
> Update the check of pmsevfr for empty/partial predicated SVE and
> alignment event in kernel driver.
>
> Signed-off-by: Wei Li <liwei391@huawei.com>
> ---
> arch/arm64/include/asm/sysreg.h | 4 +++-
> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 463175f80341..be4c44ccdb56 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -281,7 +281,6 @@
> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>
> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>
> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> @@ -769,6 +768,9 @@
> #define ID_AA64DFR0_PMUVER_8_5 0x6
> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>
> +#define ID_AA64DFR0_PMSVER_8_2 0x1
> +#define ID_AA64DFR0_PMSVER_8_3 0x2
> +
> #define ID_DFR0_PERFMON_SHIFT 24
>
> #define ID_DFR0_PERFMON_8_1 0x4
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> struct hlist_node hotplug_node;
>
> int irq; /* PPI */
> -
> + int pmuver;
> u16 min_period;
> u16 counter_sz;
>
> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> /* Keep track of our dynamic hotplug state */
> static enum cpuhp_state arm_spe_pmu_online;
>
> +static u64 sys_pmsevfr_el1_mask[] = {
> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> + BIT_ULL(1),
> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> +};
> +
> enum arm_spe_pmu_buf_fault_action {
> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> SPE_PMU_BUF_FAULT_ACT_FATAL,
> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> return -ENOENT;
>
> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> return -EOPNOTSUPP;
>
> if (attr->exclude_idle)
> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> fld, smp_processor_id());
> return;
> }
> + spe_pmu->pmuver = fld;
How do we deal with cases where we have big.LITTLE system with differing
SPE versions ?
Cheers
Suzuki
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-29 9:12 ` Suzuki K Poulose
@ 2020-07-30 8:14 ` Leo Yan
-1 siblings, 0 replies; 44+ messages in thread
From: Leo Yan @ 2020-07-30 8:14 UTC (permalink / raw)
To: Suzuki K Poulose
Cc: mark.rutland, will, alexander.shishkin, catalin.marinas, jolsa,
adrian.hunter, acme, linux-kernel, zhangshaokun, peterz, mingo,
James.Clark, guohanjun, namhyung, liwei391, linux-arm-kernel
Hi Suzuki,
On Wed, Jul 29, 2020 at 10:12:50AM +0100, Suzuki Kuruppassery Poulose wrote:
> On 07/24/2020 10:16 AM, Wei Li wrote:
> > Armv8.3 extends the SPE by adding:
> > - Alignment field in the Events packet, and filtering on this event
> > using PMSEVFR_EL1.
> > - Support for the Scalable Vector Extension (SVE).
> >
> > The main additions for SVE are:
> > - Recording the vector length for SVE operations in the Operation Type
> > packet. It is not possible to filter on vector length.
> > - Incomplete predicate and empty predicate fields in the Events packet,
> > and filtering on these events using PMSEVFR_EL1.
> >
> > Update the check of pmsevfr for empty/partial predicated SVE and
> > alignment event in kernel driver.
> >
> > Signed-off-by: Wei Li <liwei391@huawei.com>
> > ---
> > arch/arm64/include/asm/sysreg.h | 4 +++-
> > drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> > 2 files changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index 463175f80341..be4c44ccdb56 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -281,7 +281,6 @@
> > #define SYS_PMSFCR_EL1_ST_SHIFT 18
> > #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> > -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
> > #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> > #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> > @@ -769,6 +768,9 @@
> > #define ID_AA64DFR0_PMUVER_8_5 0x6
> > #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
> > +#define ID_AA64DFR0_PMSVER_8_2 0x1
> > +#define ID_AA64DFR0_PMSVER_8_3 0x2
> > +
> > #define ID_DFR0_PERFMON_SHIFT 24
> > #define ID_DFR0_PERFMON_8_1 0x4
> > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> > index e51ddb6d63ed..5ec7ee0c8fa1 100644
> > --- a/drivers/perf/arm_spe_pmu.c
> > +++ b/drivers/perf/arm_spe_pmu.c
> > @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> > struct hlist_node hotplug_node;
> > int irq; /* PPI */
> > -
> > + int pmuver;
> > u16 min_period;
> > u16 counter_sz;
> > @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> > /* Keep track of our dynamic hotplug state */
> > static enum cpuhp_state arm_spe_pmu_online;
> > +static u64 sys_pmsevfr_el1_mask[] = {
> > + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> > + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> > + BIT_ULL(1),
> > + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> > + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> > + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> > +};
> > +
> > enum arm_spe_pmu_buf_fault_action {
> > SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> > SPE_PMU_BUF_FAULT_ACT_FATAL,
> > @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> > !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> > return -ENOENT;
> > - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> > + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> > return -EOPNOTSUPP;
> > if (attr->exclude_idle)
> > @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> > fld, smp_processor_id());
> > return;
> > }
> > + spe_pmu->pmuver = fld;
>
> How do we deal with cases where we have big.LITTLE system with differing
> SPE versions ?
Good point.
The first question we need to answer is: how to define SPE version?
From my understanding, if SPE uses the same sample specification and
the same packet format, then we should consider the SPE is the same
version cross CPUs. So even some CPUs are ARMv8.2 and other CPUs are
ARMv8.3 variants, we still should take the SPE as the same version.
And when read the SPE driver in the file drivers/perf/arm_spe_pmu.c and
I concluded that so far the SPE perf driver is to only support SPE-v1
with single instance, it cannot support a complex usage case like
below:
CPU0-3: ARMv8.2 architecture with SPE
CPU4-7: ARMv8.3 architecture with SPE
For this case, if we take SPE as two different versions, let's say
SPE-8.2 and SPE-8.3, then should the SPE driver need to create multi
perf PMU events? For example, we should create a perf PMU event
'arm_spe_8.2' and another PMU event 'arm_spe_8.3'.
Another option is we always take this as SPE-v1 and only create single
PMU event, just keep what's we are doing with the perf event
'arm_spe_0', but the driver needs to dynamically detect SPE PMU version
number in the function arm_spe_pmu_event_init(), and then based on
version number to select corresponding mask for PMSEVFR.
Thanks,
Leo
[1] https://lore.kernel.org/linux-arm-kernel/20200724071111.35593-1-liwei391@huawei.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-30 8:14 ` Leo Yan
0 siblings, 0 replies; 44+ messages in thread
From: Leo Yan @ 2020-07-30 8:14 UTC (permalink / raw)
To: Suzuki K Poulose
Cc: liwei391, adrian.hunter, alexander.shishkin, acme,
catalin.marinas, James.Clark, jolsa, mark.rutland, namhyung, will,
zhangshaokun, linux-arm-kernel, linux-kernel, mingo, peterz,
guohanjun
Hi Suzuki,
On Wed, Jul 29, 2020 at 10:12:50AM +0100, Suzuki Kuruppassery Poulose wrote:
> On 07/24/2020 10:16 AM, Wei Li wrote:
> > Armv8.3 extends the SPE by adding:
> > - Alignment field in the Events packet, and filtering on this event
> > using PMSEVFR_EL1.
> > - Support for the Scalable Vector Extension (SVE).
> >
> > The main additions for SVE are:
> > - Recording the vector length for SVE operations in the Operation Type
> > packet. It is not possible to filter on vector length.
> > - Incomplete predicate and empty predicate fields in the Events packet,
> > and filtering on these events using PMSEVFR_EL1.
> >
> > Update the check of pmsevfr for empty/partial predicated SVE and
> > alignment event in kernel driver.
> >
> > Signed-off-by: Wei Li <liwei391@huawei.com>
> > ---
> > arch/arm64/include/asm/sysreg.h | 4 +++-
> > drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> > 2 files changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index 463175f80341..be4c44ccdb56 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -281,7 +281,6 @@
> > #define SYS_PMSFCR_EL1_ST_SHIFT 18
> > #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> > -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
> > #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> > #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> > @@ -769,6 +768,9 @@
> > #define ID_AA64DFR0_PMUVER_8_5 0x6
> > #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
> > +#define ID_AA64DFR0_PMSVER_8_2 0x1
> > +#define ID_AA64DFR0_PMSVER_8_3 0x2
> > +
> > #define ID_DFR0_PERFMON_SHIFT 24
> > #define ID_DFR0_PERFMON_8_1 0x4
> > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> > index e51ddb6d63ed..5ec7ee0c8fa1 100644
> > --- a/drivers/perf/arm_spe_pmu.c
> > +++ b/drivers/perf/arm_spe_pmu.c
> > @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> > struct hlist_node hotplug_node;
> > int irq; /* PPI */
> > -
> > + int pmuver;
> > u16 min_period;
> > u16 counter_sz;
> > @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> > /* Keep track of our dynamic hotplug state */
> > static enum cpuhp_state arm_spe_pmu_online;
> > +static u64 sys_pmsevfr_el1_mask[] = {
> > + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> > + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> > + BIT_ULL(1),
> > + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> > + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> > + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> > +};
> > +
> > enum arm_spe_pmu_buf_fault_action {
> > SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> > SPE_PMU_BUF_FAULT_ACT_FATAL,
> > @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> > !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> > return -ENOENT;
> > - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> > + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> > return -EOPNOTSUPP;
> > if (attr->exclude_idle)
> > @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> > fld, smp_processor_id());
> > return;
> > }
> > + spe_pmu->pmuver = fld;
>
> How do we deal with cases where we have big.LITTLE system with differing
> SPE versions ?
Good point.
The first question we need to answer is: how to define SPE version?
From my understanding, if SPE uses the same sample specification and
the same packet format, then we should consider the SPE is the same
version cross CPUs. So even some CPUs are ARMv8.2 and other CPUs are
ARMv8.3 variants, we still should take the SPE as the same version.
And when read the SPE driver in the file drivers/perf/arm_spe_pmu.c and
I concluded that so far the SPE perf driver is to only support SPE-v1
with single instance, it cannot support a complex usage case like
below:
CPU0-3: ARMv8.2 architecture with SPE
CPU4-7: ARMv8.3 architecture with SPE
For this case, if we take SPE as two different versions, let's say
SPE-8.2 and SPE-8.3, then should the SPE driver need to create multi
perf PMU events? For example, we should create a perf PMU event
'arm_spe_8.2' and another PMU event 'arm_spe_8.3'.
Another option is we always take this as SPE-v1 and only create single
PMU event, just keep what's we are doing with the perf event
'arm_spe_0', but the driver needs to dynamically detect SPE PMU version
number in the function arm_spe_pmu_event_init(), and then based on
version number to select corresponding mask for PMSEVFR.
Thanks,
Leo
[1] https://lore.kernel.org/linux-arm-kernel/20200724071111.35593-1-liwei391@huawei.com/
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-30 8:14 ` Leo Yan
@ 2020-07-31 12:18 ` liwei (GF)
-1 siblings, 0 replies; 44+ messages in thread
From: liwei (GF) @ 2020-07-31 12:18 UTC (permalink / raw)
To: Leo Yan, Suzuki K Poulose
Cc: mark.rutland, will, alexander.shishkin, catalin.marinas,
adrian.hunter, acme, linux-kernel, zhangshaokun, peterz, mingo,
James.Clark, guohanjun, namhyung, jolsa, linux-arm-kernel
On 2020/7/30 16:14, Leo Yan wrote:
> Hi Suzuki,
>
> On Wed, Jul 29, 2020 at 10:12:50AM +0100, Suzuki Kuruppassery Poulose wrote:
>> On 07/24/2020 10:16 AM, Wei Li wrote:
>>> Armv8.3 extends the SPE by adding:
>>> - Alignment field in the Events packet, and filtering on this event
>>> using PMSEVFR_EL1.
>>> - Support for the Scalable Vector Extension (SVE).
>>>
>>> The main additions for SVE are:
>>> - Recording the vector length for SVE operations in the Operation Type
>>> packet. It is not possible to filter on vector length.
>>> - Incomplete predicate and empty predicate fields in the Events packet,
>>> and filtering on these events using PMSEVFR_EL1.
>>>
>>> Update the check of pmsevfr for empty/partial predicated SVE and
>>> alignment event in kernel driver.
>>>
>>> Signed-off-by: Wei Li <liwei391@huawei.com>
>>> ---
>>> arch/arm64/include/asm/sysreg.h | 4 +++-
>>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>> index 463175f80341..be4c44ccdb56 100644
>>> --- a/arch/arm64/include/asm/sysreg.h
>>> +++ b/arch/arm64/include/asm/sysreg.h
>>> @@ -281,7 +281,6 @@
>>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>>> @@ -769,6 +768,9 @@
>>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>>> +
>>> #define ID_DFR0_PERFMON_SHIFT 24
>>> #define ID_DFR0_PERFMON_8_1 0x4
>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>>> --- a/drivers/perf/arm_spe_pmu.c
>>> +++ b/drivers/perf/arm_spe_pmu.c
>>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>>> struct hlist_node hotplug_node;
>>> int irq; /* PPI */
>>> -
>>> + int pmuver;
>>> u16 min_period;
>>> u16 counter_sz;
>>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>>> /* Keep track of our dynamic hotplug state */
>>> static enum cpuhp_state arm_spe_pmu_online;
>>> +static u64 sys_pmsevfr_el1_mask[] = {
>>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>>> + BIT_ULL(1),
>>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>>> +};
>>> +
>>> enum arm_spe_pmu_buf_fault_action {
>>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>>> return -ENOENT;
>>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>>> return -EOPNOTSUPP;
>>> if (attr->exclude_idle)
>>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>>> fld, smp_processor_id());
>>> return;
>>> }
>>> + spe_pmu->pmuver = fld;
>>
>> How do we deal with cases where we have big.LITTLE system with differing
>> SPE versions ?
>
> Good point.
>
> The first question we need to answer is: how to define SPE version?
> From my understanding, if SPE uses the same sample specification and
> the same packet format, then we should consider the SPE is the same
> version cross CPUs. So even some CPUs are ARMv8.2 and other CPUs are
> ARMv8.3 variants, we still should take the SPE as the same version.
>
> And when read the SPE driver in the file drivers/perf/arm_spe_pmu.c and
> I concluded that so far the SPE perf driver is to only support SPE-v1
> with single instance, it cannot support a complex usage case like
> below:
>
> CPU0-3: ARMv8.2 architecture with SPE
> CPU4-7: ARMv8.3 architecture with SPE
>
> For this case, if we take SPE as two different versions, let's say
> SPE-8.2 and SPE-8.3, then should the SPE driver need to create multi
> perf PMU events? For example, we should create a perf PMU event
> 'arm_spe_8.2' and another PMU event 'arm_spe_8.3'.
As we have supported SPE v2 (ARMv8.3-SPE) now, if we add the new
of_device_id: "arm,statistical-profiling-extension-v2" and the new
platform_device_id: "arm,spe-v2", we may really support two instance now.
Even two different versions of SPE pmus which work on different range of
cores with the same PPI. Their functional scopes are the same as the PPI partitions.
> Another option is we always take this as SPE-v1 and only create single
> PMU event, just keep what's we are doing with the perf event
> 'arm_spe_0', but the driver needs to dynamically detect SPE PMU version
> number in the function arm_spe_pmu_event_init(), and then based on
> version number to select corresponding mask for PMSEVFR.
Thus, the driver will service two devices, and will also register two PMUs 'arm_spe_0' and
'arm_spe_1', so i think there is no conflict here.
Back to Suzuki's question, refer to the ACPI parsing code for SPE device, function
arm_spe_acpi_register_device(), there is a check of hetero_id for all cores.
It seems that we only support homogeneous ACPI/SPE machines, but i can't find the similar
check in OF/SPE parsing code.
> Thanks,
> Leo
>
> [1] https://lore.kernel.org/linux-arm-kernel/20200724071111.35593-1-liwei391@huawei.com/
>
Thanks,
Wei
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-31 12:18 ` liwei (GF)
0 siblings, 0 replies; 44+ messages in thread
From: liwei (GF) @ 2020-07-31 12:18 UTC (permalink / raw)
To: Leo Yan, Suzuki K Poulose
Cc: adrian.hunter, alexander.shishkin, acme, catalin.marinas,
James.Clark, jolsa, mark.rutland, namhyung, will, zhangshaokun,
linux-arm-kernel, linux-kernel, mingo, peterz, guohanjun
On 2020/7/30 16:14, Leo Yan wrote:
> Hi Suzuki,
>
> On Wed, Jul 29, 2020 at 10:12:50AM +0100, Suzuki Kuruppassery Poulose wrote:
>> On 07/24/2020 10:16 AM, Wei Li wrote:
>>> Armv8.3 extends the SPE by adding:
>>> - Alignment field in the Events packet, and filtering on this event
>>> using PMSEVFR_EL1.
>>> - Support for the Scalable Vector Extension (SVE).
>>>
>>> The main additions for SVE are:
>>> - Recording the vector length for SVE operations in the Operation Type
>>> packet. It is not possible to filter on vector length.
>>> - Incomplete predicate and empty predicate fields in the Events packet,
>>> and filtering on these events using PMSEVFR_EL1.
>>>
>>> Update the check of pmsevfr for empty/partial predicated SVE and
>>> alignment event in kernel driver.
>>>
>>> Signed-off-by: Wei Li <liwei391@huawei.com>
>>> ---
>>> arch/arm64/include/asm/sysreg.h | 4 +++-
>>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>> index 463175f80341..be4c44ccdb56 100644
>>> --- a/arch/arm64/include/asm/sysreg.h
>>> +++ b/arch/arm64/include/asm/sysreg.h
>>> @@ -281,7 +281,6 @@
>>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>>> @@ -769,6 +768,9 @@
>>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>>> +
>>> #define ID_DFR0_PERFMON_SHIFT 24
>>> #define ID_DFR0_PERFMON_8_1 0x4
>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>>> --- a/drivers/perf/arm_spe_pmu.c
>>> +++ b/drivers/perf/arm_spe_pmu.c
>>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>>> struct hlist_node hotplug_node;
>>> int irq; /* PPI */
>>> -
>>> + int pmuver;
>>> u16 min_period;
>>> u16 counter_sz;
>>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>>> /* Keep track of our dynamic hotplug state */
>>> static enum cpuhp_state arm_spe_pmu_online;
>>> +static u64 sys_pmsevfr_el1_mask[] = {
>>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>>> + BIT_ULL(1),
>>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>>> +};
>>> +
>>> enum arm_spe_pmu_buf_fault_action {
>>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>>> return -ENOENT;
>>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>>> return -EOPNOTSUPP;
>>> if (attr->exclude_idle)
>>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>>> fld, smp_processor_id());
>>> return;
>>> }
>>> + spe_pmu->pmuver = fld;
>>
>> How do we deal with cases where we have big.LITTLE system with differing
>> SPE versions ?
>
> Good point.
>
> The first question we need to answer is: how to define SPE version?
> From my understanding, if SPE uses the same sample specification and
> the same packet format, then we should consider the SPE is the same
> version cross CPUs. So even some CPUs are ARMv8.2 and other CPUs are
> ARMv8.3 variants, we still should take the SPE as the same version.
>
> And when read the SPE driver in the file drivers/perf/arm_spe_pmu.c and
> I concluded that so far the SPE perf driver is to only support SPE-v1
> with single instance, it cannot support a complex usage case like
> below:
>
> CPU0-3: ARMv8.2 architecture with SPE
> CPU4-7: ARMv8.3 architecture with SPE
>
> For this case, if we take SPE as two different versions, let's say
> SPE-8.2 and SPE-8.3, then should the SPE driver need to create multi
> perf PMU events? For example, we should create a perf PMU event
> 'arm_spe_8.2' and another PMU event 'arm_spe_8.3'.
As we have supported SPE v2 (ARMv8.3-SPE) now, if we add the new
of_device_id: "arm,statistical-profiling-extension-v2" and the new
platform_device_id: "arm,spe-v2", we may really support two instance now.
Even two different versions of SPE pmus which work on different range of
cores with the same PPI. Their functional scopes are the same as the PPI partitions.
> Another option is we always take this as SPE-v1 and only create single
> PMU event, just keep what's we are doing with the perf event
> 'arm_spe_0', but the driver needs to dynamically detect SPE PMU version
> number in the function arm_spe_pmu_event_init(), and then based on
> version number to select corresponding mask for PMSEVFR.
Thus, the driver will service two devices, and will also register two PMUs 'arm_spe_0' and
'arm_spe_1', so i think there is no conflict here.
Back to Suzuki's question, refer to the ACPI parsing code for SPE device, function
arm_spe_acpi_register_device(), there is a check of hetero_id for all cores.
It seems that we only support homogeneous ACPI/SPE machines, but i can't find the similar
check in OF/SPE parsing code.
> Thanks,
> Leo
>
> [1] https://lore.kernel.org/linux-arm-kernel/20200724071111.35593-1-liwei391@huawei.com/
>
Thanks,
Wei
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-31 12:18 ` liwei (GF)
@ 2020-07-31 14:01 ` Suzuki K Poulose
-1 siblings, 0 replies; 44+ messages in thread
From: Suzuki K Poulose @ 2020-07-31 14:01 UTC (permalink / raw)
To: liwei391, leo.yan, will
Cc: mark.rutland, alexander.shishkin, catalin.marinas, adrian.hunter,
acme, linux-kernel, zhangshaokun, peterz, mingo, James.Clark,
guohanjun, namhyung, jolsa, linux-arm-kernel
On 07/31/2020 01:18 PM, liwei (GF) wrote:
>
>
> On 2020/7/30 16:14, Leo Yan wrote:
>> Hi Suzuki,
>>
>> On Wed, Jul 29, 2020 at 10:12:50AM +0100, Suzuki Kuruppassery Poulose wrote:
>>> On 07/24/2020 10:16 AM, Wei Li wrote:
>>>> Armv8.3 extends the SPE by adding:
>>>> - Alignment field in the Events packet, and filtering on this event
>>>> using PMSEVFR_EL1.
>>>> - Support for the Scalable Vector Extension (SVE).
>>>>
>>>> The main additions for SVE are:
>>>> - Recording the vector length for SVE operations in the Operation Type
>>>> packet. It is not possible to filter on vector length.
>>>> - Incomplete predicate and empty predicate fields in the Events packet,
>>>> and filtering on these events using PMSEVFR_EL1.
>>>>
>>>> Update the check of pmsevfr for empty/partial predicated SVE and
>>>> alignment event in kernel driver.
>>>>
>>>> Signed-off-by: Wei Li <liwei391@huawei.com>
>>>> ---
>>>> arch/arm64/include/asm/sysreg.h | 4 +++-
>>>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>>>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>>> index 463175f80341..be4c44ccdb56 100644
>>>> --- a/arch/arm64/include/asm/sysreg.h
>>>> +++ b/arch/arm64/include/asm/sysreg.h
>>>> @@ -281,7 +281,6 @@
>>>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>>>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>>>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>>>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>>>> @@ -769,6 +768,9 @@
>>>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>>>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>>>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>>>> +
>>>> #define ID_DFR0_PERFMON_SHIFT 24
>>>> #define ID_DFR0_PERFMON_8_1 0x4
>>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>>>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>>>> --- a/drivers/perf/arm_spe_pmu.c
>>>> +++ b/drivers/perf/arm_spe_pmu.c
>>>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>>>> struct hlist_node hotplug_node;
>>>> int irq; /* PPI */
>>>> -
>>>> + int pmuver;
>>>> u16 min_period;
>>>> u16 counter_sz;
>>>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>>>> /* Keep track of our dynamic hotplug state */
>>>> static enum cpuhp_state arm_spe_pmu_online;
>>>> +static u64 sys_pmsevfr_el1_mask[] = {
>>>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>>>> + BIT_ULL(1),
>>>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>>>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>>>> +};
>>>> +
>>>> enum arm_spe_pmu_buf_fault_action {
>>>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>>>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>>>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>>>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>>>> return -ENOENT;
>>>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>>>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>>>> return -EOPNOTSUPP;
>>>> if (attr->exclude_idle)
>>>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>>>> fld, smp_processor_id());
>>>> return;
>>>> }
>>>> + spe_pmu->pmuver = fld;
>>>
>>> How do we deal with cases where we have big.LITTLE system with differing
>>> SPE versions ?
>>
>> Good point.
>>
>> The first question we need to answer is: how to define SPE version?
>> From my understanding, if SPE uses the same sample specification and
>> the same packet format, then we should consider the SPE is the same
>> version cross CPUs. So even some CPUs are ARMv8.2 and other CPUs are
>> ARMv8.3 variants, we still should take the SPE as the same version.
>>
>> And when read the SPE driver in the file drivers/perf/arm_spe_pmu.c and
>> I concluded that so far the SPE perf driver is to only support SPE-v1
>> with single instance, it cannot support a complex usage case like
>> below:
>>
>> CPU0-3: ARMv8.2 architecture with SPE
>> CPU4-7: ARMv8.3 architecture with SPE
>>
>> For this case, if we take SPE as two different versions, let's say
>> SPE-8.2 and SPE-8.3, then should the SPE driver need to create multi
>> perf PMU events? For example, we should create a perf PMU event
>> 'arm_spe_8.2' and another PMU event 'arm_spe_8.3'.
>
> As we have supported SPE v2 (ARMv8.3-SPE) now, if we add the new
> of_device_id: "arm,statistical-profiling-extension-v2" and the new
> platform_device_id: "arm,spe-v2", we may really support two instance now.
> Even two different versions of SPE pmus which work on different range of
> cores with the same PPI. Their functional scopes are the same as the PPI partitions.
>
>> Another option is we always take this as SPE-v1 and only create single
>> PMU event, just keep what's we are doing with the perf event
>> 'arm_spe_0', but the driver needs to dynamically detect SPE PMU version
>> number in the function arm_spe_pmu_event_init(), and then based on
>> version number to select corresponding mask for PMSEVFR.
>
> Thus, the driver will service two devices, and will also register two PMUs 'arm_spe_0' and
> 'arm_spe_1', so i think there is no conflict here.
Or the other option is to strictly support only one version - the lowest
version supported by the system, which is compatible with all the
others. This could be achieved by using the sanitised feature value of
the SPE version.
Will,
What do you prefer ?
>
> Back to Suzuki's question, refer to the ACPI parsing code for SPE device, function
> arm_spe_acpi_register_device(), there is a check of hetero_id for all cores.
> It seems that we only support homogeneous ACPI/SPE machines, but i can't find the similar
> check in OF/SPE parsing code.
I think the driver assumes that the system supports uniform version
of the SPE (which was valid when the driver was written).
Cheers
Suzuki
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-07-31 14:01 ` Suzuki K Poulose
0 siblings, 0 replies; 44+ messages in thread
From: Suzuki K Poulose @ 2020-07-31 14:01 UTC (permalink / raw)
To: liwei391, leo.yan, will
Cc: adrian.hunter, alexander.shishkin, acme, catalin.marinas,
James.Clark, jolsa, mark.rutland, namhyung, zhangshaokun,
linux-arm-kernel, linux-kernel, mingo, peterz, guohanjun
On 07/31/2020 01:18 PM, liwei (GF) wrote:
>
>
> On 2020/7/30 16:14, Leo Yan wrote:
>> Hi Suzuki,
>>
>> On Wed, Jul 29, 2020 at 10:12:50AM +0100, Suzuki Kuruppassery Poulose wrote:
>>> On 07/24/2020 10:16 AM, Wei Li wrote:
>>>> Armv8.3 extends the SPE by adding:
>>>> - Alignment field in the Events packet, and filtering on this event
>>>> using PMSEVFR_EL1.
>>>> - Support for the Scalable Vector Extension (SVE).
>>>>
>>>> The main additions for SVE are:
>>>> - Recording the vector length for SVE operations in the Operation Type
>>>> packet. It is not possible to filter on vector length.
>>>> - Incomplete predicate and empty predicate fields in the Events packet,
>>>> and filtering on these events using PMSEVFR_EL1.
>>>>
>>>> Update the check of pmsevfr for empty/partial predicated SVE and
>>>> alignment event in kernel driver.
>>>>
>>>> Signed-off-by: Wei Li <liwei391@huawei.com>
>>>> ---
>>>> arch/arm64/include/asm/sysreg.h | 4 +++-
>>>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>>>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>>> index 463175f80341..be4c44ccdb56 100644
>>>> --- a/arch/arm64/include/asm/sysreg.h
>>>> +++ b/arch/arm64/include/asm/sysreg.h
>>>> @@ -281,7 +281,6 @@
>>>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>>>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>>>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>>>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>>>> @@ -769,6 +768,9 @@
>>>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>>>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>>>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>>>> +
>>>> #define ID_DFR0_PERFMON_SHIFT 24
>>>> #define ID_DFR0_PERFMON_8_1 0x4
>>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>>>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>>>> --- a/drivers/perf/arm_spe_pmu.c
>>>> +++ b/drivers/perf/arm_spe_pmu.c
>>>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>>>> struct hlist_node hotplug_node;
>>>> int irq; /* PPI */
>>>> -
>>>> + int pmuver;
>>>> u16 min_period;
>>>> u16 counter_sz;
>>>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>>>> /* Keep track of our dynamic hotplug state */
>>>> static enum cpuhp_state arm_spe_pmu_online;
>>>> +static u64 sys_pmsevfr_el1_mask[] = {
>>>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>>>> + BIT_ULL(1),
>>>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>>>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>>>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>>>> +};
>>>> +
>>>> enum arm_spe_pmu_buf_fault_action {
>>>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>>>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>>>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>>>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>>>> return -ENOENT;
>>>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>>>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>>>> return -EOPNOTSUPP;
>>>> if (attr->exclude_idle)
>>>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>>>> fld, smp_processor_id());
>>>> return;
>>>> }
>>>> + spe_pmu->pmuver = fld;
>>>
>>> How do we deal with cases where we have big.LITTLE system with differing
>>> SPE versions ?
>>
>> Good point.
>>
>> The first question we need to answer is: how to define SPE version?
>> From my understanding, if SPE uses the same sample specification and
>> the same packet format, then we should consider the SPE is the same
>> version cross CPUs. So even some CPUs are ARMv8.2 and other CPUs are
>> ARMv8.3 variants, we still should take the SPE as the same version.
>>
>> And when read the SPE driver in the file drivers/perf/arm_spe_pmu.c and
>> I concluded that so far the SPE perf driver is to only support SPE-v1
>> with single instance, it cannot support a complex usage case like
>> below:
>>
>> CPU0-3: ARMv8.2 architecture with SPE
>> CPU4-7: ARMv8.3 architecture with SPE
>>
>> For this case, if we take SPE as two different versions, let's say
>> SPE-8.2 and SPE-8.3, then should the SPE driver need to create multi
>> perf PMU events? For example, we should create a perf PMU event
>> 'arm_spe_8.2' and another PMU event 'arm_spe_8.3'.
>
> As we have supported SPE v2 (ARMv8.3-SPE) now, if we add the new
> of_device_id: "arm,statistical-profiling-extension-v2" and the new
> platform_device_id: "arm,spe-v2", we may really support two instance now.
> Even two different versions of SPE pmus which work on different range of
> cores with the same PPI. Their functional scopes are the same as the PPI partitions.
>
>> Another option is we always take this as SPE-v1 and only create single
>> PMU event, just keep what's we are doing with the perf event
>> 'arm_spe_0', but the driver needs to dynamically detect SPE PMU version
>> number in the function arm_spe_pmu_event_init(), and then based on
>> version number to select corresponding mask for PMSEVFR.
>
> Thus, the driver will service two devices, and will also register two PMUs 'arm_spe_0' and
> 'arm_spe_1', so i think there is no conflict here.
Or the other option is to strictly support only one version - the lowest
version supported by the system, which is compatible with all the
others. This could be achieved by using the sanitised feature value of
the SPE version.
Will,
What do you prefer ?
>
> Back to Suzuki's question, refer to the ACPI parsing code for SPE device, function
> arm_spe_acpi_register_device(), there is a check of hetero_id for all cores.
> It seems that we only support homogeneous ACPI/SPE machines, but i can't find the similar
> check in OF/SPE parsing code.
I think the driver assumes that the system supports uniform version
of the SPE (which was valid when the driver was written).
Cheers
Suzuki
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-07-24 9:16 ` Wei Li
@ 2020-09-07 12:51 ` Will Deacon
-1 siblings, 0 replies; 44+ messages in thread
From: Will Deacon @ 2020-09-07 12:51 UTC (permalink / raw)
To: Wei Li
Cc: Mark Rutland, Suzuki K Poulose, Alexander Shishkin,
Catalin Marinas, guohanjun, Adrian Hunter,
Arnaldo Carvalho de Melo, linux-kernel, zhangshaokun,
Peter Zijlstra, Ingo Molnar, James Clark, Leo Yan, Namhyung Kim,
Jiri Olsa, linux-arm-kernel
On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
> Armv8.3 extends the SPE by adding:
> - Alignment field in the Events packet, and filtering on this event
> using PMSEVFR_EL1.
> - Support for the Scalable Vector Extension (SVE).
>
> The main additions for SVE are:
> - Recording the vector length for SVE operations in the Operation Type
> packet. It is not possible to filter on vector length.
> - Incomplete predicate and empty predicate fields in the Events packet,
> and filtering on these events using PMSEVFR_EL1.
>
> Update the check of pmsevfr for empty/partial predicated SVE and
> alignment event in kernel driver.
>
> Signed-off-by: Wei Li <liwei391@huawei.com>
> ---
> arch/arm64/include/asm/sysreg.h | 4 +++-
> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 463175f80341..be4c44ccdb56 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -281,7 +281,6 @@
> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>
> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
I think we can just update this mask unconditionally to allow the new bits.
> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> @@ -769,6 +768,9 @@
> #define ID_AA64DFR0_PMUVER_8_5 0x6
> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>
> +#define ID_AA64DFR0_PMSVER_8_2 0x1
> +#define ID_AA64DFR0_PMSVER_8_3 0x2
> +
> #define ID_DFR0_PERFMON_SHIFT 24
>
> #define ID_DFR0_PERFMON_8_1 0x4
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> struct hlist_node hotplug_node;
>
> int irq; /* PPI */
> -
> + int pmuver;
nit: please call this "pmsver" to align with the architecture (where
"pmuver" means something else).
> u16 min_period;
> u16 counter_sz;
>
> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> /* Keep track of our dynamic hotplug state */
> static enum cpuhp_state arm_spe_pmu_online;
>
> +static u64 sys_pmsevfr_el1_mask[] = {
> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> + BIT_ULL(1),
> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> +};
As I said above, you can drop this and just update the #define.
> +
> enum arm_spe_pmu_buf_fault_action {
> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> SPE_PMU_BUF_FAULT_ACT_FATAL,
> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> return -ENOENT;
>
> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> return -EOPNOTSUPP;
Same here.
>
> if (attr->exclude_idle)
> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> fld, smp_processor_id());
> return;
> }
> + spe_pmu->pmuver = fld;
>
> /* Read PMBIDR first to determine whether or not we have access */
> reg = read_sysreg_s(SYS_PMBIDR_EL1);
> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
> }
>
> dev_info(dev,
> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> - cpumask_pr_args(&spe_pmu->supported_cpus),
> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
There's no need for this. If userspace finds this information useful, then
we should expose it in sysfs, like we do for other PMU paramaters. If
userspace doesn't find it useful, then there's no need to expose it at all.
So I would suggest adding something like SPE_PMU_CAP_PMSVER and exposing the
field on a per-SPE-PMU basis in sysfs.
big.LITTLE should work as before, where we expose a completely separate PMU
instance for each CPU type.
Will
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-09-07 12:51 ` Will Deacon
0 siblings, 0 replies; 44+ messages in thread
From: Will Deacon @ 2020-09-07 12:51 UTC (permalink / raw)
To: Wei Li
Cc: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Catalin Marinas, James Clark, Jiri Olsa, Leo Yan, Mark Rutland,
Namhyung Kim, Suzuki K Poulose, zhangshaokun, linux-arm-kernel,
linux-kernel, Ingo Molnar, Peter Zijlstra, guohanjun
On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
> Armv8.3 extends the SPE by adding:
> - Alignment field in the Events packet, and filtering on this event
> using PMSEVFR_EL1.
> - Support for the Scalable Vector Extension (SVE).
>
> The main additions for SVE are:
> - Recording the vector length for SVE operations in the Operation Type
> packet. It is not possible to filter on vector length.
> - Incomplete predicate and empty predicate fields in the Events packet,
> and filtering on these events using PMSEVFR_EL1.
>
> Update the check of pmsevfr for empty/partial predicated SVE and
> alignment event in kernel driver.
>
> Signed-off-by: Wei Li <liwei391@huawei.com>
> ---
> arch/arm64/include/asm/sysreg.h | 4 +++-
> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 463175f80341..be4c44ccdb56 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -281,7 +281,6 @@
> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>
> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
I think we can just update this mask unconditionally to allow the new bits.
> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
> @@ -769,6 +768,9 @@
> #define ID_AA64DFR0_PMUVER_8_5 0x6
> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>
> +#define ID_AA64DFR0_PMSVER_8_2 0x1
> +#define ID_AA64DFR0_PMSVER_8_3 0x2
> +
> #define ID_DFR0_PERFMON_SHIFT 24
>
> #define ID_DFR0_PERFMON_8_1 0x4
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index e51ddb6d63ed..5ec7ee0c8fa1 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
> struct hlist_node hotplug_node;
>
> int irq; /* PPI */
> -
> + int pmuver;
nit: please call this "pmsver" to align with the architecture (where
"pmuver" means something else).
> u16 min_period;
> u16 counter_sz;
>
> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
> /* Keep track of our dynamic hotplug state */
> static enum cpuhp_state arm_spe_pmu_online;
>
> +static u64 sys_pmsevfr_el1_mask[] = {
> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
> + BIT_ULL(1),
> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
> +};
As I said above, you can drop this and just update the #define.
> +
> enum arm_spe_pmu_buf_fault_action {
> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
> SPE_PMU_BUF_FAULT_ACT_FATAL,
> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
> return -ENOENT;
>
> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
> return -EOPNOTSUPP;
Same here.
>
> if (attr->exclude_idle)
> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
> fld, smp_processor_id());
> return;
> }
> + spe_pmu->pmuver = fld;
>
> /* Read PMBIDR first to determine whether or not we have access */
> reg = read_sysreg_s(SYS_PMBIDR_EL1);
> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
> }
>
> dev_info(dev,
> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> - cpumask_pr_args(&spe_pmu->supported_cpus),
> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
There's no need for this. If userspace finds this information useful, then
we should expose it in sysfs, like we do for other PMU paramaters. If
userspace doesn't find it useful, then there's no need to expose it at all.
So I would suggest adding something like SPE_PMU_CAP_PMSVER and exposing the
field on a per-SPE-PMU basis in sysfs.
big.LITTLE should work as before, where we expose a completely separate PMU
instance for each CPU type.
Will
>
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
2020-09-07 12:51 ` Will Deacon
@ 2020-09-29 8:17 ` liwei (GF)
-1 siblings, 0 replies; 44+ messages in thread
From: liwei (GF) @ 2020-09-29 8:17 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, Suzuki K Poulose, Alexander Shishkin,
Catalin Marinas, guohanjun, Adrian Hunter,
Arnaldo Carvalho de Melo, linux-kernel, zhangshaokun,
Peter Zijlstra, Ingo Molnar, James Clark, Leo Yan, Namhyung Kim,
Jiri Olsa, linux-arm-kernel
Hi Will,
On 2020/9/7 20:51, Will Deacon wrote:
> On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
>> Armv8.3 extends the SPE by adding:
>> - Alignment field in the Events packet, and filtering on this event
>> using PMSEVFR_EL1.
>> - Support for the Scalable Vector Extension (SVE).
>>
>> The main additions for SVE are:
>> - Recording the vector length for SVE operations in the Operation Type
>> packet. It is not possible to filter on vector length.
>> - Incomplete predicate and empty predicate fields in the Events packet,
>> and filtering on these events using PMSEVFR_EL1.
>>
>> Update the check of pmsevfr for empty/partial predicated SVE and
>> alignment event in kernel driver.
>>
>> Signed-off-by: Wei Li <liwei391@huawei.com>
>> ---
>> arch/arm64/include/asm/sysreg.h | 4 +++-
>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 463175f80341..be4c44ccdb56 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -281,7 +281,6 @@
>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>
>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>
> I think we can just update this mask unconditionally to allow the new bits.
>
>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>> @@ -769,6 +768,9 @@
>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>
>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>> +
>> #define ID_DFR0_PERFMON_SHIFT 24
>>
>> #define ID_DFR0_PERFMON_8_1 0x4
>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>> --- a/drivers/perf/arm_spe_pmu.c
>> +++ b/drivers/perf/arm_spe_pmu.c
>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>> struct hlist_node hotplug_node;
>>
>> int irq; /* PPI */
>> -
>> + int pmuver;
>
> nit: please call this "pmsver" to align with the architecture (where
> "pmuver" means something else).
OK, i will rename it in v2.
>> u16 min_period;
>> u16 counter_sz;
>>
>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>> /* Keep track of our dynamic hotplug state */
>> static enum cpuhp_state arm_spe_pmu_online;
>>
>> +static u64 sys_pmsevfr_el1_mask[] = {
>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>> + BIT_ULL(1),
>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>> +};
>
> As I said above, you can drop this and just update the #define.
>
>> +
>> enum arm_spe_pmu_buf_fault_action {
>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>> return -ENOENT;
>>
>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>> return -EOPNOTSUPP;
>
> Same here.
What if we use these new bits on the system which just support ARMv8.2-SPE?
It will return success and never work, i don't think that is suitable.
>>
>> if (attr->exclude_idle)
>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> fld, smp_processor_id());
>> return;
>> }
>> + spe_pmu->pmuver = fld;
>>
>> /* Read PMBIDR first to determine whether or not we have access */
>> reg = read_sysreg_s(SYS_PMBIDR_EL1);
>> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> }
>>
>> dev_info(dev,
>> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>> - cpumask_pr_args(&spe_pmu->supported_cpus),
>> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
>
> There's no need for this. If userspace finds this information useful, then
> we should expose it in sysfs, like we do for other PMU paramaters. If
> userspace doesn't find it useful, then there's no need to expose it at all.
>
> So I would suggest adding something like SPE_PMU_CAP_PMSVER and exposing the
> field on a per-SPE-PMU basis in sysfs.
We may need this as then we can know which events are supported. It is meaningful to testcases.
So i will expose it as cap attribute in v2.
> big.LITTLE should work as before, where we expose a completely separate PMU
> instance for each CPU type.
>
The of_compatible of SPE PMU is "arm,statistical-profiling-extension-v1", and the platform_device
name is "arm,spe-v1". Should we add a "v2" entry for ARMv8.3-SPE or not?
Thanks for your time.
Best regards,
Wei
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 44+ messages in thread* Re: [PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE
@ 2020-09-29 8:17 ` liwei (GF)
0 siblings, 0 replies; 44+ messages in thread
From: liwei (GF) @ 2020-09-29 8:17 UTC (permalink / raw)
To: Will Deacon
Cc: Adrian Hunter, Alexander Shishkin, Arnaldo Carvalho de Melo,
Catalin Marinas, James Clark, Jiri Olsa, Leo Yan, Mark Rutland,
Namhyung Kim, Suzuki K Poulose, zhangshaokun, linux-arm-kernel,
linux-kernel, Ingo Molnar, Peter Zijlstra, guohanjun
Hi Will,
On 2020/9/7 20:51, Will Deacon wrote:
> On Fri, Jul 24, 2020 at 05:16:04PM +0800, Wei Li wrote:
>> Armv8.3 extends the SPE by adding:
>> - Alignment field in the Events packet, and filtering on this event
>> using PMSEVFR_EL1.
>> - Support for the Scalable Vector Extension (SVE).
>>
>> The main additions for SVE are:
>> - Recording the vector length for SVE operations in the Operation Type
>> packet. It is not possible to filter on vector length.
>> - Incomplete predicate and empty predicate fields in the Events packet,
>> and filtering on these events using PMSEVFR_EL1.
>>
>> Update the check of pmsevfr for empty/partial predicated SVE and
>> alignment event in kernel driver.
>>
>> Signed-off-by: Wei Li <liwei391@huawei.com>
>> ---
>> arch/arm64/include/asm/sysreg.h | 4 +++-
>> drivers/perf/arm_spe_pmu.c | 18 ++++++++++++++----
>> 2 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 463175f80341..be4c44ccdb56 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -281,7 +281,6 @@
>> #define SYS_PMSFCR_EL1_ST_SHIFT 18
>>
>> #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
>> -#define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL
>
> I think we can just update this mask unconditionally to allow the new bits.
>
>> #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
>> #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
>> @@ -769,6 +768,9 @@
>> #define ID_AA64DFR0_PMUVER_8_5 0x6
>> #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
>>
>> +#define ID_AA64DFR0_PMSVER_8_2 0x1
>> +#define ID_AA64DFR0_PMSVER_8_3 0x2
>> +
>> #define ID_DFR0_PERFMON_SHIFT 24
>>
>> #define ID_DFR0_PERFMON_8_1 0x4
>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
>> index e51ddb6d63ed..5ec7ee0c8fa1 100644
>> --- a/drivers/perf/arm_spe_pmu.c
>> +++ b/drivers/perf/arm_spe_pmu.c
>> @@ -54,7 +54,7 @@ struct arm_spe_pmu {
>> struct hlist_node hotplug_node;
>>
>> int irq; /* PPI */
>> -
>> + int pmuver;
>
> nit: please call this "pmsver" to align with the architecture (where
> "pmuver" means something else).
OK, i will rename it in v2.
>> u16 min_period;
>> u16 counter_sz;
>>
>> @@ -80,6 +80,15 @@ struct arm_spe_pmu {
>> /* Keep track of our dynamic hotplug state */
>> static enum cpuhp_state arm_spe_pmu_online;
>>
>> +static u64 sys_pmsevfr_el1_mask[] = {
>> + [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
>> + BIT_ULL(1),
>> + [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
>> + GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
>> + BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
>> +};
>
> As I said above, you can drop this and just update the #define.
>
>> +
>> enum arm_spe_pmu_buf_fault_action {
>> SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
>> SPE_PMU_BUF_FAULT_ACT_FATAL,
>> @@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
>> !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus))
>> return -ENOENT;
>>
>> - if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
>> + if (arm_spe_event_to_pmsevfr(event) & ~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
>> return -EOPNOTSUPP;
>
> Same here.
What if we use these new bits on the system which just support ARMv8.2-SPE?
It will return success and never work, i don't think that is suitable.
>>
>> if (attr->exclude_idle)
>> @@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> fld, smp_processor_id());
>> return;
>> }
>> + spe_pmu->pmuver = fld;
>>
>> /* Read PMBIDR first to determine whether or not we have access */
>> reg = read_sysreg_s(SYS_PMBIDR_EL1);
>> @@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
>> }
>>
>> dev_info(dev,
>> - "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>> - cpumask_pr_args(&spe_pmu->supported_cpus),
>> + "v%d probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
>> + spe_pmu->pmuver, cpumask_pr_args(&spe_pmu->supported_cpus),
>
> There's no need for this. If userspace finds this information useful, then
> we should expose it in sysfs, like we do for other PMU paramaters. If
> userspace doesn't find it useful, then there's no need to expose it at all.
>
> So I would suggest adding something like SPE_PMU_CAP_PMSVER and exposing the
> field on a per-SPE-PMU basis in sysfs.
We may need this as then we can know which events are supported. It is meaningful to testcases.
So i will expose it as cap attribute in v2.
> big.LITTLE should work as before, where we expose a completely separate PMU
> instance for each CPU type.
>
The of_compatible of SPE PMU is "arm,statistical-profiling-extension-v1", and the platform_device
name is "arm,spe-v1". Should we add a "v2" entry for ARMv8.3-SPE or not?
Thanks for your time.
Best regards,
Wei
^ permalink raw reply [flat|nested] 44+ messages in thread