* [PATCH 5/7] powerpc/perf: Update Power PMU cache_events to u64 type
From: Athira Rajeev @ 2020-06-05 7:57 UTC (permalink / raw)
To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>
Events of type PERF_TYPE_HW_CACHE was described for Power PMU
as: int (*cache_events)[type][op][result];
where type, op, result values unpacked from the event attribute config
value is used to generate the raw event code at runtime.
So far the event code values which used to create these cache-related
events were within 32 bit and `int` type worked. In power10,
some of the event codes are of 64-bit value and hence update the
Power PMU cache_events to `u64` type in `power_pmu` struct.
Also propagate this change to existing all PMU driver code paths
which are using ppmu->cache_events.
Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/perf_event_server.h | 2 +-
arch/powerpc/perf/core-book3s.c | 2 +-
arch/powerpc/perf/generic-compat-pmu.c | 2 +-
arch/powerpc/perf/mpc7450-pmu.c | 2 +-
arch/powerpc/perf/power5+-pmu.c | 2 +-
arch/powerpc/perf/power5-pmu.c | 2 +-
arch/powerpc/perf/power6-pmu.c | 2 +-
arch/powerpc/perf/power7-pmu.c | 2 +-
arch/powerpc/perf/power8-pmu.c | 2 +-
arch/powerpc/perf/power9-pmu.c | 2 +-
arch/powerpc/perf/ppc970-pmu.c | 2 +-
11 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 895aeaa..cb207f8 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -47,7 +47,7 @@ struct power_pmu {
const struct attribute_group **attr_groups;
int n_generic;
int *generic_events;
- int (*cache_events)[PERF_COUNT_HW_CACHE_MAX]
+ u64 (*cache_events)[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX];
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 9db72cd..6de81d1 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1818,7 +1818,7 @@ static void hw_perf_event_destroy(struct perf_event *event)
static int hw_perf_cache_event(u64 config, u64 *eventp)
{
unsigned long type, op, result;
- int ev;
+ u64 ev;
if (!ppmu->cache_events)
return -EINVAL;
diff --git a/arch/powerpc/perf/generic-compat-pmu.c b/arch/powerpc/perf/generic-compat-pmu.c
index 5e5a54d..eb8a6aaf 100644
--- a/arch/powerpc/perf/generic-compat-pmu.c
+++ b/arch/powerpc/perf/generic-compat-pmu.c
@@ -101,7 +101,7 @@ enum {
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int generic_compat_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 generic_compat_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[ C(L1D) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0,
diff --git a/arch/powerpc/perf/mpc7450-pmu.c b/arch/powerpc/perf/mpc7450-pmu.c
index 4d5ef92..cf1eb89 100644
--- a/arch/powerpc/perf/mpc7450-pmu.c
+++ b/arch/powerpc/perf/mpc7450-pmu.c
@@ -354,7 +354,7 @@ static void mpc7450_disable_pmc(unsigned int pmc, unsigned long mmcr[])
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int mpc7450_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 mpc7450_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = { /* RESULT_ACCESS RESULT_MISS */
[C(OP_READ)] = { 0, 0x225 },
[C(OP_WRITE)] = { 0, 0x227 },
diff --git a/arch/powerpc/perf/power5+-pmu.c b/arch/powerpc/perf/power5+-pmu.c
index f857454..9252281 100644
--- a/arch/powerpc/perf/power5+-pmu.c
+++ b/arch/powerpc/perf/power5+-pmu.c
@@ -618,7 +618,7 @@ static void power5p_disable_pmc(unsigned int pmc, unsigned long mmcr[])
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int power5p_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power5p_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = { /* RESULT_ACCESS RESULT_MISS */
[C(OP_READ)] = { 0x1c10a8, 0x3c1088 },
[C(OP_WRITE)] = { 0x2c10a8, 0xc10c3 },
diff --git a/arch/powerpc/perf/power5-pmu.c b/arch/powerpc/perf/power5-pmu.c
index da52eca..3b36630 100644
--- a/arch/powerpc/perf/power5-pmu.c
+++ b/arch/powerpc/perf/power5-pmu.c
@@ -560,7 +560,7 @@ static void power5_disable_pmc(unsigned int pmc, unsigned long mmcr[])
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int power5_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power5_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = { /* RESULT_ACCESS RESULT_MISS */
[C(OP_READ)] = { 0x4c1090, 0x3c1088 },
[C(OP_WRITE)] = { 0x3c1090, 0xc10c3 },
diff --git a/arch/powerpc/perf/power6-pmu.c b/arch/powerpc/perf/power6-pmu.c
index 3929cac..540b78d 100644
--- a/arch/powerpc/perf/power6-pmu.c
+++ b/arch/powerpc/perf/power6-pmu.c
@@ -481,7 +481,7 @@ static void p6_disable_pmc(unsigned int pmc, unsigned long mmcr[])
* are event codes.
* The "DTLB" and "ITLB" events relate to the DERAT and IERAT.
*/
-static int power6_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power6_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = { /* RESULT_ACCESS RESULT_MISS */
[C(OP_READ)] = { 0x280030, 0x80080 },
[C(OP_WRITE)] = { 0x180032, 0x80088 },
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index a137813..2b7f375 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -332,7 +332,7 @@ static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = { /* RESULT_ACCESS RESULT_MISS */
[C(OP_READ)] = { 0xc880, 0x400f0 },
[C(OP_WRITE)] = { 0, 0x300f0 },
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 3a5fcc2..5282e84 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -253,7 +253,7 @@ static void power8_config_bhrb(u64 pmu_bhrb_filter)
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int power8_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power8_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[ C(L1D) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = PM_LD_REF_L1,
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 08c3ef7..05dae38 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -310,7 +310,7 @@ static void power9_config_bhrb(u64 pmu_bhrb_filter)
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int power9_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power9_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[ C(L1D) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = PM_LD_REF_L1,
diff --git a/arch/powerpc/perf/ppc970-pmu.c b/arch/powerpc/perf/ppc970-pmu.c
index 4035d93..2970d1e 100644
--- a/arch/powerpc/perf/ppc970-pmu.c
+++ b/arch/powerpc/perf/ppc970-pmu.c
@@ -432,7 +432,7 @@ static void p970_disable_pmc(unsigned int pmc, unsigned long mmcr[])
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
-static int ppc970_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 ppc970_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = { /* RESULT_ACCESS RESULT_MISS */
[C(OP_READ)] = { 0x8810, 0x3810 },
[C(OP_WRITE)] = { 0x7810, 0x813 },
--
1.8.3.1
^ permalink raw reply related
* [PATCH 4/7] powerpc/perf: Add power10_feat to dt_cpu_ftrs
From: Athira Rajeev @ 2020-06-05 7:57 UTC (permalink / raw)
To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>
From: Madhavan Srinivasan <maddy@linux.ibm.com>
Add power10 feature function to dt_cpu_ftrs.c along
with a power10 specific init() to initialize pmu sprs.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
arch/powerpc/include/asm/reg.h | 3 +++
arch/powerpc/kernel/cpu_setup_power.S | 7 +++++++
arch/powerpc/kernel/dt_cpu_ftrs.c | 26 ++++++++++++++++++++++++++
3 files changed, 36 insertions(+)
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 21a1b2d..900ada1 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1068,6 +1068,9 @@
#define MMCR0_PMC2_LOADMISSTIME 0x5
#endif
+/* BHRB disable bit for PowerISA v3.10 */
+#define MMCRA_BHRB_DISABLE 0x0000002000000000
+
/*
* SPRG usage:
*
diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index efdcfa7..e8b3370c 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -233,3 +233,10 @@ __init_PMU_ISA207:
li r5,0
mtspr SPRN_MMCRS,r5
blr
+
+__init_PMU_ISA31:
+ li r5,0
+ mtspr SPRN_MMCR3,r5
+ LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
+ mtspr SPRN_MMCRA,r5
+ blr
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 3a40951..f482286 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -450,6 +450,31 @@ static int __init feat_enable_pmu_power9(struct dt_cpu_feature *f)
return 1;
}
+static void init_pmu_power10(void)
+{
+ init_pmu_power9();
+
+ mtspr(SPRN_MMCR3, 0);
+ mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
+}
+
+static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
+{
+ hfscr_pmu_enable();
+
+ init_pmu_power10();
+ init_pmu_registers = init_pmu_power10;
+
+ cur_cpu_spec->cpu_features |= CPU_FTR_MMCRA;
+ cur_cpu_spec->cpu_user_features |= PPC_FEATURE_PSERIES_PERFMON_COMPAT;
+
+ cur_cpu_spec->num_pmcs = 6;
+ cur_cpu_spec->pmc_type = PPC_PMC_IBM;
+ cur_cpu_spec->oprofile_cpu_type = "ppc64/power10";
+
+ return 1;
+}
+
static int __init feat_enable_tm(struct dt_cpu_feature *f)
{
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -639,6 +664,7 @@ struct dt_cpu_feature_match {
{"pc-relative-addressing", feat_enable, 0},
{"machine-check-power9", feat_enable_mce_power9, 0},
{"performance-monitor-power9", feat_enable_pmu_power9, 0},
+ {"performance-monitor-power10", feat_enable_pmu_power10, 0},
{"event-based-branch-v3", feat_enable, 0},
{"random-number-generator", feat_enable, 0},
{"system-call-vectored", feat_disable, 0},
--
1.8.3.1
^ permalink raw reply related
* [PATCH 2/7] KVM: PPC: Book3S HV: Save/restore new PMU registers
From: Athira Rajeev @ 2020-06-05 7:57 UTC (permalink / raw)
To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>
PowerISA v3.1 has added new performance monitoring unit (PMU)
special purpose registers (SPRs). They are
Monitor Mode Control Register 3 (MMCR3)
Sampled Instruction Event Register A (SIER2)
Sampled Instruction Event Register B (SIER3)
Patch addes support to save/restore these new
SPRs while entering/exiting guest.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
arch/powerpc/include/asm/kvm_host.h | 4 ++--
arch/powerpc/kernel/asm-offsets.c | 3 +++
arch/powerpc/kvm/book3s_hv.c | 6 ++++--
arch/powerpc/kvm/book3s_hv_interrupts.S | 8 ++++++++
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 24 ++++++++++++++++++++++++
6 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 45704f2..078f464 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -119,7 +119,7 @@ struct kvmppc_host_state {
void __iomem *xive_tima_virt;
u32 saved_xirr;
u64 dabr;
- u64 host_mmcr[7]; /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER */
+ u64 host_mmcr[10]; /* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER, MMCR3, SIER2/3 */
u32 host_pmc[8];
u64 host_purr;
u64 host_spurr;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 1dc6310..8bc2122 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -637,12 +637,12 @@ struct kvm_vcpu_arch {
u32 ccr1;
u32 dbsr;
- u64 mmcr[5];
+ u64 mmcr[6];
u32 pmc[8];
u32 spmc[2];
u64 siar;
u64 sdar;
- u64 sier;
+ u64 sier[3];
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
u64 tfhar;
u64 texasr;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 9b9cde0..3ec3f37 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -697,6 +697,9 @@ int main(void)
HSTATE_FIELD(HSTATE_SDAR, host_mmcr[4]);
HSTATE_FIELD(HSTATE_MMCR2, host_mmcr[5]);
HSTATE_FIELD(HSTATE_SIER, host_mmcr[6]);
+ HSTATE_FIELD(HSTATE_MMCR3, host_mmcr[7]);
+ HSTATE_FIELD(HSTATE_SIER2, host_mmcr[8]);
+ HSTATE_FIELD(HSTATE_SIER3, host_mmcr[9]);
HSTATE_FIELD(HSTATE_PMC1, host_pmc[0]);
HSTATE_FIELD(HSTATE_PMC2, host_pmc[1]);
HSTATE_FIELD(HSTATE_PMC3, host_pmc[2]);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a0cf175..9e3840b 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1695,7 +1695,8 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
*val = get_reg_val(id, vcpu->arch.sdar);
break;
case KVM_REG_PPC_SIER:
- *val = get_reg_val(id, vcpu->arch.sier);
+ i = id - KVM_REG_PPC_SIER;
+ *val = get_reg_val(id, vcpu->arch.sier[i]);
break;
case KVM_REG_PPC_IAMR:
*val = get_reg_val(id, vcpu->arch.iamr);
@@ -1916,7 +1917,8 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
vcpu->arch.sdar = set_reg_val(id, *val);
break;
case KVM_REG_PPC_SIER:
- vcpu->arch.sier = set_reg_val(id, *val);
+ i = id - KVM_REG_PPC_SIER;
+ vcpu->arch.sier[i] = set_reg_val(id, *val);
break;
case KVM_REG_PPC_IAMR:
vcpu->arch.iamr = set_reg_val(id, *val);
diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
index 63fd81f..59822cb 100644
--- a/arch/powerpc/kvm/book3s_hv_interrupts.S
+++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
@@ -140,6 +140,14 @@ BEGIN_FTR_SECTION
std r8, HSTATE_MMCR2(r13)
std r9, HSTATE_SIER(r13)
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+BEGIN_FTR_SECTION
+ mfspr r5, SPRN_MMCR3
+ mfspr r6, SPRN_SIER2
+ mfspr r7, SPRN_SIER3
+ std r5, HSTATE_MMCR3(r13)
+ std r6, HSTATE_SIER2(r13)
+ std r7, HSTATE_SIER3(r13)
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
mfspr r3, SPRN_PMC1
mfspr r5, SPRN_PMC2
mfspr r6, SPRN_PMC3
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 7194389..57b6c14 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -3436,6 +3436,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
mtspr SPRN_SIAR, r7
mtspr SPRN_SDAR, r8
BEGIN_FTR_SECTION
+ ld r5, VCPU_MMCR + 40(r4)
+ ld r6, VCPU_SIER + 8(r4)
+ ld r7, VCPU_SIER + 16(r4)
+ mtspr SPRN_MMCR3, r5
+ mtspr SPRN_SIER2, r6
+ mtspr SPRN_SIER3, r7
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
+BEGIN_FTR_SECTION
ld r5, VCPU_MMCR + 24(r4)
ld r6, VCPU_SIER(r4)
mtspr SPRN_MMCR2, r5
@@ -3496,6 +3504,14 @@ BEGIN_FTR_SECTION
mtspr SPRN_MMCR2, r8
mtspr SPRN_SIER, r9
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+BEGIN_FTR_SECTION
+ ld r5, HSTATE_MMCR3(r13)
+ ld r6, HSTATE_SIER2(r13)
+ ld r7, HSTATE_SIER3(r13)
+ mtspr SPRN_MMCR3, r5
+ mtspr SPRN_SIER2, r6
+ mtspr SPRN_SIER3, r7
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
mtspr SPRN_MMCR0, r3
isync
mtlr r0
@@ -3555,6 +3571,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
BEGIN_FTR_SECTION
std r10, VCPU_MMCR + 24(r9)
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+BEGIN_FTR_SECTION
+ mfspr r5, SPRN_MMCR3
+ mfspr r6, SPRN_SIER2
+ mfspr r7, SPRN_SIER3
+ std r5, VCPU_MMCR + 40(r9)
+ std r6, VCPU_SIER + 8(r9)
+ std r7, VCPU_SIER + 16(r9)
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
std r7, VCPU_SIAR(r9)
std r8, VCPU_SDAR(r9)
mfspr r3, SPRN_PMC1
--
1.8.3.1
^ permalink raw reply related
* [PATCH 3/7] powerpc/xmon: Add PowerISA v3.1 PMU SPRs
From: Athira Rajeev @ 2020-06-05 7:57 UTC (permalink / raw)
To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>
From: Madhavan Srinivasan <maddy@linux.ibm.com>
PowerISA v3.1 added three new perfromance
monitoring unit (PMU) speical purpose register (SPR).
They are Monitor Mode Control Register 3 (MMCR3),
Sampled Instruction Event Register 2 (SIER2),
Sampled Instruction Event Register 3 (SIER3).
Patch here adds a new dump function dump_310_sprs
to print these SPR values.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
arch/powerpc/xmon/xmon.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index b34d703..6fd5f4e 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2023,6 +2023,20 @@ static void dump_300_sprs(void)
#endif
}
+static void dump_310_sprs(void)
+{
+#ifdef CONFIG_PPC64
+ if (!cpu_has_feature(CPU_FTR_ARCH_31))
+ return;
+
+ printf("mmcr3 = %.16lx\n",
+ mfspr(SPRN_MMCR3));
+
+ printf("sier2 = %.16lx sier3 = %.16lx\n",
+ mfspr(SPRN_SIER2), mfspr(SPRN_SIER3));
+#endif
+}
+
static void dump_one_spr(int spr, bool show_unimplemented)
{
unsigned long val;
@@ -2077,6 +2091,7 @@ static void super_regs(void)
dump_206_sprs();
dump_207_sprs();
dump_300_sprs();
+ dump_310_sprs();
return;
}
--
1.8.3.1
^ permalink raw reply related
* [PATCH 1/7] powerpc/perf: Add support for ISA3.1 PMU SPRs
From: Athira Rajeev @ 2020-06-05 7:57 UTC (permalink / raw)
To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>
From: Madhavan Srinivasan <maddy@linux.ibm.com>
PowerISA v3.1 includes new performance monitoring unit(PMU)
special purpose registers (SPRs). They are
Monitor Mode Control Register 3 (MMCR3)
Sampled Instruction Event Register 2 (SIER2)
Sampled Instruction Event Register 3 (SIER3)
MMCR3 is added for further sampling related configuration
control. SIER2/SIER3 are added to provide additional
information about the sampled instruction.
Patch adds new PPMU flag called "PPMU_ARCH_310S" to support
handling of these new SPRs, updates the struct thread_struct
to include these new SPRs, increase the size of mmcr[] array
by one to include MMCR3 in struct cpu_hw_event. This is needed
to support programming of MMCR3 SPR during event_[enable/disable].
Patch also adds the sysfs support for the MMCR3 SPR along with
SPRN_ macros for these new pmu sprs.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
arch/powerpc/include/asm/perf_event_server.h | 1 +
arch/powerpc/include/asm/processor.h | 4 ++++
arch/powerpc/include/asm/reg.h | 6 ++++++
arch/powerpc/kernel/sysfs.c | 8 ++++++++
arch/powerpc/perf/core-book3s.c | 29 ++++++++++++++++++++++++++--
5 files changed, 46 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 3e9703f..895aeaa 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -69,6 +69,7 @@ struct power_pmu {
#define PPMU_HAS_SIER 0x00000040 /* Has SIER */
#define PPMU_ARCH_207S 0x00000080 /* PMC is architecture v2.07S */
#define PPMU_NO_SIAR 0x00000100 /* Do not use SIAR */
+#define PPMU_ARCH_310S 0x00000200 /* Has MMCR3, SIER2 and SIER3 */
/*
* Values for flags to get_alternatives()
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 52a6783..a466e94 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -272,6 +272,10 @@ struct thread_struct {
unsigned mmcr0;
unsigned used_ebb;
+ unsigned long mmcr3;
+ unsigned long sier2;
+ unsigned long sier3;
+
#endif
};
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 88e6c78..21a1b2d 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -876,7 +876,9 @@
#define MMCR0_FCHV 0x00000001UL /* freeze conditions in hypervisor mode */
#define SPRN_MMCR1 798
#define SPRN_MMCR2 785
+#define SPRN_MMCR3 754
#define SPRN_UMMCR2 769
+#define SPRN_UMMCR3 738
#define SPRN_MMCRA 0x312
#define MMCRA_SDSYNC 0x80000000UL /* SDAR synced with SIAR */
#define MMCRA_SDAR_DCACHE_MISS 0x40000000UL
@@ -918,6 +920,10 @@
#define SIER_SIHV 0x1000000 /* Sampled MSR_HV */
#define SIER_SIAR_VALID 0x0400000 /* SIAR contents valid */
#define SIER_SDAR_VALID 0x0200000 /* SDAR contents valid */
+#define SPRN_SIER2 752
+#define SPRN_SIER3 753
+#define SPRN_USIER2 736
+#define SPRN_USIER3 737
#define SPRN_SIAR 796
#define SPRN_SDAR 797
#define SPRN_TACR 888
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 571b325..46b4ebc 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -622,8 +622,10 @@ void ppc_enable_pmcs(void)
SYSFS_PMCSETUP(pmc8, SPRN_PMC8);
SYSFS_PMCSETUP(mmcra, SPRN_MMCRA);
+SYSFS_PMCSETUP(mmcr3, SPRN_MMCR3);
static DEVICE_ATTR(mmcra, 0600, show_mmcra, store_mmcra);
+static DEVICE_ATTR(mmcr3, 0600, show_mmcr3, store_mmcr3);
#endif /* HAS_PPC_PMC56 */
@@ -886,6 +888,9 @@ static int register_cpu_online(unsigned int cpu)
#ifdef CONFIG_PMU_SYSFS
if (cpu_has_feature(CPU_FTR_MMCRA))
device_create_file(s, &dev_attr_mmcra);
+
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+ device_create_file(s, &dev_attr_mmcr3);
#endif /* CONFIG_PMU_SYSFS */
if (cpu_has_feature(CPU_FTR_PURR)) {
@@ -980,6 +985,9 @@ static int unregister_cpu_online(unsigned int cpu)
#ifdef CONFIG_PMU_SYSFS
if (cpu_has_feature(CPU_FTR_MMCRA))
device_remove_file(s, &dev_attr_mmcra);
+
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+ device_remove_file(s, &dev_attr_mmcr3);
#endif /* CONFIG_PMU_SYSFS */
if (cpu_has_feature(CPU_FTR_PURR)) {
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 13b9dd5..9db72cd 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -39,10 +39,10 @@ struct cpu_hw_events {
unsigned int flags[MAX_HWEVENTS];
/*
* The order of the MMCR array is:
- * - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
+ * - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
* - 32-bit, MMCR0, MMCR1, MMCR2
*/
- unsigned long mmcr[4];
+ unsigned long mmcr[5];
struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
u8 limited_hwidx[MAX_LIMITED_HWCOUNTERS];
u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
@@ -584,6 +584,11 @@ static void ebb_switch_out(unsigned long mmcr0)
current->thread.sdar = mfspr(SPRN_SDAR);
current->thread.mmcr0 = mmcr0 & MMCR0_USER_MASK;
current->thread.mmcr2 = mfspr(SPRN_MMCR2) & MMCR2_USER_MASK;
+ if (ppmu->flags & PPMU_ARCH_310S) {
+ current->thread.mmcr3 = mfspr(SPRN_MMCR3);
+ current->thread.sier2 = mfspr(SPRN_SIER2);
+ current->thread.sier3 = mfspr(SPRN_SIER3);
+ }
}
static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
@@ -623,6 +628,12 @@ static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
* instead manage the MMCR2 entirely by itself.
*/
mtspr(SPRN_MMCR2, cpuhw->mmcr[3] | current->thread.mmcr2);
+
+ if (ppmu->flags & PPMU_ARCH_310S) {
+ mtspr(SPRN_MMCR3, current->thread.mmcr3);
+ mtspr(SPRN_SIER2, current->thread.sier2);
+ mtspr(SPRN_SIER3, current->thread.sier3);
+ }
out:
return mmcr0;
}
@@ -843,6 +854,11 @@ void perf_event_print_debug(void)
pr_info("EBBRR: %016lx BESCR: %016lx\n",
mfspr(SPRN_EBBRR), mfspr(SPRN_BESCR));
}
+
+ if (ppmu->flags & PPMU_ARCH_310S) {
+ pr_info("MMCR3: %016lx SIER2: %016lx SIER3: %016lx\n",
+ mfspr(SPRN_MMCR3), mfspr(SPRN_SIER2), mfspr(SPRN_SIER3));
+ }
#endif
pr_info("SIAR: %016lx SDAR: %016lx SIER: %016lx\n",
mfspr(SPRN_SIAR), sdar, sier);
@@ -1308,6 +1324,10 @@ static void power_pmu_enable(struct pmu *pmu)
if (!cpuhw->n_added) {
mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
+#ifdef CONFIG_PPC64
+ if (ppmu->flags & PPMU_ARCH_310S)
+ mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
+#endif /* CONFIG_PPC64 */
goto out_enable;
}
@@ -1351,6 +1371,11 @@ static void power_pmu_enable(struct pmu *pmu)
if (ppmu->flags & PPMU_ARCH_207S)
mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
+#ifdef CONFIG_PPC64
+ if (ppmu->flags & PPMU_ARCH_310S)
+ mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
+#endif /* CONFIG_PPC64 */
+
/*
* Read off any pre-existing events that need to move
* to another PMC.
--
1.8.3.1
^ permalink raw reply related
* [PATCH 0/7] powerpc/perf: Add support for power10 PMU Hardware
From: Athira Rajeev @ 2020-06-05 7:57 UTC (permalink / raw)
To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
The patch series adds support for power10 PMU hardware.
And code changes are based on powerpc/next.
Athira Rajeev (4):
KVM: PPC: Book3S HV: Save/restore new PMU registers
powerpc/perf: Update Power PMU cache_events to u64 type
powerpc/perf: power10 Performance Monitoring support
powerpc/perf: support BHRB disable bit and new filtering modes
Madhavan Srinivasan (3):
powerpc/perf: Add support for ISA3.1 PMU SPRs
powerpc/xmon: Add PowerISA v3.1 PMU SPRs
powerpc/perf: Add power10_feat to dt_cpu_ftrs
arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
arch/powerpc/include/asm/kvm_host.h | 4 +-
arch/powerpc/include/asm/perf_event_server.h | 3 +-
arch/powerpc/include/asm/processor.h | 4 +
arch/powerpc/include/asm/reg.h | 9 +
arch/powerpc/kernel/asm-offsets.c | 3 +
arch/powerpc/kernel/cpu_setup_power.S | 7 +
arch/powerpc/kernel/dt_cpu_ftrs.c | 26 ++
arch/powerpc/kernel/sysfs.c | 8 +
arch/powerpc/kvm/book3s_hv.c | 6 +-
arch/powerpc/kvm/book3s_hv_interrupts.S | 8 +
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 24 ++
arch/powerpc/perf/Makefile | 2 +-
arch/powerpc/perf/core-book3s.c | 60 +++-
arch/powerpc/perf/generic-compat-pmu.c | 2 +-
arch/powerpc/perf/internal.h | 1 +
arch/powerpc/perf/isa207-common.c | 72 ++++-
arch/powerpc/perf/isa207-common.h | 33 +-
arch/powerpc/perf/mpc7450-pmu.c | 2 +-
arch/powerpc/perf/power10-events-list.h | 81 +++++
arch/powerpc/perf/power10-pmu.c | 431 +++++++++++++++++++++++++++
arch/powerpc/perf/power5+-pmu.c | 2 +-
arch/powerpc/perf/power5-pmu.c | 2 +-
arch/powerpc/perf/power6-pmu.c | 2 +-
arch/powerpc/perf/power7-pmu.c | 2 +-
arch/powerpc/perf/power8-pmu.c | 2 +-
arch/powerpc/perf/power9-pmu.c | 2 +-
arch/powerpc/perf/ppc970-pmu.c | 2 +-
arch/powerpc/platforms/powernv/idle.c | 14 +
arch/powerpc/xmon/xmon.c | 15 +
30 files changed, 796 insertions(+), 35 deletions(-)
create mode 100644 arch/powerpc/perf/power10-events-list.h
create mode 100644 arch/powerpc/perf/power10-pmu.c
--
1.8.3.1
^ permalink raw reply
* [PATCH] tpm: ibmvtpm: Wait for ready buffer before probing for TPM2 attributes
From: David Gibson @ 2020-06-05 6:37 UTC (permalink / raw)
To: Michael Ellerman, Peter Huewe, Jarkko Sakkinen, Jason Gunthorpe,
Stefan Berger, Nayna Jain
Cc: linuxppc-dev, linux-integrity, Paul Mackerras, linux-kernel,
David Gibson
The tpm2_get_cc_attrs_tbl() call will result in TPM commands being issued,
which will need the use of the internal command/response buffer. But,
we're issuing this *before* we've waited to make sure that buffer is
allocated.
This can result in intermittent failures to probe if the hypervisor / TPM
implementation doesn't respond quickly enough. I find it fails almost
every time with an 8 vcpu guest under KVM with software emulated TPM.
Fixes: 18b3670d79ae9 "tpm: ibmvtpm: Add support for TPM2"
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
drivers/char/tpm/tpm_ibmvtpm.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index 09fe45246b8c..994385bf37c0 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -683,13 +683,6 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
if (rc)
goto init_irq_cleanup;
- if (!strcmp(id->compat, "IBM,vtpm20")) {
- chip->flags |= TPM_CHIP_FLAG_TPM2;
- rc = tpm2_get_cc_attrs_tbl(chip);
- if (rc)
- goto init_irq_cleanup;
- }
-
if (!wait_event_timeout(ibmvtpm->crq_queue.wq,
ibmvtpm->rtce_buf != NULL,
HZ)) {
@@ -697,6 +690,13 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
goto init_irq_cleanup;
}
+ if (!strcmp(id->compat, "IBM,vtpm20")) {
+ chip->flags |= TPM_CHIP_FLAG_TPM2;
+ rc = tpm2_get_cc_attrs_tbl(chip);
+ if (rc)
+ goto init_irq_cleanup;
+ }
+
return tpm_chip_register(chip);
init_irq_cleanup:
do {
--
2.26.2
^ permalink raw reply related
* linux-next 04 June: warning: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE" is not defined
From: Christophe Leroy @ 2020-06-05 5:33 UTC (permalink / raw)
To: Chuck Lever, Stephen Rothwell
Cc: Linux Next Mailing List, PowerPC, Linux Kernel Mailing List
Hi all,
Getting the following warning on linux-next from yesterday,
CC net/sunrpc/svcsock.o
net/sunrpc/svcsock.c:227:5: warning: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE"
is not defined [-Wundef]
#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
^
Bisected to ca07eda33e01 (refs/bisect/bad) SUNRPC: Refactor svc_recvfrom()
Missing #include <asm/cacheflush.h>
Christophe
^ permalink raw reply
* Re: [PATCH] pwm: Add missing "CONFIG_" prefix
From: Joe Perches @ 2020-06-05 4:07 UTC (permalink / raw)
To: Kees Cook
Cc: linux-pwm, Uwe Kleine-König, linux-kernel, Thierry Reding,
Paul Mackerras, linuxppc-dev
In-Reply-To: <202006041451.19491ECA@keescook>
On Thu, 2020-06-04 at 14:52 -0700, Kees Cook wrote:
> On Wed, Jun 03, 2020 at 04:04:31PM -0700, Joe Perches wrote:
> > On Wed, 2020-06-03 at 15:40 -0700, Kees Cook wrote:
> > > The IS_ENABLED() use was missing the CONFIG_ prefix which would have
> > > lead to skipping this code.
> > >
> > > Fixes: 3ad1f3a33286 ("pwm: Implement some checks for lowlevel drivers")
> > > Signed-off-by: Kees Cook <keescook@chromium.org>
> > > ---
> > > drivers/pwm/core.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
> > > index 9973c442b455..6b3cbc0490c6 100644
> > > --- a/drivers/pwm/core.c
> > > +++ b/drivers/pwm/core.c
> > > @@ -121,7 +121,7 @@ static int pwm_device_request(struct pwm_device *pwm, const char *label)
> > > pwm->chip->ops->get_state(pwm->chip, pwm, &pwm->state);
> > > trace_pwm_get(pwm, &pwm->state);
> > >
> > > - if (IS_ENABLED(PWM_DEBUG))
> > > + if (IS_ENABLED(CONFIG_PWM_DEBUG))
> > > pwm->last = pwm->state;
> > > }
> > >
> > > --
> > > 2.25.1
> > >
> >
> > more odd uses (mostly in comments)
> >
> > $ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
> > sed -r 's/\s+//g'| \
> > grep -v '(CONFIG_' | \
> > sort | uniq -c | sort -rn
> > 7 IS_ENABLED(DEBUG)
> > 4 IS_ENABLED(DRM_I915_SELFTEST)
> > 4 IS_ENABLED(cfg)
> > 2 IS_ENABLED(opt_name)
> > 2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
> > 2 IS_ENABLED(config)
> > 2 IS_ENABLED(cond)
> > 2 IS_ENABLED(__BIG_ENDIAN)
> > 1 IS_ENABLED(x)
> > 1 IS_ENABLED(STRICT_KERNEL_RWX)
> > 1 IS_ENABLED(PWM_DEBUG)
> > 1 IS_ENABLED(option)
> > 1 IS_ENABLED(ETHTOOL_NETLINK)
> > 1 IS_ENABLED(DEBUG_RANDOM_TRIE)
> > 1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)
> >
> > STRICT_KERNEL_RWX is misused here in ppc
> >
> > ---
> >
> > Fix pr_warn without newline too.
> >
> > arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> > index 51e3c15f7aff..dd60c5f2b991 100644
> > --- a/arch/powerpc/mm/book3s64/hash_utils.c
> > +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> > @@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
> > * Pick a size for the linear mapping. Currently, we only
> > * support 16M, 1M and 4K which is the default
> > */
> > - if (IS_ENABLED(STRICT_KERNEL_RWX) &&
> > + if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
> > (unsigned long)_stext % 0x1000000) {
> > if (mmu_psize_defs[MMU_PAGE_16M].shift)
> > - pr_warn("Kernel not 16M aligned, "
> > - "disabling 16M linear map alignment");
> > + pr_warn("Kernel not 16M aligned, disabling 16M linear map alignment\n");
> > aligned = false;
> > }
>
> Joe, I was going to send all of the fixes for these issues, but your
> patch doesn't have a SoB. Shall I add one for the above patch?
<shrug> sure if you want, or submit it yourself.
My feeling about these types of changes is the maintainers
of the subsystems, in this case ppc, should manage this
themselves and shouldn't require anyone else to actually
bother to send real patches.
^ permalink raw reply
* [powerpc:next] BUILD SUCCESS 1395375c592770fe5158a592944aaeed67fa94ff
From: kernel test robot @ 2020-06-05 2:47 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
branch HEAD: 1395375c592770fe5158a592944aaeed67fa94ff Merge branch 'topic/ppc-kvm' into next
elapsed time: 860m
configs tested: 99
configs skipped: 9
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
mips ar7_defconfig
csky alldefconfig
sh shx3_defconfig
mips cavium_octeon_defconfig
um x86_64_defconfig
ia64 tiger_defconfig
arc nps_defconfig
arc tb10x_defconfig
arm spear13xx_defconfig
nios2 defconfig
mips jmr3927_defconfig
alpha defconfig
xtensa defconfig
powerpc mpc866_ads_defconfig
mips workpad_defconfig
s390 alldefconfig
sh sh7757lcr_defconfig
nds32 alldefconfig
m68k m5407c3_defconfig
mips malta_defconfig
x86_64 defconfig
arm assabet_defconfig
arm multi_v4t_defconfig
i386 allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allyesconfig
um allnoconfig
um defconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS 787e1419e6c96ab8641adb4391b8955bc2e76817
From: kernel test robot @ 2020-06-05 2:47 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 787e1419e6c96ab8641adb4391b8955bc2e76817 Automatic merge of branch 'next' into merge
elapsed time: 861m
configs tested: 108
configs skipped: 8
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
mips ar7_defconfig
csky alldefconfig
sh shx3_defconfig
mips jmr3927_defconfig
xtensa defconfig
nios2 defconfig
alpha defconfig
powerpc mpc866_ads_defconfig
mips workpad_defconfig
s390 alldefconfig
sh sh7757lcr_defconfig
nds32 alldefconfig
m68k m5407c3_defconfig
mips malta_defconfig
i386 allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a001-20200604
i386 randconfig-a006-20200604
i386 randconfig-a002-20200604
i386 randconfig-a005-20200604
i386 randconfig-a004-20200604
i386 randconfig-a003-20200604
x86_64 randconfig-a011-20200604
x86_64 randconfig-a016-20200604
x86_64 randconfig-a013-20200604
x86_64 randconfig-a014-20200604
x86_64 randconfig-a012-20200604
x86_64 randconfig-a015-20200604
i386 randconfig-a014-20200604
i386 randconfig-a015-20200604
i386 randconfig-a011-20200604
i386 randconfig-a016-20200604
i386 randconfig-a012-20200604
i386 randconfig-a013-20200604
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allnoconfig
um defconfig
um allyesconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Daniel Kolesa @ 2020-06-05 2:18 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Rich Felker, libc-alpha, eery, musl, Will Springer,
Palmer Dabbelt via binutils, via libc-dev, Michal Suchánek,
linuxppc-dev, Joseph Myers
In-Reply-To: <20200604233516.GM31009@gate.crashing.org>
On Fri, Jun 5, 2020, at 01:35, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jun 04, 2020 at 11:43:53PM +0200, Daniel Kolesa wrote:
> > The thing is, I've yet to see in which way the ELFv2 ABI *actually* requires VSX - I don't think compiling for 970 introduces any actual differences. There will be omissions, yes - but then the more accurate thing would be to say that a subset of ELFv2 is used, rather than it being a different ABI per se.
>
> Two big things are that binaries that someone else made are supposed to
> work for you as well -- including binaries using VSX registers, or any
> instructions that require ISA 2.07 (or some older ISA after 970). This
> includes DSOs (shared libraries). So for a distribution this means that
> they will not use VSX *anywhere*, or only in very specialised things.
> That is a many-years setback, for people/situations where it could be
> used.
Third party precompiled stuff doesn't really need to concern us, since none really exists. It's also still an upgrade over ELFv1 regardless (I mean, the same things apply there). I'm also not really all that convinced that vectors make a huge difference in non-specialized code (autovectorization still has a way to go) and code written to use vector instructions should probably check auxval and take those paths at runtime. As for other instructions, fair enough, but from my rough testing, it doesn't make such a massive difference for average case (and where it does, one can always rebuild their thing with CFLAGS=-mcpu=power9)
>
> > The ELFv2 document specifies things like passing of quadruple precision floats. Indeed, VSX is needed there, but that's not a concern if you *don't* use quadruple precision floats.
>
> As Joseph says, QP float is passed in the VRs, so that works just fine
> *if* you have AltiVec. If not, you probably should pass such values in
> the GPRs, or however a struct of 16 bytes is passed in whatever ABI you
> use (this is a much more general problem than just BE ELFv2 ;-) ) And
> then you have some kind of "partial soft float". Fun! (Or not...)
As I mentioned previously, I kinda want to be able to cover the same targets as ELFv1 can, which also means non-altivec hardware, so it doesn't matter that much either way... being able to cover existing stable compilers with a manageable patchset would be nice, too.
>
> > > If you always use -mcpu=970 (or similar), then not very much is
> > > different for you most likely -- except of course there is no promise
> > > to the user that they can use VSX and all instructions in ISA 2.07,
> > > which is a very useful promise to have normally.
> >
> > Yes, -mcpu=970 is used for default packages. *However*, it is possible that the user compiles their own packages with -mcpu=power9 or something similar, and then it'll be possible to utilize VSX and all, and it should still work with the existing userland. When speaking of ABI, what matters is... well, the binary interface, which is the same - so I believe this is still ELFv2. A subset is always compliant with the whole.
>
> The same calling convention will probably work, yes. An ABI is more
> than that though.
>
>
> > > Btw, if you use GCC, *please* send in testresults? :-)
> >
> > Yes, it's all gcc (we do have clang, but compiling repo packages with clang is generally frowned upon in the project, as we have vast majority of packages cross-compilable, and our cross-compiling infrastructure is gcc-centric, plus we enable certain things by default such as hardening flags that clang does not support). I'll try to remember next time I'm running tests.
>
> Thanks in advance!
>
> > I have a feeling that glibc would object to such port, since it means it would have to exist in parallel with a potential different ELFv2 port that does have a POWER8 minimal requirement; gcc would need a way to tell them apart, too (based on what would they be told apart? the simplest way would probably be something like, if --with-abi=elfv2 { if --with-cpu < power8 -> use glibc-novsx else use glibc-vsx } ...)
>
> The target name allows to make such distinctions: this could for example
> be powerpc64-*-linux-void (maybe I put the distinction in the wrong
> part of the name here? The glibc people will know better, and "void" is
> probably not a great name anyway).
Hm, I'm not a huge fan of putting ABI specifics in the triplet, it feels wrong - there is no precedent for it with POWER (ARM did it with EABI though), the last part should remain 'gnu' as it's still glibc; besides, gcc is compiled for exactly one target triplet, and traditionally with ppc compilers it's always been possible to target everything with just one compiler (endian, 32bit, 64bit, abi...). Detection based on CPU is probably quirky too, actually, since it'd mean different CPU types selecting different dynamic linkers.
The best way would probably be adding a new -mabi, e.g. -mabi=elfv2-novsx (just an example), which would behave exactly like -mabi=elfv2, except it'd emit some extra detection macro - that would be documented in the ABI spec (and glibc could use it to choose the correct port), allow gcc (or clang) to select a correct dynamic linker name, and probably default to a 64-bit long double format. This would also get rid of the patches used for musl which pick the default -mlong-double based on target triples :) Alternatively, since -mabi is additive, it could be a modifier for -mabi=elfv2, which would perhaps make even more sense. Clang would need similar changes. I think this way would respect existing conventions the most and be overall least invasive.
Thoughts?
>
>
> Segher
>
Daniel
^ permalink raw reply
* RE: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Williams, Dan J @ 2020-06-05 0:54 UTC (permalink / raw)
To: Vaibhav Jain, linuxppc-dev@lists.ozlabs.org,
linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
Cc: Aneesh Kumar K . V, Santosh Sivaraj, Oliver O'Halloran,
Weiny, Ira, Steven Rostedt
In-Reply-To: <87h7vrgpzx.fsf@linux.ibm.com>
> -----Original Message-----
> From: Vaibhav Jain <vaibhav@linux.ibm.com>
> Sent: Thursday, June 4, 2020 2:06 AM
> To: Williams, Dan J <dan.j.williams@intel.com>; linuxppc-
> dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-
> kernel@vger.kernel.org
> Cc: Santosh Sivaraj <santosh@fossix.org>; Aneesh Kumar K . V
> <aneesh.kumar@linux.ibm.com>; Steven Rostedt <rostedt@goodmis.org>;
> Oliver O'Halloran <oohall@gmail.com>; Weiny, Ira <ira.weiny@intel.com>
> Subject: RE: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for
> PAPR nvdimm specific methods
>
> Hi Dan,
>
> Thanks for review and insights on this. My responses below:
>
> "Williams, Dan J" <dan.j.williams@intel.com> writes:
>
> > [ forgive formatting I'm temporarily stuck using Outlook this week...
> > ]
> >
> >> From: Vaibhav Jain <vaibhav@linux.ibm.com>
> > [..]
> >>
> >> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in
> papr_scm
> >> module and add the command family NVDIMM_FAMILY_PAPR to the
> white
> >> list of NVDIMM command sets. Also advertise support for ND_CMD_CALL
> >> for the nvdimm command mask and implement necessary scaffolding in
> >> the module to handle ND_CMD_CALL ioctl and PDSM requests that we
> receive.
> >>
> >> The layout of the PDSM request as we expect from libnvdimm/libndctl
> >> is described in newly introduced uapi header 'papr_pdsm.h' which
> >> defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used to
> >> communicate the PDSM request via member
> 'nd_cmd_pkg.nd_command' and
> >> size of payload that need to be sent/received for servicing the PDSM.
> >>
> >> A new function is_cmd_valid() is implemented that reads the args to
> >> papr_scm_ndctl() and performs sanity tests on them. A new function
> >> papr_scm_service_pdsm() is introduced and is called from
> >> papr_scm_ndctl() in case of a PDSM request is received via
> >> ND_CMD_CALL command from libnvdimm.
> >>
> >> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Michael Ellerman <mpe@ellerman.id.au>
> >> Cc: Ira Weiny <ira.weiny@intel.com>
> >> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> >> ---
> >> Changelog:
> >>
> >> Resend:
> >> * Added ack from Aneesh.
> >>
> >> v8..v9:
> >> * Reduced the usage of term SCM replacing it with appropriate
> >> replacement [ Dan Williams, Aneesh ]
> >> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
> >> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
> >> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
> >> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
> >> * Minor update to patch description.
> >>
> >> v7..v8:
> >> * Removed the 'payload_offset' field from 'struct
> >> nd_pdsm_cmd_pkg'. Instead command payload is always assumed to
> start
> >> at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
> >> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
> >> 'reserved' field of 10-bytes is introduced. [ Aneesh ]
> >> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
> >> [ Ira ]
> >>
> >> Resend:
> >> * None
> >>
> >> v6..v7 :
> >> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h
> >> [Mpe].
> >> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h
> [Mpe].
> >> * Removed macros that were unused in papr_scm.c from
> papr_scm_pdsm.h
> >> [Mpe].
> >> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
> >>
> >> v5..v6 :
> >> * Changed the usage of the term DSM to PDSM to distinguish it from the
> >> ACPI term [ Dan Williams ]
> >> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various
> >> struct
> >> to reflect the new terminology.
> >> * Updated the patch description and title to reflect the new terminology.
> >> * Squashed patch to introduce new command family in 'ndctl.h' with
> >> this patch [ Dan Williams ]
> >> * Updated the papr_scm_pdsm method starting index from 0x10000 to
> 0x0
> >> [ Dan Williams ]
> >> * Removed redundant license text from the papr_scm_psdm.h file.
> >> [ Dan Williams ]
> >> * s/envelop/envelope/ at various places [ Dan Williams ]
> >> * Added '__packed' attribute to command package header to gaurd
> >> against different compiler adding paddings between the fields.
> >> [ Dan Williams]
> >> * Converted various pr_debug to dev_debug [ Dan Williams ]
> >>
> >> v4..v5 :
> >> * None
> >>
> >> v3..v4 :
> >> * None
> >>
> >> v2..v3 :
> >> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
> >>
> >> v1..v2 :
> >> * None
> >> ---
> >> arch/powerpc/include/uapi/asm/papr_pdsm.h | 136
> >> ++++++++++++++++++++++
> arch/powerpc/platforms/pseries/papr_scm.c |
> >> 101 +++++++++++++++-
> >> include/uapi/linux/ndctl.h | 1 +
> >> 3 files changed, 232 insertions(+), 6 deletions(-) create mode
> >> 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
> >>
> >> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> new file mode 100644
> >> index 000000000000..6407fefcc007
> >> --- /dev/null
> >> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> @@ -0,0 +1,136 @@
> >> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> >> +/*
> >> + * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
> >> + *
> >> + * (C) Copyright IBM 2020
> >> + *
> >> + * Author: Vaibhav Jain <vaibhav at linux.ibm.com> */
> >> +
> >> +#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_ #define
> >> +_UAPI_ASM_POWERPC_PAPR_PDSM_H_
> >> +
> >> +#include <linux/types.h>
> >> +
> >> +/*
> >> + * PDSM Envelope:
> >> + *
> >> + * The ioctl ND_CMD_CALL transfers data between user-space and
> >> +kernel via
> >> + * envelope which consists of a header and user-defined payload
> sections.
> >> + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects
> >> +a
> >> + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload'
> field.
> >> + * There is reserved field that can used to introduce new fields to
> >> +the
> >> + * structure in future. It also tries to ensure that
> >> 'nd_pdsm_cmd_pkg.payload'
> >> + * lies at a 8-byte boundary.
> >> + *
> >> + * +-------------+---------------------+---------------------------+
> >> + * | 64-Bytes | 16-Bytes | Max 176-Bytes |
> >> + * +-------------+---------------------+---------------------------+
> >> + * | nd_pdsm_cmd_pkg | |
> >> + * |-------------+ | |
> >> + * | nd_cmd_pkg | | |
> >> + * +-------------+---------------------+---------------------------+
> >> + * | nd_family | | |
> >> + * | nd_size_out | cmd_status | |
> >> + * | nd_size_in | payload_version | payload |
> >> + * | nd_command | reserved | |
> >> + * | nd_fw_size | | |
> >> + *
> >> + +-------------+---------------------+---------------------------+
> >> + *
> >> + * PDSM Header:
> >> + *
> >> + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
> >> + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to
> >> member
> >> + * 'nd_cmd_pkg.nd_command'. Apart from size information of the
> >> envelope
> >> +which is
> >> + * contained in 'struct nd_cmd_pkg', the header also has members
> >> +following
> >> + * members:
> >> + *
> >> + * 'cmd_status' : (Out) Errors if any encountered while
> >> servicing PDSM.
> >> + * 'payload_version' : (In/Out) Version number associated with
> the
> >> payload.
> >> + * 'reserved' : Not used and reserved for future.
> >> + *
> >> + * PDSM Payload:
> >> + *
> >> + * The layout of the PDSM Payload is defined by various structs
> >> +shared between
> >> + * papr_scm and libndctl so that contents of payload can be
> >> +interpreted. During
> >> + * servicing of a PDSM the papr_scm module will read input args from
> >> +the payload
> >> + * field by casting its contents to an appropriate struct pointer
> >> +based on the
> >> + * PDSM command. Similarly the output of servicing the PDSM command
> >> +will be
> >> + * copied to the payload field using the same struct.
> >> + *
> >> + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope
> >> +size, which
> >> + * leaves around 176 bytes for the envelope payload (ignoring any
> >> +padding that
> >> + * the compiler may silently introduce).
> >> + *
> >> + * Payload Version:
> >> + *
> >> + * A 'payload_version' field is present in PDSM header that
> >> +indicates a specific
> >> + * version of the structure present in PDSM Payload for a given PDSM
> >> command.
> >> + * This provides backward compatibility in case the PDSM Payload
> >> +structure
> >> + * evolves and different structures are supported by 'papr_scm' and
> >> 'libndctl'.
> >> + *
> >> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send
> >> +the version
> >> + * of the payload struct it supports via 'payload_version' field.
> >> +The
> >> 'papr_scm'
> >> + * module when servicing the PDSM envelope checks the
> 'payload_version'
> >> +and then
> >> + * uses 'payload struct version' == MIN('payload_version field',
> >> + * 'max payload-struct-version supported by papr_scm') to service
> >> +the
> >> PDSM.
> >> + * After servicing the PDSM, 'papr_scm' put the negotiated version
> >> +of payload
> >> + * struct in returned 'payload_version' field.
> >> + *
> >> + * Libndctl on receiving the envelope back from papr_scm again
> >> +checks the
> >> + * 'payload_version' field and based on it use the appropriate
> >> +version dsm
> >> + * struct to parse the results.
> >> + *
> >> + * Backward Compatibility:
> >> + *
> >> + * Above scheme of exchanging different versioned PDSM struct
> >> +between libndctl
> >> + * and papr_scm should provide backward compatibility until
> >> +following two
> >> + * assumptions/conditions when defining new PDSM structs hold:
> >> + *
> >> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> >> + *
> >> + * 1. T(X) is a proper subset of T(Y) if Y > X.
> >> + * i.e Each new version of PDSM struct should retain existing struct
> >> + * attributes from previous version
> >> + *
> >> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> >> + * it should also support T(1), T(2)...T(X - 1).
> >> + * i.e When adding support for new version of a PDSM struct, libndctl
> >> + * and papr_scm should retain support of the existing PDSM struct
> >> + * version they support.
> >> + */
> >> +
> >> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from
> >> libnvdimm
> >> +*/ struct nd_pdsm_cmd_pkg {
> >> + struct nd_cmd_pkg hdr; /* Package header containing sub-
> >> cmd */
> >> + __s32 cmd_status; /* Out: Sub-cmd status returned back */
> >> + __u16 reserved[5]; /* Ignored and to be used in future */
> >> + __u16 payload_version; /* In/Out: version of the payload */
> >> + __u8 payload[]; /* In/Out: Sub-cmd data buffer */
> >> +} __packed;
> >> +
> >> +/*
> >> + * Methods to be embedded in ND_CMD_CALL request. These are sent
> to
> >> the
> >> +kernel
> >> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
> >> +*/ enum papr_pdsm {
> >> + PAPR_PDSM_MIN = 0x0,
> >> + PAPR_PDSM_MAX,
> >> +};
> >> +
> >> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */ static
> >> +inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct
> nd_cmd_pkg
> >> *cmd) {
> >> + return (struct nd_pdsm_cmd_pkg *) cmd; }
> >> +
> >> +/* Return the payload pointer for a given pcmd */ static inline void
> >> +*pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd) {
> >> + if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> >> + return NULL;
> >> + else
> >> + return (void *)(pcmd->payload);
> >> +}
> >> +
> >> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c
> >> b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 149431594839..5e2237e7ec08 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -15,13 +15,15 @@
> >> #include <linux/seq_buf.h>
> >>
> >> #include <asm/plpar_wrappers.h>
> >> +#include <asm/papr_pdsm.h>
> >>
> >> #define BIND_ANY_ADDR (~0ul)
> >>
> >> #define PAPR_SCM_DIMM_CMD_MASK \
> >> ((1ul << ND_CMD_GET_CONFIG_SIZE) | \
> >> (1ul << ND_CMD_GET_CONFIG_DATA) | \
> >> - (1ul << ND_CMD_SET_CONFIG_DATA))
> >> + (1ul << ND_CMD_SET_CONFIG_DATA) | \
> >> + (1ul << ND_CMD_CALL))
> >>
> >> /* DIMM health bitmap bitmap indicators */
> >> /* SCM device is unable to persist memory contents */ @@ -350,16
> >> +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
> >> return 0;
> >> }
> >>
> >> +/*
> >> + * Validate the inputs args to dimm-control function and return '0' if valid.
> >> + * This also does initial sanity validation to ND_CMD_CALL
> >> +sub-command
> >> packages.
> >> + */
> >> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd,
> >> +void
> >> *buf,
> >> + unsigned int buf_len)
> >> +{
> >> + unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> >> + struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> >> + struct papr_scm_priv *p;
> >> +
> >> + /* Only dimm-specific calls are supported atm */
> >> + if (!nvdimm)
> >> + return -EINVAL;
> >> +
> >> + /* get the provider date from struct nvdimm */
> >> + p = nvdimm_provider_data(nvdimm);
> >> +
> >> + if (!test_bit(cmd, &cmd_mask)) {
> >> + dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> >> + return -EINVAL;
> >> + } else if (cmd == ND_CMD_CALL) {
> >> +
> >> + /* Verify the envelope package */
> >> + if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> >> + dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> >> + buf_len);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + /* Verify that the PDSM family is valid */
> >> + if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> >> + dev_dbg(&p->pdev->dev, "Invalid pkg
> >> family=0x%llx\n",
> >> + pkg->hdr.nd_family);
> >> + return -EINVAL;
> >> +
> >> + }
> >> +
> >> + /* We except a payload with all PDSM commands */
> >> + if (pdsm_cmd_to_payload(pkg) == NULL) {
> >> + dev_dbg(&p->pdev->dev,
> >> + "Empty payload for sub-command=0x%llx\n",
> >> + pkg->hdr.nd_command);
> >> + return -EINVAL;
> >> + }
> >> + }
> >> +
> >> + /* Command looks valid */
> >
> <snip>
> > So this is where I would expect the kernel to validate the command vs
> > a known list of supported commands / payloads. One of the goals of
> > requiring public documentation of any commands that libnvdimm might
> > support for the ioctl path is to give the kernel the ability to gate
> > future enabling on consideration of a common kernel front-end
> > interface. I believe this would also address questions about the
> > versioning scheme because userspace would be actively prevented from
> > sending command payloads that were not first explicitly enabled in the
> > kernel. This interface as it stands in this patch set seems to be a
> > very thin / "anything goes" passthrough with no consideration for that
> > policy.
> >
> > As an example of the utility of this policy, consider the recent
> > support for nvdimm security commands that allow a passphrase to be set
> > and issue commands like "unlock" and "secure erase". The kernel
> > actively prevents those commands from being sent from userspace. See
> > acpi_nfit_clear_to_send() and nd_cmd_clear_to_send(). The reasoning is
> > that it enforces the kernel's nvdimm security model that uses
> > encrypted/trusted keys to protect key material (clear text keys
> > only-ever exist in kernel-space). Yes, that restriction is painful for
> > people that don't want the kernel's security model and just want the
> > simplicity of passing clear-text keys around, but it's necessary for
> > the kernel to have any chance to provide a common abstraction across
> > vendors. The pain of negotiating every single command with what the
> > kernel will support is useful for the long term health of the kernel.
> > It forces ongoing conversations across vendors to consolidate
> > interfaces and reuse kernel best practices like encrypted/trusted
> > keys. Code acceptance is the only real gate the kernel has to enforce
> > cooperation across vendors.
> >
> > The expectation is that the kernel does not allow any command to pass
> > that is not explicitly listed in a bitmap of known commands. I would
> > expect that if you changed the payload of an existing command that
> > would likely require a new entry in this bitmap. The goal is to give
> > the kernel a chance to constrain the passthrough interface to afford a
> > chance to have a discussion of what might done in a common
> > implementation. Another example is the label-area read-write commands.
> > The kernel needs explicit control to ensure that it owns the label
> > area and that userspace is not able to corrupt it (write it behind the
> > kernel's back).
> >
> > Now that said, I have battle scars with some OEMs that just want a
> > generic passthrough interface so they never need to work with the
> > kernel community again and can just write their custom validation
> > tooling and be done. I've mostly been successful in that fight outside
> > of the gaping hole of ND_CMD_VENDOR. That's the path that ipmctl has
> > used to issue commands that have not made it into the public
> > specification on docs.pmem.io. My warning shot for that is the
> > "disable_vendor_specific" module option that administrators can set to
> > only allow commands that the kernel explicitly knows the effects of to
> > be issued. The result is only tooling / enabling that submits to this
> > auditing regime is guaranteed to work everywhere.
>
> Agree with points made above. With this patchset we arent really trying to
> push an ioctl passthrough to exchange arbitary data with papr-scm module.
> Nor do we want to bypass the kernel community for any future
> enhancements on this interface. We made some design choices based on
> our understanding of certain restriction we saw in ndctl/libndctl. Specifically
> wanted to avoid issuing two CMD_CALL ioctl roundtrips.
>
> That being said I had an extended discussion with Aneesh rethinking the
> 'version' field and we both agreed *to remove this field* from the proposed
> 'struct nd_pdsm_cmd_pkg'. This should resolve the contentions around this
> Patch-4 in this patchset. Since the 'version' field isnt extensively used right
> now the impact on the patchset would be small.
>
> >
> > So, that long explanation out of the way, what does that mean for this
> > patch set? I'd like to understand if you still see a need for a
> > versioning scheme if the implementation is required to explicitly list
> > all the commands it supports? I.e. that the kernel need not worry
> > about userspace sending future unknown payloads because unknown
> > payloads are blocked. Also if your interface has anything similar to a
> > "vendor specific" passthrough I would like to require that go through
> > the ND_CMD_VENDOR ioctl, so that the kernel still has a common check
> > point to prevent vendor specific "I don't want to talk to the kernel
> > community" shenanigans, but even better if ND_CMD_VENDOR is
> something
> > the kernel can eventually jettison because nobody is using it.
>
> As I mentioned above this isn't a 'vendor specific passthrough'
> machenism. The 'version' field was proposed to avoid two CMD_CALL ioctl
> roundtrip to fetch and report extended nvdimm health data like 'life-
> remaining' which isnt always available for papr-scm.
Oh, why not define a maximal health payload with all the attributes you know about today, leave some room for future expansion, and then report a validity flag for each attribute? This is how the "intel" smart-health payload works. If they ever needed to extend the payload they would increase the size and add more validity flags. Old userspace never groks the new fields, new userspace knows to ask for and parse the larger payload.
See the flags field in 'struct nd_intel_smart' (in ndctl) and the translation of those flags to ndctl generic attribute flags intel_cmd_smart_get_flags().
In general I'd like ndctl to understand the superset of all health attributes across all vendors. For the truly vendor specific ones it would mean that the health flags with a specific "papr_scm" back-end just would never be set on an "intel" device. I.e. look at the "hpe" and "msft" health backends. They only set a subset of the valid flags that could be reported.
> However we just realized instead of relying on 'version' field we can
> advertise support for these extended attributes via nvdimm-flags from sysfs.
> Looking at the nvdimm-flags libndctl can use an appropriate pdsm command
> and struct to fetch the dimm health information from papr_scm via
> CMD_CALL.
>
> But thats something we plan to do in future and not with the current
> patchset which only reports fixed set of nvdimm health attributes.
>
> >
> > I feel like this is a conversation that will take a few days to
> > resolve, which does not leave time to push this for v5.8. That said, I
> > do think the health flags patches at the beginning of this series are
> > low risk and uncontentious. How about I merge those for v5.8 and
> > circle back to get this ioctl path queued early in v5.8-rc? Apologies
> > for the late feedback on this relative to v5.8.
> >
> Thanks for this consideration. Agree to the proposal. However changes to
> patchset with removal of 'version' field is fairly small hence can quickly push
> an updated patch series cumulating rest of the review comments from Ira.
>
> Does that sounds reasonable ?
^ permalink raw reply
* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Segher Boessenkool @ 2020-06-05 0:02 UTC (permalink / raw)
To: Daniel Kolesa
Cc: Rich Felker, libc-alpha, eery, musl, Will Springer,
Palmer Dabbelt via binutils, via libc-dev, Michal Suchánek,
linuxppc-dev, Joseph Myers
In-Reply-To: <4ca9d8f4-4d61-4e99-969c-03a99e4fd3cc@www.fastmail.com>
Hi!
On Fri, Jun 05, 2020 at 12:26:22AM +0200, Daniel Kolesa wrote:
> Either way I'll think about it some more and possibly prepare an RFC port. I'm definitely willing to put in the work and later maintenance effort if that's what it takes to make it happen.
Yeah, you'll need to convince all parties involved that it will not be
more work to them then they are willing to put in. Initial development
and ongoing work, as you say.
For GCC it is probably an easy decision (but we'll see your proposed ABI
amendments): it shouldn't be much work at all, and it even benefits us
directly (it'll fill some holes in our testing matrix).
Segher
^ permalink raw reply
* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Phil Blundell @ 2020-06-04 23:43 UTC (permalink / raw)
To: Segher Boessenkool
Cc: libc-alpha, eery, Daniel Kolesa, Will Springer,
Michal Suchánek, linuxppc-dev
In-Reply-To: <20200604230639.GL31009@gate.crashing.org>
On Thu, Jun 04, 2020 at 06:06:39PM -0500, Segher Boessenkool wrote:
> On Thu, Jun 04, 2020 at 11:55:11PM +0200, Phil Blundell wrote:
> > 1a. Define your own subset of ELFv2 which is interworkable with the full
> > ABI at the function call interface but doesn't make all the same
> > guarantees about binary compatibility. That would mean that a binary
> > built with your toolchain and conforming to the subset ABI would run on
> > any system that implements the full ELFv2 ABI, but the opposite is not
> > necessarily true.
>
> And you can only use shared objects also built for that subset ABI. If
> you use some binary distribution, then it will also have to be built for
> that subset, practically anyway.
Right, absolutely. Any place that I wrote "binary", I meant to include
both DSOs and executables.
> This is very similar to soft-float targets.
Yes, agreed.
> There is more process involved than most open source people are
> comfortable with :-/
Yes, that's unfortunate but it goes with the territory. I think we have
to accept that any attempt to define a single ABI where there are
multiple interests involved will be a significant effort involving
thousands of person-hours of work, much discussion, and a certain amount
of politics and compromise. Inevitably, some people/organisations at
the margins will decide that the game isn't worth the candle. If they
don't participate in the general ABI effort then they can hardly
complain about the results, but equally there is nothing to stop these
folks from defining their own ABIs. If they can attain a critical mass
to support such a variant ABI then, as far as I'm concerned, that's a
fine thing and all power to them.
p.
^ permalink raw reply
* [PATCH v10 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200604234136.253703-1-vaibhav@linux.ibm.com>
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v9..v10:
* Removed code in papr_pdsm_health that performed validation on pdsm
payload version and corrosponding struct and defines used for
validation of payload version.
* Dropped usage of struct papr_pdsm_health in 'struct
papr_scm_priv'. Instead papr_psdm_health() now uses
'papr_scm_priv.health_bitmap' to populate the pdsm payload.
* Above change also fixes the problem where this patch was removing
the code that was previously introduced in this patch-series.
[ Ira ]
* Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
nd_cmd_pkg'. This def is useful in validating payload sizes.
* Reworked papr_pdsm_health() to enforce a specific payload size for
'PAPR_PDSM_HEALTH' pdsm request.
Resend:
* Added ack from Aneesh.
v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.
v7..v8:
* None
Resend:
* None
v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
__drc_pmem_query_health() bypassing the cache [Mpe].
v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
gaurd against possibility of different compilers adding different
paddings to the struct [ Dan Williams ]
* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
'bool' and also updated drc_pmem_query_health() to take this into
account. [ Dan Williams ]
v4..v5:
* None
v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
from enum to #defines [Aneesh]
v1..v2:
* New patch in the series
---
arch/powerpc/include/uapi/asm/papr_pdsm.h | 33 +++++++++++
arch/powerpc/platforms/pseries/papr_scm.c | 70 +++++++++++++++++++++++
2 files changed, 103 insertions(+)
diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index 8b1a4f8fa316..c4c990ede5d4 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -71,12 +71,17 @@ struct nd_pdsm_cmd_pkg {
__u8 payload[]; /* In/Out: Sub-cmd data buffer */
} __packed;
+/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */
+#define ND_PDSM_ENVELOPE_HDR_SIZE \
+ (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
+
/*
* Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
* via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
*/
enum papr_pdsm {
PAPR_PDSM_MIN = 0x0,
+ PAPR_PDSM_HEALTH,
PAPR_PDSM_MAX,
};
@@ -95,4 +100,32 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
return (void *)(pcmd->payload);
}
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY 0
+#define PAPR_PDSM_DIMM_UNHEALTHY 1
+#define PAPR_PDSM_DIMM_CRITICAL 2
+#define PAPR_PDSM_DIMM_FATAL 3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * dimm_unarmed : Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown : Previous shutdown did not persist contents.
+ * dimm_bad_restore : Contents from previous shutdown werent restored.
+ * dimm_scrubbed : Contents of the dimm have been scrubbed.
+ * dimm_locked : Contents of the dimm cant be modified until CEC reboot
+ * dimm_encrypted : Contents of dimm are encrypted.
+ * dimm_health : Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
+ */
+struct nd_papr_pdsm_health {
+ __u8 dimm_unarmed;
+ __u8 dimm_bad_shutdown;
+ __u8 dimm_bad_restore;
+ __u8 dimm_scrubbed;
+ __u8 dimm_locked;
+ __u8 dimm_encrypted;
+ __u16 dimm_health;
+} __packed;
+
#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 05eb56ecab5e..984942be24c1 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -421,6 +421,72 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
return 0;
}
+/* Fetch the DIMM health info and populate it in provided package. */
+static int papr_pdsm_health(struct papr_scm_priv *p,
+ struct nd_pdsm_cmd_pkg *pkg)
+{
+ int rc;
+ struct nd_papr_pdsm_health health = { 0 };
+ u16 copysize = sizeof(struct nd_papr_pdsm_health);
+ u16 payload_size = pkg->hdr.nd_size_out - ND_PDSM_ENVELOPE_HDR_SIZE;
+
+ /* Ensure correct payload size that can hold struct nd_papr_pdsm_health */
+ if (payload_size != copysize) {
+ dev_dbg(&p->pdev->dev,
+ "Unexpected payload-size (%u). Expected (%u)",
+ pkg->hdr.nd_size_out, copysize);
+ rc = -ENOSPC;
+ goto out;
+ }
+
+ /* Ensure dimm health mutex is taken preventing concurrent access */
+ rc = mutex_lock_interruptible(&p->health_mutex);
+ if (rc)
+ goto out;
+
+ /* Always fetch upto date dimm health data ignoring cached values */
+ rc = __drc_pmem_query_health(p);
+ if (rc) {
+ mutex_unlock(&p->health_mutex);
+ goto out;
+ }
+
+ /* update health struct with various flags derived from health bitmap */
+ health = (struct nd_papr_pdsm_health) {
+ .dimm_unarmed = p->health_bitmap & PAPR_PMEM_UNARMED_MASK,
+ .dimm_bad_shutdown = p->health_bitmap & PAPR_PMEM_BAD_SHUTDOWN_MASK,
+ .dimm_bad_restore = p->health_bitmap & PAPR_PMEM_BAD_RESTORE_MASK,
+ .dimm_encrypted = p->health_bitmap & PAPR_PMEM_ENCRYPTED,
+ .dimm_locked = p->health_bitmap & PAPR_PMEM_SCRUBBED_AND_LOCKED,
+ .dimm_scrubbed = p->health_bitmap & PAPR_PMEM_SCRUBBED_AND_LOCKED,
+ .dimm_health = PAPR_PDSM_DIMM_HEALTHY,
+ };
+
+ /* Update field dimm_health based on health_bitmap flags */
+ if (p->health_bitmap & PAPR_PMEM_HEALTH_FATAL)
+ health.dimm_health = PAPR_PDSM_DIMM_FATAL;
+ else if (p->health_bitmap & PAPR_PMEM_HEALTH_CRITICAL)
+ health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
+ else if (p->health_bitmap & PAPR_PMEM_HEALTH_UNHEALTHY)
+ health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
+
+ /* struct populated hence can release the mutex now */
+ mutex_unlock(&p->health_mutex);
+
+ dev_dbg(&p->pdev->dev, "Copying payload size=%u\n", copysize);
+
+ /* Copy the health struct to the payload */
+ memcpy(pdsm_cmd_to_payload(pkg), &health, copysize);
+
+ /* Update fw size including size of struct nd_pdsm_cmd_pkg fields */
+ pkg->hdr.nd_fw_size = copysize + ND_PDSM_ENVELOPE_HDR_SIZE;
+
+out:
+ dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
+
+ return rc;
+}
+
/*
* For a given pdsm request call an appropriate service function.
* Note: Use 'nd_pdsm_cmd_pkg.cmd_status to report psdm servicing errors. Hence
@@ -435,6 +501,10 @@ static void papr_scm_service_pdsm(struct papr_scm_priv *p,
/* Call pdsm service function */
switch (pdsm) {
+ case PAPR_PDSM_HEALTH:
+ pkg->cmd_status = papr_pdsm_health(p, pkg);
+ break;
+
default:
dev_dbg(&p->pdev->dev, "PDSM[0x%x]: Unsupported PDSM request\n",
pdsm);
--
2.26.2
^ permalink raw reply related
* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Segher Boessenkool @ 2020-06-04 23:42 UTC (permalink / raw)
To: Joseph Myers
Cc: Rich Felker, libc-alpha, eery, Daniel Kolesa, musl, Will Springer,
Palmer Dabbelt via binutils, via libc-dev, Michal Suchánek,
linuxppc-dev
In-Reply-To: <alpine.DEB.2.21.2006042154060.8237@digraph.polyomino.org.uk>
Hi!
On Thu, Jun 04, 2020 at 10:08:02PM +0000, Joseph Myers wrote:
> > The ELFv2 document specifies things like passing of quadruple precision
> > floats. Indeed, VSX is needed there, but that's not a concern if you
> > *don't* use quadruple precision floats.
>
> My understanding is that the registers used for argument passing are all
> ones that exactly correspond to the Vector registers in earlier
> instruction set versions. In other words, you could *in principle*
> produce an object, or a whole libm shared library, [...]
And then there is the VRSAVE register, if your OS still uses that.
Let's hope not :-)
This is similar to what -mno-float128-hardware does (which actually
requires VSX hardware for the emulation library currently).
Segher
^ permalink raw reply
* [PATCH v10 2/6] seq_buf: Export seq_buf_printf
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Cezary Rojewski, Piotr Maziarz, Steven Rostedt,
Christoph Hellwig, Oliver O'Halloran, Aneesh Kumar K . V,
Borislav Petkov, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200604234136.253703-1-vaibhav@linux.ibm.com>
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.
Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.
Cc: Piotr Maziarz <piotrx.maziarz@linux.intel.com>
Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v9..v10:
* None
Resend:
* Added ack from Steven Rostedt
v8..v9:
* None
v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
modules' to 'kernel lodable modules'.
Resend:
* Added ack from Steven Rostedt
v6..v7:
* New patch in the series
---
lib/seq_buf.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
return ret;
}
+EXPORT_SYMBOL_GPL(seq_buf_printf);
#ifdef CONFIG_BINARY_PRINTF
/**
--
2.26.2
^ permalink raw reply related
* [PATCH v10 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200604234136.253703-1-vaibhav@linux.ibm.com>
Since papr_scm_ndctl() can be called from outside papr_scm, its
exposed to the possibility of receiving NULL as value of 'cmd_rc'
argument. This patch updates papr_scm_ndctl() to protect against such
possibility by assigning it pointer to a local variable in case cmd_rc
== NULL.
Finally the patch also updates the 'default' clause of the switch-case
block removing a 'return' statement thereby ensuring that value of
'cmd_rc' is always logged when papr_scm_ndctl() returns.
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v9..v10
* New patch in the series
---
arch/powerpc/platforms/pseries/papr_scm.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 0c091622b15e..6512fe6a2874 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
{
struct nd_cmd_get_config_size *get_size_hdr;
struct papr_scm_priv *p;
+ int rc;
/* Only dimm-specific calls are supported atm */
if (!nvdimm)
return -EINVAL;
+ /* Use a local variable in case cmd_rc pointer is NULL */
+ if (!cmd_rc)
+ cmd_rc = &rc;
+
p = nvdimm_provider_data(nvdimm);
switch (cmd) {
@@ -381,12 +386,13 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
break;
default:
- return -EINVAL;
+ dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
+ *cmd_rc = -EINVAL;
}
dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
- return 0;
+ return *cmd_rc;
}
static ssize_t flags_show(struct device *dev,
--
2.26.2
^ permalink raw reply related
* [PATCH v10 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200604234136.253703-1-vaibhav@linux.ibm.com>
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.
The patch also adds a documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.
[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v9..v10:
* Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].
Resend:
* Added ack from Aneesh.
v8..v9:
* Rename some variables and defines to reduce usage of term SCM
replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem
v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
bitmap/ [ Ira, Aneesh ].
Resend:
* None
v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
READ_ONCE(). [Mpe]
v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
[Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
integer [Mpe]
* Updated patch description to reflect the changes made in this
version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
flags_show() [Dan Williams]
v4..v5 :
* None
v3..v4 :
* None
v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
NVDIMM unarmed [Aneesh]
v1..v2 :
* New patch in the series.
---
Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 +++
arch/powerpc/platforms/pseries/papr_scm.c | 168 +++++++++++++++++-
2 files changed, 193 insertions(+), 2 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index 000000000000..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What: /sys/bus/nd/devices/nmemX/papr/flags
+Date: Apr, 2020
+KernelVersion: v5.8
+Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
+Description:
+ (RO) Report flags indicating various states of a
+ papr-pmem NVDIMM device. Each flag maps to a one or
+ more bits set in the dimm-health-bitmap retrieved in
+ response to H_SCM_HEALTH hcall. The details of the bit
+ flags returned in response to this hcall is available
+ at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+ the flags reported in this sysfs file:
+
+ * "not_armed" : Indicates that NVDIMM contents will not
+ survive a power cycle.
+ * "flush_fail" : Indicates that NVDIMM contents
+ couldn't be flushed during last
+ shut-down event.
+ * "restore_fail": Indicates that NVDIMM contents
+ couldn't be restored during NVDIMM
+ initialization.
+ * "encrypted" : NVDIMM contents are encrypted.
+ * "smart_notify": There is health event for the NVDIMM.
+ * "scrubbed" : Indicating that contents of the
+ NVDIMM have been scrubbed.
+ * "locked" : Indicating that NVDIMM contents cant
+ be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f35592423380..0c091622b15e 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -12,6 +12,7 @@
#include <linux/libnvdimm.h>
#include <linux/platform_device.h>
#include <linux/delay.h>
+#include <linux/seq_buf.h>
#include <asm/plpar_wrappers.h>
@@ -22,6 +23,44 @@
(1ul << ND_CMD_GET_CONFIG_DATA) | \
(1ul << ND_CMD_SET_CONFIG_DATA))
+/* DIMM health bitmap bitmap indicators */
+/* SCM device is unable to persist memory contents */
+#define PAPR_PMEM_UNARMED (1ULL << (63 - 0))
+/* SCM device failed to persist memory contents */
+#define PAPR_PMEM_SHUTDOWN_DIRTY (1ULL << (63 - 1))
+/* SCM device contents are persisted from previous IPL */
+#define PAPR_PMEM_SHUTDOWN_CLEAN (1ULL << (63 - 2))
+/* SCM device contents are not persisted from previous IPL */
+#define PAPR_PMEM_EMPTY (1ULL << (63 - 3))
+/* SCM device memory life remaining is critically low */
+#define PAPR_PMEM_HEALTH_CRITICAL (1ULL << (63 - 4))
+/* SCM device will be garded off next IPL due to failure */
+#define PAPR_PMEM_HEALTH_FATAL (1ULL << (63 - 5))
+/* SCM contents cannot persist due to current platform health status */
+#define PAPR_PMEM_HEALTH_UNHEALTHY (1ULL << (63 - 6))
+/* SCM device is unable to persist memory contents in certain conditions */
+#define PAPR_PMEM_HEALTH_NON_CRITICAL (1ULL << (63 - 7))
+/* SCM device is encrypted */
+#define PAPR_PMEM_ENCRYPTED (1ULL << (63 - 8))
+/* SCM device has been scrubbed and locked */
+#define PAPR_PMEM_SCRUBBED_AND_LOCKED (1ULL << (63 - 9))
+
+/* Bits status indicators for health bitmap indicating unarmed dimm */
+#define PAPR_PMEM_UNARMED_MASK (PAPR_PMEM_UNARMED | \
+ PAPR_PMEM_HEALTH_UNHEALTHY)
+
+/* Bits status indicators for health bitmap indicating unflushed dimm */
+#define PAPR_PMEM_BAD_SHUTDOWN_MASK (PAPR_PMEM_SHUTDOWN_DIRTY)
+
+/* Bits status indicators for health bitmap indicating unrestored dimm */
+#define PAPR_PMEM_BAD_RESTORE_MASK (PAPR_PMEM_EMPTY)
+
+/* Bit status indicators for smart event notification */
+#define PAPR_PMEM_SMART_EVENT_MASK (PAPR_PMEM_HEALTH_CRITICAL | \
+ PAPR_PMEM_HEALTH_FATAL | \
+ PAPR_PMEM_HEALTH_UNHEALTHY)
+
+/* private struct associated with each region */
struct papr_scm_priv {
struct platform_device *pdev;
struct device_node *dn;
@@ -39,6 +78,15 @@ struct papr_scm_priv {
struct resource res;
struct nd_region *region;
struct nd_interleave_set nd_set;
+
+ /* Protect dimm health data from concurrent read/writes */
+ struct mutex health_mutex;
+
+ /* Last time the health information of the dimm was updated */
+ unsigned long lasthealth_jiffies;
+
+ /* Health information for the dimm */
+ u64 health_bitmap;
};
static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -144,6 +192,61 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
return drc_pmem_bind(p);
}
+/*
+ * Issue hcall to retrieve dimm health info and populate papr_scm_priv with the
+ * health information.
+ */
+static int __drc_pmem_query_health(struct papr_scm_priv *p)
+{
+ unsigned long ret[PLPAR_HCALL_BUFSIZE];
+ long rc;
+
+ /* issue the hcall */
+ rc = plpar_hcall(H_SCM_HEALTH, ret, p->drc_index);
+ if (rc != H_SUCCESS) {
+ dev_err(&p->pdev->dev,
+ "Failed to query health information, Err:%ld\n", rc);
+ return -ENXIO;
+ }
+
+ p->lasthealth_jiffies = jiffies;
+ p->health_bitmap = ret[0] & ret[1];
+
+ dev_dbg(&p->pdev->dev,
+ "Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
+ ret[0], ret[1]);
+
+ return 0;
+}
+
+/* Min interval in seconds for assuming stable dimm health */
+#define MIN_HEALTH_QUERY_INTERVAL 60
+
+/* Query cached health info and if needed call drc_pmem_query_health */
+static int drc_pmem_query_health(struct papr_scm_priv *p)
+{
+ unsigned long cache_timeout;
+ int rc;
+
+ /* Protect concurrent modifications to papr_scm_priv */
+ rc = mutex_lock_interruptible(&p->health_mutex);
+ if (rc)
+ return rc;
+
+ /* Jiffies offset for which the health data is assumed to be same */
+ cache_timeout = p->lasthealth_jiffies +
+ msecs_to_jiffies(MIN_HEALTH_QUERY_INTERVAL * 1000);
+
+ /* Fetch new health info is its older than MIN_HEALTH_QUERY_INTERVAL */
+ if (time_after(jiffies, cache_timeout))
+ rc = __drc_pmem_query_health(p);
+ else
+ /* Assume cached health data is valid */
+ rc = 0;
+
+ mutex_unlock(&p->health_mutex);
+ return rc;
+}
static int papr_scm_meta_get(struct papr_scm_priv *p,
struct nd_cmd_get_config_data_hdr *hdr)
@@ -286,6 +389,64 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
return 0;
}
+static ssize_t flags_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct nvdimm *dimm = to_nvdimm(dev);
+ struct papr_scm_priv *p = nvdimm_provider_data(dimm);
+ struct seq_buf s;
+ u64 health;
+ int rc;
+
+ rc = drc_pmem_query_health(p);
+ if (rc)
+ return rc;
+
+ /* Copy health_bitmap locally, check masks & update out buffer */
+ health = READ_ONCE(p->health_bitmap);
+
+ seq_buf_init(&s, buf, PAGE_SIZE);
+ if (health & PAPR_PMEM_UNARMED_MASK)
+ seq_buf_printf(&s, "not_armed ");
+
+ if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+ seq_buf_printf(&s, "flush_fail ");
+
+ if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+ seq_buf_printf(&s, "restore_fail ");
+
+ if (health & PAPR_PMEM_ENCRYPTED)
+ seq_buf_printf(&s, "encrypted ");
+
+ if (health & PAPR_PMEM_SMART_EVENT_MASK)
+ seq_buf_printf(&s, "smart_notify ");
+
+ if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
+ seq_buf_printf(&s, "scrubbed locked ");
+
+ if (seq_buf_used(&s))
+ seq_buf_printf(&s, "\n");
+
+ return seq_buf_used(&s);
+}
+DEVICE_ATTR_RO(flags);
+
+/* papr_scm specific dimm attributes */
+static struct attribute *papr_nd_attributes[] = {
+ &dev_attr_flags.attr,
+ NULL,
+};
+
+static struct attribute_group papr_nd_attribute_group = {
+ .name = "papr",
+ .attrs = papr_nd_attributes,
+};
+
+static const struct attribute_group *papr_nd_attr_groups[] = {
+ &papr_nd_attribute_group,
+ NULL,
+};
+
static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
{
struct device *dev = &p->pdev->dev;
@@ -312,8 +473,8 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
dimm_flags = 0;
set_bit(NDD_LABELING, &dimm_flags);
- p->nvdimm = nvdimm_create(p->bus, p, NULL, dimm_flags,
- PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
+ p->nvdimm = nvdimm_create(p->bus, p, papr_nd_attr_groups,
+ dimm_flags, PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
if (!p->nvdimm) {
dev_err(dev, "Error creating DIMM object for %pOF\n", p->dn);
goto err;
@@ -399,6 +560,9 @@ static int papr_scm_probe(struct platform_device *pdev)
if (!p)
return -ENOMEM;
+ /* Initialize the dimm mutex */
+ mutex_init(&p->health_mutex);
+
/* optional DT properties */
of_property_read_u32(dn, "ibm,metadata-size", &metadata_size);
--
2.26.2
^ permalink raw reply related
* [PATCH v10 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200604234136.253703-1-vaibhav@linux.ibm.com>
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_cmd_pkg.nd_command' and size of payload that need to be
sent/received for servicing the PDSM.
A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v9..v10:
* Simplified 'struct nd_pdsm_cmd_pkg' by removing the
'payload_version' field.
* Removed the corrosponding documentation on versioning and backward
compatibility from 'papr_pdsm.h'
* Reduced the size of reserved fields to 4-bytes making 'struct
nd_pdsm_cmd_pkg' 64 + 8 bytes long.
* Updated is_cmd_valid() to enforce validation checks on pdsm
commands. [ Dan Williams ]
* Added check for reserved fields being set to '0' in is_cmd_valid()
[ Ira ]
* Moved changes for checking cmd_rc == NULL and logging improvements
to a separate prelim patch [ Ira ].
* Moved pdsm package validation checks from papr_scm_service_pdsm()
to is_cmd_valid().
* Marked papr_scm_service_pdsm() return type as 'void' since errors
are reported in nd_pdsm_cmd_pkg.cmd_status field.
Resend:
* Added ack from Aneesh.
v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.
v7..v8:
* Removed the 'payload_offset' field from 'struct
nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
[ Ira ]
Resend:
* None
v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
[Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
[Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x10000 to 0x0
[ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
[ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
against different compiler adding paddings between the fields.
[ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]
v4..v5 :
* None
v3..v4 :
* None
v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]
v1..v2 :
* None
---
arch/powerpc/include/uapi/asm/papr_pdsm.h | 98 +++++++++++++++++++
arch/powerpc/platforms/pseries/papr_scm.c | 113 +++++++++++++++++++++-
include/uapi/linux/ndctl.h | 1 +
3 files changed, 207 insertions(+), 5 deletions(-)
create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
new file mode 100644
index 000000000000..8b1a4f8fa316
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
+ *
+ * (C) Copyright IBM 2020
+ *
+ * Author: Vaibhav Jain <vaibhav at linux.ibm.com>
+ */
+
+#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_
+#define _UAPI_ASM_POWERPC_PAPR_PDSM_H_
+
+#include <linux/types.h>
+
+/*
+ * PDSM Envelope:
+ *
+ * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
+ * envelope which consists of a header and user-defined payload sections.
+ * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
+ * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
+ * There is reserved field that can used to introduce new fields to the
+ * structure in future. It also tries to ensure that 'nd_pdsm_cmd_pkg.payload'
+ * lies at a 8-byte boundary.
+ *
+ * +-------------+---------------------+---------------------------+
+ * | 64-Bytes | 8-Bytes | Max 184-Bytes |
+ * +-------------+---------------------+---------------------------+
+ * | nd_pdsm_cmd_pkg | |
+ * |-------------+ | |
+ * | nd_cmd_pkg | | |
+ * +-------------+---------------------+---------------------------+
+ * | nd_family | | |
+ * | nd_size_out | cmd_status | |
+ * | nd_size_in | reserved | payload |
+ * | nd_command | | |
+ * | nd_fw_size | | |
+ * +-------------+---------------------+---------------------------+
+ *
+ * PDSM Header:
+ *
+ * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
+ * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
+ * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which is
+ * contained in 'struct nd_cmd_pkg', the header also has members following
+ * members:
+ *
+ * 'cmd_status' : (Out) Errors if any encountered while servicing PDSM.
+ * 'reserved' : Not used, reserved for future and should be set to 0.
+ *
+ * PDSM Payload:
+ *
+ * The layout of the PDSM Payload is defined by various structs shared between
+ * papr_scm and libndctl so that contents of payload can be interpreted. During
+ * servicing of a PDSM the papr_scm module will read input args from the payload
+ * field by casting its contents to an appropriate struct pointer based on the
+ * PDSM command. Similarly the output of servicing the PDSM command will be
+ * copied to the payload field using the same struct.
+ *
+ * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size, which
+ * leaves around 184 bytes for the envelope payload (ignoring any padding that
+ * the compiler may silently introduce).
+ *
+ */
+
+/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
+struct nd_pdsm_cmd_pkg {
+ struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */
+ __s32 cmd_status; /* Out: Sub-cmd status returned back */
+ __u16 reserved[2]; /* Ignored and to be used in future */
+ __u8 payload[]; /* In/Out: Sub-cmd data buffer */
+} __packed;
+
+/*
+ * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
+ * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
+ */
+enum papr_pdsm {
+ PAPR_PDSM_MIN = 0x0,
+ PAPR_PDSM_MAX,
+};
+
+/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
+static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
+{
+ return (struct nd_pdsm_cmd_pkg *)cmd;
+}
+
+/* Return the payload pointer for a given pcmd */
+static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
+{
+ if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
+ return NULL;
+ else
+ return (void *)(pcmd->payload);
+}
+
+#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 6512fe6a2874..05eb56ecab5e 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -15,13 +15,15 @@
#include <linux/seq_buf.h>
#include <asm/plpar_wrappers.h>
+#include <asm/papr_pdsm.h>
#define BIND_ANY_ADDR (~0ul)
#define PAPR_SCM_DIMM_CMD_MASK \
((1ul << ND_CMD_GET_CONFIG_SIZE) | \
(1ul << ND_CMD_GET_CONFIG_DATA) | \
- (1ul << ND_CMD_SET_CONFIG_DATA))
+ (1ul << ND_CMD_SET_CONFIG_DATA) | \
+ (1ul << ND_CMD_CALL))
/* DIMM health bitmap bitmap indicators */
/* SCM device is unable to persist memory contents */
@@ -349,22 +351,117 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
return 0;
}
+/*
+ * Do a sanity checks on the inputs args to dimm-control function and return
+ * '0' if valid. This also does validation on ND_CMD_CALL sub-command packages.
+ */
+static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
+ unsigned int buf_len)
+{
+ unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
+ struct nd_pdsm_cmd_pkg *pkg;
+ struct nd_cmd_pkg *nd_cmd;
+ struct papr_scm_priv *p;
+ enum papr_pdsm pdsm;
+
+ /* Only dimm-specific calls are supported atm */
+ if (!nvdimm)
+ return -EINVAL;
+
+ /* get the provider data from struct nvdimm */
+ p = nvdimm_provider_data(nvdimm);
+
+ if (!test_bit(cmd, &cmd_mask)) {
+ dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
+ return -EINVAL;
+ }
+
+ /* For CMD_CALL verify pdsm request */
+ if (cmd == ND_CMD_CALL) {
+ /* Verify the envelope package */
+ if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
+ dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
+ buf_len);
+ return -EINVAL;
+ }
+
+ /* Verify that the nd_cmd_pkg.nd_family is correct */
+ nd_cmd = (struct nd_cmd_pkg *)buf;
+ if (nd_cmd->nd_family != NVDIMM_FAMILY_PAPR) {
+ dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
+ nd_cmd->nd_family);
+ return -EINVAL;
+ }
+
+ /* Get the pdsm request package and the command */
+ pkg = nd_to_pdsm_cmd_pkg(nd_cmd);
+ pdsm = pkg->hdr.nd_command;
+
+ /* Verify if the psdm command is valid */
+ if (pdsm <= PAPR_PDSM_MIN || pdsm >= PAPR_PDSM_MAX) {
+ dev_dbg(&p->pdev->dev, "PDSM[0x%x]: Invalid PDSM\n", pdsm);
+ return -EINVAL;
+ }
+
+ /* We except a payload with all PDSM commands */
+ if (!pdsm_cmd_to_payload(pkg)) {
+ dev_dbg(&p->pdev->dev, "PDSM[0x%x]: Empty payload\n", pdsm);
+ return -EINVAL;
+ }
+
+ /* Ensure reserved fields of the pdsm header are set to 0 */
+ if (pkg->reserved[0] || pkg->reserved[1]) {
+ dev_dbg(&p->pdev->dev,
+ "PDSM[0x%x]: Invalid reserved field usage\n", pdsm);
+ return -EINVAL;
+ }
+ }
+
+ /* Let the command be further processed */
+ return 0;
+}
+
+/*
+ * For a given pdsm request call an appropriate service function.
+ * Note: Use 'nd_pdsm_cmd_pkg.cmd_status to report psdm servicing errors. Hence
+ * function marked as void.
+ */
+static void papr_scm_service_pdsm(struct papr_scm_priv *p,
+ struct nd_pdsm_cmd_pkg *pkg)
+{
+ const enum papr_pdsm pdsm = pkg->hdr.nd_command;
+
+ dev_dbg(&p->pdev->dev, "PDSM[0x%x]: Servicing..\n", pdsm);
+
+ /* Call pdsm service function */
+ switch (pdsm) {
+ default:
+ dev_dbg(&p->pdev->dev, "PDSM[0x%x]: Unsupported PDSM request\n",
+ pdsm);
+ pkg->cmd_status = -ENOENT;
+ break;
+ }
+}
+
static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
struct nvdimm *nvdimm, unsigned int cmd, void *buf,
unsigned int buf_len, int *cmd_rc)
{
struct nd_cmd_get_config_size *get_size_hdr;
+ struct nd_pdsm_cmd_pkg *call_pkg = NULL;
struct papr_scm_priv *p;
int rc;
- /* Only dimm-specific calls are supported atm */
- if (!nvdimm)
- return -EINVAL;
-
/* Use a local variable in case cmd_rc pointer is NULL */
if (!cmd_rc)
cmd_rc = &rc;
+ *cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
+ if (*cmd_rc) {
+ pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
+ return *cmd_rc;
+ }
+
p = nvdimm_provider_data(nvdimm);
switch (cmd) {
@@ -385,6 +482,12 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
*cmd_rc = papr_scm_meta_set(p, buf);
break;
+ case ND_CMD_CALL:
+ call_pkg = nd_to_pdsm_cmd_pkg(buf);
+ papr_scm_service_pdsm(p, call_pkg);
+ *cmd_rc = 0;
+ break;
+
default:
dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
*cmd_rc = -EINVAL;
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index de5d90212409..0e09dc5cec19 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -244,6 +244,7 @@ struct nd_cmd_pkg {
#define NVDIMM_FAMILY_HPE2 2
#define NVDIMM_FAMILY_MSFT 3
#define NVDIMM_FAMILY_HYPERV 4
+#define NVDIMM_FAMILY_PAPR 5
#define ND_IOCTL_CALL _IOWR(ND_IOCTL, ND_CMD_CALL,\
struct nd_cmd_pkg)
--
2.26.2
^ permalink raw reply related
* [PATCH v10 1/6] powerpc: Document details on H_SCM_HEALTH hcall
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200604234136.253703-1-vaibhav@linux.ibm.com>
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Acked-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v9..v10:
* Added ack from Ira.
Resend:
* None
v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]
v7..v8:
* Added a clarification on bit-ordering of Health Bitmap
Resend:
* None
v6..v7:
* None
v5..v6:
* New patch in the series
---
Documentation/powerpc/papr_hcalls.rst | 46 ++++++++++++++++++++++++---
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
**H_SCM_HEALTH**
| Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
| Return Value: *H_Success, H_Parameter, H_Hardware*
Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more states
+(described in table below) of the PMEM device and health-bit-valid-bitmap indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400000000000000
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++------+-----------------------------------------------------------------------+
+| Bit | Definition |
++======+=======================================================================+
+| 00 | PMEM device is unable to persist memory contents. |
+| | If the system is powered down, nothing will be saved. |
++------+-----------------------------------------------------------------------+
+| 01 | PMEM device failed to persist memory contents. Either contents were |
+| | not saved successfully on power down or were not restored properly on |
+| | power up. |
++------+-----------------------------------------------------------------------+
+| 02 | PMEM device contents are persisted from previous IPL. The data from |
+| | the last boot were successfully restored. |
++------+-----------------------------------------------------------------------+
+| 03 | PMEM device contents are not persisted from previous IPL. There was no|
+| | data to restore from the last boot. |
++------+-----------------------------------------------------------------------+
+| 04 | PMEM device memory life remaining is critically low |
++------+-----------------------------------------------------------------------+
+| 05 | PMEM device will be garded off next IPL due to failure |
++------+-----------------------------------------------------------------------+
+| 06 | PMEM device contents cannot persist due to current platform health |
+| | status. A hardware failure may prevent data from being saved or |
+| | restored. |
++------+-----------------------------------------------------------------------+
+| 07 | PMEM device is unable to persist memory contents in certain conditions|
++------+-----------------------------------------------------------------------+
+| 08 | PMEM device is encrypted |
++------+-----------------------------------------------------------------------+
+| 09 | PMEM device has successfully completed a requested erase or secure |
+| | erase procedure. |
++------+-----------------------------------------------------------------------+
+|10:63 | Reserved / Unused |
++------+-----------------------------------------------------------------------+
**H_SCM_PERFORMANCE_STATS**
--
2.26.2
^ permalink raw reply related
* [PATCH v10 0/6] powerpc/papr_scm: Add support for reporting nvdimm health
From: Vaibhav Jain @ 2020-06-04 23:41 UTC (permalink / raw)
To: linuxppc-dev, linux-nvdimm, linux-kernel
Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
Changes since v9 [1]:
* Addressed review comments from Ira and Dan Williams.
* Removed the contentious 'payload_version' field from struct
nd_pdsm_cmd_pkg.
* Also removed code/defines related to handling of different version
of a pdsm payload struct.
* Consolidated validation checks for nd_pdsm_cmd_pkg in
is_cmd_valid().
* Added a check in is_cmd_valid() to ensure reserved fields in struct
nd_pdsm_cmd_pkg are set to '0'.
* Reworked papr_pdsm_health() to avoid removing code that was added in
initial part of this patch-series.
* Added a new patch to the series to move out some proposed changes to
papr_scm_ndctl() in an independent patch.
* Reworked papr_pdsm_health() to ensure correct payload_size in the
pdsm command package.
[1] https://lore.kernel.org/linux-nvdimm/20200602101438.73929-1-vaibhav@linux.ibm.com
---
The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3]. Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.
To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3]. This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.
These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.
Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:
# ndctl list -DH
[
{
"dev":"nmem0",
"health":{
"health_state":"fatal",
"shutdown_state":"dirty"
}
}
]
Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:
# ndctl list -d nmem0 -H
[
{
"dev":"nmem0",
"health":{
"health_state":"ok",
"shutdown_state":"clean"
}
}
]
PAPR NVDIMM-Specific-Methods(PDSM)
==================================
PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.
Structure of the patch-set
==========================
The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.
Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.
Fourth patch updates papr_scm_ndctl() to handle a possible error case
and also improve debug logging.
Fifth patch deals with implementing support for servicing PDSM
commands in papr_scm module.
Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.
References:
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v10
---
Vaibhav Jain (6):
powerpc: Document details on H_SCM_HEALTH hcall
seq_buf: Export seq_buf_printf
powerpc/papr_scm: Fetch nvdimm health information from PHYP
powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 ++
Documentation/powerpc/papr_hcalls.rst | 46 ++-
arch/powerpc/include/uapi/asm/papr_pdsm.h | 131 +++++++
arch/powerpc/platforms/pseries/papr_scm.c | 361 +++++++++++++++++-
include/uapi/linux/ndctl.h | 1 +
lib/seq_buf.c | 1 +
6 files changed, 554 insertions(+), 13 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
--
2.26.2
^ permalink raw reply
* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Segher Boessenkool @ 2020-06-04 23:35 UTC (permalink / raw)
To: Daniel Kolesa
Cc: Rich Felker, libc-alpha, eery, musl, Will Springer,
Palmer Dabbelt via binutils, via libc-dev, Michal Suchánek,
linuxppc-dev, Joseph Myers
In-Reply-To: <60fa8bd7-2439-4403-a0eb-166a2fb49a4b@www.fastmail.com>
Hi!
On Thu, Jun 04, 2020 at 11:43:53PM +0200, Daniel Kolesa wrote:
> The thing is, I've yet to see in which way the ELFv2 ABI *actually* requires VSX - I don't think compiling for 970 introduces any actual differences. There will be omissions, yes - but then the more accurate thing would be to say that a subset of ELFv2 is used, rather than it being a different ABI per se.
Two big things are that binaries that someone else made are supposed to
work for you as well -- including binaries using VSX registers, or any
instructions that require ISA 2.07 (or some older ISA after 970). This
includes DSOs (shared libraries). So for a distribution this means that
they will not use VSX *anywhere*, or only in very specialised things.
That is a many-years setback, for people/situations where it could be
used.
> The ELFv2 document specifies things like passing of quadruple precision floats. Indeed, VSX is needed there, but that's not a concern if you *don't* use quadruple precision floats.
As Joseph says, QP float is passed in the VRs, so that works just fine
*if* you have AltiVec. If not, you probably should pass such values in
the GPRs, or however a struct of 16 bytes is passed in whatever ABI you
use (this is a much more general problem than just BE ELFv2 ;-) ) And
then you have some kind of "partial soft float". Fun! (Or not...)
> > If you always use -mcpu=970 (or similar), then not very much is
> > different for you most likely -- except of course there is no promise
> > to the user that they can use VSX and all instructions in ISA 2.07,
> > which is a very useful promise to have normally.
>
> Yes, -mcpu=970 is used for default packages. *However*, it is possible that the user compiles their own packages with -mcpu=power9 or something similar, and then it'll be possible to utilize VSX and all, and it should still work with the existing userland. When speaking of ABI, what matters is... well, the binary interface, which is the same - so I believe this is still ELFv2. A subset is always compliant with the whole.
The same calling convention will probably work, yes. An ABI is more
than that though.
> > Btw, if you use GCC, *please* send in testresults? :-)
>
> Yes, it's all gcc (we do have clang, but compiling repo packages with clang is generally frowned upon in the project, as we have vast majority of packages cross-compilable, and our cross-compiling infrastructure is gcc-centric, plus we enable certain things by default such as hardening flags that clang does not support). I'll try to remember next time I'm running tests.
Thanks in advance!
> I have a feeling that glibc would object to such port, since it means it would have to exist in parallel with a potential different ELFv2 port that does have a POWER8 minimal requirement; gcc would need a way to tell them apart, too (based on what would they be told apart? the simplest way would probably be something like, if --with-abi=elfv2 { if --with-cpu < power8 -> use glibc-novsx else use glibc-vsx } ...)
The target name allows to make such distinctions: this could for example
be powerpc64-*-linux-void (maybe I put the distinction in the wrong
part of the name here? The glibc people will know better, and "void" is
probably not a great name anyway).
Segher
^ permalink raw reply
* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Segher Boessenkool @ 2020-06-04 23:06 UTC (permalink / raw)
To: Phil Blundell
Cc: libc-alpha, eery, Daniel Kolesa, Will Springer,
Michal Suchánek, linuxppc-dev
In-Reply-To: <20200604215511.GB28641@pbcl.net>
Hi!
On Thu, Jun 04, 2020 at 11:55:11PM +0200, Phil Blundell wrote:
> 1a. Define your own subset of ELFv2 which is interworkable with the full
> ABI at the function call interface but doesn't make all the same
> guarantees about binary compatibility. That would mean that a binary
> built with your toolchain and conforming to the subset ABI would run on
> any system that implements the full ELFv2 ABI, but the opposite is not
> necessarily true. There should be no impediment to getting support for
> such an ABI upstream in any part of the GNU toolchain where it's
> required if you can demonstrate that there's a non-trivial userbase for
> it. The hardest part may be thinking of a name.
And you can only use shared objects also built for that subset ABI. If
you use some binary distribution, then it will also have to be built for
that subset, practically anyway.
This is very similar to soft-float targets. There are "standard" ways
to deal with this. Distros usually balk at having to maintain multiple
variants of a target, and users do not usually want to be restricted to
the lowest common denominator. There always is that tension.
> 1b. Or, if the missing instructions are severe enough that it simply
> isn't possible to have an interworkable implementation, you just need to
> define your own ABI that fits your needs. You can still borrow as much
> as necessary from ELFv2 but you definitely need to call it something
> else at that point. All the other comments from 1a above still apply.
A different name is handy in casual conversation then, yes; but also in
case 1a, it should be clear what is what somehow.
> 2. Implement kernel emulation for the missing instructions. If they
> are seldom used in practice then this might be adequate. Of course,
> binaries that use them intensively will be slow; you'd have to judge
> whether this is better or worse than having them not run at all. If
> you do this then you can implement the full ELFv2 ABI; your own
> toolchain might still choose not to use the instructions that it knows
> are going to be emulated, but at least other binaries will still run
> and you can call yourself compatible.
But not just instructions, there are actual new registers! This might
be way too much work in the case of VSX.
But it is possible that implementing QP float (binary128) this way is
a feasible way forward, _if_ you have AltiVec enabled.
> 3. Persuade whoever controls the ELFv2 ABI to relax their requirements.
> But I assume they didn't make the original decision capriciously so
> this might be hard/impossible. ABI definitions from hardware vendors
> are always slightly political and we just have to accept this.
There is more process involved than most open source people are
comfortable with :-/
> FWIW, we faced a similar situation about 20 years ago when the then-new
> ARM EABI was defined. This essentially required implementations to
> support the ARMv5T instruction set; the committee that defined the ABI
> took the view that requiring implementations to cater for older
> architectures would be too onerous. It was entirely possible to
> implement 99% of the EABI on older processors; such implementations
> weren't strictly conforming but they were interworkable enough to be
> useful in practice, and the "almost-EABI" was still significantly
> better than what had gone before.
Yeah, this situation is quite similar in some ways :-)
The compilers should be able to adjust to what you need pretty easily.
Since you seem to have a distribution on-board already, the biggest
hurdle left is getting glibc to accept the new port, I think. I don't
know if it will be easy to them, or a lot of work instead.
Thanks,
Segher
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox