* [PATCH V2 0/7] powerpc/perf: Fixes for power10 PMU
From: Athira Rajeev @ 2020-11-26 16:54 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
Patchset contains PMU fixes for power10.
This patchset contains 7 patches.
Patch1 includes fix to update event code with radix_scope_qual
bit in power10.
Patch2 and Patch3 updates the event group constraints for L2/L3
and threshold events in power10.
Patch4, patch5 and patch6 includes the event code changes for
l2/l3 events and some of the generic events.
Patch7 adds fixes for PMCCEXT bit in power10.
Changelog:
Changes from v1 -> v2
- Addressed Michael Ellerman's comments in the patchset.
Split patch 2 to address l2l3 and threshold events
group constraints fixes separately.
Split Patch 3 also to address event code updates
separately for generic and cache events.
Fixed commit messages and also PMCCEXT bit setting
during event enable.
Athira Rajeev (7):
powerpc/perf: Fix to update radix_scope_qual in power10
powerpc/perf: Update the PMU group constraints for l2l3 events in
power10
powerpc/perf: Fix the PMU group constraints for threshold events in
power10
powerpc/perf: Add generic and cache event list for power10 DD1
powerpc/perf: Fix to update generic event codes for power10
powerpc/perf: Fix to update cache events with l2l3 events in power10
powerpc/perf: MMCR0 control for PMU registers under PMCC=00
arch/powerpc/include/asm/reg.h | 1 +
arch/powerpc/kernel/cpu_setup_power.c | 1 +
arch/powerpc/kernel/dt_cpu_ftrs.c | 1 +
arch/powerpc/perf/core-book3s.c | 4 +
arch/powerpc/perf/isa207-common.c | 35 ++++++-
arch/powerpc/perf/isa207-common.h | 16 ++-
arch/powerpc/perf/power10-events-list.h | 9 ++
arch/powerpc/perf/power10-pmu.c | 178 ++++++++++++++++++++++++++++++--
8 files changed, 231 insertions(+), 14 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH V2 3/7] powerpc/perf: Fix the PMU group constraints for threshold events in power10
From: Athira Rajeev @ 2020-11-26 16:54 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1606409684-1589-1-git-send-email-atrajeev@linux.vnet.ibm.com>
The PMU group constraints mask for threshold events covers
all thresholding bits which includes threshold control value
(start/stop), select value as well as thresh_cmp value (MMCRA[9:18].
In power9, thresh_cmp bits were part of the event code. But in case
of power10, thresh_cmp bits are not part of event code due to
inclusion of MMCR3 bits. Hence thresh_cmp is not valid for
group constraints for power10.
Fix the PMU group constraints checking for threshold events in
power10 by using constraint mask and value for only threshold control
and select bits.
Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/perf/isa207-common.c | 7 ++++++-
arch/powerpc/perf/isa207-common.h | 3 +++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 38ed450c..0f4983e 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -351,7 +351,12 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
value |= CNST_SAMPLE_VAL(event >> EVENT_SAMPLE_SHIFT);
}
- if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+ if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+ if (event_is_threshold(event)) {
+ mask |= CNST_THRESH_CTL_SEL_MASK;
+ value |= CNST_THRESH_CTL_SEL_VAL(event >> EVENT_THRESH_SHIFT);
+ }
+ } else if (cpu_has_feature(CPU_FTR_ARCH_300)) {
if (event_is_threshold(event) && is_thresh_cmp_valid(event)) {
mask |= CNST_THRESH_MASK;
value |= CNST_THRESH_VAL(event >> EVENT_THRESH_SHIFT);
diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
index dc9c3d2..4208764 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -149,6 +149,9 @@
#define CNST_THRESH_VAL(v) (((v) & EVENT_THRESH_MASK) << 32)
#define CNST_THRESH_MASK CNST_THRESH_VAL(EVENT_THRESH_MASK)
+#define CNST_THRESH_CTL_SEL_VAL(v) (((v) & 0x7ffull) << 32)
+#define CNST_THRESH_CTL_SEL_MASK CNST_THRESH_CTL_SEL_VAL(0x7ff)
+
#define CNST_EBB_VAL(v) (((v) & EVENT_EBB_MASK) << 24)
#define CNST_EBB_MASK CNST_EBB_VAL(EVENT_EBB_MASK)
--
1.8.3.1
^ permalink raw reply related
* [PATCH V2 5/7] powerpc/perf: Fix to update generic event codes for power10
From: Athira Rajeev @ 2020-11-26 16:54 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1606409684-1589-1-git-send-email-atrajeev@linux.vnet.ibm.com>
Fix the event code for events: branch-instructions (to PM_BR_FIN),
branch-misses (to PM_MPRED_BR_FIN) and cache-misses (to
PM_LD_DEMAND_MISS_L1_FIN) for power10 PMU. Update the
list of generic events with this modified event code.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/perf/power10-events-list.h | 3 +++
arch/powerpc/perf/power10-pmu.c | 15 +++++++++------
2 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/perf/power10-events-list.h b/arch/powerpc/perf/power10-events-list.h
index 60c1b81..abd778f 100644
--- a/arch/powerpc/perf/power10-events-list.h
+++ b/arch/powerpc/perf/power10-events-list.h
@@ -15,6 +15,9 @@
EVENT(PM_RUN_INST_CMPL, 0x500fa);
EVENT(PM_BR_CMPL, 0x4d05e);
EVENT(PM_BR_MPRED_CMPL, 0x400f6);
+EVENT(PM_BR_FIN, 0x2f04a);
+EVENT(PM_MPRED_BR_FIN, 0x3e098);
+EVENT(PM_LD_DEMAND_MISS_L1_FIN, 0x400f0);
/* All L1 D cache load references counted at finish, gated by reject */
EVENT(PM_LD_REF_L1, 0x100fc);
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index bc3d4dd..a02da69 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -114,6 +114,9 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
GENERIC_EVENT_ATTR(cache-misses, PM_LD_MISS_L1);
GENERIC_EVENT_ATTR(mem-loads, MEM_LOADS);
GENERIC_EVENT_ATTR(mem-stores, MEM_STORES);
+GENERIC_EVENT_ATTR(branch-instructions, PM_BR_FIN);
+GENERIC_EVENT_ATTR(branch-misses, PM_MPRED_BR_FIN);
+GENERIC_EVENT_ATTR(cache-misses, PM_LD_DEMAND_MISS_L1_FIN);
CACHE_EVENT_ATTR(L1-dcache-load-misses, PM_LD_MISS_L1);
CACHE_EVENT_ATTR(L1-dcache-loads, PM_LD_REF_L1);
@@ -157,10 +160,10 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
static struct attribute *power10_events_attr[] = {
GENERIC_EVENT_PTR(PM_RUN_CYC),
GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
- GENERIC_EVENT_PTR(PM_BR_CMPL),
- GENERIC_EVENT_PTR(PM_BR_MPRED_CMPL),
+ GENERIC_EVENT_PTR(PM_BR_FIN),
+ GENERIC_EVENT_PTR(PM_MPRED_BR_FIN),
GENERIC_EVENT_PTR(PM_LD_REF_L1),
- GENERIC_EVENT_PTR(PM_LD_MISS_L1),
+ GENERIC_EVENT_PTR(PM_LD_DEMAND_MISS_L1_FIN),
GENERIC_EVENT_PTR(MEM_LOADS),
GENERIC_EVENT_PTR(MEM_STORES),
CACHE_EVENT_PTR(PM_LD_MISS_L1),
@@ -259,10 +262,10 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
static int power10_generic_events[] = {
[PERF_COUNT_HW_CPU_CYCLES] = PM_RUN_CYC,
[PERF_COUNT_HW_INSTRUCTIONS] = PM_RUN_INST_CMPL,
- [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = PM_BR_CMPL,
- [PERF_COUNT_HW_BRANCH_MISSES] = PM_BR_MPRED_CMPL,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = PM_BR_FIN,
+ [PERF_COUNT_HW_BRANCH_MISSES] = PM_MPRED_BR_FIN,
[PERF_COUNT_HW_CACHE_REFERENCES] = PM_LD_REF_L1,
- [PERF_COUNT_HW_CACHE_MISSES] = PM_LD_MISS_L1,
+ [PERF_COUNT_HW_CACHE_MISSES] = PM_LD_DEMAND_MISS_L1_FIN,
};
static u64 power10_bhrb_filter_map(u64 branch_sample_type)
--
1.8.3.1
^ permalink raw reply related
* [PATCH V2 6/7] powerpc/perf: Fix to update cache events with l2l3 events in power10
From: Athira Rajeev @ 2020-11-26 16:54 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1606409684-1589-1-git-send-email-atrajeev@linux.vnet.ibm.com>
Export l2l3 events (PM_L2_ST_MISS and PM_L2_ST) and LLC-prefetches
(PM_L3_PF_MISS_L3) via sysfs, and also add these to list of
cache_events.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/perf/power10-events-list.h | 6 ++++++
arch/powerpc/perf/power10-pmu.c | 12 +++++++++---
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/perf/power10-events-list.h b/arch/powerpc/perf/power10-events-list.h
index abd778f..e45dafe 100644
--- a/arch/powerpc/perf/power10-events-list.h
+++ b/arch/powerpc/perf/power10-events-list.h
@@ -39,6 +39,12 @@
EVENT(PM_DATA_FROM_L3, 0x01340000001c040);
/* Demand LD - L3 Miss (not L2 hit and not L3 hit) */
EVENT(PM_DATA_FROM_L3MISS, 0x300fe);
+/* All successful D-side store dispatches for this thread */
+EVENT(PM_L2_ST, 0x010000046080);
+/* All successful D-side store dispatches for this thread that were L2 Miss */
+EVENT(PM_L2_ST_MISS, 0x26880);
+/* Total HW L3 prefetches(Load+store) */
+EVENT(PM_L3_PF_MISS_L3, 0x100000016080);
/* Data PTEG reload */
EVENT(PM_DTLB_MISS, 0x300fc);
/* ITLB Reloaded */
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index a02da69..79e0206 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -127,6 +127,9 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
CACHE_EVENT_ATTR(L1-icache-prefetches, PM_IC_PREF_REQ);
CACHE_EVENT_ATTR(LLC-load-misses, PM_DATA_FROM_L3MISS);
CACHE_EVENT_ATTR(LLC-loads, PM_DATA_FROM_L3);
+CACHE_EVENT_ATTR(LLC-prefetches, PM_L3_PF_MISS_L3);
+CACHE_EVENT_ATTR(LLC-store-misses, PM_L2_ST_MISS);
+CACHE_EVENT_ATTR(LLC-stores, PM_L2_ST);
CACHE_EVENT_ATTR(branch-load-misses, PM_BR_MPRED_CMPL);
CACHE_EVENT_ATTR(branch-loads, PM_BR_CMPL);
CACHE_EVENT_ATTR(dTLB-load-misses, PM_DTLB_MISS);
@@ -175,6 +178,9 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
CACHE_EVENT_PTR(PM_IC_PREF_REQ),
CACHE_EVENT_PTR(PM_DATA_FROM_L3MISS),
CACHE_EVENT_PTR(PM_DATA_FROM_L3),
+ CACHE_EVENT_PTR(PM_L3_PF_MISS_L3),
+ CACHE_EVENT_PTR(PM_L2_ST_MISS),
+ CACHE_EVENT_PTR(PM_L2_ST),
CACHE_EVENT_PTR(PM_BR_MPRED_CMPL),
CACHE_EVENT_PTR(PM_BR_CMPL),
CACHE_EVENT_PTR(PM_DTLB_MISS),
@@ -460,11 +466,11 @@ static void power10_config_bhrb(u64 pmu_bhrb_filter)
[C(RESULT_MISS)] = PM_DATA_FROM_L3MISS,
},
[C(OP_WRITE)] = {
- [C(RESULT_ACCESS)] = -1,
- [C(RESULT_MISS)] = -1,
+ [C(RESULT_ACCESS)] = PM_L2_ST,
+ [C(RESULT_MISS)] = PM_L2_ST_MISS,
},
[C(OP_PREFETCH)] = {
- [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_ACCESS)] = PM_L3_PF_MISS_L3,
[C(RESULT_MISS)] = 0,
},
},
--
1.8.3.1
^ permalink raw reply related
* [PATCH V2 7/7] powerpc/perf: MMCR0 control for PMU registers under PMCC=00
From: Athira Rajeev @ 2020-11-26 16:54 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1606409684-1589-1-git-send-email-atrajeev@linux.vnet.ibm.com>
PowerISA v3.1 introduces new control bit (PMCCEXT) for restricting
access to group B PMU registers in problem state when
MMCR0 PMCC=0b00. In problem state and when MMCR0 PMCC=0b00,
setting the Monitor Mode Control Register bit 54 (MMCR0 PMCCEXT),
will restrict read permission on Group B Performance Monitor
Registers (SIER, SIAR, SDAR and MMCR1). When this bit is set to zero,
group B registers will be readable. In other platforms (like power9),
the older behaviour is retained where group B PMU SPRs are readable.
Patch adds support for MMCR0 PMCCEXT bit in power10 by enabling
this bit during boot and during the PMU event enable/disable callback
functions.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/reg.h | 1 +
arch/powerpc/kernel/cpu_setup_power.c | 1 +
arch/powerpc/kernel/dt_cpu_ftrs.c | 1 +
arch/powerpc/perf/core-book3s.c | 4 ++++
arch/powerpc/perf/isa207-common.c | 8 ++++++++
5 files changed, 15 insertions(+)
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index f877a57..cba9965 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -864,6 +864,7 @@
#define MMCR0_BHRBA 0x00200000UL /* BHRB Access allowed in userspace */
#define MMCR0_EBE 0x00100000UL /* Event based branch enable */
#define MMCR0_PMCC 0x000c0000UL /* PMC control */
+#define MMCR0_PMCCEXT ASM_CONST(0x00000200) /* PMCCEXT control */
#define MMCR0_PMCC_U6 0x00080000UL /* PMC1-6 are R/W by user (PR) */
#define MMCR0_PMC1CE 0x00008000UL /* PMC1 count enable*/
#define MMCR0_PMCjCE ASM_CONST(0x00004000) /* PMCj count enable*/
diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
index 0c2191e..3cca88e 100644
--- a/arch/powerpc/kernel/cpu_setup_power.c
+++ b/arch/powerpc/kernel/cpu_setup_power.c
@@ -123,6 +123,7 @@ static void init_PMU_ISA31(void)
{
mtspr(SPRN_MMCR3, 0);
mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
+ mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
}
/*
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 1098863..9d07965 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -454,6 +454,7 @@ static void init_pmu_power10(void)
mtspr(SPRN_MMCR3, 0);
mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
+ mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
}
static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 3c8c6ce..35cf93c 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -95,6 +95,7 @@ struct cpu_hw_events {
#define SPRN_SIER3 0
#define MMCRA_SAMPLE_ENABLE 0
#define MMCRA_BHRB_DISABLE 0
+#define MMCR0_PMCCEXT 0
static inline unsigned long perf_ip_adjust(struct pt_regs *regs)
{
@@ -1270,6 +1271,9 @@ static void power_pmu_disable(struct pmu *pmu)
val |= MMCR0_FC;
val &= ~(MMCR0_EBE | MMCR0_BHRBA | MMCR0_PMCC | MMCR0_PMAO |
MMCR0_FC56);
+ /* Set mmcr0 PMCCEXT for p10 */
+ if (ppmu->flags & PPMU_ARCH_31)
+ val |= MMCR0_PMCCEXT;
/*
* The barrier is to make sure the mtspr has been
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 0f4983e..24f0a90 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -558,6 +558,14 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
if (!(pmc_inuse & 0x60))
mmcr->mmcr0 |= MMCR0_FC56;
+ /*
+ * Set mmcr0 (PMCCEXT) for p10 which
+ * will restrict access to group B registers
+ * when MMCR0 PMCC=0b00.
+ */
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+ mmcr->mmcr0 |= MMCR0_PMCCEXT;
+
mmcr->mmcr1 = mmcr1;
mmcr->mmcra = mmcra;
mmcr->mmcr2 = mmcr2;
--
1.8.3.1
^ permalink raw reply related
* [PATCH V2 4/7] powerpc/perf: Add generic and cache event list for power10 DD1
From: Athira Rajeev @ 2020-11-26 16:54 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1606409684-1589-1-git-send-email-atrajeev@linux.vnet.ibm.com>
There are event code updates for some of the generic events
and cache events for power10. Inorder to maintain the current
event codes work with DD1 also, create a new array of generic_events,
cache_events and pmu_attr_groups with suffix _dd1, example,
power10_events_attr_dd1. So that further updates to event codes
can be made in the original list, ie, power10_events_attr. Update the
power10 pmu init code to pick the dd1 list while registering
the power PMU, based on the pvr (Processor Version Register) value.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/perf/power10-pmu.c | 152 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 152 insertions(+)
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index 88c5430..bc3d4dd 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -129,6 +129,31 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
CACHE_EVENT_ATTR(dTLB-load-misses, PM_DTLB_MISS);
CACHE_EVENT_ATTR(iTLB-load-misses, PM_ITLB_MISS);
+static struct attribute *power10_events_attr_dd1[] = {
+ GENERIC_EVENT_PTR(PM_RUN_CYC),
+ GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
+ GENERIC_EVENT_PTR(PM_BR_CMPL),
+ GENERIC_EVENT_PTR(PM_BR_MPRED_CMPL),
+ GENERIC_EVENT_PTR(PM_LD_REF_L1),
+ GENERIC_EVENT_PTR(PM_LD_MISS_L1),
+ GENERIC_EVENT_PTR(MEM_LOADS),
+ GENERIC_EVENT_PTR(MEM_STORES),
+ CACHE_EVENT_PTR(PM_LD_MISS_L1),
+ CACHE_EVENT_PTR(PM_LD_REF_L1),
+ CACHE_EVENT_PTR(PM_LD_PREFETCH_CACHE_LINE_MISS),
+ CACHE_EVENT_PTR(PM_ST_MISS_L1),
+ CACHE_EVENT_PTR(PM_L1_ICACHE_MISS),
+ CACHE_EVENT_PTR(PM_INST_FROM_L1),
+ CACHE_EVENT_PTR(PM_IC_PREF_REQ),
+ CACHE_EVENT_PTR(PM_DATA_FROM_L3MISS),
+ CACHE_EVENT_PTR(PM_DATA_FROM_L3),
+ CACHE_EVENT_PTR(PM_BR_MPRED_CMPL),
+ CACHE_EVENT_PTR(PM_BR_CMPL),
+ CACHE_EVENT_PTR(PM_DTLB_MISS),
+ CACHE_EVENT_PTR(PM_ITLB_MISS),
+ NULL
+};
+
static struct attribute *power10_events_attr[] = {
GENERIC_EVENT_PTR(PM_RUN_CYC),
GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
@@ -154,6 +179,11 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
NULL
};
+static struct attribute_group power10_pmu_events_group_dd1 = {
+ .name = "events",
+ .attrs = power10_events_attr_dd1,
+};
+
static struct attribute_group power10_pmu_events_group = {
.name = "events",
.attrs = power10_events_attr,
@@ -205,12 +235,27 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
.attrs = power10_pmu_format_attr,
};
+static const struct attribute_group *power10_pmu_attr_groups_dd1[] = {
+ &power10_pmu_format_group,
+ &power10_pmu_events_group_dd1,
+ NULL,
+};
+
static const struct attribute_group *power10_pmu_attr_groups[] = {
&power10_pmu_format_group,
&power10_pmu_events_group,
NULL,
};
+static int power10_generic_events_dd1[] = {
+ [PERF_COUNT_HW_CPU_CYCLES] = PM_RUN_CYC,
+ [PERF_COUNT_HW_INSTRUCTIONS] = PM_RUN_INST_CMPL,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = PM_BR_CMPL,
+ [PERF_COUNT_HW_BRANCH_MISSES] = PM_BR_MPRED_CMPL,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = PM_LD_REF_L1,
+ [PERF_COUNT_HW_CACHE_MISSES] = PM_LD_MISS_L1,
+};
+
static int power10_generic_events[] = {
[PERF_COUNT_HW_CPU_CYCLES] = PM_RUN_CYC,
[PERF_COUNT_HW_INSTRUCTIONS] = PM_RUN_INST_CMPL,
@@ -276,6 +321,107 @@ static void power10_config_bhrb(u64 pmu_bhrb_filter)
* 0 means not supported, -1 means nonsensical, other values
* are event codes.
*/
+static u64 power10_cache_events_dd1[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+ [C(L1D)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = PM_LD_REF_L1,
+ [C(RESULT_MISS)] = PM_LD_MISS_L1,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = 0,
+ [C(RESULT_MISS)] = PM_ST_MISS_L1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = PM_LD_PREFETCH_CACHE_LINE_MISS,
+ [C(RESULT_MISS)] = 0,
+ },
+ },
+ [C(L1I)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = PM_INST_FROM_L1,
+ [C(RESULT_MISS)] = PM_L1_ICACHE_MISS,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = PM_INST_FROM_L1MISS,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = PM_IC_PREF_REQ,
+ [C(RESULT_MISS)] = 0,
+ },
+ },
+ [C(LL)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = PM_DATA_FROM_L3,
+ [C(RESULT_MISS)] = PM_DATA_FROM_L3MISS,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = 0,
+ },
+ },
+ [C(DTLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0,
+ [C(RESULT_MISS)] = PM_DTLB_MISS,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ },
+ [C(ITLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0,
+ [C(RESULT_MISS)] = PM_ITLB_MISS,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ },
+ [C(BPU)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = PM_BR_CMPL,
+ [C(RESULT_MISS)] = PM_BR_MPRED_CMPL,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ },
+ [C(NODE)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ },
+};
+
static u64 power10_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = {
[C(OP_READ)] = {
@@ -422,6 +568,12 @@ int init_power10_pmu(void)
/* Set the PERF_REG_EXTENDED_MASK here */
PERF_REG_EXTENDED_MASK = PERF_REG_PMU_MASK_31;
+ if ((PVR_CFG(pvr) == 1)) {
+ power10_pmu.generic_events = power10_generic_events_dd1;
+ power10_pmu.attr_groups = power10_pmu_attr_groups_dd1;
+ power10_pmu.cache_events = &power10_cache_events_dd1;
+ }
+
rc = register_power_pmu(&power10_pmu);
if (rc)
return rc;
--
1.8.3.1
^ permalink raw reply related
* [PATCH 2/2] powerpc/ps3: make system bus's remove and shutdown callbacks return void
From: Uwe Kleine-König @ 2020-11-26 16:59 UTC (permalink / raw)
To: Geoff Levand, Jaroslav Kysela, Takashi Iwai, Michael Ellerman,
Jens Axboe, Jim Paris, Arnd Bergmann, Greg Kroah-Hartman,
David S. Miller, Jakub Kicinski, James E.J. Bottomley,
Martin K. Petersen, Alan Stern, Bartlomiej Zolnierkiewicz
Cc: alsa-devel, linux-scsi, linux-usb, linux-fbdev, dri-devel,
linux-block, Paul Mackerras, netdev, linuxppc-dev
In-Reply-To: <20201126165950.2554997-1-u.kleine-koenig@pengutronix.de>
The driver core ignores the return value of struct device_driver::remove
because there is only little that can be done. For the shutdown callback
it's ps3_system_bus_shutdown() which ignores the return value.
To simplify the quest to make struct device_driver::remove return void,
let struct ps3_system_bus_driver::remove return void, too. All users
already unconditionally return 0, this commit makes it obvious that
returning an error code is a bad idea and ensures future users behave
accordingly.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
arch/powerpc/include/asm/ps3.h | 4 ++--
arch/powerpc/platforms/ps3/system-bus.c | 5 ++---
drivers/block/ps3disk.c | 3 +--
drivers/block/ps3vram.c | 3 +--
drivers/char/ps3flash.c | 3 +--
drivers/net/ethernet/toshiba/ps3_gelic_net.c | 3 +--
drivers/ps3/ps3-lpm.c | 3 +--
drivers/ps3/ps3-vuart.c | 10 ++++------
drivers/scsi/ps3rom.c | 3 +--
drivers/usb/host/ehci-ps3.c | 4 +---
drivers/usb/host/ohci-ps3.c | 4 +---
drivers/video/fbdev/ps3fb.c | 4 +---
sound/ppc/snd_ps3.c | 3 +--
13 files changed, 18 insertions(+), 34 deletions(-)
diff --git a/arch/powerpc/include/asm/ps3.h b/arch/powerpc/include/asm/ps3.h
index cb89e4bf55ce..e646c7f218bc 100644
--- a/arch/powerpc/include/asm/ps3.h
+++ b/arch/powerpc/include/asm/ps3.h
@@ -378,8 +378,8 @@ struct ps3_system_bus_driver {
enum ps3_match_sub_id match_sub_id;
struct device_driver core;
int (*probe)(struct ps3_system_bus_device *);
- int (*remove)(struct ps3_system_bus_device *);
- int (*shutdown)(struct ps3_system_bus_device *);
+ void (*remove)(struct ps3_system_bus_device *);
+ void (*shutdown)(struct ps3_system_bus_device *);
/* int (*suspend)(struct ps3_system_bus_device *, pm_message_t); */
/* int (*resume)(struct ps3_system_bus_device *); */
};
diff --git a/arch/powerpc/platforms/ps3/system-bus.c b/arch/powerpc/platforms/ps3/system-bus.c
index c62aaa29a9d5..b431f41c6cb5 100644
--- a/arch/powerpc/platforms/ps3/system-bus.c
+++ b/arch/powerpc/platforms/ps3/system-bus.c
@@ -382,7 +382,6 @@ static int ps3_system_bus_probe(struct device *_dev)
static int ps3_system_bus_remove(struct device *_dev)
{
- int result = 0;
struct ps3_system_bus_device *dev = ps3_dev_to_system_bus_dev(_dev);
struct ps3_system_bus_driver *drv;
@@ -393,13 +392,13 @@ static int ps3_system_bus_remove(struct device *_dev)
BUG_ON(!drv);
if (drv->remove)
- result = drv->remove(dev);
+ drv->remove(dev);
else
dev_dbg(&dev->core, "%s:%d %s: no remove method\n",
__func__, __LINE__, drv->core.name);
pr_debug(" <- %s:%d: %s\n", __func__, __LINE__, dev_name(&dev->core));
- return result;
+ return 0;
}
static void ps3_system_bus_shutdown(struct device *_dev)
diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index 7b55811c2a81..ba3ece56cbb3 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -507,7 +507,7 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
return error;
}
-static int ps3disk_remove(struct ps3_system_bus_device *_dev)
+static void ps3disk_remove(struct ps3_system_bus_device *_dev)
{
struct ps3_storage_device *dev = to_ps3_storage_device(&_dev->core);
struct ps3disk_private *priv = ps3_system_bus_get_drvdata(&dev->sbd);
@@ -526,7 +526,6 @@ static int ps3disk_remove(struct ps3_system_bus_device *_dev)
kfree(dev->bounce_buf);
kfree(priv);
ps3_system_bus_set_drvdata(_dev, NULL);
- return 0;
}
static struct ps3_system_bus_driver ps3disk = {
diff --git a/drivers/block/ps3vram.c b/drivers/block/ps3vram.c
index 1088798c8dd0..b71d28372ef3 100644
--- a/drivers/block/ps3vram.c
+++ b/drivers/block/ps3vram.c
@@ -797,7 +797,7 @@ static int ps3vram_probe(struct ps3_system_bus_device *dev)
return error;
}
-static int ps3vram_remove(struct ps3_system_bus_device *dev)
+static void ps3vram_remove(struct ps3_system_bus_device *dev)
{
struct ps3vram_priv *priv = ps3_system_bus_get_drvdata(dev);
@@ -817,7 +817,6 @@ static int ps3vram_remove(struct ps3_system_bus_device *dev)
free_pages((unsigned long) priv->xdr_buf, get_order(XDR_BUF_SIZE));
kfree(priv);
ps3_system_bus_set_drvdata(dev, NULL);
- return 0;
}
static struct ps3_system_bus_driver ps3vram = {
diff --git a/drivers/char/ps3flash.c b/drivers/char/ps3flash.c
index 1a07fee33f66..23871cde41fb 100644
--- a/drivers/char/ps3flash.c
+++ b/drivers/char/ps3flash.c
@@ -403,7 +403,7 @@ static int ps3flash_probe(struct ps3_system_bus_device *_dev)
return error;
}
-static int ps3flash_remove(struct ps3_system_bus_device *_dev)
+static void ps3flash_remove(struct ps3_system_bus_device *_dev)
{
struct ps3_storage_device *dev = to_ps3_storage_device(&_dev->core);
@@ -413,7 +413,6 @@ static int ps3flash_remove(struct ps3_system_bus_device *_dev)
kfree(ps3_system_bus_get_drvdata(&dev->sbd));
ps3_system_bus_set_drvdata(&dev->sbd, NULL);
ps3flash_dev = NULL;
- return 0;
}
diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index d9a5722f561b..3d1fc8d2ca66 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -1791,7 +1791,7 @@ static int ps3_gelic_driver_probe(struct ps3_system_bus_device *dev)
* ps3_gelic_driver_remove - remove a device from the control of this driver
*/
-static int ps3_gelic_driver_remove(struct ps3_system_bus_device *dev)
+static void ps3_gelic_driver_remove(struct ps3_system_bus_device *dev)
{
struct gelic_card *card = ps3_system_bus_get_drvdata(dev);
struct net_device *netdev0;
@@ -1840,7 +1840,6 @@ static int ps3_gelic_driver_remove(struct ps3_system_bus_device *dev)
ps3_close_hv_device(dev);
pr_debug("%s: done\n", __func__);
- return 0;
}
static struct ps3_system_bus_driver ps3_gelic_driver = {
diff --git a/drivers/ps3/ps3-lpm.c b/drivers/ps3/ps3-lpm.c
index e54aa2d82f50..65512b6cc6fd 100644
--- a/drivers/ps3/ps3-lpm.c
+++ b/drivers/ps3/ps3-lpm.c
@@ -1196,7 +1196,7 @@ static int ps3_lpm_probe(struct ps3_system_bus_device *dev)
return 0;
}
-static int ps3_lpm_remove(struct ps3_system_bus_device *dev)
+static void ps3_lpm_remove(struct ps3_system_bus_device *dev)
{
dev_dbg(&dev->core, " -> %s:%u:\n", __func__, __LINE__);
@@ -1206,7 +1206,6 @@ static int ps3_lpm_remove(struct ps3_system_bus_device *dev)
lpm_priv = NULL;
dev_info(&dev->core, " <- %s:%u:\n", __func__, __LINE__);
- return 0;
}
static struct ps3_system_bus_driver ps3_lpm_driver = {
diff --git a/drivers/ps3/ps3-vuart.c b/drivers/ps3/ps3-vuart.c
index 4ed131eaff51..e34ae6a442c7 100644
--- a/drivers/ps3/ps3-vuart.c
+++ b/drivers/ps3/ps3-vuart.c
@@ -1102,7 +1102,7 @@ static int ps3_vuart_cleanup(struct ps3_system_bus_device *dev)
* device can no longer be used.
*/
-static int ps3_vuart_remove(struct ps3_system_bus_device *dev)
+static void ps3_vuart_remove(struct ps3_system_bus_device *dev)
{
struct ps3_vuart_port_priv *priv = to_port_priv(dev);
struct ps3_vuart_port_driver *drv;
@@ -1118,7 +1118,7 @@ static int ps3_vuart_remove(struct ps3_system_bus_device *dev)
dev_dbg(&dev->core, "%s:%d: no driver bound\n", __func__,
__LINE__);
mutex_unlock(&vuart_bus_priv.probe_mutex);
- return 0;
+ return;
}
drv = ps3_system_bus_dev_to_vuart_drv(dev);
@@ -1141,7 +1141,6 @@ static int ps3_vuart_remove(struct ps3_system_bus_device *dev)
dev_dbg(&dev->core, " <- %s:%d\n", __func__, __LINE__);
mutex_unlock(&vuart_bus_priv.probe_mutex);
- return 0;
}
/**
@@ -1154,7 +1153,7 @@ static int ps3_vuart_remove(struct ps3_system_bus_device *dev)
* sequence.
*/
-static int ps3_vuart_shutdown(struct ps3_system_bus_device *dev)
+static void ps3_vuart_shutdown(struct ps3_system_bus_device *dev)
{
struct ps3_vuart_port_driver *drv;
@@ -1169,7 +1168,7 @@ static int ps3_vuart_shutdown(struct ps3_system_bus_device *dev)
dev_dbg(&dev->core, "%s:%d: no driver bound\n", __func__,
__LINE__);
mutex_unlock(&vuart_bus_priv.probe_mutex);
- return 0;
+ return;
}
drv = ps3_system_bus_dev_to_vuart_drv(dev);
@@ -1193,7 +1192,6 @@ static int ps3_vuart_shutdown(struct ps3_system_bus_device *dev)
dev_dbg(&dev->core, " <- %s:%d\n", __func__, __LINE__);
mutex_unlock(&vuart_bus_priv.probe_mutex);
- return 0;
}
static int __init ps3_vuart_bus_init(void)
diff --git a/drivers/scsi/ps3rom.c b/drivers/scsi/ps3rom.c
index f75c0b5cd587..ccb5771f1cb7 100644
--- a/drivers/scsi/ps3rom.c
+++ b/drivers/scsi/ps3rom.c
@@ -402,7 +402,7 @@ static int ps3rom_probe(struct ps3_system_bus_device *_dev)
return error;
}
-static int ps3rom_remove(struct ps3_system_bus_device *_dev)
+static void ps3rom_remove(struct ps3_system_bus_device *_dev)
{
struct ps3_storage_device *dev = to_ps3_storage_device(&_dev->core);
struct Scsi_Host *host = ps3_system_bus_get_drvdata(&dev->sbd);
@@ -412,7 +412,6 @@ static int ps3rom_remove(struct ps3_system_bus_device *_dev)
scsi_host_put(host);
ps3_system_bus_set_drvdata(&dev->sbd, NULL);
kfree(dev->bounce_buf);
- return 0;
}
static struct ps3_system_bus_driver ps3rom = {
diff --git a/drivers/usb/host/ehci-ps3.c b/drivers/usb/host/ehci-ps3.c
index fb52133c3557..98568b046a1a 100644
--- a/drivers/usb/host/ehci-ps3.c
+++ b/drivers/usb/host/ehci-ps3.c
@@ -200,7 +200,7 @@ static int ps3_ehci_probe(struct ps3_system_bus_device *dev)
return result;
}
-static int ps3_ehci_remove(struct ps3_system_bus_device *dev)
+static void ps3_ehci_remove(struct ps3_system_bus_device *dev)
{
unsigned int tmp;
struct usb_hcd *hcd = ps3_system_bus_get_drvdata(dev);
@@ -227,8 +227,6 @@ static int ps3_ehci_remove(struct ps3_system_bus_device *dev)
ps3_dma_region_free(dev->d_region);
ps3_close_hv_device(dev);
-
- return 0;
}
static int __init ps3_ehci_driver_register(struct ps3_system_bus_driver *drv)
diff --git a/drivers/usb/host/ohci-ps3.c b/drivers/usb/host/ohci-ps3.c
index f77cd6af0ccf..4f5af929c3e4 100644
--- a/drivers/usb/host/ohci-ps3.c
+++ b/drivers/usb/host/ohci-ps3.c
@@ -184,7 +184,7 @@ static int ps3_ohci_probe(struct ps3_system_bus_device *dev)
return result;
}
-static int ps3_ohci_remove(struct ps3_system_bus_device *dev)
+static void ps3_ohci_remove(struct ps3_system_bus_device *dev)
{
unsigned int tmp;
struct usb_hcd *hcd = ps3_system_bus_get_drvdata(dev);
@@ -212,8 +212,6 @@ static int ps3_ohci_remove(struct ps3_system_bus_device *dev)
ps3_dma_region_free(dev->d_region);
ps3_close_hv_device(dev);
-
- return 0;
}
static int __init ps3_ohci_driver_register(struct ps3_system_bus_driver *drv)
diff --git a/drivers/video/fbdev/ps3fb.c b/drivers/video/fbdev/ps3fb.c
index 203c254f8f6c..2fe08b67eda7 100644
--- a/drivers/video/fbdev/ps3fb.c
+++ b/drivers/video/fbdev/ps3fb.c
@@ -1208,7 +1208,7 @@ static int ps3fb_probe(struct ps3_system_bus_device *dev)
return retval;
}
-static int ps3fb_shutdown(struct ps3_system_bus_device *dev)
+static void ps3fb_shutdown(struct ps3_system_bus_device *dev)
{
struct fb_info *info = ps3_system_bus_get_drvdata(dev);
u64 xdr_lpar = ps3_mm_phys_to_lpar(__pa(ps3fb_videomemory.address));
@@ -1241,8 +1241,6 @@ static int ps3fb_shutdown(struct ps3_system_bus_device *dev)
lv1_gpu_memory_free(ps3fb.memory_handle);
ps3_close_hv_device(dev);
dev_dbg(&dev->core, " <- %s:%d\n", __func__, __LINE__);
-
- return 0;
}
static struct ps3_system_bus_driver ps3fb_driver = {
diff --git a/sound/ppc/snd_ps3.c b/sound/ppc/snd_ps3.c
index 6ab796a5d936..8e44fa5d4dc7 100644
--- a/sound/ppc/snd_ps3.c
+++ b/sound/ppc/snd_ps3.c
@@ -1049,7 +1049,7 @@ static int snd_ps3_driver_probe(struct ps3_system_bus_device *dev)
}; /* snd_ps3_probe */
/* called when module removal */
-static int snd_ps3_driver_remove(struct ps3_system_bus_device *dev)
+static void snd_ps3_driver_remove(struct ps3_system_bus_device *dev)
{
int ret;
pr_info("%s:start id=%d\n", __func__, dev->match_id);
@@ -1075,7 +1075,6 @@ static int snd_ps3_driver_remove(struct ps3_system_bus_device *dev)
lv1_gpu_device_unmap(2);
ps3_close_hv_device(dev);
pr_info("%s:end id=%d\n", __func__, dev->match_id);
- return 0;
} /* snd_ps3_remove */
static struct ps3_system_bus_driver snd_ps3_bus_driver_info = {
--
2.29.2
^ permalink raw reply related
* [PATCH 1/2] ALSA: ppc: drop if block with always false condition
From: Uwe Kleine-König @ 2020-11-26 16:59 UTC (permalink / raw)
To: Geoff Levand, Jaroslav Kysela, Takashi Iwai, Michael Ellerman,
Jens Axboe, Jim Paris, Arnd Bergmann, Greg Kroah-Hartman,
David S. Miller, Jakub Kicinski, James E.J. Bottomley,
Martin K. Petersen, Alan Stern, Bartlomiej Zolnierkiewicz
Cc: alsa-devel, linux-scsi, linux-usb, linux-fbdev, dri-devel,
linux-block, Paul Mackerras, netdev, linuxppc-dev
The remove callback is only called for devices that were probed
successfully before. As the matching probe function cannot complete
without error if dev->match_id != PS3_MATCH_ID_SOUND, we don't have to
check this here.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
sound/ppc/snd_ps3.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/sound/ppc/snd_ps3.c b/sound/ppc/snd_ps3.c
index 58bb49fff184..6ab796a5d936 100644
--- a/sound/ppc/snd_ps3.c
+++ b/sound/ppc/snd_ps3.c
@@ -1053,8 +1053,6 @@ static int snd_ps3_driver_remove(struct ps3_system_bus_device *dev)
{
int ret;
pr_info("%s:start id=%d\n", __func__, dev->match_id);
- if (dev->match_id != PS3_MATCH_ID_SOUND)
- return -ENXIO;
/*
* ctl and preallocate buffer will be freed in
--
2.29.2
^ permalink raw reply related
* [Bug 209733] Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while
From: bugzilla-daemon @ 2020-11-26 17:26 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <bug-209733-206035@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=209733
--- Comment #4 from Cameron (cam@neo-zeon.de) ---
After enough testing, I feel confident that this issue was fixed in 5.9.9.
However, I encountered issues with XFS with 5.9.9 and 5.9.10 (mainly on POWER,
but to a lesser extent they seemed to happen for me on amd64 at least). 5.9.11
has the weird hang fixed and no other issues (XFS or otherwise) in over 2 days!
I feel confident in closing this issue.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply
* [Bug 209733] Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while
From: bugzilla-daemon @ 2020-11-26 17:26 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <bug-209733-206035@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=209733
Cameron (cam@neo-zeon.de) changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |CODE_FIX
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply
* Re: [PATCH 2/2] powerpc/ps3: make system bus's remove and shutdown callbacks return void
From: Uwe Kleine-König @ 2020-11-26 17:44 UTC (permalink / raw)
To: Geoff Levand, Jaroslav Kysela, Takashi Iwai, Michael Ellerman,
Jens Axboe, Jim Paris, Arnd Bergmann, Greg Kroah-Hartman,
David S. Miller, Jakub Kicinski, James E.J. Bottomley,
Martin K. Petersen, Alan Stern, Bartlomiej Zolnierkiewicz
Cc: Paul Mackerras, linuxppc-dev
In-Reply-To: <20201126165950.2554997-2-u.kleine-koenig@pengutronix.de>
[-- Attachment #1: Type: text/plain, Size: 801 bytes --]
[dropped a few lists from Cc: that are off-topic for this mail]
Hello,
while creating this patch series I looked at ps3_system_bus_shutdown().
I think the BUG_ON(!drv) in (now) line 422 can be easily triggered when
there is a device without driver. (Try unbinding via sysfs before
shutdown.)
Also the BUG in (now) line 437 seems possible to trigger. Consider a
driver that doesn't have the two callbacks, e.g. because there is
nothing special to do on shutdown and probe only used devm_* resources.
While at it, I find it surprising that the remove callback is called if
there is no shutdown callback.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | https://www.pengutronix.de/ |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* [RFC PATCH 00/14] powerpc64: Add support for ftrace direct calls
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
This series adds support for DYNAMIC_FTRACE_WITH_DIRECT_CALLS for
powerpc64le.
This is mostly working fine for me, except for a soft lockup I see with
the ftrace direct selftest. It happens when irqsoff tracer is being
tested with the ftrace direct modules. This appears to be an existing
upstream issue since I am able to reproduce the lockup without these
patches. I will be looking into that to see if I can figure out the
cause of those lockups.
In the meantime, I would appreciate a review of these patches.
- Naveen
Naveen N. Rao (14):
ftrace: Fix updating FTRACE_FL_TRAMP
ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
ftrace: Fix cleanup in error path of register_ftrace_direct()
ftrace: Remove ftrace_find_direct_func()
ftrace: Add architectural helpers for [un]register_ftrace_direct()
powerpc: Add support for CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
powerpc/ftrace: Remove dead code
powerpc/ftrace: Use FTRACE_REGS_ADDR to identify the correct ftrace
trampoline
powerpc/ftrace: Use a hash table for tracking ftrace stubs
powerpc/ftrace: Drop assumptions about ftrace trampoline target
powerpc/ftrace: Use GPR save/restore macros in ftrace_graph_caller()
powerpc/ftrace: Drop saving LR to stack save area for -mprofile-kernel
powerpc/ftrace: Add support for register_ftrace_direct() for
MPROFILE_KERNEL
samples/ftrace: Add powerpc support for ftrace direct samples
arch/powerpc/Kconfig | 2 +
arch/powerpc/include/asm/ftrace.h | 14 +
arch/powerpc/include/asm/ptrace.h | 31 ++
arch/powerpc/kernel/trace/ftrace.c | 314 +++++++++++++-----
.../powerpc/kernel/trace/ftrace_64_mprofile.S | 70 ++--
include/linux/ftrace.h | 7 +-
kernel/trace/Kconfig | 2 +-
kernel/trace/ftrace.c | 130 +++-----
samples/Kconfig | 2 +-
samples/ftrace/ftrace-direct-modify.c | 58 ++++
samples/ftrace/ftrace-direct-too.c | 48 ++-
samples/ftrace/ftrace-direct.c | 45 ++-
12 files changed, 519 insertions(+), 204 deletions(-)
base-commit: 4c202167192a77481310a3cacae9f12618b92216
--
2.25.4
^ permalink raw reply
* [RFC PATCH 01/14] ftrace: Fix updating FTRACE_FL_TRAMP
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
On powerpc, kprobe-direct.tc triggered FTRACE_WARN_ON() in
ftrace_get_addr_new() followed by the below message:
Bad trampoline accounting at: 000000004222522f (wake_up_process+0xc/0x20) (f0000001)
The set of steps leading to this involved:
- modprobe ftrace-direct-too
- enable_probe
- modprobe ftrace-direct
- rmmod ftrace-direct <-- trigger
The problem turned out to be that we were not updating flags in the
ftrace record properly. From the above message about the trampoline
accounting being bad, it can be seen that the ftrace record still has
FTRACE_FL_TRAMP set though ftrace-direct module is going away. This
happens because we are checking if any ftrace_ops has the
FTRACE_FL_TRAMP flag set _before_ updating the filter hash.
The fix for this is to look for any _other_ ftrace_ops that also needs
FTRACE_FL_TRAMP.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
kernel/trace/ftrace.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8185f7240095f4..9c1bba8cc51b03 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1629,6 +1629,8 @@ static bool test_rec_ops_needs_regs(struct dyn_ftrace *rec)
static struct ftrace_ops *
ftrace_find_tramp_ops_any(struct dyn_ftrace *rec);
static struct ftrace_ops *
+ftrace_find_tramp_ops_any_other(struct dyn_ftrace *rec, struct ftrace_ops *op_exclude);
+static struct ftrace_ops *
ftrace_find_tramp_ops_next(struct dyn_ftrace *rec, struct ftrace_ops *ops);
static bool __ftrace_hash_rec_update(struct ftrace_ops *ops,
@@ -1778,7 +1780,7 @@ static bool __ftrace_hash_rec_update(struct ftrace_ops *ops,
* to it.
*/
if (ftrace_rec_count(rec) == 1 &&
- ftrace_find_tramp_ops_any(rec))
+ ftrace_find_tramp_ops_any_other(rec, ops))
rec->flags |= FTRACE_FL_TRAMP;
else
rec->flags &= ~FTRACE_FL_TRAMP;
@@ -2244,6 +2246,24 @@ ftrace_find_tramp_ops_any(struct dyn_ftrace *rec)
return NULL;
}
+static struct ftrace_ops *
+ftrace_find_tramp_ops_any_other(struct dyn_ftrace *rec, struct ftrace_ops *op_exclude)
+{
+ struct ftrace_ops *op;
+ unsigned long ip = rec->ip;
+
+ do_for_each_ftrace_op(op, ftrace_ops_list) {
+
+ if (op == op_exclude || !op->trampoline)
+ continue;
+
+ if (hash_contains_ip(ip, op->func_hash))
+ return op;
+ } while_for_each_ftrace_op(op);
+
+ return NULL;
+}
+
static struct ftrace_ops *
ftrace_find_tramp_ops_next(struct dyn_ftrace *rec,
struct ftrace_ops *op)
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 02/14] ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
DYNAMIC_FTRACE_WITH_DIRECT_CALLS should depend on
DYNAMIC_FTRACE_WITH_REGS since we need ftrace_regs_caller().
Fixes: 763e34e74bb7d5c ("ftrace: Add register_ftrace_direct()")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
kernel/trace/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index a4020c0b4508c9..e1bf5228fb692a 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -202,7 +202,7 @@ config DYNAMIC_FTRACE_WITH_REGS
config DYNAMIC_FTRACE_WITH_DIRECT_CALLS
def_bool y
- depends on DYNAMIC_FTRACE
+ depends on DYNAMIC_FTRACE_WITH_REGS
depends on HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
config FUNCTION_PROFILER
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 03/14] ftrace: Fix cleanup in error path of register_ftrace_direct()
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
We need to remove hash entry if register_ftrace_function() fails.
Consolidate the cleanup to be done after register_ftrace_function() at
the end.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
kernel/trace/ftrace.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 9c1bba8cc51b03..3844a4a1346a9c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5136,8 +5136,6 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
__add_hash_entry(direct_functions, entry);
ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
- if (ret)
- remove_hash_entry(direct_functions, entry);
if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
ret = register_ftrace_function(&direct_ops);
@@ -5146,6 +5144,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
}
if (ret) {
+ remove_hash_entry(direct_functions, entry);
kfree(entry);
if (!direct->count) {
list_del_rcu(&direct->next);
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 04/14] ftrace: Remove ftrace_find_direct_func()
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
This is a revert of commit 013bf0da047481 ("ftrace: Add
ftrace_find_direct_func()")
ftrace_find_direct_func() was meant for use in the function graph tracer
by architecture specific code. However, commit ff205766dbbee0 ("ftrace:
Fix function_graph tracer interaction with BPF trampoline") disabled
function graph tracer for direct calls leaving this without any users.
In addition, modify_ftrace_direct() allowed redirecting the direct call
to a different trampoline that was never registered through
register_ftrace_direct(). This meant that ftrace_direct_funcs didn't
capture all trampolines.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
include/linux/ftrace.h | 5 ---
kernel/trace/ftrace.c | 84 ++----------------------------------------
2 files changed, 4 insertions(+), 85 deletions(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 1bd3a0356ae478..46b4b7ee28c41f 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -285,7 +285,6 @@ extern int ftrace_direct_func_count;
int register_ftrace_direct(unsigned long ip, unsigned long addr);
int unregister_ftrace_direct(unsigned long ip, unsigned long addr);
int modify_ftrace_direct(unsigned long ip, unsigned long old_addr, unsigned long new_addr);
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr);
int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
struct dyn_ftrace *rec,
unsigned long old_addr,
@@ -306,10 +305,6 @@ static inline int modify_ftrace_direct(unsigned long ip,
{
return -ENOTSUPP;
}
-static inline struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
- return NULL;
-}
static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
struct dyn_ftrace *rec,
unsigned long old_addr,
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3844a4a1346a9c..7476f2458b6d95 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5005,46 +5005,6 @@ ftrace_set_addr(struct ftrace_ops *ops, unsigned long ip, int remove,
}
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
-
-struct ftrace_direct_func {
- struct list_head next;
- unsigned long addr;
- int count;
-};
-
-static LIST_HEAD(ftrace_direct_funcs);
-
-/**
- * ftrace_find_direct_func - test an address if it is a registered direct caller
- * @addr: The address of a registered direct caller
- *
- * This searches to see if a ftrace direct caller has been registered
- * at a specific address, and if so, it returns a descriptor for it.
- *
- * This can be used by architecture code to see if an address is
- * a direct caller (trampoline) attached to a fentry/mcount location.
- * This is useful for the function_graph tracer, as it may need to
- * do adjustments if it traced a location that also has a direct
- * trampoline attached to it.
- */
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
- struct ftrace_direct_func *entry;
- bool found = false;
-
- /* May be called by fgraph trampoline (protected by rcu tasks) */
- list_for_each_entry_rcu(entry, &ftrace_direct_funcs, next) {
- if (entry->addr == addr) {
- found = true;
- break;
- }
- }
- if (found)
- return entry;
-
- return NULL;
-}
-
/**
* register_ftrace_direct - Call a custom trampoline directly
* @ip: The address of the nop at the beginning of a function
@@ -5064,7 +5024,6 @@ struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
*/
int register_ftrace_direct(unsigned long ip, unsigned long addr)
{
- struct ftrace_direct_func *direct;
struct ftrace_func_entry *entry;
struct ftrace_hash *free_hash = NULL;
struct dyn_ftrace *rec;
@@ -5118,19 +5077,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
if (!entry)
goto out_unlock;
- direct = ftrace_find_direct_func(addr);
- if (!direct) {
- direct = kmalloc(sizeof(*direct), GFP_KERNEL);
- if (!direct) {
- kfree(entry);
- goto out_unlock;
- }
- direct->addr = addr;
- direct->count = 0;
- list_add_rcu(&direct->next, &ftrace_direct_funcs);
- ftrace_direct_func_count++;
- }
-
+ ftrace_direct_func_count++;
entry->ip = ip;
entry->direct = addr;
__add_hash_entry(direct_functions, entry);
@@ -5145,18 +5092,8 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
if (ret) {
remove_hash_entry(direct_functions, entry);
+ ftrace_direct_func_count--;
kfree(entry);
- if (!direct->count) {
- list_del_rcu(&direct->next);
- synchronize_rcu_tasks();
- kfree(direct);
- if (free_hash)
- free_ftrace_hash(free_hash);
- free_hash = NULL;
- ftrace_direct_func_count--;
- }
- } else {
- direct->count++;
}
out_unlock:
mutex_unlock(&direct_mutex);
@@ -5199,7 +5136,6 @@ static struct ftrace_func_entry *find_direct_entry(unsigned long *ip,
int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
{
- struct ftrace_direct_func *direct;
struct ftrace_func_entry *entry;
int ret = -ENODEV;
@@ -5217,20 +5153,8 @@ int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
WARN_ON(ret);
remove_hash_entry(direct_functions, entry);
-
- direct = ftrace_find_direct_func(addr);
- if (!WARN_ON(!direct)) {
- /* This is the good path (see the ! before WARN) */
- direct->count--;
- WARN_ON(direct->count < 0);
- if (!direct->count) {
- list_del_rcu(&direct->next);
- synchronize_rcu_tasks();
- kfree(direct);
- kfree(entry);
- ftrace_direct_func_count--;
- }
- }
+ ftrace_direct_func_count--;
+ kfree(entry);
out_unlock:
mutex_unlock(&direct_mutex);
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 05/14] ftrace: Add architectural helpers for [un]register_ftrace_direct()
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
Architectures may want to do some validation (such as to ensure that the
trampoline code is reachable from the provided ftrace location) before
accepting ftrace direct registration. Add helpers for the same.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
include/linux/ftrace.h | 2 ++
kernel/trace/ftrace.c | 27 +++++++++++++++++++++------
2 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 46b4b7ee28c41f..3fdcb4c513bc2d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -290,6 +290,8 @@ int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
unsigned long old_addr,
unsigned long new_addr);
unsigned long ftrace_find_rec_direct(unsigned long ip);
+int arch_register_ftrace_direct(unsigned long ip, unsigned long addr);
+void arch_unregister_ftrace_direct(unsigned long ip, unsigned long addr);
#else
# define ftrace_direct_func_count 0
static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 7476f2458b6d95..0e259b90527722 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5005,6 +5005,13 @@ ftrace_set_addr(struct ftrace_ops *ops, unsigned long ip, int remove,
}
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+int __weak arch_register_ftrace_direct(unsigned long ip, unsigned long addr)
+{
+ return 0;
+}
+
+void __weak arch_unregister_ftrace_direct(unsigned long ip, unsigned long addr) { }
+
/**
* register_ftrace_direct - Call a custom trampoline directly
* @ip: The address of the nop at the beginning of a function
@@ -5028,6 +5035,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
struct ftrace_hash *free_hash = NULL;
struct dyn_ftrace *rec;
int ret = -EBUSY;
+ int arch_ret;
mutex_lock(&direct_mutex);
@@ -5082,18 +5090,24 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
entry->direct = addr;
__add_hash_entry(direct_functions, entry);
- ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
+ arch_ret = arch_register_ftrace_direct(ip, addr);
- if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
- ret = register_ftrace_function(&direct_ops);
- if (ret)
- ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
+ if (!arch_ret) {
+ ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
+
+ if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
+ ret = register_ftrace_function(&direct_ops);
+ if (ret)
+ ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
+ }
}
- if (ret) {
+ if (arch_ret || ret) {
remove_hash_entry(direct_functions, entry);
ftrace_direct_func_count--;
kfree(entry);
+ if (!arch_ret)
+ arch_unregister_ftrace_direct(ip, addr);
}
out_unlock:
mutex_unlock(&direct_mutex);
@@ -5155,6 +5169,7 @@ int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
remove_hash_entry(direct_functions, entry);
ftrace_direct_func_count--;
kfree(entry);
+ arch_unregister_ftrace_direct(ip, addr);
out_unlock:
mutex_unlock(&direct_mutex);
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 06/14] powerpc: Add support for CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
Add register_get_kernel_argument() for a rudimentary way to access
kernel function arguments.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/ptrace.h | 31 +++++++++++++++++++++++++++++++
2 files changed, 32 insertions(+)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe084929b..cfc6dd787f532c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -202,6 +202,7 @@ config PPC
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU)
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
+ select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index e2c778c176a3a6..956828c07abd70 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -62,6 +62,8 @@ struct pt_regs
};
#endif
+#define NR_REG_ARGUMENTS 8
+
#ifdef __powerpc64__
/*
@@ -85,8 +87,10 @@ struct pt_regs
#ifdef PPC64_ELF_ABI_v2
#define STACK_FRAME_MIN_SIZE 32
+#define STACK_FRAME_PARM_SAVE 32
#else
#define STACK_FRAME_MIN_SIZE STACK_FRAME_OVERHEAD
+#define STACK_FRAME_PARM_SAVE 48
#endif
/* Size of dummy stack frame allocated when calling signal handler. */
@@ -103,6 +107,7 @@ struct pt_regs
#define STACK_INT_FRAME_SIZE (sizeof(struct pt_regs) + STACK_FRAME_OVERHEAD)
#define STACK_FRAME_MARKER 2
#define STACK_FRAME_MIN_SIZE STACK_FRAME_OVERHEAD
+#define STACK_FRAME_PARM_SAVE 8
/* Size of stack frame allocated when calling signal handler. */
#define __SIGNAL_FRAMESIZE 64
@@ -309,6 +314,32 @@ static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
return 0;
}
+/**
+ * regs_get_kernel_argument() - get Nth function argument in kernel
+ * @regs: pt_regs of that context
+ * @n: function argument number (start from 0)
+ *
+ * regs_get_argument() returns @n th argument of the function call.
+ * Note that this chooses most probable assignment, and is incorrect
+ * in scenarios where double or fp/vector parameters are involved.
+ * This also doesn't take into account stack alignment requirements.
+ *
+ * This is expected to be called from kprobes or ftrace with regs
+ * at function entry, so the current function has not setup its stack.
+ */
+static inline unsigned long regs_get_kernel_argument(struct pt_regs *regs,
+ unsigned int n)
+{
+ if (n >= NR_REG_ARGUMENTS) {
+#ifndef __powerpc64__
+ n -= NR_REG_ARGUMENTS;
+#endif
+ n += STACK_FRAME_PARM_SAVE / sizeof(unsigned long);
+ return regs_get_kernel_stack_nth(regs, n);
+ } else {
+ return regs_get_register(regs, offsetof(struct pt_regs, gpr[n + 3]));
+ }
+}
#endif /* __ASSEMBLY__ */
#ifndef __powerpc64__
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 07/14] powerpc/ftrace: Remove dead code
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
ftrace_plt_tramps[] was intended to speed up skipping plt branches, but
the code wasn't completed. It is also not significantly better than
reading and decoding the instruction. Remove the same.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/trace/ftrace.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 42761ebec9f755..4fe5f373172fd2 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -332,7 +332,6 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
struct ppc_inst op;
unsigned long ptr;
struct ppc_inst instr;
- static unsigned long ftrace_plt_tramps[NUM_FTRACE_TRAMPS];
/* Is this a known long jump tramp? */
for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
@@ -341,13 +340,6 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
else if (ftrace_tramps[i] == tramp)
return 0;
- /* Is this a known plt tramp? */
- for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
- if (!ftrace_plt_tramps[i])
- break;
- else if (ftrace_plt_tramps[i] == tramp)
- return -1;
-
/* New trampoline -- read where this goes */
if (probe_kernel_read_inst(&op, (void *)tramp)) {
pr_debug("Fetching opcode failed.\n");
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 08/14] powerpc/ftrace: Use FTRACE_REGS_ADDR to identify the correct ftrace trampoline
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
Use FTRACE_REGS_ADDR instead of keying off
CONFIG_DYNAMIC_FTRACE_WITH_REGS to identify the proper ftrace trampoline
address to use.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/trace/ftrace.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 4fe5f373172fd2..14b39f7797d455 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -361,11 +361,7 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
}
/* Let's re-write the tramp to go to ftrace_[regs_]caller */
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
- ptr = ppc_global_function_entry((void *)ftrace_regs_caller);
-#else
- ptr = ppc_global_function_entry((void *)ftrace_caller);
-#endif
+ ptr = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
if (create_branch(&instr, (void *)tramp, ptr, 0)) {
pr_debug("%ps is not reachable from existing mcount tramp\n",
(void *)ptr);
@@ -885,11 +881,7 @@ int __init ftrace_dyn_arch_init(void)
0x7d8903a6, /* mtctr r12 */
0x4e800420, /* bctr */
};
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
- unsigned long addr = ppc_global_function_entry((void *)ftrace_regs_caller);
-#else
- unsigned long addr = ppc_global_function_entry((void *)ftrace_caller);
-#endif
+ unsigned long addr = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
long reladdr = addr - kernel_toc_addr();
if (reladdr > 0x7FFFFFFF || reladdr < -(0x80000000L)) {
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 12/14] powerpc/ftrace: Drop saving LR to stack save area for -mprofile-kernel
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
In -mprofile-kernel variant of ftrace_graph_caller(), we save the
optionally-updated LR address into the stack save area at the end. This
is likely an offshoot of the initial -mprofile-kernel implementation in
gcc emitting the same as part of the -mprofile-kernel instruction
sequence. However, this is not required. Drop it.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index bbe871b47ade58..c5602e9b07faa3 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -310,7 +310,5 @@ _GLOBAL(ftrace_graph_caller)
ld r2, 24(r1)
addi r1, r1, SWITCH_FRAME_SIZE
- mflr r0
- std r0, LRSAVE(r1)
bctr
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 10/14] powerpc/ftrace: Drop assumptions about ftrace trampoline target
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
We currently assume that ftrace locations are patched to go to either
ftrace_caller or ftrace_regs_caller. Drop this assumption in preparation
for supporting ftrace direct calls.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/trace/ftrace.c | 107 +++++++++++++++++++++++------
1 file changed, 86 insertions(+), 21 deletions(-)
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 7ddb6e4b527c39..fcb21a9756e456 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -322,14 +322,15 @@ static int add_ftrace_tramp(unsigned long tramp, unsigned long target)
*/
static int setup_mcount_compiler_tramp(unsigned long tramp)
{
+ int i;
struct ppc_inst op;
struct ppc_inst instr;
struct ppc_ftrace_stub_data *stub;
unsigned long ptr, ftrace_target = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
- /* Is this a known long jump tramp? */
- hash_for_each_possible(ppc_ftrace_stubs, stub, hentry, ftrace_target)
- if (stub->target == ftrace_target && stub->addr == tramp)
+ /* Is this a known tramp? */
+ hash_for_each(ppc_ftrace_stubs, i, stub, hentry)
+ if (stub->addr == tramp)
return 0;
/* New trampoline -- read where this goes */
@@ -608,23 +609,16 @@ static int __ftrace_make_call_kernel(struct dyn_ftrace *rec, unsigned long addr)
{
struct ppc_inst op;
void *ip = (void *)rec->ip;
- unsigned long tramp, entry, ptr;
+ unsigned long tramp, ptr;
- /* Make sure we're being asked to patch branch to a known ftrace addr */
- entry = ppc_global_function_entry((void *)ftrace_caller);
ptr = ppc_global_function_entry((void *)addr);
- if (ptr != entry) {
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
- entry = ppc_global_function_entry((void *)ftrace_regs_caller);
- if (ptr != entry) {
+ /* Make sure we branch to ftrace_regs_caller since we only setup stubs for that */
+ tramp = ppc_global_function_entry((void *)ftrace_caller);
+ if (ptr == tramp)
+ ptr = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
#endif
- pr_err("Unknown ftrace addr to patch: %ps\n", (void *)ptr);
- return -EINVAL;
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
- }
-#endif
- }
/* Make sure we have a nop */
if (probe_kernel_read_inst(&op, ip)) {
@@ -637,7 +631,7 @@ static int __ftrace_make_call_kernel(struct dyn_ftrace *rec, unsigned long addr)
return -EINVAL;
}
- tramp = find_ftrace_tramp((unsigned long)ip, FTRACE_REGS_ADDR);
+ tramp = find_ftrace_tramp((unsigned long)ip, ptr);
if (!tramp) {
pr_err("No ftrace trampolines reachable from %ps\n", ip);
return -EINVAL;
@@ -783,6 +777,81 @@ __ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
}
#endif
+static int
+__ftrace_modify_call_kernel(struct dyn_ftrace *rec, unsigned long old_addr, unsigned long addr)
+{
+ struct ppc_inst op;
+ unsigned long ip = rec->ip;
+ unsigned long entry, ptr, tramp;
+
+ /* read where this goes */
+ if (probe_kernel_read_inst(&op, (void *)ip)) {
+ pr_err("Fetching opcode failed.\n");
+ return -EFAULT;
+ }
+
+ /* Make sure that this is still a 24bit jump */
+ if (!is_bl_op(op)) {
+ pr_err("Not expected bl: opcode is %s\n", ppc_inst_as_str(op));
+ return -EINVAL;
+ }
+
+ /* lets find where the pointer goes */
+ tramp = find_bl_target(ip, op);
+ entry = ppc_global_function_entry((void *)old_addr);
+
+ pr_devel("ip:%lx jumps to %lx", ip, tramp);
+
+ if (tramp != entry) {
+ /* old_addr is not within range, so we must have used a trampoline */
+ struct ppc_ftrace_stub_data *stub;
+
+ hash_for_each_possible(ppc_ftrace_stubs, stub, hentry, entry)
+ if (stub->target == entry && stub->addr == tramp)
+ break;
+
+ if (stub->target != entry || stub->addr != tramp) {
+ pr_err("we don't know about the tramp at %lx!\n", tramp);
+ return -EFAULT;
+ }
+ }
+
+ /* The new target may be within range */
+ if (test_24bit_addr(ip, addr)) {
+ /* within range */
+ if (patch_branch((struct ppc_inst *)ip, addr, BRANCH_SET_LINK)) {
+ pr_err("REL24 out of range!\n");
+ return -EINVAL;
+ }
+
+ return 0;
+ }
+
+ ptr = ppc_global_function_entry((void *)addr);
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+ /* Make sure we branch to ftrace_regs_caller since we only setup stubs for that */
+ entry = ppc_global_function_entry((void *)ftrace_caller);
+ if (ptr == entry)
+ ptr = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
+#endif
+
+ tramp = find_ftrace_tramp(ip, ptr);
+
+ if (!tramp) {
+ pr_err("Couldn't find a trampoline\n");
+ return -EFAULT;
+ }
+
+ pr_devel("trampoline %lx target %lx", tramp, ptr);
+
+ if (patch_branch((struct ppc_inst *)ip, tramp, BRANCH_SET_LINK)) {
+ pr_err("REL24 out of range!\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
unsigned long addr)
{
@@ -800,11 +869,7 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
new = ftrace_call_replace(ip, addr, 1);
return ftrace_modify_code(ip, old, new);
} else if (core_kernel_text(ip)) {
- /*
- * We always patch out of range locations to go to the regs
- * variant, so there is nothing to do here
- */
- return 0;
+ return __ftrace_modify_call_kernel(rec, old_addr, addr);
}
#ifdef CONFIG_MODULES
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 13/14] powerpc/ftrace: Add support for register_ftrace_direct() for MPROFILE_KERNEL
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
Add support for register_ftrace_direct() for MPROFILE_KERNEL, as it
depends on DYNAMIC_FTRACE_WITH_REGS.
Since powerpc only provides a branch range of 32MB, we set aside a 64k
area within kernel text for creating stubs that can be used to branch to
the provided trampoline, which can be located in the module area. This
is limited to kernel text, and as such, ftrace direct calls are not
supported for functions in kernel modules at this time.
We use orig_gpr3 to stash the address of the direct call trampoline in
arch_ftrace_set_direct_caller(). ftrace_regs_caller() is updated to
check for this to determine if we need to redirect to a direct call
trampoline. As the direct call trampoline has to work as an alternative
for the ftrace trampoline, we setup LR and r0 appropriately, and update
ctr to the trampoline address. Finally, ftrace_graph_caller() is
updated to save/restore r0.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/ftrace.h | 14 ++
arch/powerpc/kernel/trace/ftrace.c | 140 +++++++++++++++++-
.../powerpc/kernel/trace/ftrace_64_mprofile.S | 40 ++++-
4 files changed, 182 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index cfc6dd787f532c..a87ac2e403196e 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -197,6 +197,7 @@ config PPC
select HAVE_DEBUG_KMEMLEAK
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DYNAMIC_FTRACE
+ select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS if MPROFILE_KERNEL
select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL
select HAVE_EBPF_JIT if PPC64
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU)
diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index bc76970b6ee532..2f1c46e9f5d416 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -10,6 +10,8 @@
#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
+#define FTRACE_STUBS_SIZE 65536
+
#ifdef __ASSEMBLY__
/* Based off of objdump optput from glibc */
@@ -59,6 +61,18 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
struct dyn_arch_ftrace {
struct module *mod;
};
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+/*
+ * When there is a direct caller registered, we use regs->orig_gpr3 (similar to
+ * how x86 uses orig_ax) to let ftrace_{regs_}_caller know that we should go
+ * there instead of returning to the function
+ */
+static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
+{
+ regs->orig_gpr3 = addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
#endif /* __ASSEMBLY__ */
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index fcb21a9756e456..815b14ae45a71f 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -37,6 +37,7 @@ static DEFINE_HASHTABLE(ppc_ftrace_stubs, 8);
struct ppc_ftrace_stub_data {
unsigned long addr;
unsigned long target;
+ refcount_t refs;
struct hlist_node hentry;
};
@@ -299,7 +300,7 @@ static unsigned long find_ftrace_tramp(unsigned long ip, unsigned long target)
return 0;
}
-static int add_ftrace_tramp(unsigned long tramp, unsigned long target)
+static int add_ftrace_tramp(unsigned long tramp, unsigned long target, int lock)
{
struct ppc_ftrace_stub_data *stub;
@@ -309,11 +310,123 @@ static int add_ftrace_tramp(unsigned long tramp, unsigned long target)
stub->addr = tramp;
stub->target = target;
+ refcount_set(&stub->refs, 1);
+ if (lock)
+ refcount_inc(&stub->refs);
hash_add(ppc_ftrace_stubs, &stub->hentry, target);
return 0;
}
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static u32 ftrace_direct_stub_insns[] = {
+ PPC_RAW_LIS(12, 0),
+ PPC_RAW_ORI(12, 12, 0),
+ PPC_RAW_SLDI(12, 12, 32),
+ PPC_RAW_ORIS(12, 12, 0),
+ PPC_RAW_ORI(12, 12, 0),
+ PPC_RAW_MTCTR(12),
+ PPC_RAW_BCTR(),
+};
+#define FTRACE_NUM_STUBS (FTRACE_STUBS_SIZE / sizeof(ftrace_direct_stub_insns))
+static DECLARE_BITMAP(stubs_bitmap, FTRACE_NUM_STUBS);
+extern unsigned int ftrace_stubs[];
+
+static unsigned long get_ftrace_tramp(unsigned long ip, unsigned long target)
+{
+ struct ppc_ftrace_stub_data *stub_data;
+ struct ppc_inst instr;
+ unsigned int *stub;
+ int index;
+
+ hash_for_each_possible(ppc_ftrace_stubs, stub_data, hentry, target) {
+ if (stub_data->target == target &&
+ !create_branch(&instr, (void *)ip, stub_data->addr, 0)) {
+ refcount_inc(&stub_data->refs);
+ return stub_data->addr;
+ }
+ }
+
+ /* Allocate a stub */
+ do {
+ index = find_first_zero_bit(stubs_bitmap, FTRACE_NUM_STUBS);
+ if (index >= FTRACE_NUM_STUBS) {
+ pr_err("No stubs available\n");
+ return 0;
+ }
+ } while (test_and_set_bit(index, stubs_bitmap));
+ stub = &ftrace_stubs[index * sizeof(ftrace_direct_stub_insns) / 4];
+
+ if (create_branch(&instr, (void *)ip, (unsigned long)stub, 0)) {
+ /* Stub is not reachable from the ftrace location */
+ clear_bit(index, stubs_bitmap);
+ return 0;
+ }
+
+ memcpy(stub, ftrace_direct_stub_insns, sizeof(ftrace_direct_stub_insns));
+ stub[0] |= IMM_L(target >> 48);
+ stub[1] |= IMM_L(target >> 32);
+ stub[3] |= IMM_L(target >> 16);
+ stub[4] |= IMM_L(target);
+ if (add_ftrace_tramp((unsigned long)stub, target, 0)) {
+ pr_err("Error allocating ftrace stub");
+ clear_bit(index, stubs_bitmap);
+ return 0;
+ }
+
+ return (unsigned long)stub;
+}
+
+static void remove_ftrace_tramp(unsigned long ip, unsigned long target, unsigned long stub_addr)
+{
+ struct ppc_ftrace_stub_data *stub;
+ unsigned long tramp = 0;
+ struct ppc_inst instr;
+ int index;
+
+ hash_for_each_possible(ppc_ftrace_stubs, stub, hentry, target) {
+ if (stub->target == target && stub->addr == stub_addr &&
+ !create_branch(&instr, (void *)ip, stub->addr, 0)) {
+ if (refcount_dec_and_test(&stub->refs)) {
+ tramp = stub->addr;
+ hash_del(&stub->hentry);
+ kfree(stub);
+ break;
+ }
+ return;
+ }
+ }
+
+ if (tramp) {
+ synchronize_rcu_tasks();
+ index = (tramp - (unsigned long)ftrace_stubs) / sizeof(ftrace_direct_stub_insns);
+ clear_bit(index, stubs_bitmap);
+ }
+}
+
+int arch_register_ftrace_direct(unsigned long ip, unsigned long addr)
+{
+ if (addr & 0x03) {
+ pr_err("Target address is not at instruction boundary: 0x%lx\n", addr);
+ return -EINVAL;
+ }
+
+ if (is_module_text_address(ip)) {
+ pr_err("Kernel modules are not supported for direct calls\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+#else
+static unsigned long get_ftrace_tramp(unsigned long ip, unsigned long target)
+{
+ return find_ftrace_tramp(ip, target);
+}
+
+static void remove_ftrace_tramp(unsigned long ip, unsigned long target, unsigned long stub_addr) { }
+#endif
+
/*
* If this is a compiler generated long_branch trampoline (essentially, a
* trampoline that has a branch to _mcount()), we re-write the branch to
@@ -365,7 +478,7 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
return -1;
}
- if (add_ftrace_tramp(tramp, ftrace_target)) {
+ if (add_ftrace_tramp(tramp, ftrace_target, 1)) {
pr_debug("No tramp locations left\n");
return -1;
}
@@ -409,6 +522,8 @@ static int __ftrace_make_nop_kernel(struct dyn_ftrace *rec, unsigned long addr)
return -EPERM;
}
+ remove_ftrace_tramp(ip, addr, tramp);
+
return 0;
}
@@ -631,7 +746,7 @@ static int __ftrace_make_call_kernel(struct dyn_ftrace *rec, unsigned long addr)
return -EINVAL;
}
- tramp = find_ftrace_tramp((unsigned long)ip, ptr);
+ tramp = get_ftrace_tramp((unsigned long)ip, ptr);
if (!tramp) {
pr_err("No ftrace trampolines reachable from %ps\n", ip);
return -EINVAL;
@@ -782,7 +897,7 @@ __ftrace_modify_call_kernel(struct dyn_ftrace *rec, unsigned long old_addr, unsi
{
struct ppc_inst op;
unsigned long ip = rec->ip;
- unsigned long entry, ptr, tramp;
+ unsigned long entry, ptr, tramp, tramp_old = 0;
/* read where this goes */
if (probe_kernel_read_inst(&op, (void *)ip)) {
@@ -814,6 +929,8 @@ __ftrace_modify_call_kernel(struct dyn_ftrace *rec, unsigned long old_addr, unsi
pr_err("we don't know about the tramp at %lx!\n", tramp);
return -EFAULT;
}
+
+ tramp_old = tramp;
}
/* The new target may be within range */
@@ -824,7 +941,7 @@ __ftrace_modify_call_kernel(struct dyn_ftrace *rec, unsigned long old_addr, unsi
return -EINVAL;
}
- return 0;
+ goto out;
}
ptr = ppc_global_function_entry((void *)addr);
@@ -836,7 +953,7 @@ __ftrace_modify_call_kernel(struct dyn_ftrace *rec, unsigned long old_addr, unsi
ptr = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
#endif
- tramp = find_ftrace_tramp(ip, ptr);
+ tramp = get_ftrace_tramp(ip, ptr);
if (!tramp) {
pr_err("Couldn't find a trampoline\n");
@@ -850,8 +967,13 @@ __ftrace_modify_call_kernel(struct dyn_ftrace *rec, unsigned long old_addr, unsi
return -EINVAL;
}
+out:
+ if (tramp_old)
+ remove_ftrace_tramp(ip, old_addr, tramp_old);
+
return 0;
}
+
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
unsigned long addr)
{
@@ -950,9 +1072,13 @@ int __init ftrace_dyn_arch_init(void)
memcpy(tramp[i], stub_insns, sizeof(stub_insns));
tramp[i][1] |= PPC_HA(reladdr);
tramp[i][2] |= PPC_LO(reladdr);
- add_ftrace_tramp((unsigned long)tramp[i], addr);
+ add_ftrace_tramp((unsigned long)tramp[i], addr, 1);
}
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ bitmap_zero(stubs_bitmap, FTRACE_NUM_STUBS);
+#endif
+
return 0;
}
#else
diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index c5602e9b07faa3..ffd2e33ff979bc 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -13,6 +13,13 @@
#include <asm/bug.h>
#include <asm/ptrace.h>
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ .balign 4
+.global ftrace_stubs
+ftrace_stubs:
+ .space FTRACE_STUBS_SIZE
+#endif
+
/*
*
* ftrace_caller()/ftrace_regs_caller() is the function that replaces _mcount()
@@ -91,6 +98,10 @@ _GLOBAL(ftrace_regs_caller)
std r10, _XER(r1)
std r11, _CCR(r1)
+ /* Clear out orig_gpr3 */
+ li r6, 0
+ std r6, ORIG_GPR3(r1)
+
/* Load &pt_regs in r6 for call below */
addi r6, r1 ,STACK_FRAME_OVERHEAD
@@ -103,20 +114,34 @@ ftrace_regs_call:
/* Load ctr with the possibly modified NIP */
ld r3, _NIP(r1)
mtctr r3
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ /* Check if we should go to a direct call next */
+ ld r4, ORIG_GPR3(r1)
+ cmpdi r4, 0
+ beq+ 1f
+ /* r4 has the direct call target, setup LR and r0 as on our entry, reset cr0 */
+ mtctr r4
+ mtlr r3
+ ld r0, _LINK(r1)
+ cmpd r3, r3
+ b 2f
+#endif
+
+1:
#ifdef CONFIG_LIVEPATCH
cmpd r14, r3 /* has NIP been altered? */
#endif
- /* Restore gprs */
- REST_GPR(0,r1)
- REST_10GPRS(2,r1)
- REST_10GPRS(12,r1)
- REST_10GPRS(22,r1)
-
/* Restore possibly modified LR */
ld r0, _LINK(r1)
mtlr r0
+ /* Restore gprs */
+2: REST_10GPRS(2,r1)
+ REST_10GPRS(12,r1)
+ REST_10GPRS(22,r1)
+
/* Restore callee's TOC */
ld r2, 24(r1)
@@ -282,6 +307,7 @@ _GLOBAL(ftrace_graph_caller)
stdu r1,-SWITCH_FRAME_SIZE(r1)
/* with -mprofile-kernel, parameter regs are still alive at _mcount */
SAVE_8GPRS(3, r1)
+ SAVE_GPR(0, r1)
/* Save callee's TOC in the ABI compliant location */
std r2, 24(r1)
@@ -304,6 +330,8 @@ _GLOBAL(ftrace_graph_caller)
ld r0, _NIP(r1)
mtctr r0
+
+ REST_GPR(0, r1)
REST_8GPRS(3, r1)
/* Restore callee's TOC */
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 09/14] powerpc/ftrace: Use a hash table for tracking ftrace stubs
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
In preparation for having to deal with large number of ftrace stubs in
support of ftrace direct calls, convert existing stubs to use a hash
table. The hash table is key'ed off the target address for the stubs
since there could be multiple stubs for the same target to cover the
full kernel text.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
arch/powerpc/kernel/trace/ftrace.c | 75 +++++++++++++-----------------
1 file changed, 33 insertions(+), 42 deletions(-)
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 14b39f7797d455..7ddb6e4b527c39 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -13,6 +13,7 @@
#define pr_fmt(fmt) "ftrace-powerpc: " fmt
+#include <linux/hashtable.h>
#include <linux/spinlock.h>
#include <linux/hardirq.h>
#include <linux/uaccess.h>
@@ -32,14 +33,12 @@
#ifdef CONFIG_DYNAMIC_FTRACE
-/*
- * We generally only have a single long_branch tramp and at most 2 or 3 plt
- * tramps generated. But, we don't use the plt tramps currently. We also allot
- * 2 tramps after .text and .init.text. So, we only end up with around 3 usable
- * tramps in total. Set aside 8 just to be sure.
- */
-#define NUM_FTRACE_TRAMPS 8
-static unsigned long ftrace_tramps[NUM_FTRACE_TRAMPS];
+static DEFINE_HASHTABLE(ppc_ftrace_stubs, 8);
+struct ppc_ftrace_stub_data {
+ unsigned long addr;
+ unsigned long target;
+ struct hlist_node hentry;
+};
static struct ppc_inst
ftrace_call_replace(unsigned long ip, unsigned long addr, int link)
@@ -288,36 +287,31 @@ __ftrace_make_nop(struct module *mod,
#endif /* PPC64 */
#endif /* CONFIG_MODULES */
-static unsigned long find_ftrace_tramp(unsigned long ip)
+static unsigned long find_ftrace_tramp(unsigned long ip, unsigned long target)
{
- int i;
+ struct ppc_ftrace_stub_data *stub;
struct ppc_inst instr;
- /*
- * We have the compiler generated long_branch tramps at the end
- * and we prefer those
- */
- for (i = NUM_FTRACE_TRAMPS - 1; i >= 0; i--)
- if (!ftrace_tramps[i])
- continue;
- else if (create_branch(&instr, (void *)ip,
- ftrace_tramps[i], 0) == 0)
- return ftrace_tramps[i];
+ hash_for_each_possible(ppc_ftrace_stubs, stub, hentry, target)
+ if (stub->target == target && !create_branch(&instr, (void *)ip, stub->addr, 0))
+ return stub->addr;
return 0;
}
-static int add_ftrace_tramp(unsigned long tramp)
+static int add_ftrace_tramp(unsigned long tramp, unsigned long target)
{
- int i;
+ struct ppc_ftrace_stub_data *stub;
- for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
- if (!ftrace_tramps[i]) {
- ftrace_tramps[i] = tramp;
- return 0;
- }
+ stub = kmalloc(sizeof(*stub), GFP_KERNEL);
+ if (!stub)
+ return -1;
- return -1;
+ stub->addr = tramp;
+ stub->target = target;
+ hash_add(ppc_ftrace_stubs, &stub->hentry, target);
+
+ return 0;
}
/*
@@ -328,16 +322,14 @@ static int add_ftrace_tramp(unsigned long tramp)
*/
static int setup_mcount_compiler_tramp(unsigned long tramp)
{
- int i;
struct ppc_inst op;
- unsigned long ptr;
struct ppc_inst instr;
+ struct ppc_ftrace_stub_data *stub;
+ unsigned long ptr, ftrace_target = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
/* Is this a known long jump tramp? */
- for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
- if (!ftrace_tramps[i])
- break;
- else if (ftrace_tramps[i] == tramp)
+ hash_for_each_possible(ppc_ftrace_stubs, stub, hentry, ftrace_target)
+ if (stub->target == ftrace_target && stub->addr == tramp)
return 0;
/* New trampoline -- read where this goes */
@@ -361,19 +353,18 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
}
/* Let's re-write the tramp to go to ftrace_[regs_]caller */
- ptr = ppc_global_function_entry((void *)FTRACE_REGS_ADDR);
- if (create_branch(&instr, (void *)tramp, ptr, 0)) {
+ if (create_branch(&instr, (void *)tramp, ftrace_target, 0)) {
pr_debug("%ps is not reachable from existing mcount tramp\n",
- (void *)ptr);
+ (void *)ftrace_target);
return -1;
}
- if (patch_branch((struct ppc_inst *)tramp, ptr, 0)) {
+ if (patch_branch((struct ppc_inst *)tramp, ftrace_target, 0)) {
pr_debug("REL24 out of range!\n");
return -1;
}
- if (add_ftrace_tramp(tramp)) {
+ if (add_ftrace_tramp(tramp, ftrace_target)) {
pr_debug("No tramp locations left\n");
return -1;
}
@@ -405,7 +396,7 @@ static int __ftrace_make_nop_kernel(struct dyn_ftrace *rec, unsigned long addr)
if (setup_mcount_compiler_tramp(tramp)) {
/* Are other trampolines reachable? */
- if (!find_ftrace_tramp(ip)) {
+ if (!find_ftrace_tramp(ip, FTRACE_REGS_ADDR)) {
pr_err("No ftrace trampolines reachable from %ps\n",
(void *)ip);
return -EINVAL;
@@ -646,7 +637,7 @@ static int __ftrace_make_call_kernel(struct dyn_ftrace *rec, unsigned long addr)
return -EINVAL;
}
- tramp = find_ftrace_tramp((unsigned long)ip);
+ tramp = find_ftrace_tramp((unsigned long)ip, FTRACE_REGS_ADDR);
if (!tramp) {
pr_err("No ftrace trampolines reachable from %ps\n", ip);
return -EINVAL;
@@ -894,7 +885,7 @@ int __init ftrace_dyn_arch_init(void)
memcpy(tramp[i], stub_insns, sizeof(stub_insns));
tramp[i][1] |= PPC_HA(reladdr);
tramp[i][2] |= PPC_LO(reladdr);
- add_ftrace_tramp((unsigned long)tramp[i]);
+ add_ftrace_tramp((unsigned long)tramp[i], addr);
}
return 0;
--
2.25.4
^ permalink raw reply related
* [RFC PATCH 11/14] powerpc/ftrace: Use GPR save/restore macros in ftrace_graph_caller()
From: Naveen N. Rao @ 2020-11-26 18:08 UTC (permalink / raw)
To: Steven Rostedt, Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1606412433.git.naveen.n.rao@linux.vnet.ibm.com>
Use SAVE_8GPRS(), REST_8GPRS() and _NIP(), along with using the standard
SWITCH_FRAME_SIZE for the stack frame in ftrace_graph_caller() to
simplify code. This increases the stack frame size, but it is unlikely
to be an issue since ftrace_[regs_]caller() have just used a similar
stack frame size, and it isn't evident that the graph caller has too
deep a call stack to cause issues.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
.../powerpc/kernel/trace/ftrace_64_mprofile.S | 28 +++++--------------
1 file changed, 7 insertions(+), 21 deletions(-)
diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index f9fd5f743eba34..bbe871b47ade58 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -279,24 +279,17 @@ livepatch_handler:
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
_GLOBAL(ftrace_graph_caller)
- stdu r1, -112(r1)
+ stdu r1,-SWITCH_FRAME_SIZE(r1)
/* with -mprofile-kernel, parameter regs are still alive at _mcount */
- std r10, 104(r1)
- std r9, 96(r1)
- std r8, 88(r1)
- std r7, 80(r1)
- std r6, 72(r1)
- std r5, 64(r1)
- std r4, 56(r1)
- std r3, 48(r1)
+ SAVE_8GPRS(3, r1)
/* Save callee's TOC in the ABI compliant location */
std r2, 24(r1)
ld r2, PACATOC(r13) /* get kernel TOC in r2 */
- addi r5, r1, 112
+ addi r5, r1, SWITCH_FRAME_SIZE
mfctr r4 /* ftrace_caller has moved local addr here */
- std r4, 40(r1)
+ std r4, _NIP(r1)
mflr r3 /* ftrace_caller has restored LR from stack */
subi r4, r4, MCOUNT_INSN_SIZE
@@ -309,21 +302,14 @@ _GLOBAL(ftrace_graph_caller)
*/
mtlr r3
- ld r0, 40(r1)
+ ld r0, _NIP(r1)
mtctr r0
- ld r10, 104(r1)
- ld r9, 96(r1)
- ld r8, 88(r1)
- ld r7, 80(r1)
- ld r6, 72(r1)
- ld r5, 64(r1)
- ld r4, 56(r1)
- ld r3, 48(r1)
+ REST_8GPRS(3, r1)
/* Restore callee's TOC */
ld r2, 24(r1)
- addi r1, r1, 112
+ addi r1, r1, SWITCH_FRAME_SIZE
mflr r0
std r0, LRSAVE(r1)
bctr
--
2.25.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox