* [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems
@ 2017-11-01 14:12 Mark Rutland
2017-11-01 14:12 ` [PATCH 1/5] arm_pmu: fold platform helpers into platform code Mark Rutland
` (6 more replies)
0 siblings, 7 replies; 19+ messages in thread
From: Mark Rutland @ 2017-11-01 14:12 UTC (permalink / raw)
To: linux-arm-kernel
Currently the arm_pmu ACPI code is somewhat dubious. It attempts ot
allocate memory any manpiulate IRQs in a hotplug callback, which is an
atomic context.
These patches (based on the arm64 for-next/core branch [1]) attempt to
address this by moving work out of hotplug callback, requiring a
reorganisation of the common arm_pmu code.
I've given these a boot-test on a Juno R1 system, both with DT and ACPI.
In either case the PMU works as expected, and lockdep seems happy.
I've pushed the series out to my arm64/acpi-pmu-lockdep branch [2].
Thanks,
Mark.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/core
[2] https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/acpi-pmu-lockdep
Mark Rutland (5):
arm_pmu: fold platform helpers into platform code
arm_pmu: have armpmu_alloc() take GFP flags
arm_pmu: acpi: check for mismatched PPIs
arm_pmu: note IRQs/PMUs per-cpu
arm_pmu: acpi: request IRQs up-front
drivers/perf/arm_pmu.c | 107 ++++++++++++++--------------------------
drivers/perf/arm_pmu_acpi.c | 62 ++++++++++++++++-------
drivers/perf/arm_pmu_platform.c | 42 +++++++++++++++-
include/linux/perf/arm_pmu.h | 20 +++++---
4 files changed, 137 insertions(+), 94 deletions(-)
--
2.11.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/5] arm_pmu: fold platform helpers into platform code
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
@ 2017-11-01 14:12 ` Mark Rutland
2017-11-01 14:12 ` [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags Mark Rutland
` (5 subsequent siblings)
6 siblings, 0 replies; 19+ messages in thread
From: Mark Rutland @ 2017-11-01 14:12 UTC (permalink / raw)
To: linux-arm-kernel
The armpmu_{request,free}_irqs() helpers are only used by
arm_pmu_platform.c, so let's fold them in and make them static.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
drivers/perf/arm_pmu.c | 21 ---------------------
drivers/perf/arm_pmu_platform.c | 21 +++++++++++++++++++++
include/linux/perf/arm_pmu.h | 2 --
3 files changed, 21 insertions(+), 23 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 7bc5eee96b31..57df8dce8e19 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -548,14 +548,6 @@ void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, cpu));
}
-void armpmu_free_irqs(struct arm_pmu *armpmu)
-{
- int cpu;
-
- for_each_cpu(cpu, &armpmu->supported_cpus)
- armpmu_free_irq(armpmu, cpu);
-}
-
int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
{
int err = 0;
@@ -612,19 +604,6 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
return err;
}
-int armpmu_request_irqs(struct arm_pmu *armpmu)
-{
- int cpu, err;
-
- for_each_cpu(cpu, &armpmu->supported_cpus) {
- err = armpmu_request_irq(armpmu, cpu);
- if (err)
- break;
- }
-
- return err;
-}
-
static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
{
struct pmu_hw_events __percpu *hw_events = pmu->hw_events;
diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c
index bbc64ee6dd96..db863f14c406 100644
--- a/drivers/perf/arm_pmu_platform.c
+++ b/drivers/perf/arm_pmu_platform.c
@@ -172,6 +172,27 @@ static int pmu_parse_irqs(struct arm_pmu *pmu)
return 0;
}
+static int armpmu_request_irqs(struct arm_pmu *armpmu)
+{
+ int cpu, err;
+
+ for_each_cpu(cpu, &armpmu->supported_cpus) {
+ err = armpmu_request_irq(armpmu, cpu);
+ if (err)
+ break;
+ }
+
+ return err;
+}
+
+static void armpmu_free_irqs(struct arm_pmu *armpmu)
+{
+ int cpu;
+
+ for_each_cpu(cpu, &armpmu->supported_cpus)
+ armpmu_free_irq(armpmu, cpu);
+}
+
int arm_pmu_device_probe(struct platform_device *pdev,
const struct of_device_id *of_table,
const struct pmu_probe_info *probe_table)
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index af0f44effd44..97426407904d 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -176,8 +176,6 @@ static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
struct arm_pmu *armpmu_alloc(void);
void armpmu_free(struct arm_pmu *pmu);
int armpmu_register(struct arm_pmu *pmu);
-int armpmu_request_irqs(struct arm_pmu *armpmu);
-void armpmu_free_irqs(struct arm_pmu *armpmu);
int armpmu_request_irq(struct arm_pmu *armpmu, int cpu);
void armpmu_free_irq(struct arm_pmu *armpmu, int cpu);
--
2.11.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
2017-11-01 14:12 ` [PATCH 1/5] arm_pmu: fold platform helpers into platform code Mark Rutland
@ 2017-11-01 14:12 ` Mark Rutland
2017-12-06 6:54 ` Zhangshaokun
2017-12-11 17:37 ` Will Deacon
2017-11-01 14:12 ` [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs Mark Rutland
` (4 subsequent siblings)
6 siblings, 2 replies; 19+ messages in thread
From: Mark Rutland @ 2017-11-01 14:12 UTC (permalink / raw)
To: linux-arm-kernel
In ACPI systems, we don't know the makeup of CPUs until we hotplug them
on, and thus have to allocate the PMU datastrcutures at hotplug time.
Thus, we must use GFP_ATOMIC allocations.
Reorganise the PMU allocators to take a GFP argument so that we can
permit this.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
drivers/perf/arm_pmu.c | 6 +++---
drivers/perf/arm_pmu_acpi.c | 2 +-
drivers/perf/arm_pmu_platform.c | 2 +-
include/linux/perf/arm_pmu.h | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 57df8dce8e19..3d6d4c5f2356 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -779,18 +779,18 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
&cpu_pmu->node);
}
-struct arm_pmu *armpmu_alloc(void)
+struct arm_pmu *armpmu_alloc(gfp_t flags)
{
struct arm_pmu *pmu;
int cpu;
- pmu = kzalloc(sizeof(*pmu), GFP_KERNEL);
+ pmu = kzalloc(sizeof(*pmu), flags);
if (!pmu) {
pr_info("failed to allocate PMU device!\n");
goto out;
}
- pmu->hw_events = alloc_percpu(struct pmu_hw_events);
+ pmu->hw_events = alloc_percpu_gfp(struct pmu_hw_events, flags);
if (!pmu->hw_events) {
pr_info("failed to allocate per-cpu PMU data.\n");
goto out_free_pmu;
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 705f1a390e31..a52f5b673a15 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -127,7 +127,7 @@ static struct arm_pmu *arm_pmu_acpi_find_alloc_pmu(void)
return pmu;
}
- pmu = armpmu_alloc();
+ pmu = armpmu_alloc(GFP_ATOMIC);
if (!pmu) {
pr_warn("Unable to allocate PMU for CPU%d\n",
smp_processor_id());
diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c
index db863f14c406..686b3f28c5d1 100644
--- a/drivers/perf/arm_pmu_platform.c
+++ b/drivers/perf/arm_pmu_platform.c
@@ -203,7 +203,7 @@ int arm_pmu_device_probe(struct platform_device *pdev,
struct arm_pmu *pmu;
int ret = -ENODEV;
- pmu = armpmu_alloc();
+ pmu = armpmu_alloc(GFP_KERNEL);
if (!pmu)
return -ENOMEM;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 97426407904d..52e6d56a6a4e 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -173,7 +173,7 @@ static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
#endif
/* Internal functions only for core arm_pmu code */
-struct arm_pmu *armpmu_alloc(void);
+struct arm_pmu *armpmu_alloc(gfp_t flags);
void armpmu_free(struct arm_pmu *pmu);
int armpmu_register(struct arm_pmu *pmu);
int armpmu_request_irq(struct arm_pmu *armpmu, int cpu);
--
2.11.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
2017-11-01 14:12 ` [PATCH 1/5] arm_pmu: fold platform helpers into platform code Mark Rutland
2017-11-01 14:12 ` [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags Mark Rutland
@ 2017-11-01 14:12 ` Mark Rutland
2017-12-11 17:37 ` Will Deacon
2017-11-01 14:12 ` [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu Mark Rutland
` (3 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2017-11-01 14:12 UTC (permalink / raw)
To: linux-arm-kernel
The arm_pmu platform code explicitly checks for mismatched PPIs at probe
time, while the ACPI code leaves this to the core code. Future
refactoring will make this difficult for the core code to check, so
let's have the ACPI code check this explicitly.
As before, upon a failure we'll continue on without an interrupt. Ho
hum.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
drivers/perf/arm_pmu.c | 16 ++++------------
drivers/perf/arm_pmu_acpi.c | 42 ++++++++++++++++++++++++++++++++++++++----
2 files changed, 42 insertions(+), 16 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 3d6d4c5f2356..e0242103d904 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -557,18 +557,10 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
if (!irq)
return 0;
- if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) {
- err = request_percpu_irq(irq, handler, "arm-pmu",
- &hw_events->percpu_pmu);
- } else if (irq_is_percpu_devid(irq)) {
- int other_cpu = cpumask_first(&armpmu->active_irqs);
- int other_irq = per_cpu(hw_events->irq, other_cpu);
-
- if (irq != other_irq) {
- pr_warn("mismatched PPIs detected.\n");
- err = -EINVAL;
- goto err_out;
- }
+ if (irq_is_percpu_devid(irq)) {
+ if (cpumask_empty(&armpmu->active_irqs))
+ err = request_percpu_irq(irq, handler, "arm-pmu",
+ &hw_events->percpu_pmu);
} else {
struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
unsigned long irq_flags;
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index a52f5b673a15..07a113b8a3b2 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -11,6 +11,8 @@
#include <linux/acpi.h>
#include <linux/cpumask.h>
#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/irqdesc.h>
#include <linux/percpu.h>
#include <linux/perf/arm_pmu.h>
@@ -140,6 +142,35 @@ static struct arm_pmu *arm_pmu_acpi_find_alloc_pmu(void)
}
/*
+ * Check whether the new IRQ is compatible with those already associated with
+ * the PMU (e.g. we don't have mismatched PPIs).
+ */
+static bool pmu_irq_matches(struct arm_pmu *pmu, int irq)
+{
+ struct pmu_hw_events __percpu *hw_events = pmu->hw_events;
+ int cpu;
+
+ if (!irq)
+ return true;
+
+ for_each_cpu(cpu, &pmu->supported_cpus) {
+ int other_irq = per_cpu(hw_events->irq, cpu);
+ if (!other_irq)
+ continue;
+
+ if (irq == other_irq)
+ continue;
+ if (!irq_is_percpu_devid(irq) && !irq_is_percpu_devid(other_irq))
+ continue;
+
+ pr_warn("mismatched PPIs detected\n");
+ return false;
+ }
+
+ return true;
+}
+
+/*
* This must run before the common arm_pmu hotplug logic, so that we can
* associate a CPU and its interrupt before the common code tries to manage the
* affinity and so on.
@@ -164,18 +195,21 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
if (!pmu)
return -ENOMEM;
- cpumask_set_cpu(cpu, &pmu->supported_cpus);
-
per_cpu(probed_pmus, cpu) = pmu;
+ if (pmu_irq_matches(pmu, irq)) {
+ hw_events = pmu->hw_events;
+ per_cpu(hw_events->irq, cpu) = irq;
+ }
+
+ cpumask_set_cpu(cpu, &pmu->supported_cpus);
+
/*
* Log and request the IRQ so the core arm_pmu code can manage it. In
* some situations (e.g. mismatched PPIs), we may fail to request the
* IRQ. However, it may be too late for us to do anything about it.
* The common ARM PMU code will log a warning in this case.
*/
- hw_events = pmu->hw_events;
- per_cpu(hw_events->irq, cpu) = irq;
armpmu_request_irq(pmu, cpu);
/*
--
2.11.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
` (2 preceding siblings ...)
2017-11-01 14:12 ` [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs Mark Rutland
@ 2017-11-01 14:12 ` Mark Rutland
2017-12-11 17:36 ` Will Deacon
2017-11-01 14:12 ` [PATCH 5/5] arm_pmu: acpi: request IRQs up-front Mark Rutland
` (2 subsequent siblings)
6 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2017-11-01 14:12 UTC (permalink / raw)
To: linux-arm-kernel
The current way we manage IRQs forces the ACPI PMU driver to request
IRQs in the cpu bringup path, which isn't safe due to implicit memory
allocations in the request_irq() path.
To solve that, we need to decouple requesting IRQs from PMU management,
requesting IRQs up-front, before we know the associated PMU. We will
separately (and perhaps later) associate each IRQ with its PMU.
This patch allows the IRQ handlers to be registered without a PMU dev
argument, using a percpu pointer instead.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
drivers/perf/arm_pmu.c | 93 ++++++++++++++++++++++++++++----------------
include/linux/perf/arm_pmu.h | 3 +-
2 files changed, 62 insertions(+), 34 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index e0242103d904..287b3edfb4cc 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -26,6 +26,9 @@
#include <asm/irq_regs.h>
+static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
+static DEFINE_PER_CPU(int, cpu_irq);
+
static int
armpmu_map_cache_event(const unsigned (*cache_map)
[PERF_COUNT_HW_CACHE_MAX]
@@ -334,13 +337,9 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
int ret;
u64 start_clock, finish_clock;
- /*
- * we request the IRQ with a (possibly percpu) struct arm_pmu**, but
- * the handlers expect a struct arm_pmu*. The percpu_irq framework will
- * do any necessary shifting, we just need to perform the first
- * dereference.
- */
- armpmu = *(void **)dev;
+ armpmu = this_cpu_read(cpu_armpmu);
+ if (WARN_ON_ONCE(!armpmu))
+ return IRQ_NONE;
plat = armpmu_get_platdata(armpmu);
@@ -531,40 +530,56 @@ int perf_num_counters(void)
}
EXPORT_SYMBOL_GPL(perf_num_counters);
-void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
+int armpmu_count_irq_users(const int irq)
{
- struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
- int irq = per_cpu(hw_events->irq, cpu);
+ int cpu, count = 0;
+
+ for_each_possible_cpu(cpu) {
+ if (per_cpu(cpu_irq, cpu) == irq)
+ count++;
+ }
+
+ return count;
+}
- if (!cpumask_test_and_clear_cpu(cpu, &armpmu->active_irqs))
+void __armpmu_free_irq(int irq, int cpu)
+{
+ if (per_cpu(cpu_irq, cpu) == 0)
+ return;
+ if (WARN_ON(irq != per_cpu(cpu_irq, cpu)))
return;
if (irq_is_percpu_devid(irq)) {
- free_percpu_irq(irq, &hw_events->percpu_pmu);
- cpumask_clear(&armpmu->active_irqs);
- return;
+ if (armpmu_count_irq_users(irq) == 1)
+ free_percpu_irq(irq, &cpu_armpmu);
+ } else {
+ free_irq(irq, NULL);
}
- free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, cpu));
+ per_cpu(cpu_irq, cpu) = 0;
}
-int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
+void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
{
- int err = 0;
struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
- const irq_handler_t handler = armpmu_dispatch_irq;
int irq = per_cpu(hw_events->irq, cpu);
+
+ __armpmu_free_irq(irq, cpu);
+ per_cpu(cpu_armpmu, cpu) = NULL;
+}
+
+int armpmu_request_irq_flags(int irq, unsigned long irq_flags, int cpu)
+{
+ int err = 0;
+ const irq_handler_t handler = armpmu_dispatch_irq;
if (!irq)
return 0;
if (irq_is_percpu_devid(irq)) {
- if (cpumask_empty(&armpmu->active_irqs))
+ if (armpmu_count_irq_users(irq) == 0)
err = request_percpu_irq(irq, handler, "arm-pmu",
- &hw_events->percpu_pmu);
+ &cpu_armpmu);
} else {
- struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
- unsigned long irq_flags;
-
err = irq_force_affinity(irq, cpumask_of(cpu));
if (err && num_possible_cpus() > 1) {
@@ -573,22 +588,14 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
goto err_out;
}
- if (platdata && platdata->irq_flags) {
- irq_flags = platdata->irq_flags;
- } else {
- irq_flags = IRQF_PERCPU |
- IRQF_NOBALANCING |
- IRQF_NO_THREAD;
- }
-
err = request_irq(irq, handler, irq_flags, "arm-pmu",
- per_cpu_ptr(&hw_events->percpu_pmu, cpu));
+ NULL);
}
if (err)
goto err_out;
- cpumask_set_cpu(cpu, &armpmu->active_irqs);
+ per_cpu(cpu_irq, cpu) = irq;
return 0;
err_out:
@@ -596,6 +603,26 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
return err;
}
+int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
+{
+ struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
+ unsigned long irq_flags;
+ struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
+ int irq = per_cpu(hw_events->irq, cpu);
+ if (!irq)
+ return 0;
+
+ if (platdata && platdata->irq_flags) {
+ irq_flags = platdata->irq_flags;
+ } else {
+ irq_flags = ARM_PMU_IRQ_FLAGS;
+ }
+
+ per_cpu(cpu_armpmu, cpu) = armpmu;
+
+ return armpmu_request_irq_flags(irq, irq_flags, cpu);
+}
+
static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
{
struct pmu_hw_events __percpu *hw_events = pmu->hw_events;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 52e6d56a6a4e..2ce6db4e00b2 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -92,7 +92,6 @@ enum armpmu_attr_groups {
struct arm_pmu {
struct pmu pmu;
- cpumask_t active_irqs;
cpumask_t supported_cpus;
char *name;
irqreturn_t (*handle_irq)(int irq_num, void *dev);
@@ -179,6 +178,8 @@ int armpmu_register(struct arm_pmu *pmu);
int armpmu_request_irq(struct arm_pmu *armpmu, int cpu);
void armpmu_free_irq(struct arm_pmu *armpmu, int cpu);
+#define ARM_PMU_IRQ_FLAGS (IRQF_PERCPU | IRQF_NOBALANCING | IRQF_NO_THREAD)
+
#define ARMV8_PMU_PDEV_NAME "armv8-pmu"
#endif /* CONFIG_ARM_PMU */
--
2.11.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/5] arm_pmu: acpi: request IRQs up-front
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
` (3 preceding siblings ...)
2017-11-01 14:12 ` [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu Mark Rutland
@ 2017-11-01 14:12 ` Mark Rutland
2017-12-11 17:36 ` Will Deacon
2017-11-01 17:02 ` [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Tyler Baicar
2017-12-11 17:38 ` Will Deacon
6 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2017-11-01 14:12 UTC (permalink / raw)
To: linux-arm-kernel
We can't request IRQs in atomic context, so for ACPI systems we'll have
to request them up-front, and later associate them with CPUs.
This patch reorganises the arm_pmu code to do so. As we no longer have
the arm_pmu strucutre at probe time, a number of prototypes need to be
adjusted, requiring changes to the common arm_pmu code and arm_pmu
platform code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
drivers/perf/arm_pmu.c | 37 +++----------------------------------
drivers/perf/arm_pmu_acpi.c | 20 +++++++-------------
drivers/perf/arm_pmu_platform.c | 25 ++++++++++++++++++++++---
include/linux/perf/arm_pmu.h | 13 +++++++++++--
4 files changed, 43 insertions(+), 52 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 287b3edfb4cc..534b4b3fb440 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -323,13 +323,6 @@ validate_group(struct perf_event *event)
return 0;
}
-static struct arm_pmu_platdata *armpmu_get_platdata(struct arm_pmu *armpmu)
-{
- struct platform_device *pdev = armpmu->plat_device;
-
- return pdev ? dev_get_platdata(&pdev->dev) : NULL;
-}
-
static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
{
struct arm_pmu *armpmu;
@@ -542,7 +535,7 @@ int armpmu_count_irq_users(const int irq)
return count;
}
-void __armpmu_free_irq(int irq, int cpu)
+void armpmu_free_irq(int irq, int cpu)
{
if (per_cpu(cpu_irq, cpu) == 0)
return;
@@ -559,16 +552,7 @@ void __armpmu_free_irq(int irq, int cpu)
per_cpu(cpu_irq, cpu) = 0;
}
-void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
-{
- struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
- int irq = per_cpu(hw_events->irq, cpu);
-
- __armpmu_free_irq(irq, cpu);
- per_cpu(cpu_armpmu, cpu) = NULL;
-}
-
-int armpmu_request_irq_flags(int irq, unsigned long irq_flags, int cpu)
+int armpmu_request_irq(int irq, unsigned long irq_flags, int cpu)
{
int err = 0;
const irq_handler_t handler = armpmu_dispatch_irq;
@@ -603,24 +587,9 @@ int armpmu_request_irq_flags(int irq, unsigned long irq_flags, int cpu)
return err;
}
-int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
+void armpmu_bind_cpu(struct arm_pmu *armpmu, int cpu)
{
- struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
- unsigned long irq_flags;
- struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
- int irq = per_cpu(hw_events->irq, cpu);
- if (!irq)
- return 0;
-
- if (platdata && platdata->irq_flags) {
- irq_flags = platdata->irq_flags;
- } else {
- irq_flags = ARM_PMU_IRQ_FLAGS;
- }
-
per_cpu(cpu_armpmu, cpu) = armpmu;
-
- return armpmu_request_irq_flags(irq, irq_flags, cpu);
}
static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 07a113b8a3b2..8740314584aa 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -89,7 +89,13 @@ static int arm_pmu_acpi_parse_irqs(void)
pr_warn("No ACPI PMU IRQ for CPU%d\n", cpu);
}
+ /*
+ * Log and request the IRQ so the core arm_pmu code can manage
+ * it. We'll have to sanity-check IRQs later when we associate
+ * them with their PMUs.
+ */
per_cpu(pmu_irqs, cpu) = irq;
+ armpmu_request_irq(irq, ARM_PMU_IRQ_FLAGS, cpu);
}
return 0;
@@ -203,14 +209,7 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
}
cpumask_set_cpu(cpu, &pmu->supported_cpus);
-
- /*
- * Log and request the IRQ so the core arm_pmu code can manage it. In
- * some situations (e.g. mismatched PPIs), we may fail to request the
- * IRQ. However, it may be too late for us to do anything about it.
- * The common ARM PMU code will log a warning in this case.
- */
- armpmu_request_irq(pmu, cpu);
+ armpmu_bind_cpu(pmu, cpu);
/*
* Ideally, we'd probe the PMU here when we find the first matching
@@ -281,11 +280,6 @@ static int arm_pmu_acpi_init(void)
if (acpi_disabled)
return 0;
- /*
- * We can't request IRQs yet, since we don't know the cookie value
- * until we know which CPUs share the same logical PMU. We'll handle
- * that in arm_pmu_acpi_cpu_starting().
- */
ret = arm_pmu_acpi_parse_irqs();
if (ret)
return ret;
diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c
index 686b3f28c5d1..f5aba6bafe1d 100644
--- a/drivers/perf/arm_pmu_platform.c
+++ b/drivers/perf/arm_pmu_platform.c
@@ -174,10 +174,24 @@ static int pmu_parse_irqs(struct arm_pmu *pmu)
static int armpmu_request_irqs(struct arm_pmu *armpmu)
{
+ struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
+ struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
+ unsigned long irq_flags;
int cpu, err;
+ if (platdata && platdata->irq_flags)
+ irq_flags = platdata->irq_flags;
+ else
+ irq_flags = ARM_PMU_IRQ_FLAGS;
+
for_each_cpu(cpu, &armpmu->supported_cpus) {
- err = armpmu_request_irq(armpmu, cpu);
+ int irq = per_cpu(hw_events->irq, cpu);
+ if (!irq)
+ continue;
+
+ armpmu_bind_cpu(armpmu, cpu);
+
+ err = armpmu_request_irq(irq, irq_flags, cpu);
if (err)
break;
}
@@ -188,9 +202,14 @@ static int armpmu_request_irqs(struct arm_pmu *armpmu)
static void armpmu_free_irqs(struct arm_pmu *armpmu)
{
int cpu;
+ struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
- for_each_cpu(cpu, &armpmu->supported_cpus)
- armpmu_free_irq(armpmu, cpu);
+ for_each_cpu(cpu, &armpmu->supported_cpus) {
+ int irq = per_cpu(hw_events->irq, cpu);
+
+ armpmu_free_irq(irq, cpu);
+ armpmu_bind_cpu(NULL, cpu);
+ }
}
int arm_pmu_device_probe(struct platform_device *pdev,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 2ce6db4e00b2..00f11de8775e 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -14,6 +14,7 @@
#include <linux/interrupt.h>
#include <linux/perf_event.h>
+#include <linux/platform_device.h>
#include <linux/sysfs.h>
#include <asm/cputype.h>
@@ -175,8 +176,16 @@ static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
struct arm_pmu *armpmu_alloc(gfp_t flags);
void armpmu_free(struct arm_pmu *pmu);
int armpmu_register(struct arm_pmu *pmu);
-int armpmu_request_irq(struct arm_pmu *armpmu, int cpu);
-void armpmu_free_irq(struct arm_pmu *armpmu, int cpu);
+void armpmu_bind_cpu(struct arm_pmu *armpmu, int cpu);
+int armpmu_request_irq(int irq, unsigned long flags, int cpu);
+void armpmu_free_irq(int irq, int cpu);
+
+static inline struct arm_pmu_platdata *armpmu_get_platdata(struct arm_pmu *armpmu)
+{
+ struct platform_device *pdev = armpmu->plat_device;
+
+ return pdev ? dev_get_platdata(&pdev->dev) : NULL;
+}
#define ARM_PMU_IRQ_FLAGS (IRQF_PERCPU | IRQF_NOBALANCING | IRQF_NO_THREAD)
--
2.11.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
` (4 preceding siblings ...)
2017-11-01 14:12 ` [PATCH 5/5] arm_pmu: acpi: request IRQs up-front Mark Rutland
@ 2017-11-01 17:02 ` Tyler Baicar
2017-12-11 17:38 ` Will Deacon
6 siblings, 0 replies; 19+ messages in thread
From: Tyler Baicar @ 2017-11-01 17:02 UTC (permalink / raw)
To: linux-arm-kernel
On 11/1/2017 10:12 AM, Mark Rutland wrote:
> Currently the arm_pmu ACPI code is somewhat dubious. It attempts ot
> allocate memory any manpiulate IRQs in a hotplug callback, which is an
> atomic context.
>
> These patches (based on the arm64 for-next/core branch [1]) attempt to
> address this by moving work out of hotplug callback, requiring a
> reorganisation of the common arm_pmu code.
>
> I've given these a boot-test on a Juno R1 system, both with DT and ACPI.
> In either case the PMU works as expected, and lockdep seems happy.
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Boot tested on Qualcomm Centriq 2400 and verified I no longer see the BUG print.
Thanks,
Tyler
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags
2017-11-01 14:12 ` [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags Mark Rutland
@ 2017-12-06 6:54 ` Zhangshaokun
2017-12-11 17:37 ` Will Deacon
1 sibling, 0 replies; 19+ messages in thread
From: Zhangshaokun @ 2017-12-06 6:54 UTC (permalink / raw)
To: linux-arm-kernel
Hi Mark,
We (HiSilicon) also met this issue and it works happily with this patch,
but it is still not in 4.15 version, any special reasons?
Thanks,
Shaokun
On 2017/11/1 22:12, Mark Rutland wrote:
> In ACPI systems, we don't know the makeup of CPUs until we hotplug them
> on, and thus have to allocate the PMU datastrcutures at hotplug time.
> Thus, we must use GFP_ATOMIC allocations.
>
> Reorganise the PMU allocators to take a GFP argument so that we can
> permit this.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
> drivers/perf/arm_pmu.c | 6 +++---
> drivers/perf/arm_pmu_acpi.c | 2 +-
> drivers/perf/arm_pmu_platform.c | 2 +-
> include/linux/perf/arm_pmu.h | 2 +-
> 4 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 57df8dce8e19..3d6d4c5f2356 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -779,18 +779,18 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
> &cpu_pmu->node);
> }
>
> -struct arm_pmu *armpmu_alloc(void)
> +struct arm_pmu *armpmu_alloc(gfp_t flags)
> {
> struct arm_pmu *pmu;
> int cpu;
>
> - pmu = kzalloc(sizeof(*pmu), GFP_KERNEL);
> + pmu = kzalloc(sizeof(*pmu), flags);
> if (!pmu) {
> pr_info("failed to allocate PMU device!\n");
> goto out;
> }
>
> - pmu->hw_events = alloc_percpu(struct pmu_hw_events);
> + pmu->hw_events = alloc_percpu_gfp(struct pmu_hw_events, flags);
> if (!pmu->hw_events) {
> pr_info("failed to allocate per-cpu PMU data.\n");
> goto out_free_pmu;
> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
> index 705f1a390e31..a52f5b673a15 100644
> --- a/drivers/perf/arm_pmu_acpi.c
> +++ b/drivers/perf/arm_pmu_acpi.c
> @@ -127,7 +127,7 @@ static struct arm_pmu *arm_pmu_acpi_find_alloc_pmu(void)
> return pmu;
> }
>
> - pmu = armpmu_alloc();
> + pmu = armpmu_alloc(GFP_ATOMIC);
> if (!pmu) {
> pr_warn("Unable to allocate PMU for CPU%d\n",
> smp_processor_id());
> diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c
> index db863f14c406..686b3f28c5d1 100644
> --- a/drivers/perf/arm_pmu_platform.c
> +++ b/drivers/perf/arm_pmu_platform.c
> @@ -203,7 +203,7 @@ int arm_pmu_device_probe(struct platform_device *pdev,
> struct arm_pmu *pmu;
> int ret = -ENODEV;
>
> - pmu = armpmu_alloc();
> + pmu = armpmu_alloc(GFP_KERNEL);
> if (!pmu)
> return -ENOMEM;
>
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 97426407904d..52e6d56a6a4e 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -173,7 +173,7 @@ static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
> #endif
>
> /* Internal functions only for core arm_pmu code */
> -struct arm_pmu *armpmu_alloc(void);
> +struct arm_pmu *armpmu_alloc(gfp_t flags);
> void armpmu_free(struct arm_pmu *pmu);
> int armpmu_register(struct arm_pmu *pmu);
> int armpmu_request_irq(struct arm_pmu *armpmu, int cpu);
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 5/5] arm_pmu: acpi: request IRQs up-front
2017-11-01 14:12 ` [PATCH 5/5] arm_pmu: acpi: request IRQs up-front Mark Rutland
@ 2017-12-11 17:36 ` Will Deacon
2017-12-11 17:55 ` Mark Rutland
0 siblings, 1 reply; 19+ messages in thread
From: Will Deacon @ 2017-12-11 17:36 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 01, 2017 at 02:12:39PM +0000, Mark Rutland wrote:
> We can't request IRQs in atomic context, so for ACPI systems we'll have
> to request them up-front, and later associate them with CPUs.
>
> This patch reorganises the arm_pmu code to do so. As we no longer have
> the arm_pmu strucutre at probe time, a number of prototypes need to be
> adjusted, requiring changes to the common arm_pmu code and arm_pmu
> platform code.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
> drivers/perf/arm_pmu.c | 37 +++----------------------------------
> drivers/perf/arm_pmu_acpi.c | 20 +++++++-------------
> drivers/perf/arm_pmu_platform.c | 25 ++++++++++++++++++++++---
> include/linux/perf/arm_pmu.h | 13 +++++++++++--
> 4 files changed, 43 insertions(+), 52 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 287b3edfb4cc..534b4b3fb440 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -323,13 +323,6 @@ validate_group(struct perf_event *event)
> return 0;
> }
>
> -static struct arm_pmu_platdata *armpmu_get_platdata(struct arm_pmu *armpmu)
> -{
> - struct platform_device *pdev = armpmu->plat_device;
> -
> - return pdev ? dev_get_platdata(&pdev->dev) : NULL;
> -}
> -
> static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> {
> struct arm_pmu *armpmu;
> @@ -542,7 +535,7 @@ int armpmu_count_irq_users(const int irq)
> return count;
> }
>
> -void __armpmu_free_irq(int irq, int cpu)
> +void armpmu_free_irq(int irq, int cpu)
> {
> if (per_cpu(cpu_irq, cpu) == 0)
> return;
> @@ -559,16 +552,7 @@ void __armpmu_free_irq(int irq, int cpu)
> per_cpu(cpu_irq, cpu) = 0;
> }
>
> -void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
> -{
> - struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
> - int irq = per_cpu(hw_events->irq, cpu);
> -
> - __armpmu_free_irq(irq, cpu);
> - per_cpu(cpu_armpmu, cpu) = NULL;
> -}
> -
> -int armpmu_request_irq_flags(int irq, unsigned long irq_flags, int cpu)
> +int armpmu_request_irq(int irq, unsigned long irq_flags, int cpu)
> {
> int err = 0;
> const irq_handler_t handler = armpmu_dispatch_irq;
> @@ -603,24 +587,9 @@ int armpmu_request_irq_flags(int irq, unsigned long irq_flags, int cpu)
> return err;
> }
>
> -int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
> +void armpmu_bind_cpu(struct arm_pmu *armpmu, int cpu)
> {
> - struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
> - unsigned long irq_flags;
> - struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
> - int irq = per_cpu(hw_events->irq, cpu);
> - if (!irq)
> - return 0;
> -
> - if (platdata && platdata->irq_flags) {
> - irq_flags = platdata->irq_flags;
> - } else {
> - irq_flags = ARM_PMU_IRQ_FLAGS;
> - }
> -
> per_cpu(cpu_armpmu, cpu) = armpmu;
Can we not make the binding implicit in armpmu_{request,free}_irq?
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu
2017-11-01 14:12 ` [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu Mark Rutland
@ 2017-12-11 17:36 ` Will Deacon
2017-12-11 18:15 ` Mark Rutland
0 siblings, 1 reply; 19+ messages in thread
From: Will Deacon @ 2017-12-11 17:36 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 01, 2017 at 02:12:38PM +0000, Mark Rutland wrote:
> The current way we manage IRQs forces the ACPI PMU driver to request
> IRQs in the cpu bringup path, which isn't safe due to implicit memory
> allocations in the request_irq() path.
>
> To solve that, we need to decouple requesting IRQs from PMU management,
> requesting IRQs up-front, before we know the associated PMU. We will
> separately (and perhaps later) associate each IRQ with its PMU.
>
> This patch allows the IRQ handlers to be registered without a PMU dev
> argument, using a percpu pointer instead.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
> drivers/perf/arm_pmu.c | 93 ++++++++++++++++++++++++++++----------------
> include/linux/perf/arm_pmu.h | 3 +-
> 2 files changed, 62 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index e0242103d904..287b3edfb4cc 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -26,6 +26,9 @@
>
> #include <asm/irq_regs.h>
>
> +static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
> +static DEFINE_PER_CPU(int, cpu_irq);
> +
> static int
> armpmu_map_cache_event(const unsigned (*cache_map)
> [PERF_COUNT_HW_CACHE_MAX]
> @@ -334,13 +337,9 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> int ret;
> u64 start_clock, finish_clock;
>
> - /*
> - * we request the IRQ with a (possibly percpu) struct arm_pmu**, but
> - * the handlers expect a struct arm_pmu*. The percpu_irq framework will
> - * do any necessary shifting, we just need to perform the first
> - * dereference.
> - */
> - armpmu = *(void **)dev;
> + armpmu = this_cpu_read(cpu_armpmu);
> + if (WARN_ON_ONCE(!armpmu))
> + return IRQ_NONE;
>
> plat = armpmu_get_platdata(armpmu);
>
> @@ -531,40 +530,56 @@ int perf_num_counters(void)
> }
> EXPORT_SYMBOL_GPL(perf_num_counters);
>
> -void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
> +int armpmu_count_irq_users(const int irq)
> {
> - struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
> - int irq = per_cpu(hw_events->irq, cpu);
> + int cpu, count = 0;
> +
> + for_each_possible_cpu(cpu) {
> + if (per_cpu(cpu_irq, cpu) == irq)
> + count++;
> + }
> +
> + return count;
> +}
>
> - if (!cpumask_test_and_clear_cpu(cpu, &armpmu->active_irqs))
> +void __armpmu_free_irq(int irq, int cpu)
> +{
> + if (per_cpu(cpu_irq, cpu) == 0)
> + return;
> + if (WARN_ON(irq != per_cpu(cpu_irq, cpu)))
> return;
>
> if (irq_is_percpu_devid(irq)) {
> - free_percpu_irq(irq, &hw_events->percpu_pmu);
> - cpumask_clear(&armpmu->active_irqs);
> - return;
> + if (armpmu_count_irq_users(irq) == 1)
> + free_percpu_irq(irq, &cpu_armpmu);
Do you actually need the count, or could you just free the irq the first
time this is called and set all of the cpu_irqs to 0?
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags
2017-11-01 14:12 ` [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags Mark Rutland
2017-12-06 6:54 ` Zhangshaokun
@ 2017-12-11 17:37 ` Will Deacon
2017-12-11 18:02 ` Mark Rutland
1 sibling, 1 reply; 19+ messages in thread
From: Will Deacon @ 2017-12-11 17:37 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 01, 2017 at 02:12:36PM +0000, Mark Rutland wrote:
> In ACPI systems, we don't know the makeup of CPUs until we hotplug them
> on, and thus have to allocate the PMU datastrcutures at hotplug time.
> Thus, we must use GFP_ATOMIC allocations.
>
> Reorganise the PMU allocators to take a GFP argument so that we can
> permit this.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
> drivers/perf/arm_pmu.c | 6 +++---
> drivers/perf/arm_pmu_acpi.c | 2 +-
> drivers/perf/arm_pmu_platform.c | 2 +-
> include/linux/perf/arm_pmu.h | 2 +-
> 4 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 57df8dce8e19..3d6d4c5f2356 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -779,18 +779,18 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
> &cpu_pmu->node);
> }
>
> -struct arm_pmu *armpmu_alloc(void)
> +struct arm_pmu *armpmu_alloc(gfp_t flags)
> {
> struct arm_pmu *pmu;
> int cpu;
>
> - pmu = kzalloc(sizeof(*pmu), GFP_KERNEL);
> + pmu = kzalloc(sizeof(*pmu), flags);
> if (!pmu) {
> pr_info("failed to allocate PMU device!\n");
> goto out;
> }
>
> - pmu->hw_events = alloc_percpu(struct pmu_hw_events);
> + pmu->hw_events = alloc_percpu_gfp(struct pmu_hw_events, flags);
> if (!pmu->hw_events) {
> pr_info("failed to allocate per-cpu PMU data.\n");
> goto out_free_pmu;
> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
> index 705f1a390e31..a52f5b673a15 100644
> --- a/drivers/perf/arm_pmu_acpi.c
> +++ b/drivers/perf/arm_pmu_acpi.c
> @@ -127,7 +127,7 @@ static struct arm_pmu *arm_pmu_acpi_find_alloc_pmu(void)
> return pmu;
> }
>
> - pmu = armpmu_alloc();
> + pmu = armpmu_alloc(GFP_ATOMIC);
I think I'd rather have armpmu_alloc_atomic as a wrapper around
__armpmu_alloc(GFP_ATOMIC) and then leave the armpmu_alloc to pass
GFP_KERNEL.
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs
2017-11-01 14:12 ` [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs Mark Rutland
@ 2017-12-11 17:37 ` Will Deacon
2017-12-11 18:08 ` Mark Rutland
0 siblings, 1 reply; 19+ messages in thread
From: Will Deacon @ 2017-12-11 17:37 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 01, 2017 at 02:12:37PM +0000, Mark Rutland wrote:
> The arm_pmu platform code explicitly checks for mismatched PPIs at probe
> time, while the ACPI code leaves this to the core code. Future
> refactoring will make this difficult for the core code to check, so
> let's have the ACPI code check this explicitly.
>
> As before, upon a failure we'll continue on without an interrupt. Ho
> hum.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
> drivers/perf/arm_pmu.c | 16 ++++------------
> drivers/perf/arm_pmu_acpi.c | 42 ++++++++++++++++++++++++++++++++++++++----
> 2 files changed, 42 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 3d6d4c5f2356..e0242103d904 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -557,18 +557,10 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
> if (!irq)
> return 0;
>
> - if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) {
> - err = request_percpu_irq(irq, handler, "arm-pmu",
> - &hw_events->percpu_pmu);
> - } else if (irq_is_percpu_devid(irq)) {
> - int other_cpu = cpumask_first(&armpmu->active_irqs);
> - int other_irq = per_cpu(hw_events->irq, other_cpu);
> -
> - if (irq != other_irq) {
> - pr_warn("mismatched PPIs detected.\n");
> - err = -EINVAL;
> - goto err_out;
> - }
> + if (irq_is_percpu_devid(irq)) {
> + if (cpumask_empty(&armpmu->active_irqs))
Why not leave this as before, with a '&&' operator?
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
` (5 preceding siblings ...)
2017-11-01 17:02 ` [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Tyler Baicar
@ 2017-12-11 17:38 ` Will Deacon
6 siblings, 0 replies; 19+ messages in thread
From: Will Deacon @ 2017-12-11 17:38 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 01, 2017 at 02:12:34PM +0000, Mark Rutland wrote:
> Currently the arm_pmu ACPI code is somewhat dubious. It attempts ot
> allocate memory any manpiulate IRQs in a hotplug callback, which is an
> atomic context.
>
> These patches (based on the arm64 for-next/core branch [1]) attempt to
> address this by moving work out of hotplug callback, requiring a
> reorganisation of the common arm_pmu code.
>
> I've given these a boot-test on a Juno R1 system, both with DT and ACPI.
> In either case the PMU works as expected, and lockdep seems happy.
>
> I've pushed the series out to my arm64/acpi-pmu-lockdep branch [2].
This mostly looks ok to me; I've left some minor comments on individual
patches. My main concern is taking this as a fix, because there's an awful
lot here.
Is there anything simpler we can do for 4.15/stable?
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 5/5] arm_pmu: acpi: request IRQs up-front
2017-12-11 17:36 ` Will Deacon
@ 2017-12-11 17:55 ` Mark Rutland
2017-12-11 18:45 ` Will Deacon
0 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2017-12-11 17:55 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 11, 2017 at 05:36:47PM +0000, Will Deacon wrote:
> On Wed, Nov 01, 2017 at 02:12:39PM +0000, Mark Rutland wrote:
> > We can't request IRQs in atomic context, so for ACPI systems we'll have
> > to request them up-front, and later associate them with CPUs.
> >
> > This patch reorganises the arm_pmu code to do so. As we no longer have
> > the arm_pmu strucutre at probe time, a number of prototypes need to be
> > adjusted, requiring changes to the common arm_pmu code and arm_pmu
> > platform code.
> > +void armpmu_bind_cpu(struct arm_pmu *armpmu, int cpu)
> > {
> > - struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
> > - unsigned long irq_flags;
> > - struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
> > - int irq = per_cpu(hw_events->irq, cpu);
> > - if (!irq)
> > - return 0;
> > -
> > - if (platdata && platdata->irq_flags) {
> > - irq_flags = platdata->irq_flags;
> > - } else {
> > - irq_flags = ARM_PMU_IRQ_FLAGS;
> > - }
> > -
> > per_cpu(cpu_armpmu, cpu) = armpmu;
>
> Can we not make the binding implicit in armpmu_{request,free}_irq?
Unfortunately not.
As mentioned in the commit message (typo and all), in the ACPI case, we
need to request/free IRQs before we know the PMU.
Mark.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags
2017-12-11 17:37 ` Will Deacon
@ 2017-12-11 18:02 ` Mark Rutland
0 siblings, 0 replies; 19+ messages in thread
From: Mark Rutland @ 2017-12-11 18:02 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 11, 2017 at 05:37:02PM +0000, Will Deacon wrote:
> On Wed, Nov 01, 2017 at 02:12:36PM +0000, Mark Rutland wrote:
> > diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
> > index 705f1a390e31..a52f5b673a15 100644
> > --- a/drivers/perf/arm_pmu_acpi.c
> > +++ b/drivers/perf/arm_pmu_acpi.c
> > @@ -127,7 +127,7 @@ static struct arm_pmu *arm_pmu_acpi_find_alloc_pmu(void)
> > return pmu;
> > }
> >
> > - pmu = armpmu_alloc();
> > + pmu = armpmu_alloc(GFP_ATOMIC);
>
> I think I'd rather have armpmu_alloc_atomic as a wrapper around
> __armpmu_alloc(GFP_ATOMIC) and then leave the armpmu_alloc to pass
> GFP_KERNEL.
Sure; done.
Mark.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs
2017-12-11 17:37 ` Will Deacon
@ 2017-12-11 18:08 ` Mark Rutland
2017-12-11 18:43 ` Will Deacon
0 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2017-12-11 18:08 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 11, 2017 at 05:37:07PM +0000, Will Deacon wrote:
> On Wed, Nov 01, 2017 at 02:12:37PM +0000, Mark Rutland wrote:
> > The arm_pmu platform code explicitly checks for mismatched PPIs at probe
> > time, while the ACPI code leaves this to the core code. Future
> > refactoring will make this difficult for the core code to check, so
> > let's have the ACPI code check this explicitly.
> >
> > As before, upon a failure we'll continue on without an interrupt. Ho
> > hum.
> >
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > ---
> > drivers/perf/arm_pmu.c | 16 ++++------------
> > drivers/perf/arm_pmu_acpi.c | 42 ++++++++++++++++++++++++++++++++++++++----
> > 2 files changed, 42 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 3d6d4c5f2356..e0242103d904 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -557,18 +557,10 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
> > if (!irq)
> > return 0;
> >
> > - if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) {
> > - err = request_percpu_irq(irq, handler, "arm-pmu",
> > - &hw_events->percpu_pmu);
> > - } else if (irq_is_percpu_devid(irq)) {
> > - int other_cpu = cpumask_first(&armpmu->active_irqs);
> > - int other_irq = per_cpu(hw_events->irq, other_cpu);
> > -
> > - if (irq != other_irq) {
> > - pr_warn("mismatched PPIs detected.\n");
> > - err = -EINVAL;
> > - goto err_out;
> > - }
> > + if (irq_is_percpu_devid(irq)) {
> > + if (cpumask_empty(&armpmu->active_irqs))
>
> Why not leave this as before, with a '&&' operator?
Because then we'd fall into the else case (for SPIs), were the
active_irqs mask empty.
Previously, that would have been caught by the irq_is_percpu_devid(irq)
case that got removed.
I can instead make this:
if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) {
err = request_percpu_irq(irq, handler, "arm-pmu",
&hw_events->percpu_pmu);
} else if (irq_is_percpu_devid(irq)) {
/* nothing to do */
} else {
< SPI case >
}
... but that seemed more painful to read.
Mark.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu
2017-12-11 17:36 ` Will Deacon
@ 2017-12-11 18:15 ` Mark Rutland
0 siblings, 0 replies; 19+ messages in thread
From: Mark Rutland @ 2017-12-11 18:15 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 11, 2017 at 05:36:58PM +0000, Will Deacon wrote:
> On Wed, Nov 01, 2017 at 02:12:38PM +0000, Mark Rutland wrote:
> > @@ -531,40 +530,56 @@ int perf_num_counters(void)
> > }
> > EXPORT_SYMBOL_GPL(perf_num_counters);
> >
> > -void armpmu_free_irq(struct arm_pmu *armpmu, int cpu)
> > +int armpmu_count_irq_users(const int irq)
> > {
> > - struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
> > - int irq = per_cpu(hw_events->irq, cpu);
> > + int cpu, count = 0;
> > +
> > + for_each_possible_cpu(cpu) {
> > + if (per_cpu(cpu_irq, cpu) == irq)
> > + count++;
> > + }
> > +
> > + return count;
> > +}
> >
> > - if (!cpumask_test_and_clear_cpu(cpu, &armpmu->active_irqs))
> > +void __armpmu_free_irq(int irq, int cpu)
> > +{
> > + if (per_cpu(cpu_irq, cpu) == 0)
> > + return;
> > + if (WARN_ON(irq != per_cpu(cpu_irq, cpu)))
> > return;
> >
> > if (irq_is_percpu_devid(irq)) {
> > - free_percpu_irq(irq, &hw_events->percpu_pmu);
> > - cpumask_clear(&armpmu->active_irqs);
> > - return;
> > + if (armpmu_count_irq_users(irq) == 1)
> > + free_percpu_irq(irq, &cpu_armpmu);
>
> Do you actually need the count, or could you just free the irq the first
> time this is called and set all of the cpu_irqs to 0?
It might be safe to blat all the matching cpu_irq entries. I'll take a
look.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs
2017-12-11 18:08 ` Mark Rutland
@ 2017-12-11 18:43 ` Will Deacon
0 siblings, 0 replies; 19+ messages in thread
From: Will Deacon @ 2017-12-11 18:43 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 11, 2017 at 06:08:31PM +0000, Mark Rutland wrote:
> On Mon, Dec 11, 2017 at 05:37:07PM +0000, Will Deacon wrote:
> > On Wed, Nov 01, 2017 at 02:12:37PM +0000, Mark Rutland wrote:
> > > The arm_pmu platform code explicitly checks for mismatched PPIs at probe
> > > time, while the ACPI code leaves this to the core code. Future
> > > refactoring will make this difficult for the core code to check, so
> > > let's have the ACPI code check this explicitly.
> > >
> > > As before, upon a failure we'll continue on without an interrupt. Ho
> > > hum.
> > >
> > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > ---
> > > drivers/perf/arm_pmu.c | 16 ++++------------
> > > drivers/perf/arm_pmu_acpi.c | 42 ++++++++++++++++++++++++++++++++++++++----
> > > 2 files changed, 42 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > > index 3d6d4c5f2356..e0242103d904 100644
> > > --- a/drivers/perf/arm_pmu.c
> > > +++ b/drivers/perf/arm_pmu.c
> > > @@ -557,18 +557,10 @@ int armpmu_request_irq(struct arm_pmu *armpmu, int cpu)
> > > if (!irq)
> > > return 0;
> > >
> > > - if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) {
> > > - err = request_percpu_irq(irq, handler, "arm-pmu",
> > > - &hw_events->percpu_pmu);
> > > - } else if (irq_is_percpu_devid(irq)) {
> > > - int other_cpu = cpumask_first(&armpmu->active_irqs);
> > > - int other_irq = per_cpu(hw_events->irq, other_cpu);
> > > -
> > > - if (irq != other_irq) {
> > > - pr_warn("mismatched PPIs detected.\n");
> > > - err = -EINVAL;
> > > - goto err_out;
> > > - }
> > > + if (irq_is_percpu_devid(irq)) {
> > > + if (cpumask_empty(&armpmu->active_irqs))
> >
> > Why not leave this as before, with a '&&' operator?
>
> Because then we'd fall into the else case (for SPIs), were the
> active_irqs mask empty.
>
> Previously, that would have been caught by the irq_is_percpu_devid(irq)
> case that got removed.
>
> I can instead make this:
>
> if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) {
> err = request_percpu_irq(irq, handler, "arm-pmu",
> &hw_events->percpu_pmu);
> } else if (irq_is_percpu_devid(irq)) {
> /* nothing to do */
> } else {
> < SPI case >
> }
>
> ... but that seemed more painful to read.
Yeah, that's crazy :)
How about:
if (!irq_is_percpu_devid(irq)) {
/* SPI case */
} else if (cpumask_empty(&armpmu->active_irqs)) {
/* PPI case */
}
?
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 5/5] arm_pmu: acpi: request IRQs up-front
2017-12-11 17:55 ` Mark Rutland
@ 2017-12-11 18:45 ` Will Deacon
0 siblings, 0 replies; 19+ messages in thread
From: Will Deacon @ 2017-12-11 18:45 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 11, 2017 at 05:55:40PM +0000, Mark Rutland wrote:
> On Mon, Dec 11, 2017 at 05:36:47PM +0000, Will Deacon wrote:
> > On Wed, Nov 01, 2017 at 02:12:39PM +0000, Mark Rutland wrote:
> > > We can't request IRQs in atomic context, so for ACPI systems we'll have
> > > to request them up-front, and later associate them with CPUs.
> > >
> > > This patch reorganises the arm_pmu code to do so. As we no longer have
> > > the arm_pmu strucutre at probe time, a number of prototypes need to be
> > > adjusted, requiring changes to the common arm_pmu code and arm_pmu
> > > platform code.
>
> > > +void armpmu_bind_cpu(struct arm_pmu *armpmu, int cpu)
> > > {
> > > - struct arm_pmu_platdata *platdata = armpmu_get_platdata(armpmu);
> > > - unsigned long irq_flags;
> > > - struct pmu_hw_events __percpu *hw_events = armpmu->hw_events;
> > > - int irq = per_cpu(hw_events->irq, cpu);
> > > - if (!irq)
> > > - return 0;
> > > -
> > > - if (platdata && platdata->irq_flags) {
> > > - irq_flags = platdata->irq_flags;
> > > - } else {
> > > - irq_flags = ARM_PMU_IRQ_FLAGS;
> > > - }
> > > -
> > > per_cpu(cpu_armpmu, cpu) = armpmu;
> >
> > Can we not make the binding implicit in armpmu_{request,free}_irq?
>
> Unfortunately not.
>
> As mentioned in the commit message (typo and all), in the ACPI case, we
> need to request/free IRQs before we know the PMU.
Urgh. This is hideous! Just try reading the imnplementation of
armpmu_bind_cpu out loud.
Could we use {enable,disable}_irq in the hotplug notifier and request the
interrupts with NOAUTOEN instead? That would mean we have a similar flow
for SPI and PPIs and could potentially hide some of the book-keeping behind
armpmu_{enable,disable}_irq functions.
Will
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-12-11 18:45 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-01 14:12 [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Mark Rutland
2017-11-01 14:12 ` [PATCH 1/5] arm_pmu: fold platform helpers into platform code Mark Rutland
2017-11-01 14:12 ` [PATCH 2/5] arm_pmu: have armpmu_alloc() take GFP flags Mark Rutland
2017-12-06 6:54 ` Zhangshaokun
2017-12-11 17:37 ` Will Deacon
2017-12-11 18:02 ` Mark Rutland
2017-11-01 14:12 ` [PATCH 3/5] arm_pmu: acpi: check for mismatched PPIs Mark Rutland
2017-12-11 17:37 ` Will Deacon
2017-12-11 18:08 ` Mark Rutland
2017-12-11 18:43 ` Will Deacon
2017-11-01 14:12 ` [PATCH 4/5] arm_pmu: note IRQs/PMUs per-cpu Mark Rutland
2017-12-11 17:36 ` Will Deacon
2017-12-11 18:15 ` Mark Rutland
2017-11-01 14:12 ` [PATCH 5/5] arm_pmu: acpi: request IRQs up-front Mark Rutland
2017-12-11 17:36 ` Will Deacon
2017-12-11 17:55 ` Mark Rutland
2017-12-11 18:45 ` Will Deacon
2017-11-01 17:02 ` [PATCH 0/5] arm_pmu: fix lockdep issues with ACPI systems Tyler Baicar
2017-12-11 17:38 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).