linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] perf: Avoid placing cpumask var on stack
@ 2024-04-03 12:50 Dawei Li
  2024-04-03 12:51 ` [PATCH v2 01/10] cpumask: add cpumask_any_and_but() Dawei Li
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:50 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

Hi all,

This is v2 of [1] and [2] which basically eliminate cpumask var allocation
on stack for perf subsystem.

Change since v1:
- Change from dynamic allocation to a temporary var free helper:
  cpumask_any_and_but().	[Mark]

- Some minor coding style improvements, reverse chrismas tree e.g.

- For cpumask_any_and_but() itself:
  - Moved to cpumask.h, just like other helpers.
  - Return value converted to unsigned int.
  - Remove EXPORT_SYMBOL, for obvious reason.

[1]:
https://lore.kernel.org/lkml/20240402105610.1695644-1-dawei.li@shingroup.cn/

[2]:
https://lore.kernel.org/lkml/1486381132-5610-1-git-send-email-mark.rutland@arm.com/

Dawei Li (9):
  perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
  perf/arm-cmn: Avoid placing cpumask var on stack
  perf/arm_cspmu: Avoid placing cpumask var on stack
  perf/arm_dsu: Avoid placing cpumask var on stack
  perf/dwc_pcie: Avoid placing cpumask var on stack
  perf/hisi_pcie: Avoid placing cpumask var on stack
  perf/hisi_uncore: Avoid placing cpumask var on stack
  perf/qcom_l2: Avoid placing cpumask var on stack
  perf/thunderx2: Avoid placing cpumask var on stack

Mark Rutland (1):
  cpumask: add cpumask_any_and_but()

 drivers/perf/alibaba_uncore_drw_pmu.c    | 10 +++-------
 drivers/perf/arm-cmn.c                   | 10 +++++-----
 drivers/perf/arm_cspmu/arm_cspmu.c       |  8 +++-----
 drivers/perf/arm_dsu_pmu.c               | 19 ++++++-------------
 drivers/perf/dwc_pcie_pmu.c              | 10 ++++------
 drivers/perf/hisilicon/hisi_pcie_pmu.c   |  9 ++++-----
 drivers/perf/hisilicon/hisi_uncore_pmu.c |  6 ++----
 drivers/perf/qcom_l2_pmu.c               |  8 +++-----
 drivers/perf/thunderx2_pmu.c             | 10 +++-------
 include/linux/cpumask.h                  | 23 +++++++++++++++++++++++
 10 files changed, 56 insertions(+), 57 deletions(-)

Thanks,

    Dawei

-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 01/10] cpumask: add cpumask_any_and_but()
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 12:51 ` [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack Dawei Li
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Thomas Gleixner, Andrew Morton, Peter Zijlstra, Rusty Russell,
	Dawei Li

From: Mark Rutland <mark.rutland@arm.com>

In some cases, it's useful to be able to select a random cpu from the
intersection of two masks, excluding a particular CPU.

For example, in some systems an uncore PMU is shared by a subset of
CPUs, and management of this PMU is assigned to some arbitrary CPU in
this set. Whenever the management CPU is hotplugged out, we wish to
migrate responsibility to another arbitrary CPU which is both in this
set and online.

Today we can use cpumask_any_and() to select an arbitrary CPU in the
intersection of two masks. We can also use cpumask_any_but() to select
any arbitrary cpu in a mask excluding, a particular CPU.

To do both, we either need to use a temporary cpumask, which is
wasteful, or use some lower-level cpumask helpers, which can be unclear.

This patch adds a new cpumask_any_and_but() to cater for these cases.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 include/linux/cpumask.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1c29947db848..121f3ac757ff 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -388,6 +388,29 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
 	return i;
 }
 
+/**
+ * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
+ * @mask1: the first input cpumask
+ * @mask2: the second input cpumask
+ * @cpu: the cpu to ignore
+ *
+ * Returns >= nr_cpu_ids if no cpus set.
+ */
+static inline
+unsigned int cpumask_any_and_but(const struct cpumask *mask1,
+				 const struct cpumask *mask2,
+				 unsigned int cpu)
+{
+	unsigned int i;
+
+	cpumask_check(cpu);
+	i = cpumask_first_and(mask1, mask2);
+	if (i != cpu)
+		return i;
+
+	return cpumask_next_and(cpu, mask1, mask2);
+}
+
 /**
  * cpumask_nth - get the Nth cpu in a cpumask
  * @srcp: the cpumask pointer
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
  2024-04-03 12:51 ` [PATCH v2 01/10] cpumask: add cpumask_any_and_but() Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:16   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 03/10] perf/arm-cmn: " Dawei Li
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/alibaba_uncore_drw_pmu.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
index a9277dcf90ce..d4d14b65c4a5 100644
--- a/drivers/perf/alibaba_uncore_drw_pmu.c
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -746,18 +746,14 @@ static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 	struct ali_drw_pmu_irq *irq;
 	struct ali_drw_pmu *drw_pmu;
 	unsigned int target;
-	int ret;
-	cpumask_t node_online_cpus;
 
 	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
 	if (cpu != irq->cpu)
 		return 0;
 
-	ret = cpumask_and(&node_online_cpus,
-			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
-	if (ret)
-		target = cpumask_any_but(&node_online_cpus, cpu);
-	else
+	target = cpumask_any_and_but(cpumask_of_node(cpu_to_node(cpu)),
+				     cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
 		target = cpumask_any_but(cpu_online_mask, cpu);
 
 	if (target >= nr_cpu_ids)
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 03/10] perf/arm-cmn: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
  2024-04-03 12:51 ` [PATCH v2 01/10] cpumask: add cpumask_any_and_but() Dawei Li
  2024-04-03 12:51 ` [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:23   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 04/10] perf/arm_cspmu: " Dawei Li
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/arm-cmn.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index 7ef9c7e4836b..6bfb0c4a1287 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -1950,20 +1950,20 @@ static int arm_cmn_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_no
 	struct arm_cmn *cmn;
 	unsigned int target;
 	int node;
-	cpumask_t mask;
 
 	cmn = hlist_entry_safe(cpuhp_node, struct arm_cmn, cpuhp_node);
 	if (cpu != cmn->cpu)
 		return 0;
 
 	node = dev_to_node(cmn->dev);
-	if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
-	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
-		target = cpumask_any(&mask);
-	else
+
+	target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
 		target = cpumask_any_but(cpu_online_mask, cpu);
+
 	if (target < nr_cpu_ids)
 		arm_cmn_migrate(cmn, target);
+
 	return 0;
 }
 
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 04/10] perf/arm_cspmu: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (2 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 03/10] perf/arm-cmn: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:21   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 05/10] perf/arm_dsu: " Dawei Li
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index b9a252272f1e..fd1004251665 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -1322,8 +1322,7 @@ static int arm_cspmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 
 static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
 {
-	int dst;
-	struct cpumask online_supported;
+	unsigned int dst;
 
 	struct arm_cspmu *cspmu =
 		hlist_entry_safe(node, struct arm_cspmu, cpuhp_node);
@@ -1333,9 +1332,8 @@ static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
 		return 0;
 
 	/* Choose a new CPU to migrate ownership of the PMU to */
-	cpumask_and(&online_supported, &cspmu->associated_cpus,
-		    cpu_online_mask);
-	dst = cpumask_any_but(&online_supported, cpu);
+	dst = cpumask_any_and_but(&cspmu->associated_cpus,
+				  cpu_online_mask, cpu);
 	if (dst >= nr_cpu_ids)
 		return 0;
 
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 05/10] perf/arm_dsu: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (3 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 04/10] perf/arm_cspmu: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:32   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 06/10] perf/dwc_pcie: " Dawei Li
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/arm_dsu_pmu.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index bae3ca37f846..adc0bbb5fafe 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -230,15 +230,6 @@ static const struct attribute_group *dsu_pmu_attr_groups[] = {
 	NULL,
 };
 
-static int dsu_pmu_get_online_cpu_any_but(struct dsu_pmu *dsu_pmu, int cpu)
-{
-	struct cpumask online_supported;
-
-	cpumask_and(&online_supported,
-			 &dsu_pmu->associated_cpus, cpu_online_mask);
-	return cpumask_any_but(&online_supported, cpu);
-}
-
 static inline bool dsu_pmu_counter_valid(struct dsu_pmu *dsu_pmu, u32 idx)
 {
 	return (idx < dsu_pmu->num_counters) ||
@@ -827,14 +818,16 @@ static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 
 static int dsu_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
 {
-	int dst;
-	struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
-						   cpuhp_node);
+	struct dsu_pmu *dsu_pmu;
+	unsigned int dst;
+
+	dsu_pmu = hlist_entry_safe(node, struct dsu_pmu, cpuhp_node);
 
 	if (!cpumask_test_and_clear_cpu(cpu, &dsu_pmu->active_cpu))
 		return 0;
 
-	dst = dsu_pmu_get_online_cpu_any_but(dsu_pmu, cpu);
+	dst = cpumask_any_and_but(&dsu_pmu->associated_cpus,
+				  cpu_online_mask, cpu);
 	/* If there are no active CPUs in the DSU, leave IRQ disabled */
 	if (dst >= nr_cpu_ids)
 		return 0;
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 06/10] perf/dwc_pcie: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (4 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 05/10] perf/arm_dsu: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:33   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 07/10] perf/hisi_pcie: " Dawei Li
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/dwc_pcie_pmu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
index 957058ad0099..c5e328f23841 100644
--- a/drivers/perf/dwc_pcie_pmu.c
+++ b/drivers/perf/dwc_pcie_pmu.c
@@ -690,9 +690,8 @@ static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_n
 {
 	struct dwc_pcie_pmu *pcie_pmu;
 	struct pci_dev *pdev;
-	int node;
-	cpumask_t mask;
 	unsigned int target;
+	int node;
 
 	pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node);
 	/* Nothing to do if this CPU doesn't own the PMU */
@@ -702,10 +701,9 @@ static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_n
 	pcie_pmu->on_cpu = -1;
 	pdev = pcie_pmu->pdev;
 	node = dev_to_node(&pdev->dev);
-	if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
-	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
-		target = cpumask_any(&mask);
-	else
+
+	target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
 		target = cpumask_any_but(cpu_online_mask, cpu);
 
 	if (target >= nr_cpu_ids) {
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 07/10] perf/hisi_pcie: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (5 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 06/10] perf/dwc_pcie: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:35   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 08/10] perf/hisi_uncore: " Dawei Li
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/hisilicon/hisi_pcie_pmu.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 5d1f0e9fdb08..06b192cc31d5 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -673,7 +673,6 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 {
 	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
 	unsigned int target;
-	cpumask_t mask;
 	int numa_node;
 
 	/* Nothing to do if this CPU doesn't own the PMU */
@@ -684,10 +683,10 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 
 	/* Choose a local CPU from all online cpus. */
 	numa_node = dev_to_node(&pcie_pmu->pdev->dev);
-	if (cpumask_and(&mask, cpumask_of_node(numa_node), cpu_online_mask) &&
-	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
-		target = cpumask_any(&mask);
-	else
+
+	target = cpumask_any_and_but(cpumask_of_node(numa_node),
+				     cpu_online_mask, cpu);
+	if (target >= nr_cpu_ids)
 		target = cpumask_any_but(cpu_online_mask, cpu);
 
 	if (target >= nr_cpu_ids) {
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 08/10] perf/hisi_uncore: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (6 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 07/10] perf/hisi_pcie: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:35   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 09/10] perf/qcom_l2: " Dawei Li
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 04031450d5fe..ccc9191ad1b6 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -504,7 +504,6 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 {
 	struct hisi_pmu *hisi_pmu = hlist_entry_safe(node, struct hisi_pmu,
 						     node);
-	cpumask_t pmu_online_cpus;
 	unsigned int target;
 
 	if (!cpumask_test_and_clear_cpu(cpu, &hisi_pmu->associated_cpus))
@@ -518,9 +517,8 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 	hisi_pmu->on_cpu = -1;
 
 	/* Choose a new CPU to migrate ownership of the PMU to */
-	cpumask_and(&pmu_online_cpus, &hisi_pmu->associated_cpus,
-		    cpu_online_mask);
-	target = cpumask_any_but(&pmu_online_cpus, cpu);
+	target = cpumask_any_and_but(&hisi_pmu->associated_cpus,
+				     cpu_online_mask, cpu);
 	if (target >= nr_cpu_ids)
 		return 0;
 
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 09/10] perf/qcom_l2: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (7 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 08/10] perf/hisi_uncore: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:36   ` Mark Rutland
  2024-04-03 12:51 ` [PATCH v2 10/10] perf/thunderx2: " Dawei Li
  2024-04-03 14:41 ` [PATCH v2 00/10] perf: " Mark Rutland
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/qcom_l2_pmu.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/qcom_l2_pmu.c b/drivers/perf/qcom_l2_pmu.c
index 148df5ae8ef8..b5a44dc1dc3a 100644
--- a/drivers/perf/qcom_l2_pmu.c
+++ b/drivers/perf/qcom_l2_pmu.c
@@ -801,9 +801,8 @@ static int l2cache_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
 
 static int l2cache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 {
-	struct cluster_pmu *cluster;
 	struct l2cache_pmu *l2cache_pmu;
-	cpumask_t cluster_online_cpus;
+	struct cluster_pmu *cluster;
 	unsigned int target;
 
 	l2cache_pmu = hlist_entry_safe(node, struct l2cache_pmu, node);
@@ -820,9 +819,8 @@ static int l2cache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 	cluster->on_cpu = -1;
 
 	/* Any other CPU for this cluster which is still online */
-	cpumask_and(&cluster_online_cpus, &cluster->cluster_cpus,
-		    cpu_online_mask);
-	target = cpumask_any_but(&cluster_online_cpus, cpu);
+	target = cpumask_any_and_but(&cluster->cluster_cpus,
+				     cpu_online_mask, cpu);
 	if (target >= nr_cpu_ids) {
 		disable_irq(cluster->irq);
 		return 0;
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 10/10] perf/thunderx2: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (8 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 09/10] perf/qcom_l2: " Dawei Li
@ 2024-04-03 12:51 ` Dawei Li
  2024-04-03 14:38   ` Mark Rutland
  2024-04-03 14:41 ` [PATCH v2 00/10] perf: " Mark Rutland
  10 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
  To: will, mark.rutland, yury.norov, linux
  Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
	konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Dawei Li

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
  teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
  trap system into an in-consisitent/compromised state;

Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/perf/thunderx2_pmu.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
index e16d10c763de..b3377b662ffc 100644
--- a/drivers/perf/thunderx2_pmu.c
+++ b/drivers/perf/thunderx2_pmu.c
@@ -932,9 +932,8 @@ static int tx2_uncore_pmu_online_cpu(unsigned int cpu,
 static int tx2_uncore_pmu_offline_cpu(unsigned int cpu,
 		struct hlist_node *hpnode)
 {
-	int new_cpu;
 	struct tx2_uncore_pmu *tx2_pmu;
-	struct cpumask cpu_online_mask_temp;
+	unsigned int new_cpu;
 
 	tx2_pmu = hlist_entry_safe(hpnode,
 			struct tx2_uncore_pmu, hpnode);
@@ -945,11 +944,8 @@ static int tx2_uncore_pmu_offline_cpu(unsigned int cpu,
 	if (tx2_pmu->hrtimer_callback)
 		hrtimer_cancel(&tx2_pmu->hrtimer);
 
-	cpumask_copy(&cpu_online_mask_temp, cpu_online_mask);
-	cpumask_clear_cpu(cpu, &cpu_online_mask_temp);
-	new_cpu = cpumask_any_and(
-			cpumask_of_node(tx2_pmu->node),
-			&cpu_online_mask_temp);
+	new_cpu = cpumask_any_and_but(cpumask_of_node(tx2_pmu->node),
+				      cpu_online_mask, cpu);
 
 	tx2_pmu->cpu = new_cpu;
 	if (new_cpu >= nr_cpu_ids)
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack Dawei Li
@ 2024-04-03 14:16   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:16 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:01PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

I don't think we need to explain all the pitfalls of the approach we haven't
taken. Could we please simplify this down to:

Could we please get rid of the bit that says we should "always use the
*cpumask_var API(s)", and simplify the commit message down to:

| perf/alibaba_uncore_drw: Avoid placing cpumask on the stack
| 
| In general it's preferable to avoid placing cpumasks on the stack, as
| for large values of NR_CPUS these can consume significant amounts of
| stack space and make stack overflows more likely.
| 
| Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
| the stack.

The logic looks good to me, so with that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/alibaba_uncore_drw_pmu.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
> index a9277dcf90ce..d4d14b65c4a5 100644
> --- a/drivers/perf/alibaba_uncore_drw_pmu.c
> +++ b/drivers/perf/alibaba_uncore_drw_pmu.c
> @@ -746,18 +746,14 @@ static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  	struct ali_drw_pmu_irq *irq;
>  	struct ali_drw_pmu *drw_pmu;
>  	unsigned int target;
> -	int ret;
> -	cpumask_t node_online_cpus;
>  
>  	irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
>  	if (cpu != irq->cpu)
>  		return 0;
>  
> -	ret = cpumask_and(&node_online_cpus,
> -			  cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
> -	if (ret)
> -		target = cpumask_any_but(&node_online_cpus, cpu);
> -	else
> +	target = cpumask_any_and_but(cpumask_of_node(cpu_to_node(cpu)),
> +				     cpu_online_mask, cpu);
> +	if (target >= nr_cpu_ids)
>  		target = cpumask_any_but(cpu_online_mask, cpu);
>  
>  	if (target >= nr_cpu_ids)
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 04/10] perf/arm_cspmu: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 04/10] perf/arm_cspmu: " Dawei Li
@ 2024-04-03 14:21   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:21 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:03PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> 
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
> index b9a252272f1e..fd1004251665 100644
> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> @@ -1322,8 +1322,7 @@ static int arm_cspmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>  
>  static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
>  {
> -	int dst;
> -	struct cpumask online_supported;
> +	unsigned int dst;
>  
>  	struct arm_cspmu *cspmu =
>  		hlist_entry_safe(node, struct arm_cspmu, cpuhp_node);
> @@ -1333,9 +1332,8 @@ static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
>  		return 0;
>  
>  	/* Choose a new CPU to migrate ownership of the PMU to */
> -	cpumask_and(&online_supported, &cspmu->associated_cpus,
> -		    cpu_online_mask);
> -	dst = cpumask_any_but(&online_supported, cpu);
> +	dst = cpumask_any_and_but(&cspmu->associated_cpus,
> +				  cpu_online_mask, cpu);
>  	if (dst >= nr_cpu_ids)
>  		return 0;
>  
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 03/10] perf/arm-cmn: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 03/10] perf/arm-cmn: " Dawei Li
@ 2024-04-03 14:23   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:23 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:02PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/arm-cmn.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
> index 7ef9c7e4836b..6bfb0c4a1287 100644
> --- a/drivers/perf/arm-cmn.c
> +++ b/drivers/perf/arm-cmn.c
> @@ -1950,20 +1950,20 @@ static int arm_cmn_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_no
>  	struct arm_cmn *cmn;
>  	unsigned int target;
>  	int node;
> -	cpumask_t mask;
>  
>  	cmn = hlist_entry_safe(cpuhp_node, struct arm_cmn, cpuhp_node);
>  	if (cpu != cmn->cpu)
>  		return 0;
>  
>  	node = dev_to_node(cmn->dev);
> -	if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
> -	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
> -		target = cpumask_any(&mask);
> -	else
> +
> +	target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
> +	if (target >= nr_cpu_ids)
>  		target = cpumask_any_but(cpu_online_mask, cpu);
> +
>  	if (target < nr_cpu_ids)
>  		arm_cmn_migrate(cmn, target);
> +
>  	return 0;
>  }
>  
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 05/10] perf/arm_dsu: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 05/10] perf/arm_dsu: " Dawei Li
@ 2024-04-03 14:32   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:32 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:04PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/arm_dsu_pmu.c | 19 ++++++-------------
>  1 file changed, 6 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
> index bae3ca37f846..adc0bbb5fafe 100644
> --- a/drivers/perf/arm_dsu_pmu.c
> +++ b/drivers/perf/arm_dsu_pmu.c
> @@ -230,15 +230,6 @@ static const struct attribute_group *dsu_pmu_attr_groups[] = {
>  	NULL,
>  };
>  
> -static int dsu_pmu_get_online_cpu_any_but(struct dsu_pmu *dsu_pmu, int cpu)
> -{
> -	struct cpumask online_supported;
> -
> -	cpumask_and(&online_supported,
> -			 &dsu_pmu->associated_cpus, cpu_online_mask);
> -	return cpumask_any_but(&online_supported, cpu);
> -}
> -
>  static inline bool dsu_pmu_counter_valid(struct dsu_pmu *dsu_pmu, u32 idx)
>  {
>  	return (idx < dsu_pmu->num_counters) ||
> @@ -827,14 +818,16 @@ static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>  
>  static int dsu_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
>  {
> -	int dst;
> -	struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
> -						   cpuhp_node);
> +	struct dsu_pmu *dsu_pmu;
> +	unsigned int dst;
> +
> +	dsu_pmu = hlist_entry_safe(node, struct dsu_pmu, cpuhp_node);
>  
>  	if (!cpumask_test_and_clear_cpu(cpu, &dsu_pmu->active_cpu))
>  		return 0;
>  
> -	dst = dsu_pmu_get_online_cpu_any_but(dsu_pmu, cpu);
> +	dst = cpumask_any_and_but(&dsu_pmu->associated_cpus,
> +				  cpu_online_mask, cpu);
>  	/* If there are no active CPUs in the DSU, leave IRQ disabled */
>  	if (dst >= nr_cpu_ids)
>  		return 0;
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 06/10] perf/dwc_pcie: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 06/10] perf/dwc_pcie: " Dawei Li
@ 2024-04-03 14:33   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:33 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:05PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/dwc_pcie_pmu.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
> index 957058ad0099..c5e328f23841 100644
> --- a/drivers/perf/dwc_pcie_pmu.c
> +++ b/drivers/perf/dwc_pcie_pmu.c
> @@ -690,9 +690,8 @@ static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_n
>  {
>  	struct dwc_pcie_pmu *pcie_pmu;
>  	struct pci_dev *pdev;
> -	int node;
> -	cpumask_t mask;
>  	unsigned int target;
> +	int node;
>  
>  	pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node);
>  	/* Nothing to do if this CPU doesn't own the PMU */
> @@ -702,10 +701,9 @@ static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_n
>  	pcie_pmu->on_cpu = -1;
>  	pdev = pcie_pmu->pdev;
>  	node = dev_to_node(&pdev->dev);
> -	if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
> -	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
> -		target = cpumask_any(&mask);
> -	else
> +
> +	target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
> +	if (target >= nr_cpu_ids)
>  		target = cpumask_any_but(cpu_online_mask, cpu);
>  
>  	if (target >= nr_cpu_ids) {
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 07/10] perf/hisi_pcie: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 07/10] perf/hisi_pcie: " Dawei Li
@ 2024-04-03 14:35   ` Mark Rutland
  2024-04-05  9:52     ` Jonathan Cameron
  0 siblings, 1 reply; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:35 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:06PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/hisilicon/hisi_pcie_pmu.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 5d1f0e9fdb08..06b192cc31d5 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -673,7 +673,6 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  {
>  	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
>  	unsigned int target;
> -	cpumask_t mask;
>  	int numa_node;
>  
>  	/* Nothing to do if this CPU doesn't own the PMU */
> @@ -684,10 +683,10 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  
>  	/* Choose a local CPU from all online cpus. */
>  	numa_node = dev_to_node(&pcie_pmu->pdev->dev);
> -	if (cpumask_and(&mask, cpumask_of_node(numa_node), cpu_online_mask) &&
> -	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
> -		target = cpumask_any(&mask);
> -	else
> +
> +	target = cpumask_any_and_but(cpumask_of_node(numa_node),
> +				     cpu_online_mask, cpu);
> +	if (target >= nr_cpu_ids)
>  		target = cpumask_any_but(cpu_online_mask, cpu);
>  
>  	if (target >= nr_cpu_ids) {
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 08/10] perf/hisi_uncore: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 08/10] perf/hisi_uncore: " Dawei Li
@ 2024-04-03 14:35   ` Mark Rutland
  2024-04-05  9:53     ` Jonathan Cameron
  0 siblings, 1 reply; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:35 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:07PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/hisilicon/hisi_uncore_pmu.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> index 04031450d5fe..ccc9191ad1b6 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> @@ -504,7 +504,6 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  {
>  	struct hisi_pmu *hisi_pmu = hlist_entry_safe(node, struct hisi_pmu,
>  						     node);
> -	cpumask_t pmu_online_cpus;
>  	unsigned int target;
>  
>  	if (!cpumask_test_and_clear_cpu(cpu, &hisi_pmu->associated_cpus))
> @@ -518,9 +517,8 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  	hisi_pmu->on_cpu = -1;
>  
>  	/* Choose a new CPU to migrate ownership of the PMU to */
> -	cpumask_and(&pmu_online_cpus, &hisi_pmu->associated_cpus,
> -		    cpu_online_mask);
> -	target = cpumask_any_but(&pmu_online_cpus, cpu);
> +	target = cpumask_any_and_but(&hisi_pmu->associated_cpus,
> +				     cpu_online_mask, cpu);
>  	if (target >= nr_cpu_ids)
>  		return 0;
>  
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 09/10] perf/qcom_l2: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 09/10] perf/qcom_l2: " Dawei Li
@ 2024-04-03 14:36   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:36 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:08PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/qcom_l2_pmu.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/qcom_l2_pmu.c b/drivers/perf/qcom_l2_pmu.c
> index 148df5ae8ef8..b5a44dc1dc3a 100644
> --- a/drivers/perf/qcom_l2_pmu.c
> +++ b/drivers/perf/qcom_l2_pmu.c
> @@ -801,9 +801,8 @@ static int l2cache_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
>  
>  static int l2cache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  {
> -	struct cluster_pmu *cluster;
>  	struct l2cache_pmu *l2cache_pmu;
> -	cpumask_t cluster_online_cpus;
> +	struct cluster_pmu *cluster;
>  	unsigned int target;
>  
>  	l2cache_pmu = hlist_entry_safe(node, struct l2cache_pmu, node);
> @@ -820,9 +819,8 @@ static int l2cache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  	cluster->on_cpu = -1;
>  
>  	/* Any other CPU for this cluster which is still online */
> -	cpumask_and(&cluster_online_cpus, &cluster->cluster_cpus,
> -		    cpu_online_mask);
> -	target = cpumask_any_but(&cluster_online_cpus, cpu);
> +	target = cpumask_any_and_but(&cluster->cluster_cpus,
> +				     cpu_online_mask, cpu);
>  	if (target >= nr_cpu_ids) {
>  		disable_irq(cluster->irq);
>  		return 0;
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 10/10] perf/thunderx2: Avoid placing cpumask var on stack
  2024-04-03 12:51 ` [PATCH v2 10/10] perf/thunderx2: " Dawei Li
@ 2024-04-03 14:38   ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:38 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:51:09PM +0800, Dawei Li wrote:
> For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> variable on stack is not recommended since it can cause potential stack
> overflow.
> 
> Instead, kernel code should always use *cpumask_var API(s) to allocate
> cpumask var in config-neutral way, leaving allocation strategy to
> CONFIG_CPUMASK_OFFSTACK.
> 
> But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> for if allocation fails(which is unlikely but still possible):
> - If -ENOMEM is returned to caller, kernel crashes for non-bringup
>   teardown;
> - If callback pretends nothing happened and returns 0 to caller, it may
>   trap system into an in-consisitent/compromised state;
> 
> Use newly-introduced cpumask_any_and_but() to address all issues above.
> It eliminates usage of temporary cpumask var in generic way, no matter how
> the cpumask var is allocated.
> 
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

The logic looks good to me, but I'd like the commit message updated the same as
per my comment on patch 2.

With that commit message:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/perf/thunderx2_pmu.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
> index e16d10c763de..b3377b662ffc 100644
> --- a/drivers/perf/thunderx2_pmu.c
> +++ b/drivers/perf/thunderx2_pmu.c
> @@ -932,9 +932,8 @@ static int tx2_uncore_pmu_online_cpu(unsigned int cpu,
>  static int tx2_uncore_pmu_offline_cpu(unsigned int cpu,
>  		struct hlist_node *hpnode)
>  {
> -	int new_cpu;
>  	struct tx2_uncore_pmu *tx2_pmu;
> -	struct cpumask cpu_online_mask_temp;
> +	unsigned int new_cpu;
>  
>  	tx2_pmu = hlist_entry_safe(hpnode,
>  			struct tx2_uncore_pmu, hpnode);
> @@ -945,11 +944,8 @@ static int tx2_uncore_pmu_offline_cpu(unsigned int cpu,
>  	if (tx2_pmu->hrtimer_callback)
>  		hrtimer_cancel(&tx2_pmu->hrtimer);
>  
> -	cpumask_copy(&cpu_online_mask_temp, cpu_online_mask);
> -	cpumask_clear_cpu(cpu, &cpu_online_mask_temp);
> -	new_cpu = cpumask_any_and(
> -			cpumask_of_node(tx2_pmu->node),
> -			&cpu_online_mask_temp);
> +	new_cpu = cpumask_any_and_but(cpumask_of_node(tx2_pmu->node),
> +				      cpu_online_mask, cpu);
>  
>  	tx2_pmu->cpu = new_cpu;
>  	if (new_cpu >= nr_cpu_ids)
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 00/10] perf: Avoid placing cpumask var on stack
  2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
                   ` (9 preceding siblings ...)
  2024-04-03 12:51 ` [PATCH v2 10/10] perf/thunderx2: " Dawei Li
@ 2024-04-03 14:41 ` Mark Rutland
  2024-04-03 16:11   ` Dawei Li
  10 siblings, 1 reply; 25+ messages in thread
From: Mark Rutland @ 2024-04-03 14:41 UTC (permalink / raw)
  To: Dawei Li
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Wed, Apr 03, 2024 at 08:50:59PM +0800, Dawei Li wrote:
> Hi all,

Hi,

> This is v2 of [1] and [2] which basically eliminate cpumask var allocation
> on stack for perf subsystem.
> 
> Change since v1:
> - Change from dynamic allocation to a temporary var free helper:
>   cpumask_any_and_but().	[Mark]
> 
> - Some minor coding style improvements, reverse chrismas tree e.g.
> 
> - For cpumask_any_and_but() itself:
>   - Moved to cpumask.h, just like other helpers.
>   - Return value converted to unsigned int.
>   - Remove EXPORT_SYMBOL, for obvious reason.

Thanks for this!

The logic all looks good; if you can spin a v3 with the updated commit messages
I reckon we can queue this up shortly.

Mark.

> 
> [1]:
> https://lore.kernel.org/lkml/20240402105610.1695644-1-dawei.li@shingroup.cn/
> 
> [2]:
> https://lore.kernel.org/lkml/1486381132-5610-1-git-send-email-mark.rutland@arm.com/
> 
> Dawei Li (9):
>   perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
>   perf/arm-cmn: Avoid placing cpumask var on stack
>   perf/arm_cspmu: Avoid placing cpumask var on stack
>   perf/arm_dsu: Avoid placing cpumask var on stack
>   perf/dwc_pcie: Avoid placing cpumask var on stack
>   perf/hisi_pcie: Avoid placing cpumask var on stack
>   perf/hisi_uncore: Avoid placing cpumask var on stack
>   perf/qcom_l2: Avoid placing cpumask var on stack
>   perf/thunderx2: Avoid placing cpumask var on stack
> 
> Mark Rutland (1):
>   cpumask: add cpumask_any_and_but()
> 
>  drivers/perf/alibaba_uncore_drw_pmu.c    | 10 +++-------
>  drivers/perf/arm-cmn.c                   | 10 +++++-----
>  drivers/perf/arm_cspmu/arm_cspmu.c       |  8 +++-----
>  drivers/perf/arm_dsu_pmu.c               | 19 ++++++-------------
>  drivers/perf/dwc_pcie_pmu.c              | 10 ++++------
>  drivers/perf/hisilicon/hisi_pcie_pmu.c   |  9 ++++-----
>  drivers/perf/hisilicon/hisi_uncore_pmu.c |  6 ++----
>  drivers/perf/qcom_l2_pmu.c               |  8 +++-----
>  drivers/perf/thunderx2_pmu.c             | 10 +++-------
>  include/linux/cpumask.h                  | 23 +++++++++++++++++++++++
>  10 files changed, 56 insertions(+), 57 deletions(-)
> 
> Thanks,
> 
>     Dawei
> 
> -- 
> 2.27.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 00/10] perf: Avoid placing cpumask var on stack
  2024-04-03 14:41 ` [PATCH v2 00/10] perf: " Mark Rutland
@ 2024-04-03 16:11   ` Dawei Li
  2024-04-03 16:39     ` Yury Norov
  0 siblings, 1 reply; 25+ messages in thread
From: Dawei Li @ 2024-04-03 16:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

Hi Mark,

On Wed, Apr 03, 2024 at 03:41:07PM +0100, Mark Rutland wrote:
> On Wed, Apr 03, 2024 at 08:50:59PM +0800, Dawei Li wrote:
> > Hi all,
> 
> Hi,
> 
> > This is v2 of [1] and [2] which basically eliminate cpumask var allocation
> > on stack for perf subsystem.
> > 
> > Change since v1:
> > - Change from dynamic allocation to a temporary var free helper:
> >   cpumask_any_and_but().	[Mark]
> > 
> > - Some minor coding style improvements, reverse chrismas tree e.g.
> > 
> > - For cpumask_any_and_but() itself:
> >   - Moved to cpumask.h, just like other helpers.
> >   - Return value converted to unsigned int.
> >   - Remove EXPORT_SYMBOL, for obvious reason.
> 
> Thanks for this!
> 
> The logic all looks good; if you can spin a v3 with the updated commit messages
> I reckon we can queue this up shortly.

Thanks for review.

v3 respinned:
https://lore.kernel.org/lkml/20240403155950.2068109-1-dawei.li@shingroup.cn/

If it's going through perf tree, do we need Acked-by from bitmap
maintainers for patch[1]?

Thanks,

    Dawei

> 
> Mark.
> 
> > 
> > [1]:
> > https://lore.kernel.org/lkml/20240402105610.1695644-1-dawei.li@shingroup.cn/
> > 
> > [2]:
> > https://lore.kernel.org/lkml/1486381132-5610-1-git-send-email-mark.rutland@arm.com/
> > 
> > Dawei Li (9):
> >   perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
> >   perf/arm-cmn: Avoid placing cpumask var on stack
> >   perf/arm_cspmu: Avoid placing cpumask var on stack
> >   perf/arm_dsu: Avoid placing cpumask var on stack
> >   perf/dwc_pcie: Avoid placing cpumask var on stack
> >   perf/hisi_pcie: Avoid placing cpumask var on stack
> >   perf/hisi_uncore: Avoid placing cpumask var on stack
> >   perf/qcom_l2: Avoid placing cpumask var on stack
> >   perf/thunderx2: Avoid placing cpumask var on stack
> > 
> > Mark Rutland (1):
> >   cpumask: add cpumask_any_and_but()
> > 
> >  drivers/perf/alibaba_uncore_drw_pmu.c    | 10 +++-------
> >  drivers/perf/arm-cmn.c                   | 10 +++++-----
> >  drivers/perf/arm_cspmu/arm_cspmu.c       |  8 +++-----
> >  drivers/perf/arm_dsu_pmu.c               | 19 ++++++-------------
> >  drivers/perf/dwc_pcie_pmu.c              | 10 ++++------
> >  drivers/perf/hisilicon/hisi_pcie_pmu.c   |  9 ++++-----
> >  drivers/perf/hisilicon/hisi_uncore_pmu.c |  6 ++----
> >  drivers/perf/qcom_l2_pmu.c               |  8 +++-----
> >  drivers/perf/thunderx2_pmu.c             | 10 +++-------
> >  include/linux/cpumask.h                  | 23 +++++++++++++++++++++++
> >  10 files changed, 56 insertions(+), 57 deletions(-)
> > 
> > Thanks,
> > 
> >     Dawei
> > 
> > -- 
> > 2.27.0
> > 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 00/10] perf: Avoid placing cpumask var on stack
  2024-04-03 16:11   ` Dawei Li
@ 2024-04-03 16:39     ` Yury Norov
  0 siblings, 0 replies; 25+ messages in thread
From: Yury Norov @ 2024-04-03 16:39 UTC (permalink / raw)
  To: Dawei Li
  Cc: Mark Rutland, will, linux, xueshuai, renyu.zj, yangyicong,
	jonathan.cameron, andersson, konrad.dybcio, linux-arm-kernel,
	linux-kernel, linux-arm-msm

On Thu, Apr 04, 2024 at 12:11:51AM +0800, Dawei Li wrote:
> Hi Mark,
> 
> On Wed, Apr 03, 2024 at 03:41:07PM +0100, Mark Rutland wrote:
> > On Wed, Apr 03, 2024 at 08:50:59PM +0800, Dawei Li wrote:
> > > Hi all,
> > 
> > Hi,
> > 
> > > This is v2 of [1] and [2] which basically eliminate cpumask var allocation
> > > on stack for perf subsystem.
> > > 
> > > Change since v1:
> > > - Change from dynamic allocation to a temporary var free helper:
> > >   cpumask_any_and_but().	[Mark]
> > > 
> > > - Some minor coding style improvements, reverse chrismas tree e.g.
> > > 
> > > - For cpumask_any_and_but() itself:
> > >   - Moved to cpumask.h, just like other helpers.
> > >   - Return value converted to unsigned int.
> > >   - Remove EXPORT_SYMBOL, for obvious reason.
> > 
> > Thanks for this!
> > 
> > The logic all looks good; if you can spin a v3 with the updated commit messages
> > I reckon we can queue this up shortly.
> 
> Thanks for review.
> 
> v3 respinned:
> https://lore.kernel.org/lkml/20240403155950.2068109-1-dawei.li@shingroup.cn/
> 
> If it's going through perf tree, do we need Acked-by from bitmap
> maintainers for patch[1]?

There's only one bitmap-related patch, so I agree - the series should
go through Mark's tree. I acked 1st patch in v3. Please go ahead.

Thanks,
Yury

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 07/10] perf/hisi_pcie: Avoid placing cpumask var on stack
  2024-04-03 14:35   ` Mark Rutland
@ 2024-04-05  9:52     ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2024-04-05  9:52 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Dawei Li, will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	andersson, konrad.dybcio, linux-arm-kernel, linux-kernel,
	linux-arm-msm

On Wed, 3 Apr 2024 15:35:10 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> On Wed, Apr 03, 2024 at 08:51:06PM +0800, Dawei Li wrote:
> > For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> > variable on stack is not recommended since it can cause potential stack
> > overflow.
> > 
> > Instead, kernel code should always use *cpumask_var API(s) to allocate
> > cpumask var in config-neutral way, leaving allocation strategy to
> > CONFIG_CPUMASK_OFFSTACK.
> > 
> > But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> > for if allocation fails(which is unlikely but still possible):
> > - If -ENOMEM is returned to caller, kernel crashes for non-bringup
> >   teardown;
> > - If callback pretends nothing happened and returns 0 to caller, it may
> >   trap system into an in-consisitent/compromised state;
> > 
> > Use newly-introduced cpumask_any_and_but() to address all issues above.
> > It eliminates usage of temporary cpumask var in generic way, no matter how
> > the cpumask var is allocated.
> > 
> > Suggested-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Dawei Li <dawei.li@shingroup.cn>  
> 
> The logic looks good to me, but I'd like the commit message updated the same as
> per my comment on patch 2.
> 
> With that commit message:
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> Mark.
> 
> > ---
> >  drivers/perf/hisilicon/hisi_pcie_pmu.c | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > index 5d1f0e9fdb08..06b192cc31d5 100644
> > --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > @@ -673,7 +673,6 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> >  {
> >  	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
> >  	unsigned int target;
> > -	cpumask_t mask;
> >  	int numa_node;
> >  
> >  	/* Nothing to do if this CPU doesn't own the PMU */
> > @@ -684,10 +683,10 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> >  
> >  	/* Choose a local CPU from all online cpus. */
> >  	numa_node = dev_to_node(&pcie_pmu->pdev->dev);
> > -	if (cpumask_and(&mask, cpumask_of_node(numa_node), cpu_online_mask) &&
> > -	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
> > -		target = cpumask_any(&mask);
> > -	else
> > +
> > +	target = cpumask_any_and_but(cpumask_of_node(numa_node),
> > +				     cpu_online_mask, cpu);
> > +	if (target >= nr_cpu_ids)
> >  		target = cpumask_any_but(cpu_online_mask, cpu);
> >  
> >  	if (target >= nr_cpu_ids) {
> > -- 
> > 2.27.0
> >   


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 08/10] perf/hisi_uncore: Avoid placing cpumask var on stack
  2024-04-03 14:35   ` Mark Rutland
@ 2024-04-05  9:53     ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2024-04-05  9:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Dawei Li, will, yury.norov, linux, xueshuai, renyu.zj, yangyicong,
	andersson, konrad.dybcio, linux-arm-kernel, linux-kernel,
	linux-arm-msm

On Wed, 3 Apr 2024 15:35:47 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> On Wed, Apr 03, 2024 at 08:51:07PM +0800, Dawei Li wrote:
> > For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
> > variable on stack is not recommended since it can cause potential stack
> > overflow.
> > 
> > Instead, kernel code should always use *cpumask_var API(s) to allocate
> > cpumask var in config-neutral way, leaving allocation strategy to
> > CONFIG_CPUMASK_OFFSTACK.
> > 
> > But dynamic allocation in cpuhp's teardown callback is somewhat problematic
> > for if allocation fails(which is unlikely but still possible):
> > - If -ENOMEM is returned to caller, kernel crashes for non-bringup
> >   teardown;
> > - If callback pretends nothing happened and returns 0 to caller, it may
> >   trap system into an in-consisitent/compromised state;
> > 
> > Use newly-introduced cpumask_any_and_but() to address all issues above.
> > It eliminates usage of temporary cpumask var in generic way, no matter how
> > the cpumask var is allocated.
> > 
> > Suggested-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Dawei Li <dawei.li@shingroup.cn>  
> 
> The logic looks good to me, but I'd like the commit message updated the same as
> per my comment on patch 2.
> 
> With that commit message:
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-04-05  9:53 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03 12:50 [PATCH v2 00/10] perf: Avoid placing cpumask var on stack Dawei Li
2024-04-03 12:51 ` [PATCH v2 01/10] cpumask: add cpumask_any_and_but() Dawei Li
2024-04-03 12:51 ` [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack Dawei Li
2024-04-03 14:16   ` Mark Rutland
2024-04-03 12:51 ` [PATCH v2 03/10] perf/arm-cmn: " Dawei Li
2024-04-03 14:23   ` Mark Rutland
2024-04-03 12:51 ` [PATCH v2 04/10] perf/arm_cspmu: " Dawei Li
2024-04-03 14:21   ` Mark Rutland
2024-04-03 12:51 ` [PATCH v2 05/10] perf/arm_dsu: " Dawei Li
2024-04-03 14:32   ` Mark Rutland
2024-04-03 12:51 ` [PATCH v2 06/10] perf/dwc_pcie: " Dawei Li
2024-04-03 14:33   ` Mark Rutland
2024-04-03 12:51 ` [PATCH v2 07/10] perf/hisi_pcie: " Dawei Li
2024-04-03 14:35   ` Mark Rutland
2024-04-05  9:52     ` Jonathan Cameron
2024-04-03 12:51 ` [PATCH v2 08/10] perf/hisi_uncore: " Dawei Li
2024-04-03 14:35   ` Mark Rutland
2024-04-05  9:53     ` Jonathan Cameron
2024-04-03 12:51 ` [PATCH v2 09/10] perf/qcom_l2: " Dawei Li
2024-04-03 14:36   ` Mark Rutland
2024-04-03 12:51 ` [PATCH v2 10/10] perf/thunderx2: " Dawei Li
2024-04-03 14:38   ` Mark Rutland
2024-04-03 14:41 ` [PATCH v2 00/10] perf: " Mark Rutland
2024-04-03 16:11   ` Dawei Li
2024-04-03 16:39     ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).