linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
@ 2023-06-08 11:43 Junhao He
  2023-06-08 12:30 ` Yicong Yang
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Junhao He @ 2023-06-08 11:43 UTC (permalink / raw)
  To: will, jonathan.cameron, linux-kernel, mark.rutland
  Cc: linux-arm-kernel, linux-doc, linuxarm, yangyicong, shenyang39,
	prime.zeng, hejunhao3

The driver needs to migrate the perf context if the current using CPU going
to teardown. By the time calling the cpuhp::teardown() callback the
cpu_online_mask() hasn't updated yet and still includes the CPU going to
teardown. In current driver's implementation we may migrate the context
to the teardown CPU and leads to the below calltrace:

...
[  368.104662][  T932] task:cpuhp/0         state:D stack:    0 pid:   15 ppid:     2 flags:0x00000008
[  368.113699][  T932] Call trace:
[  368.116834][  T932]  __switch_to+0x7c/0xbc
[  368.120924][  T932]  __schedule+0x338/0x6f0
[  368.125098][  T932]  schedule+0x50/0xe0
[  368.128926][  T932]  schedule_preempt_disabled+0x18/0x24
[  368.134229][  T932]  __mutex_lock.constprop.0+0x1d4/0x5dc
[  368.139617][  T932]  __mutex_lock_slowpath+0x1c/0x30
[  368.144573][  T932]  mutex_lock+0x50/0x60
[  368.148579][  T932]  perf_pmu_migrate_context+0x84/0x2b0
[  368.153884][  T932]  hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
[  368.160579][  T932]  cpuhp_invoke_callback+0x2a0/0x650
[  368.165707][  T932]  cpuhp_thread_fun+0xe4/0x190
[  368.170316][  T932]  smpboot_thread_fn+0x15c/0x1a0
[  368.175099][  T932]  kthread+0x108/0x13c
[  368.179012][  T932]  ret_from_fork+0x10/0x18
...

Use function cpumask_any_but() to find one correct active cpu to fixes
this issue.

Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
---
 drivers/perf/hisilicon/hisi_pcie_pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 0bc8dc36aff5..14f8b4b03337 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -683,7 +683,7 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 
 	pcie_pmu->on_cpu = -1;
 	/* Choose a new CPU from all online cpus. */
-	target = cpumask_first(cpu_online_mask);
+	target = cpumask_any_but(cpu_online_mask, cpu);
 	if (target >= nr_cpu_ids) {
 		pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
 		return 0;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  2023-06-08 11:43 [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown Junhao He
@ 2023-06-08 12:30 ` Yicong Yang
  2023-06-08 12:40 ` Mark Rutland
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Yicong Yang @ 2023-06-08 12:30 UTC (permalink / raw)
  To: Junhao He, will, jonathan.cameron, linux-kernel, mark.rutland
  Cc: linux-arm-kernel, linux-doc, yangyicong, shenyang39, prime.zeng,
	yangyicong

On 2023/6/8 19:43, Junhao He wrote:
> The driver needs to migrate the perf context if the current using CPU going
> to teardown. By the time calling the cpuhp::teardown() callback the
> cpu_online_mask() hasn't updated yet and still includes the CPU going to
> teardown. In current driver's implementation we may migrate the context
> to the teardown CPU and leads to the below calltrace:
> 
> ...
> [  368.104662][  T932] task:cpuhp/0         state:D stack:    0 pid:   15 ppid:     2 flags:0x00000008
> [  368.113699][  T932] Call trace:
> [  368.116834][  T932]  __switch_to+0x7c/0xbc
> [  368.120924][  T932]  __schedule+0x338/0x6f0
> [  368.125098][  T932]  schedule+0x50/0xe0
> [  368.128926][  T932]  schedule_preempt_disabled+0x18/0x24
> [  368.134229][  T932]  __mutex_lock.constprop.0+0x1d4/0x5dc
> [  368.139617][  T932]  __mutex_lock_slowpath+0x1c/0x30
> [  368.144573][  T932]  mutex_lock+0x50/0x60
> [  368.148579][  T932]  perf_pmu_migrate_context+0x84/0x2b0
> [  368.153884][  T932]  hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
> [  368.160579][  T932]  cpuhp_invoke_callback+0x2a0/0x650
> [  368.165707][  T932]  cpuhp_thread_fun+0xe4/0x190
> [  368.170316][  T932]  smpboot_thread_fn+0x15c/0x1a0
> [  368.175099][  T932]  kthread+0x108/0x13c
> [  368.179012][  T932]  ret_from_fork+0x10/0x18
> ...
> 
> Use function cpumask_any_but() to find one correct active cpu to fixes
> this issue.
> 
> Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
> Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>

> ---
>  drivers/perf/hisilicon/hisi_pcie_pmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 0bc8dc36aff5..14f8b4b03337 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -683,7 +683,7 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  
>  	pcie_pmu->on_cpu = -1;
>  	/* Choose a new CPU from all online cpus. */
> -	target = cpumask_first(cpu_online_mask);
> +	target = cpumask_any_but(cpu_online_mask, cpu);
>  	if (target >= nr_cpu_ids) {
>  		pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
>  		return 0;
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  2023-06-08 11:43 [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown Junhao He
  2023-06-08 12:30 ` Yicong Yang
@ 2023-06-08 12:40 ` Mark Rutland
  2023-06-08 13:14 ` Jonathan Cameron
  2023-06-09 11:16 ` Will Deacon
  3 siblings, 0 replies; 5+ messages in thread
From: Mark Rutland @ 2023-06-08 12:40 UTC (permalink / raw)
  To: Junhao He, will
  Cc: jonathan.cameron, linux-kernel, linux-arm-kernel, linux-doc,
	linuxarm, yangyicong, shenyang39, prime.zeng

On Thu, Jun 08, 2023 at 07:43:26PM +0800, Junhao He wrote:
> The driver needs to migrate the perf context if the current using CPU going
> to teardown. By the time calling the cpuhp::teardown() callback the
> cpu_online_mask() hasn't updated yet and still includes the CPU going to
> teardown. In current driver's implementation we may migrate the context
> to the teardown CPU and leads to the below calltrace:
> 
> ...
> [  368.104662][  T932] task:cpuhp/0         state:D stack:    0 pid:   15 ppid:     2 flags:0x00000008
> [  368.113699][  T932] Call trace:
> [  368.116834][  T932]  __switch_to+0x7c/0xbc
> [  368.120924][  T932]  __schedule+0x338/0x6f0
> [  368.125098][  T932]  schedule+0x50/0xe0
> [  368.128926][  T932]  schedule_preempt_disabled+0x18/0x24
> [  368.134229][  T932]  __mutex_lock.constprop.0+0x1d4/0x5dc
> [  368.139617][  T932]  __mutex_lock_slowpath+0x1c/0x30
> [  368.144573][  T932]  mutex_lock+0x50/0x60
> [  368.148579][  T932]  perf_pmu_migrate_context+0x84/0x2b0
> [  368.153884][  T932]  hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
> [  368.160579][  T932]  cpuhp_invoke_callback+0x2a0/0x650
> [  368.165707][  T932]  cpuhp_thread_fun+0xe4/0x190
> [  368.170316][  T932]  smpboot_thread_fn+0x15c/0x1a0
> [  368.175099][  T932]  kthread+0x108/0x13c
> [  368.179012][  T932]  ret_from_fork+0x10/0x18
> ...
> 
> Use function cpumask_any_but() to find one correct active cpu to fixes
> this issue.
> 
> Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
> Signed-off-by: Junhao He <hejunhao3@huawei.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

I assume that Will can pick this up.

I did a quick check, and all other perf drivers seem to do the right thing
here, either using cpumask_any_but(), or generating a temporary mask with the
cpu being offlined removed.

Mark.

> ---
>  drivers/perf/hisilicon/hisi_pcie_pmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 0bc8dc36aff5..14f8b4b03337 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -683,7 +683,7 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  
>  	pcie_pmu->on_cpu = -1;
>  	/* Choose a new CPU from all online cpus. */
> -	target = cpumask_first(cpu_online_mask);
> +	target = cpumask_any_but(cpu_online_mask, cpu);
>  	if (target >= nr_cpu_ids) {
>  		pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
>  		return 0;
> -- 
> 2.30.0
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  2023-06-08 11:43 [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown Junhao He
  2023-06-08 12:30 ` Yicong Yang
  2023-06-08 12:40 ` Mark Rutland
@ 2023-06-08 13:14 ` Jonathan Cameron
  2023-06-09 11:16 ` Will Deacon
  3 siblings, 0 replies; 5+ messages in thread
From: Jonathan Cameron @ 2023-06-08 13:14 UTC (permalink / raw)
  To: Junhao He
  Cc: will, linux-kernel, mark.rutland, linux-arm-kernel, linux-doc,
	linuxarm, yangyicong, shenyang39, prime.zeng

On Thu, 8 Jun 2023 19:43:26 +0800
Junhao He <hejunhao3@huawei.com> wrote:

> The driver needs to migrate the perf context if the current using CPU going
> to teardown. By the time calling the cpuhp::teardown() callback the
> cpu_online_mask() hasn't updated yet and still includes the CPU going to
> teardown. In current driver's implementation we may migrate the context
> to the teardown CPU and leads to the below calltrace:
> 
> ...
> [  368.104662][  T932] task:cpuhp/0         state:D stack:    0 pid:   15 ppid:     2 flags:0x00000008
> [  368.113699][  T932] Call trace:
> [  368.116834][  T932]  __switch_to+0x7c/0xbc
> [  368.120924][  T932]  __schedule+0x338/0x6f0
> [  368.125098][  T932]  schedule+0x50/0xe0
> [  368.128926][  T932]  schedule_preempt_disabled+0x18/0x24
> [  368.134229][  T932]  __mutex_lock.constprop.0+0x1d4/0x5dc
> [  368.139617][  T932]  __mutex_lock_slowpath+0x1c/0x30
> [  368.144573][  T932]  mutex_lock+0x50/0x60
> [  368.148579][  T932]  perf_pmu_migrate_context+0x84/0x2b0
> [  368.153884][  T932]  hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
> [  368.160579][  T932]  cpuhp_invoke_callback+0x2a0/0x650
> [  368.165707][  T932]  cpuhp_thread_fun+0xe4/0x190
> [  368.170316][  T932]  smpboot_thread_fn+0x15c/0x1a0
> [  368.175099][  T932]  kthread+0x108/0x13c
> [  368.179012][  T932]  ret_from_fork+0x10/0x18
> ...
> 
> Use function cpumask_any_but() to find one correct active cpu to fixes
> this issue.
> 
> Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
> Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/perf/hisilicon/hisi_pcie_pmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 0bc8dc36aff5..14f8b4b03337 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -683,7 +683,7 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  
>  	pcie_pmu->on_cpu = -1;
>  	/* Choose a new CPU from all online cpus. */
> -	target = cpumask_first(cpu_online_mask);
> +	target = cpumask_any_but(cpu_online_mask, cpu);
>  	if (target >= nr_cpu_ids) {
>  		pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
>  		return 0;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  2023-06-08 11:43 [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown Junhao He
                   ` (2 preceding siblings ...)
  2023-06-08 13:14 ` Jonathan Cameron
@ 2023-06-09 11:16 ` Will Deacon
  3 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2023-06-09 11:16 UTC (permalink / raw)
  To: Junhao He, linux-kernel, mark.rutland, jonathan.cameron
  Cc: catalin.marinas, kernel-team, Will Deacon, shenyang39, yangyicong,
	linux-doc, linux-arm-kernel, prime.zeng, linuxarm

On Thu, 8 Jun 2023 19:43:26 +0800, Junhao He wrote:
> The driver needs to migrate the perf context if the current using CPU going
> to teardown. By the time calling the cpuhp::teardown() callback the
> cpu_online_mask() hasn't updated yet and still includes the CPU going to
> teardown. In current driver's implementation we may migrate the context
> to the teardown CPU and leads to the below calltrace:
> 
> ...
> [  368.104662][  T932] task:cpuhp/0         state:D stack:    0 pid:   15 ppid:     2 flags:0x00000008
> [  368.113699][  T932] Call trace:
> [  368.116834][  T932]  __switch_to+0x7c/0xbc
> [  368.120924][  T932]  __schedule+0x338/0x6f0
> [  368.125098][  T932]  schedule+0x50/0xe0
> [  368.128926][  T932]  schedule_preempt_disabled+0x18/0x24
> [  368.134229][  T932]  __mutex_lock.constprop.0+0x1d4/0x5dc
> [  368.139617][  T932]  __mutex_lock_slowpath+0x1c/0x30
> [  368.144573][  T932]  mutex_lock+0x50/0x60
> [  368.148579][  T932]  perf_pmu_migrate_context+0x84/0x2b0
> [  368.153884][  T932]  hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
> [  368.160579][  T932]  cpuhp_invoke_callback+0x2a0/0x650
> [  368.165707][  T932]  cpuhp_thread_fun+0xe4/0x190
> [  368.170316][  T932]  smpboot_thread_fn+0x15c/0x1a0
> [  368.175099][  T932]  kthread+0x108/0x13c
> [  368.179012][  T932]  ret_from_fork+0x10/0x18
> ...
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/1] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
      https://git.kernel.org/will/c/7a6a9f1c5a0a

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-09 11:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-08 11:43 [PATCH] drivers/perf: hisi: Don't migrate perf to the CPU going to teardown Junhao He
2023-06-08 12:30 ` Yicong Yang
2023-06-08 12:40 ` Mark Rutland
2023-06-08 13:14 ` Jonathan Cameron
2023-06-09 11:16 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).