* [PATCH] coresight: cti: disclaim device only when it's claimed
@ 2020-08-18 11:10 Tingwei Zhang
2020-08-18 11:10 ` [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug Tingwei Zhang
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Tingwei Zhang @ 2020-08-18 11:10 UTC (permalink / raw)
To: Mathieu Poirier, Suzuki K Poulose, Alexander Shishkin,
Greg Kroah-Hartman, Mike Leach
Cc: tsoni, Sai Prakash Ranjan, coresight, Mao Jinlong, linux-kernel,
Tingwei Zhang, linux-arm-kernel
Coresight_claim_device() is called in cti_starting_cpu() only
when CTI is enabled while coresight_disclaim_device() is called
uncontionally in cti_dying_cpu(). This triggered below WARNING.
Only call disclaim device when CTI device is enabled to fix it.
[ 75.989643] WARNING: CPU: 1 PID: 14 at
kernel/drivers/hwtracing/coresight/coresight.c:209
coresight_disclaim_device_unlocked+0x10/0x24
[ 75.989697] CPU: 1 PID: 14 Comm: migration/1 Not tainted
5.9.0-rc1-gff1304be0a05-dirty #21
[ 75.989709] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 75.989737] pstate: 80c00085 (Nzcv daIf +PAN +UAO BTYPE=--)
[ 75.989758] pc : coresight_disclaim_device_unlocked+0x10/0x24
[ 75.989775] lr : coresight_disclaim_device+0x24/0x38
[ 75.989783] sp : ffff800011cd3c90
.
[ 75.990018] Call trace:
[ 75.990041] coresight_disclaim_device_unlocked+0x10/0x24
[ 75.990066] cti_dying_cpu+0x34/0x4c
[ 75.990101] cpuhp_invoke_callback+0x84/0x1e0
[ 75.990121] take_cpu_down+0x90/0xe0
[ 75.990154] multi_cpu_stop+0x134/0x160
[ 75.990171] cpu_stopper_thread+0xb0/0x13c
[ 75.990196] smpboot_thread_fn+0x1c4/0x270
[ 75.990222] kthread+0x128/0x154
[ 75.990251] ret_from_fork+0x10/0x18
Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver")
Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org>
---
drivers/hwtracing/coresight/coresight-cti.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
index 3ccc703dc940..d6fea6efec71 100644
--- a/drivers/hwtracing/coresight/coresight-cti.c
+++ b/drivers/hwtracing/coresight/coresight-cti.c
@@ -742,7 +742,8 @@ static int cti_dying_cpu(unsigned int cpu)
spin_lock(&drvdata->spinlock);
drvdata->config.hw_powered = false;
- coresight_disclaim_device(drvdata->base);
+ if (drvdata->config.hw_enabled)
+ coresight_disclaim_device(drvdata->base);
spin_unlock(&drvdata->spinlock);
return 0;
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug 2020-08-18 11:10 [PATCH] coresight: cti: disclaim device only when it's claimed Tingwei Zhang @ 2020-08-18 11:10 ` Tingwei Zhang 2020-08-18 12:47 ` Mike Leach 2020-08-18 11:10 ` [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() Tingwei Zhang 2020-08-20 8:12 ` [PATCH] coresight: cti: disclaim device only when it's claimed Mike Leach 2 siblings, 1 reply; 11+ messages in thread From: Tingwei Zhang @ 2020-08-18 11:10 UTC (permalink / raw) To: Mathieu Poirier, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, Mike Leach Cc: tsoni, Sai Prakash Ranjan, coresight, Mao Jinlong, linux-kernel, Tingwei Zhang, linux-arm-kernel Below BUG is triggered by call pm_runtime_get_sync() in cti_cpuhp_enable_hw(). It's in CPU hotplug callback with interrupt disabled. Pm_runtime_get_sync() calls clock driver to enable clock which could sleep. Remove pm_runtime_get_sync() in cti_cpuhp_enable_hw() since pm_runtime_get_sync() is called in cti_enabld and pm_runtime_put() is called in cti_disabled. No need to increase pm count when CPU gets online since it's not decreased when CPU is offline. [ 105.800279] BUG: scheduling while atomic: swapper/1/0/0x00000002 [ 105.800290] Modules linked in: [ 105.800327] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 5.9.0-rc1-gff1304be0a05-dirty #21 [ 105.800337] Hardware name: Thundercomm Dragonboard 845c (DT) [ 105.800353] Call trace: [ 105.800414] dump_backtrace+0x0/0x1d4 [ 105.800439] show_stack+0x14/0x1c [ 105.800462] dump_stack+0xc0/0x100 [ 105.800490] __schedule_bug+0x58/0x74 [ 105.800523] __schedule+0x590/0x65c [ 105.800538] schedule+0x78/0x10c [ 105.800553] schedule_timeout+0x188/0x250 [ 105.800585] qmp_send.constprop.10+0x12c/0x1b0 [ 105.800599] qmp_qdss_clk_prepare+0x18/0x20 [ 105.800622] clk_core_prepare+0x48/0xd4 [ 105.800639] clk_prepare+0x20/0x34 [ 105.800663] amba_pm_runtime_resume+0x54/0x90 [ 105.800695] __rpm_callback+0xdc/0x138 [ 105.800709] rpm_callback+0x24/0x78 [ 105.800724] rpm_resume+0x328/0x47c [ 105.800739] __pm_runtime_resume+0x50/0x74 [ 105.800768] cti_starting_cpu+0x40/0xa4 [ 105.800795] cpuhp_invoke_callback+0x84/0x1e0 [ 105.800814] notify_cpu_starting+0x9c/0xb8 [ 105.800834] secondary_start_kernel+0xd8/0x164 [ 105.800933] CPU1: Booted secondary processor 0x0000000100 [0x517f803c] Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver") Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> --- drivers/hwtracing/coresight/coresight-cti.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c index 869569eb8c7f..baba6af83440 100644 --- a/drivers/hwtracing/coresight/coresight-cti.c +++ b/drivers/hwtracing/coresight/coresight-cti.c @@ -126,9 +126,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata) { struct cti_config *config = &drvdata->config; - struct device *dev = &drvdata->csdev->dev; - pm_runtime_get_sync(dev->parent); spin_lock(&drvdata->spinlock); config->hw_powered = true; @@ -148,7 +146,6 @@ static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata) /* did not re-enable due to no claim / no request */ cti_hp_not_enabled: spin_unlock(&drvdata->spinlock); - pm_runtime_put(dev->parent); } /* disable hardware */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug 2020-08-18 11:10 ` [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug Tingwei Zhang @ 2020-08-18 12:47 ` Mike Leach 0 siblings, 0 replies; 11+ messages in thread From: Mike Leach @ 2020-08-18 12:47 UTC (permalink / raw) To: Tingwei Zhang Cc: tsoni, Sai Prakash Ranjan, Mathieu Poirier, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, Coresight ML, Mao Jinlong, Linux Kernel Mailing List, linux-arm-kernel On Tue, 18 Aug 2020 at 12:11, Tingwei Zhang <tingwei@codeaurora.org> wrote: > > Below BUG is triggered by call pm_runtime_get_sync() in > cti_cpuhp_enable_hw(). It's in CPU hotplug callback with interrupt > disabled. Pm_runtime_get_sync() calls clock driver to enable clock > which could sleep. Remove pm_runtime_get_sync() in cti_cpuhp_enable_hw() > since pm_runtime_get_sync() is called in cti_enabld and pm_runtime_put() > is called in cti_disabled. No need to increase pm count when CPU gets > online since it's not decreased when CPU is offline. > > [ 105.800279] BUG: scheduling while atomic: swapper/1/0/0x00000002 > [ 105.800290] Modules linked in: > [ 105.800327] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W > 5.9.0-rc1-gff1304be0a05-dirty #21 > [ 105.800337] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 105.800353] Call trace: > [ 105.800414] dump_backtrace+0x0/0x1d4 > [ 105.800439] show_stack+0x14/0x1c > [ 105.800462] dump_stack+0xc0/0x100 > [ 105.800490] __schedule_bug+0x58/0x74 > [ 105.800523] __schedule+0x590/0x65c > [ 105.800538] schedule+0x78/0x10c > [ 105.800553] schedule_timeout+0x188/0x250 > [ 105.800585] qmp_send.constprop.10+0x12c/0x1b0 > [ 105.800599] qmp_qdss_clk_prepare+0x18/0x20 > [ 105.800622] clk_core_prepare+0x48/0xd4 > [ 105.800639] clk_prepare+0x20/0x34 > [ 105.800663] amba_pm_runtime_resume+0x54/0x90 > [ 105.800695] __rpm_callback+0xdc/0x138 > [ 105.800709] rpm_callback+0x24/0x78 > [ 105.800724] rpm_resume+0x328/0x47c > [ 105.800739] __pm_runtime_resume+0x50/0x74 > [ 105.800768] cti_starting_cpu+0x40/0xa4 > [ 105.800795] cpuhp_invoke_callback+0x84/0x1e0 > [ 105.800814] notify_cpu_starting+0x9c/0xb8 > [ 105.800834] secondary_start_kernel+0xd8/0x164 > [ 105.800933] CPU1: Booted secondary processor 0x0000000100 [0x517f803c] > > Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver") > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > --- > drivers/hwtracing/coresight/coresight-cti.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c > index 869569eb8c7f..baba6af83440 100644 > --- a/drivers/hwtracing/coresight/coresight-cti.c > +++ b/drivers/hwtracing/coresight/coresight-cti.c > @@ -126,9 +126,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) > static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata) > { > struct cti_config *config = &drvdata->config; > - struct device *dev = &drvdata->csdev->dev; > > - pm_runtime_get_sync(dev->parent); > spin_lock(&drvdata->spinlock); > config->hw_powered = true; > > @@ -148,7 +146,6 @@ static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata) > /* did not re-enable due to no claim / no request */ > cti_hp_not_enabled: > spin_unlock(&drvdata->spinlock); > - pm_runtime_put(dev->parent); > } > > /* disable hardware */ > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > Reviewed-by Mike Leach <mike.leach@linaro.org> -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-18 11:10 [PATCH] coresight: cti: disclaim device only when it's claimed Tingwei Zhang 2020-08-18 11:10 ` [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug Tingwei Zhang @ 2020-08-18 11:10 ` Tingwei Zhang 2020-08-20 8:06 ` Mike Leach ` (2 more replies) 2020-08-20 8:12 ` [PATCH] coresight: cti: disclaim device only when it's claimed Mike Leach 2 siblings, 3 replies; 11+ messages in thread From: Tingwei Zhang @ 2020-08-18 11:10 UTC (permalink / raw) To: Mathieu Poirier, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, Mike Leach Cc: tsoni, Sai Prakash Ranjan, coresight, Mao Jinlong, linux-kernel, Tingwei Zhang, linux-arm-kernel Deadlock as below is triggered by one CPU holds drvdata->spinlock and calls cti_enable_hw(). Smp_call_function_single() is called in cti_enable_hw() and tries to let another CPU write CTI registers. That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() and doesn't response to IPI from smp_call_function_single(). [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 ... [ 988.451638] Call trace: [ 988.454119] smp_call_function_single+0x158/0x1b8 [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight] [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] [ 988.474957] enable_source_store+0x5c/0x9c [coresight] [ 988.480140] dev_attr_store+0x14/0x28 [ 988.483839] sysfs_kf_write+0x38/0x4c [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 [ 988.491585] vfs_write+0xfc/0x300 [ 988.494931] ksys_write+0x78/0xe0 [ 988.498283] __arm64_sys_write+0x18/0x20 [ 988.502240] el0_svc_common+0x98/0x160 [ 988.506024] do_el0_svc+0x78/0x80 [ 988.509377] el0_sync_handler+0xd4/0x270 [ 988.513337] el0_sync+0x164/0x180 This change write CTI registers directly in cti_enable_hw(). Config->hw_powered has been checked to be true with spinlock holded. CTI is powered and can be programmed until spinlock is released. Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices") Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> --- drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c index 3ccc703dc940..869569eb8c7f 100644 --- a/drivers/hwtracing/coresight/coresight-cti.c +++ b/drivers/hwtracing/coresight/coresight-cti.c @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata) CS_LOCK(drvdata->base); } -static void cti_enable_hw_smp_call(void *info) -{ - struct cti_drvdata *drvdata = info; - - cti_write_all_hw_regs(drvdata); -} - /* write regs to hardware and enable */ static int cti_enable_hw(struct cti_drvdata *drvdata) { @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) if (rc) goto cti_err_not_enabled; - if (drvdata->ctidev.cpu >= 0) { - rc = smp_call_function_single(drvdata->ctidev.cpu, - cti_enable_hw_smp_call, - drvdata, 1); - if (rc) - goto cti_err_not_enabled; - } else { - cti_write_all_hw_regs(drvdata); - } + cti_write_all_hw_regs(drvdata); config->hw_enabled = true; atomic_inc(&drvdata->config.enable_req_count); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-18 11:10 ` [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() Tingwei Zhang @ 2020-08-20 8:06 ` Mike Leach 2020-08-27 18:12 ` Mathieu Poirier 2020-08-27 19:17 ` Mathieu Poirier 2 siblings, 0 replies; 11+ messages in thread From: Mike Leach @ 2020-08-20 8:06 UTC (permalink / raw) To: Tingwei Zhang Cc: tsoni, Sai Prakash Ranjan, Mathieu Poirier, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, Coresight ML, Mao Jinlong, Linux Kernel Mailing List, linux-arm-kernel Hi Tingwei On Tue, 18 Aug 2020 at 12:11, Tingwei Zhang <tingwei@codeaurora.org> wrote: > > Deadlock as below is triggered by one CPU holds drvdata->spinlock > and calls cti_enable_hw(). Smp_call_function_single() is called > in cti_enable_hw() and tries to let another CPU write CTI registers. > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() > and doesn't response to IPI from smp_call_function_single(). > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 > ... > [ 988.451638] Call trace: > [ 988.454119] smp_call_function_single+0x158/0x1b8 > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight] > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] > [ 988.474957] enable_source_store+0x5c/0x9c [coresight] > [ 988.480140] dev_attr_store+0x14/0x28 > [ 988.483839] sysfs_kf_write+0x38/0x4c > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 > [ 988.491585] vfs_write+0xfc/0x300 > [ 988.494931] ksys_write+0x78/0xe0 > [ 988.498283] __arm64_sys_write+0x18/0x20 > [ 988.502240] el0_svc_common+0x98/0x160 > [ 988.506024] do_el0_svc+0x78/0x80 > [ 988.509377] el0_sync_handler+0xd4/0x270 > [ 988.513337] el0_sync+0x164/0x180 > > This change write CTI registers directly in cti_enable_hw(). > Config->hw_powered has been checked to be true with spinlock holded. > CTI is powered and can be programmed until spinlock is released. > Agreed. Reviewed-by Mike Leach <mike.leach@linaro.org> > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices") > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > --- > drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- > 1 file changed, 1 insertion(+), 16 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c > index 3ccc703dc940..869569eb8c7f 100644 > --- a/drivers/hwtracing/coresight/coresight-cti.c > +++ b/drivers/hwtracing/coresight/coresight-cti.c > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata) > CS_LOCK(drvdata->base); > } > > -static void cti_enable_hw_smp_call(void *info) > -{ > - struct cti_drvdata *drvdata = info; > - > - cti_write_all_hw_regs(drvdata); > -} > - > /* write regs to hardware and enable */ > static int cti_enable_hw(struct cti_drvdata *drvdata) > { > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) > if (rc) > goto cti_err_not_enabled; > > - if (drvdata->ctidev.cpu >= 0) { > - rc = smp_call_function_single(drvdata->ctidev.cpu, > - cti_enable_hw_smp_call, > - drvdata, 1); > - if (rc) > - goto cti_err_not_enabled; > - } else { > - cti_write_all_hw_regs(drvdata); > - } > + cti_write_all_hw_regs(drvdata); > > config->hw_enabled = true; > atomic_inc(&drvdata->config.enable_req_count); > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > > _______________________________________________ > CoreSight mailing list > CoreSight@lists.linaro.org > https://lists.linaro.org/mailman/listinfo/coresight -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-18 11:10 ` [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() Tingwei Zhang 2020-08-20 8:06 ` Mike Leach @ 2020-08-27 18:12 ` Mathieu Poirier 2020-08-28 3:18 ` Tingwei Zhang 2020-08-27 19:17 ` Mathieu Poirier 2 siblings, 1 reply; 11+ messages in thread From: Mathieu Poirier @ 2020-08-27 18:12 UTC (permalink / raw) To: Tingwei Zhang Cc: tsoni, Sai Prakash Ranjan, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, coresight, Mao Jinlong, linux-kernel, linux-arm-kernel, Mike Leach Hi Tingwei, On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote: > Deadlock as below is triggered by one CPU holds drvdata->spinlock > and calls cti_enable_hw(). Smp_call_function_single() is called > in cti_enable_hw() and tries to let another CPU write CTI registers. > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() > and doesn't response to IPI from smp_call_function_single(). > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 > ... > [ 988.451638] Call trace: > [ 988.454119] smp_call_function_single+0x158/0x1b8 > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight] > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] > [ 988.474957] enable_source_store+0x5c/0x9c [coresight] > [ 988.480140] dev_attr_store+0x14/0x28 > [ 988.483839] sysfs_kf_write+0x38/0x4c > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 > [ 988.491585] vfs_write+0xfc/0x300 > [ 988.494931] ksys_write+0x78/0xe0 > [ 988.498283] __arm64_sys_write+0x18/0x20 > [ 988.502240] el0_svc_common+0x98/0x160 > [ 988.506024] do_el0_svc+0x78/0x80 > [ 988.509377] el0_sync_handler+0xd4/0x270 > [ 988.513337] el0_sync+0x164/0x180 > Was this the full log or you did cut some of it? > This change write CTI registers directly in cti_enable_hw(). > Config->hw_powered has been checked to be true with spinlock holded. > CTI is powered and can be programmed until spinlock is released. > From your explanation above it seems that cti_enable_hw() was called from, say CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock was taken and smp_call_function_single() called right after. That woke up CPUx and cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to take CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. Is my assessment correct? If so I don't think the fix suggested in this patch will work. The same condition will happen whenever cti_enable_hw() is called on a CPU to enable a CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on latter CPU at the same time. I think a better solution is to grab the lock in cti_enable_hw() and check the value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release the lock and call smp_call_function_single(). In cti_enable_hw_smp_call() take the lock again and move forward from there. I have applied the other two patches in this set so no need to send them again. Thanks, Mathieu > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices") > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > --- > drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- > 1 file changed, 1 insertion(+), 16 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c > index 3ccc703dc940..869569eb8c7f 100644 > --- a/drivers/hwtracing/coresight/coresight-cti.c > +++ b/drivers/hwtracing/coresight/coresight-cti.c > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata) > CS_LOCK(drvdata->base); > } > > -static void cti_enable_hw_smp_call(void *info) > -{ > - struct cti_drvdata *drvdata = info; > - > - cti_write_all_hw_regs(drvdata); > -} > - > /* write regs to hardware and enable */ > static int cti_enable_hw(struct cti_drvdata *drvdata) > { > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) > if (rc) > goto cti_err_not_enabled; > > - if (drvdata->ctidev.cpu >= 0) { > - rc = smp_call_function_single(drvdata->ctidev.cpu, > - cti_enable_hw_smp_call, > - drvdata, 1); > - if (rc) > - goto cti_err_not_enabled; > - } else { > - cti_write_all_hw_regs(drvdata); > - } > + cti_write_all_hw_regs(drvdata); > > config->hw_enabled = true; > atomic_inc(&drvdata->config.enable_req_count); > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-27 18:12 ` Mathieu Poirier @ 2020-08-28 3:18 ` Tingwei Zhang 2020-08-31 20:28 ` Mathieu Poirier 0 siblings, 1 reply; 11+ messages in thread From: Tingwei Zhang @ 2020-08-28 3:18 UTC (permalink / raw) To: Mathieu Poirier Cc: tsoni, Sai Prakash Ranjan, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, coresight, Mao Jinlong, linux-kernel, Tingwei Zhang, linux-arm-kernel, Mike Leach On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote: > Hi Tingwei, > > On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote: > > Deadlock as below is triggered by one CPU holds drvdata->spinlock > > and calls cti_enable_hw(). Smp_call_function_single() is called > > in cti_enable_hw() and tries to let another CPU write CTI registers. > > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() > > and doesn't response to IPI from smp_call_function_single(). > > > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L > > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 > > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) > > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) > > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 > > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 > > ... > > [ 988.451638] Call trace: > > [ 988.454119] smp_call_function_single+0x158/0x1b8 > > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] > > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight] > > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] > > [ 988.474957] enable_source_store+0x5c/0x9c [coresight] > > [ 988.480140] dev_attr_store+0x14/0x28 > > [ 988.483839] sysfs_kf_write+0x38/0x4c > > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 > > [ 988.491585] vfs_write+0xfc/0x300 > > [ 988.494931] ksys_write+0x78/0xe0 > > [ 988.498283] __arm64_sys_write+0x18/0x20 > > [ 988.502240] el0_svc_common+0x98/0x160 > > [ 988.506024] do_el0_svc+0x78/0x80 > > [ 988.509377] el0_sync_handler+0xd4/0x270 > > [ 988.513337] el0_sync+0x164/0x180 > > > > Was this the full log or you did cut some of it? > I cut some CPU registers' value since it's too long and not relevant. The Call trace is full. > > This change write CTI registers directly in cti_enable_hw(). > > Config->hw_powered has been checked to be true with spinlock holded. > > CTI is powered and can be programmed until spinlock is released. > > > > From your explanation above it seems that cti_enable_hw() was called from, > say > CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock was > taken > and smp_call_function_single() called right after. That woke up CPUx and > cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to > take > CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. Is my > assessment correct? > Most of them is correct. The only difference is CPUx is power on when cti_enable_hw() is called. Otherwise it will goto cti_state_unchanged: and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called when CPUx tries to suspend instead of resume. > If so I don't think the fix suggested in this patch will work. The same > condition will happen whenever cti_enable_hw() is called on a CPU to > enable a > CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on > latter > CPU at the same time. > I'm not sure I understand this correctly. Let me clarify it a little bit. It's a deadlock since cti_enable_hw() holds the spinlock and calls cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch, I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx. It won't wait for CPUy and release spinlock after program registers of CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will continue to run. Since spinlock is held and config->hw_powered is true, we don't need to worry about CPUy power down when we program CTI on CPUx. > I think a better solution is to grab the lock in cti_enable_hw() and check > the > value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release the > lock and > call smp_call_function_single(). In cti_enable_hw_smp_call() take the > lock > again and move forward from there. > After cti_enable_hw() releases the lock, it's possible that CPU is offline by user, cti_enable_hw_smp_call() will fail in this case. > I have applied the other two patches in this set so no need to send them > again. > Thanks, Tingwei > Thanks, > Mathieu > > > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI > devices") > > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > > --- > > drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- > > 1 file changed, 1 insertion(+), 16 deletions(-) > > > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c > b/drivers/hwtracing/coresight/coresight-cti.c > > index 3ccc703dc940..869569eb8c7f 100644 > > --- a/drivers/hwtracing/coresight/coresight-cti.c > > +++ b/drivers/hwtracing/coresight/coresight-cti.c > > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata > *drvdata) > > CS_LOCK(drvdata->base); > > } > > > > -static void cti_enable_hw_smp_call(void *info) > > -{ > > - struct cti_drvdata *drvdata = info; > > - > > - cti_write_all_hw_regs(drvdata); > > -} > > - > > /* write regs to hardware and enable */ > > static int cti_enable_hw(struct cti_drvdata *drvdata) > > { > > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata > *drvdata) > > if (rc) > > goto cti_err_not_enabled; > > > > - if (drvdata->ctidev.cpu >= 0) { > > - rc = smp_call_function_single(drvdata->ctidev.cpu, > > - cti_enable_hw_smp_call, > > - drvdata, 1); > > - if (rc) > > - goto cti_err_not_enabled; > > - } else { > > - cti_write_all_hw_regs(drvdata); > > - } > > + cti_write_all_hw_regs(drvdata); > > > > config->hw_enabled = true; > > atomic_inc(&drvdata->config.enable_req_count); > > -- > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora > Forum, > > a Linux Foundation Collaborative Project > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-28 3:18 ` Tingwei Zhang @ 2020-08-31 20:28 ` Mathieu Poirier 2020-09-01 6:35 ` Tingwei Zhang 0 siblings, 1 reply; 11+ messages in thread From: Mathieu Poirier @ 2020-08-31 20:28 UTC (permalink / raw) To: Tingwei Zhang Cc: tsoni, Sai Prakash Ranjan, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, coresight, Mao Jinlong, linux-kernel, Tingwei Zhang, linux-arm-kernel, Mike Leach On Fri, Aug 28, 2020 at 11:18:18AM +0800, Tingwei Zhang wrote: > On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote: > > Hi Tingwei, > > > > On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote: > > > Deadlock as below is triggered by one CPU holds drvdata->spinlock > > > and calls cti_enable_hw(). Smp_call_function_single() is called > > > in cti_enable_hw() and tries to let another CPU write CTI registers. > > > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() > > > and doesn't response to IPI from smp_call_function_single(). > > > > > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L > > > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 > > > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) > > > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) > > > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 > > > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 > > > ... > > > [ 988.451638] Call trace: > > > [ 988.454119] smp_call_function_single+0x158/0x1b8 > > > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] > > > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight] > > > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] > > > [ 988.474957] enable_source_store+0x5c/0x9c [coresight] > > > [ 988.480140] dev_attr_store+0x14/0x28 > > > [ 988.483839] sysfs_kf_write+0x38/0x4c > > > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 > > > [ 988.491585] vfs_write+0xfc/0x300 > > > [ 988.494931] ksys_write+0x78/0xe0 > > > [ 988.498283] __arm64_sys_write+0x18/0x20 > > > [ 988.502240] el0_svc_common+0x98/0x160 > > > [ 988.506024] do_el0_svc+0x78/0x80 > > > [ 988.509377] el0_sync_handler+0xd4/0x270 > > > [ 988.513337] el0_sync+0x164/0x180 > > > > > > > Was this the full log or you did cut some of it? > > > > I cut some CPU registers' value since it's too long and not relevant. > The Call trace is full. > > > > This change write CTI registers directly in cti_enable_hw(). > > > Config->hw_powered has been checked to be true with spinlock holded. > > > CTI is powered and can be programmed until spinlock is released. > > > > > > > From your explanation above it seems that cti_enable_hw() was called from, > > say > > CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock was > > taken > > and smp_call_function_single() called right after. That woke up CPUx and > > cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to > > take > > CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. Is my > > assessment correct? > > > > Most of them is correct. The only difference is CPUx is power on when > cti_enable_hw() is called. Otherwise it will goto cti_state_unchanged: > and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called > when CPUx tries to suspend instead of resume. > > > If so I don't think the fix suggested in this patch will work. The same > > condition will happen whenever cti_enable_hw() is called on a CPU to > > enable a > > CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on > > latter > > CPU at the same time. > > > > I'm not sure I understand this correctly. Let me clarify it a little bit. > It's a deadlock since cti_enable_hw() holds the spinlock and calls > cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It > waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while > CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch, > I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx. > It won't wait for CPUy and release spinlock after program registers of > CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will > continue to run. Since spinlock is held and config->hw_powered is true, > we don't need to worry about CPUy power down when we program CTI on CPUx. > > > I think a better solution is to grab the lock in cti_enable_hw() and check > > the > > value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release the > > lock and > > call smp_call_function_single(). In cti_enable_hw_smp_call() take the > > lock > > again and move forward from there. > > > > After cti_enable_hw() releases the lock, it's possible that CPU is offline > by user, cti_enable_hw_smp_call() will fail in this case. > > I took another look at your solution and you are correct, CPUy won't lock but rather spin waiting for cti_enable_hw() to finish on CPUx. To make sure CPUx is not interrupted while executing cti_enable_hw() I suggest to use spin_lock_irq(). Thanks, Mathieu > > > I have applied the other two patches in this set so no need to send them > > again. > > > Thanks, > Tingwei > > Thanks, > > Mathieu > > > > > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI > > devices") > > > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > > > --- > > > drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- > > > 1 file changed, 1 insertion(+), 16 deletions(-) > > > > > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c > > b/drivers/hwtracing/coresight/coresight-cti.c > > > index 3ccc703dc940..869569eb8c7f 100644 > > > --- a/drivers/hwtracing/coresight/coresight-cti.c > > > +++ b/drivers/hwtracing/coresight/coresight-cti.c > > > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata > > *drvdata) > > > CS_LOCK(drvdata->base); > > > } > > > > > > -static void cti_enable_hw_smp_call(void *info) > > > -{ > > > - struct cti_drvdata *drvdata = info; > > > - > > > - cti_write_all_hw_regs(drvdata); > > > -} > > > - > > > /* write regs to hardware and enable */ > > > static int cti_enable_hw(struct cti_drvdata *drvdata) > > > { > > > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata > > *drvdata) > > > if (rc) > > > goto cti_err_not_enabled; > > > > > > - if (drvdata->ctidev.cpu >= 0) { > > > - rc = smp_call_function_single(drvdata->ctidev.cpu, > > > - cti_enable_hw_smp_call, > > > - drvdata, 1); > > > - if (rc) > > > - goto cti_err_not_enabled; > > > - } else { > > > - cti_write_all_hw_regs(drvdata); > > > - } > > > + cti_write_all_hw_regs(drvdata); > > > > > > config->hw_enabled = true; > > > atomic_inc(&drvdata->config.enable_req_count); > > > -- > > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora > > Forum, > > > a Linux Foundation Collaborative Project > > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-31 20:28 ` Mathieu Poirier @ 2020-09-01 6:35 ` Tingwei Zhang 0 siblings, 0 replies; 11+ messages in thread From: Tingwei Zhang @ 2020-09-01 6:35 UTC (permalink / raw) To: Mathieu Poirier Cc: tsoni, Sai Prakash Ranjan, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, coresight, Mao Jinlong, linux-kernel, Tingwei Zhang, linux-arm-kernel, Mike Leach On Tue, Sep 01, 2020 at 04:28:19AM +0800, Mathieu Poirier wrote: > On Fri, Aug 28, 2020 at 11:18:18AM +0800, Tingwei Zhang wrote: > > On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote: > > > Hi Tingwei, > > > > > > On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote: > > > > Deadlock as below is triggered by one CPU holds drvdata->spinlock > > > > and calls cti_enable_hw(). Smp_call_function_single() is called > > > > in cti_enable_hw() and tries to let another CPU write CTI registers. > > > > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() > > > > and doesn't response to IPI from smp_call_function_single(). > > > > > > > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L > > > > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 > > > > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) > > > > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) > > > > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 > > > > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 > > > > ... > > > > [ 988.451638] Call trace: > > > > [ 988.454119] smp_call_function_single+0x158/0x1b8 > > > > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] > > > > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 > [coresight] > > > > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] > > > > [ 988.474957] enable_source_store+0x5c/0x9c [coresight] > > > > [ 988.480140] dev_attr_store+0x14/0x28 > > > > [ 988.483839] sysfs_kf_write+0x38/0x4c > > > > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 > > > > [ 988.491585] vfs_write+0xfc/0x300 > > > > [ 988.494931] ksys_write+0x78/0xe0 > > > > [ 988.498283] __arm64_sys_write+0x18/0x20 > > > > [ 988.502240] el0_svc_common+0x98/0x160 > > > > [ 988.506024] do_el0_svc+0x78/0x80 > > > > [ 988.509377] el0_sync_handler+0xd4/0x270 > > > > [ 988.513337] el0_sync+0x164/0x180 > > > > > > > > > > Was this the full log or you did cut some of it? > > > > > > > I cut some CPU registers' value since it's too long and not relevant. > > The Call trace is full. > > > > > > This change write CTI registers directly in cti_enable_hw(). > > > > Config->hw_powered has been checked to be true with spinlock holded. > > > > CTI is powered and can be programmed until spinlock is released. > > > > > > > > > > From your explanation above it seems that cti_enable_hw() was called > from, > > > say > > > CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock > was > > > taken > > > and smp_call_function_single() called right after. That woke up CPUx > and > > > cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying > to > > > take > > > CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. > Is my > > > assessment correct? > > > > > > > Most of them is correct. The only difference is CPUx is power on when > > cti_enable_hw() is called. Otherwise it will goto cti_state_unchanged: > > and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called > > when CPUx tries to suspend instead of resume. > > > > > If so I don't think the fix suggested in this patch will work. The > same > > > condition will happen whenever cti_enable_hw() is called on a CPU to > > > enable a > > > CTI that belongs to another CPU and that cti_cpu_pm_notify() is called > on > > > latter > > > CPU at the same time. > > > > > > > I'm not sure I understand this correctly. Let me clarify it a little > bit. > > It's a deadlock since cti_enable_hw() holds the spinlock and calls > > cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It > > waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while > > CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch, > > I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx. > > It won't wait for CPUy and release spinlock after program registers of > > CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will > > continue to run. Since spinlock is held and config->hw_powered is true, > > we don't need to worry about CPUy power down when we program CTI on > CPUx. > > > > > I think a better solution is to grab the lock in cti_enable_hw() and > check > > > the > > > value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release > the > > > lock and > > > call smp_call_function_single(). In cti_enable_hw_smp_call() take the > > > lock > > > again and move forward from there. > > > > > > > After cti_enable_hw() releases the lock, it's possible that CPU is > offline > > by user, cti_enable_hw_smp_call() will fail in this case. > > > > > > I took another look at your solution and you are correct, CPUy won't lock > but > rather spin waiting for cti_enable_hw() to finish on CPUx. To make sure > CPUx is > not interrupted while executing cti_enable_hw() I suggest to use > spin_lock_irq(). > > Thanks, > Mathieu > Sure. I've sent v2 according to your comments. > > > > > I have applied the other two patches in this set so no need to send > them > > > again. > > > > > Thanks, > > Tingwei > > > Thanks, > > > Mathieu > > > > > > > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI > > > devices") > > > > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > > > > --- > > > > drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- > > > > 1 file changed, 1 insertion(+), 16 deletions(-) > > > > > > > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c > > > b/drivers/hwtracing/coresight/coresight-cti.c > > > > index 3ccc703dc940..869569eb8c7f 100644 > > > > --- a/drivers/hwtracing/coresight/coresight-cti.c > > > > +++ b/drivers/hwtracing/coresight/coresight-cti.c > > > > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata > > > *drvdata) > > > > CS_LOCK(drvdata->base); > > > > } > > > > > > > > -static void cti_enable_hw_smp_call(void *info) > > > > -{ > > > > - struct cti_drvdata *drvdata = info; > > > > - > > > > - cti_write_all_hw_regs(drvdata); > > > > -} > > > > - > > > > /* write regs to hardware and enable */ > > > > static int cti_enable_hw(struct cti_drvdata *drvdata) > > > > { > > > > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata > > > *drvdata) > > > > if (rc) > > > > goto cti_err_not_enabled; > > > > > > > > - if (drvdata->ctidev.cpu >= 0) { > > > > - rc = smp_call_function_single(drvdata->ctidev.cpu, > > > > - cti_enable_hw_smp_call, > > > > - drvdata, 1); > > > > - if (rc) > > > > - goto cti_err_not_enabled; > > > > - } else { > > > > - cti_write_all_hw_regs(drvdata); > > > > - } > > > > + cti_write_all_hw_regs(drvdata); > > > > > > > > config->hw_enabled = true; > > > > atomic_inc(&drvdata->config.enable_req_count); > > > > -- > > > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora > > > Forum, > > > > a Linux Foundation Collaborative Project > > > > > > > > > > _______________________________________________ > > > linux-arm-kernel mailing list > > > linux-arm-kernel@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() 2020-08-18 11:10 ` [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() Tingwei Zhang 2020-08-20 8:06 ` Mike Leach 2020-08-27 18:12 ` Mathieu Poirier @ 2020-08-27 19:17 ` Mathieu Poirier 2 siblings, 0 replies; 11+ messages in thread From: Mathieu Poirier @ 2020-08-27 19:17 UTC (permalink / raw) To: Tingwei Zhang Cc: tsoni, Sai Prakash Ranjan, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, coresight, Mao Jinlong, linux-kernel, linux-arm-kernel, Mike Leach On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote: > Deadlock as below is triggered by one CPU holds drvdata->spinlock > and calls cti_enable_hw(). Smp_call_function_single() is called > in cti_enable_hw() and tries to let another CPU write CTI registers. > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify() > and doesn't response to IPI from smp_call_function_single(). > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1 > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8 > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8 > ... > [ 988.451638] Call trace: > [ 988.454119] smp_call_function_single+0x158/0x1b8 > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti] > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight] > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight] > [ 988.474957] enable_source_store+0x5c/0x9c [coresight] > [ 988.480140] dev_attr_store+0x14/0x28 > [ 988.483839] sysfs_kf_write+0x38/0x4c > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0 > [ 988.491585] vfs_write+0xfc/0x300 > [ 988.494931] ksys_write+0x78/0xe0 > [ 988.498283] __arm64_sys_write+0x18/0x20 > [ 988.502240] el0_svc_common+0x98/0x160 > [ 988.506024] do_el0_svc+0x78/0x80 > [ 988.509377] el0_sync_handler+0xd4/0x270 > [ 988.513337] el0_sync+0x164/0x180 > > This change write CTI registers directly in cti_enable_hw(). > Config->hw_powered has been checked to be true with spinlock holded. > CTI is powered and can be programmed until spinlock is released. > > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices") > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > --- > drivers/hwtracing/coresight/coresight-cti.c | 17 +---------------- > 1 file changed, 1 insertion(+), 16 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c > index 3ccc703dc940..869569eb8c7f 100644 > --- a/drivers/hwtracing/coresight/coresight-cti.c > +++ b/drivers/hwtracing/coresight/coresight-cti.c > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata) > CS_LOCK(drvdata->base); > } > > -static void cti_enable_hw_smp_call(void *info) > -{ > - struct cti_drvdata *drvdata = info; > - > - cti_write_all_hw_regs(drvdata); > -} > - > /* write regs to hardware and enable */ > static int cti_enable_hw(struct cti_drvdata *drvdata) > { > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) > if (rc) > goto cti_err_not_enabled; > > - if (drvdata->ctidev.cpu >= 0) { > - rc = smp_call_function_single(drvdata->ctidev.cpu, > - cti_enable_hw_smp_call, > - drvdata, 1); > - if (rc) > - goto cti_err_not_enabled; > - } else { > - cti_write_all_hw_regs(drvdata); > - } > + cti_write_all_hw_regs(drvdata); I forgot... Looking at the code in cti_enable_hw() the claim tags are not released in the event that smp_call_function_single() fails. Please address that as part of refactoring the function. Thanks, Mathieu > > config->hw_enabled = true; > atomic_inc(&drvdata->config.enable_req_count); > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] coresight: cti: disclaim device only when it's claimed 2020-08-18 11:10 [PATCH] coresight: cti: disclaim device only when it's claimed Tingwei Zhang 2020-08-18 11:10 ` [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug Tingwei Zhang 2020-08-18 11:10 ` [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() Tingwei Zhang @ 2020-08-20 8:12 ` Mike Leach 2 siblings, 0 replies; 11+ messages in thread From: Mike Leach @ 2020-08-20 8:12 UTC (permalink / raw) To: Tingwei Zhang Cc: tsoni, Sai Prakash Ranjan, Mathieu Poirier, Suzuki K Poulose, Alexander Shishkin, Greg Kroah-Hartman, Coresight ML, Mao Jinlong, Linux Kernel Mailing List, linux-arm-kernel On Tue, 18 Aug 2020 at 12:11, Tingwei Zhang <tingwei@codeaurora.org> wrote: > > Coresight_claim_device() is called in cti_starting_cpu() only > when CTI is enabled while coresight_disclaim_device() is called > uncontionally in cti_dying_cpu(). This triggered below WARNING. > Only call disclaim device when CTI device is enabled to fix it. > > [ 75.989643] WARNING: CPU: 1 PID: 14 at > kernel/drivers/hwtracing/coresight/coresight.c:209 > coresight_disclaim_device_unlocked+0x10/0x24 > [ 75.989697] CPU: 1 PID: 14 Comm: migration/1 Not tainted > 5.9.0-rc1-gff1304be0a05-dirty #21 > [ 75.989709] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 75.989737] pstate: 80c00085 (Nzcv daIf +PAN +UAO BTYPE=--) > [ 75.989758] pc : coresight_disclaim_device_unlocked+0x10/0x24 > [ 75.989775] lr : coresight_disclaim_device+0x24/0x38 > [ 75.989783] sp : ffff800011cd3c90 > . > [ 75.990018] Call trace: > [ 75.990041] coresight_disclaim_device_unlocked+0x10/0x24 > [ 75.990066] cti_dying_cpu+0x34/0x4c > [ 75.990101] cpuhp_invoke_callback+0x84/0x1e0 > [ 75.990121] take_cpu_down+0x90/0xe0 > [ 75.990154] multi_cpu_stop+0x134/0x160 > [ 75.990171] cpu_stopper_thread+0xb0/0x13c > [ 75.990196] smpboot_thread_fn+0x1c4/0x270 > [ 75.990222] kthread+0x128/0x154 > [ 75.990251] ret_from_fork+0x10/0x18 > > Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver") > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > --- > drivers/hwtracing/coresight/coresight-cti.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c > index 3ccc703dc940..d6fea6efec71 100644 > --- a/drivers/hwtracing/coresight/coresight-cti.c > +++ b/drivers/hwtracing/coresight/coresight-cti.c > @@ -742,7 +742,8 @@ static int cti_dying_cpu(unsigned int cpu) > > spin_lock(&drvdata->spinlock); > drvdata->config.hw_powered = false; > - coresight_disclaim_device(drvdata->base); > + if (drvdata->config.hw_enabled) > + coresight_disclaim_device(drvdata->base); > spin_unlock(&drvdata->spinlock); > return 0; > } > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > Reviewed-by: Mike Leach <mike.leach@linaro.org> -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-09-01 6:37 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-08-18 11:10 [PATCH] coresight: cti: disclaim device only when it's claimed Tingwei Zhang 2020-08-18 11:10 ` [PATCH] coresight: cti: remove pm_runtime_get_sync() from CPU hotplug Tingwei Zhang 2020-08-18 12:47 ` Mike Leach 2020-08-18 11:10 ` [PATCH] coresight: cti: write regsiters directly in cti_enable_hw() Tingwei Zhang 2020-08-20 8:06 ` Mike Leach 2020-08-27 18:12 ` Mathieu Poirier 2020-08-28 3:18 ` Tingwei Zhang 2020-08-31 20:28 ` Mathieu Poirier 2020-09-01 6:35 ` Tingwei Zhang 2020-08-27 19:17 ` Mathieu Poirier 2020-08-20 8:12 ` [PATCH] coresight: cti: disclaim device only when it's claimed Mike Leach
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).