From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Niklas Söderlund" <niklas.soderlund+renesas@ragnatech.se>,
"Geert Uytterhoeven" <geert+renesas@glider.be>,
"Daniel Lezcano" <daniel.lezcano@linaro.org>,
"Sasha Levin" <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.14 52/66] clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM
Date: Tue, 22 Dec 2020 21:22:38 -0500 [thread overview]
Message-ID: <20201223022253.2793452-52-sashal@kernel.org> (raw)
In-Reply-To: <20201223022253.2793452-1-sashal@kernel.org>
From: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
[ Upstream commit 8ae954caf49ac403c177d117fb8e05cbc866aa3c ]
The ch->lock is used to protect the whole enable() and read() of
sh_cmt's implementation of struct clocksource. The enable()
implementation calls pm_runtime_get_sync() which may result in the clock
source to be read() triggering a cyclic lockdep warning for the
ch->lock.
The sh_cmt driver implement its own balancing of calls to
sh_cmt_{enable,disable}() with flags in sh_cmt_{start,stop}(). It does
this to deal with that start and stop are shared between the clock
source and clock event providers. While this could be improved on
verifying corner cases based on any substantial rework on all devices
this driver supports might prove hard.
As a first step separate the PM handling for clock event and clock
source. Always put/get the device when enabling/disabling the clock
source but keep the clock event logic unchanged. This allows the sh_cmt
implementation of struct clocksource to call PM without holding the
ch->lock and avoiding the deadlock.
Triggering and log of the deadlock warning,
# echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource
[ 46.948370] ======================================================
[ 46.954730] WARNING: possible circular locking dependency detected
[ 46.961094] 5.10.0-rc6-arm64-renesas-00001-g0e5fd7414e8b #36 Not tainted
[ 46.967985] ------------------------------------------------------
[ 46.974342] migration/0/11 is trying to acquire lock:
[ 46.979543] ffff0000403ed220 (&dev->power.lock){-...}-{2:2}, at: __pm_runtime_resume+0x40/0x74
[ 46.988445]
[ 46.988445] but task is already holding lock:
[ 46.994441] ffff000040ad0298 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x28/0x210
[ 47.002173]
[ 47.002173] which lock already depends on the new lock.
[ 47.002173]
[ 47.010573]
[ 47.010573] the existing dependency chain (in reverse order) is:
[ 47.018262]
[ 47.018262] -> #3 (&ch->lock){....}-{2:2}:
[ 47.024033] lock_acquire.part.0+0x120/0x330
[ 47.028970] lock_acquire+0x64/0x80
[ 47.033105] _raw_spin_lock_irqsave+0x7c/0xc4
[ 47.038130] sh_cmt_start+0x28/0x210
[ 47.042352] sh_cmt_clocksource_enable+0x28/0x50
[ 47.047644] change_clocksource+0x9c/0x160
[ 47.052402] multi_cpu_stop+0xa4/0x190
[ 47.056799] cpu_stopper_thread+0x90/0x154
[ 47.061557] smpboot_thread_fn+0x244/0x270
[ 47.066310] kthread+0x154/0x160
[ 47.070175] ret_from_fork+0x10/0x20
[ 47.074390]
[ 47.074390] -> #2 (tk_core.seq.seqcount){----}-{0:0}:
[ 47.081136] lock_acquire.part.0+0x120/0x330
[ 47.086070] lock_acquire+0x64/0x80
[ 47.090203] seqcount_lockdep_reader_access.constprop.0+0x74/0x100
[ 47.097096] ktime_get+0x28/0xa0
[ 47.100960] hrtimer_start_range_ns+0x210/0x2dc
[ 47.106164] generic_sched_clock_init+0x70/0x88
[ 47.111364] sched_clock_init+0x40/0x64
[ 47.115853] start_kernel+0x494/0x524
[ 47.120156]
[ 47.120156] -> #1 (hrtimer_bases.lock){-.-.}-{2:2}:
[ 47.126721] lock_acquire.part.0+0x120/0x330
[ 47.136042] lock_acquire+0x64/0x80
[ 47.144461] _raw_spin_lock_irqsave+0x7c/0xc4
[ 47.153721] hrtimer_start_range_ns+0x68/0x2dc
[ 47.163054] rpm_suspend+0x308/0x5dc
[ 47.171473] rpm_idle+0xc4/0x2a4
[ 47.179550] pm_runtime_work+0x98/0xc0
[ 47.188209] process_one_work+0x294/0x6f0
[ 47.197142] worker_thread+0x70/0x45c
[ 47.205661] kthread+0x154/0x160
[ 47.213673] ret_from_fork+0x10/0x20
[ 47.221957]
[ 47.221957] -> #0 (&dev->power.lock){-...}-{2:2}:
[ 47.236292] check_noncircular+0x128/0x140
[ 47.244907] __lock_acquire+0x13b0/0x204c
[ 47.253332] lock_acquire.part.0+0x120/0x330
[ 47.262033] lock_acquire+0x64/0x80
[ 47.269826] _raw_spin_lock_irqsave+0x7c/0xc4
[ 47.278430] __pm_runtime_resume+0x40/0x74
[ 47.286758] sh_cmt_start+0x84/0x210
[ 47.294537] sh_cmt_clocksource_enable+0x28/0x50
[ 47.303449] change_clocksource+0x9c/0x160
[ 47.311783] multi_cpu_stop+0xa4/0x190
[ 47.319720] cpu_stopper_thread+0x90/0x154
[ 47.328022] smpboot_thread_fn+0x244/0x270
[ 47.336298] kthread+0x154/0x160
[ 47.343708] ret_from_fork+0x10/0x20
[ 47.351445]
[ 47.351445] other info that might help us debug this:
[ 47.351445]
[ 47.370225] Chain exists of:
[ 47.370225] &dev->power.lock --> tk_core.seq.seqcount --> &ch->lock
[ 47.370225]
[ 47.392003] Possible unsafe locking scenario:
[ 47.392003]
[ 47.405314] CPU0 CPU1
[ 47.413569] ---- ----
[ 47.421768] lock(&ch->lock);
[ 47.428425] lock(tk_core.seq.seqcount);
[ 47.438701] lock(&ch->lock);
[ 47.447930] lock(&dev->power.lock);
[ 47.455172]
[ 47.455172] *** DEADLOCK ***
[ 47.455172]
[ 47.471433] 3 locks held by migration/0/11:
[ 47.479099] #0: ffff8000113c9278 (timekeeper_lock){-.-.}-{2:2}, at: change_clocksource+0x2c/0x160
[ 47.491834] #1: ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
[ 47.504727] #2: ffff000040ad0298 (&ch->lock){....}-{2:2}, at: sh_cmt_start+0x28/0x210
[ 47.516541]
[ 47.516541] stack backtrace:
[ 47.528480] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.10.0-rc6-arm64-renesas-00001-g0e5fd7414e8b #36
[ 47.542147] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[ 47.554241] Call trace:
[ 47.560832] dump_backtrace+0x0/0x190
[ 47.568670] show_stack+0x14/0x30
[ 47.576144] dump_stack+0xe8/0x130
[ 47.583670] print_circular_bug+0x1f0/0x200
[ 47.592015] check_noncircular+0x128/0x140
[ 47.600289] __lock_acquire+0x13b0/0x204c
[ 47.608486] lock_acquire.part.0+0x120/0x330
[ 47.616953] lock_acquire+0x64/0x80
[ 47.624582] _raw_spin_lock_irqsave+0x7c/0xc4
[ 47.633114] __pm_runtime_resume+0x40/0x74
[ 47.641371] sh_cmt_start+0x84/0x210
[ 47.649115] sh_cmt_clocksource_enable+0x28/0x50
[ 47.657916] change_clocksource+0x9c/0x160
[ 47.666165] multi_cpu_stop+0xa4/0x190
[ 47.674056] cpu_stopper_thread+0x90/0x154
[ 47.682308] smpboot_thread_fn+0x244/0x270
[ 47.690560] kthread+0x154/0x160
[ 47.697927] ret_from_fork+0x10/0x20
[ 47.708447] clocksource: Switched to clocksource e60f0000.timer
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201205021921.1456190-2-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/clocksource/sh_cmt.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/clocksource/sh_cmt.c b/drivers/clocksource/sh_cmt.c
index 3cd62f7c33e30..73f2bae07f1bc 100644
--- a/drivers/clocksource/sh_cmt.c
+++ b/drivers/clocksource/sh_cmt.c
@@ -317,7 +317,6 @@ static int sh_cmt_enable(struct sh_cmt_channel *ch)
{
int k, ret;
- pm_runtime_get_sync(&ch->cmt->pdev->dev);
dev_pm_syscore_device(&ch->cmt->pdev->dev, true);
/* enable clock */
@@ -392,7 +391,6 @@ static void sh_cmt_disable(struct sh_cmt_channel *ch)
clk_disable(ch->cmt->clk);
dev_pm_syscore_device(&ch->cmt->pdev->dev, false);
- pm_runtime_put(&ch->cmt->pdev->dev);
}
/* private flags */
@@ -560,10 +558,16 @@ static int sh_cmt_start(struct sh_cmt_channel *ch, unsigned long flag)
int ret = 0;
unsigned long flags;
+ if (flag & FLAG_CLOCKSOURCE)
+ pm_runtime_get_sync(&ch->cmt->pdev->dev);
+
raw_spin_lock_irqsave(&ch->lock, flags);
- if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
+ if (!(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
+ if (flag & FLAG_CLOCKEVENT)
+ pm_runtime_get_sync(&ch->cmt->pdev->dev);
ret = sh_cmt_enable(ch);
+ }
if (ret)
goto out;
@@ -588,14 +592,20 @@ static void sh_cmt_stop(struct sh_cmt_channel *ch, unsigned long flag)
f = ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE);
ch->flags &= ~flag;
- if (f && !(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE)))
+ if (f && !(ch->flags & (FLAG_CLOCKEVENT | FLAG_CLOCKSOURCE))) {
sh_cmt_disable(ch);
+ if (flag & FLAG_CLOCKEVENT)
+ pm_runtime_put(&ch->cmt->pdev->dev);
+ }
/* adjust the timeout to maximum if only clocksource left */
if ((flag == FLAG_CLOCKEVENT) && (ch->flags & FLAG_CLOCKSOURCE))
__sh_cmt_set_next(ch, ch->max_match_value);
raw_spin_unlock_irqrestore(&ch->lock, flags);
+
+ if (flag & FLAG_CLOCKSOURCE)
+ pm_runtime_put(&ch->cmt->pdev->dev);
}
static struct sh_cmt_channel *cs_to_sh_cmt(struct clocksource *cs)
--
2.27.0
next prev parent reply other threads:[~2020-12-23 2:38 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-23 2:21 [PATCH AUTOSEL 4.14 01/66] locks: Fix UBSAN undefined behaviour in flock64_to_posix_lock Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 02/66] tomoyo: fix clang pointer arithmetic warning Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 03/66] crypto: omap-aes - fix the reference count leak of omap device Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 04/66] staging: wimax: depends on NET Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 05/66] scsi: pm80xx: Avoid busywait in FW ready check Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 06/66] scsi: pm80xx: Fix pm8001_mpi_get_nvmd_resp() race condition Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 07/66] staging: ks7010: fix missing destroy_workqueue() on error in ks7010_sdio_probe Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 08/66] staging: rtl8192u: fix wrong judgement in rtl8192_rx_isr Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 09/66] mips: ar7: add missing iounmap() on error in ar7_gpio_init Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 10/66] mips: cm: add missing iounmap() on error in mips_cm_probe() Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 11/66] locktorture: Prevent hangs for invalid arguments Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 12/66] rcutorture: " Sasha Levin
2020-12-23 2:21 ` [PATCH AUTOSEL 4.14 13/66] drm: panel: simple: add missing platform_driver_unregister() in panel_simple_init Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 14/66] drm/ast: Fixed 1920x1080 sync. polarity issue Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 15/66] s390/trng: set quality to 1024 Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 16/66] Bluetooth: hidp: use correct wait queue when removing ctrl_wait Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 17/66] net: skb_vlan_untag(): don't reset transport offset if set by GRO layer Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 18/66] mwifiex: pcie: skip cancel_work_sync() on reset failure path Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 19/66] MIPS: BMC47xx: fix kconfig dependency bug for BCM47XX_SSB Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 20/66] jfs: Fix memleak in dbAdjCtl Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 21/66] media: zr364xx: propagate errors from zr364xx_start_readpipe() Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 22/66] media: cec-core: first mark device unregistered, then wake up fhs Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 23/66] media: isif: reset global state Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 24/66] s390/dasd: Fix operational path inconsistency Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 25/66] media: usb: dvb-usb-v2: zd1301: fix missing platform_device_unregister() Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 26/66] media: dvbdev: Fix memleak in dvb_register_device Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 27/66] mmc: tmio: do not print real IOMEM pointer Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 28/66] ARM: OMAP2+: Fix memleak in omap2xxx_clkt_vps_init Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 29/66] MIPS: kvm: Use vm_get_page_prot to get protection bits Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 30/66] scsi: ufs: Atomic update for clkgating_enable Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 31/66] ALSA: usb-audio: Don't call usb_set_interface() at trigger callback Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 32/66] rxrpc: Don't leak the service-side session key to userspace Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 33/66] scsi: atari_scsi: Fix race condition between .queuecommand and EH Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 34/66] ARM: dts: hisilicon: fix errors detected by snps-dw-apb-uart.yaml Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 35/66] ARM: dts: hisilicon: fix errors detected by usb yaml Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 36/66] ARM: dts: hisilicon: fix errors detected by simple-bus.yaml Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 37/66] ARM: dts: hisilicon: fix errors detected by spi-pl022.yaml Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 38/66] selftests/x86/fsgsbase: Fix GS == 1, 2, and 3 tests Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 39/66] brcmsmac: ampdu: Check BA window size before checking block ack Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 40/66] hv_netvsc: Validate number of allocated sub-channels Sasha Levin
2020-12-23 2:47 ` Michael Kelley
2020-12-23 8:59 ` Andrea Parri
2020-12-23 14:14 ` Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 41/66] iommu/tegra-smmu: Expand mutex protection range Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 42/66] arm64: tegra: Fix GIC400 missing GICH/GICV register regions Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 43/66] crypto: qce - Fix SHA result buffer corruption issues Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 44/66] media: gp8psk: initialize stats at power control logic Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 45/66] net/lapb: fix t1 timer handling for LAPB_STATE_0 Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 46/66] x86/pci: Fix the function type for check_reserved_t Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 47/66] x86/mce: Panic for LMCE only if mca_cfg.tolerant < 3 Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 48/66] bridge: switchdev: Notify about VLAN protocol changes Sasha Levin
2020-12-23 15:31 ` Vladimir Oltean
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 49/66] MIPS: KASLR: Avoid endless loop in sync_icache if synci_step is zero Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 50/66] cpufreq: sti-cpufreq: fix mem leak in sti_cpufreq_set_opp_info() Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 51/66] cpufreq: mediatek: add missing platform_driver_unregister() on error in mtk_cpufreq_driver_init Sasha Levin
2020-12-23 2:22 ` Sasha Levin [this message]
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 53/66] mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 54/66] misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells() Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 55/66] iwlwifi: trans: consider firmware dead after errors Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 56/66] iwlwifi: add an extra firmware state in the transport Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 57/66] USB: typec: tcpm: Fix PR_SWAP error handling Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 58/66] USB: typec: tcpm: Add a 30ms room for tPSSourceOn in PR_SWAP Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 59/66] nl80211: always accept scan request with the duration set Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 60/66] cfg80211: Save the regulatory domain when setting custom regulatory Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 61/66] mac80211: disallow band-switch during CSA Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 62/66] mac80211: Fix calculation of minimal channel width Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 63/66] mac80211: don't filter out beacons once we start CSA Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 64/66] mac80211: Update rate control on channel change Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 65/66] ALSA: hda/hdmi: packet buffer index must be set before reading value Sasha Levin
2020-12-23 2:22 ` [PATCH AUTOSEL 4.14 66/66] PCI: Add function 1 DMA alias quirk for Marvell 9215 SATA controller Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201223022253.2793452-52-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=daniel.lezcano@linaro.org \
--cc=geert+renesas@glider.be \
--cc=linux-kernel@vger.kernel.org \
--cc=niklas.soderlund+renesas@ragnatech.se \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox