From: Luo Gengkun <luogengkun@huaweicloud.com>
To: linux-kernel@vger.kernel.org
Cc: pmladek@suse.com, mhocko@suse.com, lecopzer.chen@mediatek.com,
yaoma@linux.alibaba.com, linuxppc-dev@lists.ozlabs.org,
dianders@chromium.org, song@kernel.org, bpf@vger.kernel.org,
npiggin@gmail.com, trix@redhat.com, naveen.n.rao@linux.ibm.com,
kernelfans@gmail.com, akpm@linux-foundation.org,
luogengkun@huaweicloud.com, tglx@linutronix.de
Subject: [PATCH] watchdog/core: Fix AA deadlock due to watchdog holding cpu_hotplug_lock and wait for wq
Date: Thu, 6 Jun 2024 15:38:28 +0000 [thread overview]
Message-ID: <20240606153828.3261006-1-luogengkun@huaweicloud.com> (raw)
We found an AA deadlock problem as shown belowed:
TaskA TaskB WatchDog system_wq
...
css_killed_work_fn:
P(cgroup_mutex)
...
...
__lockup_detector_reconfigure:
P(cpu_hotplug_lock.read)
...
...
cpu_up:
percpu_down_write:
P(cpu_hotplug_lock.write)
...
cgroup_bpf_release:
P(cgroup_mutex)
smp_call_on_cpu:
Wait system_wq
cpuset_css_offline:
P(cpu_hotplug_lock.read)
WatchDog is waitting for system_wq, who is waitting for cgroup_mutex, to finish
the jobs, but the owner of the cgroup_mutex is waitting for cpu_hotplug_lock.
The key point is the cpu_hotplug_lock, cause the system_wq may be waitting other
lock. It seems unhealthy to hold a lock when waitting system_wq, because we
never know what jobs are system_wq doing. So I fix this by replace cpu_read_lock/unlock
with cpu_hotplug_disable/enable to prevent cpu offline/online.
Fixes: e31d6883f21c ("watchdog/core, powerpc: Lock cpus across reconfiguration")
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
---
kernel/watchdog.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 51915b44ac73..6ac6fb8d3be0 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -867,7 +867,7 @@ int lockup_detector_offline_cpu(unsigned int cpu)
static void __lockup_detector_reconfigure(void)
{
- cpus_read_lock();
+ cpu_hotplug_disable();
watchdog_hardlockup_stop();
softlockup_stop_all();
@@ -877,7 +877,7 @@ static void __lockup_detector_reconfigure(void)
softlockup_start_all();
watchdog_hardlockup_start();
- cpus_read_unlock();
+ cpu_hotplug_enable();
/*
* Must be called outside the cpus locked section to prevent
* recursive locking in the perf code.
@@ -916,11 +916,11 @@ static __init void lockup_detector_setup(void)
#else /* CONFIG_SOFTLOCKUP_DETECTOR */
static void __lockup_detector_reconfigure(void)
{
- cpus_read_lock();
+ cpu_hotplug_disable();
watchdog_hardlockup_stop();
lockup_detector_update_enable();
watchdog_hardlockup_start();
- cpus_read_unlock();
+ cpu_hotplug_enable();
}
void lockup_detector_reconfigure(void)
{
--
2.34.1
next reply other threads:[~2024-06-06 15:32 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-06 15:38 Luo Gengkun [this message]
2024-06-11 7:56 ` [PATCH] watchdog/core: Fix AA deadlock due to watchdog holding cpu_hotplug_lock and wait for wq kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240606153828.3261006-1-luogengkun@huaweicloud.com \
--to=luogengkun@huaweicloud.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=dianders@chromium.org \
--cc=kernelfans@gmail.com \
--cc=lecopzer.chen@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mhocko@suse.com \
--cc=naveen.n.rao@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=pmladek@suse.com \
--cc=song@kernel.org \
--cc=tglx@linutronix.de \
--cc=trix@redhat.com \
--cc=yaoma@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).