public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next] cgroup: Fix AA deadlock caused by cgroup_bpf_release
@ 2024-06-07 11:03 Chen Ridong
  2024-06-10  2:47 ` Roman Gushchin
  2024-06-10 12:28 ` Markus Elfring
  0 siblings, 2 replies; 12+ messages in thread
From: Chen Ridong @ 2024-06-07 11:03 UTC (permalink / raw)
  To: martin.lau, ast, daniel, andrii, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tj, lizefan.x,
	hannes, roman.gushchin
  Cc: bpf, cgroups, linux-kernel

We found an AA deadlock problem as shown belowed:

cgroup_destroy_wq		TaskB				WatchDog			system_wq

...
css_killed_work_fn:
P(cgroup_mutex)
...
								...
								__lockup_detector_reconfigure:
								P(cpu_hotplug_lock.read)
								...
				...
				percpu_down_write:
				P(cpu_hotplug_lock.write)
												...
												cgroup_bpf_release:
												P(cgroup_mutex)
								smp_call_on_cpu:
								Wait system_wq

cpuset_css_offline:
P(cpu_hotplug_lock.read)

WatchDog is waiting for system_wq, who is waiting for cgroup_mutex, to
finish the jobs, but the owner of the cgroup_mutex is waiting for
cpu_hotplug_lock. This problem caused by commit 4bfc0bb2c60e ("bpf:
decouple the lifetime of cgroup_bpf from cgroup itself")
puts cgroup_bpf release work into system_wq. As cgroup_bpf is a member of
cgroup, it is reasonable to put cgroup bpf release work into
cgroup_destroy_wq, which is only used for cgroup's release work, and the
preblem is solved.

Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
 kernel/bpf/cgroup.c             | 2 +-
 kernel/cgroup/cgroup-internal.h | 1 +
 kernel/cgroup/cgroup.c          | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 8ba73042a239..a611a1274788 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -334,7 +334,7 @@ static void cgroup_bpf_release_fn(struct percpu_ref *ref)
 	struct cgroup *cgrp = container_of(ref, struct cgroup, bpf.refcnt);
 
 	INIT_WORK(&cgrp->bpf.release_work, cgroup_bpf_release);
-	queue_work(system_wq, &cgrp->bpf.release_work);
+	queue_work(cgroup_destroy_wq, &cgrp->bpf.release_work);
 }
 
 /* Get underlying bpf_prog of bpf_prog_list entry, regardless if it's through
diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index 520b90dd97ec..9e57f3e9316e 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -13,6 +13,7 @@
 extern spinlock_t trace_cgroup_path_lock;
 extern char trace_cgroup_path[TRACE_CGROUP_PATH_LEN];
 extern void __init enable_debug_cgroup(void);
+extern struct workqueue_struct *cgroup_destroy_wq;
 
 /*
  * cgroup_path() takes a spin lock. It is good practice not to take
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index e32b6972c478..3317e03fe2fb 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -124,7 +124,7 @@ DEFINE_PERCPU_RWSEM(cgroup_threadgroup_rwsem);
  * destruction work items don't end up filling up max_active of system_wq
  * which may lead to deadlock.
  */
-static struct workqueue_struct *cgroup_destroy_wq;
+struct workqueue_struct *cgroup_destroy_wq;
 
 /* generate an array of cgroup subsystem pointers */
 #define SUBSYS(_x) [_x ## _cgrp_id] = &_x ## _cgrp_subsys,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-07-17  2:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-07 11:03 [PATCH -next] cgroup: Fix AA deadlock caused by cgroup_bpf_release Chen Ridong
2024-06-10  2:47 ` Roman Gushchin
2024-07-09 13:42   ` chenridong
2024-07-10  3:02     ` chenridong
2024-07-11  3:52       ` Roman Gushchin
2024-07-11 17:36         ` Tejun Heo
2024-07-12  1:15           ` chenridong
2024-07-16 12:14             ` chenridong
2024-07-16 14:53               ` Roman Gushchin
2024-07-17  2:08                 ` chenridong
2024-06-10 12:28 ` Markus Elfring
2024-07-09 13:45   ` chenridong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox