From: Roman Gushchin <roman.gushchin@linux.dev>
To: chenridong <chenridong@huawei.com>
Cc: Tejun Heo <tj@kernel.org>,
martin.lau@linux.dev, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, eddyz87@gmail.com, song@kernel.org,
yonghong.song@linux.dev, john.fastabend@gmail.com,
kpsingh@kernel.org, sdf@google.com, haoluo@google.com,
jolsa@kernel.org, lizefan.x@bytedance.com, hannes@cmpxchg.org,
bpf@vger.kernel.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH -next] cgroup: Fix AA deadlock caused by cgroup_bpf_release
Date: Tue, 16 Jul 2024 14:53:04 +0000 [thread overview]
Message-ID: <ZpaJUIyiDguRQWSn@google.com> (raw)
In-Reply-To: <c6d10b39-4583-4162-b481-375f41aaeba1@huawei.com>
On Tue, Jul 16, 2024 at 08:14:31PM +0800, chenridong wrote:
>
>
> On 2024/7/12 9:15, chenridong wrote:
> >
> >
> > On 2024/7/12 1:36, Tejun Heo wrote:
> > > Hello,
> > >
> > > On Thu, Jul 11, 2024 at 03:52:34AM +0000, Roman Gushchin wrote:
> > > > > The max_active of system_wq is WQ_DFL_ACTIVE(256). If all
> > > > > active works are
> > > > > cgroup bpf release works, it will block smp_call_on_cpu work
> > > > > which enque
> > > > > after cgroup bpf releases. So smp_call_on_cpu holding
> > > > > cpu_hotplug_lock will
> > > > > wait for completion, but it can never get a completion
> > > > > because cgroup bpf
> > > > > release works can not get cgroup_mutex and will never finish.
> > > > > However, Placing the cgroup bpf release works on cgroup
> > > > > destroy will never
> > > > > block smp_call_on_cpu work, which means loop is broken.
> > > > > Thus, it can solve
> > > > > the problem.
> > > >
> > > > Tejun,
> > > >
> > > > do you have an opinion on this?
> > > >
> > > > If there are certain limitations from the cgroup side on what
> > > > can be done
> > > > in a generic work context, it would be nice to document (e.g. don't grab
> > > > cgroup mutex), but I still struggle to understand what exactly is wrong
> > > > with the blamed commit.
> > >
> > > I think the general rule here is more "don't saturate system wqs" rather
> > > than "don't grab cgroup_mutex from system_wq". system wqs are for misc
> > > things which shouldn't create a large number of concurrent work items. If
> > > something is going to generate 256+ concurrent work items, it should
> > > use its
> > > own workqueue. We don't know what's in system wqs and can't expect
> > > its users
> > > to police specific lock usages.
> > >
> > Thank you, Tj. That's exactly what I'm trying to convey. Just like
> > cgroup, which has its own workqueue and may create a large number of
> > release works, it is better to place all its related works on its
> > workqueue rather than on system wqs.
> >
> > Regards,
> > Ridong
> >
> > > Another aspect is that the current WQ_DFL_ACTIVE is an arbitrary number I
> > > came up with close to 15 years ago. Machine size has increased by
> > > multiple
> > > times, if not an order of magnitude since then. So, "there can't be a
> > > reasonable situation where 256 concurrency limit isn't enough" is most
> > > likely not true anymore and the limits need to be pushed upward.
> > >
> > > Thanks.
> > >
> >
> Hello, Tejun, and Roman, is the patch acceptable? Do I need to take any
> further actions?
>
I'm not against merging it. I still find the explanation/commit message
a bit vague and believe that maybe some changes need to be done on the watchdog
side to make such lockups impossible. As I understand the two most important
pieces are the watchdog which tries to run a system work on every cpu while
holding cpu_hotplug_lock on read and the cpuset controller which tries
to grab cpu_hotplug_lock on writing.
It's indeed a tricky problem, so maybe there is no simple and clear explanation.
Anyway thank you for finding the problem and providing a reproducer!
Thanks!
next prev parent reply other threads:[~2024-07-16 14:53 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-07 11:03 [PATCH -next] cgroup: Fix AA deadlock caused by cgroup_bpf_release Chen Ridong
2024-06-10 2:47 ` Roman Gushchin
2024-07-09 13:42 ` chenridong
2024-07-10 3:02 ` chenridong
2024-07-11 3:52 ` Roman Gushchin
2024-07-11 17:36 ` Tejun Heo
2024-07-12 1:15 ` chenridong
2024-07-16 12:14 ` chenridong
2024-07-16 14:53 ` Roman Gushchin [this message]
2024-07-17 2:08 ` chenridong
2024-06-10 12:28 ` Markus Elfring
2024-07-09 13:45 ` chenridong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZpaJUIyiDguRQWSn@google.com \
--to=roman.gushchin@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huawei.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox