From: "Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>
To: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Bui Quang Minh
<minhquangbui99-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Tadeusz Struk
<tadeusz.struk-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path
Date: Thu, 26 May 2022 11:56:34 +0200 [thread overview]
Message-ID: <Yo7KfEOz92kS2z5Y@blackbook> (raw)
In-Reply-To: <20220525161455.GA16134-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
On Wed, May 25, 2022 at 06:14:55PM +0200, Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org> wrote:
> But the above is not correct. I've looked at the stack trace [1] and the
> offending percpu_ref_put_many is called from an RCU callback
> percpu_ref_switch_to_atomic_rcu(), so I can't actually see why it drops
> to zero there...
The link [1] should have been [1].
After some more thought, the following is possible sequencing of
involved functions.
// ref=A: initial state
kill_css()
css_get // ref+=F == A+F: fuse
percpu_ref_kill_and_confirm
__percpu_ref_switch_to_atomic
percpu_ref_get
// ref += 1 == A+F+1: atomic mode, self-protection
percpu_ref_put
// ref -= 1 == A+F: kill the base reference
[via rcu]
percpu_ref_switch_to_atomic_rcu
percpu_ref_call_confirm_rcu
css_killed_ref_fn == refcnt.confirm_switch
queue_work(css->destroy_work) (1)
[via css->destroy_work]
css_killed_work_fn == wq.func
offline_css() // needs fuse
css_put // ref -= F == A: de-fuse
percpu_ref_put
// ref -= 1 == A-1: remove self-protection
css_release // A <= 1 -> 2nd queue_work explodes!
queue_work(css->destroy_work) (2)
[via css->destroy_work]
css_release_work_fn == wq.func
Another CPU would have to dispatch and run the css_killed_work_fn
callback in parallel to percpu_ref_switch_to_atomic_rcu. It's a more
correct explanation, however, its likelihood does seem very low. Perhaps
some debug prints of percpu_ref_data.data in percpu_ref_call_confirm_rcu
could shed more light onto this [2].
HTH,
Michal
[1] https://syzkaller.appspot.com/text?tag=CrashReport&x=162b5781f00000
[2] I tried notifying syzbot about [3] moments ago.
[3] https://github.com/Werkov/linux/tree/cgroup-ml/css-lifecycle-syzbot
next prev parent reply other threads:[~2022-05-26 9:56 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-25 15:15 [PATCH 0/2] cgroup_subsys_state lifecycle fixups Michal Koutný
[not found] ` <20220525151517.8430-1-mkoutny-IBi9RG/b67k@public.gmane.org>
2022-05-25 15:15 ` [PATCH 1/2] cgroup: Wait for cgroup_subsys_state offlining on unmount Michal Koutný
2022-05-25 15:15 ` [PATCH 2/2] cgroup: Use separate work structs on css release path Michal Koutný
[not found] ` <20220525151517.8430-3-mkoutny-IBi9RG/b67k@public.gmane.org>
2022-05-25 16:14 ` Michal Koutný
[not found] ` <20220525161455.GA16134-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-05-26 9:56 ` Michal Koutný [this message]
2022-05-26 18:15 ` Tejun Heo
[not found] ` <Yo/DtjEU/kYr190u-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-05-27 16:39 ` Tadeusz Struk
[not found] ` <904ef8af-13a5-e566-b760-74519f70fa62-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2022-05-27 16:54 ` Michal Koutný
[not found] ` <20220527165429.GJ11007-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-05-27 17:23 ` Tejun Heo
2022-06-01 23:13 ` Tadeusz Struk
[not found] ` <0babd7df-bdef-9edc-3682-1144bc0c2d2b-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2022-06-01 23:20 ` Tejun Heo
[not found] ` <Ypf0VnKUMiuRgZqT-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-06-01 23:37 ` Tadeusz Struk
[not found] ` <1fb4d8d7-ccc0-b020-715e-38c2dfd94c23-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2022-06-01 23:43 ` Tejun Heo
[not found] ` <Ypf5jpI7dSmpi4W0-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-06-02 0:00 ` Tadeusz Struk
[not found] ` <c3bd8e63-7204-f86d-8efa-254db71185fc-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2022-06-02 0:07 ` Tejun Heo
[not found] ` <Ypf/MpwzByOrSp6A-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-06-02 0:26 ` Tadeusz Struk
[not found] ` <416dc60a-f0e5-7d05-1613-3cd0ca415768-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2022-06-02 0:29 ` Tejun Heo
[not found] ` <YpgEY/lJbLidLOhc-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-06-02 0:40 ` Tadeusz Struk
[not found] ` <0fd1c3fd-fa86-dbed-f3f0-74c91b1efa11-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2022-06-02 11:47 ` Michal Koutný
[not found] ` <20220602114705.GB21320-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-06-02 14:28 ` Tadeusz Struk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yo7KfEOz92kS2z5Y@blackbook \
--to=mkoutny-ibi9rg/b67k@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
--cc=minhquangbui99-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=tadeusz.struk-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox