Re: [PATCH] cgroup: don't queue css_release

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
       [not found]       ` <317701e1-20a7-206f-92cd-cd36d436eee2@linaro.org>
@ 2022-05-19 11:23         ` Hillf Danton
  2022-05-19 23:26           ` Tadeusz Struk
  0 siblings, 1 reply; 11+ messages in thread
From: Hillf Danton @ 2022-05-19 11:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Tadeusz Struk, Michal Koutny, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk  wrote:
> On 4/22/22 04:05, Michal Koutny wrote:
> > On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo <tj@kernel.org> wrote:
> >> If this is the case, we need to hold an extra reference to be put by the
> >> css_killed_work_fn(), right?

That put could trigger INIT_WORK in css_release() and warning [1]
on init active (active state 0) object OTOH as the same
css->destroy_work is used in both kill and release pathes.

Hillf

[1] https://lore.kernel.org/lkml/000000000000ff747805debce6c6@google.com/
> > 
> > I looked into it a bit more lately and found that there already is such
> > a fuse in kill_css() [1].
> > 
> > At the same type syzbots stack trace demonstrates the fuse is
> > ineffective
> > 
> >> css_release+0xae/0xc0 kernel/cgroup/cgroup.c:5146                    (**)
> >> percpu_ref_put_many include/linux/percpu-refcount.h:322 [inline]
> >> percpu_ref_put include/linux/percpu-refcount.h:338 [inline]
> >> percpu_ref_call_confirm_rcu lib/percpu-refcount.c:162 [inline]        (*)
> >> percpu_ref_switch_to_atomic_rcu+0x5a2/0x5b0 lib/percpu-refcount.c:199
> >> rcu_do_batch+0x4f8/0xbc0 kernel/rcu/tree.c:2485
> >> rcu_core+0x59b/0xe30 kernel/rcu/tree.c:2722
> >> rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2735
> >> __do_softirq+0x27e/0x596 kernel/softirq.c:305
> > 
> > (*) this calls css_killed_ref_fn confirm_switch
> > (**) zero references after confirmed kill?
> > 
> > So, I was also looking at the possible race with css_free_rwork_fn()
> > (from failed css_create()) but that would likely emit a warning from
> > __percpu_ref_exit().
> > 
> > So, I still think there's something fishy (so far possible only via
> > artificial ENOMEM injection) that needs an explanation...
> 
> I can't reliably reproduce this issue on neither mainline nor v5.10, where
> syzbot originally found it. It still triggers for syzbot though.
> 
> -- 
> Thanks,
> Tadeusz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-19 11:23         ` [PATCH] cgroup: don't queue css_release_work if one already pending Hillf Danton
@ 2022-05-19 23:26           ` Tadeusz Struk
  2022-05-20  8:13             ` Tejun Heo
  0 siblings, 1 reply; 11+ messages in thread
From: Tadeusz Struk @ 2022-05-19 23:26 UTC (permalink / raw)
  To: Hillf Danton, Tejun Heo
  Cc: Michal Koutny, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On 5/19/22 04:23, Hillf Danton wrote:
> On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk  wrote:
>> On 4/22/22 04:05, Michal Koutny wrote:
>>> On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo<tj@kernel.org>  wrote:
>>>> If this is the case, we need to hold an extra reference to be put by the
>>>> css_killed_work_fn(), right?
> That put could trigger INIT_WORK in css_release() and warning [1]
> on init active (active state 0) object OTOH as the same
> css->destroy_work is used in both kill and release pathes.

Will this help if there would be two WQs, one for the css_release path
and one for the rcu_work?

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index adb820e98f24..a4873b33e488 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -124,6 +124,7 @@ DEFINE_PERCPU_RWSEM(cgroup_threadgroup_rwsem);
   * which may lead to deadlock.
   */
  static struct workqueue_struct *cgroup_destroy_wq;
+static struct workqueue_struct *cgroup_destroy_rcu_wq;
  
  /* generate an array of cgroup subsystem pointers */
  #define SUBSYS(_x) [_x ## _cgrp_id] = &_x ## _cgrp_subsys,

-- 
Thanks,
Tadeusz


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-19 23:26           ` Tadeusz Struk
@ 2022-05-20  8:13             ` Tejun Heo
  2022-05-20 16:38               ` Tadeusz Struk
  2022-05-20 23:48               ` Hillf Danton
  0 siblings, 2 replies; 11+ messages in thread
From: Tejun Heo @ 2022-05-20  8:13 UTC (permalink / raw)
  To: Tadeusz Struk
  Cc: Hillf Danton, Michal Koutny, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On Thu, May 19, 2022 at 04:26:51PM -0700, Tadeusz Struk wrote:
> On 5/19/22 04:23, Hillf Danton wrote:
> > On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk  wrote:
> > > On 4/22/22 04:05, Michal Koutny wrote:
> > > > On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo<tj@kernel.org>  wrote:
> > > > > If this is the case, we need to hold an extra reference to be put by the
> > > > > css_killed_work_fn(), right?
> > That put could trigger INIT_WORK in css_release() and warning [1]
> > on init active (active state 0) object OTOH as the same
> > css->destroy_work is used in both kill and release pathes.

Hmm... wouldn't the extra reference keep release from happening?

> Will this help if there would be two WQs, one for the css_release path
> and one for the rcu_work?
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index adb820e98f24..a4873b33e488 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -124,6 +124,7 @@ DEFINE_PERCPU_RWSEM(cgroup_threadgroup_rwsem);
>   * which may lead to deadlock.
>   */
>  static struct workqueue_struct *cgroup_destroy_wq;
> +static struct workqueue_struct *cgroup_destroy_rcu_wq;

I don't understand why this would help. Care to elaborate?

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-20  8:13             ` Tejun Heo
@ 2022-05-20 16:38               ` Tadeusz Struk
  2022-05-20 16:42                 ` Michal Koutný
  2022-05-20 23:48               ` Hillf Danton
  1 sibling, 1 reply; 11+ messages in thread
From: Tadeusz Struk @ 2022-05-20 16:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Hillf Danton, Michal Koutny, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On 5/20/22 01:13, Tejun Heo wrote:
> On Thu, May 19, 2022 at 04:26:51PM -0700, Tadeusz Struk wrote:
>> On 5/19/22 04:23, Hillf Danton wrote:
>>> On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk  wrote:
>>>> On 4/22/22 04:05, Michal Koutny wrote:
>>>>> On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo<tj@kernel.org>  wrote:
>>>>>> If this is the case, we need to hold an extra reference to be put by the
>>>>>> css_killed_work_fn(), right?
>>> That put could trigger INIT_WORK in css_release() and warning [1]
>>> on init active (active state 0) object OTOH as the same
>>> css->destroy_work is used in both kill and release paths.
> 
> Hmm... wouldn't the extra reference keep release from happening?
> 
>> Will this help if there would be two WQs, one for the css_release path
>> and one for the rcu_work?
>>
>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>> index adb820e98f24..a4873b33e488 100644
>> --- a/kernel/cgroup/cgroup.c
>> +++ b/kernel/cgroup/cgroup.c
>> @@ -124,6 +124,7 @@ DEFINE_PERCPU_RWSEM(cgroup_threadgroup_rwsem);
>>    * which may lead to deadlock.
>>    */
>>   static struct workqueue_struct *cgroup_destroy_wq;
>> +static struct workqueue_struct *cgroup_destroy_rcu_wq;
> 
> I don't understand why this would help. Care to elaborate?

I think it will help to solve the list corruption issue:

list_add corruption. prev->next should be next (ffff8881f705c060), but was ffff888113123870. (prev=ffff888113123870).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:28!

as this is a result of enqueuing the same css->destroy_work onto the same WQ,
one on the rcu path and one on the css_release path.
I will prototype it today and test with syzbot.

-- 
Thanks,
Tadeusz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-20 16:38               ` Tadeusz Struk
@ 2022-05-20 16:42                 ` Michal Koutný
  2022-05-20 16:56                   ` Tadeusz Struk
  2022-05-23 19:00                   ` Tadeusz Struk
  0 siblings, 2 replies; 11+ messages in thread
From: Michal Koutný @ 2022-05-20 16:42 UTC (permalink / raw)
  To: Tadeusz Struk
  Cc: Tejun Heo, Hillf Danton, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On Fri, May 20, 2022 at 09:38:12AM -0700, Tadeusz Struk <tadeusz.struk@linaro.org> wrote:
> as this is a result of enqueuing the same css->destroy_work onto the same WQ,
> one on the rcu path and one on the css_release path.
> I will prototype it today and test with syzbot.

In my understanding, you'd need two independent work_structs in a css,
not two separate workqueues to put the single entry on.

Michal


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-20 16:42                 ` Michal Koutný
@ 2022-05-20 16:56                   ` Tadeusz Struk
  2022-05-23 19:00                   ` Tadeusz Struk
  1 sibling, 0 replies; 11+ messages in thread
From: Tadeusz Struk @ 2022-05-20 16:56 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Tejun Heo, Hillf Danton, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On 5/20/22 09:42, Michal Koutný wrote:
> On Fri, May 20, 2022 at 09:38:12AM -0700, Tadeusz Struk<tadeusz.struk@linaro.org>  wrote:
>> as this is a result of enqueuing the same css->destroy_work onto the same WQ,
>> one on the rcu path and one on the css_release path.
>> I will prototype it today and test with syzbot.
> In my understanding, you'd need two independent work_structs in a css,
> not two separate workqueues to put the single entry on.

I think either way would work, but two workqueues would be less churn.
I can try both.

-- 
Thanks,
Tadeusz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-20  8:13             ` Tejun Heo
  2022-05-20 16:38               ` Tadeusz Struk
@ 2022-05-20 23:48               ` Hillf Danton
  1 sibling, 0 replies; 11+ messages in thread
From: Hillf Danton @ 2022-05-20 23:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Tadeusz Struk, Michal Koutny, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On Thu, 19 May 2022 22:13:47 -1000 Tejun Heo  wrote:
> On Thu, May 19, 2022 at 04:26:51PM -0700, Tadeusz Struk wrote:
> > On 5/19/22 04:23, Hillf Danton wrote:
> > > On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk  wrote:
> > > > On 4/22/22 04:05, Michal Koutny wrote:
> > > > > On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo<tj@kernel.org>  wrote:
> > > > > > If this is the case, we need to hold an extra reference to be put by the
> > > > > > css_killed_work_fn(), right?
> > > 
> > > That put could trigger INIT_WORK in css_release() and warning [1]
> > > on init active (active state 0) object OTOH as the same
> > > css->destroy_work is used in both kill and release pathes.
> 
> Hmm... wouldn't the extra reference keep release from happening?

Hm...Hm... have difficulty following up given the risk of memleak without
release.

> 
> > Will this help if there would be two WQs, one for the css_release path
> > and one for the rcu_work?

Unlikely as adding one more queue does not help INIT_WORK.

Hillf
> > 
> > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> > index adb820e98f24..a4873b33e488 100644
> > --- a/kernel/cgroup/cgroup.c
> > +++ b/kernel/cgroup/cgroup.c
> > @@ -124,6 +124,7 @@ DEFINE_PERCPU_RWSEM(cgroup_threadgroup_rwsem);
> >   * which may lead to deadlock.
> >   */
> >  static struct workqueue_struct *cgroup_destroy_wq;
> > +static struct workqueue_struct *cgroup_destroy_rcu_wq;
> 
> I don't understand why this would help. Care to elaborate?
> 
> Thanks.
> 
> -- 
> tejun


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-20 16:42                 ` Michal Koutný
  2022-05-20 16:56                   ` Tadeusz Struk
@ 2022-05-23 19:00                   ` Tadeusz Struk
  2022-05-23 19:02                     ` Tejun Heo
  1 sibling, 1 reply; 11+ messages in thread
From: Tadeusz Struk @ 2022-05-23 19:00 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Tejun Heo, Hillf Danton, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On 5/20/22 09:42, Michal Koutný wrote:
> On Fri, May 20, 2022 at 09:38:12AM -0700, Tadeusz Struk <tadeusz.struk@linaro.org> wrote:
>> as this is a result of enqueuing the same css->destroy_work onto the same WQ,
>> one on the rcu path and one on the css_release path.
>> I will prototype it today and test with syzbot.
> 
> In my understanding, you'd need two independent work_structs in a css,
> not two separate workqueues to put the single entry on.

Yes, separating the css_killed_ref and css_release paths with two separate work_structs
fixes the two syzbot list corruption issues [1]&[2].
I tested it on mainline v5.18.0 and v5.10.117
In case of [2] the mainline triggers an issue, but it is unrelated to list corruption.

[1] https://syzkaller.appspot.com/bug?id=e26e54d6eac9d9fb50b221ec3e4627b327465dbd
[2] https://syzkaller.appspot.com/bug?id=3c7ff113ccb695e839b859da3fc481c36eb1cfd5

I will send a patch soon.

-- 
Thanks,
Tadeusz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-23 19:00                   ` Tadeusz Struk
@ 2022-05-23 19:02                     ` Tejun Heo
  2022-05-23 19:08                       ` Tadeusz Struk
  0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2022-05-23 19:02 UTC (permalink / raw)
  To: Tadeusz Struk
  Cc: Michal Koutný, Hillf Danton, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On Mon, May 23, 2022 at 12:00:06PM -0700, Tadeusz Struk wrote:
> On 5/20/22 09:42, Michal Koutný wrote:
> > On Fri, May 20, 2022 at 09:38:12AM -0700, Tadeusz Struk <tadeusz.struk@linaro.org> wrote:
> > > as this is a result of enqueuing the same css->destroy_work onto the same WQ,
> > > one on the rcu path and one on the css_release path.
> > > I will prototype it today and test with syzbot.
> > 
> > In my understanding, you'd need two independent work_structs in a css,
> > not two separate workqueues to put the single entry on.
> 
> Yes, separating the css_killed_ref and css_release paths with two separate work_structs
> fixes the two syzbot list corruption issues [1]&[2].
> I tested it on mainline v5.18.0 and v5.10.117
> In case of [2] the mainline triggers an issue, but it is unrelated to list corruption.

Can you try holding an extra ref in the killed path?

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-23 19:02                     ` Tejun Heo
@ 2022-05-23 19:08                       ` Tadeusz Struk
  2022-05-23 20:05                         ` Tadeusz Struk
  0 siblings, 1 reply; 11+ messages in thread
From: Tadeusz Struk @ 2022-05-23 19:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michal Koutný, Hillf Danton, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On 5/23/22 12:02, Tejun Heo wrote:
> Can you try holding an extra ref in the killed path?

Yes, I can try that. Should that be with or without the extra work_struct?

-- 
Thanks,
Tadeusz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cgroup: don't queue css_release_work if one already pending
  2022-05-23 19:08                       ` Tadeusz Struk
@ 2022-05-23 20:05                         ` Tadeusz Struk
  0 siblings, 0 replies; 11+ messages in thread
From: Tadeusz Struk @ 2022-05-23 20:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michal Koutný, Hillf Danton, linux-kernel, linux-mm,
	syzbot+e42ae441c3b10acf9e9d

On 5/23/22 12:08, Tadeusz Struk wrote:
> On 5/23/22 12:02, Tejun Heo wrote:
>> Can you try holding an extra ref in the killed path?
> 
> Yes, I can try that. Should that be with or without the extra work_struct?

With this patch:
https://syzkaller.appspot.com/text?tag=Patch&x=1264b23df00000

The issue still triggers:
https://syzkaller.appspot.com/text?tag=CrashReport&x=162b5781f00000

-- 
Thanks,
Tadeusz


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-05-23 20:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220412192459.227740-1-tadeusz.struk@linaro.org>
     [not found] ` <20220414164409.GA5404@blackbody.suse.cz>
     [not found]   ` <YmHwOAdGY2Lwl+M3@slm.duckdns.org>
     [not found]     ` <20220422100400.GA29552@blackbody.suse.cz>
     [not found]       ` <317701e1-20a7-206f-92cd-cd36d436eee2@linaro.org>
2022-05-19 11:23         ` [PATCH] cgroup: don't queue css_release_work if one already pending Hillf Danton
2022-05-19 23:26           ` Tadeusz Struk
2022-05-20  8:13             ` Tejun Heo
2022-05-20 16:38               ` Tadeusz Struk
2022-05-20 16:42                 ` Michal Koutný
2022-05-20 16:56                   ` Tadeusz Struk
2022-05-23 19:00                   ` Tadeusz Struk
2022-05-23 19:02                     ` Tejun Heo
2022-05-23 19:08                       ` Tadeusz Struk
2022-05-23 20:05                         ` Tadeusz Struk
2022-05-20 23:48               ` Hillf Danton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).