From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Tejun Heo <tj@kernel.org>, Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
linux-s390 <linux-s390@vger.kernel.org>,
KVM list <kvm@vger.kernel.org>, Oleg Nesterov <oleg@redhat.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Li Zefan <lizefan@huawei.com>,
Johannes Weiner <hannes@cmpxchg.org>,
cgroups@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 1/2] cgroup: make sure a parent css isn't offlined before its children
Date: Fri, 22 Jan 2016 09:18:35 +0100 [thread overview]
Message-ID: <56A1E5DB.1010201@de.ibm.com> (raw)
In-Reply-To: <20160121212812.GJ5157@mtj.duckdns.org>
On 01/21/2016 10:28 PM, Tejun Heo wrote:
> On Thu, Jan 21, 2016 at 10:24:16PM +0100, Peter Zijlstra wrote:
>> On Thu, Jan 21, 2016 at 03:31:11PM -0500, Tejun Heo wrote:
>>> There are three subsystem callbacks in css shutdown path -
>>> css_offline(), css_released() and css_free(). Except for
>>> css_released(), cgroup core didn't use to guarantee the order of
>>> invocation. css_offline() or css_free() could be called on a parent
>>> css before its children. This behavior is unexpected and led to
>>> use-after-free in cpu controller.
>>>
>>> This patch updates offline path so that a parent css is never offlined
>>> before its children. Each css keeps online_cnt which reaches zero iff
>>> itself and all its children are offline and offline_css() is invoked
>>> only after online_cnt reaches zero.
>>>
>>> This fixes the reported cpu controller malfunction. The next patch
>>> will update css_free() handling.
>>
>> No, I need to fix the cpu controller too, because the offending code
>> sits off of css_free() (the next patch), but also does a call_rcu() in
>> between, which also doesn't guarantee order.
>
> Ah, I see. Christian, can you please apply all three patches and see
> whether the problem gets fixed? Once verified, I'll update the patch
> description and repost.
With these 3 patches I always run into the dio/scsi problem, but never in
the css issue. So I cannot test a full day or so, but it looks like
the problem is gone. At least it worked multiple times for 30minutes or
so until my system was killed by the io issue.
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
next prev parent reply other threads:[~2016-01-22 8:18 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 11:19 regression 4.4: deadlock in with cgroup percpu_rwsem Christian Borntraeger
2016-01-14 13:38 ` Christian Borntraeger
2016-01-14 14:04 ` Nikolay Borisov
2016-01-14 14:08 ` Christian Borntraeger
2016-01-14 14:27 ` Nikolay Borisov
2016-01-14 17:15 ` Christian Borntraeger
2016-01-14 19:56 ` Tejun Heo
2016-01-15 7:30 ` Christian Borntraeger
2016-01-15 15:13 ` Christian Borntraeger
2016-01-18 18:32 ` Peter Zijlstra
2016-01-18 18:48 ` Christian Borntraeger
2016-01-19 9:55 ` Heiko Carstens
2016-01-19 19:36 ` Christian Borntraeger
2016-01-19 19:38 ` Tejun Heo
2016-01-20 7:07 ` Heiko Carstens
2016-01-20 10:15 ` Christian Borntraeger
2016-01-20 10:30 ` Peter Zijlstra
2016-01-20 10:47 ` Peter Zijlstra
2016-01-20 15:30 ` Tejun Heo
2016-01-20 16:04 ` Tejun Heo
2016-01-20 16:49 ` Peter Zijlstra
2016-01-20 16:56 ` Tejun Heo
2016-01-23 2:03 ` Paul E. McKenney
2016-01-25 8:49 ` Christoph Hellwig
2016-01-25 19:38 ` Tejun Heo
2016-01-26 14:51 ` Christoph Hellwig
2016-01-26 15:28 ` Tejun Heo
2016-01-26 16:41 ` Christoph Hellwig
2016-01-20 10:53 ` Peter Zijlstra
2016-01-21 8:23 ` Christian Borntraeger
2016-01-21 9:27 ` Peter Zijlstra
2016-01-15 16:40 ` Tejun Heo
2016-01-19 17:18 ` [PATCH cgroup/for-4.5-fixes] cpuset: make mm migration asynchronous Tejun Heo
2016-01-22 14:24 ` Christian Borntraeger
2016-01-22 15:22 ` Tejun Heo
2016-01-22 15:45 ` Christian Borntraeger
2016-01-22 15:47 ` Tejun Heo
2016-01-22 15:23 ` Tejun Heo
2016-01-21 20:31 ` [PATCH 1/2] cgroup: make sure a parent css isn't offlined before its children Tejun Heo
2016-01-21 20:32 ` [PATCH 2/2] cgroup: make sure a parent css isn't freed " Tejun Heo
2016-01-22 15:45 ` [PATCH v2 " Tejun Heo
2016-01-21 21:24 ` [PATCH 1/2] cgroup: make sure a parent css isn't offlined " Peter Zijlstra
2016-01-21 21:28 ` Tejun Heo
2016-01-22 8:18 ` Christian Borntraeger [this message]
2016-02-29 11:13 ` [tip:sched/core] sched/cgroup: Fix cgroup entity load tracking tear-down tip-bot for Peter Zijlstra
2016-01-22 15:45 ` [PATCH v2 1/2] cgroup: make sure a parent css isn't offlined before its children Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A1E5DB.1010201@de.ibm.com \
--to=borntraeger@de.ibm.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).