All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Christian Borntraeger
	<borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-s390 <linux-s390-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	KVM list <kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	kernel-team-b10kYP2dOMg@public.gmane.org
Subject: [PATCH 1/2] cgroup: make sure a parent css isn't offlined before its children
Date: Thu, 21 Jan 2016 15:31:11 -0500	[thread overview]
Message-ID: <20160121203111.GF5157@mtj.duckdns.org> (raw)
In-Reply-To: <5698A023.9070703-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

There are three subsystem callbacks in css shutdown path -
css_offline(), css_released() and css_free().  Except for
css_released(), cgroup core didn't use to guarantee the order of
invocation.  css_offline() or css_free() could be called on a parent
css before its children.  This behavior is unexpected and led to
use-after-free in cpu controller.

This patch updates offline path so that a parent css is never offlined
before its children.  Each css keeps online_cnt which reaches zero iff
itself and all its children are offline and offline_css() is invoked
only after online_cnt reaches zero.

This fixes the reported cpu controller malfunction.  The next patch
will update css_free() handling.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Reported-by: Christian Borntraeger <borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Link: http://lkml.kernel.org/g/5698A023.9070703-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Cc: Heiko Carstens <heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
Hello, Christian.

Can you please verify whether this patch fixes the issue?

Thanks.

 include/linux/cgroup-defs.h |    6 ++++++
 kernel/cgroup.c             |   22 +++++++++++++++++-----
 2 files changed, 23 insertions(+), 5 deletions(-)

--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -127,6 +127,12 @@ struct cgroup_subsys_state {
 	 */
 	u64 serial_nr;
 
+	/*
+	 * Incremented by online self and children.  Used to guarantee that
+	 * parents are not offlined before their children.
+	 */
+	atomic_t online_cnt;
+
 	/* percpu_ref killing and RCU release */
 	struct rcu_head rcu_head;
 	struct work_struct destroy_work;
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4761,6 +4761,7 @@ static void init_and_link_css(struct cgr
 	INIT_LIST_HEAD(&css->sibling);
 	INIT_LIST_HEAD(&css->children);
 	css->serial_nr = css_serial_nr_next++;
+	atomic_set(&css->online_cnt, 0);
 
 	if (cgroup_parent(cgrp)) {
 		css->parent = cgroup_css(cgroup_parent(cgrp), ss);
@@ -4783,6 +4784,10 @@ static int online_css(struct cgroup_subs
 	if (!ret) {
 		css->flags |= CSS_ONLINE;
 		rcu_assign_pointer(css->cgroup->subsys[ss->id], css);
+
+		atomic_inc(&css->online_cnt);
+		if (css->parent)
+			atomic_inc(&css->parent->online_cnt);
 	}
 	return ret;
 }
@@ -5020,10 +5025,15 @@ static void css_killed_work_fn(struct wo
 		container_of(work, struct cgroup_subsys_state, destroy_work);
 
 	mutex_lock(&cgroup_mutex);
-	offline_css(css);
-	mutex_unlock(&cgroup_mutex);
 
-	css_put(css);
+	do {
+		offline_css(css);
+		css_put(css);
+		/* @css can't go away while we're holding cgroup_mutex */
+		css = css->parent;
+	} while (css && atomic_dec_and_test(&css->online_cnt));
+
+	mutex_unlock(&cgroup_mutex);
 }
 
 /* css kill confirmation processing requires process context, bounce */
@@ -5032,8 +5042,10 @@ static void css_killed_ref_fn(struct per
 	struct cgroup_subsys_state *css =
 		container_of(ref, struct cgroup_subsys_state, refcnt);
 
-	INIT_WORK(&css->destroy_work, css_killed_work_fn);
-	queue_work(cgroup_destroy_wq, &css->destroy_work);
+	if (atomic_dec_and_test(&css->online_cnt)) {
+		INIT_WORK(&css->destroy_work, css_killed_work_fn);
+		queue_work(cgroup_destroy_wq, &css->destroy_work);
+	}
 }
 
 /**

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: linux-kernel@vger.kernel.org,
	linux-s390 <linux-s390@vger.kernel.org>,
	KVM list <kvm@vger.kernel.org>, Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	cgroups@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH 1/2] cgroup: make sure a parent css isn't offlined before its children
Date: Thu, 21 Jan 2016 15:31:11 -0500	[thread overview]
Message-ID: <20160121203111.GF5157@mtj.duckdns.org> (raw)
In-Reply-To: <5698A023.9070703@de.ibm.com>

There are three subsystem callbacks in css shutdown path -
css_offline(), css_released() and css_free().  Except for
css_released(), cgroup core didn't use to guarantee the order of
invocation.  css_offline() or css_free() could be called on a parent
css before its children.  This behavior is unexpected and led to
use-after-free in cpu controller.

This patch updates offline path so that a parent css is never offlined
before its children.  Each css keeps online_cnt which reaches zero iff
itself and all its children are offline and offline_css() is invoked
only after online_cnt reaches zero.

This fixes the reported cpu controller malfunction.  The next patch
will update css_free() handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/g/5698A023.9070703@de.ibm.com
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
---
Hello, Christian.

Can you please verify whether this patch fixes the issue?

Thanks.

 include/linux/cgroup-defs.h |    6 ++++++
 kernel/cgroup.c             |   22 +++++++++++++++++-----
 2 files changed, 23 insertions(+), 5 deletions(-)

--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -127,6 +127,12 @@ struct cgroup_subsys_state {
 	 */
 	u64 serial_nr;
 
+	/*
+	 * Incremented by online self and children.  Used to guarantee that
+	 * parents are not offlined before their children.
+	 */
+	atomic_t online_cnt;
+
 	/* percpu_ref killing and RCU release */
 	struct rcu_head rcu_head;
 	struct work_struct destroy_work;
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4761,6 +4761,7 @@ static void init_and_link_css(struct cgr
 	INIT_LIST_HEAD(&css->sibling);
 	INIT_LIST_HEAD(&css->children);
 	css->serial_nr = css_serial_nr_next++;
+	atomic_set(&css->online_cnt, 0);
 
 	if (cgroup_parent(cgrp)) {
 		css->parent = cgroup_css(cgroup_parent(cgrp), ss);
@@ -4783,6 +4784,10 @@ static int online_css(struct cgroup_subs
 	if (!ret) {
 		css->flags |= CSS_ONLINE;
 		rcu_assign_pointer(css->cgroup->subsys[ss->id], css);
+
+		atomic_inc(&css->online_cnt);
+		if (css->parent)
+			atomic_inc(&css->parent->online_cnt);
 	}
 	return ret;
 }
@@ -5020,10 +5025,15 @@ static void css_killed_work_fn(struct wo
 		container_of(work, struct cgroup_subsys_state, destroy_work);
 
 	mutex_lock(&cgroup_mutex);
-	offline_css(css);
-	mutex_unlock(&cgroup_mutex);
 
-	css_put(css);
+	do {
+		offline_css(css);
+		css_put(css);
+		/* @css can't go away while we're holding cgroup_mutex */
+		css = css->parent;
+	} while (css && atomic_dec_and_test(&css->online_cnt));
+
+	mutex_unlock(&cgroup_mutex);
 }
 
 /* css kill confirmation processing requires process context, bounce */
@@ -5032,8 +5042,10 @@ static void css_killed_ref_fn(struct per
 	struct cgroup_subsys_state *css =
 		container_of(ref, struct cgroup_subsys_state, refcnt);
 
-	INIT_WORK(&css->destroy_work, css_killed_work_fn);
-	queue_work(cgroup_destroy_wq, &css->destroy_work);
+	if (atomic_dec_and_test(&css->online_cnt)) {
+		INIT_WORK(&css->destroy_work, css_killed_work_fn);
+		queue_work(cgroup_destroy_wq, &css->destroy_work);
+	}
 }
 
 /**

  parent reply	other threads:[~2016-01-21 20:31 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-14 11:19 regression 4.4: deadlock in with cgroup percpu_rwsem Christian Borntraeger
2016-01-14 13:38 ` Christian Borntraeger
2016-01-14 14:04 ` Nikolay Borisov
2016-01-14 14:08   ` Christian Borntraeger
2016-01-14 14:27     ` Nikolay Borisov
2016-01-14 17:15       ` Christian Borntraeger
2016-01-14 19:56 ` Tejun Heo
2016-01-15  7:30   ` Christian Borntraeger
2016-01-15 15:13     ` Christian Borntraeger
2016-01-18 18:32       ` Peter Zijlstra
2016-01-18 18:48         ` Christian Borntraeger
2016-01-19  9:55           ` Heiko Carstens
2016-01-19 19:36             ` Christian Borntraeger
2016-01-19 19:38               ` Tejun Heo
2016-01-20  7:07                 ` Heiko Carstens
2016-01-20 10:15                   ` Christian Borntraeger
2016-01-20 10:30                     ` Peter Zijlstra
2016-01-20 10:47                       ` Peter Zijlstra
2016-01-20 15:30                         ` Tejun Heo
2016-01-20 16:04                           ` Tejun Heo
2016-01-20 16:49                             ` Peter Zijlstra
2016-01-20 16:56                               ` Tejun Heo
2016-01-23  2:03                           ` Paul E. McKenney
2016-01-25  8:49                             ` Christoph Hellwig
2016-01-25 19:38                               ` Tejun Heo
2016-01-26 14:51                                 ` Christoph Hellwig
2016-01-26 15:28                                   ` Tejun Heo
2016-01-26 16:41                                     ` Christoph Hellwig
2016-01-20 10:53                       ` Peter Zijlstra
2016-01-21  8:23                         ` Christian Borntraeger
2016-01-21  9:27                           ` Peter Zijlstra
2016-01-15 16:40     ` Tejun Heo
     [not found]       ` <20160115164023.GH3520-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-01-19 17:18         ` [PATCH cgroup/for-4.5-fixes] cpuset: make mm migration asynchronous Tejun Heo
2016-01-19 17:18           ` Tejun Heo
2016-01-22 14:24           ` Christian Borntraeger
2016-01-22 15:22             ` Tejun Heo
     [not found]               ` <20160122152232.GB32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-01-22 15:45                 ` Christian Borntraeger
2016-01-22 15:45                   ` Christian Borntraeger
2016-01-22 15:47                   ` Tejun Heo
     [not found]           ` <20160119171841.GP3520-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-01-22 15:23             ` Tejun Heo
2016-01-22 15:23               ` Tejun Heo
     [not found]     ` <5698A023.9070703-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2016-01-21 20:31       ` Tejun Heo [this message]
2016-01-21 20:31         ` [PATCH 1/2] cgroup: make sure a parent css isn't offlined before its children Tejun Heo
2016-01-21 20:32         ` [PATCH 2/2] cgroup: make sure a parent css isn't freed " Tejun Heo
2016-01-22 15:45           ` [PATCH v2 " Tejun Heo
2016-01-22 15:45             ` Tejun Heo
     [not found]         ` <20160121203111.GF5157-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-01-21 21:24           ` [PATCH 1/2] cgroup: make sure a parent css isn't offlined " Peter Zijlstra
2016-01-21 21:24             ` Peter Zijlstra
     [not found]             ` <20160121212416.GL6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-01-21 21:28               ` Tejun Heo
2016-01-21 21:28                 ` Tejun Heo
2016-01-22  8:18                 ` Christian Borntraeger
2016-02-29 11:13             ` [tip:sched/core] sched/cgroup: Fix cgroup entity load tracking tear-down tip-bot for Peter Zijlstra
2016-01-22 15:45           ` [PATCH v2 1/2] cgroup: make sure a parent css isn't offlined before its children Tejun Heo
2016-01-22 15:45             ` Tejun Heo
2016-01-22 15:45             ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160121203111.GF5157@mtj.duckdns.org \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=kernel-team-b10kYP2dOMg@public.gmane.org \
    --cc=kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-s390-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
    --cc=oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.