From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: ctalbott-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
rni-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [PATCH 04/12] blkcg: make blkcg_gq's hierarchical
Date: Fri, 14 Dec 2012 14:41:17 -0800 [thread overview]
Message-ID: <1355524885-22719-5-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1355524885-22719-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Currently a child blkg (blkcg_gq) can be created even if its parent
doesn't exist. ie. Given a blkg, it's not guaranteed that its
ancestors will exist. This makes it difficult to implement proper
hierarchy support for blkcg policies.
Always create blkgs recursively and make a child blkg hold a reference
to its parent. blkg->parent is added so that finding the parent is
easy. blkcg_parent() is also added in the process.
This change can be visible to userland. e.g. while issuing IO in a
nested cgroup didn't affect the ancestors at all, now it will
initialize all ancestor blkgs and zero stats for the request_queue
will always appear on them. While this is userland visible, this
shouldn't cause any functional difference.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
block/blk-cgroup.c | 42 +++++++++++++++++++++++++++++++++++++-----
block/blk-cgroup.h | 18 ++++++++++++++++++
2 files changed, 55 insertions(+), 5 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index fde2286..c858628 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -201,7 +201,16 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
}
blkg = new_blkg;
- /* insert */
+ /* link parent and insert */
+ if (blkcg_parent(blkcg)) {
+ blkg->parent = __blkg_lookup(blkcg_parent(blkcg), q, false);
+ if (WARN_ON_ONCE(!blkg->parent)) {
+ blkg = ERR_PTR(-EINVAL);
+ goto err_put_css;
+ }
+ blkg_get(blkg->parent);
+ }
+
spin_lock(&blkcg->lock);
ret = radix_tree_insert(&blkcg->blkg_tree, q->id, blkg);
if (likely(!ret)) {
@@ -213,6 +222,10 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
if (!ret)
return blkg;
+ /* @blkg failed fully initialized, use the usual release path */
+ blkg_put(blkg);
+ return ERR_PTR(ret);
+
err_put_css:
css_put(&blkcg->css);
err_free_blkg:
@@ -226,8 +239,9 @@ err_free_blkg:
* @q: request_queue of interest
*
* Lookup blkg for the @blkcg - @q pair. If it doesn't exist, try to
- * create one. This function should be called under RCU read lock and
- * @q->queue_lock.
+ * create one. blkg creation is performed recursively from blkcg_root such
+ * that all non-root blkg's have access to the parent blkg. This function
+ * should be called under RCU read lock and @q->queue_lock.
*
* Returns pointer to the looked up or created blkg on success, ERR_PTR()
* value on error. If @q is dead, returns ERR_PTR(-EINVAL). If @q is not
@@ -252,7 +266,23 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
if (blkg)
return blkg;
- return blkg_create(blkcg, q, NULL);
+ /*
+ * Create blkgs walking down from blkcg_root to @blkcg, so that all
+ * non-root blkgs have access to their parents.
+ */
+ while (true) {
+ struct blkcg *pos = blkcg;
+ struct blkcg *parent = blkcg_parent(blkcg);
+
+ while (parent && !__blkg_lookup(parent, q, false)) {
+ pos = parent;
+ parent = blkcg_parent(parent);
+ }
+
+ blkg = blkg_create(pos, q, NULL);
+ if (pos == blkcg || IS_ERR(blkg))
+ return blkg;
+ }
}
EXPORT_SYMBOL_GPL(blkg_lookup_create);
@@ -321,8 +351,10 @@ static void blkg_rcu_free(struct rcu_head *rcu_head)
void __blkg_release(struct blkcg_gq *blkg)
{
- /* release the extra blkcg reference this blkg has been holding */
+ /* release the blkcg and parent blkg refs this blkg has been holding */
css_put(&blkg->blkcg->css);
+ if (blkg->parent)
+ blkg_put(blkg->parent);
/*
* A group is freed in rcu manner. But having an rcu lock does not
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 2459730..b26ed58 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -94,8 +94,13 @@ struct blkcg_gq {
struct list_head q_node;
struct hlist_node blkcg_node;
struct blkcg *blkcg;
+
+ /* all non-root blkcg_gq's are guaranteed to have access to parent */
+ struct blkcg_gq *parent;
+
/* request allocation list for this blkcg-q pair */
struct request_list rl;
+
/* reference count */
int refcnt;
@@ -181,6 +186,19 @@ static inline struct blkcg *bio_blkcg(struct bio *bio)
}
/**
+ * blkcg_parent - get the parent of a blkcg
+ * @blkcg: blkcg of interest
+ *
+ * Return the parent blkcg of @blkcg. Can be called anytime.
+ */
+static inline struct blkcg *blkcg_parent(struct blkcg *blkcg)
+{
+ struct cgroup *pcg = blkcg->css.cgroup->parent;
+
+ return pcg ? cgroup_to_blkcg(pcg) : NULL;
+}
+
+/**
* blkg_to_pdata - get policy private data
* @blkg: blkg of interest
* @pol: policy of interest
--
1.7.11.7
WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: lizefan@huawei.com, axboe@kernel.dk, vgoyal@redhat.com
Cc: containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, ctalbott@google.com,
rni@google.com, Tejun Heo <tj@kernel.org>
Subject: [PATCH 04/12] blkcg: make blkcg_gq's hierarchical
Date: Fri, 14 Dec 2012 14:41:17 -0800 [thread overview]
Message-ID: <1355524885-22719-5-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1355524885-22719-1-git-send-email-tj@kernel.org>
Currently a child blkg (blkcg_gq) can be created even if its parent
doesn't exist. ie. Given a blkg, it's not guaranteed that its
ancestors will exist. This makes it difficult to implement proper
hierarchy support for blkcg policies.
Always create blkgs recursively and make a child blkg hold a reference
to its parent. blkg->parent is added so that finding the parent is
easy. blkcg_parent() is also added in the process.
This change can be visible to userland. e.g. while issuing IO in a
nested cgroup didn't affect the ancestors at all, now it will
initialize all ancestor blkgs and zero stats for the request_queue
will always appear on them. While this is userland visible, this
shouldn't cause any functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-cgroup.c | 42 +++++++++++++++++++++++++++++++++++++-----
block/blk-cgroup.h | 18 ++++++++++++++++++
2 files changed, 55 insertions(+), 5 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index fde2286..c858628 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -201,7 +201,16 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
}
blkg = new_blkg;
- /* insert */
+ /* link parent and insert */
+ if (blkcg_parent(blkcg)) {
+ blkg->parent = __blkg_lookup(blkcg_parent(blkcg), q, false);
+ if (WARN_ON_ONCE(!blkg->parent)) {
+ blkg = ERR_PTR(-EINVAL);
+ goto err_put_css;
+ }
+ blkg_get(blkg->parent);
+ }
+
spin_lock(&blkcg->lock);
ret = radix_tree_insert(&blkcg->blkg_tree, q->id, blkg);
if (likely(!ret)) {
@@ -213,6 +222,10 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
if (!ret)
return blkg;
+ /* @blkg failed fully initialized, use the usual release path */
+ blkg_put(blkg);
+ return ERR_PTR(ret);
+
err_put_css:
css_put(&blkcg->css);
err_free_blkg:
@@ -226,8 +239,9 @@ err_free_blkg:
* @q: request_queue of interest
*
* Lookup blkg for the @blkcg - @q pair. If it doesn't exist, try to
- * create one. This function should be called under RCU read lock and
- * @q->queue_lock.
+ * create one. blkg creation is performed recursively from blkcg_root such
+ * that all non-root blkg's have access to the parent blkg. This function
+ * should be called under RCU read lock and @q->queue_lock.
*
* Returns pointer to the looked up or created blkg on success, ERR_PTR()
* value on error. If @q is dead, returns ERR_PTR(-EINVAL). If @q is not
@@ -252,7 +266,23 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
if (blkg)
return blkg;
- return blkg_create(blkcg, q, NULL);
+ /*
+ * Create blkgs walking down from blkcg_root to @blkcg, so that all
+ * non-root blkgs have access to their parents.
+ */
+ while (true) {
+ struct blkcg *pos = blkcg;
+ struct blkcg *parent = blkcg_parent(blkcg);
+
+ while (parent && !__blkg_lookup(parent, q, false)) {
+ pos = parent;
+ parent = blkcg_parent(parent);
+ }
+
+ blkg = blkg_create(pos, q, NULL);
+ if (pos == blkcg || IS_ERR(blkg))
+ return blkg;
+ }
}
EXPORT_SYMBOL_GPL(blkg_lookup_create);
@@ -321,8 +351,10 @@ static void blkg_rcu_free(struct rcu_head *rcu_head)
void __blkg_release(struct blkcg_gq *blkg)
{
- /* release the extra blkcg reference this blkg has been holding */
+ /* release the blkcg and parent blkg refs this blkg has been holding */
css_put(&blkg->blkcg->css);
+ if (blkg->parent)
+ blkg_put(blkg->parent);
/*
* A group is freed in rcu manner. But having an rcu lock does not
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 2459730..b26ed58 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -94,8 +94,13 @@ struct blkcg_gq {
struct list_head q_node;
struct hlist_node blkcg_node;
struct blkcg *blkcg;
+
+ /* all non-root blkcg_gq's are guaranteed to have access to parent */
+ struct blkcg_gq *parent;
+
/* request allocation list for this blkcg-q pair */
struct request_list rl;
+
/* reference count */
int refcnt;
@@ -181,6 +186,19 @@ static inline struct blkcg *bio_blkcg(struct bio *bio)
}
/**
+ * blkcg_parent - get the parent of a blkcg
+ * @blkcg: blkcg of interest
+ *
+ * Return the parent blkcg of @blkcg. Can be called anytime.
+ */
+static inline struct blkcg *blkcg_parent(struct blkcg *blkcg)
+{
+ struct cgroup *pcg = blkcg->css.cgroup->parent;
+
+ return pcg ? cgroup_to_blkcg(pcg) : NULL;
+}
+
+/**
* blkg_to_pdata - get policy private data
* @blkg: blkg of interest
* @pol: policy of interest
--
1.7.11.7
next prev parent reply other threads:[~2012-12-14 22:41 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-14 22:41 [PATCHSET] block: implement blkcg hierarchy support in cfq Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-14 22:41 ` [PATCH 01/12] blkcg: fix minor bug in blkg_alloc() Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-17 19:10 ` Vivek Goyal
2012-12-17 19:10 ` Vivek Goyal
2012-12-17 19:10 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 02/12] blkcg: reorganize blkg_lookup_create() and friends Tejun Heo
2012-12-14 22:41 ` Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-17 19:28 ` Vivek Goyal
2012-12-17 19:28 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 03/12] blkcg: cosmetic updates to blkg_create() Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-17 19:37 ` Vivek Goyal
2012-12-17 19:37 ` Vivek Goyal
2012-12-17 19:37 ` Vivek Goyal
2012-12-14 22:41 ` Tejun Heo [this message]
2012-12-14 22:41 ` [PATCH 04/12] blkcg: make blkcg_gq's hierarchical Tejun Heo
[not found] ` <1355524885-22719-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-17 20:04 ` Vivek Goyal
2012-12-17 20:04 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 05/12] cfq-iosched: add leaf_weight Tejun Heo
2012-12-14 22:41 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 06/12] cfq-iosched: implement cfq_group->nr_active and ->level_weight Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-17 20:46 ` Vivek Goyal
2012-12-17 20:46 ` Vivek Goyal
[not found] ` <20121217204609.GH7235-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 21:15 ` Tejun Heo
2012-12-17 21:15 ` Tejun Heo
2012-12-17 21:15 ` Tejun Heo
[not found] ` <20121217211517.GC1844-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-17 21:18 ` Vivek Goyal
2012-12-17 21:18 ` Vivek Goyal
2012-12-17 21:18 ` Vivek Goyal
[not found] ` <20121217211843.GA13691-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 21:20 ` Tejun Heo
2012-12-17 21:20 ` Tejun Heo
2012-12-17 21:20 ` Tejun Heo
2012-12-17 20:46 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 07/12] cfq-iosched: implement hierarchy-ready cfq_group charge scaling Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-17 20:53 ` Vivek Goyal
2012-12-17 20:53 ` Vivek Goyal
[not found] ` <20121217205317.GI7235-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 21:17 ` Tejun Heo
2012-12-17 21:17 ` Tejun Heo
[not found] ` <20121217211738.GD1844-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-17 21:27 ` Vivek Goyal
2012-12-17 21:27 ` Vivek Goyal
2012-12-17 21:27 ` Vivek Goyal
[not found] ` <20121217212736.GB13691-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 21:33 ` Tejun Heo
2012-12-17 21:33 ` Tejun Heo
[not found] ` <20121217213314.GF1844-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-17 21:49 ` Vivek Goyal
2012-12-17 21:49 ` Vivek Goyal
[not found] ` <20121217214911.GC13691-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 22:12 ` Tejun Heo
2012-12-17 22:12 ` Tejun Heo
2012-12-17 22:12 ` Tejun Heo
2012-12-17 21:49 ` Vivek Goyal
2012-12-17 21:33 ` Tejun Heo
2012-12-17 21:17 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 08/12] cfq-iosched: convert cfq_group_slice() to use cfqg->vfraction Tejun Heo
2012-12-14 22:41 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 09/12] cfq-iosched: enable full blkcg hierarchy support Tejun Heo
2012-12-14 22:41 ` Tejun Heo
[not found] ` <1355524885-22719-10-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-18 18:40 ` Vivek Goyal
2012-12-18 18:40 ` Vivek Goyal
2012-12-18 18:40 ` Vivek Goyal
[not found] ` <20121218184022.GC24050-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-18 19:10 ` Tejun Heo
2012-12-18 19:10 ` Tejun Heo
[not found] ` <20121218191055.GN1844-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-18 19:16 ` Vivek Goyal
2012-12-18 19:16 ` Vivek Goyal
[not found] ` <20121218191645.GA25908-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-18 19:17 ` Tejun Heo
2012-12-18 19:17 ` Tejun Heo
2012-12-18 19:17 ` Tejun Heo
2012-12-18 19:16 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 10/12] blkcg: add blkg_policy_data->plid Tejun Heo
2012-12-14 22:41 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 11/12] blkcg: implement blkg_prfill_[rw]stat_recursive() Tejun Heo
2012-12-14 22:41 ` [PATCH 12/12] cfq-iosched: add hierarchical cfq_group statistics Tejun Heo
2012-12-17 16:52 ` [PATCHSET] block: implement blkcg hierarchy support in cfq Vivek Goyal
2012-12-17 16:52 ` Vivek Goyal
2012-12-17 16:52 ` Vivek Goyal
[not found] ` <20121217165228.GB7235-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 17:38 ` Tejun Heo
2012-12-17 17:38 ` Tejun Heo
[not found] ` <20121217173800.GB2592-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-17 18:50 ` Vivek Goyal
2012-12-17 18:50 ` Vivek Goyal
[not found] ` <20121217185014.GC7235-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-17 18:59 ` Tejun Heo
2012-12-17 18:59 ` Tejun Heo
2012-12-17 18:59 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 11/12] blkcg: implement blkg_prfill_[rw]stat_recursive() Tejun Heo
2012-12-14 22:41 ` [PATCH 12/12] cfq-iosched: add hierarchical cfq_group statistics Tejun Heo
[not found] ` <1355524885-22719-13-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-12-18 19:11 ` Vivek Goyal
2012-12-18 19:11 ` Vivek Goyal
2012-12-18 19:11 ` Vivek Goyal
[not found] ` <20121218191117.GD24050-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-18 19:14 ` Tejun Heo
2012-12-18 19:14 ` Tejun Heo
[not found] ` <20121218191425.GO1844-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-18 19:18 ` Vivek Goyal
2012-12-18 19:18 ` Vivek Goyal
[not found] ` <20121218191854.GB25908-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-18 19:21 ` Tejun Heo
2012-12-18 19:21 ` Tejun Heo
[not found] ` <20121218192155.GQ1844-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2012-12-18 19:26 ` Vivek Goyal
2012-12-18 19:26 ` Vivek Goyal
2012-12-18 19:18 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1355524885-22719-5-git-send-email-tj@kernel.org \
--to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=ctalbott-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=rni-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.