From: Tejun Heo <tj@kernel.org>
To: lizefan@huawei.com, axboe@kernel.dk, vgoyal@redhat.com
Cc: containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, ctalbott@google.com,
rni@google.com, Tejun Heo <tj@kernel.org>
Subject: [PATCH 04/12] blkcg: make blkcg_gq's hierarchical
Date: Fri, 14 Dec 2012 14:41:17 -0800 [thread overview]
Message-ID: <1355524885-22719-5-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1355524885-22719-1-git-send-email-tj@kernel.org>
Currently a child blkg (blkcg_gq) can be created even if its parent
doesn't exist. ie. Given a blkg, it's not guaranteed that its
ancestors will exist. This makes it difficult to implement proper
hierarchy support for blkcg policies.
Always create blkgs recursively and make a child blkg hold a reference
to its parent. blkg->parent is added so that finding the parent is
easy. blkcg_parent() is also added in the process.
This change can be visible to userland. e.g. while issuing IO in a
nested cgroup didn't affect the ancestors at all, now it will
initialize all ancestor blkgs and zero stats for the request_queue
will always appear on them. While this is userland visible, this
shouldn't cause any functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-cgroup.c | 42 +++++++++++++++++++++++++++++++++++++-----
block/blk-cgroup.h | 18 ++++++++++++++++++
2 files changed, 55 insertions(+), 5 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index fde2286..c858628 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -201,7 +201,16 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
}
blkg = new_blkg;
- /* insert */
+ /* link parent and insert */
+ if (blkcg_parent(blkcg)) {
+ blkg->parent = __blkg_lookup(blkcg_parent(blkcg), q, false);
+ if (WARN_ON_ONCE(!blkg->parent)) {
+ blkg = ERR_PTR(-EINVAL);
+ goto err_put_css;
+ }
+ blkg_get(blkg->parent);
+ }
+
spin_lock(&blkcg->lock);
ret = radix_tree_insert(&blkcg->blkg_tree, q->id, blkg);
if (likely(!ret)) {
@@ -213,6 +222,10 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
if (!ret)
return blkg;
+ /* @blkg failed fully initialized, use the usual release path */
+ blkg_put(blkg);
+ return ERR_PTR(ret);
+
err_put_css:
css_put(&blkcg->css);
err_free_blkg:
@@ -226,8 +239,9 @@ err_free_blkg:
* @q: request_queue of interest
*
* Lookup blkg for the @blkcg - @q pair. If it doesn't exist, try to
- * create one. This function should be called under RCU read lock and
- * @q->queue_lock.
+ * create one. blkg creation is performed recursively from blkcg_root such
+ * that all non-root blkg's have access to the parent blkg. This function
+ * should be called under RCU read lock and @q->queue_lock.
*
* Returns pointer to the looked up or created blkg on success, ERR_PTR()
* value on error. If @q is dead, returns ERR_PTR(-EINVAL). If @q is not
@@ -252,7 +266,23 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
if (blkg)
return blkg;
- return blkg_create(blkcg, q, NULL);
+ /*
+ * Create blkgs walking down from blkcg_root to @blkcg, so that all
+ * non-root blkgs have access to their parents.
+ */
+ while (true) {
+ struct blkcg *pos = blkcg;
+ struct blkcg *parent = blkcg_parent(blkcg);
+
+ while (parent && !__blkg_lookup(parent, q, false)) {
+ pos = parent;
+ parent = blkcg_parent(parent);
+ }
+
+ blkg = blkg_create(pos, q, NULL);
+ if (pos == blkcg || IS_ERR(blkg))
+ return blkg;
+ }
}
EXPORT_SYMBOL_GPL(blkg_lookup_create);
@@ -321,8 +351,10 @@ static void blkg_rcu_free(struct rcu_head *rcu_head)
void __blkg_release(struct blkcg_gq *blkg)
{
- /* release the extra blkcg reference this blkg has been holding */
+ /* release the blkcg and parent blkg refs this blkg has been holding */
css_put(&blkg->blkcg->css);
+ if (blkg->parent)
+ blkg_put(blkg->parent);
/*
* A group is freed in rcu manner. But having an rcu lock does not
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 2459730..b26ed58 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -94,8 +94,13 @@ struct blkcg_gq {
struct list_head q_node;
struct hlist_node blkcg_node;
struct blkcg *blkcg;
+
+ /* all non-root blkcg_gq's are guaranteed to have access to parent */
+ struct blkcg_gq *parent;
+
/* request allocation list for this blkcg-q pair */
struct request_list rl;
+
/* reference count */
int refcnt;
@@ -181,6 +186,19 @@ static inline struct blkcg *bio_blkcg(struct bio *bio)
}
/**
+ * blkcg_parent - get the parent of a blkcg
+ * @blkcg: blkcg of interest
+ *
+ * Return the parent blkcg of @blkcg. Can be called anytime.
+ */
+static inline struct blkcg *blkcg_parent(struct blkcg *blkcg)
+{
+ struct cgroup *pcg = blkcg->css.cgroup->parent;
+
+ return pcg ? cgroup_to_blkcg(pcg) : NULL;
+}
+
+/**
* blkg_to_pdata - get policy private data
* @blkg: blkg of interest
* @pol: policy of interest
--
1.7.11.7
next prev parent reply other threads:[~2012-12-14 22:44 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-14 22:41 [PATCHSET] block: implement blkcg hierarchy support in cfq Tejun Heo
2012-12-14 22:41 ` [PATCH 01/12] blkcg: fix minor bug in blkg_alloc() Tejun Heo
2012-12-17 19:10 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 02/12] blkcg: reorganize blkg_lookup_create() and friends Tejun Heo
2012-12-17 19:28 ` Vivek Goyal
2012-12-14 22:41 ` [PATCH 03/12] blkcg: cosmetic updates to blkg_create() Tejun Heo
2012-12-17 19:37 ` Vivek Goyal
2012-12-14 22:41 ` Tejun Heo [this message]
2012-12-17 20:04 ` [PATCH 04/12] blkcg: make blkcg_gq's hierarchical Vivek Goyal
2012-12-14 22:41 ` [PATCH 05/12] cfq-iosched: add leaf_weight Tejun Heo
2012-12-14 22:41 ` [PATCH 06/12] cfq-iosched: implement cfq_group->nr_active and ->level_weight Tejun Heo
2012-12-17 20:46 ` Vivek Goyal
2012-12-17 21:15 ` Tejun Heo
2012-12-17 21:18 ` Vivek Goyal
2012-12-17 21:20 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 07/12] cfq-iosched: implement hierarchy-ready cfq_group charge scaling Tejun Heo
2012-12-17 20:53 ` Vivek Goyal
2012-12-17 21:17 ` Tejun Heo
2012-12-17 21:27 ` Vivek Goyal
2012-12-17 21:33 ` Tejun Heo
2012-12-17 21:49 ` Vivek Goyal
2012-12-17 22:12 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 08/12] cfq-iosched: convert cfq_group_slice() to use cfqg->vfraction Tejun Heo
2012-12-14 22:41 ` [PATCH 09/12] cfq-iosched: enable full blkcg hierarchy support Tejun Heo
2012-12-18 18:40 ` Vivek Goyal
2012-12-18 19:10 ` Tejun Heo
2012-12-18 19:16 ` Vivek Goyal
2012-12-18 19:17 ` Tejun Heo
2012-12-14 22:41 ` [PATCH 10/12] blkcg: add blkg_policy_data->plid Tejun Heo
2012-12-14 22:41 ` [PATCH 11/12] blkcg: implement blkg_prfill_[rw]stat_recursive() Tejun Heo
2012-12-14 22:41 ` [PATCH 12/12] cfq-iosched: add hierarchical cfq_group statistics Tejun Heo
2012-12-18 19:11 ` Vivek Goyal
2012-12-18 19:14 ` Tejun Heo
2012-12-18 19:18 ` Vivek Goyal
2012-12-18 19:21 ` Tejun Heo
2012-12-18 19:26 ` Vivek Goyal
2012-12-17 16:52 ` [PATCHSET] block: implement blkcg hierarchy support in cfq Vivek Goyal
2012-12-17 17:38 ` Tejun Heo
2012-12-17 18:50 ` Vivek Goyal
2012-12-17 18:59 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1355524885-22719-5-git-send-email-tj@kernel.org \
--to=tj@kernel.org \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=containers@lists.linux-foundation.org \
--cc=ctalbott@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=rni@google.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).