From: Tejun Heo <tj@kernel.org>
To: axboe@kernel.dk, vgoyal@redhat.com
Cc: ctalbott@google.com, rni@google.com,
linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: [PATCH 14/36] blkcg: factor out blkio_group creation
Date: Tue, 21 Feb 2012 17:46:41 -0800 [thread overview]
Message-ID: <1329875223-5102-15-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1329875223-5102-1-git-send-email-tj@kernel.org>
Currently both blk-throttle and cfq-iosched implement their own
blkio_group creation code in throtl_get_tg() and cfq_get_cfqg(). This
patch factors out the common code into blkg_lookup_create(), which
returns ERR_PTR value so that transitional failures due to queue
bypass can be distinguished from other failures.
* New plkio_policy_ops methods blkio_alloc_group_fn() and
blkio_link_group_fn added. Both are transitional and will be
removed once the blkg management code is fully moved into
blk-cgroup.c.
* blkio_alloc_group_fn() allocates policy-specific blkg which is
usually a larger data structure with blkg as the first entry and
intiailizes it. Note that initialization of blkg proper, including
percpu stats, is responsibility of blk-cgroup proper.
Note that default config (weight, bps...) initialization is done
from this method; otherwise, we end up violating locking order
between blkcg and q locks via blkcg_get_CONF() functions.
* blkio_link_group_fn() is called under queue_lock and responsible for
linking the blkg to the queue. blkcg side is handled by blk-cgroup
proper.
* The common blkg creation function is named blkg_lookup_create() and
blkiocg_lookup_group() is renamed to blkg_lookup() for consistency.
Also, throtl / cfq related functions are similarly [re]named for
consistency.
This simplifies blkcg policy implementations and enables further
cleanup.
-v2: Vivek noticed that blkg_lookup_create() incorrectly tested
blk_queue_dead() instead of blk_queue_bypass() leading a user of
the function ending up creating a new blkg on bypassing queue.
This is a bug introduced while relocating bypass patches before
this one. Fixed.
-v3: ERR_PTR patch folded into this one. @for_root added to
blkg_lookup_create() to allow creating root group on a bypassed
queue during elevator switch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
block/blk-cgroup.c | 117 ++++++++++++++++++++++++++++----------
block/blk-cgroup.h | 30 +++++-----
block/blk-throttle.c | 155 +++++++++++++++++---------------------------------
block/cfq-iosched.c | 131 +++++++++++++-----------------------------
block/cfq.h | 8 ---
5 files changed, 193 insertions(+), 248 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index f1b08d3c..bc98914 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -465,38 +465,93 @@ void blkiocg_update_io_merged_stats(struct blkio_group *blkg, bool direction,
}
EXPORT_SYMBOL_GPL(blkiocg_update_io_merged_stats);
-/*
- * This function allocates the per cpu stats for blkio_group. Should be called
- * from sleepable context as alloc_per_cpu() requires that.
- */
-int blkio_alloc_blkg_stats(struct blkio_group *blkg)
+struct blkio_group *blkg_lookup_create(struct blkio_cgroup *blkcg,
+ struct request_queue *q,
+ enum blkio_policy_id plid,
+ bool for_root)
+ __releases(q->queue_lock) __acquires(q->queue_lock)
{
- /* Allocate memory for per cpu stats */
- blkg->stats_cpu = alloc_percpu(struct blkio_group_stats_cpu);
- if (!blkg->stats_cpu)
- return -ENOMEM;
- return 0;
-}
-EXPORT_SYMBOL_GPL(blkio_alloc_blkg_stats);
+ struct blkio_policy_type *pol = blkio_policy[plid];
+ struct blkio_group *blkg, *new_blkg;
-void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
- struct blkio_group *blkg, struct request_queue *q, dev_t dev,
- enum blkio_policy_id plid)
-{
- unsigned long flags;
+ WARN_ON_ONCE(!rcu_read_lock_held());
+ lockdep_assert_held(q->queue_lock);
- spin_lock_irqsave(&blkcg->lock, flags);
- spin_lock_init(&blkg->stats_lock);
- rcu_assign_pointer(blkg->q, q);
- blkg->blkcg_id = css_id(&blkcg->css);
+ /*
+ * This could be the first entry point of blkcg implementation and
+ * we shouldn't allow anything to go through for a bypassing queue.
+ * The following can be removed if blkg lookup is guaranteed to
+ * fail on a bypassing queue.
+ */
+ if (unlikely(blk_queue_bypass(q)) && !for_root)
+ return ERR_PTR(blk_queue_dead(q) ? -EINVAL : -EBUSY);
+
+ blkg = blkg_lookup(blkcg, q, plid);
+ if (blkg)
+ return blkg;
+
+ if (!css_tryget(&blkcg->css))
+ return ERR_PTR(-EINVAL);
+
+ /*
+ * Allocate and initialize.
+ *
+ * FIXME: The following is broken. Percpu memory allocation
+ * requires %GFP_KERNEL context and can't be performed from IO
+ * path. Allocation here should inherently be atomic and the
+ * following lock dancing can be removed once the broken percpu
+ * allocation is fixed.
+ */
+ spin_unlock_irq(q->queue_lock);
+ rcu_read_unlock();
+
+ new_blkg = pol->ops.blkio_alloc_group_fn(q, blkcg);
+ if (new_blkg) {
+ new_blkg->stats_cpu = alloc_percpu(struct blkio_group_stats_cpu);
+
+ spin_lock_init(&new_blkg->stats_lock);
+ rcu_assign_pointer(new_blkg->q, q);
+ new_blkg->blkcg_id = css_id(&blkcg->css);
+ new_blkg->plid = plid;
+ cgroup_path(blkcg->css.cgroup, new_blkg->path,
+ sizeof(new_blkg->path));
+ }
+
+ rcu_read_lock();
+ spin_lock_irq(q->queue_lock);
+ css_put(&blkcg->css);
+
+ /* did bypass get turned on inbetween? */
+ if (unlikely(blk_queue_bypass(q)) && !for_root) {
+ blkg = ERR_PTR(blk_queue_dead(q) ? -EINVAL : -EBUSY);
+ goto out;
+ }
+
+ /* did someone beat us to it? */
+ blkg = blkg_lookup(blkcg, q, plid);
+ if (unlikely(blkg))
+ goto out;
+
+ /* did alloc fail? */
+ if (unlikely(!new_blkg || !new_blkg->stats_cpu)) {
+ blkg = ERR_PTR(-ENOMEM);
+ goto out;
+ }
+
+ /* insert */
+ spin_lock(&blkcg->lock);
+ swap(blkg, new_blkg);
hlist_add_head_rcu(&blkg->blkcg_node, &blkcg->blkg_list);
- blkg->plid = plid;
- spin_unlock_irqrestore(&blkcg->lock, flags);
- /* Need to take css reference ? */
- cgroup_path(blkcg->css.cgroup, blkg->path, sizeof(blkg->path));
- blkg->dev = dev;
+ pol->ops.blkio_link_group_fn(q, blkg);
+ spin_unlock(&blkcg->lock);
+out:
+ if (new_blkg) {
+ free_percpu(new_blkg->stats_cpu);
+ kfree(new_blkg);
+ }
+ return blkg;
}
-EXPORT_SYMBOL_GPL(blkiocg_add_blkio_group);
+EXPORT_SYMBOL_GPL(blkg_lookup_create);
static void __blkiocg_del_blkio_group(struct blkio_group *blkg)
{
@@ -533,9 +588,9 @@ int blkiocg_del_blkio_group(struct blkio_group *blkg)
EXPORT_SYMBOL_GPL(blkiocg_del_blkio_group);
/* called under rcu_read_lock(). */
-struct blkio_group *blkiocg_lookup_group(struct blkio_cgroup *blkcg,
- struct request_queue *q,
- enum blkio_policy_id plid)
+struct blkio_group *blkg_lookup(struct blkio_cgroup *blkcg,
+ struct request_queue *q,
+ enum blkio_policy_id plid)
{
struct blkio_group *blkg;
struct hlist_node *n;
@@ -545,7 +600,7 @@ struct blkio_group *blkiocg_lookup_group(struct blkio_cgroup *blkcg,
return blkg;
return NULL;
}
-EXPORT_SYMBOL_GPL(blkiocg_lookup_group);
+EXPORT_SYMBOL_GPL(blkg_lookup);
void blkg_destroy_all(struct request_queue *q)
{
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 562fa55..2600ae7 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -204,6 +204,10 @@ extern unsigned int blkcg_get_read_iops(struct blkio_cgroup *blkcg,
extern unsigned int blkcg_get_write_iops(struct blkio_cgroup *blkcg,
dev_t dev);
+typedef struct blkio_group *(blkio_alloc_group_fn)(struct request_queue *q,
+ struct blkio_cgroup *blkcg);
+typedef void (blkio_link_group_fn)(struct request_queue *q,
+ struct blkio_group *blkg);
typedef void (blkio_unlink_group_fn)(struct request_queue *q,
struct blkio_group *blkg);
typedef bool (blkio_clear_queue_fn)(struct request_queue *q);
@@ -219,6 +223,8 @@ typedef void (blkio_update_group_write_iops_fn)(struct request_queue *q,
struct blkio_group *blkg, unsigned int write_iops);
struct blkio_policy_ops {
+ blkio_alloc_group_fn *blkio_alloc_group_fn;
+ blkio_link_group_fn *blkio_link_group_fn;
blkio_unlink_group_fn *blkio_unlink_group_fn;
blkio_clear_queue_fn *blkio_clear_queue_fn;
blkio_update_group_weight_fn *blkio_update_group_weight_fn;
@@ -307,14 +313,14 @@ static inline void blkiocg_set_start_empty_time(struct blkio_group *blkg) {}
extern struct blkio_cgroup blkio_root_cgroup;
extern struct blkio_cgroup *cgroup_to_blkio_cgroup(struct cgroup *cgroup);
extern struct blkio_cgroup *task_blkio_cgroup(struct task_struct *tsk);
-extern void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
- struct blkio_group *blkg, struct request_queue *q, dev_t dev,
- enum blkio_policy_id plid);
-extern int blkio_alloc_blkg_stats(struct blkio_group *blkg);
extern int blkiocg_del_blkio_group(struct blkio_group *blkg);
-extern struct blkio_group *blkiocg_lookup_group(struct blkio_cgroup *blkcg,
- struct request_queue *q,
- enum blkio_policy_id plid);
+extern struct blkio_group *blkg_lookup(struct blkio_cgroup *blkcg,
+ struct request_queue *q,
+ enum blkio_policy_id plid);
+struct blkio_group *blkg_lookup_create(struct blkio_cgroup *blkcg,
+ struct request_queue *q,
+ enum blkio_policy_id plid,
+ bool for_root);
void blkiocg_update_timeslice_used(struct blkio_group *blkg,
unsigned long time,
unsigned long unaccounted_time);
@@ -335,17 +341,11 @@ cgroup_to_blkio_cgroup(struct cgroup *cgroup) { return NULL; }
static inline struct blkio_cgroup *
task_blkio_cgroup(struct task_struct *tsk) { return NULL; }
-static inline void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
- struct blkio_group *blkg, void *key, dev_t dev,
- enum blkio_policy_id plid) {}
-
-static inline int blkio_alloc_blkg_stats(struct blkio_group *blkg) { return 0; }
-
static inline int
blkiocg_del_blkio_group(struct blkio_group *blkg) { return 0; }
-static inline struct blkio_group *
-blkiocg_lookup_group(struct blkio_cgroup *blkcg, void *key) { return NULL; }
+static inline struct blkio_group *blkg_lookup(struct blkio_cgroup *blkcg,
+ void *key) { return NULL; }
static inline void blkiocg_update_timeslice_used(struct blkio_group *blkg,
unsigned long time,
unsigned long unaccounted_time)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index aeeb798..2ae637b 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -181,17 +181,25 @@ static void throtl_put_tg(struct throtl_grp *tg)
call_rcu(&tg->rcu_head, throtl_free_tg);
}
-static void throtl_init_group(struct throtl_grp *tg)
+static struct blkio_group *throtl_alloc_blkio_group(struct request_queue *q,
+ struct blkio_cgroup *blkcg)
{
+ struct throtl_grp *tg;
+
+ tg = kzalloc_node(sizeof(*tg), GFP_ATOMIC, q->node);
+ if (!tg)
+ return NULL;
+
INIT_HLIST_NODE(&tg->tg_node);
RB_CLEAR_NODE(&tg->rb_node);
bio_list_init(&tg->bio_lists[0]);
bio_list_init(&tg->bio_lists[1]);
tg->limits_changed = false;
- /* Practically unlimited BW */
- tg->bps[0] = tg->bps[1] = -1;
- tg->iops[0] = tg->iops[1] = -1;
+ tg->bps[READ] = blkcg_get_read_bps(blkcg, tg->blkg.dev);
+ tg->bps[WRITE] = blkcg_get_write_bps(blkcg, tg->blkg.dev);
+ tg->iops[READ] = blkcg_get_read_iops(blkcg, tg->blkg.dev);
+ tg->iops[WRITE] = blkcg_get_write_iops(blkcg, tg->blkg.dev);
/*
* Take the initial reference that will be released on destroy
@@ -200,14 +208,8 @@ static void throtl_init_group(struct throtl_grp *tg)
* exit or cgroup deletion path depending on who is exiting first.
*/
atomic_set(&tg->ref, 1);
-}
-/* Should be called with rcu read lock held (needed for blkcg) */
-static void
-throtl_add_group_to_td_list(struct throtl_data *td, struct throtl_grp *tg)
-{
- hlist_add_head(&tg->tg_node, &td->tg_list);
- td->nr_undestroyed_grps++;
+ return &tg->blkg;
}
static void
@@ -246,119 +248,62 @@ throtl_tg_fill_dev_details(struct throtl_data *td, struct throtl_grp *tg)
spin_unlock_irq(td->queue->queue_lock);
}
-static void throtl_init_add_tg_lists(struct throtl_data *td,
- struct throtl_grp *tg, struct blkio_cgroup *blkcg)
+static void throtl_link_blkio_group(struct request_queue *q,
+ struct blkio_group *blkg)
{
- __throtl_tg_fill_dev_details(td, tg);
-
- /* Add group onto cgroup list */
- blkiocg_add_blkio_group(blkcg, &tg->blkg, td->queue,
- tg->blkg.dev, BLKIO_POLICY_THROTL);
-
- tg->bps[READ] = blkcg_get_read_bps(blkcg, tg->blkg.dev);
- tg->bps[WRITE] = blkcg_get_write_bps(blkcg, tg->blkg.dev);
- tg->iops[READ] = blkcg_get_read_iops(blkcg, tg->blkg.dev);
- tg->iops[WRITE] = blkcg_get_write_iops(blkcg, tg->blkg.dev);
-
- throtl_add_group_to_td_list(td, tg);
-}
-
-/* Should be called without queue lock and outside of rcu period */
-static struct throtl_grp *throtl_alloc_tg(struct throtl_data *td)
-{
- struct throtl_grp *tg = NULL;
- int ret;
-
- tg = kzalloc_node(sizeof(*tg), GFP_ATOMIC, td->queue->node);
- if (!tg)
- return NULL;
-
- ret = blkio_alloc_blkg_stats(&tg->blkg);
+ struct throtl_data *td = q->td;
+ struct throtl_grp *tg = tg_of_blkg(blkg);
- if (ret) {
- kfree(tg);
- return NULL;
- }
+ __throtl_tg_fill_dev_details(td, tg);
- throtl_init_group(tg);
- return tg;
+ hlist_add_head(&tg->tg_node, &td->tg_list);
+ td->nr_undestroyed_grps++;
}
static struct
-throtl_grp *throtl_find_tg(struct throtl_data *td, struct blkio_cgroup *blkcg)
+throtl_grp *throtl_lookup_tg(struct throtl_data *td, struct blkio_cgroup *blkcg)
{
struct throtl_grp *tg = NULL;
/*
* This is the common case when there are no blkio cgroups.
- * Avoid lookup in this case
- */
+ * Avoid lookup in this case
+ */
if (blkcg == &blkio_root_cgroup)
tg = td->root_tg;
else
- tg = tg_of_blkg(blkiocg_lookup_group(blkcg, td->queue,
- BLKIO_POLICY_THROTL));
+ tg = tg_of_blkg(blkg_lookup(blkcg, td->queue,
+ BLKIO_POLICY_THROTL));
__throtl_tg_fill_dev_details(td, tg);
return tg;
}
-static struct throtl_grp *throtl_get_tg(struct throtl_data *td,
- struct blkio_cgroup *blkcg)
+static struct throtl_grp *throtl_lookup_create_tg(struct throtl_data *td,
+ struct blkio_cgroup *blkcg)
{
- struct throtl_grp *tg = NULL, *__tg = NULL;
struct request_queue *q = td->queue;
-
- /* no throttling for dead queue */
- if (unlikely(blk_queue_bypass(q)))
- return NULL;
-
- tg = throtl_find_tg(td, blkcg);
- if (tg)
- return tg;
-
- if (!css_tryget(&blkcg->css))
- return NULL;
-
- /*
- * Need to allocate a group. Allocation of group also needs allocation
- * of per cpu stats which in-turn takes a mutex() and can block. Hence
- * we need to drop rcu lock and queue_lock before we call alloc.
- */
- spin_unlock_irq(q->queue_lock);
- rcu_read_unlock();
-
- tg = throtl_alloc_tg(td);
-
- /* Group allocated and queue is still alive. take the lock */
- rcu_read_lock();
- spin_lock_irq(q->queue_lock);
- css_put(&blkcg->css);
-
- /* Make sure @q is still alive */
- if (unlikely(blk_queue_bypass(q))) {
- kfree(tg);
- return NULL;
- }
+ struct throtl_grp *tg = NULL;
/*
- * If some other thread already allocated the group while we were
- * not holding queue lock, free up the group
+ * This is the common case when there are no blkio cgroups.
+ * Avoid lookup in this case
*/
- __tg = throtl_find_tg(td, blkcg);
+ if (blkcg == &blkio_root_cgroup) {
+ tg = td->root_tg;
+ } else {
+ struct blkio_group *blkg;
- if (__tg) {
- kfree(tg);
- return __tg;
- }
+ blkg = blkg_lookup_create(blkcg, q, BLKIO_POLICY_THROTL, false);
- /* Group allocation failed. Account the IO to root group */
- if (!tg) {
- tg = td->root_tg;
- return tg;
+ /* if %NULL and @q is alive, fall back to root_tg */
+ if (!IS_ERR(blkg))
+ tg = tg_of_blkg(blkg);
+ else if (!blk_queue_dead(q))
+ tg = td->root_tg;
}
- throtl_init_add_tg_lists(td, tg, blkcg);
+ __throtl_tg_fill_dev_details(td, tg);
return tg;
}
@@ -1107,6 +1052,8 @@ static void throtl_shutdown_wq(struct request_queue *q)
static struct blkio_policy_type blkio_policy_throtl = {
.ops = {
+ .blkio_alloc_group_fn = throtl_alloc_blkio_group,
+ .blkio_link_group_fn = throtl_link_blkio_group,
.blkio_unlink_group_fn = throtl_unlink_blkio_group,
.blkio_clear_queue_fn = throtl_clear_queue,
.blkio_update_group_read_bps_fn =
@@ -1141,7 +1088,7 @@ bool blk_throtl_bio(struct request_queue *q, struct bio *bio)
*/
rcu_read_lock();
blkcg = task_blkio_cgroup(current);
- tg = throtl_find_tg(td, blkcg);
+ tg = throtl_lookup_tg(td, blkcg);
if (tg) {
throtl_tg_fill_dev_details(td, tg);
@@ -1157,7 +1104,7 @@ bool blk_throtl_bio(struct request_queue *q, struct bio *bio)
* IO group
*/
spin_lock_irq(q->queue_lock);
- tg = throtl_get_tg(td, blkcg);
+ tg = throtl_lookup_create_tg(td, blkcg);
if (unlikely(!tg))
goto out_unlock;
@@ -1252,6 +1199,7 @@ void blk_throtl_drain(struct request_queue *q)
int blk_throtl_init(struct request_queue *q)
{
struct throtl_data *td;
+ struct blkio_group *blkg;
td = kzalloc_node(sizeof(*td), GFP_KERNEL, q->node);
if (!td)
@@ -1262,13 +1210,17 @@ int blk_throtl_init(struct request_queue *q)
td->limits_changed = false;
INIT_DELAYED_WORK(&td->throtl_work, blk_throtl_work);
- /* alloc and Init root group. */
+ q->td = td;
td->queue = q;
+ /* alloc and init root group. */
rcu_read_lock();
spin_lock_irq(q->queue_lock);
- td->root_tg = throtl_get_tg(td, &blkio_root_cgroup);
+ blkg = blkg_lookup_create(&blkio_root_cgroup, q, BLKIO_POLICY_THROTL,
+ true);
+ if (!IS_ERR(blkg))
+ td->root_tg = tg_of_blkg(blkg);
spin_unlock_irq(q->queue_lock);
rcu_read_unlock();
@@ -1277,9 +1229,6 @@ int blk_throtl_init(struct request_queue *q)
kfree(td);
return -ENOMEM;
}
-
- /* Attach throtl data to request queue */
- q->td = td;
return 0;
}
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 1c3f41b..acef564 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1048,10 +1048,12 @@ static void cfq_update_blkio_group_weight(struct request_queue *q,
cfqg->needs_update = true;
}
-static void cfq_init_add_cfqg_lists(struct cfq_data *cfqd,
- struct cfq_group *cfqg, struct blkio_cgroup *blkcg)
+static void cfq_link_blkio_group(struct request_queue *q,
+ struct blkio_group *blkg)
{
- struct backing_dev_info *bdi = &cfqd->queue->backing_dev_info;
+ struct cfq_data *cfqd = q->elevator->elevator_data;
+ struct backing_dev_info *bdi = &q->backing_dev_info;
+ struct cfq_group *cfqg = cfqg_of_blkg(blkg);
unsigned int major, minor;
/*
@@ -1062,34 +1064,26 @@ static void cfq_init_add_cfqg_lists(struct cfq_data *cfqd,
*/
if (bdi->dev) {
sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
- cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg,
- cfqd->queue, MKDEV(major, minor));
- } else
- cfq_blkiocg_add_blkio_group(blkcg, &cfqg->blkg,
- cfqd->queue, 0);
+ blkg->dev = MKDEV(major, minor);
+ }
cfqd->nr_blkcg_linked_grps++;
- cfqg->weight = blkcg_get_weight(blkcg, cfqg->blkg.dev);
/* Add group on cfqd list */
hlist_add_head(&cfqg->cfqd_node, &cfqd->cfqg_list);
}
-/*
- * Should be called from sleepable context. No request queue lock as per
- * cpu stats are allocated dynamically and alloc_percpu needs to be called
- * from sleepable context.
- */
-static struct cfq_group * cfq_alloc_cfqg(struct cfq_data *cfqd)
+static struct blkio_group *cfq_alloc_blkio_group(struct request_queue *q,
+ struct blkio_cgroup *blkcg)
{
struct cfq_group *cfqg;
- int ret;
- cfqg = kzalloc_node(sizeof(*cfqg), GFP_ATOMIC, cfqd->queue->node);
+ cfqg = kzalloc_node(sizeof(*cfqg), GFP_ATOMIC, q->node);
if (!cfqg)
return NULL;
cfq_init_cfqg_base(cfqg);
+ cfqg->weight = blkcg_get_weight(blkcg, cfqg->blkg.dev);
/*
* Take the initial reference that will be released on destroy
@@ -1099,90 +1093,38 @@ static struct cfq_group * cfq_alloc_cfqg(struct cfq_data *cfqd)
*/
cfqg->ref = 1;
- ret = blkio_alloc_blkg_stats(&cfqg->blkg);
- if (ret) {
- kfree(cfqg);
- return NULL;
- }
-
- return cfqg;
-}
-
-static struct cfq_group *
-cfq_find_cfqg(struct cfq_data *cfqd, struct blkio_cgroup *blkcg)
-{
- struct cfq_group *cfqg = NULL;
- struct backing_dev_info *bdi = &cfqd->queue->backing_dev_info;
- unsigned int major, minor;
-
- /*
- * This is the common case when there are no blkio cgroups.
- * Avoid lookup in this case
- */
- if (blkcg == &blkio_root_cgroup)
- cfqg = cfqd->root_group;
- else
- cfqg = cfqg_of_blkg(blkiocg_lookup_group(blkcg, cfqd->queue,
- BLKIO_POLICY_PROP));
-
- if (cfqg && !cfqg->blkg.dev && bdi->dev && dev_name(bdi->dev)) {
- sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
- cfqg->blkg.dev = MKDEV(major, minor);
- }
-
- return cfqg;
+ return &cfqg->blkg;
}
/*
* Search for the cfq group current task belongs to. request_queue lock must
* be held.
*/
-static struct cfq_group *cfq_get_cfqg(struct cfq_data *cfqd,
- struct blkio_cgroup *blkcg)
+static struct cfq_group *cfq_lookup_create_cfqg(struct cfq_data *cfqd,
+ struct blkio_cgroup *blkcg)
{
- struct cfq_group *cfqg = NULL, *__cfqg = NULL;
struct request_queue *q = cfqd->queue;
+ struct backing_dev_info *bdi = &q->backing_dev_info;
+ struct cfq_group *cfqg = NULL;
- cfqg = cfq_find_cfqg(cfqd, blkcg);
- if (cfqg)
- return cfqg;
-
- if (!css_tryget(&blkcg->css))
- return NULL;
-
- /*
- * Need to allocate a group. Allocation of group also needs allocation
- * of per cpu stats which in-turn takes a mutex() and can block. Hence
- * we need to drop rcu lock and queue_lock before we call alloc.
- *
- * Not taking any queue reference here and assuming that queue is
- * around by the time we return. CFQ queue allocation code does
- * the same. It might be racy though.
- */
- rcu_read_unlock();
- spin_unlock_irq(q->queue_lock);
-
- cfqg = cfq_alloc_cfqg(cfqd);
+ /* avoid lookup for the common case where there's no blkio cgroup */
+ if (blkcg == &blkio_root_cgroup) {
+ cfqg = cfqd->root_group;
+ } else {
+ struct blkio_group *blkg;
- spin_lock_irq(q->queue_lock);
- rcu_read_lock();
- css_put(&blkcg->css);
+ blkg = blkg_lookup_create(blkcg, q, BLKIO_POLICY_PROP, false);
+ if (!IS_ERR(blkg))
+ cfqg = cfqg_of_blkg(blkg);
+ }
- /*
- * If some other thread already allocated the group while we were
- * not holding queue lock, free up the group
- */
- __cfqg = cfq_find_cfqg(cfqd, blkcg);
+ if (cfqg && !cfqg->blkg.dev && bdi->dev && dev_name(bdi->dev)) {
+ unsigned int major, minor;
- if (__cfqg) {
- kfree(cfqg);
- return __cfqg;
+ sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor);
+ cfqg->blkg.dev = MKDEV(major, minor);
}
- if (!cfqg)
- cfqg = cfqd->root_group;
-
- cfq_init_add_cfqg_lists(cfqd, cfqg, blkcg);
return cfqg;
}
@@ -1294,8 +1236,8 @@ static bool cfq_clear_queue(struct request_queue *q)
}
#else /* GROUP_IOSCHED */
-static struct cfq_group *cfq_get_cfqg(struct cfq_data *cfqd,
- struct blkio_cgroup *blkcg)
+static struct cfq_group *cfq_lookup_create_cfqg(struct cfq_data *cfqd,
+ struct blkio_cgroup *blkcg)
{
return cfqd->root_group;
}
@@ -2887,7 +2829,8 @@ retry:
blkcg = task_blkio_cgroup(current);
- cfqg = cfq_get_cfqg(cfqd, blkcg);
+ cfqg = cfq_lookup_create_cfqg(cfqd, blkcg);
+
cic = cfq_cic_lookup(cfqd, ioc);
/* cic always exists here */
cfqq = cic_to_cfqq(cic, is_sync);
@@ -3694,6 +3637,7 @@ static void cfq_exit_queue(struct elevator_queue *e)
static int cfq_init_queue(struct request_queue *q)
{
struct cfq_data *cfqd;
+ struct blkio_group *blkg __maybe_unused;
int i;
cfqd = kmalloc_node(sizeof(*cfqd), GFP_KERNEL | __GFP_ZERO, q->node);
@@ -3711,7 +3655,10 @@ static int cfq_init_queue(struct request_queue *q)
rcu_read_lock();
spin_lock_irq(q->queue_lock);
- cfqd->root_group = cfq_get_cfqg(cfqd, &blkio_root_cgroup);
+ blkg = blkg_lookup_create(&blkio_root_cgroup, q, BLKIO_POLICY_PROP,
+ true);
+ if (!IS_ERR(blkg))
+ cfqd->root_group = cfqg_of_blkg(blkg);
spin_unlock_irq(q->queue_lock);
rcu_read_unlock();
@@ -3897,6 +3844,8 @@ static struct elevator_type iosched_cfq = {
#ifdef CONFIG_CFQ_GROUP_IOSCHED
static struct blkio_policy_type blkio_policy_cfq = {
.ops = {
+ .blkio_alloc_group_fn = cfq_alloc_blkio_group,
+ .blkio_link_group_fn = cfq_link_blkio_group,
.blkio_unlink_group_fn = cfq_unlink_blkio_group,
.blkio_clear_queue_fn = cfq_clear_queue,
.blkio_update_group_weight_fn = cfq_update_blkio_group_weight,
diff --git a/block/cfq.h b/block/cfq.h
index 343b78a..3987601 100644
--- a/block/cfq.h
+++ b/block/cfq.h
@@ -67,12 +67,6 @@ static inline void cfq_blkiocg_update_completion_stats(struct blkio_group *blkg,
direction, sync);
}
-static inline void cfq_blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
- struct blkio_group *blkg, struct request_queue *q, dev_t dev)
-{
- blkiocg_add_blkio_group(blkcg, blkg, q, dev, BLKIO_POLICY_PROP);
-}
-
static inline int cfq_blkiocg_del_blkio_group(struct blkio_group *blkg)
{
return blkiocg_del_blkio_group(blkg);
@@ -105,8 +99,6 @@ static inline void cfq_blkiocg_update_dispatch_stats(struct blkio_group *blkg,
uint64_t bytes, bool direction, bool sync) {}
static inline void cfq_blkiocg_update_completion_stats(struct blkio_group *blkg, uint64_t start_time, uint64_t io_start_time, bool direction, bool sync) {}
-static inline void cfq_blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
- struct blkio_group *blkg, struct request_queue *q, dev_t dev) {}
static inline int cfq_blkiocg_del_blkio_group(struct blkio_group *blkg)
{
return 0;
--
1.7.7.3
next prev parent reply other threads:[~2012-02-22 1:53 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-22 1:46 [PATCHSET] blkcg: accumulated blkcg updates Tejun Heo
2012-02-22 1:46 ` [PATCH 01/36] block: blk-throttle should be drained regardless of q->elevator Tejun Heo
2012-02-22 1:46 ` [PATCH 02/36] blkcg: make CONFIG_BLK_CGROUP bool Tejun Heo
2012-02-22 1:46 ` [PATCH 03/36] cfq: don't register propio policy if !CONFIG_CFQ_GROUP_IOSCHED Tejun Heo
2012-02-22 1:46 ` [PATCH 04/36] elevator: clear auxiliary data earlier during elevator switch Tejun Heo
2012-02-22 1:46 ` [PATCH 05/36] elevator: make elevator_init_fn() return 0/-errno Tejun Heo
2012-02-22 1:46 ` [PATCH 06/36] block: implement blk_queue_bypass_start/end() Tejun Heo
2012-02-22 1:46 ` [PATCH 07/36] block: extend queue bypassing to cover blkcg policies Tejun Heo
2012-02-22 1:46 ` [PATCH 08/36] blkcg: shoot down blkio_groups on elevator switch Tejun Heo
2012-02-22 1:46 ` [PATCH 09/36] blkcg: move rcu_read_lock() outside of blkio_group get functions Tejun Heo
2012-02-22 1:46 ` [PATCH 10/36] blkcg: update blkg get functions take blkio_cgroup as parameter Tejun Heo
2012-02-22 1:46 ` [PATCH 11/36] blkcg: use q and plid instead of opaque void * for blkio_group association Tejun Heo
2012-02-22 1:46 ` [PATCH 12/36] blkcg: add blkio_policy[] array and allow one policy per policy ID Tejun Heo
2012-02-22 1:46 ` [PATCH 13/36] blkcg: use the usual get blkg path for root blkio_group Tejun Heo
2012-02-22 1:46 ` Tejun Heo [this message]
2012-02-22 1:46 ` [PATCH 15/36] blkcg: don't allow or retain configuration of missing devices Tejun Heo
2012-02-22 1:46 ` [PATCH 16/36] blkcg: kill blkio_policy_node Tejun Heo
2012-02-22 1:46 ` [PATCH 17/36] blkcg: kill the mind-bending blkg->dev Tejun Heo
2012-02-22 1:46 ` [PATCH 18/36] blkcg: let blkio_group point to blkio_cgroup directly Tejun Heo
2012-02-22 1:46 ` [PATCH 19/36] blkcg: add blkcg_{init|drain|exit}_queue() Tejun Heo
2012-02-22 1:46 ` [PATCH 20/36] blkcg: clear all request_queues on blkcg policy [un]registrations Tejun Heo
2012-02-22 1:46 ` [PATCH 21/36] blkcg: let blkcg core handle policy private data allocation Tejun Heo
2012-02-22 1:46 ` [PATCH 22/36] blkcg: move refcnt to blkcg core Tejun Heo
2012-02-22 1:46 ` [PATCH 23/36] blkcg: make blkg->pd an array and move configuration and stats into it Tejun Heo
2012-02-22 1:46 ` [PATCH 24/36] blkcg: don't use blkg->plid in stat related functions Tejun Heo
2012-02-22 1:46 ` [PATCH 25/36] blkcg: move per-queue blkg list heads and counters to queue and blkg Tejun Heo
2012-02-22 1:46 ` [PATCH 26/36] blkcg: let blkcg core manage per-queue blkg list and counter Tejun Heo
2012-02-22 1:46 ` [PATCH 27/36] blkcg: unify blkg's for blkcg policies Tejun Heo
2012-03-05 21:01 ` [PATCH UPDATED " Tejun Heo
2012-02-22 1:46 ` [PATCH 28/36] blkcg: use double locking instead of RCU for blkg synchronization Tejun Heo
2012-02-22 1:46 ` [PATCH 29/36] blkcg: drop unnecessary RCU locking Tejun Heo
2012-02-23 18:51 ` [PATCH UPDATED " Tejun Heo
2012-02-22 1:46 ` [PATCH 30/36] block: restructure get_request() Tejun Heo
2012-02-22 1:46 ` [PATCH 31/36] block: interface update for ioc/icq creation functions Tejun Heo
2012-02-22 1:46 ` [PATCH 32/36] block: ioc_task_link() can't fail Tejun Heo
2012-02-22 1:47 ` [PATCH 33/36] block: add io_context->active_ref Tejun Heo
2012-02-22 18:47 ` Vivek Goyal
2012-02-22 19:13 ` Tejun Heo
2012-02-23 18:20 ` Vivek Goyal
2012-02-22 1:47 ` [PATCH 34/36] block: implement bio_associate_current() Tejun Heo
2012-02-22 13:45 ` Jeff Moyer
2012-02-22 19:07 ` Tejun Heo
2012-02-22 19:33 ` Jeff Moyer
2012-02-22 19:37 ` Vivek Goyal
2012-02-22 19:41 ` Jeff Moyer
2012-02-22 1:47 ` [PATCH 35/36] block: make block cgroup policies follow bio task association Tejun Heo
2012-02-22 1:47 ` [PATCH 36/36] block: make blk-throttle preserve the issuing task on delayed bios Tejun Heo
2012-02-22 19:34 ` [PATCHSET] blkcg: accumulated blkcg updates Vivek Goyal
2012-02-22 22:04 ` Tejun Heo
2012-03-05 20:59 ` [PATCH 17.5] blkcg: skip blkg printing if q isn't associated with disk Tejun Heo
2012-03-05 21:07 ` [PATCHSET] blkcg: accumulated blkcg updates Tejun Heo
2012-03-05 21:08 ` Tejun Heo
2012-03-06 15:07 ` Vivek Goyal
2012-03-06 16:24 ` Vivek Goyal
2012-03-06 18:39 ` Vivek Goyal
2012-03-06 18:39 ` Vivek Goyal
2012-03-06 19:02 ` Vivek Goyal
2012-03-08 0:06 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1329875223-5102-15-git-send-email-tj@kernel.org \
--to=tj@kernel.org \
--cc=axboe@kernel.dk \
--cc=ctalbott@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rni@google.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.