* [PATCH 1/4] block: collapse blk_alloc_request() into get_request()
[not found] ` <1334878164-24788-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2012-04-19 23:29 ` Tejun Heo
2012-04-19 23:29 ` [PATCH 2/4] block: fix elvpriv allocation failure handling Tejun Heo
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2012-04-19 23:29 UTC (permalink / raw)
To: axboe-tSWWG44O7X1aa/9Udqfwiw
Cc: vgoyal-H+wXaHxf7aLQT0dZR+AlfA, ctalbott-hpIqsD4AKlfQT0dZR+AlfA,
rni-hpIqsD4AKlfQT0dZR+AlfA, cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo
Allocation failure handling in get_request() is about to be updated.
To ease the update, collapse blk_alloc_request() into get_request().
This patch doesn't introduce any functional change.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
block/blk-core.c | 46 +++++++++++++++++-----------------------------
1 files changed, 17 insertions(+), 29 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 3b02ba3..f6f68b0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -719,33 +719,6 @@ static inline void blk_free_request(struct request_queue *q, struct request *rq)
mempool_free(rq, q->rq.rq_pool);
}
-static struct request *
-blk_alloc_request(struct request_queue *q, struct bio *bio, struct io_cq *icq,
- unsigned int flags, gfp_t gfp_mask)
-{
- struct request *rq = mempool_alloc(q->rq.rq_pool, gfp_mask);
-
- if (!rq)
- return NULL;
-
- blk_rq_init(q, rq);
-
- rq->cmd_flags = flags | REQ_ALLOCED;
-
- if (flags & REQ_ELVPRIV) {
- rq->elv.icq = icq;
- if (unlikely(elv_set_request(q, rq, bio, gfp_mask))) {
- mempool_free(rq, q->rq.rq_pool);
- return NULL;
- }
- /* @rq->elv.icq holds on to io_context until @rq is freed */
- if (icq)
- get_io_context(icq->ioc);
- }
-
- return rq;
-}
-
/*
* ioc_batching returns true if the ioc is a valid batching request and
* should be given priority access to a request.
@@ -968,10 +941,25 @@ retry:
goto fail_alloc;
}
- rq = blk_alloc_request(q, bio, icq, rw_flags, gfp_mask);
- if (unlikely(!rq))
+ /* allocate and init request */
+ rq = mempool_alloc(q->rq.rq_pool, gfp_mask);
+ if (!rq)
goto fail_alloc;
+ blk_rq_init(q, rq);
+ rq->cmd_flags = rw_flags | REQ_ALLOCED;
+
+ if (rw_flags & REQ_ELVPRIV) {
+ rq->elv.icq = icq;
+ if (unlikely(elv_set_request(q, rq, bio, gfp_mask))) {
+ mempool_free(rq, q->rq.rq_pool);
+ goto fail_alloc;
+ }
+ /* @rq->elv.icq holds on to io_context until @rq is freed */
+ if (icq)
+ get_io_context(icq->ioc);
+ }
+
/*
* ioc may be NULL here, and ioc_batching will be false. That's
* OK, if the queue is under the request limit then requests need
--
1.7.7.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/4] block: fix elvpriv allocation failure handling
[not found] ` <1334878164-24788-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-04-19 23:29 ` [PATCH 1/4] block: collapse blk_alloc_request() into get_request() Tejun Heo
@ 2012-04-19 23:29 ` Tejun Heo
2012-04-19 23:29 ` [PATCH 3/4] blkcg: fix blkcg->css ref leak in __blkg_lookup_create() Tejun Heo
2012-04-20 8:10 ` [PATCHSET] block: fixes for long standing issues Jens Axboe
3 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2012-04-19 23:29 UTC (permalink / raw)
To: axboe-tSWWG44O7X1aa/9Udqfwiw
Cc: vgoyal-H+wXaHxf7aLQT0dZR+AlfA, ctalbott-hpIqsD4AKlfQT0dZR+AlfA,
rni-hpIqsD4AKlfQT0dZR+AlfA, cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo
Request allocation is mempool backed to guarantee forward progress
under memory pressure; unfortunately, this property got broken while
adding elvpriv data. Failures during elvpriv allocation, including
ioc and icq creation failures, currently make get_request() fail as
whole. There's no forward progress guarantee for these allocations -
they may fail indefinitely under memory pressure stalling IO and
deadlocking the system.
This patch updates get_request() such that elvpriv allocation failure
doesn't make the whole function fail. If elvpriv allocation fails,
the allocation is degraded into !ELVPRIV. This will force the request
to ELEVATOR_INSERT_BACK disturbing scheduling but elvpriv alloc
failures should be rare (nothing is per-request) and anything is
better than deadlocking.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
block/blk-core.c | 53 ++++++++++++++++++++++++++++++++++++-----------------
1 files changed, 36 insertions(+), 17 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index f6f68b0..6cf13df 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -29,6 +29,7 @@
#include <linux/fault-inject.h>
#include <linux/list_sort.h>
#include <linux/delay.h>
+#include <linux/ratelimit.h>
#define CREATE_TRACE_POINTS
#include <trace/events/block.h>
@@ -930,17 +931,6 @@ retry:
rw_flags |= REQ_IO_STAT;
spin_unlock_irq(q->queue_lock);
- /* create icq if missing */
- if ((rw_flags & REQ_ELVPRIV) && unlikely(et->icq_cache && !icq)) {
- create_io_context(gfp_mask, q->node);
- ioc = rq_ioc(bio);
- if (!ioc)
- goto fail_alloc;
- icq = ioc_create_icq(ioc, q, gfp_mask);
- if (!icq)
- goto fail_alloc;
- }
-
/* allocate and init request */
rq = mempool_alloc(q->rq.rq_pool, gfp_mask);
if (!rq)
@@ -949,17 +939,28 @@ retry:
blk_rq_init(q, rq);
rq->cmd_flags = rw_flags | REQ_ALLOCED;
+ /* init elvpriv */
if (rw_flags & REQ_ELVPRIV) {
- rq->elv.icq = icq;
- if (unlikely(elv_set_request(q, rq, bio, gfp_mask))) {
- mempool_free(rq, q->rq.rq_pool);
- goto fail_alloc;
+ if (unlikely(et->icq_cache && !icq)) {
+ create_io_context(gfp_mask, q->node);
+ ioc = rq_ioc(bio);
+ if (!ioc)
+ goto fail_elvpriv;
+
+ icq = ioc_create_icq(ioc, q, gfp_mask);
+ if (!icq)
+ goto fail_elvpriv;
}
- /* @rq->elv.icq holds on to io_context until @rq is freed */
+
+ rq->elv.icq = icq;
+ if (unlikely(elv_set_request(q, rq, bio, gfp_mask)))
+ goto fail_elvpriv;
+
+ /* @rq->elv.icq holds io_context until @rq is freed */
if (icq)
get_io_context(icq->ioc);
}
-
+out:
/*
* ioc may be NULL here, and ioc_batching will be false. That's
* OK, if the queue is under the request limit then requests need
@@ -972,6 +973,24 @@ retry:
trace_block_getrq(q, bio, rw_flags & 1);
return rq;
+fail_elvpriv:
+ /*
+ * elvpriv init failed. ioc, icq and elvpriv aren't mempool backed
+ * and may fail indefinitely under memory pressure and thus
+ * shouldn't stall IO. Treat this request as !elvpriv. This will
+ * disturb iosched and blkcg but weird is bettern than dead.
+ */
+ printk_ratelimited(KERN_WARNING "%s: request aux data allocation failed, iosched may be disturbed\n",
+ dev_name(q->backing_dev_info.dev));
+
+ rq->cmd_flags &= ~REQ_ELVPRIV;
+ rq->elv.icq = NULL;
+
+ spin_lock_irq(q->queue_lock);
+ rl->elvpriv--;
+ spin_unlock_irq(q->queue_lock);
+ goto out;
+
fail_alloc:
/*
* Allocation failed presumably due to memory. Undo anything we
--
1.7.7.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] blkcg: fix blkcg->css ref leak in __blkg_lookup_create()
[not found] ` <1334878164-24788-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-04-19 23:29 ` [PATCH 1/4] block: collapse blk_alloc_request() into get_request() Tejun Heo
2012-04-19 23:29 ` [PATCH 2/4] block: fix elvpriv allocation failure handling Tejun Heo
@ 2012-04-19 23:29 ` Tejun Heo
2012-04-20 8:10 ` [PATCHSET] block: fixes for long standing issues Jens Axboe
3 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2012-04-19 23:29 UTC (permalink / raw)
To: axboe-tSWWG44O7X1aa/9Udqfwiw
Cc: ctalbott-hpIqsD4AKlfQT0dZR+AlfA, rni-hpIqsD4AKlfQT0dZR+AlfA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA
__blkg_lookup_create() leaked blkcg->css ref if blkg allocation
failed. Fix it.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
block/blk-cgroup.c | 19 +++++++++----------
1 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8228385..30a7a9c 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -174,6 +174,7 @@ static struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
__releases(q->queue_lock) __acquires(q->queue_lock)
{
struct blkcg_gq *blkg;
+ int ret;
WARN_ON_ONCE(!rcu_read_lock_held());
lockdep_assert_held(q->queue_lock);
@@ -186,24 +187,22 @@ static struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
if (!css_tryget(&blkcg->css))
return ERR_PTR(-EINVAL);
- /*
- * Allocate and initialize.
- */
+ /* allocate */
+ ret = -ENOMEM;
blkg = blkg_alloc(blkcg, q);
-
- /* did alloc fail? */
- if (unlikely(!blkg)) {
- blkg = ERR_PTR(-ENOMEM);
- goto out;
- }
+ if (unlikely(!blkg))
+ goto err_put;
/* insert */
spin_lock(&blkcg->lock);
hlist_add_head_rcu(&blkg->blkcg_node, &blkcg->blkg_list);
list_add(&blkg->q_node, &q->blkg_list);
spin_unlock(&blkcg->lock);
-out:
return blkg;
+
+err_put:
+ css_put(&blkcg->css);
+ return ERR_PTR(ret);
}
struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
--
1.7.7.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCHSET] block: fixes for long standing issues
[not found] ` <1334878164-24788-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (2 preceding siblings ...)
2012-04-19 23:29 ` [PATCH 3/4] blkcg: fix blkcg->css ref leak in __blkg_lookup_create() Tejun Heo
@ 2012-04-20 8:10 ` Jens Axboe
3 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2012-04-20 8:10 UTC (permalink / raw)
To: Tejun Heo
Cc: ctalbott-hpIqsD4AKlfQT0dZR+AlfA, rni-hpIqsD4AKlfQT0dZR+AlfA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA
On Thu, Apr 19 2012, Tejun Heo wrote:
> Hello,
>
> This patchset fixes two long standing issues and one relatively new
> css ref leak.
>
> a. elvpriv alloc failure, including ioc and icq failures, fails
> request allocation. As those aren't mempool backed and may fail
> indefinitely, this can lead to deadlock under memory pressure.
>
> b. blkgs don't have proper indexing. With enough number of
> request_queues and blk-throttle enabled, block layer can spend
> considerable amount of cpu cycles walking the same list over and
> over again.
>
> c. __blkg_lookup_create() was leaking a css ref on failure path.
>
> This patchset contains the following four patches.
>
> 0001-block-collapse-blk_alloc_request-into-get_request.patch
> 0002-block-fix-elvpriv-allocation-failure-handling.patch
> 0003-blkcg-fix-blkcg-css-ref-leak-in-__blkg_lookup_create.patch
> 0004-blkcg-use-radix-tree-to-index-blkgs-from-blkcg.patch
>
> 0001-0002 fix #a. 0003 fixes #c. 0004 fixes #b.
>
> This patchset is on top of
>
> block/for-3.5/core 5bc4afb1ec "blkcg: drop BLKCG_STAT_{PRIV|POL|OFF} macros"
> + [1] [PATCHSET] block: per-queue policy activation, take#2
> + [2] [PATCHSET] block: cosmetic updates to blkcg API
Applied, thanks Tejun.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread