From: chengming.zhou@linux.dev
To: axboe@kernel.dk, hch@lst.de, chuck.lever@oracle.com
Cc: bvanassche@acm.org, cel@kernel.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, zhouchengming@bytedance.com
Subject: [PATCH v2] blk-mq: release scheduler resource when request complete
Date: Sun, 13 Aug 2023 23:23:25 +0800 [thread overview]
Message-ID: <20230813152325.3017343-1-chengming.zhou@linux.dev> (raw)
From: Chengming Zhou <zhouchengming@bytedance.com>
Chuck reported [1] a IO hang problem on NFS exports that reside on SATA
devices and bisected to commit 615939a2ae73 ("blk-mq: defer to the normal
submission path for post-flush requests").
We analysed the IO hang problem, found there are two postflush requests
are waiting for each other.
The first postflush request completed the REQ_FSEQ_DATA sequence, so go to
the REQ_FSEQ_POSTFLUSH sequence and added in the flush pending list, but
failed to blk_kick_flush() because of the second postflush request which
is inflight waiting in scheduler queue.
The second postflush waiting in scheduler queue can't be dispatched because
the first postflush hasn't released scheduler resource even though it has
completed by itself.
Fix it by releasing scheduler resource when the first postflush request
completed, so the second postflush can be dispatched and completed, then
make blk_kick_flush() succeed.
[1] https://lore.kernel.org/all/7A57C7AE-A51A-4254-888B-FE15CA21F9E9@oracle.com/
Fixes: 615939a2ae73 ("blk-mq: defer to the normal submission path for post-flush requests")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
---
v2:
- All IO schedulers do set ->finish_request(), so remove the
check and warn on not setting when register.
---
block/blk-mq.c | 16 ++++++++++++----
block/elevator.c | 3 +++
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f14b8669ac69..a8c63bef8ff1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -682,6 +682,14 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
}
EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);
+static void blk_mq_finish_request(struct request *rq)
+{
+ struct request_queue *q = rq->q;
+
+ if (rq->rq_flags & RQF_USE_SCHED)
+ q->elevator->type->ops.finish_request(rq);
+}
+
static void __blk_mq_free_request(struct request *rq)
{
struct request_queue *q = rq->q;
@@ -708,10 +716,6 @@ void blk_mq_free_request(struct request *rq)
{
struct request_queue *q = rq->q;
- if ((rq->rq_flags & RQF_USE_SCHED) &&
- q->elevator->type->ops.finish_request)
- q->elevator->type->ops.finish_request(rq);
-
if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq)))
laptop_io_completion(q->disk->bdi);
@@ -1021,6 +1025,8 @@ inline void __blk_mq_end_request(struct request *rq, blk_status_t error)
if (blk_mq_need_time_stamp(rq))
__blk_mq_end_request_acct(rq, ktime_get_ns());
+ blk_mq_finish_request(rq);
+
if (rq->end_io) {
rq_qos_done(rq->q, rq);
if (rq->end_io(rq, error) == RQ_END_IO_FREE)
@@ -1075,6 +1081,8 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob)
if (iob->need_ts)
__blk_mq_end_request_acct(rq, now);
+ blk_mq_finish_request(rq);
+
rq_qos_done(rq->q, rq);
/*
diff --git a/block/elevator.c b/block/elevator.c
index 8400e303fbcb..ac2cb3814eac 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -499,6 +499,9 @@ void elv_unregister_queue(struct request_queue *q)
int elv_register(struct elevator_type *e)
{
+ if (WARN_ON_ONCE(!e->ops.finish_request))
+ return -EINVAL;
+
/* insert_requests and dispatch_request are mandatory */
if (WARN_ON_ONCE(!e->ops.insert_requests || !e->ops.dispatch_request))
return -EINVAL;
--
2.41.0
next reply other threads:[~2023-08-13 15:24 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-13 15:23 chengming.zhou [this message]
2023-08-13 15:34 ` [PATCH v2] blk-mq: release scheduler resource when request complete Jens Axboe
2023-08-13 15:45 ` Chengming Zhou
2023-08-14 21:42 ` Jens Axboe
2023-08-17 14:41 ` kernel test robot
2023-08-17 14:50 ` Bart Van Assche
2023-08-17 15:29 ` Chengming Zhou
2023-08-17 17:17 ` Chengming Zhou
2023-08-17 17:26 ` Jens Axboe
2023-08-17 17:20 ` Jens Axboe
2023-08-17 17:24 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230813152325.3017343-1-chengming.zhou@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox