From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>, Qian Cai <cai@redhat.com>,
John Garry <john.garry@huawei.com>,
Kashyap Desai <kashyap.desai@broadcom.com>,
Sumit Saxena <sumit.saxena@broadcom.com>,
Hannes Reinecke <hare@suse.de>, Ming Lei <ming.lei@redhat.com>
Subject: [PATCH 1/3] blk-mq: add new API of blk_mq_hctx_set_fq_lock_class
Date: Thu, 12 Nov 2020 15:55:24 +0800 [thread overview]
Message-ID: <20201112075526.947079-2-ming.lei@redhat.com> (raw)
In-Reply-To: <20201112075526.947079-1-ming.lei@redhat.com>
flush_end_io() may be called recursively from some driver, such as
nvme-loop, so lockdep may complain 'possible recursive locking'.
Commit b3c6a5997541("block: Fix a lockdep complaint triggered by
request queue flushing") tried to address this issue by assigning
dynamically allocated per-flush-queue lock class. This solution
adds synchronize_rcu() for each hctx's release handler, and causes
horrible SCSI MQ probe delay(more than half an hour on megaraid sas).
Add new API of blk_mq_hctx_set_fq_lock_class() for these drivers, so
we just need to use driver specific lock class for avoiding the
lockdep warning of 'possible recursive locking'.
Reported-by: Qian Cai <cai@redhat.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-flush.c | 25 +++++++++++++++++++++++++
include/linux/blk-mq.h | 3 +++
2 files changed, 28 insertions(+)
diff --git a/block/blk-flush.c b/block/blk-flush.c
index e32958f0b687..657743524e15 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -490,3 +490,28 @@ void blk_free_flush_queue(struct blk_flush_queue *fq)
kfree(fq->flush_rq);
kfree(fq);
}
+
+/*
+ * Allow driver to set its own lock class to fq->mq_flush_lock for
+ * avoiding lockdep complaint.
+ *
+ * flush_end_io() may be called recursively from some driver, such as
+ * nvme-loop, so lockdep may complain 'possible recursive locking' because
+ * all 'struct blk_flush_queue' instance share same mq_flush_lock lock class
+ * key. We need to assign different lock class for these driver's
+ * fq->mq_flush_lock for avoiding the lockdep warning.
+ *
+ * Use dynamically allocated lock class key for each 'blk_flush_queue'
+ * instance is over-kill, and more worse it introduces horrible boot delay
+ * issue because synchronize_rcu() is implied in lockdep_unregister_key which
+ * is called for each hctx release. SCSI probing may synchronously create and
+ * destroy lots of MQ request_queues for non-existent devices, and some robot
+ * test kernel always enable lockdep option. It is observed that more than half
+ * an hour is taken during SCSI MQ probe with per-fq lock class.
+ */
+void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
+ struct lock_class_key *key)
+{
+ lockdep_set_class(&hctx->fq->mq_flush_lock, key);
+}
+EXPORT_SYMBOL_GPL(blk_mq_hctx_set_fq_lock_class);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 794b2a33a2c3..5f639240760e 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -5,6 +5,7 @@
#include <linux/blkdev.h>
#include <linux/sbitmap.h>
#include <linux/srcu.h>
+#include <linux/lockdep.h>
struct blk_mq_tags;
struct blk_flush_queue;
@@ -594,5 +595,7 @@ static inline void blk_mq_cleanup_rq(struct request *rq)
}
blk_qc_t blk_mq_submit_bio(struct bio *bio);
+void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
+ struct lock_class_key *key);
#endif
--
2.25.4
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-11-12 7:55 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-12 7:55 [PATCH 0/3] blk-mq/nvme-loop: use nvme-loop's lock class for addressing lockdep false positive warning Ming Lei
2020-11-12 7:55 ` Ming Lei [this message]
2020-11-16 17:26 ` [PATCH 1/3] blk-mq: add new API of blk_mq_hctx_set_fq_lock_class Christoph Hellwig
2020-11-17 1:04 ` Ming Lei
2020-11-17 1:16 ` Ming Lei
2020-11-17 6:46 ` Kashyap Desai
2020-11-12 7:55 ` [PATCH 2/3] nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class Ming Lei
2020-11-16 17:27 ` Christoph Hellwig
2020-11-12 7:55 ` [PATCH 3/3] Revert "block: Fix a lockdep complaint triggered by request queue flushing" Ming Lei
2020-11-12 15:12 ` Bart Van Assche
2020-11-12 15:29 ` Bart Van Assche
2020-11-13 1:11 ` Ming Lei
2020-11-13 1:27 ` [PATCH V2 " Ming Lei
2020-11-16 17:27 ` Christoph Hellwig
2020-11-25 1:31 ` [PATCH 0/3] blk-mq/nvme-loop: use nvme-loop's lock class for addressing lockdep false positive warning Ming Lei
2020-11-30 2:36 ` Ming Lei
2020-12-02 19:07 ` Qian Cai
2020-12-03 0:50 ` Ming Lei
2020-12-03 1:23 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201112075526.947079-2-ming.lei@redhat.com \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=cai@redhat.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=john.garry@huawei.com \
--cc=kashyap.desai@broadcom.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sumit.saxena@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox