linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-4.16 PATCH v4 0/4] block/dm: allow DM to defer blk_register_queue() until ready
@ 2018-01-11 20:14 Mike Snitzer
  2018-01-11 20:14 ` [for-4.16 PATCH v4 1/4] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-11 20:14 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

Hi Jens,

I think this set is now ready for you to pick up for 4.16.

Hannes, I ran with your review feedback and think the result is
cleaner (avoids frivolously using another QUEUE_FLAG).

Ming, I didn't add your Reviewed-by to "block: allow gendisk's
request_queue registration to be deferred" because it changed enough,
albeit for the better, that it is best for you to review again.

Thanks,
Mike

Mike Snitzer (4):
  block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN
  block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  block: allow gendisk's request_queue registration to be deferred
  dm: fix awkward and incomplete request_queue initialization

 block/blk-sysfs.c     | 12 ++++++++----
 block/genhd.c         | 23 +++++++++++++++++++----
 drivers/md/dm-core.h  |  2 --
 drivers/md/dm-rq.c    | 11 -----------
 drivers/md/dm.c       | 41 ++++++++++++++++++++++-------------------
 include/linux/genhd.h |  5 +++++
 6 files changed, 54 insertions(+), 40 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [for-4.16 PATCH v4 1/4] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN
  2018-01-11 20:14 [for-4.16 PATCH v4 0/4] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
@ 2018-01-11 20:14 ` Mike Snitzer
  2018-01-11 20:14 ` [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue Mike Snitzer
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-11 20:14 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

device_add_disk() will only call bdi_register_owner() if
!GENHD_FL_HIDDEN, so it follows that del_gendisk() should only call
bdi_unregister() if !GENHD_FL_HIDDEN.

Found with code inspection.  bdi_unregister() won't do any harm if
bdi_register_owner() wasn't used but best to avoid the unnecessary
call to bdi_unregister().

Fixes: 8ddcd65325 ("block: introduce GENHD_FL_HIDDEN")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 block/genhd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/genhd.c b/block/genhd.c
index 96a66f671720..00620e01e043 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -725,7 +725,8 @@ void del_gendisk(struct gendisk *disk)
 		 * Unregister bdi before releasing device numbers (as they can
 		 * get reused and we'd get clashes in sysfs).
 		 */
-		bdi_unregister(disk->queue->backing_dev_info);
+		if (!(disk->flags & GENHD_FL_HIDDEN))
+			bdi_unregister(disk->queue->backing_dev_info);
 		blk_unregister_queue(disk);
 	} else {
 		WARN_ON(1);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-11 20:14 [for-4.16 PATCH v4 0/4] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
  2018-01-11 20:14 ` [for-4.16 PATCH v4 1/4] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
@ 2018-01-11 20:14 ` Mike Snitzer
  2018-01-12  0:28   ` Bart Van Assche
  2018-01-12  7:09   ` Ming Lei
  2018-01-11 20:14 ` [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
  2018-01-11 20:14 ` [for-4.16 PATCH v4 4/4] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
  3 siblings, 2 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-11 20:14 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

blk_unregister_queue() must protect against any modifications of
q->queue_flags (not just those performed in blk-sysfs.c).  Therefore
q->queue_lock needs to be used rather than q->sysfs_lock.

Fixes: e9a823fb34a8b ("block: fix warning when I/O elevator is changed as request_queue is being removed")
Cc: stable@vger.kernel.org # 4.14+
Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-sysfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 870484eaed1f..52f57539f1c7 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -929,9 +929,9 @@ void blk_unregister_queue(struct gendisk *disk)
 	if (WARN_ON(!q))
 		return;
 
-	mutex_lock(&q->sysfs_lock);
+	spin_lock_irq(q->queue_lock);
 	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
-	mutex_unlock(&q->sysfs_lock);
+	spin_unlock_irq(q->queue_lock);
 
 	wbt_exit(q);
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 20:14 [for-4.16 PATCH v4 0/4] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
  2018-01-11 20:14 ` [for-4.16 PATCH v4 1/4] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
  2018-01-11 20:14 ` [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue Mike Snitzer
@ 2018-01-11 20:14 ` Mike Snitzer
  2018-01-12  0:37   ` Bart Van Assche
  2018-01-12  7:33   ` Ming Lei
  2018-01-11 20:14 ` [for-4.16 PATCH v4 4/4] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
  3 siblings, 2 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-11 20:14 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

Since I can remember DM has forced the block layer to allow the
allocation and initialization of the request_queue to be distinct
operations.  Reason for this is block/genhd.c:add_disk() has requires
that the request_queue (and associated bdi) be tied to the gendisk
before add_disk() is called -- because add_disk() also deals with
exposing the request_queue via blk_register_queue().

DM's dynamic creation of arbitrary device types (and associated
request_queue types) requires the DM device's gendisk be available so
that DM table loads can establish a master/slave relationship with
subordinate devices that are referenced by loaded DM tables -- using
bd_link_disk_holder().  But until these DM tables, and their associated
subordinate devices, are known DM cannot know what type of request_queue
it needs -- nor what its queue_limits should be.

This chicken and egg scenario has created all manner of problems for DM
and, at times, the block layer.

Summary of changes:

- Add blk_add_disk_no_queue_reg() and add_disk_no_queue_reg() variant
  that drivers may use to add a disk without also calling
  blk_register_queue().  Driver must call blk_register_queue() once its
  request_queue is fully initialized.

- Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
  is not set.  It won't be set if driver used add_disk_no_queue_reg()
  but driver encounters an error and must del_gendisk() before calling
  blk_register_queue().

- Export blk_register_queue().

These changes allow DM to use add_disk_no_queue_reg() to anchor its
gendisk as the "master" for master/slave relationships DM must establish
with subordinate devices referenced in DM tables that get loaded.  Once
all "slave" devices for a DM device are known its request_queue can be
properly initialized and then advertised via sysfs -- important
improvement being that no request_queue resource initialization
performed by blk_register_queue() is missed for DM devices anymore.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-sysfs.c     |  8 ++++++--
 block/genhd.c         | 20 +++++++++++++++++---
 include/linux/genhd.h |  5 +++++
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 52f57539f1c7..90de6337cc4d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -921,6 +921,7 @@ int blk_register_queue(struct gendisk *disk)
 	mutex_unlock(&q->sysfs_lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(blk_register_queue);
 
 void blk_unregister_queue(struct gendisk *disk)
 {
@@ -930,12 +931,15 @@ void blk_unregister_queue(struct gendisk *disk)
 		return;
 
 	spin_lock_irq(q->queue_lock);
-	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
+	if (!queue_flag_test_and_clear(QUEUE_FLAG_REGISTERED, q)) {
+		/* Return early if disk->queue was never registered. */
+		spin_unlock_irq(q->queue_lock);
+		return;
+	}
 	spin_unlock_irq(q->queue_lock);
 
 	wbt_exit(q);
 
-
 	if (q->mq_ops)
 		blk_mq_unregister_dev(disk_to_dev(disk), q);
 
diff --git a/block/genhd.c b/block/genhd.c
index 00620e01e043..d4aaf0cae9ad 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -629,7 +629,7 @@ static void register_disk(struct device *parent, struct gendisk *disk)
 }
 
 /**
- * device_add_disk - add partitioning information to kernel list
+ * device_add_disk_no_queue_reg - add partitioning information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
  *
@@ -638,7 +638,7 @@ static void register_disk(struct device *parent, struct gendisk *disk)
  *
  * FIXME: error handling
  */
-void device_add_disk(struct device *parent, struct gendisk *disk)
+void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
 {
 	dev_t devt;
 	int retval;
@@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 				    exact_match, exact_lock, disk);
 	}
 	register_disk(parent, disk);
-	blk_register_queue(disk);
 
 	/*
 	 * Take an extra ref on queue which will be put on disk_release()
@@ -693,6 +692,21 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 	disk_add_events(disk);
 	blk_integrity_add(disk);
 }
+EXPORT_SYMBOL(device_add_disk_no_queue_reg);
+
+/**
+ * device_add_disk - add disk information to kernel list
+ * @parent: parent device for the disk
+ * @disk: per-device disk information
+ *
+ * Adds partitioning information and also registers the
+ * associated request_queue to @disk.
+ */
+void device_add_disk(struct device *parent, struct gendisk *disk)
+{
+	device_add_disk_no_queue_reg(parent, disk);
+	blk_register_queue(disk);
+}
 EXPORT_SYMBOL(device_add_disk);
 
 void del_gendisk(struct gendisk *disk)
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5144ebe046c9..5e3531027b51 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -395,6 +395,11 @@ static inline void add_disk(struct gendisk *disk)
 {
 	device_add_disk(NULL, disk);
 }
+extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
+static inline void add_disk_no_queue_reg(struct gendisk *disk)
+{
+	device_add_disk_no_queue_reg(NULL, disk);
+}
 
 extern void del_gendisk(struct gendisk *gp);
 extern struct gendisk *get_gendisk(dev_t dev, int *partno);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [for-4.16 PATCH v4 4/4] dm: fix awkward and incomplete request_queue initialization
  2018-01-11 20:14 [for-4.16 PATCH v4 0/4] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
                   ` (2 preceding siblings ...)
  2018-01-11 20:14 ` [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
@ 2018-01-11 20:14 ` Mike Snitzer
  3 siblings, 0 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-11 20:14 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

DM is now no longer prone to having its request_queue be improperly
initialized.

Summary of changes:

- defer DM's blk_register_queue() from add_disk()-time until
  dm_setup_md_queue() by using add_disk_no_queue_reg() in alloc_dev().

- dm_setup_md_queue() is updated to fully initialize DM's request_queue
  (_after_ all table loads have occurred and the request_queue's type,
  features and limits are known).

- various other small improvements that were noticed along the way.

A very welcome side-effect of these changes is DM no longer needs to:
1) backfill the "mq" sysfs entry (because historically DM didn't
initialize the request_queue to use blk-mq until _after_
blk_register_queue() was called via add_disk()).
2) call elv_register_queue() to get .request_fn request-based DM
device's "queue" exposed in syfs.

In addition, blk-mq debugfs support is now made available because
request-based DM's blk-mq request_queue is now properly initialized
before dm_setup_md_queue() calls blk_register_queue().

These changes also stave off the need to introduce new DM-specific
workarounds in block core, e.g. this proposal:
https://patchwork.kernel.org/patch/10067961/

In the end DM devices should be less unicorn in nature (relative to
initialization and availability of block core infrastructure).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/md/dm-core.h |  2 --
 drivers/md/dm-rq.c   | 11 -----------
 drivers/md/dm.c      | 41 ++++++++++++++++++++++-------------------
 3 files changed, 22 insertions(+), 32 deletions(-)

diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index 6a14f945783c..f955123b4765 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -130,8 +130,6 @@ struct mapped_device {
 	struct srcu_struct io_barrier;
 };
 
-void dm_init_md_queue(struct mapped_device *md);
-void dm_init_normal_md_queue(struct mapped_device *md);
 int md_in_flight(struct mapped_device *md);
 void disable_write_same(struct mapped_device *md);
 void disable_write_zeroes(struct mapped_device *md);
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 9d32f25489c2..3b319776d80c 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -700,7 +700,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	/* disable dm_old_request_fn's merge heuristic by default */
 	md->seq_rq_merge_deadline_usecs = 0;
 
-	dm_init_normal_md_queue(md);
 	blk_queue_softirq_done(md->queue, dm_softirq_done);
 
 	/* Initialize the request-based DM worker thread */
@@ -713,8 +712,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
 		return error;
 	}
 
-	elv_register_queue(md->queue);
-
 	return 0;
 }
 
@@ -810,17 +807,9 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 		err = PTR_ERR(q);
 		goto out_tag_set;
 	}
-	dm_init_md_queue(md);
-
-	/* backfill 'mq' sysfs registration normally done in blk_register_queue */
-	err = blk_mq_register_dev(disk_to_dev(md->disk), q);
-	if (err)
-		goto out_cleanup_queue;
 
 	return 0;
 
-out_cleanup_queue:
-	blk_cleanup_queue(q);
 out_tag_set:
 	blk_mq_free_tag_set(md->tag_set);
 out_kfree_tag_set:
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7475739fee49..c84d4a0c6bf7 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1626,20 +1626,9 @@ static const struct dax_operations dm_dax_ops;
 
 static void dm_wq_work(struct work_struct *work);
 
-void dm_init_md_queue(struct mapped_device *md)
-{
-	/*
-	 * Initialize data that will only be used by a non-blk-mq DM queue
-	 * - must do so here (in alloc_dev callchain) before queue is used
-	 */
-	md->queue->queuedata = md;
-	md->queue->backing_dev_info->congested_data = md;
-}
-
-void dm_init_normal_md_queue(struct mapped_device *md)
+static void dm_init_normal_md_queue(struct mapped_device *md)
 {
 	md->use_blk_mq = false;
-	dm_init_md_queue(md);
 
 	/*
 	 * Initialize aspects of queue that aren't relevant for blk-mq
@@ -1734,10 +1723,10 @@ static struct mapped_device *alloc_dev(int minor)
 	md->queue = blk_alloc_queue_node(GFP_KERNEL, numa_node_id);
 	if (!md->queue)
 		goto bad;
+	md->queue->queuedata = md;
+	md->queue->backing_dev_info->congested_data = md;
 
-	dm_init_md_queue(md);
-
-	md->disk = alloc_disk_node(1, numa_node_id);
+	md->disk = alloc_disk_node(1, md->numa_node_id);
 	if (!md->disk)
 		goto bad;
 
@@ -1761,7 +1750,7 @@ static struct mapped_device *alloc_dev(int minor)
 		goto bad;
 	md->dax_dev = dax_dev;
 
-	add_disk(md->disk);
+	add_disk_no_queue_reg(md->disk);
 	format_dev_t(md->name, MKDEV(_major, minor));
 
 	md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
@@ -1962,13 +1951,18 @@ static struct dm_table *__unbind(struct mapped_device *md)
  */
 int dm_create(int minor, struct mapped_device **result)
 {
+	int r;
 	struct mapped_device *md;
 
 	md = alloc_dev(minor);
 	if (!md)
 		return -ENXIO;
 
-	dm_sysfs_init(md);
+	r = dm_sysfs_init(md);
+	if (r) {
+		free_dev(md);
+		return r;
+	}
 
 	*result = md;
 	return 0;
@@ -2021,10 +2015,12 @@ EXPORT_SYMBOL_GPL(dm_get_queue_limits);
 int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 {
 	int r;
+	struct queue_limits limits;
 	enum dm_queue_mode type = dm_get_md_type(md);
 
 	switch (type) {
 	case DM_TYPE_REQUEST_BASED:
+		dm_init_normal_md_queue(md);
 		r = dm_old_init_request_queue(md, t);
 		if (r) {
 			DMERR("Cannot initialize queue for request-based mapped device");
@@ -2057,6 +2053,14 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 		break;
 	}
 
+	r = dm_calculate_queue_limits(t, &limits);
+	if (r) {
+		DMERR("Cannot calculate initial queue limits");
+		return r;
+	}
+	dm_table_set_restrictions(t, md->queue, &limits);
+	blk_register_queue(md->disk);
+
 	return 0;
 }
 
@@ -2121,7 +2125,6 @@ EXPORT_SYMBOL_GPL(dm_device_name);
 
 static void __dm_destroy(struct mapped_device *md, bool wait)
 {
-	struct request_queue *q = dm_get_md_queue(md);
 	struct dm_table *map;
 	int srcu_idx;
 
@@ -2132,7 +2135,7 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
 	set_bit(DMF_FREEING, &md->flags);
 	spin_unlock(&_minor_lock);
 
-	blk_set_queue_dying(q);
+	blk_set_queue_dying(md->queue);
 
 	if (dm_request_based(md) && md->kworker_task)
 		kthread_flush_worker(&md->kworker);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-11 20:14 ` [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue Mike Snitzer
@ 2018-01-12  0:28   ` Bart Van Assche
  2018-01-12  2:53     ` Mike Snitzer
  2018-01-12  7:09   ` Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2018-01-12  0:28 UTC (permalink / raw)
  To: snitzer@redhat.com, axboe@kernel.dk
  Cc: dm-devel@redhat.com, hare@suse.de, tom.leiming@gmail.com,
	linux-block@vger.kernel.org

T24gVGh1LCAyMDE4LTAxLTExIGF0IDE1OjE0IC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IGJsa191bnJlZ2lzdGVyX3F1ZXVlKCkgbXVzdCBwcm90ZWN0IGFnYWluc3QgYW55IG1vZGlmaWNh
dGlvbnMgb2YNCj4gcS0+cXVldWVfZmxhZ3MgKG5vdCBqdXN0IHRob3NlIHBlcmZvcm1lZCBpbiBi
bGstc3lzZnMuYykuICBUaGVyZWZvcmUNCj4gcS0+cXVldWVfbG9jayBuZWVkcyB0byBiZSB1c2Vk
IHJhdGhlciB0aGFuIHEtPnN5c2ZzX2xvY2suDQo+IA0KPiBGaXhlczogZTlhODIzZmIzNGE4YiAo
ImJsb2NrOiBmaXggd2FybmluZyB3aGVuIEkvTyBlbGV2YXRvciBpcyBjaGFuZ2VkIGFzIHJlcXVl
c3RfcXVldWUgaXMgYmVpbmcgcmVtb3ZlZCIpDQo+IENjOiBzdGFibGVAdmdlci5rZXJuZWwub3Jn
ICMgNC4xNCsNCj4gUmVwb3J0ZWQtYnk6IEJhcnQgVmFuIEFzc2NoZSA8QmFydC5WYW5Bc3NjaGVA
d2RjLmNvbT4NCj4gU2lnbmVkLW9mZi1ieTogTWlrZSBTbml0emVyIDxzbml0emVyQHJlZGhhdC5j
b20+DQo+IC0tLQ0KPiAgYmxvY2svYmxrLXN5c2ZzLmMgfCA0ICsrLS0NCj4gIDEgZmlsZSBjaGFu
Z2VkLCAyIGluc2VydGlvbnMoKyksIDIgZGVsZXRpb25zKC0pDQo+IA0KPiBkaWZmIC0tZ2l0IGEv
YmxvY2svYmxrLXN5c2ZzLmMgYi9ibG9jay9ibGstc3lzZnMuYw0KPiBpbmRleCA4NzA0ODRlYWVk
MWYuLjUyZjU3NTM5ZjFjNyAxMDA2NDQNCj4gLS0tIGEvYmxvY2svYmxrLXN5c2ZzLmMNCj4gKysr
IGIvYmxvY2svYmxrLXN5c2ZzLmMNCj4gQEAgLTkyOSw5ICs5MjksOSBAQCB2b2lkIGJsa191bnJl
Z2lzdGVyX3F1ZXVlKHN0cnVjdCBnZW5kaXNrICpkaXNrKQ0KPiAgCWlmIChXQVJOX09OKCFxKSkN
Cj4gIAkJcmV0dXJuOw0KPiAgDQo+IC0JbXV0ZXhfbG9jaygmcS0+c3lzZnNfbG9jayk7DQo+ICsJ
c3Bpbl9sb2NrX2lycShxLT5xdWV1ZV9sb2NrKTsNCj4gIAlxdWV1ZV9mbGFnX2NsZWFyX3VubG9j
a2VkKFFVRVVFX0ZMQUdfUkVHSVNURVJFRCwgcSk7DQo+IC0JbXV0ZXhfdW5sb2NrKCZxLT5zeXNm
c19sb2NrKTsNCj4gKwlzcGluX3VubG9ja19pcnEocS0+cXVldWVfbG9jayk7DQoNCkhlbGxvIE1p
a2UsDQoNClRoZSBmdW5jdGlvbiBuYW1lIHF1ZXVlX2ZsYWdfY2xlYXJfdW5sb2NrZWQoKSBtZWFu
cyAiY2xlYXIgYSBxdWV1ZSBmbGFnDQp3aXRob3V0IGhvbGRpbmcgdGhlIHF1ZXVlIGxvY2siLiBT
byBhdCBsZWFzdCB0byBtZSB0aGUgYWJvdmUgY29kZSBpcyBjb25mdXNpbmcuDQpQbGVhc2UgY29u
c2lkZXIgdG8gY2hhbmdlIHF1ZXVlX2ZsYWdfY2xlYXJfdW5sb2NrZWQoKSBpbnRvIHF1ZXVlX2Zs
YWdfY2xlYXIoKS4NCg0KVGhhbmtzLA0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 20:14 ` [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
@ 2018-01-12  0:37   ` Bart Van Assche
  2018-01-12  2:03     ` Mike Snitzer
  2018-01-12  7:33   ` Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2018-01-12  0:37 UTC (permalink / raw)
  To: snitzer@redhat.com, axboe@kernel.dk
  Cc: dm-devel@redhat.com, hare@suse.de, tom.leiming@gmail.com,
	linux-block@vger.kernel.org

T24gVGh1LCAyMDE4LTAxLTExIGF0IDE1OjE0IC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IC12b2lkIGRldmljZV9hZGRfZGlzayhzdHJ1Y3QgZGV2aWNlICpwYXJlbnQsIHN0cnVjdCBnZW5k
aXNrICpkaXNrKQ0KPiArdm9pZCBkZXZpY2VfYWRkX2Rpc2tfbm9fcXVldWVfcmVnKHN0cnVjdCBk
ZXZpY2UgKnBhcmVudCwgc3RydWN0IGdlbmRpc2sgKmRpc2spDQo+ICB7DQo+ICAJZGV2X3QgZGV2
dDsNCj4gIAlpbnQgcmV0dmFsOw0KPiBAQCAtNjgyLDcgKzY4Miw2IEBAIHZvaWQgZGV2aWNlX2Fk
ZF9kaXNrKHN0cnVjdCBkZXZpY2UgKnBhcmVudCwgc3RydWN0IGdlbmRpc2sgKmRpc2spDQo+ICAJ
CQkJICAgIGV4YWN0X21hdGNoLCBleGFjdF9sb2NrLCBkaXNrKTsNCj4gIAl9DQo+ICAJcmVnaXN0
ZXJfZGlzayhwYXJlbnQsIGRpc2spOw0KPiAtCWJsa19yZWdpc3Rlcl9xdWV1ZShkaXNrKTsNCj4g
IA0KPiAgCS8qDQo+ICAJICogVGFrZSBhbiBleHRyYSByZWYgb24gcXVldWUgd2hpY2ggd2lsbCBi
ZSBwdXQgb24gZGlza19yZWxlYXNlKCkNCj4gQEAgLTY5Myw2ICs2OTIsMjEgQEAgdm9pZCBkZXZp
Y2VfYWRkX2Rpc2soc3RydWN0IGRldmljZSAqcGFyZW50LCBzdHJ1Y3QgZ2VuZGlzayAqZGlzaykN
Cj4gIAlkaXNrX2FkZF9ldmVudHMoZGlzayk7DQo+ICAJYmxrX2ludGVncml0eV9hZGQoZGlzayk7
DQo+ICB9DQo+ICtFWFBPUlRfU1lNQk9MKGRldmljZV9hZGRfZGlza19ub19xdWV1ZV9yZWcpOw0K
DQpIZWxsbyBNaWtlLA0KDQpUaGlzIGNoYW5nZSBjYW4gaW5jcmVhc2UgdGhlIHRpbWUgYmV0d2Vl
biB0aGUgZ2VuZXJhdGlvbiBvZiB0aGUgZGlzayB1ZXZlbnQNCmFuZCB0aGUgcmVnaXN0cmF0aW9u
IG9mIHRoZSByZXF1ZXN0IHF1ZXVlIHN5c2ZzIGF0dHJpYnV0ZXMuIENhbiB0aGlzIGNhdXNlDQph
bnkgdWRldiBydWxlcyB0byBmYWlsPw0KDQpUaGFua3MsDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred
  2018-01-12  0:37   ` Bart Van Assche
@ 2018-01-12  2:03     ` Mike Snitzer
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-12  2:03 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe@kernel.dk, dm-devel@redhat.com, hare@suse.de,
	tom.leiming@gmail.com, linux-block@vger.kernel.org

On Thu, Jan 11 2018 at  7:37pm -0500,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> On Thu, 2018-01-11 at 15:14 -0500, Mike Snitzer wrote:
> > -void device_add_disk(struct device *parent, struct gendisk *disk)
> > +void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
> >  {
> >  	dev_t devt;
> >  	int retval;
> > @@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
> >  				    exact_match, exact_lock, disk);
> >  	}
> >  	register_disk(parent, disk);
> > -	blk_register_queue(disk);
> >  
> >  	/*
> >  	 * Take an extra ref on queue which will be put on disk_release()
> > @@ -693,6 +692,21 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
> >  	disk_add_events(disk);
> >  	blk_integrity_add(disk);
> >  }
> > +EXPORT_SYMBOL(device_add_disk_no_queue_reg);
> 
> Hello Mike,
> 
> This change can increase the time between the generation of the disk uevent
> and the registration of the request queue sysfs attributes. Can this cause
> any udev rules to fail?

Certainly not for DM (DM has very sophisticated udev rules that wait for
the final dm generated "CHANGE" event before considering a device to be
"ready").

But are you asking about non-DM devices?  I cannot see, what amounts to,
moving blk_register_queue() to the end of device_add_disk() as reason for
concern.  If it were there technically would already be that race.

Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-12  0:28   ` Bart Van Assche
@ 2018-01-12  2:53     ` Mike Snitzer
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-12  2:53 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe@kernel.dk, dm-devel@redhat.com, hare@suse.de,
	tom.leiming@gmail.com, linux-block@vger.kernel.org

On Thu, Jan 11 2018 at  7:28pm -0500,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> On Thu, 2018-01-11 at 15:14 -0500, Mike Snitzer wrote:
> > blk_unregister_queue() must protect against any modifications of
> > q->queue_flags (not just those performed in blk-sysfs.c).  Therefore
> > q->queue_lock needs to be used rather than q->sysfs_lock.
> > 
> > Fixes: e9a823fb34a8b ("block: fix warning when I/O elevator is changed as request_queue is being removed")
> > Cc: stable@vger.kernel.org # 4.14+
> > Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  block/blk-sysfs.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > index 870484eaed1f..52f57539f1c7 100644
> > --- a/block/blk-sysfs.c
> > +++ b/block/blk-sysfs.c
> > @@ -929,9 +929,9 @@ void blk_unregister_queue(struct gendisk *disk)
> >  	if (WARN_ON(!q))
> >  		return;
> >  
> > -	mutex_lock(&q->sysfs_lock);
> > +	spin_lock_irq(q->queue_lock);
> >  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> > -	mutex_unlock(&q->sysfs_lock);
> > +	spin_unlock_irq(q->queue_lock);
> 
> Hello Mike,
> 
> The function name queue_flag_clear_unlocked() means "clear a queue flag
> without holding the queue lock". So at least to me the above code is confusing.
> Please consider to change queue_flag_clear_unlocked() into queue_flag_clear().

If Jens would like to change it when applying the patch to his tree
that is fine by me.

But as you know, it doesn't matter:
queue_flag_clear() just has extra queue_lockdep_assert_held(q);

So I see no reason to respin this patch for this.  Especially when you
consider patch 3 replaces it with queue_flag_test_and_clear() -- and no
it isn't a problem for stable@ to carry on using
queue_flag_clear_unlocked

Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-11 20:14 ` [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue Mike Snitzer
  2018-01-12  0:28   ` Bart Van Assche
@ 2018-01-12  7:09   ` Ming Lei
  2018-01-12 12:53     ` Mike Snitzer
  1 sibling, 1 reply; 14+ messages in thread
From: Ming Lei @ 2018-01-12  7:09 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

On Thu, Jan 11, 2018 at 03:14:15PM -0500, Mike Snitzer wrote:
> blk_unregister_queue() must protect against any modifications of
> q->queue_flags (not just those performed in blk-sysfs.c).  Therefore
> q->queue_lock needs to be used rather than q->sysfs_lock.
> 
> Fixes: e9a823fb34a8b ("block: fix warning when I/O elevator is changed as request_queue is being removed")
> Cc: stable@vger.kernel.org # 4.14+
> Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-sysfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 870484eaed1f..52f57539f1c7 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -929,9 +929,9 @@ void blk_unregister_queue(struct gendisk *disk)
>  	if (WARN_ON(!q))
>  		return;
>  
> -	mutex_lock(&q->sysfs_lock);
> +	spin_lock_irq(q->queue_lock);
>  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> -	mutex_unlock(&q->sysfs_lock);
> +	spin_unlock_irq(q->queue_lock);
>  
>  	wbt_exit(q);

Hi Mike,

This change may not be correct, since at least e9a823fb34a8b depends
on q->sysfs_lock to sync between testing the flag in __elevator_change()
and clearing it here.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 20:14 ` [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
  2018-01-12  0:37   ` Bart Van Assche
@ 2018-01-12  7:33   ` Ming Lei
  1 sibling, 0 replies; 14+ messages in thread
From: Ming Lei @ 2018-01-12  7:33 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block

On Thu, Jan 11, 2018 at 03:14:16PM -0500, Mike Snitzer wrote:
> Since I can remember DM has forced the block layer to allow the
> allocation and initialization of the request_queue to be distinct
> operations.  Reason for this is block/genhd.c:add_disk() has requires
> that the request_queue (and associated bdi) be tied to the gendisk
> before add_disk() is called -- because add_disk() also deals with
> exposing the request_queue via blk_register_queue().
> 
> DM's dynamic creation of arbitrary device types (and associated
> request_queue types) requires the DM device's gendisk be available so
> that DM table loads can establish a master/slave relationship with
> subordinate devices that are referenced by loaded DM tables -- using
> bd_link_disk_holder().  But until these DM tables, and their associated
> subordinate devices, are known DM cannot know what type of request_queue
> it needs -- nor what its queue_limits should be.
> 
> This chicken and egg scenario has created all manner of problems for DM
> and, at times, the block layer.
> 
> Summary of changes:
> 
> - Add blk_add_disk_no_queue_reg() and add_disk_no_queue_reg() variant
>   that drivers may use to add a disk without also calling
>   blk_register_queue().  Driver must call blk_register_queue() once its
>   request_queue is fully initialized.
> 
> - Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
>   is not set.  It won't be set if driver used add_disk_no_queue_reg()
>   but driver encounters an error and must del_gendisk() before calling
>   blk_register_queue().
> 
> - Export blk_register_queue().
> 
> These changes allow DM to use add_disk_no_queue_reg() to anchor its
> gendisk as the "master" for master/slave relationships DM must establish
> with subordinate devices referenced in DM tables that get loaded.  Once
> all "slave" devices for a DM device are known its request_queue can be
> properly initialized and then advertised via sysfs -- important
> improvement being that no request_queue resource initialization
> performed by blk_register_queue() is missed for DM devices anymore.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-sysfs.c     |  8 ++++++--
>  block/genhd.c         | 20 +++++++++++++++++---
>  include/linux/genhd.h |  5 +++++
>  3 files changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 52f57539f1c7..90de6337cc4d 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -921,6 +921,7 @@ int blk_register_queue(struct gendisk *disk)
>  	mutex_unlock(&q->sysfs_lock);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(blk_register_queue);
>  
>  void blk_unregister_queue(struct gendisk *disk)
>  {
> @@ -930,12 +931,15 @@ void blk_unregister_queue(struct gendisk *disk)
>  		return;
>  
>  	spin_lock_irq(q->queue_lock);
> -	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> +	if (!queue_flag_test_and_clear(QUEUE_FLAG_REGISTERED, q)) {
> +		/* Return early if disk->queue was never registered. */
> +		spin_unlock_irq(q->queue_lock);
> +		return;
> +	}
>  	spin_unlock_irq(q->queue_lock);
>  
>  	wbt_exit(q);
>  
> -
>  	if (q->mq_ops)
>  		blk_mq_unregister_dev(disk_to_dev(disk), q);
>  
> diff --git a/block/genhd.c b/block/genhd.c
> index 00620e01e043..d4aaf0cae9ad 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -629,7 +629,7 @@ static void register_disk(struct device *parent, struct gendisk *disk)
>  }
>  
>  /**
> - * device_add_disk - add partitioning information to kernel list
> + * device_add_disk_no_queue_reg - add partitioning information to kernel list
>   * @parent: parent device for the disk
>   * @disk: per-device partitioning information
>   *
> @@ -638,7 +638,7 @@ static void register_disk(struct device *parent, struct gendisk *disk)
>   *
>   * FIXME: error handling
>   */
> -void device_add_disk(struct device *parent, struct gendisk *disk)
> +void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
>  {
>  	dev_t devt;
>  	int retval;
> @@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  				    exact_match, exact_lock, disk);
>  	}
>  	register_disk(parent, disk);
> -	blk_register_queue(disk);
>  
>  	/*
>  	 * Take an extra ref on queue which will be put on disk_release()
> @@ -693,6 +692,21 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  	disk_add_events(disk);
>  	blk_integrity_add(disk);
>  }
> +EXPORT_SYMBOL(device_add_disk_no_queue_reg);
> +
> +/**
> + * device_add_disk - add disk information to kernel list
> + * @parent: parent device for the disk
> + * @disk: per-device disk information
> + *
> + * Adds partitioning information and also registers the
> + * associated request_queue to @disk.
> + */
> +void device_add_disk(struct device *parent, struct gendisk *disk)
> +{
> +	device_add_disk_no_queue_reg(parent, disk);
> +	blk_register_queue(disk);
> +}
>  EXPORT_SYMBOL(device_add_disk);
>  
>  void del_gendisk(struct gendisk *disk)
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 5144ebe046c9..5e3531027b51 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -395,6 +395,11 @@ static inline void add_disk(struct gendisk *disk)
>  {
>  	device_add_disk(NULL, disk);
>  }
> +extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
> +static inline void add_disk_no_queue_reg(struct gendisk *disk)
> +{
> +	device_add_disk_no_queue_reg(NULL, disk);
> +}
>  
>  extern void del_gendisk(struct gendisk *gp);
>  extern struct gendisk *get_gendisk(dev_t dev, int *partno);
> -- 
> 2.15.0
> 

Looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-12  7:09   ` Ming Lei
@ 2018-01-12 12:53     ` Mike Snitzer
  2018-01-12 14:14       ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2018-01-12 12:53 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block,
	David Jeffery

On Fri, Jan 12 2018 at  2:09am -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Thu, Jan 11, 2018 at 03:14:15PM -0500, Mike Snitzer wrote:
> > blk_unregister_queue() must protect against any modifications of
> > q->queue_flags (not just those performed in blk-sysfs.c).  Therefore
> > q->queue_lock needs to be used rather than q->sysfs_lock.
> > 
> > Fixes: e9a823fb34a8b ("block: fix warning when I/O elevator is changed as request_queue is being removed")
> > Cc: stable@vger.kernel.org # 4.14+
> > Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  block/blk-sysfs.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > index 870484eaed1f..52f57539f1c7 100644
> > --- a/block/blk-sysfs.c
> > +++ b/block/blk-sysfs.c
> > @@ -929,9 +929,9 @@ void blk_unregister_queue(struct gendisk *disk)
> >  	if (WARN_ON(!q))
> >  		return;
> >  
> > -	mutex_lock(&q->sysfs_lock);
> > +	spin_lock_irq(q->queue_lock);
> >  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> > -	mutex_unlock(&q->sysfs_lock);
> > +	spin_unlock_irq(q->queue_lock);
> >  
> >  	wbt_exit(q);
> 
> Hi Mike,
> 
> This change may not be correct, since at least e9a823fb34a8b depends
> on q->sysfs_lock to sync between testing the flag in __elevator_change()
> and clearing it here.

The header for commit e9a823fb34a8b says:
    To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when
    changing the elevator and use the request_queue's sysfs_lock to
    serialize between clearing the flag and the elevator testing the flag.

The reality is sysfs_lock isn't needed to serialize between
blk_unregister_queue() clearing the flag and __elevator_change() testing
the flag.

The original commit e9a823fb34a8b is pretty conflated.  "conflated" because
the resource being protected isn't the queue_flags (it is the 'queue'
kobj).

I'll respin v5 of this patchset to fix this up first, and then apply the
changes I _really_ need to land (DM queue initialization fix).

And then I'm going to slowly step away from block core and _not_ allow
myself to be tripped up, or baited, by historic block core issues for a
while... ;)

Thanks,
Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-12 12:53     ` Mike Snitzer
@ 2018-01-12 14:14       ` Ming Lei
  2018-01-12 15:05         ` Mike Snitzer
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2018-01-12 14:14 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: axboe, Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block,
	David Jeffery

On Fri, Jan 12, 2018 at 07:53:40AM -0500, Mike Snitzer wrote:
> On Fri, Jan 12 2018 at  2:09am -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Thu, Jan 11, 2018 at 03:14:15PM -0500, Mike Snitzer wrote:
> > > blk_unregister_queue() must protect against any modifications of
> > > q->queue_flags (not just those performed in blk-sysfs.c).  Therefore
> > > q->queue_lock needs to be used rather than q->sysfs_lock.
> > > 
> > > Fixes: e9a823fb34a8b ("block: fix warning when I/O elevator is changed as request_queue is being removed")
> > > Cc: stable@vger.kernel.org # 4.14+
> > > Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  block/blk-sysfs.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > > index 870484eaed1f..52f57539f1c7 100644
> > > --- a/block/blk-sysfs.c
> > > +++ b/block/blk-sysfs.c
> > > @@ -929,9 +929,9 @@ void blk_unregister_queue(struct gendisk *disk)
> > >  	if (WARN_ON(!q))
> > >  		return;
> > >  
> > > -	mutex_lock(&q->sysfs_lock);
> > > +	spin_lock_irq(q->queue_lock);
> > >  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> > > -	mutex_unlock(&q->sysfs_lock);
> > > +	spin_unlock_irq(q->queue_lock);
> > >  
> > >  	wbt_exit(q);
> > 
> > Hi Mike,
> > 
> > This change may not be correct, since at least e9a823fb34a8b depends
> > on q->sysfs_lock to sync between testing the flag in __elevator_change()
> > and clearing it here.
> 
> The header for commit e9a823fb34a8b says:
>     To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when
>     changing the elevator and use the request_queue's sysfs_lock to
>     serialize between clearing the flag and the elevator testing the flag.
> 
> The reality is sysfs_lock isn't needed to serialize between
> blk_unregister_queue() clearing the flag and __elevator_change() testing
> the flag.

Without holding sysfs_lock, __elevator_change() may just be called after
q->kobj is deleted from blk_unregister_queue(), then the warning of
'WARNING: CPU: 3 PID: 14075 at lib/kobject.c:244 kobject_add_internal+0x103/0x2d0'
can be triggered again.

BTW, __elevator_change() is always run with holding sysfs_lock.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue
  2018-01-12 14:14       ` Ming Lei
@ 2018-01-12 15:05         ` Mike Snitzer
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Snitzer @ 2018-01-12 15:05 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, Ming Lei, hare, Bart.VanAssche, dm-devel, linux-block,
	David Jeffery

On Fri, Jan 12 2018 at  9:14am -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Fri, Jan 12, 2018 at 07:53:40AM -0500, Mike Snitzer wrote:
> > On Fri, Jan 12 2018 at  2:09am -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Thu, Jan 11, 2018 at 03:14:15PM -0500, Mike Snitzer wrote:
> > > > blk_unregister_queue() must protect against any modifications of
> > > > q->queue_flags (not just those performed in blk-sysfs.c).  Therefore
> > > > q->queue_lock needs to be used rather than q->sysfs_lock.
> > > > 
> > > > Fixes: e9a823fb34a8b ("block: fix warning when I/O elevator is changed as request_queue is being removed")
> > > > Cc: stable@vger.kernel.org # 4.14+
> > > > Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  block/blk-sysfs.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > > > index 870484eaed1f..52f57539f1c7 100644
> > > > --- a/block/blk-sysfs.c
> > > > +++ b/block/blk-sysfs.c
> > > > @@ -929,9 +929,9 @@ void blk_unregister_queue(struct gendisk *disk)
> > > >  	if (WARN_ON(!q))
> > > >  		return;
> > > >  
> > > > -	mutex_lock(&q->sysfs_lock);
> > > > +	spin_lock_irq(q->queue_lock);
> > > >  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> > > > -	mutex_unlock(&q->sysfs_lock);
> > > > +	spin_unlock_irq(q->queue_lock);
> > > >  
> > > >  	wbt_exit(q);
> > > 
> > > Hi Mike,
> > > 
> > > This change may not be correct, since at least e9a823fb34a8b depends
> > > on q->sysfs_lock to sync between testing the flag in __elevator_change()
> > > and clearing it here.
> > 
> > The header for commit e9a823fb34a8b says:
> >     To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when
> >     changing the elevator and use the request_queue's sysfs_lock to
> >     serialize between clearing the flag and the elevator testing the flag.
> > 
> > The reality is sysfs_lock isn't needed to serialize between
> > blk_unregister_queue() clearing the flag and __elevator_change() testing
> > the flag.
> 
> Without holding sysfs_lock, __elevator_change() may just be called after
> q->kobj is deleted from blk_unregister_queue(), then the warning of
> 'WARNING: CPU: 3 PID: 14075 at lib/kobject.c:244 kobject_add_internal+0x103/0x2d0'
> can be triggered again.
> 
> BTW, __elevator_change() is always run with holding sysfs_lock.

Yes, I'm well aware.  Please see v5's PATCH 02/04 which is inbound now.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-01-12 15:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-11 20:14 [for-4.16 PATCH v4 0/4] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
2018-01-11 20:14 ` [for-4.16 PATCH v4 1/4] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
2018-01-11 20:14 ` [for-4.16 PATCH v4 2/4] block: use queue_lock when clearing QUEUE_FLAG_REGISTERED in blk_unregister_queue Mike Snitzer
2018-01-12  0:28   ` Bart Van Assche
2018-01-12  2:53     ` Mike Snitzer
2018-01-12  7:09   ` Ming Lei
2018-01-12 12:53     ` Mike Snitzer
2018-01-12 14:14       ` Ming Lei
2018-01-12 15:05         ` Mike Snitzer
2018-01-11 20:14 ` [for-4.16 PATCH v4 3/4] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
2018-01-12  0:37   ` Bart Van Assche
2018-01-12  2:03     ` Mike Snitzer
2018-01-12  7:33   ` Ming Lei
2018-01-11 20:14 ` [for-4.16 PATCH v4 4/4] dm: fix awkward and incomplete request_queue initialization Mike Snitzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).