* [RFC PATCH 0/1] remove redundant q->sysfs_dir_lock
@ 2025-01-20 13:04 Nilay Shroff
2025-01-20 13:04 ` [RFC PATCH 1/1] block: get rid of request queue ->sysfs_dir_lock Nilay Shroff
0 siblings, 1 reply; 4+ messages in thread
From: Nilay Shroff @ 2025-01-20 13:04 UTC (permalink / raw)
To: linux-block; +Cc: hch, ming.lei, dlemoal, axboe, gjoyce
Hi,
In the current implementation we use q->sysfs_dir_lock for protecting
kobject addition/deletion while we register/unregister blk-mq with
sysfs. However the sysfs/kernfs internal implementation already protects
against the simultaneous addtion/deletion of kobjects. So in that sense
use of q->sysfs_dir_lock appears redundant.
Furthermore, there're few other callsites in the current code where we
use q->sysfs_dir_lock along with q->sysfs_lock while addition/deletion of
independent access ranges for a disk under sysfs. Please refer, disk_
register_independent_access_ranges() and disk_unregister_independent_
access_ranges(). Here as well we could easily remove use of q->sysfs_dir_
lock.
The only thing which q->syfs_dir_lock appears to protect is the use of
variable q->mq_sysfs_init_done. However this could be solved by converting
q->mq_sysfs_init_done to an atomic variable.
In past few days, we have seen many lockdep splat in block layer and
getting rid of this one might help reduce some contention as well we
need to worry less about lock ordering wrt to this lock.
Nilay Shroff (1):
block: get rid of request queue ->sysfs_dir_lock
block/blk-core.c | 1 -
block/blk-ia-ranges.c | 4 ----
block/blk-mq-sysfs.c | 25 +++++++------------------
block/blk-sysfs.c | 5 -----
include/linux/blkdev.h | 3 +--
5 files changed, 8 insertions(+), 30 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [RFC PATCH 1/1] block: get rid of request queue ->sysfs_dir_lock
2025-01-20 13:04 [RFC PATCH 0/1] remove redundant q->sysfs_dir_lock Nilay Shroff
@ 2025-01-20 13:04 ` Nilay Shroff
2025-01-22 6:28 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Nilay Shroff @ 2025-01-20 13:04 UTC (permalink / raw)
To: linux-block; +Cc: hch, ming.lei, dlemoal, axboe, gjoyce
The request queue uses ->sysfs_dir_lock for protecting the addition/
deletion of kobject entries under sysfs while we register/unregister
blk-mq. However kobject addition/deletion is already protected with
kernfs/sysfs internal synchronization primitives. So use of q->sysfs_
dir_lock seems redundant.
Moreover, q->sysfs_dir_lock is also used at few other callsites along
with q->sysfs_lock for protecting the addition/deletion of kojects.
One such example is when we register with sysfs a set of independent
access ranges for a disk. Here as well we could get rid off q->sysfs_
dir_lock and only use q->sysfs_lock.
The only variable which q->sysfs_dir_lock appears to protect is q->
mq_sysfs_init_done which is set/unset while registering/unregistering
blk-mq with sysfs. But this variable could be easily converted to an
atomic type and used safely without using q->sysfs_dir_lock.
So with this patch we remove q->sysfs_dir_lock from each callsite
and also make q->mq_sysfs_init_done an atomic variable.
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
block/blk-core.c | 1 -
block/blk-ia-ranges.c | 4 ----
block/blk-mq-sysfs.c | 25 +++++++------------------
block/blk-sysfs.c | 5 -----
include/linux/blkdev.h | 3 +--
5 files changed, 8 insertions(+), 30 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 32fb28a6372c..d6c4fa3943b5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -430,7 +430,6 @@ struct request_queue *blk_alloc_queue(struct queue_limits *lim, int node_id)
refcount_set(&q->refs, 1);
mutex_init(&q->debugfs_mutex);
mutex_init(&q->sysfs_lock);
- mutex_init(&q->sysfs_dir_lock);
mutex_init(&q->limits_lock);
mutex_init(&q->rq_qos_mutex);
spin_lock_init(&q->queue_lock);
diff --git a/block/blk-ia-ranges.c b/block/blk-ia-ranges.c
index c9eb4241e048..d479f5481b66 100644
--- a/block/blk-ia-ranges.c
+++ b/block/blk-ia-ranges.c
@@ -111,7 +111,6 @@ int disk_register_independent_access_ranges(struct gendisk *disk)
struct request_queue *q = disk->queue;
int i, ret;
- lockdep_assert_held(&q->sysfs_dir_lock);
lockdep_assert_held(&q->sysfs_lock);
if (!iars)
@@ -155,7 +154,6 @@ void disk_unregister_independent_access_ranges(struct gendisk *disk)
struct blk_independent_access_ranges *iars = disk->ia_ranges;
int i;
- lockdep_assert_held(&q->sysfs_dir_lock);
lockdep_assert_held(&q->sysfs_lock);
if (!iars)
@@ -289,7 +287,6 @@ void disk_set_independent_access_ranges(struct gendisk *disk,
{
struct request_queue *q = disk->queue;
- mutex_lock(&q->sysfs_dir_lock);
mutex_lock(&q->sysfs_lock);
if (iars && !disk_check_ia_ranges(disk, iars)) {
kfree(iars);
@@ -313,6 +310,5 @@ void disk_set_independent_access_ranges(struct gendisk *disk,
disk_register_independent_access_ranges(disk);
unlock:
mutex_unlock(&q->sysfs_lock);
- mutex_unlock(&q->sysfs_dir_lock);
}
EXPORT_SYMBOL_GPL(disk_set_independent_access_ranges);
diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c
index 156e9bb07abf..0d2226035ab6 100644
--- a/block/blk-mq-sysfs.c
+++ b/block/blk-mq-sysfs.c
@@ -223,8 +223,6 @@ int blk_mq_sysfs_register(struct gendisk *disk)
unsigned long i, j;
int ret;
- lockdep_assert_held(&q->sysfs_dir_lock);
-
ret = kobject_add(q->mq_kobj, &disk_to_dev(disk)->kobj, "mq");
if (ret < 0)
goto out;
@@ -237,7 +235,7 @@ int blk_mq_sysfs_register(struct gendisk *disk)
goto unreg;
}
- q->mq_sysfs_init_done = true;
+ atomic_set(&q->mq_sysfs_init_done, 1);
out:
return ret;
@@ -259,15 +257,13 @@ void blk_mq_sysfs_unregister(struct gendisk *disk)
struct blk_mq_hw_ctx *hctx;
unsigned long i;
- lockdep_assert_held(&q->sysfs_dir_lock);
-
queue_for_each_hw_ctx(q, hctx, i)
blk_mq_unregister_hctx(hctx);
kobject_uevent(q->mq_kobj, KOBJ_REMOVE);
kobject_del(q->mq_kobj);
- q->mq_sysfs_init_done = false;
+ atomic_set(&q->mq_sysfs_init_done, 0);
}
void blk_mq_sysfs_unregister_hctxs(struct request_queue *q)
@@ -275,15 +271,11 @@ void blk_mq_sysfs_unregister_hctxs(struct request_queue *q)
struct blk_mq_hw_ctx *hctx;
unsigned long i;
- mutex_lock(&q->sysfs_dir_lock);
- if (!q->mq_sysfs_init_done)
- goto unlock;
+ if (!atomic_read(&q->mq_sysfs_init_done))
+ return;
queue_for_each_hw_ctx(q, hctx, i)
blk_mq_unregister_hctx(hctx);
-
-unlock:
- mutex_unlock(&q->sysfs_dir_lock);
}
int blk_mq_sysfs_register_hctxs(struct request_queue *q)
@@ -292,9 +284,8 @@ int blk_mq_sysfs_register_hctxs(struct request_queue *q)
unsigned long i;
int ret = 0;
- mutex_lock(&q->sysfs_dir_lock);
- if (!q->mq_sysfs_init_done)
- goto unlock;
+ if (!atomic_read(&q->mq_sysfs_init_done))
+ goto out;
queue_for_each_hw_ctx(q, hctx, i) {
ret = blk_mq_register_hctx(hctx);
@@ -302,8 +293,6 @@ int blk_mq_sysfs_register_hctxs(struct request_queue *q)
break;
}
-unlock:
- mutex_unlock(&q->sysfs_dir_lock);
-
+out:
return ret;
}
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 368c2f6f5c85..6f548a4376aa 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -762,7 +762,6 @@ int blk_register_queue(struct gendisk *disk)
struct request_queue *q = disk->queue;
int ret;
- mutex_lock(&q->sysfs_dir_lock);
kobject_init(&disk->queue_kobj, &blk_queue_ktype);
ret = kobject_add(&disk->queue_kobj, &disk_to_dev(disk)->kobj, "queue");
if (ret < 0)
@@ -803,7 +802,6 @@ int blk_register_queue(struct gendisk *disk)
if (q->elevator)
kobject_uevent(&q->elevator->kobj, KOBJ_ADD);
mutex_unlock(&q->sysfs_lock);
- mutex_unlock(&q->sysfs_dir_lock);
/*
* SCSI probing may synchronously create and destroy a lot of
@@ -828,7 +826,6 @@ int blk_register_queue(struct gendisk *disk)
mutex_unlock(&q->sysfs_lock);
out_put_queue_kobj:
kobject_put(&disk->queue_kobj);
- mutex_unlock(&q->sysfs_dir_lock);
return ret;
}
@@ -859,7 +856,6 @@ void blk_unregister_queue(struct gendisk *disk)
blk_queue_flag_clear(QUEUE_FLAG_REGISTERED, q);
mutex_unlock(&q->sysfs_lock);
- mutex_lock(&q->sysfs_dir_lock);
/*
* Remove the sysfs attributes before unregistering the queue data
* structures that can be modified through sysfs.
@@ -876,7 +872,6 @@ void blk_unregister_queue(struct gendisk *disk)
/* Now that we've deleted all child objects, we can delete the queue. */
kobject_uevent(&disk->queue_kobj, KOBJ_REMOVE);
kobject_del(&disk->queue_kobj);
- mutex_unlock(&q->sysfs_dir_lock);
blk_debugfs_remove(disk);
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 76f0a4e7c2e5..1502970eefff 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -561,7 +561,6 @@ struct request_queue {
struct list_head flush_list;
struct mutex sysfs_lock;
- struct mutex sysfs_dir_lock;
struct mutex limits_lock;
/*
@@ -606,7 +605,7 @@ struct request_queue {
*/
struct mutex debugfs_mutex;
- bool mq_sysfs_init_done;
+ atomic_t mq_sysfs_init_done;
};
/* Keep blk_queue_flag_name[] in sync with the definitions below */
--
2.47.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 1/1] block: get rid of request queue ->sysfs_dir_lock
2025-01-20 13:04 ` [RFC PATCH 1/1] block: get rid of request queue ->sysfs_dir_lock Nilay Shroff
@ 2025-01-22 6:28 ` Christoph Hellwig
2025-01-22 18:06 ` Nilay Shroff
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2025-01-22 6:28 UTC (permalink / raw)
To: Nilay Shroff; +Cc: linux-block, hch, ming.lei, dlemoal, axboe, gjoyce
On Mon, Jan 20, 2025 at 06:34:11PM +0530, Nilay Shroff wrote:
> The request queue uses ->sysfs_dir_lock for protecting the addition/
> deletion of kobject entries under sysfs while we register/unregister
> blk-mq. However kobject addition/deletion is already protected with
> kernfs/sysfs internal synchronization primitives. So use of q->sysfs_
> dir_lock seems redundant.
From the pure sysfs perspective, yes. The weird thing with block
layer sysfs is that unregistration/registration can happen at weird
times, though.
> Moreover, q->sysfs_dir_lock is also used at few other callsites along
> with q->sysfs_lock for protecting the addition/deletion of kojects.
> One such example is when we register with sysfs a set of independent
> access ranges for a disk. Here as well we could get rid off q->sysfs_
> dir_lock and only use q->sysfs_lock.
>
> The only variable which q->sysfs_dir_lock appears to protect is q->
> mq_sysfs_init_done which is set/unset while registering/unregistering
> blk-mq with sysfs. But this variable could be easily converted to an
> atomic type and used safely without using q->sysfs_dir_lock.
>
So sysfs_dir_lock absolutely should go. OTOH relying more on sysfs_lock
is a bad idea, as that is also serialied with the attributes, which
again on a pure sysfs basis isn't needed. Given that you don't add
new critical sections for it this should be fine, though.
> So with this patch we remove q->sysfs_dir_lock from each callsite
> and also make q->mq_sysfs_init_done an atomic variable.
Using an atomic_t for a single variable is usually not a good idea.
Let's take a step back and look at what mq_sysfs_init_done is trying
to do.
It is set by blk_mq_sysfs_register, which is called by
blk_register_queue. before marking the queue registered and setting the
disk live. It is cleared in blk_mq_sysfs_unregister called from
blk_unregister_queue just after clearing the queue registered bit.
So maybe we could do something with the queue registered bit, although
eventually I'd like to kill that as well, but either way we need
to explain how the flag prevents the nr_hw_queues update racing with
disk addition/removal. AFAICS we're not completely getting this
right even right now. We'd probably need to hold tag_list_lock
over registering and unregistering the hctx sysfs files.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 1/1] block: get rid of request queue ->sysfs_dir_lock
2025-01-22 6:28 ` Christoph Hellwig
@ 2025-01-22 18:06 ` Nilay Shroff
0 siblings, 0 replies; 4+ messages in thread
From: Nilay Shroff @ 2025-01-22 18:06 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-block, ming.lei, dlemoal, axboe, gjoyce
On 1/22/25 11:58 AM, Christoph Hellwig wrote:
> On Mon, Jan 20, 2025 at 06:34:11PM +0530, Nilay Shroff wrote:
>> The request queue uses ->sysfs_dir_lock for protecting the addition/
>> deletion of kobject entries under sysfs while we register/unregister
>> blk-mq. However kobject addition/deletion is already protected with
>> kernfs/sysfs internal synchronization primitives. So use of q->sysfs_
>> dir_lock seems redundant.
>
> From the pure sysfs perspective, yes. The weird thing with block
> layer sysfs is that unregistration/registration can happen at weird
> times, though.
>
>> Moreover, q->sysfs_dir_lock is also used at few other callsites along
>> with q->sysfs_lock for protecting the addition/deletion of kojects.
>> One such example is when we register with sysfs a set of independent
>> access ranges for a disk. Here as well we could get rid off q->sysfs_
>> dir_lock and only use q->sysfs_lock.
>>
>> The only variable which q->sysfs_dir_lock appears to protect is q->
>> mq_sysfs_init_done which is set/unset while registering/unregistering
>> blk-mq with sysfs. But this variable could be easily converted to an
>> atomic type and used safely without using q->sysfs_dir_lock.
>>
>
> So sysfs_dir_lock absolutely should go. OTOH relying more on sysfs_lock
> is a bad idea, as that is also serialied with the attributes, which
> again on a pure sysfs basis isn't needed. Given that you don't add
> new critical sections for it this should be fine, though.
>
>> So with this patch we remove q->sysfs_dir_lock from each callsite
>> and also make q->mq_sysfs_init_done an atomic variable.
>
> Using an atomic_t for a single variable is usually not a good idea.
> Let's take a step back and look at what mq_sysfs_init_done is trying
> to do.
>
> It is set by blk_mq_sysfs_register, which is called by
> blk_register_queue. before marking the queue registered and setting the
> disk live. It is cleared in blk_mq_sysfs_unregister called from
> blk_unregister_queue just after clearing the queue registered bit.
>
Yeah that's good idea! Indeed, I think we can remove ->mq_sysfs_init_done and
replace it with the QUEUE_FLAG_REGISTERED.
> So maybe we could do something with the queue registered bit, although
> eventually I'd like to kill that as well, but either way we need
> to explain how the flag prevents the nr_hw_queues update racing with
> disk addition/removal. AFAICS we're not completely getting this
> right even right now. We'd probably need to hold tag_list_lock
> over registering and unregistering the hctx sysfs files.
>
Agreed, makes sense to hold ->tag_list_lock so that we can contain
the race between nr_hw_queue update with registering/unregistering the
hctx sysfs files.
I'd spin a new patch with above changes and submit.
Thank you Christoph for your review and comments!
--Nilay
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-01-22 18:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-20 13:04 [RFC PATCH 0/1] remove redundant q->sysfs_dir_lock Nilay Shroff
2025-01-20 13:04 ` [RFC PATCH 1/1] block: get rid of request queue ->sysfs_dir_lock Nilay Shroff
2025-01-22 6:28 ` Christoph Hellwig
2025-01-22 18:06 ` Nilay Shroff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).