From: Christoph Hellwig <hch@lst.de>
To: Nilay Shroff <nilay@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>,
linux-block@vger.kernel.org, ming.lei@redhat.com,
dlemoal@kernel.org, axboe@kernel.dk, gjoyce@ibm.com
Subject: Re: [PATCH 2/2] block: avoid acquiring q->sysfs_lock while accessing sysfs attributes
Date: Thu, 6 Feb 2025 15:07:08 +0100 [thread overview]
Message-ID: <20250206140707.GA2141@lst.de> (raw)
In-Reply-To: <f933e87a-6014-434a-8258-d871c77ca14c@linux.ibm.com>
On Thu, Feb 06, 2025 at 07:24:02PM +0530, Nilay Shroff wrote:
> Yes that's possible and so I audited all sysfs attributes which are
> currently protected using q->sysfs_lock and I found some interesting
> facts. Please find below:
>
> 1. io_poll:
> Write to this attribute is ignored. So, we don't need q->sysfs_lock.
>
> 2. io_poll_delay:
> Write to this attribute is NOP, so we don't need q->sysfs_lock.
Yes, those are easy.
> 3. io_timeout:
> Write to this attribute updates q->rq_timeout and read of this attribute
> returns the value stored in q->rq_timeout Moreover, the q->rq_timeout is
> set only once when we init the queue (under blk_mq_init_allocated_queue())
> even before disk is added. So that means that we may not need to protect
> it with q->sysfs_lock.
Are we sure blk_queue_rq_timeout is never called from anything but
probe? Either way given that this is a limit that isn't directly
corelated with anything else simply using WRITE_ONCE/READ_ONCE might
be enough.
>
> 4. nomerges:
> Write to this attribute file updates two q->flags : QUEUE_FLAG_NOMERGES
> and QUEUE_FLAG_NOXMERGES. These flags are accessed during bio-merge which
> anyways doesn't run with q->sysfs_lock held. Moreover, the q->flags are
> updated/accessed with bitops which are atomic. So, I believe, protecting
> it with q->sysfs_lock is not necessary.
Yes.
> 5. nr_requests:
> Write to this attribute updates the tag sets and this could potentially
> race with __blk_mq_update_nr_hw_queues(). So I think we should really
> protect it with q->tag_set->tag_list_lock instead of q->sysfs_lock.
Yeah.
> 6. read_ahead_kb:
> Write to this attribute file updates disk->bdi->ra_pages. The disk->bdi->
> ra_pages is also updated under queue_limits_commit_update() which runs
> holding q->limits_lock; so I think this attribute file should be protected
> with q->limits_lock and protecting it with q->sysfs_lock is not necessary.
> Maybe we should move it under the same sets of attribute files which today
> runs with q->limits_lock held.
Yes, limits_lock sounds sensible here.
> 7. rq_affinity:
> Write to this attribute file makes atomic updates to q->flags: QUEUE_FLAG_SAME_COMP
> and QUEUE_FLAG_SAME_FORCE. These flags are also accessed from blk_mq_complete_need_ipi()
> using test_bit macro. As read/write to q->flags uses bitops which are atomic,
> protecting it with q->stsys_lock is not necessary.
Agreed. Although updating both flags isn't atomic that should be
harmless in this case (but could use a comment about that).
> 8. scheduler:
> Write to this attribute actually updates q->elevator and the elevator change/switch
> code expects that the q->sysfs_lock is held while we update the iosched to protect
> against the simultaneous __blk_mq_update_nr_hw_queues update. So yes, this field needs
> q->sysfs_lock protection.
>
> However if we're thinking of protecting sched change/update using q->tag_sets->
> tag_list_lock (as discussed in another thread), then we may use q->tag_set->
> tag_list_lock instead of q->sysfs_lock here while reading/writing to this attribute
> file.
Yes.
> 9. wbt_lat_usec:
> Write to this attribute file updates the various wbt limits and state. This may race
> with blk_mq_exit_sched() or blk_register_queue(). The wbt updates through the
> blk_mq_exit_sched() and blk_register_queue() is currently protected with q->sysfs_lock
> and so yes, we need to protect this attribute with q->sysfs_lock.
>
> However, as mentioned above, if we're thinking of protecting elevator change/update
> using q->sets->tag_list_lock then we may use q->tag_set->tag_list_lock intstead of
> q->sysfs_lock while reading/writing to this attribute file.
Yes.
> The rest of attributes seems don't require protection from q->sysfs_lock or any other lock and
> those could be well protected with the sysfs/kernfs internal lock.
So I think the first step here is to remove the locking from
queue_attr_store/queue_attr_show and move it into the attributes that
still need it in some form, followed by replacing it with the more
suitable locks as defined above. And maybe with a little bit more
work we can remove sysfs_lock entirely eventually.
next prev parent reply other threads:[~2025-02-06 14:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-05 14:44 [PATCH 0/2] block: fix lock order and remove redundant locking Nilay Shroff
2025-02-05 14:44 ` [PATCH 1/2] block: fix lock ordering between the queue ->sysfs_lock and freeze-lock Nilay Shroff
2025-02-05 15:59 ` Christoph Hellwig
2025-02-06 13:22 ` Nilay Shroff
2025-02-06 14:15 ` Christoph Hellwig
2025-02-07 11:59 ` Ming Lei
2025-02-07 18:02 ` Nilay Shroff
2025-02-08 8:30 ` Ming Lei
2025-02-08 13:18 ` Nilay Shroff
2025-02-05 14:44 ` [PATCH 2/2] block: avoid acquiring q->sysfs_lock while accessing sysfs attributes Nilay Shroff
2025-02-05 15:53 ` Christoph Hellwig
2025-02-06 13:54 ` Nilay Shroff
2025-02-06 14:07 ` Christoph Hellwig [this message]
2025-02-07 11:03 ` Nilay Shroff
2025-02-08 10:41 ` Ming Lei
2025-02-08 12:56 ` Nilay Shroff
2025-02-09 11:41 ` Ming Lei
2025-02-09 13:41 ` Nilay Shroff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250206140707.GA2141@lst.de \
--to=hch@lst.de \
--cc=axboe@kernel.dk \
--cc=dlemoal@kernel.org \
--cc=gjoyce@ibm.com \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=nilay@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).