* [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
@ 2024-09-17 5:32 Damien Le Moal
2024-09-17 5:53 ` Christoph Hellwig
2024-09-17 11:55 ` Shinichiro Kawasaki
0 siblings, 2 replies; 12+ messages in thread
From: Damien Le Moal @ 2024-09-17 5:32 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: Richard W . M . Jones, Ming Lei, Jeff Moyer, Jiri Jaburek,
Christoph Hellwig, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
Commit 734e1a860312 ("block: Prevent deadlocks when switching
elevators") introduced the function elv_iosched_load_module() to allow
loading an elevator module outside of elv_iosched_store() with the
target device queue not frozen, to avoid deadlocks. However, the "none"
scheduler does not have a module and as a result,
elv_iosched_load_module() always returns an error when trying to switch
to this valid scheduler.
Fix this by checking that the requested scheduler is "none" and doing
nothing in that case.
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 734e1a860312 ("block: Prevent deadlocks when switching elevators")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
block/elevator.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/block/elevator.c b/block/elevator.c
index c355b55d0107..d0ee9c0aaed2 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -714,6 +714,8 @@ int elv_iosched_load_module(struct gendisk *disk, const char *buf,
return -EOPNOTSUPP;
strscpy(elevator_name, buf, sizeof(elevator_name));
+ if (!strncmp(elevator_name, "none", 4))
+ return 0;
return request_module("%s-iosched", strstrip(elevator_name));
}
--
2.46.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 5:32 [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler Damien Le Moal
@ 2024-09-17 5:53 ` Christoph Hellwig
2024-09-17 12:33 ` Ming Lei
2024-09-17 11:55 ` Shinichiro Kawasaki
1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2024-09-17 5:53 UTC (permalink / raw)
To: Damien Le Moal
Cc: Jens Axboe, linux-block, Richard W . M . Jones, Ming Lei,
Jeff Moyer, Jiri Jaburek, Christoph Hellwig, Bart Van Assche,
Hannes Reinecke, Chaitanya Kulkarni, Shin'ichiro Kawasaki
On Tue, Sep 17, 2024 at 02:32:58PM +0900, Damien Le Moal wrote:
> Commit 734e1a860312 ("block: Prevent deadlocks when switching
> elevators") introduced the function elv_iosched_load_module() to allow
> loading an elevator module outside of elv_iosched_store() with the
> target device queue not frozen, to avoid deadlocks. However, the "none"
> scheduler does not have a module and as a result,
> elv_iosched_load_module() always returns an error when trying to switch
> to this valid scheduler.
>
> Fix this by checking that the requested scheduler is "none" and doing
> nothing in that case.
The old code before this commit simply ignored the request_module,
just as most callers of it do. I think that's the right approach
here as well.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 5:32 [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler Damien Le Moal
2024-09-17 5:53 ` Christoph Hellwig
@ 2024-09-17 11:55 ` Shinichiro Kawasaki
1 sibling, 0 replies; 12+ messages in thread
From: Shinichiro Kawasaki @ 2024-09-17 11:55 UTC (permalink / raw)
To: Damien Le Moal
Cc: Jens Axboe, linux-block@vger.kernel.org, Richard W . M . Jones,
Ming Lei, Jeff Moyer, Jiri Jaburek, hch, Bart Van Assche,
Hannes Reinecke, Chaitanya Kulkarni
On Sep 17, 2024 / 14:32, Damien Le Moal wrote:
> Commit 734e1a860312 ("block: Prevent deadlocks when switching
> elevators") introduced the function elv_iosched_load_module() to allow
> loading an elevator module outside of elv_iosched_store() with the
> target device queue not frozen, to avoid deadlocks. However, the "none"
> scheduler does not have a module and as a result,
> elv_iosched_load_module() always returns an error when trying to switch
> to this valid scheduler.
This unexpected behavior (setting none scheduler has no effect) was found
by the blktests test case scsi/008 failure. This test case invokes the fio
command with the --ioscheduler=none option.
>
> Fix this by checking that the requested scheduler is "none" and doing
> nothing in that case.
I confirmed this fix patch avoids the unexpected behavior, and makes scsi/008
pass.
>
> Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Fixes: 734e1a860312 ("block: Prevent deadlocks when switching elevators")
> Cc: stable@vger.kernel.org
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
I also run whole blktests applying this patch on top of the v6.11 kernel, and
observed no regression. Looks good from testing point of view.
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 5:53 ` Christoph Hellwig
@ 2024-09-17 12:33 ` Ming Lei
2024-09-17 12:48 ` Damien Le Moal
0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2024-09-17 12:33 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Damien Le Moal, Jens Axboe, linux-block, Richard W . M . Jones,
Jeff Moyer, Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On Tue, Sep 17, 2024 at 1:53 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Sep 17, 2024 at 02:32:58PM +0900, Damien Le Moal wrote:
> > Commit 734e1a860312 ("block: Prevent deadlocks when switching
> > elevators") introduced the function elv_iosched_load_module() to allow
> > loading an elevator module outside of elv_iosched_store() with the
> > target device queue not frozen, to avoid deadlocks. However, the "none"
> > scheduler does not have a module and as a result,
> > elv_iosched_load_module() always returns an error when trying to switch
> > to this valid scheduler.
> >
> > Fix this by checking that the requested scheduler is "none" and doing
> > nothing in that case.
>
> The old code before this commit simply ignored the request_module,
> just as most callers of it do. I think that's the right approach
> here as well.
freeze queue is actually easy to cause deadlock, and old code is to not
do it everywhere.
Probably it may be more reliable to replace 'load_module' with 'no_freeze',
and not to freeze queue in case that 'no_freeze' is set in queue_attr_store().
queue_wb_lat_store() need no_freeze too since there is GFP_KERNEL
allocation involved.
Thanks,
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 12:33 ` Ming Lei
@ 2024-09-17 12:48 ` Damien Le Moal
2024-09-17 13:02 ` Ming Lei
0 siblings, 1 reply; 12+ messages in thread
From: Damien Le Moal @ 2024-09-17 12:48 UTC (permalink / raw)
To: Ming Lei, Christoph Hellwig
Cc: Jens Axboe, linux-block, Richard W . M . Jones, Jeff Moyer,
Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On 2024/09/17 14:33, Ming Lei wrote:
> On Tue, Sep 17, 2024 at 1:53 PM Christoph Hellwig <hch@lst.de> wrote:
>>
>> On Tue, Sep 17, 2024 at 02:32:58PM +0900, Damien Le Moal wrote:
>>> Commit 734e1a860312 ("block: Prevent deadlocks when switching
>>> elevators") introduced the function elv_iosched_load_module() to allow
>>> loading an elevator module outside of elv_iosched_store() with the
>>> target device queue not frozen, to avoid deadlocks. However, the "none"
>>> scheduler does not have a module and as a result,
>>> elv_iosched_load_module() always returns an error when trying to switch
>>> to this valid scheduler.
>>>
>>> Fix this by checking that the requested scheduler is "none" and doing
>>> nothing in that case.
>>
>> The old code before this commit simply ignored the request_module,
>> just as most callers of it do. I think that's the right approach
>> here as well.
>
> freeze queue is actually easy to cause deadlock, and old code is to not
> do it everywhere.
>
> Probably it may be more reliable to replace 'load_module' with 'no_freeze',
> and not to freeze queue in case that 'no_freeze' is set in queue_attr_store().
load_module or whatever the name you prefer, should NOT imply that freezing the
queue is not necessary. Switching the IO scheduler is really one such case.
Switching the scheduler really needs to be done with the queue frozen, but the
scheduler module loading needs to be done with the queue live.
> queue_wb_lat_store() need no_freeze too since there is GFP_KERNEL
> allocation involved.
No, because again the attribute may need to have the queue frozen to correctly
be changed. To avoid hangs, what is needed is to force a GFP_NOIO context before
calling the attribute ->store() operation. Doing so, any memory allocation that
the attribute change may need will not cause re-entry into a frozen queue (which
would result in a hang).
This is easy to do with memalloc_noio_save()/memalloc_noio_restore().
>
> Thanks,
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 12:48 ` Damien Le Moal
@ 2024-09-17 13:02 ` Ming Lei
2024-09-17 13:05 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2024-09-17 13:02 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, Jens Axboe, linux-block, Richard W . M . Jones,
Jeff Moyer, Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On Tue, Sep 17, 2024 at 02:48:06PM +0200, Damien Le Moal wrote:
> On 2024/09/17 14:33, Ming Lei wrote:
> > On Tue, Sep 17, 2024 at 1:53 PM Christoph Hellwig <hch@lst.de> wrote:
> >>
> >> On Tue, Sep 17, 2024 at 02:32:58PM +0900, Damien Le Moal wrote:
> >>> Commit 734e1a860312 ("block: Prevent deadlocks when switching
> >>> elevators") introduced the function elv_iosched_load_module() to allow
> >>> loading an elevator module outside of elv_iosched_store() with the
> >>> target device queue not frozen, to avoid deadlocks. However, the "none"
> >>> scheduler does not have a module and as a result,
> >>> elv_iosched_load_module() always returns an error when trying to switch
> >>> to this valid scheduler.
> >>>
> >>> Fix this by checking that the requested scheduler is "none" and doing
> >>> nothing in that case.
> >>
> >> The old code before this commit simply ignored the request_module,
> >> just as most callers of it do. I think that's the right approach
> >> here as well.
> >
> > freeze queue is actually easy to cause deadlock, and old code is to not
> > do it everywhere.
> >
> > Probably it may be more reliable to replace 'load_module' with 'no_freeze',
> > and not to freeze queue in case that 'no_freeze' is set in queue_attr_store().
>
> load_module or whatever the name you prefer, should NOT imply that freezing the
> queue is not necessary. Switching the IO scheduler is really one such case.
> Switching the scheduler really needs to be done with the queue frozen, but the
> scheduler module loading needs to be done with the queue live.
Here 'no_freeze' means that automatic 'freeze queue' isn't needed, or
it can be named as 'no_auto_freeze'.
Again, 'load_module' is one bad name from interface viewpoint, which is just
needed by 'scheduler' only.
>
> > queue_wb_lat_store() need no_freeze too since there is GFP_KERNEL
> > allocation involved.
>
> No, because again the attribute may need to have the queue frozen to correctly
> be changed. To avoid hangs, what is needed is to force a GFP_NOIO context before
> calling the attribute ->store() operation. Doing so, any memory allocation that
> the attribute change may need will not cause re-entry into a frozen queue (which
> would result in a hang).
>
> This is easy to do with memalloc_noio_save()/memalloc_noio_restore().
But why do we need that? Just for paper over the problem caused by the
unnecessary freeze queue?
Thanks,
Ming
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 13:02 ` Ming Lei
@ 2024-09-17 13:05 ` Christoph Hellwig
2024-09-17 13:11 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2024-09-17 13:05 UTC (permalink / raw)
To: Ming Lei
Cc: Damien Le Moal, Christoph Hellwig, Jens Axboe, linux-block,
Richard W . M . Jones, Jeff Moyer, Jiri Jaburek, Bart Van Assche,
Hannes Reinecke, Chaitanya Kulkarni, Shin'ichiro Kawasaki
On Tue, Sep 17, 2024 at 09:02:36PM +0800, Ming Lei wrote:
>
> Here 'no_freeze' means that automatic 'freeze queue' isn't needed, or
> it can be named as 'no_auto_freeze'.
>
> Again, 'load_module' is one bad name from interface viewpoint, which is just
> needed by 'scheduler' only.
If we want to reshuffle this we could have a ->store_unfrozen method
that does all the work. But as long as the elevator loading is the
only thing that needs do to unfrozen work I'd just keep things as it
and not rock the boat.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 13:05 ` Christoph Hellwig
@ 2024-09-17 13:11 ` Jens Axboe
2024-09-17 13:14 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2024-09-17 13:11 UTC (permalink / raw)
To: Christoph Hellwig, Ming Lei
Cc: Damien Le Moal, linux-block, Richard W . M . Jones, Jeff Moyer,
Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On 9/17/24 7:05 AM, Christoph Hellwig wrote:
> On Tue, Sep 17, 2024 at 09:02:36PM +0800, Ming Lei wrote:
>>
>> Here 'no_freeze' means that automatic 'freeze queue' isn't needed, or
>> it can be named as 'no_auto_freeze'.
>>
>> Again, 'load_module' is one bad name from interface viewpoint, which is just
>> needed by 'scheduler' only.
>
> If we want to reshuffle this we could have a ->store_unfrozen method
> that does all the work. But as long as the elevator loading is the
> only thing that needs do to unfrozen work I'd just keep things as it
> and not rock the boat.
Whatever reshuffling people have in mind, that needs to happen AFTER
this bug is sorted out.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 13:11 ` Jens Axboe
@ 2024-09-17 13:14 ` Christoph Hellwig
2024-09-17 13:17 ` Jens Axboe
2024-09-17 13:18 ` Damien Le Moal
0 siblings, 2 replies; 12+ messages in thread
From: Christoph Hellwig @ 2024-09-17 13:14 UTC (permalink / raw)
To: Jens Axboe
Cc: Christoph Hellwig, Ming Lei, Damien Le Moal, linux-block,
Richard W . M . Jones, Jeff Moyer, Jiri Jaburek, Bart Van Assche,
Hannes Reinecke, Chaitanya Kulkarni, Shin'ichiro Kawasaki
On Tue, Sep 17, 2024 at 07:11:22AM -0600, Jens Axboe wrote:
> Whatever reshuffling people have in mind, that needs to happen AFTER
> this bug is sorted out.
Yes. The fix from Damien will work, but reverting to the old behavior
of ignoring the request_module return value feel much better. I can
prepare a patch, but I didn't want to steal the credits from Damien.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 13:14 ` Christoph Hellwig
@ 2024-09-17 13:17 ` Jens Axboe
2024-09-17 13:18 ` Damien Le Moal
1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2024-09-17 13:17 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Ming Lei, Damien Le Moal, linux-block, Richard W . M . Jones,
Jeff Moyer, Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On 9/17/24 7:14 AM, Christoph Hellwig wrote:
> On Tue, Sep 17, 2024 at 07:11:22AM -0600, Jens Axboe wrote:
>> Whatever reshuffling people have in mind, that needs to happen AFTER
>> this bug is sorted out.
>
> Yes. The fix from Damien will work, but reverting to the old behavior
> of ignoring the request_module return value feel much better. I can
> prepare a patch, but I didn't want to steal the credits from Damien.
Oh I agree, I think we should just ignore that and that would be the
better patch. My point is just that larger reworking should be done
after the fix, not advocating for a particular solution.
That said, someone did just email me privately because they ran
into this with 6.11. So we should do a clean simple patch to help
facilitate getting it to -stable sooner rather than later.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 13:14 ` Christoph Hellwig
2024-09-17 13:17 ` Jens Axboe
@ 2024-09-17 13:18 ` Damien Le Moal
2024-09-17 13:19 ` Jens Axboe
1 sibling, 1 reply; 12+ messages in thread
From: Damien Le Moal @ 2024-09-17 13:18 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Ming Lei, linux-block, Richard W . M . Jones, Jeff Moyer,
Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On 2024/09/17 15:14, Christoph Hellwig wrote:
> On Tue, Sep 17, 2024 at 07:11:22AM -0600, Jens Axboe wrote:
>> Whatever reshuffling people have in mind, that needs to happen AFTER
>> this bug is sorted out.
>
> Yes. The fix from Damien will work, but reverting to the old behavior
> of ignoring the request_module return value feel much better. I can
> prepare a patch, but I didn't want to steal the credits from Damien.
>
OK. I can send a v2 ignoring the request_module() result, as it was before.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler
2024-09-17 13:18 ` Damien Le Moal
@ 2024-09-17 13:19 ` Jens Axboe
0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2024-09-17 13:19 UTC (permalink / raw)
To: Damien Le Moal, Christoph Hellwig
Cc: Ming Lei, linux-block, Richard W . M . Jones, Jeff Moyer,
Jiri Jaburek, Bart Van Assche, Hannes Reinecke,
Chaitanya Kulkarni, Shin'ichiro Kawasaki
On 9/17/24 7:18 AM, Damien Le Moal wrote:
> On 2024/09/17 15:14, Christoph Hellwig wrote:
>> On Tue, Sep 17, 2024 at 07:11:22AM -0600, Jens Axboe wrote:
>>> Whatever reshuffling people have in mind, that needs to happen AFTER
>>> this bug is sorted out.
>>
>> Yes. The fix from Damien will work, but reverting to the old behavior
>> of ignoring the request_module return value feel much better. I can
>> prepare a patch, but I didn't want to steal the credits from Damien.
>>
>
> OK. I can send a v2 ignoring the request_module() result, as it was before.
Sounds good, let's do that.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-09-17 13:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-17 5:32 [PATCH] block: Fix elv_iosched_local_module handling of "none" scheduler Damien Le Moal
2024-09-17 5:53 ` Christoph Hellwig
2024-09-17 12:33 ` Ming Lei
2024-09-17 12:48 ` Damien Le Moal
2024-09-17 13:02 ` Ming Lei
2024-09-17 13:05 ` Christoph Hellwig
2024-09-17 13:11 ` Jens Axboe
2024-09-17 13:14 ` Christoph Hellwig
2024-09-17 13:17 ` Jens Axboe
2024-09-17 13:18 ` Damien Le Moal
2024-09-17 13:19 ` Jens Axboe
2024-09-17 11:55 ` Shinichiro Kawasaki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).