* [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store
[not found] ` <20250730074614.2537382-3-nilay@linux.ibm.com>
@ 2025-10-01 5:20 ` Kyle Sanderson
2025-10-01 13:05 ` Kyle Sanderson
0 siblings, 1 reply; 6+ messages in thread
From: Kyle Sanderson @ 2025-10-01 5:20 UTC (permalink / raw)
To: Nilay Shroff, linux-block, Linus Torvalds, Greg Kroah-Hartman
Cc: axboe, hch, ming.lei, hare, sth, gjoyce, linux-fsdevel,
linux-kernel
On 7/30/2025 12:46 AM, Nilay Shroff wrote:
> To address this, move all sched_tags allocations and deallocations outside
> of both the ->elevator_lock and the ->freeze_lock. Since the lifetime of
> the elevator queue and its associated sched_tags is closely tied, the
> allocated sched_tags are now stored in the elevator queue structure. Then,
> during the actual elevator switch (which runs under ->freeze_lock and
> ->elevator_lock), the pre-allocated sched_tags are assigned to the
> appropriate q->hctx. Once the elevator switch is complete and the locks
> are released, the old elevator queue and its associated sched_tags are
> freed.
> ...
>
> [1] https://lore.kernel.org/all/0659ea8d-a463-47c8-9180-43c719e106eb@linux.ibm.com/
>
> Reported-by: Stefan Haberland <sth@linux.ibm.com>
> Closes: https://lore.kernel.org/all/0659ea8d-a463-47c8-9180-43c719e106eb@linux.ibm.com/
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Hi Nilay,
I am coming off of a 36 hour travel stint, and 6.16.7 (I do not have
that log, and it mightily messed up my xfs root requiring offline
repair), 6.16.9, and 6.17.0 simply do not boot on my system. After
unlocking with LUKS I get this panic consistently and immediately, and I
believe this is the problematic commit which was unfortunately carried
to the previous and current stable. I am using this udev rule:
`ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*|nvme*",
ATTR{queue/scheduler}="bfq"`
> Sep 30 21:19:39 moon kernel: io scheduler bfq registered
> Sep 30 21:19:39 moon kernel: ------------[ cut here ]------------
> Sep 30 21:19:39 moon kernel: kernel BUG at mm/slub.c:563!
> Sep 30 21:19:39 moon kernel: Oops: general protection fault, probably
for non-canonical address 0x2cdf52296eacb08: 0000 [#1] SMP NOPTI
> Sep 30 21:19:39 moon kernel: CPU: 2 UID: 0 PID: 791 Comm:
(udev-worker) Not tainted 6.17.0-061700-generic #202509282239
PREEMPT(voluntary)
> Sep 30 21:19:39 moon kernel: Hardware name: Supermicro Super
Server/A2SDi-8C-HLN4F, BIOS 2.0 03/08/2024
> Sep 30 21:19:39 moon kernel: RIP: 0010:kfree+0x6b/0x360
> Sep 30 21:19:39 moon kernel: Code: 80 48 01 d8 0f 82 f6 02 00 00 48
c7 c2 00 00 00 80 48 2b 15 af 3f 61 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06
48 03 05 8d 3f 61 01 <48> 8b 50 08 49 89 c4 f6 c2 01 0f 85 2f 02 00 00
0f 1f 44 00 00 41
> Sep 30 21:19:39 moon kernel: RSP: 0018:ffffc9e804257930 EFLAGS: 00010207
> Sep 30 21:19:39 moon kernel: RAX: 02cdf52296eacb00 RBX:
b37de27a3ab2cae5 RCX: 0000000000000000
> Sep 30 21:19:39 moon kernel: RDX: 000076bb00000000 RSI:
ffffffff983b7c31 RDI: b37de27a3ab2cae5
> Sep 30 21:19:39 moon kernel: RBP: ffffc9e804257978 R08:
0000000000000000 R09: 0000000000000000
> Sep 30 21:19:39 moon kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff894589365840
> Sep 30 21:19:39 moon kernel: R13: ffff89458c7c20e0 R14:
0000000000000000 R15: ffff89458c7c20e0
> Sep 30 21:19:39 moon kernel: FS: 0000721ca92168c0(0000)
GS:ffff898464f80000(0000) knlGS:0000000000000000
> Sep 30 21:19:39 moon kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
> Sep 30 21:19:39 moon kernel: CR2: 00005afd46663fc8 CR3:
0000000111bf4000 CR4: 00000000003506f0
> Sep 30 21:19:39 moon kernel: Call Trace:
> Sep 30 21:19:39 moon kernel: <TASK>
> Sep 30 21:19:39 moon kernel: ? kfree+0x2dd/0x360
> Sep 30 21:19:39 moon kernel: kvfree+0x31/0x40
> Sep 30 21:19:39 moon kernel: blk_mq_free_tags+0x4b/0x70
> Sep 30 21:19:39 moon kernel: blk_mq_free_map_and_rqs+0x4d/0x70
> Sep 30 21:19:39 moon kernel: blk_mq_free_sched_tags+0x35/0x90
> Sep 30 21:19:39 moon kernel: elevator_change_done+0x53/0x200
> Sep 30 21:19:39 moon kernel: elevator_change+0xdf/0x190
> Sep 30 21:19:39 moon kernel: elv_iosched_store+0x151/0x190
> Sep 30 21:19:39 moon kernel: queue_attr_store+0xf1/0x120
> Sep 30 21:19:39 moon kernel: ? putname+0x65/0x90
> Sep 30 21:19:39 moon kernel: ? aa_file_perm+0x54/0x2e0
> Sep 30 21:19:39 moon kernel: ? _copy_from_iter+0x9d/0x690
> Sep 30 21:19:39 moon kernel: sysfs_kf_write+0x6f/0x90
> Sep 30 21:19:39 moon kernel: kernfs_fop_write_iter+0x15e/0x210
> Sep 30 21:19:39 moon kernel: vfs_write+0x271/0x490
> Sep 30 21:19:39 moon kernel: ksys_write+0x6f/0xf0
> Sep 30 21:19:39 moon kernel: __x64_sys_write+0x19/0x30
> Sep 30 21:19:39 moon kernel: x64_sys_call+0x79/0x2330
> Sep 30 21:19:39 moon kernel: do_syscall_64+0x80/0xac0
> Sep 30 21:19:39 moon kernel: ?
arch_exit_to_user_mode_prepare.isra.0+0xd/0xe0
> Sep 30 21:19:39 moon kernel: ? do_syscall_64+0xb6/0xac0
> Sep 30 21:19:39 moon kernel: ?
arch_exit_to_user_mode_prepare.isra.0+0xd/0xe0
> Sep 30 21:19:39 moon kernel: ? __seccomp_filter+0x47/0x5d0
> Sep 30 21:19:39 moon kernel: ? __x64_sys_fcntl+0x97/0x130
> Sep 30 21:19:39 moon kernel: ?
arch_exit_to_user_mode_prepare.isra.0+0xd/0xe0
> Sep 30 21:19:39 moon kernel: ? do_syscall_64+0xb6/0xac0
> Sep 30 21:19:39 moon kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Sep 30 21:19:39 moon kernel: RIP: 0033:0x721ca911c5a4
> Sep 30 21:19:39 moon kernel: Code: c7 00 16 00 00 00 b8 ff ff ff ff
c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13
b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5
48 83 ec 20 48 89
> Sep 30 21:19:39 moon kernel: RSP: 002b:00007ffdfffb8b58 EFLAGS:
00000202 ORIG_RAX: 0000000000000001
> Sep 30 21:19:39 moon kernel: RAX: ffffffffffffffda RBX:
0000000000000003 RCX: 0000721ca911c5a4
> Sep 30 21:19:39 moon kernel: RDX: 0000000000000003 RSI:
00007ffdfffb8df0 RDI: 000000000000002a
> Sep 30 21:19:39 moon kernel: RBP: 00007ffdfffb8b80 R08:
0000721ca9202228 R09: 00007ffdfffb8bd0
> Sep 30 21:19:39 moon kernel: R10: 0000000000000000 R11:
0000000000000202 R12: 0000000000000003
> Sep 30 21:19:39 moon kernel: R13: 00007ffdfffb8df0 R14:
00005afd465c5100 R15: 0000000000000003
> Sep 30 21:19:39 moon kernel: </TASK>
> Sep 30 21:19:39 moon kernel: Modules linked in: bfq nfsd tcp_bbr
sch_fq auth_rpcgss nfs_acl lockd grace nvme_fabrics efi_pstore sunrpc
nfnetlink dmi_sysfs ip_tables x_tables autofs4 xfs btrfs blake2b_generic
dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor asy>
> Sep 30 21:19:39 moon kernel: Oops: general protection fault, probably
for non-canonical address 0x3ce12d676eacb08: 0000 [#2] SMP NOPTI
> Sep 30 21:19:39 moon kernel: ---[ end trace 0000000000000000 ]---
> Sep 30 21:19:39 moon kernel: CPU: 3 UID: 0 PID: 792 Comm:
(udev-worker) Tainted: G D 6.17.0-061700-generic
#202509282239 PREEMPT(voluntary)
> Sep 30 21:19:39 moon kernel: Tainted: [D]=DIE
> Sep 30 21:19:39 moon kernel: Hardware name: Supermicro Super
Server/A2SDi-8C-HLN4F, BIOS 2.0 03/08/2024
> Sep 30 21:19:39 moon kernel: RIP: 0010:kfree+0x6b/0x360
> Sep 30 21:19:40 moon kernel: Code: 80 48 01 d8 0f 82 f6 02 00 00 48
c7 c2 00 00 00 80 48 2b 15 af 3f 61 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06
48 03 05 8d 3f 61 01 <48> 8b 50 08 49 89 c4 f6 c2 01 0f 85 2f 02 00 00
0f 1f 44 00 00 41
> Sep 30 21:19:40 moon kernel: RSP: 0018:ffffc9e80425f990 EFLAGS: 00010207
> Sep 30 21:19:40 moon kernel: RAX: 03ce12d676eacb00 RBX:
f3854f723ab2cae5 RCX: 0000000000000000
> Sep 30 21:19:40 moon kernel: RDX: 000076bb00000000 RSI:
ffffffff983b7c31 RDI: f3854f723ab2cae5
> Sep 30 21:19:40 moon kernel: RBP: ffffc9e80425f9d8 R08:
0000000000000000 R09: 0000000000000000
> Sep 30 21:19:40 moon kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff894580056160
> Sep 30 21:19:40 moon kernel: R13: ffff89458c7c20e0 R14:
0000000000000000 R15: ffff89458c7c20e0
> Sep 30 21:19:40 moon kernel: FS: 0000721ca92168c0(0000)
GS:ffff898465000000(0000) knlGS:0000000000000000
> Sep 30 21:19:40 moon kernel: RIP: 0010:kfree+0x6b/0x360
> Sep 30 21:19:40 moon kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
> Sep 30 21:19:40 moon kernel: Code: 80 48 01 d8 0f 82 f6 02 00 00 48
c7 c2 00 00 00 80 48 2b 15 af 3f 61 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06
48 03 05 8d 3f 61 01 <48> 8b 50 08 49 89 c4 f6 c2 01 0f 85 2f 02 00 00
0f 1f 44 00 00 41
> Sep 30 21:19:40 moon kernel: CR2: 00007ffdfffb5b70 CR3:
000000010c1aa000 CR4: 00000000003506f0
> Sep 30 21:19:40 moon kernel: RSP: 0018:ffffc9e804257930 EFLAGS: 00010207
> Sep 30 21:19:40 moon kernel: Call Trace:
> Sep 30 21:19:40 moon kernel:
> Sep 30 21:19:40 moon kernel: RAX: 02cdf52296eacb00 RBX:
b37de27a3ab2cae5 RCX: 0000000000000000
> Sep 30 21:19:40 moon kernel: <TASK>
> Sep 30 21:19:40 moon kernel: ? kfree+0x2dd/0x360
> Sep 30 21:19:40 moon kernel: RDX: 000076bb00000000 RSI:
ffffffff983b7c31 RDI: b37de27a3ab2cae5
> Sep 30 21:19:40 moon kernel: kvfree+0x31/0x40
> Sep 30 21:19:40 moon kernel: blk_mq_free_tags+0x4b/0x70
> Sep 30 21:19:40 moon kernel: blk_mq_free_map_and_rqs+0x4d/0x70
> Sep 30 21:19:40 moon kernel: RBP: ffffc9e804257978 R08:
0000000000000000 R09: 0000000000000000
> Sep 30 21:19:40 moon kernel: blk_mq_free_sched_tags+0x35/0x90
> Sep 30 21:19:40 moon kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff894589365840
> Sep 30 21:19:40 moon kernel: elevator_change_done+0x53/0x200
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store
2025-10-01 5:20 ` [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store Kyle Sanderson
@ 2025-10-01 13:05 ` Kyle Sanderson
2025-10-02 15:30 ` Nilay Shroff
0 siblings, 1 reply; 6+ messages in thread
From: Kyle Sanderson @ 2025-10-01 13:05 UTC (permalink / raw)
To: Nilay Shroff, linux-block, Linus Torvalds, Greg Kroah-Hartman,
axboe
Cc: hch, ming.lei, hare, sth, gjoyce, linux-fsdevel, linux-kernel
On 9/30/2025 10:20 PM, Kyle Sanderson wrote:
> On 7/30/2025 12:46 AM, Nilay Shroff wrote:
>> To address this, move all sched_tags allocations and deallocations
>> outside
>> of both the ->elevator_lock and the ->freeze_lock.
>
> Hi Nilay,
>
> I am coming off of a 36 hour travel stint, and 6.16.7 (I do not have
> that log, and it mightily messed up my xfs root requiring offline
> repair), 6.16.9, and 6.17.0 simply do not boot on my system. After
> unlocking with LUKS I get this panic consistently and immediately, and I
> believe this is the problematic commit which was unfortunately carried
> to the previous and current stable. I am using this udev rule:
> `ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*|nvme*", ATTR{queue/
> scheduler}="bfq"`
Hi Greg,
Slept for a couple hours. This appears to be well known in block (the
fix is in the 6.18 pull) that it is causing panics on stable, and didn't
make it back to 6.17 past the initial merge window (as well as 6.16).
Presumably adjusting the request depth isn't common (if this is indeed
the problem)?
I also have ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*|nvme*",
ATTR{queue/nr_requests}="1024" as a udev rule.
Jens, is this the only patch from August that is needed to fix this panic?
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-6.18/block&id=ba28afbd9eff2a6370f23ef4e6a036ab0cfda409
Kyle.
https://lore.kernel.org/all/37087b24-24f7-46a9-95c4-2a2f3dced09b@niklasfi.de/
https://lore.kernel.org/all/175710207227.395498.3249940818566938241.b4-ty@kernel.dk/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store
2025-10-01 13:05 ` Kyle Sanderson
@ 2025-10-02 15:30 ` Nilay Shroff
2025-10-02 15:58 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Nilay Shroff @ 2025-10-02 15:30 UTC (permalink / raw)
To: Kyle Sanderson, linux-block, Linus Torvalds, Greg Kroah-Hartman,
axboe
Cc: hch, ming.lei, hare, sth, gjoyce, linux-fsdevel, linux-kernel
On 10/1/25 6:35 PM, Kyle Sanderson wrote:
> On 9/30/2025 10:20 PM, Kyle Sanderson wrote:
>> On 7/30/2025 12:46 AM, Nilay Shroff wrote:
>>> To address this, move all sched_tags allocations and deallocations outside
>>> of both the ->elevator_lock and the ->freeze_lock.
>>
>> Hi Nilay,
>>
>> I am coming off of a 36 hour travel stint, and 6.16.7 (I do not have that log, and it mightily messed up my xfs root requiring offline repair), 6.16.9, and 6.17.0 simply do not boot on my system. After unlocking with LUKS I get this panic consistently and immediately, and I believe this is the problematic commit which was unfortunately carried to the previous and current stable. I am using this udev rule: `ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*|nvme*", ATTR{queue/ scheduler}="bfq"`
>
> Hi Greg,
>
> Slept for a couple hours. This appears to be well known in block (the fix is in the 6.18 pull) that it is causing panics on stable, and didn't make it back to 6.17 past the initial merge window (as well as 6.16).
>
> Presumably adjusting the request depth isn't common (if this is indeed the problem)?
>
> I also have ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*|nvme*", ATTR{queue/nr_requests}="1024" as a udev rule.
>
So the above udev rule suggests that you're updating
nr_requests which do update the queue depth.
> Jens, is this the only patch from August that is needed to fix this panic?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-6.18/block&id=ba28afbd9eff2a6370f23ef4e6a036ab0cfda409
>
Greg, I think we should have the above commit ba28afbd9eff ("blk-mq: fix
blk_mq_tags double free while nr_requests grown") backported to the 6.16.x
stable kernel, if it hasn't yet queued up.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store
2025-10-02 15:30 ` Nilay Shroff
@ 2025-10-02 15:58 ` Jens Axboe
2025-10-02 16:49 ` Linus Torvalds
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2025-10-02 15:58 UTC (permalink / raw)
To: Nilay Shroff, Kyle Sanderson, linux-block, Linus Torvalds,
Greg Kroah-Hartman
Cc: hch, ming.lei, hare, sth, gjoyce, linux-fsdevel, linux-kernel
On 10/2/25 9:30 AM, Nilay Shroff wrote:
>> Slept for a couple hours. This appears to be well known in block (the fix is in the 6.18 pull) that it is causing panics on stable, and didn't make it back to 6.17 past the initial merge window (as well as 6.16).
>>
>> Presumably adjusting the request depth isn't common (if this is indeed the problem)?
>>
>> I also have ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*|nvme*", ATTR{queue/nr_requests}="1024" as a udev rule.
>>
> So the above udev rule suggests that you're updating
> nr_requests which do update the queue depth.
>
>> Jens, is this the only patch from August that is needed to fix this panic?
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-6.18/block&id=ba28afbd9eff2a6370f23ef4e6a036ab0cfda409
>>
> Greg, I think we should have the above commit ba28afbd9eff ("blk-mq: fix
> blk_mq_tags double free while nr_requests grown") backported to the 6.16.x
> stable kernel, if it hasn't yet queued up.
Sorry missed thit - yes that should be enough, and agree we should get
it into stable. Still waiting on Linus to actually pull my trees though,
so we'll have to wait for that to happen first.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store
2025-10-02 15:58 ` Jens Axboe
@ 2025-10-02 16:49 ` Linus Torvalds
2025-10-02 16:54 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2025-10-02 16:49 UTC (permalink / raw)
To: Jens Axboe
Cc: Nilay Shroff, Kyle Sanderson, linux-block, Greg Kroah-Hartman,
hch, ming.lei, hare, sth, gjoyce, linux-fsdevel, linux-kernel
On Thu, 2 Oct 2025 at 08:58, Jens Axboe <axboe@kernel.dk> wrote:
>
> Sorry missed thit - yes that should be enough, and agree we should get
> it into stable. Still waiting on Linus to actually pull my trees though,
> so we'll have to wait for that to happen first.
Literally next in my queue, so that will happen in minutes..
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store
2025-10-02 16:49 ` Linus Torvalds
@ 2025-10-02 16:54 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2025-10-02 16:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nilay Shroff, Kyle Sanderson, linux-block, Greg Kroah-Hartman,
hch, ming.lei, hare, sth, gjoyce, linux-fsdevel, linux-kernel
On 10/2/25 10:49 AM, Linus Torvalds wrote:
> On Thu, 2 Oct 2025 at 08:58, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Sorry missed thit - yes that should be enough, and agree we should get
>> it into stable. Still waiting on Linus to actually pull my trees though,
>> so we'll have to wait for that to happen first.
>
> Literally next in my queue, so that will happen in minutes..
Perfect, thanks! That's what I get for not being able to send things
out early :-)
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-10-02 16:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250730074614.2537382-1-nilay@linux.ibm.com>
[not found] ` <20250730074614.2537382-3-nilay@linux.ibm.com>
2025-10-01 5:20 ` [6.16.9 / 6.17.0 PANIC REGRESSION] block: fix lockdep warning caused by lock dependency in elv_iosched_store Kyle Sanderson
2025-10-01 13:05 ` Kyle Sanderson
2025-10-02 15:30 ` Nilay Shroff
2025-10-02 15:58 ` Jens Axboe
2025-10-02 16:49 ` Linus Torvalds
2025-10-02 16:54 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).