Linux block layer
 help / color / mirror / Atom feed
* Re: [blktest/nvme/058] Kernel OOPs while running nvme/058 tests
       [not found] <3a07b752-06a4-4eee-b302-f4669feb859d@linux.ibm.com>
@ 2025-08-26  9:08 ` Ming Lei
  2025-08-26  9:56   ` Nilay Shroff
  0 siblings, 1 reply; 3+ messages in thread
From: Ming Lei @ 2025-08-26  9:08 UTC (permalink / raw)
  To: Venkat Rao Bagalkote; +Cc: LKML, linux-nvme, Nilay Shroff, linux-block

On Tue, Aug 26, 2025 at 02:00:56PM +0530, Venkat Rao Bagalkote wrote:
> Greetings!!!
> 
> 
> IBM CI has reported a kernel OOPs, while running blktest suite(nvme/058
> test).
> 
> 
> Kernel Repo:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> 
> Traces:
> 
> 
> [37496.800225] BUG: Kernel NULL pointer dereference at 0x00000000
> [37496.800230] Faulting instruction address: 0xc0000000008a34b0
> [37496.800235] Oops: Kernel access of bad area, sig: 11 [#1]

...

> [37496.800365] GPR28: 0000000000000001 0000000000000001 c0000000b005c400
> 0000000000000000
> [37496.800424] NIP [c0000000008a34b0] __rq_qos_done_bio+0x3c/0x88

It looks regression from 370ac285f23a ("block: avoid cpu_hotplug_lock depedency on freeze_lock"),
For nvme mpath, same bio crosses two drivers, so QUEUE_FLAG_QOS_ENABLED & q->rq_qos check can't
be skipped.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [blktest/nvme/058] Kernel OOPs while running nvme/058 tests
  2025-08-26  9:08 ` [blktest/nvme/058] Kernel OOPs while running nvme/058 tests Ming Lei
@ 2025-08-26  9:56   ` Nilay Shroff
  2025-08-26 14:49     ` Ming Lei
  0 siblings, 1 reply; 3+ messages in thread
From: Nilay Shroff @ 2025-08-26  9:56 UTC (permalink / raw)
  To: Ming Lei, Venkat Rao Bagalkote; +Cc: LKML, linux-nvme, linux-block



On 8/26/25 2:38 PM, Ming Lei wrote:
> On Tue, Aug 26, 2025 at 02:00:56PM +0530, Venkat Rao Bagalkote wrote:
>> Greetings!!!
>>
>>
>> IBM CI has reported a kernel OOPs, while running blktest suite(nvme/058
>> test).
>>
>>
>> Kernel Repo:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>
>>
>> Traces:
>>
>>
>> [37496.800225] BUG: Kernel NULL pointer dereference at 0x00000000
>> [37496.800230] Faulting instruction address: 0xc0000000008a34b0
>> [37496.800235] Oops: Kernel access of bad area, sig: 11 [#1]
> 
> ...
> 
>> [37496.800365] GPR28: 0000000000000001 0000000000000001 c0000000b005c400
>> 0000000000000000
>> [37496.800424] NIP [c0000000008a34b0] __rq_qos_done_bio+0x3c/0x88
> 
> It looks regression from 370ac285f23a ("block: avoid cpu_hotplug_lock depedency on freeze_lock"),
> For nvme mpath, same bio crosses two drivers, so QUEUE_FLAG_QOS_ENABLED & q->rq_qos check can't
> be skipped.
> 
Thanks Ming for looking at it. And yes you were correct, we can't skip
QUEUE_FLAG_QOS_ENABLED & q->rq_qos for NVMe, However this issue only
manifests with NVMe multipath enabled, as that would create the stacked
NVMe devices. So shall I send the fix or are you going to send the patch
with fix?

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [blktest/nvme/058] Kernel OOPs while running nvme/058 tests
  2025-08-26  9:56   ` Nilay Shroff
@ 2025-08-26 14:49     ` Ming Lei
  0 siblings, 0 replies; 3+ messages in thread
From: Ming Lei @ 2025-08-26 14:49 UTC (permalink / raw)
  To: Nilay Shroff; +Cc: Venkat Rao Bagalkote, LKML, linux-nvme, linux-block

On Tue, Aug 26, 2025 at 03:26:02PM +0530, Nilay Shroff wrote:
> 
> 
> On 8/26/25 2:38 PM, Ming Lei wrote:
> > On Tue, Aug 26, 2025 at 02:00:56PM +0530, Venkat Rao Bagalkote wrote:
> >> Greetings!!!
> >>
> >>
> >> IBM CI has reported a kernel OOPs, while running blktest suite(nvme/058
> >> test).
> >>
> >>
> >> Kernel Repo:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >>
> >>
> >> Traces:
> >>
> >>
> >> [37496.800225] BUG: Kernel NULL pointer dereference at 0x00000000
> >> [37496.800230] Faulting instruction address: 0xc0000000008a34b0
> >> [37496.800235] Oops: Kernel access of bad area, sig: 11 [#1]
> > 
> > ...
> > 
> >> [37496.800365] GPR28: 0000000000000001 0000000000000001 c0000000b005c400
> >> 0000000000000000
> >> [37496.800424] NIP [c0000000008a34b0] __rq_qos_done_bio+0x3c/0x88
> > 
> > It looks regression from 370ac285f23a ("block: avoid cpu_hotplug_lock depedency on freeze_lock"),
> > For nvme mpath, same bio crosses two drivers, so QUEUE_FLAG_QOS_ENABLED & q->rq_qos check can't
> > be skipped.
> > 
> Thanks Ming for looking at it. And yes you were correct, we can't skip
> QUEUE_FLAG_QOS_ENABLED & q->rq_qos for NVMe, However this issue only
> manifests with NVMe multipath enabled, as that would create the stacked
> NVMe devices. So shall I send the fix or are you going to send the patch
> with fix?

Yeah, please go ahead and prepare the fix.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-26 14:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3a07b752-06a4-4eee-b302-f4669feb859d@linux.ibm.com>
2025-08-26  9:08 ` [blktest/nvme/058] Kernel OOPs while running nvme/058 tests Ming Lei
2025-08-26  9:56   ` Nilay Shroff
2025-08-26 14:49     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox