* Kernel OOPS while creating a NVMe Namespace
@ 2024-06-10 7:51 Venkat Rao Bagalkote
2024-06-10 9:43 ` Hillf Danton
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Venkat Rao Bagalkote @ 2024-06-10 7:51 UTC (permalink / raw)
To: kbusch, sagi; +Cc: linux-block, linux-kernel, linux-nvme, sachinp
Greetings!!!
Observing Kernel OOPS, while creating namespace on a NVMe device.
[ 140.209777] BUG: Unable to handle kernel data access at
0x18d7003065646fee
[ 140.209792] Faulting instruction address: 0xc00000000023b45c
[ 140.209798] Oops: Kernel access of bad area, sig: 11 [#1]
[ 140.209802] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
[ 140.209809] Modules linked in: rpadlpar_io rpaphp xsk_diag
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
bonding nf_conntrack tls nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set
nf_tables nfnetlink vmx_crypto pseries_rng binfmt_misc fuse xfs
libcrc32c sd_mod sg ibmvscsi scsi_transport_srp ibmveth nvme nvme_core
t10_pi crc64_rocksoft_generic crc64_rocksoft crc64
[ 140.209864] CPU: 2 PID: 129 Comm: kworker/u65:3 Kdump: loaded Not
tainted 6.10.0-rc3 #2
[ 140.209870] Hardware name: IBM,9009-42A POWER9 (raw) 0x4e0202
0xf000005 of:IBM,FW950.A0 (VL950_141) hv:phyp pSeries
[ 140.209876] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[ 140.209889] NIP: c00000000023b45c LR: c008000006a96b20 CTR:
c00000000023b42c
[ 140.209894] REGS: c0000000506078a0 TRAP: 0380 Not tainted (6.10.0-rc3)
[ 140.209899] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 24000244 XER: 00000000
[ 140.209915] CFAR: c008000006aa80ac IRQMASK: 0
[ 140.209915] GPR00: c008000006a96b20 c000000050607b40 c000000001573700
c000000004291ee0
[ 140.209915] GPR04: 0000000000000000 c000000006150080 00000000c0080005
fffffffffffe0000
[ 140.209915] GPR08: 0000000000000000 18d7003065646f6e 0000000000000000
c008000006aa8098
[ 140.209915] GPR12: c00000000023b42c c00000000f7cdf00 c0000000001a151c
c000000004f2be80
[ 140.209915] GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 140.209915] GPR20: c000000004dbcc00 0000000000000006 0000000000000002
c000000004911270
[ 140.209915] GPR24: 0000000000000000 0000000000000000 c0000000ee254ffc
c0000000049111f0
[ 140.209915] GPR28: 0000000000000000 c000000004911260 c000000004291ee0
c000000004911260
[ 140.209975] NIP [c00000000023b45c] synchronize_srcu+0x30/0x1c0
[ 140.209984] LR [c008000006a96b20] nvme_ns_remove+0x80/0x2d8 [nvme_core]
[ 140.209994] Call Trace:
[ 140.209997] [c000000050607b90] [c008000006a96b20]
nvme_ns_remove+0x80/0x2d8 [nvme_core]
[ 140.210008] [c000000050607bd0] [c008000006a972b4]
nvme_remove_invalid_namespaces+0x144/0x1ac [nvme_core]
[ 140.210020] [c000000050607c60] [c008000006a9dbd4]
nvme_scan_ns_list+0x19c/0x370 [nvme_core]
[ 140.210032] [c000000050607d70] [c008000006a9dfc8]
nvme_scan_work+0xc8/0x278 [nvme_core]
[ 140.210043] [c000000050607e40] [c00000000019414c]
process_one_work+0x20c/0x4f4
[ 140.210051] [c000000050607ef0] [c0000000001950cc]
worker_thread+0x378/0x544
[ 140.210058] [c000000050607f90] [c0000000001a164c] kthread+0x138/0x140
[ 140.210065] [c000000050607fe0] [c00000000000df98]
start_kernel_thread+0x14/0x18
[ 140.210072] Code: 3c4c0134 384282d4 7c0802a6 60000000 7c0802a6
fbc1fff0 fba1ffe8 fbe1fff8 7c7e1b78 f8010010 f821ffb1 e9230010
<e9290080> 7c2004ac 71290003 41820008
[ 140.210093] ---[ end trace 0000000000000000 ]---
Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
Reverting it, issue is not seen.
Regards,
Venkat.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 7:51 Kernel OOPS while creating a NVMe Namespace Venkat Rao Bagalkote
@ 2024-06-10 9:43 ` Hillf Danton
2024-06-10 9:57 ` Sagi Grimberg
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Hillf Danton @ 2024-06-10 9:43 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: kbusch, sagi, linux-block, linux-kernel, linux-nvme, sachinp
On Mon, 10 Jun 2024 13:21:00 +0530 Venkat Rao Bagalkote wrote:
> Greetings!!!
>
> Observing Kernel OOPS, while creating namespace on a NVMe device.
>
> [ 140.209777] BUG: Unable to handle kernel data access at
> 0x18d7003065646fee
> [ 140.209792] Faulting instruction address: 0xc00000000023b45c
> [ 140.209798] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 140.209802] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
> [ 140.209809] Modules linked in: rpadlpar_io rpaphp xsk_diag
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> bonding nf_conntrack tls nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set
> nf_tables nfnetlink vmx_crypto pseries_rng binfmt_misc fuse xfs
> libcrc32c sd_mod sg ibmvscsi scsi_transport_srp ibmveth nvme nvme_core
> t10_pi crc64_rocksoft_generic crc64_rocksoft crc64
> [ 140.209864] CPU: 2 PID: 129 Comm: kworker/u65:3 Kdump: loaded Not
> tainted 6.10.0-rc3 #2
> [ 140.209870] Hardware name: IBM,9009-42A POWER9 (raw) 0x4e0202
> 0xf000005 of:IBM,FW950.A0 (VL950_141) hv:phyp pSeries
> [ 140.209876] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> [ 140.209889] NIP: c00000000023b45c LR: c008000006a96b20 CTR:
> c00000000023b42c
> [ 140.209894] REGS: c0000000506078a0 TRAP: 0380 Not tainted (6.10.0-rc3)
> [ 140.209899] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
> CR: 24000244 XER: 00000000
> [ 140.209915] CFAR: c008000006aa80ac IRQMASK: 0
> [ 140.209915] GPR00: c008000006a96b20 c000000050607b40 c000000001573700
> c000000004291ee0
> [ 140.209915] GPR04: 0000000000000000 c000000006150080 00000000c0080005
> fffffffffffe0000
> [ 140.209915] GPR08: 0000000000000000 18d7003065646f6e 0000000000000000
> c008000006aa8098
> [ 140.209915] GPR12: c00000000023b42c c00000000f7cdf00 c0000000001a151c
> c000000004f2be80
> [ 140.209915] GPR16: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 140.209915] GPR20: c000000004dbcc00 0000000000000006 0000000000000002
> c000000004911270
> [ 140.209915] GPR24: 0000000000000000 0000000000000000 c0000000ee254ffc
> c0000000049111f0
> [ 140.209915] GPR28: 0000000000000000 c000000004911260 c000000004291ee0
> c000000004911260
> [ 140.209975] NIP [c00000000023b45c] synchronize_srcu+0x30/0x1c0
> [ 140.209984] LR [c008000006a96b20] nvme_ns_remove+0x80/0x2d8 [nvme_core]
> [ 140.209994] Call Trace:
> [ 140.209997] [c000000050607b90] [c008000006a96b20]
> nvme_ns_remove+0x80/0x2d8 [nvme_core]
> [ 140.210008] [c000000050607bd0] [c008000006a972b4]
> nvme_remove_invalid_namespaces+0x144/0x1ac [nvme_core]
> [ 140.210020] [c000000050607c60] [c008000006a9dbd4]
> nvme_scan_ns_list+0x19c/0x370 [nvme_core]
> [ 140.210032] [c000000050607d70] [c008000006a9dfc8]
> nvme_scan_work+0xc8/0x278 [nvme_core]
> [ 140.210043] [c000000050607e40] [c00000000019414c]
> process_one_work+0x20c/0x4f4
> [ 140.210051] [c000000050607ef0] [c0000000001950cc]
> worker_thread+0x378/0x544
> [ 140.210058] [c000000050607f90] [c0000000001a164c] kthread+0x138/0x140
> [ 140.210065] [c000000050607fe0] [c00000000000df98]
> start_kernel_thread+0x14/0x18
> [ 140.210072] Code: 3c4c0134 384282d4 7c0802a6 60000000 7c0802a6
> fbc1fff0 fba1ffe8 fbe1fff8 7c7e1b78 f8010010 f821ffb1 e9230010
> <e9290080> 7c2004ac 71290003 41820008
> [ 140.210093] ---[ end trace 0000000000000000 ]---
>
>
> Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
>
> Reverting it, issue is not seen.
See if refcnt leak existed before be647e2c76b2
--- x/drivers/nvme/host/core.c
+++ y/drivers/nvme/host/core.c
@@ -4078,6 +4078,7 @@ static void nvme_scan_work(struct work_s
return;
}
+ nvme_get_ctrl(ctrl);
if (test_and_clear_bit(NVME_AER_NOTICE_NS_CHANGED, &ctrl->events)) {
dev_info(ctrl->device, "rescanning namespaces.\n");
nvme_clear_changed_ns_log(ctrl);
@@ -4097,6 +4098,7 @@ static void nvme_scan_work(struct work_s
nvme_scan_ns_sequential(ctrl);
}
mutex_unlock(&ctrl->scan_lock);
+ nvme_put_ctrl(ctrl);
}
/*
--
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 7:51 Kernel OOPS while creating a NVMe Namespace Venkat Rao Bagalkote
2024-06-10 9:43 ` Hillf Danton
@ 2024-06-10 9:57 ` Sagi Grimberg
2024-06-10 15:24 ` Keith Busch
2024-06-10 18:32 ` Chaitanya Kulkarni
2024-06-10 18:53 ` Keith Busch
3 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2024-06-10 9:57 UTC (permalink / raw)
To: Venkat Rao Bagalkote, kbusch
Cc: linux-block, linux-kernel, linux-nvme, sachinp
On 10/06/2024 10:51, Venkat Rao Bagalkote wrote:
> Greetings!!!
>
> Observing Kernel OOPS, while creating namespace on a NVMe device.
>
> [ 140.209777] BUG: Unable to handle kernel data access at
> 0x18d7003065646fee
> [ 140.209792] Faulting instruction address: 0xc00000000023b45c
> [ 140.209798] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 140.209802] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
> [ 140.209809] Modules linked in: rpadlpar_io rpaphp xsk_diag
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> bonding nf_conntrack tls nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set
> nf_tables nfnetlink vmx_crypto pseries_rng binfmt_misc fuse xfs
> libcrc32c sd_mod sg ibmvscsi scsi_transport_srp ibmveth nvme nvme_core
> t10_pi crc64_rocksoft_generic crc64_rocksoft crc64
> [ 140.209864] CPU: 2 PID: 129 Comm: kworker/u65:3 Kdump: loaded Not
> tainted 6.10.0-rc3 #2
> [ 140.209870] Hardware name: IBM,9009-42A POWER9 (raw) 0x4e0202
> 0xf000005 of:IBM,FW950.A0 (VL950_141) hv:phyp pSeries
> [ 140.209876] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> [ 140.209889] NIP: c00000000023b45c LR: c008000006a96b20 CTR:
> c00000000023b42c
> [ 140.209894] REGS: c0000000506078a0 TRAP: 0380 Not tainted
> (6.10.0-rc3)
> [ 140.209899] MSR: 800000000280b033
> <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000244 XER: 00000000
> [ 140.209915] CFAR: c008000006aa80ac IRQMASK: 0
> [ 140.209915] GPR00: c008000006a96b20 c000000050607b40
> c000000001573700 c000000004291ee0
> [ 140.209915] GPR04: 0000000000000000 c000000006150080
> 00000000c0080005 fffffffffffe0000
> [ 140.209915] GPR08: 0000000000000000 18d7003065646f6e
> 0000000000000000 c008000006aa8098
> [ 140.209915] GPR12: c00000000023b42c c00000000f7cdf00
> c0000000001a151c c000000004f2be80
> [ 140.209915] GPR16: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> [ 140.209915] GPR20: c000000004dbcc00 0000000000000006
> 0000000000000002 c000000004911270
> [ 140.209915] GPR24: 0000000000000000 0000000000000000
> c0000000ee254ffc c0000000049111f0
> [ 140.209915] GPR28: 0000000000000000 c000000004911260
> c000000004291ee0 c000000004911260
> [ 140.209975] NIP [c00000000023b45c] synchronize_srcu+0x30/0x1c0
> [ 140.209984] LR [c008000006a96b20] nvme_ns_remove+0x80/0x2d8
> [nvme_core]
> [ 140.209994] Call Trace:
> [ 140.209997] [c000000050607b90] [c008000006a96b20]
> nvme_ns_remove+0x80/0x2d8 [nvme_core]
> [ 140.210008] [c000000050607bd0] [c008000006a972b4]
> nvme_remove_invalid_namespaces+0x144/0x1ac [nvme_core]
> [ 140.210020] [c000000050607c60] [c008000006a9dbd4]
> nvme_scan_ns_list+0x19c/0x370 [nvme_core]
> [ 140.210032] [c000000050607d70] [c008000006a9dfc8]
> nvme_scan_work+0xc8/0x278 [nvme_core]
> [ 140.210043] [c000000050607e40] [c00000000019414c]
> process_one_work+0x20c/0x4f4
> [ 140.210051] [c000000050607ef0] [c0000000001950cc]
> worker_thread+0x378/0x544
> [ 140.210058] [c000000050607f90] [c0000000001a164c] kthread+0x138/0x140
> [ 140.210065] [c000000050607fe0] [c00000000000df98]
> start_kernel_thread+0x14/0x18
> [ 140.210072] Code: 3c4c0134 384282d4 7c0802a6 60000000 7c0802a6
> fbc1fff0 fba1ffe8 fbe1fff8 7c7e1b78 f8010010 f821ffb1 e9230010
> <e9290080> 7c2004ac 71290003 41820008
> [ 140.210093] ---[ end trace 0000000000000000 ]---
>
>
> Issue is introduced by the patch:
> be647e2c76b27f409cdd520f66c95be888b553a3.
Exactly this was the concern when introducing a behavior change in a
sensitive area of the code
to silence lockdep...
I'm assuming that the bad dereference is:
synchronize_srcu(&ns->ctrl->srcu);
btw, looking at the code again, I'm assuming that synchronizing srcu in
every ns remove
slows down batch removal of many namespaces greatly...
>
>
> Reverting it, issue is not seen.
>
>
> Regards,
>
> Venkat.
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 9:57 ` Sagi Grimberg
@ 2024-06-10 15:24 ` Keith Busch
0 siblings, 0 replies; 11+ messages in thread
From: Keith Busch @ 2024-06-10 15:24 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Venkat Rao Bagalkote, linux-block, linux-kernel, linux-nvme,
sachinp
On Mon, Jun 10, 2024 at 12:57:02PM +0300, Sagi Grimberg wrote:
> On 10/06/2024 10:51, Venkat Rao Bagalkote wrote:
> >
> > [ 140.209777] BUG: Unable to handle kernel data access at 0x18d7003065646fee
> > [ 140.209792] Faulting instruction address: 0xc00000000023b45c
> > [ 140.209798] Oops: Kernel access of bad area, sig: 11 [#1]
> > [ 140.209802] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
> > [ 140.209864] CPU: 2 PID: 129 Comm: kworker/u65:3 Kdump: loaded Not tainted 6.10.0-rc3 #2
> > [ 140.209870] Hardware name: IBM,9009-42A POWER9 (raw) 0x4e0202 0xf000005 of:IBM,FW950.A0 (VL950_141) hv:phyp pSeries
> > [ 140.209876] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> > [ 140.209889] NIP: c00000000023b45c LR: c008000006a96b20 CTR: c00000000023b42c
> > [ 140.209894] REGS: c0000000506078a0 TRAP: 0380 Not tainted (6.10.0-rc3)
> > [ 140.209899] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000244 XER: 00000000
> > [ 140.209975] NIP [c00000000023b45c] synchronize_srcu+0x30/0x1c0
> > [ 140.209984] LR [c008000006a96b20] nvme_ns_remove+0x80/0x2d8 [nvme_core]
> > [ 140.209994] Call Trace:
> > [ 140.209997] [c000000050607b90] [c008000006a96b20] nvme_ns_remove+0x80/0x2d8 [nvme_core]
> > [ 140.210008] [c000000050607bd0] [c008000006a972b4] nvme_remove_invalid_namespaces+0x144/0x1ac [nvme_core]
> > [ 140.210020] [c000000050607c60] [c008000006a9dbd4] nvme_scan_ns_list+0x19c/0x370 [nvme_core]
> > [ 140.210032] [c000000050607d70] [c008000006a9dfc8] nvme_scan_work+0xc8/0x278 [nvme_core]
> > [ 140.210043] [c000000050607e40] [c00000000019414c] process_one_work+0x20c/0x4f4
> > [ 140.210051] [c000000050607ef0] [c0000000001950cc] worker_thread+0x378/0x544
> > [ 140.210058] [c000000050607f90] [c0000000001a164c] kthread+0x138/0x140
> > [ 140.210065] [c000000050607fe0] [c00000000000df98] start_kernel_thread+0x14/0x18
> > [ 140.210072] Code: 3c4c0134 384282d4 7c0802a6 60000000 7c0802a6 fbc1fff0 fba1ffe8 fbe1fff8 7c7e1b78 f8010010 f821ffb1 e9230010 e9290080> 7c2004ac 71290003 41820008
> > [ 140.210093] ---[ end trace 0000000000000000 ]---
> >
> > Issue is introduced by the patch:
> > be647e2c76b27f409cdd520f66c95be888b553a3.
>
> Exactly this was the concern when introducing a behavior change in a
> sensitive area of the code
> to silence lockdep...
No risk, no reward. :)
If we really can't figure this out, we can always revert and revisit for
the next merge.
> I'm assuming that the bad dereference is:
> synchronize_srcu(&ns->ctrl->srcu);
That would have to be it based on the report. Not sure what the problem
could be, though, the ns->ctrl must have been valid or we would have
failed earlier, and the srcu struct can't be released while the
controller is still in use by any namespace.
Anyway, I tested this path quite a bit, but I'll revisit with dynamic
attachments instead and see if that helps.
> btw, looking at the code again, I'm assuming that synchronizing srcu in
> every ns remove
> slows down batch removal of many namespaces greatly...
I may need to test this out, but I thought srcu sync was pretty quick if
there were no active readers, which there shouldn't be here outside
unusual cases.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 7:51 Kernel OOPS while creating a NVMe Namespace Venkat Rao Bagalkote
2024-06-10 9:43 ` Hillf Danton
2024-06-10 9:57 ` Sagi Grimberg
@ 2024-06-10 18:32 ` Chaitanya Kulkarni
2024-06-10 18:53 ` Keith Busch
3 siblings, 0 replies; 11+ messages in thread
From: Chaitanya Kulkarni @ 2024-06-10 18:32 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, sachinp@linux.vnet.com,
kbusch@kernel.org, sagi@grimberg.me
On 6/10/24 00:51, Venkat Rao Bagalkote wrote:
> Greetings!!!
>
> Observing Kernel OOPS, while creating namespace on a NVMe device.
>
> [ 140.209777] BUG: Unable to handle kernel data access at
> 0x18d7003065646fee
> [ 140.209792] Faulting instruction address: 0xc00000000023b45c
> [ 140.209798] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 140.209802] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
> [ 140.209809] Modules linked in: rpadlpar_io rpaphp xsk_diag
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> bonding nf_conntrack tls nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set
> nf_tables nfnetlink vmx_crypto pseries_rng binfmt_misc fuse xfs
> libcrc32c sd_mod sg ibmvscsi scsi_transport_srp ibmveth nvme nvme_core
> t10_pi crc64_rocksoft_generic crc64_rocksoft crc64
> [ 140.209864] CPU: 2 PID: 129 Comm: kworker/u65:3 Kdump: loaded Not
> tainted 6.10.0-rc3 #2
> [ 140.209870] Hardware name: IBM,9009-42A POWER9 (raw) 0x4e0202
> 0xf000005 of:IBM,FW950.A0 (VL950_141) hv:phyp pSeries
> [ 140.209876] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> [ 140.209889] NIP: c00000000023b45c LR: c008000006a96b20 CTR:
> c00000000023b42c
> [ 140.209894] REGS: c0000000506078a0 TRAP: 0380 Not tainted
> (6.10.0-rc3)
> [ 140.209899] MSR: 800000000280b033
> <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000244 XER: 00000000
> [ 140.209915] CFAR: c008000006aa80ac IRQMASK: 0
> [ 140.209915] GPR00: c008000006a96b20 c000000050607b40
> c000000001573700 c000000004291ee0
> [ 140.209915] GPR04: 0000000000000000 c000000006150080
> 00000000c0080005 fffffffffffe0000
> [ 140.209915] GPR08: 0000000000000000 18d7003065646f6e
> 0000000000000000 c008000006aa8098
> [ 140.209915] GPR12: c00000000023b42c c00000000f7cdf00
> c0000000001a151c c000000004f2be80
> [ 140.209915] GPR16: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> [ 140.209915] GPR20: c000000004dbcc00 0000000000000006
> 0000000000000002 c000000004911270
> [ 140.209915] GPR24: 0000000000000000 0000000000000000
> c0000000ee254ffc c0000000049111f0
> [ 140.209915] GPR28: 0000000000000000 c000000004911260
> c000000004291ee0 c000000004911260
> [ 140.209975] NIP [c00000000023b45c] synchronize_srcu+0x30/0x1c0
> [ 140.209984] LR [c008000006a96b20] nvme_ns_remove+0x80/0x2d8
> [nvme_core]
> [ 140.209994] Call Trace:
> [ 140.209997] [c000000050607b90] [c008000006a96b20]
> nvme_ns_remove+0x80/0x2d8 [nvme_core]
> [ 140.210008] [c000000050607bd0] [c008000006a972b4]
> nvme_remove_invalid_namespaces+0x144/0x1ac [nvme_core]
> [ 140.210020] [c000000050607c60] [c008000006a9dbd4]
> nvme_scan_ns_list+0x19c/0x370 [nvme_core]
> [ 140.210032] [c000000050607d70] [c008000006a9dfc8]
> nvme_scan_work+0xc8/0x278 [nvme_core]
> [ 140.210043] [c000000050607e40] [c00000000019414c]
> process_one_work+0x20c/0x4f4
> [ 140.210051] [c000000050607ef0] [c0000000001950cc]
> worker_thread+0x378/0x544
> [ 140.210058] [c000000050607f90] [c0000000001a164c] kthread+0x138/0x140
> [ 140.210065] [c000000050607fe0] [c00000000000df98]
> start_kernel_thread+0x14/0x18
> [ 140.210072] Code: 3c4c0134 384282d4 7c0802a6 60000000 7c0802a6
> fbc1fff0 fba1ffe8 fbe1fff8 7c7e1b78 f8010010 f821ffb1 e9230010
> <e9290080> 7c2004ac 71290003 41820008
> [ 140.210093] ---[ end trace 0000000000000000 ]---
>
>
> Issue is introduced by the patch:
> be647e2c76b27f409cdd520f66c95be888b553a3.
>
>
> Reverting it, issue is not seen.
>
>
> Regards,
>
> Venkat.
>
>
>
do you have steps that you can share ?
did you find this using blktest ? if not can you please submit
the blktest for this issue ? this clearly needs to be tested regularly
since people are working on this sensitive area ...
-ck
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 7:51 Kernel OOPS while creating a NVMe Namespace Venkat Rao Bagalkote
` (2 preceding siblings ...)
2024-06-10 18:32 ` Chaitanya Kulkarni
@ 2024-06-10 18:53 ` Keith Busch
2024-06-10 19:05 ` Sagi Grimberg
3 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2024-06-10 18:53 UTC (permalink / raw)
To: Venkat Rao Bagalkote; +Cc: sagi, linux-block, linux-kernel, linux-nvme, sachinp
On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
>
> Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
My mistake. The namespace remove list appears to be getting corrupted
because I'm using the wrong APIs to replace a "list_move_tail". This is
fixing the issue on my end:
---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7c9f91314d366..c667290de5133 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
mutex_lock(&ctrl->namespaces_lock);
list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
- if (ns->head->ns_id > nsid)
- list_splice_init_rcu(&ns->list, &rm_list,
- synchronize_rcu);
+ if (ns->head->ns_id > nsid) {
+ list_del_rcu(&ns->list);
+ list_add_tail_rcu(&ns->list, &rm_list);
+ }
}
mutex_unlock(&ctrl->namespaces_lock);
synchronize_srcu(&ctrl->srcu);
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 18:53 ` Keith Busch
@ 2024-06-10 19:05 ` Sagi Grimberg
2024-06-10 19:15 ` Keith Busch
0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2024-06-10 19:05 UTC (permalink / raw)
To: Keith Busch, Venkat Rao Bagalkote
Cc: linux-block, linux-kernel, linux-nvme, sachinp
On 10/06/2024 21:53, Keith Busch wrote:
> On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
>> Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
> My mistake. The namespace remove list appears to be getting corrupted
> because I'm using the wrong APIs to replace a "list_move_tail". This is
> fixing the issue on my end:
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 7c9f91314d366..c667290de5133 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
>
> mutex_lock(&ctrl->namespaces_lock);
> list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
> - if (ns->head->ns_id > nsid)
> - list_splice_init_rcu(&ns->list, &rm_list,
> - synchronize_rcu);
> + if (ns->head->ns_id > nsid) {
> + list_del_rcu(&ns->list);
> + list_add_tail_rcu(&ns->list, &rm_list);
> + }
> }
> mutex_unlock(&ctrl->namespaces_lock);
> synchronize_srcu(&ctrl->srcu);
> --
Can we add a reproducer for this in blktests? I'm assuming that we can
easily trigger this
with adding/removing nvmet namespaces?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 19:05 ` Sagi Grimberg
@ 2024-06-10 19:15 ` Keith Busch
2024-06-10 19:17 ` Sagi Grimberg
0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2024-06-10 19:15 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Venkat Rao Bagalkote, linux-block, linux-kernel, linux-nvme,
sachinp
On Mon, Jun 10, 2024 at 10:05:00PM +0300, Sagi Grimberg wrote:
>
>
> On 10/06/2024 21:53, Keith Busch wrote:
> > On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
> > > Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
> > My mistake. The namespace remove list appears to be getting corrupted
> > because I'm using the wrong APIs to replace a "list_move_tail". This is
> > fixing the issue on my end:
> >
> > ---
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 7c9f91314d366..c667290de5133 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> > mutex_lock(&ctrl->namespaces_lock);
> > list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
> > - if (ns->head->ns_id > nsid)
> > - list_splice_init_rcu(&ns->list, &rm_list,
> > - synchronize_rcu);
> > + if (ns->head->ns_id > nsid) {
> > + list_del_rcu(&ns->list);
> > + list_add_tail_rcu(&ns->list, &rm_list);
> > + }
> > }
> > mutex_unlock(&ctrl->namespaces_lock);
> > synchronize_srcu(&ctrl->srcu);
> > --
>
> Can we add a reproducer for this in blktests? I'm assuming that we can
> easily trigger this
> with adding/removing nvmet namespaces?
I'm testing this with Namespace Manamgent commands, which nvmet doesn't
support. You can recreate the issue by detaching the last namespace.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 19:15 ` Keith Busch
@ 2024-06-10 19:17 ` Sagi Grimberg
2024-06-10 19:33 ` Keith Busch
0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2024-06-10 19:17 UTC (permalink / raw)
To: Keith Busch
Cc: Venkat Rao Bagalkote, linux-block, linux-kernel, linux-nvme,
sachinp
On 10/06/2024 22:15, Keith Busch wrote:
> On Mon, Jun 10, 2024 at 10:05:00PM +0300, Sagi Grimberg wrote:
>>
>> On 10/06/2024 21:53, Keith Busch wrote:
>>> On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
>>>> Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
>>> My mistake. The namespace remove list appears to be getting corrupted
>>> because I'm using the wrong APIs to replace a "list_move_tail". This is
>>> fixing the issue on my end:
>>>
>>> ---
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 7c9f91314d366..c667290de5133 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
>>> mutex_lock(&ctrl->namespaces_lock);
>>> list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
>>> - if (ns->head->ns_id > nsid)
>>> - list_splice_init_rcu(&ns->list, &rm_list,
>>> - synchronize_rcu);
>>> + if (ns->head->ns_id > nsid) {
>>> + list_del_rcu(&ns->list);
>>> + list_add_tail_rcu(&ns->list, &rm_list);
>>> + }
>>> }
>>> mutex_unlock(&ctrl->namespaces_lock);
>>> synchronize_srcu(&ctrl->srcu);
>>> --
>> Can we add a reproducer for this in blktests? I'm assuming that we can
>> easily trigger this
>> with adding/removing nvmet namespaces?
> I'm testing this with Namespace Manamgent commands, which nvmet doesn't
> support. You can recreate the issue by detaching the last namespace.
>
I think the same will happen in a test that creates two namespaces and
then echo 0 > ns/enable.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 19:17 ` Sagi Grimberg
@ 2024-06-10 19:33 ` Keith Busch
2024-06-17 9:10 ` Nilay Shroff
0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2024-06-10 19:33 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Venkat Rao Bagalkote, linux-block, linux-kernel, linux-nvme,
sachinp
On Mon, Jun 10, 2024 at 10:17:42PM +0300, Sagi Grimberg wrote:
> On 10/06/2024 22:15, Keith Busch wrote:
> > On Mon, Jun 10, 2024 at 10:05:00PM +0300, Sagi Grimberg wrote:
> > >
> > > On 10/06/2024 21:53, Keith Busch wrote:
> > > > On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
> > > > > Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
> > > > My mistake. The namespace remove list appears to be getting corrupted
> > > > because I'm using the wrong APIs to replace a "list_move_tail". This is
> > > > fixing the issue on my end:
> > > >
> > > > ---
> > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > > index 7c9f91314d366..c667290de5133 100644
> > > > --- a/drivers/nvme/host/core.c
> > > > +++ b/drivers/nvme/host/core.c
> > > > @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> > > > mutex_lock(&ctrl->namespaces_lock);
> > > > list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
> > > > - if (ns->head->ns_id > nsid)
> > > > - list_splice_init_rcu(&ns->list, &rm_list,
> > > > - synchronize_rcu);
> > > > + if (ns->head->ns_id > nsid) {
> > > > + list_del_rcu(&ns->list);
> > > > + list_add_tail_rcu(&ns->list, &rm_list);
> > > > + }
> > > > }
> > > > mutex_unlock(&ctrl->namespaces_lock);
> > > > synchronize_srcu(&ctrl->srcu);
> > > > --
> > > Can we add a reproducer for this in blktests? I'm assuming that we can
> > > easily trigger this
> > > with adding/removing nvmet namespaces?
> > I'm testing this with Namespace Manamgent commands, which nvmet doesn't
> > support. You can recreate the issue by detaching the last namespace.
> >
>
> I think the same will happen in a test that creates two namespaces and then
> echo 0 > ns/enable.
Looks like nvme/016 tess this. It's reporting as "passed" on my end, but
I don't think it's actually testing the driver as intended. Still
messing with it.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Kernel OOPS while creating a NVMe Namespace
2024-06-10 19:33 ` Keith Busch
@ 2024-06-17 9:10 ` Nilay Shroff
0 siblings, 0 replies; 11+ messages in thread
From: Nilay Shroff @ 2024-06-17 9:10 UTC (permalink / raw)
To: Keith Busch, Sagi Grimberg
Cc: Venkat Rao Bagalkote, linux-block, linux-kernel, linux-nvme,
sachinp, Chaitanya Kulkarni, shinichiro.kawasaki
On 6/11/24 01:03, Keith Busch wrote:
> On Mon, Jun 10, 2024 at 10:17:42PM +0300, Sagi Grimberg wrote:
>> On 10/06/2024 22:15, Keith Busch wrote:
>>> On Mon, Jun 10, 2024 at 10:05:00PM +0300, Sagi Grimberg wrote:
>>>>
>>>> On 10/06/2024 21:53, Keith Busch wrote:
>>>>> On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
>>>>>> Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
>>>>> My mistake. The namespace remove list appears to be getting corrupted
>>>>> because I'm using the wrong APIs to replace a "list_move_tail". This is
>>>>> fixing the issue on my end:
>>>>>
>>>>> ---
>>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>>> index 7c9f91314d366..c667290de5133 100644
>>>>> --- a/drivers/nvme/host/core.c
>>>>> +++ b/drivers/nvme/host/core.c
>>>>> @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
>>>>> mutex_lock(&ctrl->namespaces_lock);
>>>>> list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
>>>>> - if (ns->head->ns_id > nsid)
>>>>> - list_splice_init_rcu(&ns->list, &rm_list,
>>>>> - synchronize_rcu);
>>>>> + if (ns->head->ns_id > nsid) {
>>>>> + list_del_rcu(&ns->list);
>>>>> + list_add_tail_rcu(&ns->list, &rm_list);
>>>>> + }
>>>>> }
>>>>> mutex_unlock(&ctrl->namespaces_lock);
>>>>> synchronize_srcu(&ctrl->srcu);
>>>>> --
>>>> Can we add a reproducer for this in blktests? I'm assuming that we can
>>>> easily trigger this
>>>> with adding/removing nvmet namespaces?
>>> I'm testing this with Namespace Manamgent commands, which nvmet doesn't
>>> support. You can recreate the issue by detaching the last namespace.
>>>
>>
>> I think the same will happen in a test that creates two namespaces and then
>> echo 0 > ns/enable.
>
> Looks like nvme/016 tess this. It's reporting as "passed" on my end, but
> I don't think it's actually testing the driver as intended. Still
> messing with it.
>
I believe nvme/016 creates and deletes the namespace however there's no backstore
associated with the loop device and hence nvme/016 is unable to recreate this issue.
To recreate this issue, we need to associate a backstore (either a block-dev or
a regular-file) to the loop device and then use it for creating and then deleting
the namespace.
I wrote a blktest for this specific regression and I could able to trigger this crash.
I would submit this blktest in a separate email.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-06-17 9:11 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-10 7:51 Kernel OOPS while creating a NVMe Namespace Venkat Rao Bagalkote
2024-06-10 9:43 ` Hillf Danton
2024-06-10 9:57 ` Sagi Grimberg
2024-06-10 15:24 ` Keith Busch
2024-06-10 18:32 ` Chaitanya Kulkarni
2024-06-10 18:53 ` Keith Busch
2024-06-10 19:05 ` Sagi Grimberg
2024-06-10 19:15 ` Keith Busch
2024-06-10 19:17 ` Sagi Grimberg
2024-06-10 19:33 ` Keith Busch
2024-06-17 9:10 ` Nilay Shroff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox