From: Sagi Grimberg <sagi@grimberg.me>
To: Yi Zhang <yi.zhang@redhat.com>, linux-nvme@lists.infradead.org
Cc: skt-results-master@redhat.com, Bruno Goncalves <bgoncalv@redhat.com>
Subject: Re: [bug report] blktests nvme/022 lead kernel WARNING and NULL pointer
Date: Fri, 30 Apr 2021 17:55:10 -0700 [thread overview]
Message-ID: <3243ab72-ffe3-e655-ef0a-d695885ce2c5@grimberg.me> (raw)
In-Reply-To: <CAHj4cs-hAaHhWU4Kb4NrQ9Fvi7DiH95P=Q=7zK-inxpeZtztkw@mail.gmail.com>
> Hello
> Recently CKI reproduced this WARNING and NULL pointer with
> linux-block/for-next on aarch64, seems it's one regression, I will try
> if I can bisect the culprit.
>
> blktests: nvme/022 (test NVMe reset command on NVMeOF file-backed ns)
>
> [ 1879.759978] run blktests nvme/022 at 2021-04-30 12:30:36
> [ 1879.804283] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> [ 1879.819087] nvmet: creating controller 1 for subsystem
> blktests-subsystem-1 for NQN
> nqn.2014-08.org.nvmexpress:uuid:0da758a0-4d84-4133-82dd-9801235b55cd.
> [ 1879.833081] nvmet: unhandled identify cns 6 on qid 0
> [ 1879.838079] nvme nvme0: creating 128 I/O queues.
> [ 1879.852353] nvme nvme0: new ctrl: "blktests-subsystem-1"
> [ 1880.879731] nvme nvme0: resetting controller
> [ 1889.940458] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [ 1889.946377] nvmet: ctrl 1 fatal error occurred!
> [ 1889.950928] nvme nvme0: Removing ctrl: NQN "blktests-subsystem-1"
It appears that we are somehow now expire the kato after/during a reset
sequence and then seem to race reset and remove...
bisection will help definitely.
> [ 1892.810813] -
> [ 1892.815427] WARNING: CPU: 30 PID: 5492 at
> drivers/nvme/target/loop.c:466 nvme_loop_reset_ctrl_work+0x48/0xf0
> [nvme_loop]
> [ 1892.826293] Modules linked in: nvme_loop nvme_fabrics nvme_core
> nvmet loop rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert
> iscsi_target_mod target_core_mod ib_iser vfat libiscsi fat
> scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib
> ib_uverbs i2c_smbus ib_core crct10dif_ce ghash_ce sha1_ce acpi_ipmi
> ipmi_ssif ipmi_devintf ipmi_msghandler thunderx2_pmu ip_tables xfs
> libcrc32c sr_mod cdrom mlx5_core ast i2c_algo_bit drm_vram_helper
> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> drm_ttm_helper ttm drm uas mlxfw sha2_ce tls sha256_arm64 usb_storage
> sg psample gpio_xlp i2c_xlp9xx dm_mirror dm_region_hash dm_log dm_mod
> [last unloaded: nvmet]
> [ 1892.885150] CPU: 30 PID: 5492 Comm: kworker/u513:5 Not tainted 5.12.0+ #1
> [ 1892.891926] Hardware name: HPE Apollo 70 /C01_APACHE_MB
> , BIOS L50_5.13_1.16 07/29/2020
> [ 1892.901654] Workqueue: nvme-reset-wq nvme_loop_reset_ctrl_work [nvme_loop]
> [ 1892.908519] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
> [ 1892.914513] pc : nvme_loop_reset_ctrl_work+0x48/0xf0 [nvme_loop]
> [ 1892.920508] lr : nvme_loop_reset_ctrl_work+0x40/0xf0 [nvme_loop]
> [ 1892.926502] sp : fffffe0031b6fd70
> [ 1892.929803] x29: fffffe0031b6fd70 x28: 0000000000000000
> [ 1892.935105] x27: fffffc081065c0c0 x26: fffffc0807c2c26c
> [ 1892.940405] x25: 0000000000000000 x24: fffffc084ae24898
> [ 1892.945705] x23: 0000000000000000 x22: fffffc09410c1d00
> [ 1892.951004] x21: fffffc084ae24890 x20: fffffc084ae244a0
> [ 1892.956305] x19: fffffc084ae24000 x18: 0000000000000012
> [ 1892.961604] x17: 0000000000000001 x16: 0000000000000019
> [ 1892.966904] x15: fffffe0011d7e7e0 x14: fffffc8ba09cedf8
> [ 1892.972204] x13: 0000000000000000 x12: 0000000000000003
> [ 1892.977504] x11: fffffc8ba09ced40 x10: fffffe0011d7e7e8
> [ 1892.982804] x9 : fffffe000ad60c58 x8 : fffffe8b650f0000
> [ 1892.988104] x7 : 0000000000000008 x6 : fffffc0000000000
> [ 1892.993403] x5 : 0000000000000000 x4 : ffffffff22f337e0
> [ 1892.998703] x3 : fffffc084ae244ac x2 : 0000000000000001
> [ 1893.004003] x1 : fffffc084ae244ac x0 : 0000000000000000
> [ 1893.009303] Call trace:
> [ 1893.011737] nvme_loop_reset_ctrl_work+0x48/0xf0 [nvme_loop]
> [ 1893.017384] process_one_work+0x1d0/0x438
> [ 1893.021385] worker_thread+0x1f8/0x4d8
> [ 1893.025123] kthread+0x114/0x118
> [ 1893.028341] ret_from_fork+0x10/0x18
> [ 1893.031907] ---[ end trace 883109425327ab60 ]---
> [ 1893.301843] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000008
> [ 1893.310620] Mem abort info:
> [ 1893.313401] ESR = 0x96000006
> [ 1893.316442] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 1893.321741] SET = 0, FnV = 0
> [ 1893.324783] EA = 0, S1PTW = 0
> [ 1893.327911] Data abort info:
> [ 1893.330778] ISV = 0, ISS = 0x00000006
> [ 1893.334600] CM = 0, WnR = 0
> [ 1893.337555] user pgtable: 64k pages, 42-bit VAs, pgdp=0000000a0b750000
> [ 1893.344069] [0000000000000008] pgd=0000000000000000,
> p4d=0000000000000000, pud=0000000000000000, pmd=0000000000000000
> [ 1893.354669] Internal error: Oops: 96000006 [#1] SMP
> [ 1893.359535] Modules linked in: nvme_loop nvme_fabrics nvme_core
> nvmet loop rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_ ib_ipoib iw_cm
> intf ipmi_msghandrm_kms_helper s64 usb_storage s.418386] CPU: 0 PID:
> 12871 Comm: kworker/u513:0 Tainted: G W 5.12.0+ #1
> [ 1893.426551] Hardware name: HPE Apollo 70 /C01_APACHE_MB
> , BIOS L50_5.13_1.16 07/29/2020
> [ 1893.436277] Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
> [ 1893.442892] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
> [ 1893.448886] pc : percpu_ref_kill_and_confirm+0x15c/0x178
> [ 1893.454189] lr : nvmet_sq_destroy+0xec/0x1f0 [nvmet]
> [ 1893.459149] sp : fffffe006466fc70
> [ 1893.462449] x29: fffffe006466fc70 x28: 0000000000000000
> [ 1893.467750] x27: fffffc8bcee4e5c0 x26: fffffc0807f57a6c
> [ 1893.473051] x25: 0000000000000000 x24: fffffc084ae248b8
> [ 1893.478351] x23: fffffc0854e600d0 x22: fffffe000ac40600
> [ 1893.483650] x21: 0000000000000000 x20: fffffc0854e60090
> [ 1893.488950] x19: fffffe0011d7e000 x18: 0000000000000016
> [ 1893.494250] x17: 0000000000000001 x16: fffffc084e210598
> [ 1893.499550] x15: fffffe0011d7e7e0 x14: fffffc8ba09cee18
> [ 1893.504850] x13: 0000000000000000 x12: fffffe006466fc68
> [ 1893.510149] x11: 0000000000000040 x10: fffffe001158b2f8
> [ 1893.515449] x9 : fffffe000ad608ec x8 : fffffc082000cfc0
> [ 1893.520749] x7 : 0000000000000001 x6 : fffffc0000000000
> [ 1893.526049] x5 : 0000000000000080 x4 : 0000000000000000
> [ 1893.531349] x3 : 0000000000000000 x2 : fffffe0011a4e000
> [ 1893.536649] x1 : fffffe0010bdec08 x0 : fffffe0010e95000
> [ 1893.541949] Call trace:
> [ 1893.544383] percpu_ref_kill_and_confirm+0x15c/0x178
> [ 1893.549335] nvmet_sq_destroy+0xec/0x1f0 [nvmet]
> [ 1893.553945] nvme_loop_destroy_io_queues+0x64/0x90 [nvme_loop]
> [ 1893.559767] nvme_loop_shutdown_ctrl+0x60/0xb8 [nvme_loop]
> [ 1893.565240] nvme_loop_delete_ctrl_host+0x18/0x20 [nvme_loop]
> [ 1893.570973] nvme_do_delete_ctrl+0x58/0x6c [nvme_core]
> [ 1893.576106] nvme_delete_ctrl_work+0x18/0x38 [nvme_core]
> [ 1893.581411] process_one_work+0x1d0/0x438
> [ 1893.585410] worker_thread+0x150/0x4d8
> [ 1893.589148] kthread+0x114/0x118
> [ 1893.592364] ret_from_fork+0x10/0x18
> [ 1893.595929] Code: 39244840 f0003781 d0004d40 91302021 (f9400462)
> [ 1893.602139] ---[ end trace 883109425327ab61 ]---
> [ 1893.606744] Kernel panic - not syncing: Oops: Fatal exception
> [ 1893.612512] SMP: stopping secondary CPUs
> [ 1893.616474] Kernel Offset: disabled
> [ 1893.619949] CPU features: 0x00046002,63000838
> [ 1893.624293] Memory Limit: none
> [ 1893.627362] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
>
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-05-01 0:55 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-30 16:40 [bug report] blktests nvme/022 lead kernel WARNING and NULL pointer Yi Zhang
2021-05-01 0:55 ` Sagi Grimberg [this message]
2021-05-01 9:58 ` Yi Zhang
2021-05-07 8:35 ` Yi Zhang
2021-05-07 19:50 ` Sagi Grimberg
2021-05-09 8:44 ` Hannes Reinecke
2021-05-12 0:32 ` Yi Zhang
2021-05-19 0:36 ` Yi Zhang
2021-05-20 6:19 ` Hannes Reinecke
2021-05-21 0:38 ` Yi Zhang
2021-05-21 18:19 ` Sagi Grimberg
2021-05-22 0:12 ` Yi Zhang
2021-05-22 14:59 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3243ab72-ffe3-e655-ef0a-d695885ce2c5@grimberg.me \
--to=sagi@grimberg.me \
--cc=bgoncalv@redhat.com \
--cc=linux-nvme@lists.infradead.org \
--cc=skt-results-master@redhat.com \
--cc=yi.zhang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox