* [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key
@ 2024-03-04 16:13 Daniel Wagner
2024-03-04 16:13 ` [PATCH blktests v1 1/2] nvme/rc: add reconnect-delay argument only for fabrics transports Daniel Wagner
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Daniel Wagner @ 2024-03-04 16:13 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: Chaitanya Kulkarni, Keith Busch, linux-block, linux-nvme,
Daniel Wagner
The is the test case for
https://lore.kernel.org/linux-nvme/20240304161006.19328-1-dwagner@suse.de/
Daniel Wagner (2):
nvme/rc: add reconnect-delay argument only for fabrics transports
nvme/048: add reconnect after ctrl key change
tests/nvme/045 | 72 +++++++++++++++++++++++++++++++++++++++++++++-
tests/nvme/045.out | 3 +-
tests/nvme/rc | 17 +++++++++--
3 files changed, 88 insertions(+), 4 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH blktests v1 1/2] nvme/rc: add reconnect-delay argument only for fabrics transports 2024-03-04 16:13 [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Daniel Wagner @ 2024-03-04 16:13 ` Daniel Wagner 2024-03-04 16:13 ` [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change Daniel Wagner 2024-03-05 9:44 ` [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Shinichiro Kawasaki 2 siblings, 0 replies; 10+ messages in thread From: Daniel Wagner @ 2024-03-04 16:13 UTC (permalink / raw) To: Shinichiro Kawasaki Cc: Chaitanya Kulkarni, Keith Busch, linux-block, linux-nvme, Daniel Wagner The loop transport is also considered a fabric transport. But not connect options are supported, in particular the reconnect-delay option. nvme-cli can't figure this out as the kernel reports the option support for all transports (union). Thus, we have to filter out this option for this loop transport in blktests. Signed-off-by: Daniel Wagner <dwagner@suse.de> --- tests/nvme/rc | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/tests/nvme/rc b/tests/nvme/rc index 2d6ebeab2f6f..5facbe8f2d2f 100644 --- a/tests/nvme/rc +++ b/tests/nvme/rc @@ -122,13 +122,25 @@ _require_nvme_test_img_size() { return 0 } -_require_nvme_trtype() { +_is_nvme_trtype() { local trtype + for trtype in "$@"; do if [[ "${nvme_trtype}" == "$trtype" ]]; then return 0 fi done + + return 1 +} + +_require_nvme_trtype() { + local trtype + + if _is_nvme_trtype "$@"; then + return 0 + fi + SKIP_REASONS+=("nvme_trtype=${nvme_trtype} is not supported in this test") return 1 } @@ -513,7 +525,8 @@ _nvme_connect_subsys() { if [[ -n "${keep_alive_tmo}" ]]; then ARGS+=(--keep-alive-tmo="${keep_alive_tmo}") fi - if [[ -n "${reconnect_delay}" ]]; then + if ! _is_nvme_trtype loop && + [[ -n "${reconnect_delay}" ]]; then ARGS+=(--reconnect-delay="${reconnect_delay}") fi if [[ -n "${ctrl_loss_tmo}" ]]; then -- 2.44.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change 2024-03-04 16:13 [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Daniel Wagner 2024-03-04 16:13 ` [PATCH blktests v1 1/2] nvme/rc: add reconnect-delay argument only for fabrics transports Daniel Wagner @ 2024-03-04 16:13 ` Daniel Wagner 2024-03-05 9:38 ` Shinichiro Kawasaki 2024-03-05 9:44 ` [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Shinichiro Kawasaki 2 siblings, 1 reply; 10+ messages in thread From: Daniel Wagner @ 2024-03-04 16:13 UTC (permalink / raw) To: Shinichiro Kawasaki Cc: Chaitanya Kulkarni, Keith Busch, linux-block, linux-nvme, Daniel Wagner The re-authentication is a soft state, meaning unless the host has to reconnect a key change on the target side is not observed. Extend the current test with a forced reconnect after a key change. This exercises the DNR handling code of the host. Signed-off-by: Daniel Wagner <dwagner@suse.de> --- tests/nvme/045 | 72 +++++++++++++++++++++++++++++++++++++++++++++- tests/nvme/045.out | 3 +- 2 files changed, 73 insertions(+), 2 deletions(-) diff --git a/tests/nvme/045 b/tests/nvme/045 index be408b629771..b0527685f61f 100755 --- a/tests/nvme/045 +++ b/tests/nvme/045 @@ -21,6 +21,62 @@ requires() { _have_driver dh_generic } +nvmf_wait_for_state() { + local def_state_timeout=5 + local subsys_name="$1" + local state="$2" + local timeout="${3:-$def_state_timeout}" + local nvmedev + local state_file + local start_time + local end_time + + nvmedev=$(_find_nvme_dev "${subsys_name}") + state_file="/sys/class/nvme-fabrics/ctl/${nvmedev}/state" + + start_time=$(date +%s) + while ! grep -q "${state}" "${state_file}"; do + sleep 1 + end_time=$(date +%s) + if (( end_time - start_time > timeout )); then + echo "expected state \"${state}\" not " \ + "reached within ${timeout} seconds" + return 1 + fi + done + + return 0 +} + +nvmf_wait_for_ctrl_delete() { + local def_state_timeout=5 + local nvmedev="$1" + local timeout="${2:-$def_state_timeout}" + local ctrl="/sys/class/nvme-fabrics/ctl/${nvmedev}/state" + local start_time + local end_time + + start_time=$(date +%s) + while [ -f "${ctrl}" ]; do + sleep 1 + end_time=$(date +%s) + if (( end_time - start_time > timeout )); then + echo "controller \"${nvmedev}\" not deleted" \ + "within ${timeout} seconds" + return 1 + fi + done + + return 0 +} + +set_nvmet_attr_qid_max() { + local nvmet_subsystem="$1" + local qid_max="$2" + local cfs_path="${NVMET_CFS}/subsystems/${nvmet_subsystem}" + + echo "${qid_max}" > "${cfs_path}/attr_qid_max" +} test() { echo "Running ${TEST_NAME}" @@ -55,7 +111,11 @@ test() { --hostnqn "${def_hostnqn}" \ --hostid "${def_hostid}" \ --dhchap-secret "${hostkey}" \ - --dhchap-ctrl-secret "${ctrlkey}" + --dhchap-ctrl-secret "${ctrlkey}" \ + --reconnect-delay 1 + + nvmf_wait_for_state "${def_subsysnqn}" "live" + nvmedev=$(_find_nvme_dev "${def_subsysnqn}") echo "Re-authenticate with original host key" @@ -108,6 +168,16 @@ test() { rand_io_size="$(_nvme_calc_rand_io_size 4m)" _run_fio_rand_io --size="${rand_io_size}" --filename="/dev/${nvmedev}n1" + echo "Renew host key on the controller and force reconnect" + + new_hostkey="$(nvme gen-dhchap-key -n ${def_subsysnqn} 2> /dev/null)" + + _set_nvmet_hostkey "${def_hostnqn}" "${new_hostkey}" + + # Force a reconnect + set_nvmet_attr_qid_max "${def_subsysnqn}" 1 + nvmf_wait_for_ctrl_delete "${nvmedev}" + _nvme_disconnect_subsys "${def_subsysnqn}" _nvmet_target_cleanup diff --git a/tests/nvme/045.out b/tests/nvme/045.out index 84a69ef5c4c6..3331304ef71a 100644 --- a/tests/nvme/045.out +++ b/tests/nvme/045.out @@ -8,5 +8,6 @@ Change DH group to ffdhe8192 Re-authenticate with changed DH group Change hash to hmac(sha512) Re-authenticate with changed hash -disconnected 1 controller(s) +Renew host key on the controller and force reconnect +disconnected 0 controller(s) Test complete -- 2.44.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change 2024-03-04 16:13 ` [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change Daniel Wagner @ 2024-03-05 9:38 ` Shinichiro Kawasaki 2024-03-05 11:08 ` Daniel Wagner 0 siblings, 1 reply; 10+ messages in thread From: Shinichiro Kawasaki @ 2024-03-05 9:38 UTC (permalink / raw) To: Daniel Wagner Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Mar 04, 2024 / 17:13, Daniel Wagner wrote: > The re-authentication is a soft state, meaning unless the host has to > reconnect a key change on the target side is not observed. Extend the > current test with a forced reconnect after a key change. This > exercises the DNR handling code of the host. > > Signed-off-by: Daniel Wagner <dwagner@suse.de> This looks a good improvement of the test case. I have two comments: - The commit title says nvme/048, but it should be nvme/045. - The helper functions nvmf_wait_for_state() and set_nvmet_attr_qid_max() are exactly same as those in nvme/048, aren't they? Probably, it's the better to move them to nvme/rc, as a preparation patch. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change 2024-03-05 9:38 ` Shinichiro Kawasaki @ 2024-03-05 11:08 ` Daniel Wagner 0 siblings, 0 replies; 10+ messages in thread From: Daniel Wagner @ 2024-03-05 11:08 UTC (permalink / raw) To: Shinichiro Kawasaki Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Tue, Mar 05, 2024 at 09:38:41AM +0000, Shinichiro Kawasaki wrote: > On Mar 04, 2024 / 17:13, Daniel Wagner wrote: > > The re-authentication is a soft state, meaning unless the host has to > > reconnect a key change on the target side is not observed. Extend the > > current test with a forced reconnect after a key change. This > > exercises the DNR handling code of the host. > > > > Signed-off-by: Daniel Wagner <dwagner@suse.de> > > This looks a good improvement of the test case. I have two comments: > > - The commit title says nvme/048, but it should be nvme/045. Indeed :) > - The helper functions nvmf_wait_for_state() and set_nvmet_attr_qid_max() > are exactly same as those in nvme/048, aren't they? > Probably, it's the better to move them to nvme/rc, as a preparation patch. I forgot that we already had those two helpers. Sure, will do. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key 2024-03-04 16:13 [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Daniel Wagner 2024-03-04 16:13 ` [PATCH blktests v1 1/2] nvme/rc: add reconnect-delay argument only for fabrics transports Daniel Wagner 2024-03-04 16:13 ` [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change Daniel Wagner @ 2024-03-05 9:44 ` Shinichiro Kawasaki 2024-03-05 11:18 ` Daniel Wagner 2 siblings, 1 reply; 10+ messages in thread From: Shinichiro Kawasaki @ 2024-03-05 9:44 UTC (permalink / raw) To: Daniel Wagner Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Mar 04, 2024 / 17:13, Daniel Wagner wrote: > The is the test case for > > https://lore.kernel.org/linux-nvme/20240304161006.19328-1-dwagner@suse.de/ > > > Daniel Wagner (2): > nvme/rc: add reconnect-delay argument only for fabrics transports > nvme/048: add reconnect after ctrl key change I apply the kernel patches in the link above to v6.8-rc7, then ran nvme/045 with the blktests patches in the series. And I observed failure of the test case with various transports [1]. Is this failure expected? Also, I observed KASAN double-free [2]. Do you observe it in your environment? I created a quick fix [3], and it looks resolving the double-free. [1] sudo ./check nvme/045 nvme/045 (Test re-authentication) [failed] runtime 8.069s ... 7.639s --- tests/nvme/045.out 2024-03-05 18:09:07.267668493 +0900 +++ /home/shin/Blktests/blktests/results/nodev/nvme/045.out.bad 2024-03-05 18:10:07.735494384 +0900 @@ -9,5 +9,6 @@ Change hash to hmac(sha512) Re-authenticate with changed hash Renew host key on the controller and force reconnect -disconnected 0 controller(s) +controller "nvme1" not deleted within 5 seconds +disconnected 1 controller(s) Test complete [2] [ 938.253184] ================================================================== [ 938.254995] BUG: KASAN: double-free in nuse_show+0x307/0x3c0 [nvme_core] [ 938.256400] Free of addr ffff88812d318000 by task nvme/1564 [ 938.258777] CPU: 2 PID: 1564 Comm: nvme Not tainted 6.8.0-rc7+ #155 [ 938.260188] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 938.261695] Call Trace: [ 938.262780] <TASK> [ 938.263950] dump_stack_lvl+0x57/0x90 [ 938.265157] print_report+0xcf/0x670 [ 938.266372] ? __virt_addr_valid+0x211/0x400 [ 938.267554] ? nuse_show+0x307/0x3c0 [nvme_core] [ 938.268790] kasan_report_invalid_free+0x72/0xa0 [ 938.270025] ? nuse_show+0x307/0x3c0 [nvme_core] [ 938.271242] ? nuse_show+0x307/0x3c0 [nvme_core] [ 938.272447] poison_slab_object+0x141/0x170 [ 938.273574] ? nuse_show+0x307/0x3c0 [nvme_core] [ 938.274826] __kasan_slab_free+0x2e/0x50 [ 938.276029] kfree+0x116/0x350 [ 938.277133] nuse_show+0x307/0x3c0 [nvme_core] [ 938.278326] ? __pfx_lock_acquire+0x10/0x10 [ 938.279433] ? __pfx_nuse_show+0x10/0x10 [nvme_core] [ 938.280669] dev_attr_show+0x42/0xc0 [ 938.281668] ? sysfs_file_ops+0x11b/0x170 [ 938.282733] sysfs_kf_seq_show+0x1f0/0x3b0 [ 938.283818] seq_read_iter+0x40c/0x11c0 [ 938.284888] ? rw_verify_area+0x179/0x470 [ 938.286016] vfs_read+0x606/0xc70 [ 938.287106] ? __pfx_vfs_read+0x10/0x10 [ 938.288153] ? kasan_quarantine_put+0xd6/0x1e0 [ 938.289234] ? lockdep_hardirqs_on+0x7d/0x100 [ 938.290313] ? __fget_light+0x53/0x1e0 [ 938.291267] ksys_read+0xf7/0x1d0 [ 938.292233] ? __pfx_ksys_read+0x10/0x10 [ 938.293301] ? kasan_quarantine_put+0xd6/0x1e0 [ 938.294300] do_syscall_64+0x9a/0x190 [ 938.295253] ? __x64_sys_openat+0x11f/0x1d0 [ 938.296292] ? lockdep_hardirqs_on+0x7d/0x100 [ 938.297277] ? __pfx___x64_sys_openat+0x10/0x10 [ 938.298328] ? ksys_read+0xf7/0x1d0 [ 938.299245] ? lockdep_hardirqs_on_prepare+0x17b/0x410 [ 938.300301] ? do_syscall_64+0xa7/0x190 [ 938.301191] ? lockdep_hardirqs_on+0x7d/0x100 [ 938.302148] ? do_syscall_64+0xa7/0x190 [ 938.303107] ? do_syscall_64+0xa7/0x190 [ 938.304009] ? do_syscall_64+0xa7/0x190 [ 938.304936] ? lockdep_hardirqs_on_prepare+0x17b/0x410 [ 938.306017] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 938.307103] RIP: 0033:0x7f57658da121 [ 938.308065] Code: 00 48 8b 15 11 fd 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 40 ce 01 00 f3 0f 1e fa 80 3d 45 82 0d 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec [ 938.310749] RSP: 002b:00007ffe0fd8ef98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 938.312023] RAX: ffffffffffffffda RBX: 00007ffe0fd908a8 RCX: 00007f57658da121 [ 938.313215] RDX: 0000000000000fff RSI: 00007ffe0fd8efb0 RDI: 0000000000000003 [ 938.314464] RBP: 00007ffe0fd90820 R08: 0000000000000073 R09: 0000000000000001 [ 938.315668] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 [ 938.316871] R13: 0000000000000000 R14: 00007f5765a4b000 R15: 000000000053bdc0 [ 938.318077] </TASK> [ 938.319688] Allocated by task 1564: [ 938.320623] kasan_save_stack+0x2f/0x50 [ 938.321579] kasan_save_track+0x10/0x30 [ 938.322532] __kasan_kmalloc+0xa6/0xb0 [ 938.323477] nvme_identify_ns+0xae/0x230 [nvme_core] [ 938.324529] nuse_show+0x27a/0x3c0 [nvme_core] [ 938.325546] dev_attr_show+0x42/0xc0 [ 938.326485] sysfs_kf_seq_show+0x1f0/0x3b0 [ 938.327429] seq_read_iter+0x40c/0x11c0 [ 938.328483] vfs_read+0x606/0xc70 [ 938.329401] ksys_read+0xf7/0x1d0 [ 938.330441] do_syscall_64+0x9a/0x190 [ 938.331348] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 938.333140] Freed by task 1564: [ 938.334143] kasan_save_stack+0x2f/0x50 [ 938.335067] kasan_save_track+0x10/0x30 [ 938.336078] kasan_save_free_info+0x37/0x60 [ 938.337101] poison_slab_object+0x102/0x170 [ 938.338124] __kasan_slab_free+0x2e/0x50 [ 938.339082] kfree+0x116/0x350 [ 938.339965] nvme_identify_ns+0x1c5/0x230 [nvme_core] [ 938.341006] nuse_show+0x27a/0x3c0 [nvme_core] [ 938.342003] dev_attr_show+0x42/0xc0 [ 938.342931] sysfs_kf_seq_show+0x1f0/0x3b0 [ 938.343882] seq_read_iter+0x40c/0x11c0 [ 938.344804] vfs_read+0x606/0xc70 [ 938.345708] ksys_read+0xf7/0x1d0 [ 938.346611] do_syscall_64+0x9a/0x190 [ 938.347538] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 938.349308] The buggy address belongs to the object at ffff88812d318000 which belongs to the cache kmalloc-4k of size 4096 [ 938.350299] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 938.350311] The buggy address is located 0 bytes inside of 4096-byte region [ffff88812d318000, ffff88812d319000) [ 938.350314] The buggy address belongs to the physical page: [ 938.358511] page:00000000389f3330 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d318 [ 938.360009] head:00000000389f3330 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 938.361388] flags: 0x17ffffc0000840(slab|head|node=0|zone=2|lastcpupid=0x1fffff) [ 938.362644] page_type: 0xffffffff() [ 938.363627] raw: 0017ffffc0000840 ffff888100043040 dead000000000122 0000000000000000 [ 938.364958] raw: 0000000000000000 0000000080040004 00000001ffffffff 0000000000000000 [ 938.366278] page dumped because: kasan: bad access detected [ 938.368303] Memory state around the buggy address: [ 938.369384] ffff88812d317f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc [ 938.370661] ffff88812d317f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 938.371983] >ffff88812d318000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 938.373295] ^ [ 938.374311] ffff88812d318080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 938.375618] ffff88812d318100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 938.376954] ================================================================== [ 938.378356] Disabling lock debugging due to kernel taint [3] diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c index f2832f70e7e0..4e161d3cd840 100644 --- a/drivers/nvme/host/sysfs.c +++ b/drivers/nvme/host/sysfs.c @@ -221,14 +221,10 @@ static int ns_update_nuse(struct nvme_ns *ns) ret = nvme_identify_ns(ns->ctrl, ns->head->ns_id, &id); if (ret) - goto out_free_id; + return ret; ns->head->nuse = le64_to_cpu(id->nuse); - -out_free_id: - kfree(id); - - return ret; + return 0; } static ssize_t nuse_show(struct device *dev, struct device_attribute *attr, ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key 2024-03-05 9:44 ` [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Shinichiro Kawasaki @ 2024-03-05 11:18 ` Daniel Wagner 2024-03-06 8:44 ` Shinichiro Kawasaki 0 siblings, 1 reply; 10+ messages in thread From: Daniel Wagner @ 2024-03-05 11:18 UTC (permalink / raw) To: Shinichiro Kawasaki Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Tue, Mar 05, 2024 at 09:44:45AM +0000, Shinichiro Kawasaki wrote: > On Mar 04, 2024 / 17:13, Daniel Wagner wrote: > > The is the test case for > > > > https://lore.kernel.org/linux-nvme/20240304161006.19328-1-dwagner@suse.de/ > > > > > > Daniel Wagner (2): > > nvme/rc: add reconnect-delay argument only for fabrics transports > > nvme/048: add reconnect after ctrl key change > > I apply the kernel patches in the link above to v6.8-rc7, then ran nvme/045 > with the blktests patches in the series. And I observed failure of the test > case with various transports [1]. Is this failure expected? If you have these patches applied, the test should pass. But we might have still some more stuff to unify between the transports. The nvme/045 test passes in my setup. Though I have seen runs which were hang for some reason. Haven't figured out yet what's happening there. But I haven't seen failures, IIRC. I am not really surprised we seeing some fallouts though. We start to test the error code paths with this test extension. > Also, I observed KASAN double-free [2]. Do you observe it in your environment? > I created a quick fix [3], and it looks resolving the double-free. No, I haven't seen this. > sudo ./check nvme/045 > nvme/045 (Test re-authentication) [failed] > runtime 8.069s ... 7.639s > --- tests/nvme/045.out 2024-03-05 18:09:07.267668493 +0900 > +++ /home/shin/Blktests/blktests/results/nodev/nvme/045.out.bad 2024-03-05 18:10:07.735494384 +0900 > @@ -9,5 +9,6 @@ > Change hash to hmac(sha512) > Re-authenticate with changed hash > Renew host key on the controller and force reconnect > -disconnected 0 controller(s) > +controller "nvme1" not deleted within 5 seconds > +disconnected 1 controller(s) > Test complete That means the host either successfully reconnected or never disconnected. We have another test case just for the disconnect test (number of queue changes), so if this test passes, it must be the former... Shouldn't really happen, this would mean the auth code has bug. > diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c > index f2832f70e7e0..4e161d3cd840 100644 > --- a/drivers/nvme/host/sysfs.c > +++ b/drivers/nvme/host/sysfs.c > @@ -221,14 +221,10 @@ static int ns_update_nuse(struct nvme_ns *ns) > > ret = nvme_identify_ns(ns->ctrl, ns->head->ns_id, &id); > if (ret) > - goto out_free_id; > + return ret; Yes, this is correct. > > ns->head->nuse = le64_to_cpu(id->nuse); > - > -out_free_id: > - kfree(id); > - > - return ret; > + return 0; > } > I think you still need to free the 'id' on the normal exit path though If you have these patches applied, the test should pass. But we might have still some more stuff to unify between the transports. The nvme/045 test passes in my setup. Though I have seen runs which were hang for some reason. Haven't figured out yet what's happening there. But I haven't seen failures. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key 2024-03-05 11:18 ` Daniel Wagner @ 2024-03-06 8:44 ` Shinichiro Kawasaki 2024-03-06 9:36 ` Daniel Wagner 0 siblings, 1 reply; 10+ messages in thread From: Shinichiro Kawasaki @ 2024-03-06 8:44 UTC (permalink / raw) To: Daniel Wagner Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Mar 05, 2024 / 12:18, Daniel Wagner wrote: > On Tue, Mar 05, 2024 at 09:44:45AM +0000, Shinichiro Kawasaki wrote: > > On Mar 04, 2024 / 17:13, Daniel Wagner wrote: > > > The is the test case for > > > > > > https://lore.kernel.org/linux-nvme/20240304161006.19328-1-dwagner@suse.de/ > > > > > > > > > Daniel Wagner (2): > > > nvme/rc: add reconnect-delay argument only for fabrics transports > > > nvme/048: add reconnect after ctrl key change > > > > I apply the kernel patches in the link above to v6.8-rc7, then ran nvme/045 > > with the blktests patches in the series. And I observed failure of the test > > case with various transports [1]. Is this failure expected? > > If you have these patches applied, the test should pass. But we might > have still some more stuff to unify between the transports. The nvme/045 > test passes in my setup. Though I have seen runs which were hang for > some reason. Haven't figured out yet what's happening there. But I > haven't seen failures, IIRC. > > I am not really surprised we seeing some fallouts though. We start to > test the error code paths with this test extension. > > > Also, I observed KASAN double-free [2]. Do you observe it in your environment? > > I created a quick fix [3], and it looks resolving the double-free. > > No, I haven't seen this. > > > sudo ./check nvme/045 > > nvme/045 (Test re-authentication) [failed] > > runtime 8.069s ... 7.639s > > --- tests/nvme/045.out 2024-03-05 18:09:07.267668493 +0900 > > +++ /home/shin/Blktests/blktests/results/nodev/nvme/045.out.bad 2024-03-05 18:10:07.735494384 +0900 > > @@ -9,5 +9,6 @@ > > Change hash to hmac(sha512) > > Re-authenticate with changed hash > > Renew host key on the controller and force reconnect > > -disconnected 0 controller(s) > > +controller "nvme1" not deleted within 5 seconds > > +disconnected 1 controller(s) > > Test complete > > That means the host either successfully reconnected or never > disconnected. We have another test case just for the disconnect test > (number of queue changes), so if this test passes, it must be the > former... Shouldn't really happen, this would mean the auth code has bug. The test case nvme/048 passes, so this looks a bug. > > > diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c > > index f2832f70e7e0..4e161d3cd840 100644 > > --- a/drivers/nvme/host/sysfs.c > > +++ b/drivers/nvme/host/sysfs.c > > @@ -221,14 +221,10 @@ static int ns_update_nuse(struct nvme_ns *ns) > > > > ret = nvme_identify_ns(ns->ctrl, ns->head->ns_id, &id); > > if (ret) > > - goto out_free_id; > > + return ret; > > Yes, this is correct. > > > > ns->head->nuse = le64_to_cpu(id->nuse); > > - > > -out_free_id: > > - kfree(id); > > - > > - return ret; > > + return 0; > > } > > > > I think you still need to free the 'id' on the normal exit path though Thanks, I posted the patch with the fix. > > If you have these patches applied, the test should pass. But we might > have still some more stuff to unify between the transports. The nvme/045 > test passes in my setup. Though I have seen runs which were hang for > some reason. Haven't figured out yet what's happening there. But I > haven't seen failures. Still with the fix of the double-free, I observe the nvme/045 failure for rdma, tcp and fc transports. I wonder where the difference between your system and mine comes from. FYI, here I share the kernel messages for rdma transport. It shows that nvme_rdma_reconnect_or_remove() was called repeatedly and it tried to reconnect. The status argument is -111 or 880, so I think the recon flag is always true and no effect. I'm interested in the status values in your environment. [ 59.117607] run blktests nvme/045 at 2024-03-06 17:05:55 [ 59.198629] (null): rxe_set_mtu: Set mtu to 1024 [ 59.211185] PCLMULQDQ-NI instructions are not detected. [ 59.362952] infiniband ens3_rxe: set active [ 59.363765] infiniband ens3_rxe: added ens3 [ 59.540499] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [ 59.560541] nvmet_rdma: enabling port 0 (10.0.2.15:4420) [ 59.688866] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 59.701114] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 59.702195] nvme nvme1: qid 0: controller authenticated [ 59.703310] nvme nvme1: qid 0: authenticated [ 59.707478] nvme nvme1: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices. [ 59.709883] nvme nvme1: creating 4 I/O queues. [ 59.745087] nvme nvme1: mapped 4/0/0 default/read/poll queues. [ 59.786869] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 10.0.2.15:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 59.999761] nvme nvme1: re-authenticating controller [ 60.010902] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 60.011640] nvme nvme1: qid 0: controller authenticated [ 60.025652] nvme nvme1: re-authenticating controller [ 60.035349] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 60.036375] nvme nvme1: qid 0: controller authenticated [ 60.050449] nvme nvme1: re-authenticating controller [ 60.060757] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 60.061460] nvme nvme1: qid 0: controller authenticated [ 62.662430] nvme nvme1: re-authenticating controller [ 62.859510] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 [ 62.860502] nvme nvme1: qid 0: controller authenticated [ 63.029182] nvme nvme1: re-authenticating controller [ 63.192844] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 [ 63.193900] nvme nvme1: qid 0: controller authenticated [ 63.608561] nvme nvme1: starting error recovery [ 63.653699] nvme nvme1: Reconnecting in 1 seconds... [ 64.712627] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 64.868896] nvmet: ctrl 1 qid 0 host response mismatch [ 64.870065] nvmet: ctrl 1 qid 0 failure1 (1) [ 64.871152] nvmet: ctrl 1 fatal error occurred! [ 64.871519] nvme nvme1: qid 0: authentication failed [ 64.874330] nvme nvme1: failed to connect queue: 0 ret=-111 [ 64.878612] nvme nvme1: Failed reconnect attempt 1 [ 64.880472] nvme nvme1: Reconnecting in 1 seconds... [ 66.040957] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 66.200862] nvmet: ctrl 1 qid 0 host response mismatch [ 66.203005] nvmet: ctrl 1 qid 0 failure1 (1) [ 66.204873] nvmet: ctrl 1 fatal error occurred! [ 66.205148] nvme nvme1: qid 0: authentication failed [ 66.208609] nvme nvme1: failed to connect queue: 0 ret=-111 [ 66.212033] nvme nvme1: Failed reconnect attempt 2 [ 66.213837] nvme nvme1: Reconnecting in 1 seconds... [ 67.327576] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 67.485392] nvmet: ctrl 1 qid 0 host response mismatch [ 67.487440] nvmet: ctrl 1 qid 0 failure1 (1) [ 67.489403] nvmet: ctrl 1 fatal error occurred! [ 67.489565] nvme nvme1: qid 0: authentication failed [ 67.493015] nvme nvme1: failed to connect queue: 0 ret=-111 [ 67.496909] nvme nvme1: Failed reconnect attempt 3 [ 67.498692] nvme nvme1: Reconnecting in 1 seconds... [ 68.610640] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 68.739298] nvme nvme1: Identify namespace failed (880) [ 68.742833] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 68.774125] nvmet: ctrl 1 qid 0 host response mismatch [ 68.776440] nvme nvme1: qid 0 auth_send failed with status 880 [ 68.778133] nvme nvme1: qid 0 failed to receive success1, nvme status 880 [ 68.780300] nvme nvme1: qid 0: authentication failed [ 68.782490] nvme nvme1: failed to connect queue: 0 ret=880 [ 68.785335] nvme nvme1: Failed reconnect attempt 4 [ 68.829188] nvme nvme1: Property Set error: 880, offset 0x14 [ 69.634482] rdma_rxe: unloaded ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key 2024-03-06 8:44 ` Shinichiro Kawasaki @ 2024-03-06 9:36 ` Daniel Wagner 2024-03-06 11:28 ` Shinichiro Kawasaki 0 siblings, 1 reply; 10+ messages in thread From: Daniel Wagner @ 2024-03-06 9:36 UTC (permalink / raw) To: Shinichiro Kawasaki Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Wed, Mar 06, 2024 at 08:44:48AM +0000, Shinichiro Kawasaki wrote: > > sudo ./check nvme/045 > > > nvme/045 (Test re-authentication) [failed] > > > runtime 8.069s ... 7.639s > > > --- tests/nvme/045.out 2024-03-05 18:09:07.267668493 +0900 > > > +++ /home/shin/Blktests/blktests/results/nodev/nvme/045.out.bad 2024-03-05 18:10:07.735494384 +0900 > > > @@ -9,5 +9,6 @@ > > > Change hash to hmac(sha512) > > > Re-authenticate with changed hash > > > Renew host key on the controller and force reconnect > > > -disconnected 0 controller(s) > > > +controller "nvme1" not deleted within 5 seconds > > > +disconnected 1 controller(s) > > > Test complete > > > > That means the host either successfully reconnected or never > > disconnected. We have another test case just for the disconnect test > > (number of queue changes), so if this test passes, it must be the > > former... Shouldn't really happen, this would mean the auth code has bug. > > The test case nvme/048 passes, so this looks a bug. I'll try to recreate it. > > If you have these patches applied, the test should pass. But we might > > have still some more stuff to unify between the transports. The nvme/045 > > test passes in my setup. Though I have seen runs which were hang for > > some reason. Haven't figured out yet what's happening there. But I > > haven't seen failures. > > Still with the fix of the double-free, I observe the nvme/045 failure for rdma, > tcp and fc transports. I wonder where the difference between your system and > mine comes from. > > FYI, here I share the kernel messages for rdma transport. It shows that > nvme_rdma_reconnect_or_remove() was called repeatedly and it tried to reconnect. > The status argument is -111 or 880, so I think the recon flag is always true > and no effect. I'm interested in the status values in your environment. Do you have these patches applied: https://lore.kernel.org/linux-nvme/20240305080005.3638-1-dwagner@suse.de/ ? > [ 59.117607] run blktests nvme/045 at 2024-03-06 17:05:55 > [ 59.198629] (null): rxe_set_mtu: Set mtu to 1024 > [ 59.211185] PCLMULQDQ-NI instructions are not detected. > [ 59.362952] infiniband ens3_rxe: set active > [ 59.363765] infiniband ens3_rxe: added ens3 > [ 59.540499] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > [ 59.560541] nvmet_rdma: enabling port 0 (10.0.2.15:4420) > [ 59.688866] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. > [ 59.701114] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > [ 59.702195] nvme nvme1: qid 0: controller authenticated > [ 59.703310] nvme nvme1: qid 0: authenticated > [ 59.707478] nvme nvme1: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices. > [ 59.709883] nvme nvme1: creating 4 I/O queues. > [ 59.745087] nvme nvme1: mapped 4/0/0 default/read/poll queues. > [ 59.786869] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 10.0.2.15:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 59.999761] nvme nvme1: re-authenticating controller > [ 60.010902] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > [ 60.011640] nvme nvme1: qid 0: controller authenticated > [ 60.025652] nvme nvme1: re-authenticating controller > [ 60.035349] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > [ 60.036375] nvme nvme1: qid 0: controller authenticated > [ 60.050449] nvme nvme1: re-authenticating controller > [ 60.060757] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > [ 60.061460] nvme nvme1: qid 0: controller authenticated > [ 62.662430] nvme nvme1: re-authenticating controller > [ 62.859510] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 > [ 62.860502] nvme nvme1: qid 0: controller authenticated > [ 63.029182] nvme nvme1: re-authenticating controller > [ 63.192844] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 > [ 63.193900] nvme nvme1: qid 0: controller authenticated > [ 63.608561] nvme nvme1: starting error recovery > [ 63.653699] nvme nvme1: Reconnecting in 1 seconds... > [ 64.712627] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. > [ 64.868896] nvmet: ctrl 1 qid 0 host response mismatch > [ 64.870065] nvmet: ctrl 1 qid 0 failure1 (1) > [ 64.871152] nvmet: ctrl 1 fatal error occurred! > [ 64.871519] nvme nvme1: qid 0: authentication failed > [ 64.874330] nvme nvme1: failed to connect queue: 0 ret=-111 > [ 64.878612] nvme nvme1: Failed reconnect attempt 1 > [ 64.880472] nvme nvme1: Reconnecting in 1 seconds... This looks like the DNR bit is not considered in the reconnect_or_delete function. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key 2024-03-06 9:36 ` Daniel Wagner @ 2024-03-06 11:28 ` Shinichiro Kawasaki 0 siblings, 0 replies; 10+ messages in thread From: Shinichiro Kawasaki @ 2024-03-06 11:28 UTC (permalink / raw) To: Daniel Wagner Cc: Chaitanya Kulkarni, Keith Busch, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Mar 06, 2024 / 10:36, Daniel Wagner wrote: > On Wed, Mar 06, 2024 at 08:44:48AM +0000, Shinichiro Kawasaki wrote: > > > sudo ./check nvme/045 > > > > nvme/045 (Test re-authentication) [failed] > > > > runtime 8.069s ... 7.639s > > > > --- tests/nvme/045.out 2024-03-05 18:09:07.267668493 +0900 > > > > +++ /home/shin/Blktests/blktests/results/nodev/nvme/045.out.bad 2024-03-05 18:10:07.735494384 +0900 > > > > @@ -9,5 +9,6 @@ > > > > Change hash to hmac(sha512) > > > > Re-authenticate with changed hash > > > > Renew host key on the controller and force reconnect > > > > -disconnected 0 controller(s) > > > > +controller "nvme1" not deleted within 5 seconds > > > > +disconnected 1 controller(s) > > > > Test complete > > > > > > That means the host either successfully reconnected or never > > > disconnected. We have another test case just for the disconnect test > > > (number of queue changes), so if this test passes, it must be the > > > former... Shouldn't really happen, this would mean the auth code has bug. > > > > The test case nvme/048 passes, so this looks a bug. > > I'll try to recreate it. > > > > If you have these patches applied, the test should pass. But we might > > > have still some more stuff to unify between the transports. The nvme/045 > > > test passes in my setup. Though I have seen runs which were hang for > > > some reason. Haven't figured out yet what's happening there. But I > > > haven't seen failures. > > > > Still with the fix of the double-free, I observe the nvme/045 failure for rdma, > > tcp and fc transports. I wonder where the difference between your system and > > mine comes from. > > > > FYI, here I share the kernel messages for rdma transport. It shows that > > nvme_rdma_reconnect_or_remove() was called repeatedly and it tried to reconnect. > > The status argument is -111 or 880, so I think the recon flag is always true > > and no effect. I'm interested in the status values in your environment. > > Do you have these patches applied: > > https://lore.kernel.org/linux-nvme/20240305080005.3638-1-dwagner@suse.de/ > > ? Yes. To be precise, I applied v2 patches, but I believe they have same logic as v3. To be sure, I replaced the v2 patches with v3, then saw the same failure. > > > [ 59.117607] run blktests nvme/045 at 2024-03-06 17:05:55 > > [ 59.198629] (null): rxe_set_mtu: Set mtu to 1024 > > [ 59.211185] PCLMULQDQ-NI instructions are not detected. > > [ 59.362952] infiniband ens3_rxe: set active > > [ 59.363765] infiniband ens3_rxe: added ens3 > > [ 59.540499] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > > [ 59.560541] nvmet_rdma: enabling port 0 (10.0.2.15:4420) > > [ 59.688866] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. > > [ 59.701114] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > > [ 59.702195] nvme nvme1: qid 0: controller authenticated > > [ 59.703310] nvme nvme1: qid 0: authenticated > > [ 59.707478] nvme nvme1: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices. > > [ 59.709883] nvme nvme1: creating 4 I/O queues. > > [ 59.745087] nvme nvme1: mapped 4/0/0 default/read/poll queues. > > [ 59.786869] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 10.0.2.15:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 59.999761] nvme nvme1: re-authenticating controller > > [ 60.010902] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > > [ 60.011640] nvme nvme1: qid 0: controller authenticated > > [ 60.025652] nvme nvme1: re-authenticating controller > > [ 60.035349] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > > [ 60.036375] nvme nvme1: qid 0: controller authenticated > > [ 60.050449] nvme nvme1: re-authenticating controller > > [ 60.060757] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 > > [ 60.061460] nvme nvme1: qid 0: controller authenticated > > [ 62.662430] nvme nvme1: re-authenticating controller > > [ 62.859510] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 > > [ 62.860502] nvme nvme1: qid 0: controller authenticated > > [ 63.029182] nvme nvme1: re-authenticating controller > > [ 63.192844] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 > > [ 63.193900] nvme nvme1: qid 0: controller authenticated > > [ 63.608561] nvme nvme1: starting error recovery > > [ 63.653699] nvme nvme1: Reconnecting in 1 seconds... > > [ 64.712627] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. > > [ 64.868896] nvmet: ctrl 1 qid 0 host response mismatch > > [ 64.870065] nvmet: ctrl 1 qid 0 failure1 (1) > > [ 64.871152] nvmet: ctrl 1 fatal error occurred! > > [ 64.871519] nvme nvme1: qid 0: authentication failed > > [ 64.874330] nvme nvme1: failed to connect queue: 0 ret=-111 > > [ 64.878612] nvme nvme1: Failed reconnect attempt 1 > > [ 64.880472] nvme nvme1: Reconnecting in 1 seconds... > > This looks like the DNR bit is not considered in the reconnect_or_delete > function. Agreed. Let me share the kernel messages with the added printk() below. We can see recon is true at every nvme_rdma_reconnect_or_remove() call. diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 507cbd6a3312..bfb23b9d5d82 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -988,6 +988,7 @@ static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl, enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl); bool recon = status > 0 && (status & NVME_SC_DNR) ? false : true; + printk("%s: status=%d recon=%d\n", __func__, status, recon); /* If we are resetting/deleting then do nothing */ if (state != NVME_CTRL_CONNECTING) { WARN_ON_ONCE(state == NVME_CTRL_NEW || state == NVME_CTRL_LIVE); [ 139.201909] run blktests nvme/045 at 2024-03-06 20:22:44 [ 139.277895] (null): rxe_set_mtu: Set mtu to 1024 [ 139.291307] PCLMULQDQ-NI instructions are not detected. [ 139.451087] infiniband ens3_rxe: set active [ 139.451856] infiniband ens3_rxe: added ens3 [ 139.627815] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [ 139.647899] nvmet_rdma: enabling port 0 (10.0.2.15:4420) [ 139.777796] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 139.789952] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 139.791053] nvme nvme1: qid 0: controller authenticated [ 139.792039] nvme nvme1: qid 0: authenticated [ 139.796131] nvme nvme1: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices. [ 139.798258] nvme nvme1: creating 4 I/O queues. [ 139.832707] nvme nvme1: mapped 4/0/0 default/read/poll queues. [ 139.874555] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 10.0.2.15:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 140.084928] nvme nvme1: re-authenticating controller [ 140.095934] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 140.096705] nvme nvme1: qid 0: controller authenticated [ 140.111696] nvme nvme1: re-authenticating controller [ 140.121209] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 140.121885] nvme nvme1: qid 0: controller authenticated [ 140.138435] nvme nvme1: re-authenticating controller [ 140.148513] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048 [ 140.149344] nvme nvme1: qid 0: controller authenticated [ 142.748523] nvme nvme1: re-authenticating controller [ 142.944154] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 [ 142.945023] nvme nvme1: qid 0: controller authenticated [ 143.127665] nvme nvme1: re-authenticating controller [ 143.293304] nvme nvme1: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192 [ 143.294204] nvme nvme1: qid 0: controller authenticated [ 143.732511] nvme nvme1: starting error recovery [ 143.785440] nvme_rdma_reconnect_or_remove: status=-107 recon=1 [ 143.786448] nvme nvme1: Reconnecting in 1 seconds... [ 144.834299] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 144.990883] nvmet: ctrl 1 qid 0 host response mismatch [ 144.991937] nvmet: ctrl 1 qid 0 failure1 (1) [ 144.992655] nvmet: ctrl 1 fatal error occurred! [ 144.992953] nvme nvme1: qid 0: authentication failed [ 144.996695] nvme nvme1: failed to connect queue: 0 ret=-111 [ 145.000090] nvme nvme1: Failed reconnect attempt 1 [ 145.001717] nvme_rdma_reconnect_or_remove: status=-111 recon=1 [ 145.003218] nvme nvme1: Reconnecting in 1 seconds... [ 146.116430] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 146.272948] nvmet: ctrl 1 qid 0 host response mismatch [ 146.274869] nvmet: ctrl 1 qid 0 failure1 (1) [ 146.276687] nvmet: ctrl 1 fatal error occurred! [ 146.276891] nvme nvme1: qid 0: authentication failed [ 146.280591] nvme nvme1: failed to connect queue: 0 ret=-111 [ 146.283783] nvme nvme1: Failed reconnect attempt 2 [ 146.285339] nvme_rdma_reconnect_or_remove: status=-111 recon=1 [ 146.287206] nvme nvme1: Reconnecting in 1 seconds... [ 147.395546] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 147.563730] nvmet: ctrl 1 qid 0 host response mismatch [ 147.565628] nvmet: ctrl 1 qid 0 failure1 (1) [ 147.567256] nvmet: ctrl 1 fatal error occurred! [ 147.567411] nvme nvme1: qid 0: authentication failed [ 147.571022] nvme nvme1: failed to connect queue: 0 ret=-111 [ 147.574003] nvme nvme1: Failed reconnect attempt 3 [ 147.575463] nvme_rdma_reconnect_or_remove: status=-111 recon=1 [ 147.576976] nvme nvme1: Reconnecting in 1 seconds... [ 148.676722] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP. [ 148.842117] nvmet: ctrl 1 qid 0 host response mismatch [ 148.844543] nvmet: ctrl 1 qid 0 failure1 (1) [ 148.846369] nvmet: ctrl 1 fatal error occurred! [ 148.846478] nvme nvme1: qid 0: authentication failed [ 148.850217] nvme nvme1: failed to connect queue: 0 ret=-111 [ 148.853850] nvme nvme1: Failed reconnect attempt 4 [ 148.855425] nvme_rdma_reconnect_or_remove: status=-111 recon=1 [ 148.857496] nvme nvme1: Reconnecting in 1 seconds... [ 149.873720] nvme nvme1: Identify namespace failed (880) [ 149.877168] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 149.927672] nvme nvme1: Property Set error: 880, offset 0x14 [ 150.754847] rdma_rxe: unloaded ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-03-06 11:28 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-04 16:13 [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Daniel Wagner 2024-03-04 16:13 ` [PATCH blktests v1 1/2] nvme/rc: add reconnect-delay argument only for fabrics transports Daniel Wagner 2024-03-04 16:13 ` [PATCH blktests v1 2/2] nvme/048: add reconnect after ctrl key change Daniel Wagner 2024-03-05 9:38 ` Shinichiro Kawasaki 2024-03-05 11:08 ` Daniel Wagner 2024-03-05 9:44 ` [PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key Shinichiro Kawasaki 2024-03-05 11:18 ` Daniel Wagner 2024-03-06 8:44 ` Shinichiro Kawasaki 2024-03-06 9:36 ` Daniel Wagner 2024-03-06 11:28 ` Shinichiro Kawasaki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox