[PATCHv3] nvme: authentication error are always non-retryable

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCHv3] nvme: authentication error are always non-retryable
@ 2024-02-26 14:30 Hannes Reinecke
  2024-02-27  7:32 ` Daniel Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2024-02-26 14:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

From: Hannes Reinecke <hare@suse.de>

Any authentication errors which are generated internally are always
non-retryable, so set the DNR bit to ensure they are not retried.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c    |  6 +++---
 drivers/nvme/host/fabrics.c | 29 +++++++++++++++++------------
 drivers/nvme/host/nvme.h    |  2 +-
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5e9a0cf43636..0a8a595070c0 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -374,14 +374,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
 	if (likely(nvme_req(req)->status == 0))
 		return COMPLETE;
 
-	if ((nvme_req(req)->status & 0x7ff) == NVME_SC_AUTH_REQUIRED)
-		return AUTHENTICATE;
-
 	if (blk_noretry_request(req) ||
 	    (nvme_req(req)->status & NVME_SC_DNR) ||
 	    nvme_req(req)->retries >= nvme_max_retries)
 		return COMPLETE;
 
+	if ((nvme_req(req)->status & 0x7ff) == NVME_SC_AUTH_REQUIRED)
+		return AUTHENTICATE;
+
 	if (req->cmd_flags & REQ_NVME_MPATH) {
 		if (nvme_is_path_error(nvme_req(req)->status) ||
 		    blk_queue_dying(req->q))
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index ab5ac219b70a..32e993781236 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -467,7 +467,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 		if (result & NVME_CONNECT_AUTHREQ_ASCR) {
 			dev_warn(ctrl->device,
 				 "qid 0: secure concatenation is not supported\n");
-			ret = NVME_SC_AUTH_REQUIRED;
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 			goto out_free_data;
 		}
 		/* Authentication required */
@@ -475,14 +475,16 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 		if (ret) {
 			dev_warn(ctrl->device,
 				 "qid 0: authentication setup failed\n");
-			ret = NVME_SC_AUTH_REQUIRED;
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 			goto out_free_data;
 		}
 		ret = nvme_auth_wait(ctrl, 0);
-		if (ret)
+		if (ret) {
 			dev_warn(ctrl->device,
-				 "qid 0: authentication failed\n");
-		else
+				 "qid 0: authentication failed, error %d\n",
+				 ret);
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
+		} else
 			dev_info(ctrl->device,
 				 "qid 0: authenticated\n");
 	}
@@ -541,7 +543,7 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 		if (result & NVME_CONNECT_AUTHREQ_ASCR) {
 			dev_warn(ctrl->device,
 				 "qid 0: secure concatenation is not supported\n");
-			ret = NVME_SC_AUTH_REQUIRED;
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 			goto out_free_data;
 		}
 		/* Authentication required */
@@ -549,12 +551,15 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 		if (ret) {
 			dev_warn(ctrl->device,
 				 "qid %d: authentication setup failed\n", qid);
-			ret = NVME_SC_AUTH_REQUIRED;
-		} else {
-			ret = nvme_auth_wait(ctrl, qid);
-			if (ret)
-				dev_warn(ctrl->device,
-					 "qid %u: authentication failed\n", qid);
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
+			goto out_free_data;
+		}
+		ret = nvme_auth_wait(ctrl, qid);
+		if (ret) {
+			dev_warn(ctrl->device,
+				 "qid %u: authentication failed, error %d\n",
+				 qid, ret);
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 		}
 	}
 out_free_data:
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 6ed2ca6b35e4..c08c220a68cf 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1123,7 +1123,7 @@ static inline int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 }
 static inline int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 {
-	return NVME_SC_AUTH_REQUIRED;
+	return NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 }
 static inline void nvme_auth_free(struct nvme_ctrl *ctrl) {};
 #endif
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCHv3] nvme: authentication error are always non-retryable
  2024-02-26 14:30 [PATCHv3] nvme: authentication error are always non-retryable Hannes Reinecke
@ 2024-02-27  7:32 ` Daniel Wagner
  2024-02-27  7:54   ` Hannes Reinecke
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2024-02-27  7:32 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
	Hannes Reinecke

On Mon, Feb 26, 2024 at 03:30:13PM +0100, Hannes Reinecke wrote:
> From: Hannes Reinecke <hare@suse.de>
> 
> Any authentication errors which are generated internally are always
> non-retryable, so set the DNR bit to ensure they are not retried.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>

This replaces my hacky version 'nvme-fc: do not retry when auth fails or
connection is refused'

Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>

But with this patch at least two UAF are uncovered. I've already
identified the first one (see comments on the v2 of this patch). The
second one is gets triggered by loop on nvme/045

[47923.100856] [10844] nvmet: nvmet_execute_auth_send: ctrl 1 qid 0 type 0 id 0 step 4                                08:04:39 [63/9375]
[47923.102798] [10844] nvmet: nvmet_execute_auth_send: ctrl 1 qid 0 reset negotiation
[47923.104447] [10844] nvmet: check nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[47923.106278] [10844] nvmet: nvmet_setup_dhgroup: ctrl 1 selecting dhgroup 1
[47923.107896] [10844] nvmet: nvmet_setup_dhgroup: ctrl 1 reuse existing DH group 1
[47923.109624] [10844] nvmet: Re-use existing hash ID 1
[47923.115167] ==================================================================
[47923.117175] BUG: KASAN: slab-use-after-free in base64_decode+0x10e/0x170
[47923.117280] Read of size 1 at addr ffff88810a1d360a by task kworker/2:14/10844

[47923.119954] CPU: 2 PID: 10844 Comm: kworker/2:14 Tainted: G        W    L     6.8.0-rc3+ #39 3d0b6128d1ea3c6026a2c1de70ba6c7dc10623c3
[47923.119954] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
[47923.123853] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop]
[47923.123853] Call Trace:
[47923.123853]  <TASK>
[47923.123853]  dump_stack_lvl+0x5b/0x80
[47923.127843]  print_report+0x163/0x800
[47923.127843]  ? __virt_addr_valid+0x2f3/0x340
[47923.127843]  ? base64_decode+0x10e/0x170
[47923.131370]  kasan_report+0xd0/0x110
[47923.131370]  ? base64_decode+0x10e/0x170
[47923.131370]  base64_decode+0x10e/0x170
[47923.131370]  nvme_auth_extract_key+0xbd/0x290 [nvme_auth c9862b6e632ff3757fc5af136ad323c2fd7ec3cc]
[47923.131370]  nvmet_setup_auth+0x3e6/0x980 [nvmet 5699c49016b7caa62e59f3ad4cb1d5fc35e0accf]
[47923.135839]  nvmet_execute_auth_send+0x5f6/0x1890 [nvmet 5699c49016b7caa62e59f3ad4cb1d5fc35e0accf]
[47923.135839]  ? process_scheduled_works+0x6d4/0xf80
[47923.138829]  process_scheduled_works+0x774/0xf80
[47923.140513]  worker_thread+0x8c4/0xfc0
[47923.140513]  ? __kthread_parkme+0x84/0x120
[47923.145904]  kthread+0x25d/0x2e0
[47923.145904]  ? __cfi_worker_thread+0x10/0x10
[47923.145904]  ? __cfi_kthread+0x10/0x10
[47923.145904]  ret_from_fork+0x41/0x70
[47923.145904]  ? __cfi_kthread+0x10/0x10
[47923.145904]  ret_from_fork_asm+0x1b/0x30
[47923.151823]  </TASK>

[47923.153891] Allocated by task 14645 on cpu 2 at 47922.532579s:
[47923.153891]  kasan_save_track+0x2c/0x90
[47923.153891]  __kasan_kmalloc+0x89/0xa0
[47923.153891]  __kmalloc_node_track_caller+0x23d/0x4e0
[47923.153891]  kstrdup+0x34/0x60
[47923.153891]  nvmet_auth_set_key+0xa7/0x2a0 [nvmet]
[47923.159060]  nvmet_host_dhchap_key_store+0x10/0x20 [nvmet]
[47923.159848]  configfs_write_iter+0x2ea/0x3c0
[47923.159848]  vfs_write+0x80c/0xb60
[47923.159848]  ksys_write+0xd7/0x1a0
[47923.162639]  do_syscall_64+0xb1/0x180
[47923.162639]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

[47923.164392] Freed by task 14645 on cpu 1 at 47923.114374s:
[47923.164392]  kasan_save_track+0x2c/0x90
[47923.164392]  kasan_save_free_info+0x4a/0x60
[47923.164392]  poison_slab_object+0x108/0x180
[47923.164392]  __kasan_slab_free+0x33/0x80
[47923.164392]  kfree+0x119/0x310
[47923.164392]  nvmet_auth_set_key+0x175/0x2a0 [nvmet]
[47923.164392]  nvmet_host_dhchap_key_store+0x10/0x20 [nvmet]
[47923.164392]  configfs_write_iter+0x2ea/0x3c0
[47923.164392]  vfs_write+0x80c/0xb60
[47923.164392]  ksys_write+0xd7/0x1a0
[47923.164392]  do_syscall_64+0xb1/0x180
[47923.164392]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

[47923.176385] The buggy address belongs to the object at ffff88810a1d3600
                which belongs to the cache kmalloc-64 of size 64
[47923.178122] The buggy address is located 10 bytes inside of
                freed 64-byte region [ffff88810a1d3600, ffff88810a1d3640)

[47923.178122] The buggy address belongs to the physical page:
[47923.178122] page:00000000651bcfd3 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10a1d3
[47923.178122] ksm flags: 0x17ffffc0000800(slab|node=0|zone=2|lastcpupid=0x1fffff)
[47923.184115] page_type: 0xffffffff()
[47923.184115] raw: 0017ffffc0000800 ffff888100042780 ffffea00044e7dc0 dead000000000003
[47923.184115] raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
[47923.187818] page dumped because: kasan: bad access detected

[47923.187818] Memory state around the buggy address:
[47923.187818]  ffff88810a1d3500: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
[47923.187818]  ffff88810a1d3580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[47923.192381] >ffff88810a1d3600: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[47923.192381]                       ^
[47923.192381]  ffff88810a1d3680: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
[47923.192381]  ffff88810a1d3700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[47923.192381] ==================================================================


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv3] nvme: authentication error are always non-retryable
  2024-02-27  7:32 ` Daniel Wagner
@ 2024-02-27  7:54   ` Hannes Reinecke
  2024-02-27  8:49     ` Daniel Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2024-02-27  7:54 UTC (permalink / raw)
  To: Daniel Wagner, Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme

On 2/27/24 08:32, Daniel Wagner wrote:
> On Mon, Feb 26, 2024 at 03:30:13PM +0100, Hannes Reinecke wrote:
>> From: Hannes Reinecke <hare@suse.de>
>>
>> Any authentication errors which are generated internally are always
>> non-retryable, so set the DNR bit to ensure they are not retried.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
> 
> This replaces my hacky version 'nvme-fc: do not retry when auth fails or
> connection is refused'
> 
> Tested-by: Daniel Wagner <dwagner@suse.de>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>
> 
> But with this patch at least two UAF are uncovered. I've already
> identified the first one (see comments on the v2 of this patch). The
> second one is gets triggered by loop on nvme/045
> 
> [47923.100856] [10844] nvmet: nvmet_execute_auth_send: ctrl 1 qid 0 type 0 id 0 step 4                                08:04:39 [63/9375]
> [47923.102798] [10844] nvmet: nvmet_execute_auth_send: ctrl 1 qid 0 reset negotiation
> [47923.104447] [10844] nvmet: check nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
> [47923.106278] [10844] nvmet: nvmet_setup_dhgroup: ctrl 1 selecting dhgroup 1
> [47923.107896] [10844] nvmet: nvmet_setup_dhgroup: ctrl 1 reuse existing DH group 1
> [47923.109624] [10844] nvmet: Re-use existing hash ID 1
> [47923.115167] ==================================================================
> [47923.117175] BUG: KASAN: slab-use-after-free in base64_decode+0x10e/0x170
> [47923.117280] Read of size 1 at addr ffff88810a1d360a by task kworker/2:14/10844
> 
> [47923.119954] CPU: 2 PID: 10844 Comm: kworker/2:14 Tainted: G        W    L     6.8.0-rc3+ #39 3d0b6128d1ea3c6026a2c1de70ba6c7dc10623c3
> [47923.119954] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
> [47923.123853] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop]
> [47923.123853] Call Trace:
> [47923.123853]  <TASK>
> [47923.123853]  dump_stack_lvl+0x5b/0x80
> [47923.127843]  print_report+0x163/0x800
> [47923.127843]  ? __virt_addr_valid+0x2f3/0x340
> [47923.127843]  ? base64_decode+0x10e/0x170
> [47923.131370]  kasan_report+0xd0/0x110
> [47923.131370]  ? base64_decode+0x10e/0x170
> [47923.131370]  base64_decode+0x10e/0x170
> [47923.131370]  nvme_auth_extract_key+0xbd/0x290 [nvme_auth c9862b6e632ff3757fc5af136ad323c2fd7ec3cc]
> [47923.131370]  nvmet_setup_auth+0x3e6/0x980 [nvmet 5699c49016b7caa62e59f3ad4cb1d5fc35e0accf]
> [47923.135839]  nvmet_execute_auth_send+0x5f6/0x1890 [nvmet 5699c49016b7caa62e59f3ad4cb1d5fc35e0accf]
> [47923.135839]  ? process_scheduled_works+0x6d4/0xf80
> [47923.138829]  process_scheduled_works+0x774/0xf80
> [47923.140513]  worker_thread+0x8c4/0xfc0
> [47923.140513]  ? __kthread_parkme+0x84/0x120
> [47923.145904]  kthread+0x25d/0x2e0
> [47923.145904]  ? __cfi_worker_thread+0x10/0x10
> [47923.145904]  ? __cfi_kthread+0x10/0x10
> [47923.145904]  ret_from_fork+0x41/0x70
> [47923.145904]  ? __cfi_kthread+0x10/0x10
> [47923.145904]  ret_from_fork_asm+0x1b/0x30
> [47923.151823]  </TASK>
> 
> [47923.153891] Allocated by task 14645 on cpu 2 at 47922.532579s:
> [47923.153891]  kasan_save_track+0x2c/0x90
> [47923.153891]  __kasan_kmalloc+0x89/0xa0
> [47923.153891]  __kmalloc_node_track_caller+0x23d/0x4e0
> [47923.153891]  kstrdup+0x34/0x60
> [47923.153891]  nvmet_auth_set_key+0xa7/0x2a0 [nvmet]
> [47923.159060]  nvmet_host_dhchap_key_store+0x10/0x20 [nvmet]
> [47923.159848]  configfs_write_iter+0x2ea/0x3c0
> [47923.159848]  vfs_write+0x80c/0xb60
> [47923.159848]  ksys_write+0xd7/0x1a0
> [47923.162639]  do_syscall_64+0xb1/0x180
> [47923.162639]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> 
> [47923.164392] Freed by task 14645 on cpu 1 at 47923.114374s:
> [47923.164392]  kasan_save_track+0x2c/0x90
> [47923.164392]  kasan_save_free_info+0x4a/0x60
> [47923.164392]  poison_slab_object+0x108/0x180
> [47923.164392]  __kasan_slab_free+0x33/0x80
> [47923.164392]  kfree+0x119/0x310
> [47923.164392]  nvmet_auth_set_key+0x175/0x2a0 [nvmet]
> [47923.164392]  nvmet_host_dhchap_key_store+0x10/0x20 [nvmet]
> [47923.164392]  configfs_write_iter+0x2ea/0x3c0
> [47923.164392]  vfs_write+0x80c/0xb60
> [47923.164392]  ksys_write+0xd7/0x1a0
> [47923.164392]  do_syscall_64+0xb1/0x180
> [47923.164392]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> 
> [47923.176385] The buggy address belongs to the object at ffff88810a1d3600
>                  which belongs to the cache kmalloc-64 of size 64
> [47923.178122] The buggy address is located 10 bytes inside of
>                  freed 64-byte region [ffff88810a1d3600, ffff88810a1d3640)
> 
> [47923.178122] The buggy address belongs to the physical page:
> [47923.178122] page:00000000651bcfd3 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10a1d3
> [47923.178122] ksm flags: 0x17ffffc0000800(slab|node=0|zone=2|lastcpupid=0x1fffff)
> [47923.184115] page_type: 0xffffffff()
> [47923.184115] raw: 0017ffffc0000800 ffff888100042780 ffffea00044e7dc0 dead000000000003
> [47923.184115] raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
> [47923.187818] page dumped because: kasan: bad access detected
> 
> [47923.187818] Memory state around the buggy address:
> [47923.187818]  ffff88810a1d3500: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
> [47923.187818]  ffff88810a1d3580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [47923.192381] >ffff88810a1d3600: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> [47923.192381]                       ^
> [47923.192381]  ffff88810a1d3680: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
> [47923.192381]  ffff88810a1d3700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [47923.192381] ==================================================================

Ouch.

Does this help?

diff --git a/drivers/nvme/target/auth.c b/drivers/nvme/target/auth.c
index 3ddbc3880cac..9afc28f1ffac 100644
--- a/drivers/nvme/target/auth.c
+++ b/drivers/nvme/target/auth.c
@@ -44,6 +44,7 @@ int nvmet_auth_set_key(struct nvmet_host *host, const 
char *secret,
         dhchap_secret = kstrdup(secret, GFP_KERNEL);
         if (!dhchap_secret)
                 return -ENOMEM;
+       down_write(&nvmet_config_sem);
         if (set_ctrl) {
                 kfree(host->dhchap_ctrl_secret);
                 host->dhchap_ctrl_secret = strim(dhchap_secret);
@@ -53,6 +54,7 @@ int nvmet_auth_set_key(struct nvmet_host *host, const 
char *secret,
                 host->dhchap_secret = strim(dhchap_secret);
                 host->dhchap_key_hash = key_hash;
         }
+       up_write(&nvmet_config_sem);
         return 0;
  }

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCHv3] nvme: authentication error are always non-retryable
  2024-02-27  7:54   ` Hannes Reinecke
@ 2024-02-27  8:49     ` Daniel Wagner
  2024-02-27 10:09       ` Daniel Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2024-02-27  8:49 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	linux-nvme

On Tue, Feb 27, 2024 at 08:54:32AM +0100, Hannes Reinecke wrote:
> Ouch.
> 
> Does this help?

Yes, this helps. I am running the whole test suite for all transport a
few times just a few times now. Just to make sure.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv3] nvme: authentication error are always non-retryable
  2024-02-27  8:49     ` Daniel Wagner
@ 2024-02-27 10:09       ` Daniel Wagner
  2024-02-27 11:04         ` Hannes Reinecke
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2024-02-27 10:09 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	linux-nvme

On Tue, Feb 27, 2024 at 09:51:17AM +0100, Daniel Wagner wrote:
> On Tue, Feb 27, 2024 at 08:54:32AM +0100, Hannes Reinecke wrote:
> > Ouch.
> > 
> > Does this help?
> 
> Yes, this helps. I am running the whole test suite for all transport a
> few times just a few times now. Just to make sure.

- rmda nvme/031

[ 2454.989813] kworker/0:35/22379 is trying to acquire lock:
[ 2454.989813] ffff88810a68dd48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: __flush_work+0x72/0x7d0
[ 2454.989813]
               but task is already holding lock:
[ 2454.989813] ffff88810a68dd48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_scheduled_works+0x6d4/0xf80
[ 2455.001905]
               other info that might help us debug this:
[ 2455.001905]  Possible unsafe locking scenario:

[ 2455.005799]        CPU0
[ 2455.005799]        ----
[ 2455.005799]   lock((wq_completion)nvmet-wq);
[ 2455.005799]   lock((wq_completion)nvmet-wq);
[ 2455.005799]
                *** DEADLOCK ***

[ 2455.005799]  May be due to missing lock nesting notation

[ 2455.005799] 3 locks held by kworker/0:35/22379:
[ 2455.005799]  #0: ffff88810a68dd48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_scheduled_works+0x6d4/0xf80
[ 2455.005799]  #1: ffff88812e997d68 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_scheduled_works+0x6d4/0xf80
[ 2455.021848]  #2: ffffffff920c60c0 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x7d0
[ 2455.021848]
               stack backtrace:
[ 2455.021848] CPU: 0 PID: 22379 Comm: kworker/0:35 Tainted: G        W          6.8.0-rc3+ #39 3d0b6128d1ea3c6026a2c1de70ba6c7dc10623c3
[ 2455.021848] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
[ 2455.021848] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
[ 2455.021848] Call Trace:
[ 2455.021848]  <TASK>
[ 2455.021848]  dump_stack_lvl+0x5b/0x80
[ 2455.021848]  __lock_acquire+0x65f2/0x7b00
[ 2455.021848]  ? lock_release+0x25c/0xcd0
[ 2455.021848]  lock_acquire+0x11c/0x3d0
[ 2455.021848]  ? __flush_work+0x72/0x7d0
[ 2455.021848]  ? lockdep_hardirqs_on_prepare+0x2b0/0x5f0
[ 2455.021848]  ? __flush_work+0x72/0x7d0
[ 2455.021848]  __flush_work+0x648/0x7d0
[ 2455.021848]  ? __flush_work+0x72/0x7d0
[ 2455.021848]  ? __flush_work+0x72/0x7d0
[ 2455.021848]  ? __cfi_wq_barrier_func+0x10/0x10
[ 2455.021848]  nvmet_ctrl_put+0x3b2/0x640 [nvmet 06f982ecc8920c7359ba51cc41092b5d91a725d5]
[ 2455.021848]  nvmet_sq_destroy+0x2e7/0x350 [nvmet 06f982ecc8920c7359ba51cc41092b5d91a725d5]
[ 2455.021848]  nvmet_rdma_free_queue+0x35/0x590 [nvmet_rdma d9fba27fd955e2c2575b3e184853a6daafaffc5c]
[ 2455.021848]  ? do_raw_spin_unlock+0x116/0x890
[ 2455.021848]  ? process_scheduled_works+0x6d4/0xf80
[ 2455.021848]  nvmet_rdma_release_queue_work+0x43/0xa0 [nvmet_rdma d9fba27fd955e2c2575b3e184853a6daafaffc5c]
[ 2455.021848]  process_scheduled_works+0x774/0xf80
[ 2455.060121]  worker_thread+0x8c4/0xfc0
[ 2455.061903]  ? __kthread_parkme+0x84/0x120
[ 2455.063471]  kthread+0x25d/0x2e0
[ 2455.065681]  ? __cfi_worker_thread+0x10/0x10
[ 2455.065681]  ? __cfi_kthread+0x10/0x10
[ 2455.065681]  ret_from_fork+0x41/0x70
[ 2455.065681]  ? __cfi_kthread+0x10/0x10
[ 2455.065681]  ret_from_fork_asm+0x1b/0x30
[ 2455.065681]  </TASK>
[ 2455.096534] nvmet: adding nsid 1 to subsystem blktests-subsystem-5

This looks familiar. I thought we have addressed it. Maybe I missing the
fix since this is on 6.8-rc3


- loop nvme/045

nvme/045 (Test re-authentication)                            [failed]
    runtime  3.187s  ...  1.344s
    --- tests/nvme/045.out      2023-11-28 12:59:52.711505550 +0100
    +++ /home/wagi/work/blktests/results/nodev/nvme/045.out.bad 2024-02-27 10:20:02.792666798 +0100
    @@ -8,5 +8,5 @@
     Re-authenticate with changed DH group
     Change hash to hmac(sha512)
     Re-authenticate with changed hash
    -disconnected 1 controller(s)
    +disconnected 0 controller(s)

This is not easy to trigger by itself. That means just executing
nvme/045 in a loop does not trigger it, instead it only appears when
running the whole test suite. Anyway, this could just be another
instance of userspace racing with the cleanup code in the kernel.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv3] nvme: authentication error are always non-retryable
  2024-02-27 10:09       ` Daniel Wagner
@ 2024-02-27 11:04         ` Hannes Reinecke
  0 siblings, 0 replies; 6+ messages in thread
From: Hannes Reinecke @ 2024-02-27 11:04 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	linux-nvme

On 2/27/24 11:09, Daniel Wagner wrote:
> On Tue, Feb 27, 2024 at 09:51:17AM +0100, Daniel Wagner wrote:
>> On Tue, Feb 27, 2024 at 08:54:32AM +0100, Hannes Reinecke wrote:
>>> Ouch.
>>>
>>> Does this help?
>>
>> Yes, this helps. I am running the whole test suite for all transport a
>> few times just a few times now. Just to make sure.
> 
> - rmda nvme/031
> 
> [ 2454.989813] kworker/0:35/22379 is trying to acquire lock:
> [ 2454.989813] ffff88810a68dd48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: __flush_work+0x72/0x7d0
> [ 2454.989813]
>                 but task is already holding lock:
> [ 2454.989813] ffff88810a68dd48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_scheduled_works+0x6d4/0xf80
> [ 2455.001905]
>                 other info that might help us debug this:
> [ 2455.001905]  Possible unsafe locking scenario:
> 
> [ 2455.005799]        CPU0
> [ 2455.005799]        ----
> [ 2455.005799]   lock((wq_completion)nvmet-wq);
> [ 2455.005799]   lock((wq_completion)nvmet-wq);
> [ 2455.005799]
>                  *** DEADLOCK ***
> 
> [ 2455.005799]  May be due to missing lock nesting notation
> 
> [ 2455.005799] 3 locks held by kworker/0:35/22379:
> [ 2455.005799]  #0: ffff88810a68dd48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_scheduled_works+0x6d4/0xf80
> [ 2455.005799]  #1: ffff88812e997d68 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_scheduled_works+0x6d4/0xf80
> [ 2455.021848]  #2: ffffffff920c60c0 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x7d0
> [ 2455.021848]
>                 stack backtrace:
> [ 2455.021848] CPU: 0 PID: 22379 Comm: kworker/0:35 Tainted: G        W          6.8.0-rc3+ #39 3d0b6128d1ea3c6026a2c1de70ba6c7dc10623c3
> [ 2455.021848] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
> [ 2455.021848] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
> [ 2455.021848] Call Trace:
> [ 2455.021848]  <TASK>
> [ 2455.021848]  dump_stack_lvl+0x5b/0x80
> [ 2455.021848]  __lock_acquire+0x65f2/0x7b00
> [ 2455.021848]  ? lock_release+0x25c/0xcd0
> [ 2455.021848]  lock_acquire+0x11c/0x3d0
> [ 2455.021848]  ? __flush_work+0x72/0x7d0
> [ 2455.021848]  ? lockdep_hardirqs_on_prepare+0x2b0/0x5f0
> [ 2455.021848]  ? __flush_work+0x72/0x7d0
> [ 2455.021848]  __flush_work+0x648/0x7d0
> [ 2455.021848]  ? __flush_work+0x72/0x7d0
> [ 2455.021848]  ? __flush_work+0x72/0x7d0
> [ 2455.021848]  ? __cfi_wq_barrier_func+0x10/0x10
> [ 2455.021848]  nvmet_ctrl_put+0x3b2/0x640 [nvmet 06f982ecc8920c7359ba51cc41092b5d91a725d5]
> [ 2455.021848]  nvmet_sq_destroy+0x2e7/0x350 [nvmet 06f982ecc8920c7359ba51cc41092b5d91a725d5]
> [ 2455.021848]  nvmet_rdma_free_queue+0x35/0x590 [nvmet_rdma d9fba27fd955e2c2575b3e184853a6daafaffc5c]
> [ 2455.021848]  ? do_raw_spin_unlock+0x116/0x890
> [ 2455.021848]  ? process_scheduled_works+0x6d4/0xf80
> [ 2455.021848]  nvmet_rdma_release_queue_work+0x43/0xa0 [nvmet_rdma d9fba27fd955e2c2575b3e184853a6daafaffc5c]
> [ 2455.021848]  process_scheduled_works+0x774/0xf80
> [ 2455.060121]  worker_thread+0x8c4/0xfc0
> [ 2455.061903]  ? __kthread_parkme+0x84/0x120
> [ 2455.063471]  kthread+0x25d/0x2e0
> [ 2455.065681]  ? __cfi_worker_thread+0x10/0x10
> [ 2455.065681]  ? __cfi_kthread+0x10/0x10
> [ 2455.065681]  ret_from_fork+0x41/0x70
> [ 2455.065681]  ? __cfi_kthread+0x10/0x10
> [ 2455.065681]  ret_from_fork_asm+0x1b/0x30
> [ 2455.065681]  </TASK>
> [ 2455.096534] nvmet: adding nsid 1 to subsystem blktests-subsystem-5
> 
> This looks familiar. I thought we have addressed it. Maybe I missing the
> fix since this is on 6.8-rc3
> 
No, looks like a different one.
Problem seems to be the 'flush_workqueue(nvmet_wq)' in
nvmet_rdma_remove_one().
When calling that we end up with this call chain:

flush_workqueue(nvmet_wq)
   rdma_release_queue_work()
     rdma_free_queue()
       sq_destroy()
         ctrl_put()
           ctrl_free()
             flush_work(async_event_work)

which deadlocks as we're already flushing the workqueue.
What would happen if we do _not_ call flush_workqueue() in
nvmet_rdma_remove_one(), but rather move it into nvmet_rdma_exit()?

Cheers,

Hannes



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-27 11:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-26 14:30 [PATCHv3] nvme: authentication error are always non-retryable Hannes Reinecke
2024-02-27  7:32 ` Daniel Wagner
2024-02-27  7:54   ` Hannes Reinecke
2024-02-27  8:49     ` Daniel Wagner
2024-02-27 10:09       ` Daniel Wagner
2024-02-27 11:04         ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.