[PATCH V4 0/2] Fix race conditions in the nvme-tcp driver

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver
@ 2025-04-04  8:27 Maurizio Lombardi
  2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Maurizio Lombardi @ 2025-04-04  8:27 UTC (permalink / raw)
  To: kbusch; +Cc: hare, sagi, linux-nvme, mlombard, zhang.guanghui, loberman

Patch 1 fixes a race condition when the sockets get closed
while the controller is transitioning from CONNECTING to LIVE state.

Patch 2 fixes a command double completion when a send operation
fails despite the fact that the command has been received by the target.


v4: added patch 2/2

v3: Do not check for -EPIPE, just call nvme_tcp_error_recovery()

v2: commit message: clarify where the error recovery is started


Maurizio Lombardi (2):
  nvme-tcp: Prevent infinite loop if socket closes during CONNECTING
    state
  nvme-tcp: do not fail the request if send fails

 drivers/nvme/host/tcp.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

-- 
2.43.5



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-04  8:27 [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
@ 2025-04-04  8:28 ` Maurizio Lombardi
  2025-04-05 23:22   ` Keith Busch
                     ` (2 more replies)
  2025-04-04  8:28 ` [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails Maurizio Lombardi
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 18+ messages in thread
From: Maurizio Lombardi @ 2025-04-04  8:28 UTC (permalink / raw)
  To: kbusch; +Cc: hare, sagi, linux-nvme, mlombard, zhang.guanghui, loberman

There is a potential race condition that can occur if
the target closes the socket while the host is in the CONNECTING state.

If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
function is invoked. However, nvme_tcp_error_recovery() is unable
to transition the controller state to NVME_CTRL_RESETTING because
the controller is still in the CONNECTING state. As a result, error
recovery is bypassed, and the controller incorrectly transitions
to the LIVE state with closed sockets.

Subsequent attempts by the host to communicate with the target
will result in an infinite loop.

Fix the bug by initiating the error recovery process to correctly
handle the disconnection in case we missed this event
while transitioning from CONNECTING to LIVE.

Tested-by: Laurence Oberman <loberman@redhat.com>
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/tcp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 26c459f0198d..f1e2b0417b39 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1329,6 +1329,7 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
 			"failed to send request %d\n", ret);
 		nvme_tcp_fail_request(queue->request);
 		nvme_tcp_done_send_req(queue);
+		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
 	}
 out:
 	memalloc_noreclaim_restore(noreclaim_flag);
-- 
2.43.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
@ 2025-04-05 23:22   ` Keith Busch
  2025-04-07  6:11   ` Hannes Reinecke
  2025-04-13 22:44   ` Sagi Grimberg
  2 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2025-04-05 23:22 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, hare, sagi, linux-nvme, mlombard, zhang.guanghui,
	loberman

On Fri, Apr 04, 2025 at 10:28:00AM +0200, Maurizio Lombardi wrote:
> There is a potential race condition that can occur if
> the target closes the socket while the host is in the CONNECTING state.
> 
> If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
> function is invoked. However, nvme_tcp_error_recovery() is unable
> to transition the controller state to NVME_CTRL_RESETTING because
> the controller is still in the CONNECTING state. As a result, error
> recovery is bypassed, and the controller incorrectly transitions
> to the LIVE state with closed sockets.
> 
> Subsequent attempts by the host to communicate with the target
> will result in an infinite loop.
> 
> Fix the bug by initiating the error recovery process to correctly
> handle the disconnection in case we missed this event
> while transitioning from CONNECTING to LIVE.

Looks good.

Reviewed-by: Keith Busch <kbusch@kernel.org>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
  2025-04-05 23:22   ` Keith Busch
@ 2025-04-07  6:11   ` Hannes Reinecke
  2025-04-13 22:44   ` Sagi Grimberg
  2 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2025-04-07  6:11 UTC (permalink / raw)
  To: Maurizio Lombardi, kbusch
  Cc: hare, sagi, linux-nvme, mlombard, zhang.guanghui, loberman

On 4/4/25 10:28, Maurizio Lombardi wrote:
> There is a potential race condition that can occur if
> the target closes the socket while the host is in the CONNECTING state.
> 
> If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
> function is invoked. However, nvme_tcp_error_recovery() is unable
> to transition the controller state to NVME_CTRL_RESETTING because
> the controller is still in the CONNECTING state. As a result, error
> recovery is bypassed, and the controller incorrectly transitions
> to the LIVE state with closed sockets.
> 
> Subsequent attempts by the host to communicate with the target
> will result in an infinite loop.
> 
> Fix the bug by initiating the error recovery process to correctly
> handle the disconnection in case we missed this event
> while transitioning from CONNECTING to LIVE.
> 
> Tested-by: Laurence Oberman <loberman@redhat.com>
> Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
>   drivers/nvme/host/tcp.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 26c459f0198d..f1e2b0417b39 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1329,6 +1329,7 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
>   			"failed to send request %d\n", ret);
>   		nvme_tcp_fail_request(queue->request);
>   		nvme_tcp_done_send_req(queue);
> +		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
>   	}
>   out:
>   	memalloc_noreclaim_restore(noreclaim_flag);

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
  2025-04-05 23:22   ` Keith Busch
  2025-04-07  6:11   ` Hannes Reinecke
@ 2025-04-13 22:44   ` Sagi Grimberg
  2025-04-14  7:25     ` Maurizio Lombardi
  2 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2025-04-13 22:44 UTC (permalink / raw)
  To: Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, mlombard, zhang.guanghui, loberman



On 04/04/2025 11:28, Maurizio Lombardi wrote:
> There is a potential race condition that can occur if
> the target closes the socket while the host is in the CONNECTING state.
>
> If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
> function is invoked. However, nvme_tcp_error_recovery() is unable
> to transition the controller state to NVME_CTRL_RESETTING because
> the controller is still in the CONNECTING state. As a result, error
> recovery is bypassed, and the controller incorrectly transitions
> to the LIVE state with closed sockets.

I think that the issue is that the controller moves to LIVE state - it 
shouldn't.
However its not clear where this happens.

>
> Subsequent attempts by the host to communicate with the target
> will result in an infinite loop.
>
> Fix the bug by initiating the error recovery process to correctly
> handle the disconnection in case we missed this event
> while transitioning from CONNECTING to LIVE.

The problem is in the initial connect - here there is no error recovery
and we want to propagate the error to the user.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-13 22:44   ` Sagi Grimberg
@ 2025-04-14  7:25     ` Maurizio Lombardi
  2025-04-14 21:35       ` Sagi Grimberg
  0 siblings, 1 reply; 18+ messages in thread
From: Maurizio Lombardi @ 2025-04-14  7:25 UTC (permalink / raw)
  To: Sagi Grimberg, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, zhang.guanghui, loberman

On Mon Apr 14, 2025 at 12:44 AM CEST, Sagi Grimberg wrote:
>
>
> On 04/04/2025 11:28, Maurizio Lombardi wrote:
>> There is a potential race condition that can occur if
>> the target closes the socket while the host is in the CONNECTING state.
>>
>> If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
>> function is invoked. However, nvme_tcp_error_recovery() is unable
>> to transition the controller state to NVME_CTRL_RESETTING because
>> the controller is still in the CONNECTING state. As a result, error
>> recovery is bypassed, and the controller incorrectly transitions
>> to the LIVE state with closed sockets.
>
> I think that the issue is that the controller moves to LIVE state - it 
> shouldn't.
> However its not clear where this happens.
>
>>
>> Subsequent attempts by the host to communicate with the target
>> will result in an infinite loop.
>>
>> Fix the bug by initiating the error recovery process to correctly
>> handle the disconnection in case we missed this event
>> while transitioning from CONNECTING to LIVE.
>
> The problem is in the initial connect - here there is no error recovery
> and we want to propagate the error to the user.

Maybe there are other problems that I've not found, but this race
condition definitely exists.
This can be reproduced by deleting the port, target-side, with nvmetcli
just after the connection has been estabilished, before the controller
goes LIVE.
Yes, it's quite hard to hit it in practice, but if you want to see it
yourself, a small sleep in the right place in the host driver will
help you.

Maurizio



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-14  7:25     ` Maurizio Lombardi
@ 2025-04-14 21:35       ` Sagi Grimberg
  2025-04-17 13:04         ` Maurizio Lombardi
  0 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2025-04-14 21:35 UTC (permalink / raw)
  To: Maurizio Lombardi, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, zhang.guanghui, loberman



On 14/04/2025 10:25, Maurizio Lombardi wrote:
> On Mon Apr 14, 2025 at 12:44 AM CEST, Sagi Grimberg wrote:
>>
>> On 04/04/2025 11:28, Maurizio Lombardi wrote:
>>> There is a potential race condition that can occur if
>>> the target closes the socket while the host is in the CONNECTING state.
>>>
>>> If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
>>> function is invoked. However, nvme_tcp_error_recovery() is unable
>>> to transition the controller state to NVME_CTRL_RESETTING because
>>> the controller is still in the CONNECTING state. As a result, error
>>> recovery is bypassed, and the controller incorrectly transitions
>>> to the LIVE state with closed sockets.
>> I think that the issue is that the controller moves to LIVE state - it
>> shouldn't.
>> However its not clear where this happens.
>>
>>> Subsequent attempts by the host to communicate with the target
>>> will result in an infinite loop.
>>>
>>> Fix the bug by initiating the error recovery process to correctly
>>> handle the disconnection in case we missed this event
>>> while transitioning from CONNECTING to LIVE.
>> The problem is in the initial connect - here there is no error recovery
>> and we want to propagate the error to the user.
> Maybe there are other problems that I've not found, but this race
> condition definitely exists.
> This can be reproduced by deleting the port, target-side, with nvmetcli
> just after the connection has been estabilished, before the controller
> goes LIVE.
> Yes, it's quite hard to hit it in practice, but if you want to see it
> yourself, a small sleep in the right place in the host driver will
> help you.

I see the issue, but we need to make sure that if the connection closes 
before
the controller finished establishing, then it cleans up correctly. 
Because at some
point in the past - it wasn't the case. Things have changed in that path 
so it might
be ok now... Just need to check. I'd trigger the race while the admin 
queue is establishing, as well
as in the middle of the sequence of IO queues are establishing.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-14 21:35       ` Sagi Grimberg
@ 2025-04-17 13:04         ` Maurizio Lombardi
  2025-04-18 11:14           ` Sagi Grimberg
  0 siblings, 1 reply; 18+ messages in thread
From: Maurizio Lombardi @ 2025-04-17 13:04 UTC (permalink / raw)
  To: Sagi Grimberg, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, zhang.guanghui, loberman

On Mon Apr 14, 2025 at 11:35 PM CEST, Sagi Grimberg wrote:
>
> I see the issue, but we need to make sure that if the connection closes 
> before
> the controller finished establishing, then it cleans up correctly. 
> Because at some
> point in the past - it wasn't the case. Things have changed in that path 
> so it might
> be ok now... Just need to check. I'd trigger the race while the admin 
> queue is establishing, as well
> as in the middle of the sequence of IO queues are establishing.

I believe my earlier testing for this patch already covered this scenario,
but I can rerun the tests to confirm and report back.

Either way, any fixes needed should be unrelated to this patch in my opinion,
as this one covers the case where the controller
has already finished establishing the admin and I/O queues.

Maurizio


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-17 13:04         ` Maurizio Lombardi
@ 2025-04-18 11:14           ` Sagi Grimberg
  2025-06-10 12:50             ` Maurizio Lombardi
  0 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2025-04-18 11:14 UTC (permalink / raw)
  To: Maurizio Lombardi, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, zhang.guanghui, loberman



On 4/17/25 16:04, Maurizio Lombardi wrote:
> On Mon Apr 14, 2025 at 11:35 PM CEST, Sagi Grimberg wrote:
>> I see the issue, but we need to make sure that if the connection closes
>> before
>> the controller finished establishing, then it cleans up correctly.
>> Because at some
>> point in the past - it wasn't the case. Things have changed in that path
>> so it might
>> be ok now... Just need to check. I'd trigger the race while the admin
>> queue is establishing, as well
>> as in the middle of the sequence of IO queues are establishing.
> I believe my earlier testing for this patch already covered this scenario,
> but I can rerun the tests to confirm and report back.

So you indeed made sure that the failure starts sporadically in the 
controller establishment sequence
and there is no use-after-free issue?

>
> Either way, any fixes needed should be unrelated to this patch in my opinion,
> as this one covers the case where the controller
> has already finished establishing the admin and I/O queues.

Well, you are changing code that was added to prevent double free issues 
when the error recovery
and the initial connect sequence ran together. Hence, I want to make 
sure that these issues don't come
back now that you are changing it.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-04-18 11:14           ` Sagi Grimberg
@ 2025-06-10 12:50             ` Maurizio Lombardi
  2025-06-10 14:29               ` Sagi Grimberg
  0 siblings, 1 reply; 18+ messages in thread
From: Maurizio Lombardi @ 2025-06-10 12:50 UTC (permalink / raw)
  To: Sagi Grimberg, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, zhang.guanghui, loberman

On Fri Apr 18, 2025 at 1:14 PM CEST, Sagi Grimberg wrote:
>
>
> On 4/17/25 16:04, Maurizio Lombardi wrote:
>> On Mon Apr 14, 2025 at 11:35 PM CEST, Sagi Grimberg wrote:
>>> I see the issue, but we need to make sure that if the connection closes
>>> before
>>> the controller finished establishing, then it cleans up correctly.
>>> Because at some
>>> point in the past - it wasn't the case. Things have changed in that path
>>> so it might
>>> be ok now... Just need to check. I'd trigger the race while the admin
>>> queue is establishing, as well
>>> as in the middle of the sequence of IO queues are establishing.
>> I believe my earlier testing for this patch already covered this scenario,
>> but I can rerun the tests to confirm and report back.
>
> So you indeed made sure that the failure starts sporadically in the 
> controller establishment sequence
> and there is no use-after-free issue?

Sorry for the long wait; I am now back at it.
I repeated the tests using a debug kernel, and nothing has been detected.

>
>>
>> Either way, any fixes needed should be unrelated to this patch in my opinion,
>> as this one covers the case where the controller
>> has already finished establishing the admin and I/O queues.
>
> Well, you are changing code that was added to prevent double free issues 
> when the error recovery
> and the initial connect sequence ran together.

Note that even with this patch, the error recovery and the initial
connect sequence cannot run together. The reset is initiated when
a send operation fails, after the controller has switched to the LIVE state.

Maurizio


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
  2025-06-10 12:50             ` Maurizio Lombardi
@ 2025-06-10 14:29               ` Sagi Grimberg
       [not found]                 ` <202510211137214079503@cestc.cn>
  0 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2025-06-10 14:29 UTC (permalink / raw)
  To: Maurizio Lombardi, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, zhang.guanghui, loberman



On 10/06/2025 15:50, Maurizio Lombardi wrote:
> On Fri Apr 18, 2025 at 1:14 PM CEST, Sagi Grimberg wrote:
>>
>> On 4/17/25 16:04, Maurizio Lombardi wrote:
>>> On Mon Apr 14, 2025 at 11:35 PM CEST, Sagi Grimberg wrote:
>>>> I see the issue, but we need to make sure that if the connection closes
>>>> before
>>>> the controller finished establishing, then it cleans up correctly.
>>>> Because at some
>>>> point in the past - it wasn't the case. Things have changed in that path
>>>> so it might
>>>> be ok now... Just need to check. I'd trigger the race while the admin
>>>> queue is establishing, as well
>>>> as in the middle of the sequence of IO queues are establishing.
>>> I believe my earlier testing for this patch already covered this scenario,
>>> but I can rerun the tests to confirm and report back.
>> So you indeed made sure that the failure starts sporadically in the
>> controller establishment sequence
>> and there is no use-after-free issue?
> Sorry for the long wait; I am now back at it.
> I repeated the tests using a debug kernel, and nothing has been detected.

OK, then lets get it merged.

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <202510211137214079503@cestc.cn>]

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state【请注意，邮件由sagigrim@gmail.com代发】
       [not found]                 ` <202510211137214079503@cestc.cn>
@ 2025-10-21  6:05                   ` Maurizio Lombardi
  2025-10-21  6:14                     ` Maurizio Lombardi
  0 siblings, 1 reply; 18+ messages in thread
From: Maurizio Lombardi @ 2025-10-21  6:05 UTC (permalink / raw)
  To: zhang.guanghui@cestc.cn, sagi, Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, loberman

On Tue Oct 21, 2025 at 5:37 AM CEST, zhang.guanghui@cestc.cn wrote:
> This patch has not been merged yet,
> So far, no other issues were found after merging this patch.

Wait, do not merge it.

nvme_tcp_error_recovery() should only be called when an -ENOPIPE error
is detected. Other errors may be temporary, like -ENOPERM, and we
shouldn't reset the connection in that case.

Merging this patch as-is will make the blktests md/001 fail.

Let me re-post a new version.

Maurizio

>  
>
> zhang.guanghui@cestc.cn
>  
> 发件人： Sagi Grimberg
> 发送时间： 2025-06-10 22:29
> 收件人： Maurizio Lombardi; Maurizio Lombardi; kbusch
> 抄送： hare; linux-nvme; zhang.guanghui; loberman
> 主题： Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state【请注意，邮件由sagigrim@gmail.com代发】
>  
>  
> On 10/06/2025 15:50, Maurizio Lombardi wrote:
>> On Fri Apr 18, 2025 at 1:14 PM CEST, Sagi Grimberg wrote:
>>>
>>> On 4/17/25 16:04, Maurizio Lombardi wrote:
>>>> On Mon Apr 14, 2025 at 11:35 PM CEST, Sagi Grimberg wrote:
>>>>> I see the issue, but we need to make sure that if the connection closes
>>>>> before
>>>>> the controller finished establishing, then it cleans up correctly.
>>>>> Because at some
>>>>> point in the past - it wasn't the case. Things have changed in that path
>>>>> so it might
>>>>> be ok now... Just need to check. I'd trigger the race while the admin
>>>>> queue is establishing, as well
>>>>> as in the middle of the sequence of IO queues are establishing.
>>>> I believe my earlier testing for this patch already covered this scenario,
>>>> but I can rerun the tests to confirm and report back.
>>> So you indeed made sure that the failure starts sporadically in the
>>> controller establishment sequence
>>> and there is no use-after-free issue?
>> Sorry for the long wait; I am now back at it.
>> I repeated the tests using a debug kernel, and nothing has been detected.
>  
> OK, then lets get it merged.
>  
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>  
>  



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state【请注意，邮件由sagigrim@gmail.com代发】
  2025-10-21  6:05                   ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state【请注意，邮件由sagigrim@gmail.com代发】 Maurizio Lombardi
@ 2025-10-21  6:14                     ` Maurizio Lombardi
  0 siblings, 0 replies; 18+ messages in thread
From: Maurizio Lombardi @ 2025-10-21  6:14 UTC (permalink / raw)
  To: Maurizio Lombardi, zhang.guanghui@cestc.cn, sagi,
	Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, loberman

On Tue Oct 21, 2025 at 8:05 AM CEST, Maurizio Lombardi wrote:
> On Tue Oct 21, 2025 at 5:37 AM CEST, zhang.guanghui@cestc.cn wrote:
>> This patch has not been merged yet,
>> So far, no other issues were found after merging this patch.
>
> Wait, do not merge it.
>
> nvme_tcp_error_recovery() should only be called when an -ENOPIPE error
> is detected. Other errors may be temporary, like -ENOPERM, and we
> shouldn't reset the connection in that case.
>
> Merging this patch as-is will make the blktests md/001 fail.
>
> Let me re-post a new version.

Also, as Sagi pointed out, during the initial connect there is no
expectation of error recovery, so there is more work to do.

Going to see if I can come up with something more robust.

Maurizio


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails
  2025-04-04  8:27 [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
  2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
@ 2025-04-04  8:28 ` Maurizio Lombardi
  2025-04-05 23:23   ` Keith Busch
  2025-04-07  6:13   ` Hannes Reinecke
  2025-04-11 12:26 ` [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
  2025-04-13 22:42 ` Sagi Grimberg
  3 siblings, 2 replies; 18+ messages in thread
From: Maurizio Lombardi @ 2025-04-04  8:28 UTC (permalink / raw)
  To: kbusch; +Cc: hare, sagi, linux-nvme, mlombard, zhang.guanghui, loberman

In some cases, the send operation may fail even though
the command is successfully received by the target.
The target then sends a command completion capsule back to the host,
leading to double completion and a kernel panic.

To prevent this, do not fail the request; instead, let the
error recovery mechanism clean everything up.

Tested-by: Zhang Guanghui <zhang.guanghui@cestc.cn>
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/tcp.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index f1e2b0417b39..726f24a352fe 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1101,19 +1101,6 @@ static inline void nvme_tcp_done_send_req(struct nvme_tcp_queue *queue)
 	queue->request = NULL;
 }
 
-static void nvme_tcp_fail_request(struct nvme_tcp_request *req)
-{
-	if (nvme_tcp_async_req(req)) {
-		union nvme_result res = {};
-
-		nvme_complete_async_event(&req->queue->ctrl->ctrl,
-				cpu_to_le16(NVME_SC_HOST_PATH_ERROR), &res);
-	} else {
-		nvme_tcp_end_request(blk_mq_rq_from_pdu(req),
-				NVME_SC_HOST_PATH_ERROR);
-	}
-}
-
 static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
 {
 	struct nvme_tcp_queue *queue = req->queue;
@@ -1327,7 +1314,6 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
 	} else if (ret < 0) {
 		dev_err(queue->ctrl->ctrl.device,
 			"failed to send request %d\n", ret);
-		nvme_tcp_fail_request(queue->request);
 		nvme_tcp_done_send_req(queue);
 		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
 	}
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails
  2025-04-04  8:28 ` [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails Maurizio Lombardi
@ 2025-04-05 23:23   ` Keith Busch
  2025-04-07  6:13   ` Hannes Reinecke
  1 sibling, 0 replies; 18+ messages in thread
From: Keith Busch @ 2025-04-05 23:23 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, hare, sagi, linux-nvme, mlombard, zhang.guanghui,
	loberman

On Fri, Apr 04, 2025 at 10:28:01AM +0200, Maurizio Lombardi wrote:
> In some cases, the send operation may fail even though
> the command is successfully received by the target.
> The target then sends a command completion capsule back to the host,
> leading to double completion and a kernel panic.
> 
> To prevent this, do not fail the request; instead, let the
> error recovery mechanism clean everything up.

Looks good.

Reviewed-by: Keith Busch <kbusch@kernel.org>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails
  2025-04-04  8:28 ` [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails Maurizio Lombardi
  2025-04-05 23:23   ` Keith Busch
@ 2025-04-07  6:13   ` Hannes Reinecke
  1 sibling, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2025-04-07  6:13 UTC (permalink / raw)
  To: Maurizio Lombardi, kbusch
  Cc: hare, sagi, linux-nvme, mlombard, zhang.guanghui, loberman

On 4/4/25 10:28, Maurizio Lombardi wrote:
> In some cases, the send operation may fail even though
> the command is successfully received by the target.
> The target then sends a command completion capsule back to the host,
> leading to double completion and a kernel panic.
> 
> To prevent this, do not fail the request; instead, let the
> error recovery mechanism clean everything up.
> 
> Tested-by: Zhang Guanghui <zhang.guanghui@cestc.cn>
> Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
>   drivers/nvme/host/tcp.c | 14 --------------
>   1 file changed, 14 deletions(-)
> 
You could have merged it with the previous patch, but that's more or
less cosmetical.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver
  2025-04-04  8:27 [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
  2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
  2025-04-04  8:28 ` [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails Maurizio Lombardi
@ 2025-04-11 12:26 ` Maurizio Lombardi
  2025-04-13 22:42 ` Sagi Grimberg
  3 siblings, 0 replies; 18+ messages in thread
From: Maurizio Lombardi @ 2025-04-11 12:26 UTC (permalink / raw)
  To: hch; +Cc: linux-nvme

Hello Christoph,

On Fri Apr 4, 2025 at 10:27 AM CEST, Maurizio Lombardi wrote:
> Patch 1 fixes a race condition when the sockets get closed
> while the controller is transitioning from CONNECTING to LIVE state.
>
> Patch 2 fixes a command double completion when a send operation
> fails despite the fact that the command has been received by the target.
>
>
> v4: added patch 2/2
>
> v3: Do not check for -EPIPE, just call nvme_tcp_error_recovery()
>
> v2: commit message: clarify where the error recovery is started
>
>
> Maurizio Lombardi (2):
>   nvme-tcp: Prevent infinite loop if socket closes during CONNECTING
>     state
>   nvme-tcp: do not fail the request if send fails
>
>  drivers/nvme/host/tcp.c | 15 +--------------
>  1 file changed, 1 insertion(+), 14 deletions(-)

Can you pick up those 2 patches please?

Thanks,
Maurizio


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver
  2025-04-04  8:27 [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
                   ` (2 preceding siblings ...)
  2025-04-11 12:26 ` [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
@ 2025-04-13 22:42 ` Sagi Grimberg
  3 siblings, 0 replies; 18+ messages in thread
From: Sagi Grimberg @ 2025-04-13 22:42 UTC (permalink / raw)
  To: Maurizio Lombardi, kbusch
  Cc: hare, linux-nvme, mlombard, zhang.guanghui, loberman



On 04/04/2025 11:27, Maurizio Lombardi wrote:
> Patch 1 fixes a race condition when the sockets get closed
> while the controller is transitioning from CONNECTING to LIVE state.
>
> Patch 2 fixes a command double completion when a send operation
> fails despite the fact that the command has been received by the target.

Maurizio, I think there is a problem in the initial connect. Where there 
is no
expectation of error recovery - but instead propagate the error to the user.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-10-21  6:15 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-04  8:27 [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
2025-04-04  8:28 ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state Maurizio Lombardi
2025-04-05 23:22   ` Keith Busch
2025-04-07  6:11   ` Hannes Reinecke
2025-04-13 22:44   ` Sagi Grimberg
2025-04-14  7:25     ` Maurizio Lombardi
2025-04-14 21:35       ` Sagi Grimberg
2025-04-17 13:04         ` Maurizio Lombardi
2025-04-18 11:14           ` Sagi Grimberg
2025-06-10 12:50             ` Maurizio Lombardi
2025-06-10 14:29               ` Sagi Grimberg
     [not found]                 ` <202510211137214079503@cestc.cn>
2025-10-21  6:05                   ` [PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state【请注意，邮件由sagigrim@gmail.com代发】 Maurizio Lombardi
2025-10-21  6:14                     ` Maurizio Lombardi
2025-04-04  8:28 ` [PATCH V4 2/2] nvme-tcp: do not complete the request if send operation fails Maurizio Lombardi
2025-04-05 23:23   ` Keith Busch
2025-04-07  6:13   ` Hannes Reinecke
2025-04-11 12:26 ` [PATCH V4 0/2] Fix race conditions in the nvme-tcp driver Maurizio Lombardi
2025-04-13 22:42 ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).