public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] nvme-fabrics: handle zero MAXCMD without closing the connection
@ 2024-11-29 14:17 Maurizio Lombardi
  2024-11-29 18:39 ` Laurence Oberman
  2024-12-03  0:36 ` Christoph Hellwig
  0 siblings, 2 replies; 3+ messages in thread
From: Maurizio Lombardi @ 2024-11-29 14:17 UTC (permalink / raw)
  To: kbusch; +Cc: axboe, hch, sagi, linux-nvme, kanie, kch, loberman, emilne,
	jmeneghi

The NVMe specification states that MAXCMD is mandatory
for NVMe-over-Fabrics implementations. However, some NVMe/TCP
and NVMe/FC arrays from major vendors have buggy firmware
that reports MAXCMD as zero in the Identify Controller data structure.

Currently, the implementation closes the connection in such cases,
completely preventing the host from connecting to the target.

Fix the issue by printing a clear error message about the firmware bug
and allowing the connection to proceed. It assumes that the
target supports a MAXCMD value of SQSIZE + 1. If any issues arise,
the user can manually adjust SQSIZE to mitigate them.

Fixes: 4999568184e5 ("nvme-fabrics: check max outstanding commands")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1a8d32a4a5c3..91ee2c3aa95e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3257,8 +3257,9 @@ static int nvme_check_ctrl_fabric_info(struct nvme_ctrl *ctrl, struct nvme_id_ct
 	}
 
 	if (!ctrl->maxcmd) {
-		dev_err(ctrl->device, "Maximum outstanding commands is 0\n");
-		return -EINVAL;
+		dev_err(ctrl->device,
+			"Firmware bug: maximum outstanding commands is 0\n");
+		ctrl->maxcmd = ctrl->sqsize + 1;
 	}
 
 	return 0;
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] nvme-fabrics: handle zero MAXCMD without closing the connection
  2024-11-29 14:17 [PATCH] nvme-fabrics: handle zero MAXCMD without closing the connection Maurizio Lombardi
@ 2024-11-29 18:39 ` Laurence Oberman
  2024-12-03  0:36 ` Christoph Hellwig
  1 sibling, 0 replies; 3+ messages in thread
From: Laurence Oberman @ 2024-11-29 18:39 UTC (permalink / raw)
  To: Maurizio Lombardi, kbusch
  Cc: axboe, hch, sagi, linux-nvme, kanie, kch, emilne, jmeneghi

On Fri, 2024-11-29 at 15:17 +0100, Maurizio Lombardi wrote:
> The NVMe specification states that MAXCMD is mandatory
> for NVMe-over-Fabrics implementations. However, some NVMe/TCP
> and NVMe/FC arrays from major vendors have buggy firmware
> that reports MAXCMD as zero in the Identify Controller data
> structure.
> 
> Currently, the implementation closes the connection in such cases,
> completely preventing the host from connecting to the target.
> 
> Fix the issue by printing a clear error message about the firmware
> bug
> and allowing the connection to proceed. It assumes that the
> target supports a MAXCMD value of SQSIZE + 1. If any issues arise,
> the user can manually adjust SQSIZE to mitigate them.
> 
> Fixes: 4999568184e5 ("nvme-fabrics: check max outstanding commands")
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
>  drivers/nvme/host/core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1a8d32a4a5c3..91ee2c3aa95e 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3257,8 +3257,9 @@ static int nvme_check_ctrl_fabric_info(struct
> nvme_ctrl *ctrl, struct nvme_id_ct
>         }
>  
>         if (!ctrl->maxcmd) {
> -               dev_err(ctrl->device, "Maximum outstanding commands
> is 0\n");
> -               return -EINVAL;
> +               dev_err(ctrl->device,
> +                       "Firmware bug: maximum outstanding commands
> is 0\n");
> +               ctrl->maxcmd = ctrl->sqsize + 1;
>         }
>  
>         return 0;

A similar fix was tested by a customer seeing this issue and they can
now connect to their array again.
Looks good.

Reviewed-by: Laurence Oberman <loberman@redhat.com>



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] nvme-fabrics: handle zero MAXCMD without closing the connection
  2024-11-29 14:17 [PATCH] nvme-fabrics: handle zero MAXCMD without closing the connection Maurizio Lombardi
  2024-11-29 18:39 ` Laurence Oberman
@ 2024-12-03  0:36 ` Christoph Hellwig
  1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2024-12-03  0:36 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, axboe, hch, sagi, linux-nvme, kanie, kch, loberman,
	emilne, jmeneghi

On Fri, Nov 29, 2024 at 03:17:06PM +0100, Maurizio Lombardi wrote:
> The NVMe specification states that MAXCMD is mandatory
> for NVMe-over-Fabrics implementations. However, some NVMe/TCP
> and NVMe/FC arrays from major vendors have buggy firmware
> that reports MAXCMD as zero in the Identify Controller data structure.

Sigh.

>  	if (!ctrl->maxcmd) {
> -		dev_err(ctrl->device, "Maximum outstanding commands is 0\n");
> -		return -EINVAL;
> +		dev_err(ctrl->device,
> +			"Firmware bug: maximum outstanding commands is 0\n");
> +		ctrl->maxcmd = ctrl->sqsize + 1;

Maybe warn instead of err as made it not an error now?  Otherwise
looks fine.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-12-03  0:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29 14:17 [PATCH] nvme-fabrics: handle zero MAXCMD without closing the connection Maurizio Lombardi
2024-11-29 18:39 ` Laurence Oberman
2024-12-03  0:36 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox