public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] tcp: add additional info for nvme_tcp_timeout log
@ 2022-11-16 13:57 amit.engel
  2022-11-16 18:37 ` Chaitanya Kulkarni
  2022-11-17 13:13 ` [PATCH] nvme-tcp: " amit.engel
  0 siblings, 2 replies; 15+ messages in thread
From: amit.engel @ 2022-11-16 13:57 UTC (permalink / raw)
  To: sagi, linux-nvme; +Cc: amit.engel, Amit Engel

From: Amit Engel <Amit.Engel@dell.com>

Signed-off-by: Amit Engel <Amit.Engel@dell.com>
---
 drivers/nvme/host/tcp.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 1eed0fc26b3a..5abaff5d6f6e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2277,8 +2277,11 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
 	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
 
 	dev_warn(ctrl->device,
-		"queue %d: timeout request %#x type %d\n",
-		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
+		"queue %d: cntlid %u timeout request %#x type %d cid %u opcode %#x "
+		"start time %lluns io start time %lluns request timeout %ums\n",
+		nvme_tcp_queue_id(req->queue), ctrl->cntlid, rq->tag, pdu->hdr.type,
+		pdu->cmd.common.command_id, pdu->cmd.common.opcode,
+		ktime_sub_ns(ktime_get(), rq->start_time_ns), ktime_sub_ns(ktime_get(), rq->io_start_time_ns), jiffies_to_msecs(rq->timeout));
 
 	if (ctrl->state != NVME_CTRL_LIVE) {
 		/*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] tcp: add additional info for nvme_tcp_timeout log
  2022-11-16 13:57 [PATCH] tcp: add additional info for nvme_tcp_timeout log amit.engel
@ 2022-11-16 18:37 ` Chaitanya Kulkarni
  2022-11-17 13:14   ` Engel, Amit
  2022-11-17 13:13 ` [PATCH] nvme-tcp: " amit.engel
  1 sibling, 1 reply; 15+ messages in thread
From: Chaitanya Kulkarni @ 2022-11-16 18:37 UTC (permalink / raw)
  To: amit.engel@dell.com, sagi@grimberg.me,
	linux-nvme@lists.infradead.org

On 11/16/22 05:57, amit.engel@dell.com wrote:
> From: Amit Engel <Amit.Engel@dell.com>
> 

without the commit log it is hard to understand whay additional
information is needed also you need to fix the subject line..

> Signed-off-by: Amit Engel <Amit.Engel@dell.com>
> ---
>   drivers/nvme/host/tcp.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 1eed0fc26b3a..5abaff5d6f6e 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2277,8 +2277,11 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
>   	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
>   
>   	dev_warn(ctrl->device,
> -		"queue %d: timeout request %#x type %d\n",
> -		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
> +		"queue %d: cntlid %u timeout request %#x type %d cid %u opcode %#x "
> +		"start time %lluns io start time %lluns request timeout %ums\n",
> +		nvme_tcp_queue_id(req->queue), ctrl->cntlid, rq->tag, pdu->hdr.type,
> +		pdu->cmd.common.command_id, pdu->cmd.common.opcode,
> +		ktime_sub_ns(ktime_get(), rq->start_time_ns), ktime_sub_ns(ktime_get(), rq->io_start_time_ns), jiffies_to_msecs(rq->timeout));
>   
>   	if (ctrl->state != NVME_CTRL_LIVE) {
>   		/*


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-16 13:57 [PATCH] tcp: add additional info for nvme_tcp_timeout log amit.engel
  2022-11-16 18:37 ` Chaitanya Kulkarni
@ 2022-11-17 13:13 ` amit.engel
  2022-11-20 11:59   ` Sagi Grimberg
  1 sibling, 1 reply; 15+ messages in thread
From: amit.engel @ 2022-11-17 13:13 UTC (permalink / raw)
  To: sagi, linux-nvme; +Cc: amit.engel, Amit Engel

From: Amit Engel <Amit.Engel@dell.com>

This provides additional details about the rq/cmd that is timed out

Signed-off-by: Amit Engel <Amit.Engel@dell.com>
---
 drivers/nvme/host/tcp.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 1eed0fc26b3a..5abaff5d6f6e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2277,8 +2277,11 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
 	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
 
 	dev_warn(ctrl->device,
-		"queue %d: timeout request %#x type %d\n",
-		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
+		"queue %d: cntlid %u timeout request %#x type %d cid %u opcode %#x "
+		"start time %lluns io start time %lluns request timeout %ums\n",
+		nvme_tcp_queue_id(req->queue), ctrl->cntlid, rq->tag, pdu->hdr.type,
+		pdu->cmd.common.command_id, pdu->cmd.common.opcode,
+		ktime_sub_ns(ktime_get(), rq->start_time_ns), ktime_sub_ns(ktime_get(), rq->io_start_time_ns), jiffies_to_msecs(rq->timeout));
 
 	if (ctrl->state != NVME_CTRL_LIVE) {
 		/*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* RE: [PATCH] tcp: add additional info for nvme_tcp_timeout log
  2022-11-16 18:37 ` Chaitanya Kulkarni
@ 2022-11-17 13:14   ` Engel, Amit
  0 siblings, 0 replies; 15+ messages in thread
From: Engel, Amit @ 2022-11-17 13:14 UTC (permalink / raw)
  To: Chaitanya Kulkarni, sagi@grimberg.me,
	linux-nvme@lists.infradead.org

Ok - I modified the commit message

-----Original Message-----
From: Chaitanya Kulkarni <chaitanyak@nvidia.com> 
Sent: Wednesday, 16 November 2022 20:37
To: Engel, Amit <Amit.Engel@Dell.com>; sagi@grimberg.me; linux-nvme@lists.infradead.org
Subject: Re: [PATCH] tcp: add additional info for nvme_tcp_timeout log


[EXTERNAL EMAIL] 

On 11/16/22 05:57, amit.engel@dell.com wrote:
> From: Amit Engel <Amit.Engel@dell.com>
> 

without the commit log it is hard to understand whay additional information is needed also you need to fix the subject line..

> Signed-off-by: Amit Engel <Amit.Engel@dell.com>
> ---
>   drivers/nvme/host/tcp.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 
> 1eed0fc26b3a..5abaff5d6f6e 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2277,8 +2277,11 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
>   	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
>   
>   	dev_warn(ctrl->device,
> -		"queue %d: timeout request %#x type %d\n",
> -		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
> +		"queue %d: cntlid %u timeout request %#x type %d cid %u opcode %#x "
> +		"start time %lluns io start time %lluns request timeout %ums\n",
> +		nvme_tcp_queue_id(req->queue), ctrl->cntlid, rq->tag, pdu->hdr.type,
> +		pdu->cmd.common.command_id, pdu->cmd.common.opcode,
> +		ktime_sub_ns(ktime_get(), rq->start_time_ns), 
> +ktime_sub_ns(ktime_get(), rq->io_start_time_ns), 
> +jiffies_to_msecs(rq->timeout));
>   
>   	if (ctrl->state != NVME_CTRL_LIVE) {
>   		/*


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-17 13:13 ` [PATCH] nvme-tcp: " amit.engel
@ 2022-11-20 11:59   ` Sagi Grimberg
  2022-11-21 11:32     ` Engel, Amit
  0 siblings, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2022-11-20 11:59 UTC (permalink / raw)
  To: amit.engel, linux-nvme


> From: Amit Engel <Amit.Engel@dell.com>
> 
> This provides additional details about the rq/cmd that is timed out

Please add an example print of the timeout log msg.

> 
> Signed-off-by: Amit Engel <Amit.Engel@dell.com>
> ---
>   drivers/nvme/host/tcp.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 1eed0fc26b3a..5abaff5d6f6e 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2277,8 +2277,11 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
>   	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
>   
>   	dev_warn(ctrl->device,
> -		"queue %d: timeout request %#x type %d\n",
> -		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
> +		"queue %d: cntlid %u timeout request %#x type %d cid %u opcode %#x "
> +		"start time %lluns io start time %lluns request timeout %ums\n",
> +		nvme_tcp_queue_id(req->queue), ctrl->cntlid, rq->tag, pdu->hdr.type,
> +		pdu->cmd.common.command_id, pdu->cmd.common.opcode,
> +		ktime_sub_ns(ktime_get(), rq->start_time_ns), ktime_sub_ns(ktime_get(), rq->io_start_time_ns), jiffies_to_msecs(rq->timeout));

You can get the cntlid from the controller device name which appears in
dev_warn, no need for it here.

Why are you printing the timeout value (in ktime)? What info does it
add? You can see the relative time in the log and you know what the
io timeout is (30 seconds before by default)...

You can also easily retrieve the cid from the tag (lower 12 bits).

The nvme opcode is fine to add...

>   
>   	if (ctrl->state != NVME_CTRL_LIVE) {
>   		/*


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-20 11:59   ` Sagi Grimberg
@ 2022-11-21 11:32     ` Engel, Amit
  2022-11-21 11:36       ` Sagi Grimberg
  0 siblings, 1 reply; 15+ messages in thread
From: Engel, Amit @ 2022-11-21 11:32 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme@lists.infradead.org; +Cc: Engel, Amit

Hi Sagi, 

Below is an example print of a read cmd that is timed out
Should I add this also to the commit msg?
nvme nvme0: queue 1: cntlid 1 timeout request 0x7 type 4 cid 28679 opcode 0x1 start time 5022008820ns io start time 5022008132ns request timeout 5000ms

Thanks
Amit


Internal Use - Confidential

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Sunday, 20 November 2022 14:00
To: Engel, Amit <Amit.Engel@Dell.com>; linux-nvme@lists.infradead.org
Subject: Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log


[EXTERNAL EMAIL] 


> From: Amit Engel <Amit.Engel@dell.com>
> 
> This provides additional details about the rq/cmd that is timed out

Please add an example print of the timeout log msg.

> 
> Signed-off-by: Amit Engel <Amit.Engel@dell.com>
> ---
>   drivers/nvme/host/tcp.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 
> 1eed0fc26b3a..5abaff5d6f6e 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2277,8 +2277,11 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
>   	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
>   
>   	dev_warn(ctrl->device,
> -		"queue %d: timeout request %#x type %d\n",
> -		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
> +		"queue %d: cntlid %u timeout request %#x type %d cid %u opcode %#x "
> +		"start time %lluns io start time %lluns request timeout %ums\n",
> +		nvme_tcp_queue_id(req->queue), ctrl->cntlid, rq->tag, pdu->hdr.type,
> +		pdu->cmd.common.command_id, pdu->cmd.common.opcode,
> +		ktime_sub_ns(ktime_get(), rq->start_time_ns), 
> +ktime_sub_ns(ktime_get(), rq->io_start_time_ns), 
> +jiffies_to_msecs(rq->timeout));

You can get the cntlid from the controller device name which appears in dev_warn, no need for it here.

Why are you printing the timeout value (in ktime)? What info does it add? You can see the relative time in the log and you know what the io timeout is (30 seconds before by default)...

You can also easily retrieve the cid from the tag (lower 12 bits).

The nvme opcode is fine to add...

>   
>   	if (ctrl->state != NVME_CTRL_LIVE) {
>   		/*

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-21 11:32     ` Engel, Amit
@ 2022-11-21 11:36       ` Sagi Grimberg
  2022-11-21 12:11         ` Engel, Amit
  0 siblings, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2022-11-21 11:36 UTC (permalink / raw)
  To: Engel, Amit, linux-nvme@lists.infradead.org


> Hi Sagi,
> 
> Below is an example print of a read cmd that is timed out
> Should I add this also to the commit msg?
> nvme nvme0: queue 1: cntlid 1 timeout request 0x7 type 4 cid 28679 opcode 0x1 start time 5022008820ns io start time 5022008132ns request timeout 5000ms

This is not an improvement from the existing msg.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-21 11:36       ` Sagi Grimberg
@ 2022-11-21 12:11         ` Engel, Amit
  2022-11-21 14:34           ` Sagi Grimberg
  0 siblings, 1 reply; 15+ messages in thread
From: Engel, Amit @ 2022-11-21 12:11 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme@lists.infradead.org

Hi Sagi.

The previous msg:
nvme nvme0: queue 0: timeout request 0x0 type 4

We found it more useful to have also:
cmd opc that was timed out
command id that was timed out
The timeout value for this specific io - in case of a timeout - it make sense to know the timeout value itself
more info about the blk rq - such as:
	a. time of a request that was allocated for the IO that was timed out
	b. time of the timed out IO that was submitted to the device

The log is already exists, why not provide additional info that can help with io timeouts debug?

Thanks,
Amit


Internal Use - Confidential

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Monday, 21 November 2022 13:36
To: Engel, Amit <Amit.Engel@Dell.com>; linux-nvme@lists.infradead.org
Subject: Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log


[EXTERNAL EMAIL] 


> Hi Sagi,
> 
> Below is an example print of a read cmd that is timed out Should I add 
> this also to the commit msg?
> nvme nvme0: queue 1: cntlid 1 timeout request 0x7 type 4 cid 28679 
> opcode 0x1 start time 5022008820ns io start time 5022008132ns request 
> timeout 5000ms

This is not an improvement from the existing msg.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-21 12:11         ` Engel, Amit
@ 2022-11-21 14:34           ` Sagi Grimberg
  2022-12-04 17:28             ` Engel, Amit
  0 siblings, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2022-11-21 14:34 UTC (permalink / raw)
  To: Engel, Amit, linux-nvme@lists.infradead.org


> Hi Sagi.
> 
> The previous msg:
> nvme nvme0: queue 0: timeout request 0x0 type 4
> 
> We found it more useful to have also:
> cmd opc that was timed out

That is fine

> command id that was timed out

That is the lower 12 bits of the tag.
You can introduce a generic helper to extract genctr+cid from a tag.

> The timeout value for this specific io - in case of a timeout - it make sense to know the timeout value itself
> more info about the blk rq - such as:
> 	a. time of a request that was allocated for the IO that was timed out
> 	b. time of the timed out IO that was submitted to the device

I don't understand what added value that is providing you.
An IO timeout is usually (if not universally) >> the delta from tag
allocation to submission.

This is debug information, and in my mind the proper place for this type
of debug is blktrace.

> The log is already exists, why not provide additional info that can help with io timeouts debug?

We can say that about a lot of things.

There is negative value from spam information in the log which is debug
in nature and is separate from the normal error/warn log information.

If this information existed in the form of trace points would that solve
your problem? Keith had a patch for nvme-tcp trace points, which can
provide this type of information.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-11-21 14:34           ` Sagi Grimberg
@ 2022-12-04 17:28             ` Engel, Amit
  2022-12-05  5:38               ` Chaitanya Kulkarni
  2022-12-05  9:59               ` Sagi Grimberg
  0 siblings, 2 replies; 15+ messages in thread
From: Engel, Amit @ 2022-12-04 17:28 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme@lists.infradead.org; +Cc: Engel, Amit, Grupi, Elad

Hi Sagi, thank you for your inputs

It seems to me that logging nvme command id is sufficient and there is no need to log also tag and genctr separately
This is due to the fact that command id is constructed as such:

| xxxx | xxxxxxxxxxxx |
  gen    request tag

What do you think?

Regarding the io timeout value, I got your point..
logging the io timeout threshold (which is the delta from allocation to submission) is sufficient

regarding nvme-tcp trace points -  I don’t believe that would help.
This timeout log is on the host side, and we can't ensure that a customer that is running nvme host will enable/use trace points
In addition, the whole point of this specific log is for debug, I guess - right?

How does this log look?
kernel: nvme nvme0: queue 4: cntlid 1 type 4 cid 4163 opcode 0x2 io timeout threshold 30000ms
(Including the cntlid, cid (which includes genctr+tag), opc, and timeout threshold value in ms)

comparing to the existing log:
kernel: nvme nvme0: queue 4: timeout request 0x43 type 4

let me know if that make sense and I will submit an updated patch.

Thanks
Amit

Internal Use - Confidential

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Monday, 21 November 2022 16:34
To: Engel, Amit <Amit.Engel@Dell.com>; linux-nvme@lists.infradead.org
Subject: Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log


[EXTERNAL EMAIL] 


> Hi Sagi.
> 
> The previous msg:
> nvme nvme0: queue 0: timeout request 0x0 type 4
> 
> We found it more useful to have also:
> cmd opc that was timed out

That is fine

> command id that was timed out

That is the lower 12 bits of the tag.
You can introduce a generic helper to extract genctr+cid from a tag.

> The timeout value for this specific io - in case of a timeout - it 
> make sense to know the timeout value itself more info about the blk rq - such as:
> 	a. time of a request that was allocated for the IO that was timed out
> 	b. time of the timed out IO that was submitted to the device

I don't understand what added value that is providing you.
An IO timeout is usually (if not universally) >> the delta from tag allocation to submission.

This is debug information, and in my mind the proper place for this type of debug is blktrace.

> The log is already exists, why not provide additional info that can help with io timeouts debug?

We can say that about a lot of things.

There is negative value from spam information in the log which is debug in nature and is separate from the normal error/warn log information.

If this information existed in the form of trace points would that solve your problem? Keith had a patch for nvme-tcp trace points, which can provide this type of information.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-12-04 17:28             ` Engel, Amit
@ 2022-12-05  5:38               ` Chaitanya Kulkarni
  2022-12-05  9:59               ` Sagi Grimberg
  1 sibling, 0 replies; 15+ messages in thread
From: Chaitanya Kulkarni @ 2022-12-05  5:38 UTC (permalink / raw)
  To: Engel, Amit; +Cc: Grupi, Elad, linux-nvme@lists.infradead.org

Amit,

> let me know if that make sense and I will submit an updated patch.
> 
> Thanks
> Amit
> 
> Internal Use - Confidential
> 

Please remove the confidentiality message when posting on mailing list 
in order to avoid objections on your emails.

-ck


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-12-04 17:28             ` Engel, Amit
  2022-12-05  5:38               ` Chaitanya Kulkarni
@ 2022-12-05  9:59               ` Sagi Grimberg
  2022-12-05 18:00                 ` Engel, Amit
  1 sibling, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2022-12-05  9:59 UTC (permalink / raw)
  To: Engel, Amit, linux-nvme@lists.infradead.org; +Cc: Grupi, Elad


> Hi Sagi, thank you for your inputs
> 
> It seems to me that logging nvme command id is sufficient and there is no need to log also tag and genctr separately
> This is due to the fact that command id is constructed as such:
> 
> | xxxx | xxxxxxxxxxxx |
>    gen    request tag
> 
> What do you think?

Yes, that is fine I guess.

> Regarding the io timeout value, I got your point..
> logging the io timeout threshold (which is the delta from allocation to submission) is sufficient
> 
> regarding nvme-tcp trace points -  I don’t believe that would help.
> This timeout log is on the host side, and we can't ensure that a customer that is running nvme host will enable/use trace points
> In addition, the whole point of this specific log is for debug, I guess - right?

To some extent.

> 
> How does this log look?
> kernel: nvme nvme0: queue 4: cntlid 1 type 4 cid 4163 opcode 0x2 io timeout threshold 30000ms
> (Including the cntlid, cid (which includes genctr+tag), opc, and timeout threshold value in ms)

Both the cntlid and the io timeout are obtainable via sysfs, why add
these to the log? The cntlid is often just an ephemeral value assigned
by the controller.

opcode is fine to add.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-12-05  9:59               ` Sagi Grimberg
@ 2022-12-05 18:00                 ` Engel, Amit
  2022-12-05 20:11                   ` Sagi Grimberg
  0 siblings, 1 reply; 15+ messages in thread
From: Engel, Amit @ 2022-12-05 18:00 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme@lists.infradead.org; +Cc: Grupi, Elad

Yes, io timeout is obtainable via sysfs.
But this io_timeout param is writable and the user can modify this value at any time.
So I still find it usufl to have this value as part of this log.

Regarding the cntlid, I agree - we can remove it.

Thanks
Amit

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Monday, 5 December 2022 11:59
To: Engel, Amit <Amit.Engel@Dell.com>; linux-nvme@lists.infradead.org
Cc: Grupi, Elad <Elad.Grupi@dell.com>
Subject: Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log


[EXTERNAL EMAIL] 


> Hi Sagi, thank you for your inputs
> 
> It seems to me that logging nvme command id is sufficient and there is 
> no need to log also tag and genctr separately This is due to the fact that command id is constructed as such:
> 
> | xxxx | xxxxxxxxxxxx |
>    gen    request tag
> 
> What do you think?

Yes, that is fine I guess.

> Regarding the io timeout value, I got your point..
> logging the io timeout threshold (which is the delta from allocation 
> to submission) is sufficient
> 
> regarding nvme-tcp trace points -  I don’t believe that would help.
> This timeout log is on the host side, and we can't ensure that a 
> customer that is running nvme host will enable/use trace points In addition, the whole point of this specific log is for debug, I guess - right?

To some extent.

> 
> How does this log look?
> kernel: nvme nvme0: queue 4: cntlid 1 type 4 cid 4163 opcode 0x2 io 
> timeout threshold 30000ms (Including the cntlid, cid (which includes 
> genctr+tag), opc, and timeout threshold value in ms)

Both the cntlid and the io timeout are obtainable via sysfs, why add these to the log? The cntlid is often just an ephemeral value assigned by the controller.

opcode is fine to add.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-12-05 18:00                 ` Engel, Amit
@ 2022-12-05 20:11                   ` Sagi Grimberg
  2022-12-06 12:57                     ` Engel, Amit
  0 siblings, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2022-12-05 20:11 UTC (permalink / raw)
  To: Engel, Amit, linux-nvme@lists.infradead.org; +Cc: Grupi, Elad


> Yes, io timeout is obtainable via sysfs.
> But this io_timeout param is writable and the user can modify this value at any time.
> So I still find it usufl to have this value as part of this log.

I am sorry but I see no point in logging a static value that is
obtainable via sysfs... I Don't see this practice in any other
block device so it is unclear why this would be needed here
when no one else is doing this?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log
  2022-12-05 20:11                   ` Sagi Grimberg
@ 2022-12-06 12:57                     ` Engel, Amit
  0 siblings, 0 replies; 15+ messages in thread
From: Engel, Amit @ 2022-12-06 12:57 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme@lists.infradead.org; +Cc: Grupi, Elad

ok, so let's go with cid (which includes gen+tag) and opc?

new log:

Dec 06 14:54:38 nc9127122.drm.lab.emc.com kernel: nvme nvme0: queue 9: timeout cid 0x6051 opcode 0x2 type 4

Existing log:

Dec 06 14:54:38 nc9127122.drm.lab.emc.com kernel: nvme nvme0: queue 9: timeout request 0x51 type 4


-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Monday, 5 December 2022 22:12
To: Engel, Amit <Amit.Engel@Dell.com>; linux-nvme@lists.infradead.org
Cc: Grupi, Elad <Elad.Grupi@dell.com>
Subject: Re: [PATCH] nvme-tcp: add additional info for nvme_tcp_timeout log


[EXTERNAL EMAIL] 


> Yes, io timeout is obtainable via sysfs.
> But this io_timeout param is writable and the user can modify this value at any time.
> So I still find it usufl to have this value as part of this log.

I am sorry but I see no point in logging a static value that is obtainable via sysfs... I Don't see this practice in any other block device so it is unclear why this would be needed here when no one else is doing this?

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-12-06 12:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-16 13:57 [PATCH] tcp: add additional info for nvme_tcp_timeout log amit.engel
2022-11-16 18:37 ` Chaitanya Kulkarni
2022-11-17 13:14   ` Engel, Amit
2022-11-17 13:13 ` [PATCH] nvme-tcp: " amit.engel
2022-11-20 11:59   ` Sagi Grimberg
2022-11-21 11:32     ` Engel, Amit
2022-11-21 11:36       ` Sagi Grimberg
2022-11-21 12:11         ` Engel, Amit
2022-11-21 14:34           ` Sagi Grimberg
2022-12-04 17:28             ` Engel, Amit
2022-12-05  5:38               ` Chaitanya Kulkarni
2022-12-05  9:59               ` Sagi Grimberg
2022-12-05 18:00                 ` Engel, Amit
2022-12-05 20:11                   ` Sagi Grimberg
2022-12-06 12:57                     ` Engel, Amit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox