[PATCH 0/2] Get rid of transport layer retry count config parameter

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] Get rid of transport layer retry count config parameter
@ 2016-06-22 12:05 Sagi Grimberg
  2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-06-22 12:05 UTC (permalink / raw)


This parameter was added in order to support a proper timeout for
error recovery before the spec defined a periodic keep-alive.

Now that we have periodic keep-alive, we don't need a user configurable
transport layer retry count, the keep-alive timeout is sufficient,
transports can retry for as long as they see fit.

Sagi Grimberg (2):
  nvme-rdma: Don't use tl_retry_count
  nvme-fabrics: Remove tl_retry_count

 drivers/nvme/host/fabrics.c | 14 --------------
 drivers/nvme/host/fabrics.h |  3 ---
 drivers/nvme/host/rdma.c    |  9 +++------
 3 files changed, 3 insertions(+), 23 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] nvme-rdma: Don't use tl_retry_count
  2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg
@ 2016-06-22 12:06 ` Sagi Grimberg
  2016-06-22 12:06 ` [PATCH 2/2] nvme-fabrics: Remove tl_retry_count Sagi Grimberg
  2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-06-22 12:06 UTC (permalink / raw)


Always use the maximum qp retry count as the
error recovery timeout is dictated from the nvme
keep-alive.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/rdma.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index e1205c0d36e4..b939f89ad936 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -117,7 +117,6 @@ struct nvme_rdma_ctrl {
 	u32			queue_count;
 
 	/* other member variables */
-	unsigned short		tl_retry_count;
 	struct blk_mq_tag_set	tag_set;
 	struct work_struct	delete_work;
 	struct work_struct	reset_work;
@@ -1268,8 +1267,8 @@ static int nvme_rdma_route_resolved(struct nvme_rdma_queue *queue)
 	param.flow_control = 1;
 
 	param.responder_resources = queue->device->dev->attrs.max_qp_rd_atom;
-	/* rdma_cm will clamp down to max QP retry count (7) */
-	param.retry_count = ctrl->tl_retry_count;
+	/* maximum retry count */
+	param.retry_count = 7;
 	param.rnr_retry_count = 7;
 	param.private_data = &priv;
 	param.private_data_len = sizeof(priv);
@@ -1898,7 +1897,6 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 
 	ctrl->queue_count = opts->nr_io_queues + 1; /* +1 for admin queue */
 	ctrl->ctrl.sqsize = opts->queue_size;
-	ctrl->tl_retry_count = opts->tl_retry_count;
 	ctrl->ctrl.kato = opts->kato;
 
 	ret = -ENOMEM;
@@ -1975,8 +1973,7 @@ out_free_ctrl:
 static struct nvmf_transport_ops nvme_rdma_transport = {
 	.name		= "rdma",
 	.required_opts	= NVMF_OPT_TRADDR,
-	.allowed_opts	= NVMF_OPT_TRSVCID | NVMF_OPT_TL_RETRY_COUNT |
-			  NVMF_OPT_RECONNECT_DELAY,
+	.allowed_opts	= NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY,
 	.create_ctrl	= nvme_rdma_create_ctrl,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] nvme-fabrics: Remove tl_retry_count
  2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg
  2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg
@ 2016-06-22 12:06 ` Sagi Grimberg
  2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-06-22 12:06 UTC (permalink / raw)


The timeout before error recovery logic kicks in is
dictated by the nvme keep-alive, so we don't really need
a transport layer retry count. transports can retry for
as much as they like.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/fabrics.c | 14 --------------
 drivers/nvme/host/fabrics.h |  3 ---
 2 files changed, 17 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index b86b6379ef0c..0a5a4c4684bc 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -503,7 +503,6 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_NQN,			"nqn=%s"		},
 	{ NVMF_OPT_QUEUE_SIZE,		"queue_size=%d"		},
 	{ NVMF_OPT_NR_IO_QUEUES,	"nr_io_queues=%d"	},
-	{ NVMF_OPT_TL_RETRY_COUNT,	"tl_retry_count=%d"	},
 	{ NVMF_OPT_RECONNECT_DELAY,	"reconnect_delay=%d"	},
 	{ NVMF_OPT_KATO,		"keep_alive_tmo=%d"	},
 	{ NVMF_OPT_HOSTNQN,		"hostnqn=%s"		},
@@ -521,7 +520,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	/* Set defaults */
 	opts->queue_size = NVMF_DEF_QUEUE_SIZE;
 	opts->nr_io_queues = num_online_cpus();
-	opts->tl_retry_count = 2;
 	opts->reconnect_delay = NVMF_DEF_RECONNECT_DELAY;
 
 	options = o = kstrdup(buf, GFP_KERNEL);
@@ -605,18 +603,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 			opts->nr_io_queues = min_t(unsigned int,
 					num_online_cpus(), token);
 			break;
-		case NVMF_OPT_TL_RETRY_COUNT:
-			if (match_int(args, &token)) {
-				ret = -EINVAL;
-				goto out;
-			}
-			if (token < 0) {
-				pr_err("Invalid tl_retry_count %d\n", token);
-				ret = -EINVAL;
-				goto out;
-			}
-			opts->tl_retry_count = token;
-			break;
 		case NVMF_OPT_KATO:
 			if (match_int(args, &token)) {
 				ret = -EINVAL;
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index b54067404963..89df52c8be97 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -69,8 +69,6 @@ enum {
  * @trsvcid:	network port used for host-controller communication.
  * @queue_size: Number of IO queue elements.
  * @nr_io_queues: Number of controller IO queues that will be established.
- * @tl_retry_count: Number of transport layer retries for a fabric queue before
- *		     kicking upper layer(s) error recovery.
  * @reconnect_delay: Time between two consecutive reconnect attempts.
  * @discovery_nqn: indicates if the subsysnqn is the well-known discovery NQN.
  * @kato:	Keep-alive timeout.
@@ -84,7 +82,6 @@ struct nvmf_ctrl_options {
 	char			*trsvcid;
 	size_t			queue_size;
 	unsigned int		nr_io_queues;
-	unsigned short		tl_retry_count;
 	unsigned int		reconnect_delay;
 	bool			discovery_nqn;
 	unsigned int		kato;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg
  2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg
  2016-06-22 12:06 ` [PATCH 2/2] nvme-fabrics: Remove tl_retry_count Sagi Grimberg
@ 2016-06-22 16:15 ` Christoph Hellwig
  2016-06-22 16:31   ` Sagi Grimberg
                     ` (2 more replies)
  2 siblings, 3 replies; 15+ messages in thread
From: Christoph Hellwig @ 2016-06-22 16:15 UTC (permalink / raw)


On Wed, Jun 22, 2016@03:05:59PM +0300, Sagi Grimberg wrote:
> This parameter was added in order to support a proper timeout for
> error recovery before the spec defined a periodic keep-alive.
> 
> Now that we have periodic keep-alive, we don't need a user configurable
> transport layer retry count, the keep-alive timeout is sufficient,
> transports can retry for as long as they see fit.

Isn't there some IB protocol level rationale for a low retry count
in various fabric setups?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
@ 2016-06-22 16:31   ` Sagi Grimberg
  2016-06-22 20:31   ` Jason Gunthorpe
  2016-07-18 15:20   ` Bart Van Assche
  2 siblings, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-06-22 16:31 UTC (permalink / raw)



>> This parameter was added in order to support a proper timeout for
>> error recovery before the spec defined a periodic keep-alive.
>>
>> Now that we have periodic keep-alive, we don't need a user configurable
>> transport layer retry count, the keep-alive timeout is sufficient,
>> transports can retry for as long as they see fit.
>
> Isn't there some IB protocol level rationale for a low retry count
> in various fabric setups?

None that I know of... The QP retry count determines the time it would
take to fail a send/read/write.. The retry_count value is multiplied
with the packet timeout (which is a result of an IB specific
computation - managed by the CM).

It's useful when one needs to limit the time until a send fails in order
to kick error recovery (useful for srp which doesn't implement periodic
keep-alive), but since nvme does, I don't see the reason why RDMA or any
other transport should expose this configuration as the keep-alive
timeout exists for that.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
  2016-06-22 16:31   ` Sagi Grimberg
@ 2016-06-22 20:31   ` Jason Gunthorpe
  2016-06-23  7:09     ` Sagi Grimberg
  2016-07-18 15:20   ` Bart Van Assche
  2 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2016-06-22 20:31 UTC (permalink / raw)


On Wed, Jun 22, 2016@09:15:59AM -0700, Christoph Hellwig wrote:
> On Wed, Jun 22, 2016@03:05:59PM +0300, Sagi Grimberg wrote:
> > This parameter was added in order to support a proper timeout for
> > error recovery before the spec defined a periodic keep-alive.
> > 
> > Now that we have periodic keep-alive, we don't need a user configurable
> > transport layer retry count, the keep-alive timeout is sufficient,
> > transports can retry for as long as they see fit.
> 
> Isn't there some IB protocol level rationale for a low retry count
> in various fabric setups?

IIRC the retry count is part of what drives the APM switch over, so
APM configuration should use a lower value.

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-22 20:31   ` Jason Gunthorpe
@ 2016-06-23  7:09     ` Sagi Grimberg
  2016-06-24  7:13       ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2016-06-23  7:09 UTC (permalink / raw)



>>> This parameter was added in order to support a proper timeout for
>>> error recovery before the spec defined a periodic keep-alive.
>>>
>>> Now that we have periodic keep-alive, we don't need a user configurable
>>> transport layer retry count, the keep-alive timeout is sufficient,
>>> transports can retry for as long as they see fit.
>>
>> Isn't there some IB protocol level rationale for a low retry count
>> in various fabric setups?
>
> IIRC the retry count is part of what drives the APM switch over, so
> APM configuration should use a lower value.

Completely agree Jason. Lowering the retry_count is very useful
for APM (Automatic Path Migration).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-23  7:09     ` Sagi Grimberg
@ 2016-06-24  7:13       ` Christoph Hellwig
  2016-06-26 15:48         ` Sagi Grimberg
  2016-07-17 11:52         ` Sagi Grimberg
  0 siblings, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2016-06-24  7:13 UTC (permalink / raw)


On Thu, Jun 23, 2016@10:09:36AM +0300, Sagi Grimberg wrote:
> Completely agree Jason. Lowering the retry_count is very useful
> for APM (Automatic Path Migration).

Does this mean you're retracting the patches?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-24  7:13       ` Christoph Hellwig
@ 2016-06-26 15:48         ` Sagi Grimberg
  2016-07-17 11:52         ` Sagi Grimberg
  1 sibling, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-06-26 15:48 UTC (permalink / raw)



>> Completely agree Jason. Lowering the retry_count is very useful
>> for APM (Automatic Path Migration).
>
> Does this mean you're retracting the patches?

I'm not, were not using APM anywhere in nvme-rdma. multipathing
is done at a higher level than the transport.

Do you see a reason to keep this? I'm not too enthusiast with leaving
configs that aren't absolutely needed. As mentioned this config was
added to add a fast-fail functionality before we defined the periodic 
keep-alive...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-24  7:13       ` Christoph Hellwig
  2016-06-26 15:48         ` Sagi Grimberg
@ 2016-07-17 11:52         ` Sagi Grimberg
  2016-07-18  4:09           ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Sagi Grimberg @ 2016-07-17 11:52 UTC (permalink / raw)



>> Completely agree Jason. Lowering the retry_count is very useful
>> for APM (Automatic Path Migration).
>
> Does this mean you're retracting the patches?

No, we never use APM in nvme-rdma, so I don't see a good reason
why we should keep it around....

Can I get a review on this btw?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-07-17 11:52         ` Sagi Grimberg
@ 2016-07-18  4:09           ` Christoph Hellwig
  2016-07-18  4:09             ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2016-07-18  4:09 UTC (permalink / raw)


On Sun, Jul 17, 2016@02:52:45PM +0300, Sagi Grimberg wrote:
> Can I get a review on this btw?

Jens already merged that patch after I pinged him last week.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-07-18  4:09           ` Christoph Hellwig
@ 2016-07-18  4:09             ` Christoph Hellwig
  2016-07-18  8:01               ` Sagi Grimberg
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2016-07-18  4:09 UTC (permalink / raw)


On Sun, Jul 17, 2016@09:09:01PM -0700, Christoph Hellwig wrote:
> On Sun, Jul 17, 2016@02:52:45PM +0300, Sagi Grimberg wrote:
> > Can I get a review on this btw?
> 
> Jens already merged that patch after I pinged him last week.

s/patch/series/.  Time to havea coffee..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-07-18  4:09             ` Christoph Hellwig
@ 2016-07-18  8:01               ` Sagi Grimberg
  0 siblings, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-07-18  8:01 UTC (permalink / raw)



>> Jens already merged that patch after I pinged him last week.
>
> s/patch/series/.  Time to havea coffee..

Thanks Christoph and Jens :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
  2016-06-22 16:31   ` Sagi Grimberg
  2016-06-22 20:31   ` Jason Gunthorpe
@ 2016-07-18 15:20   ` Bart Van Assche
  2016-07-20  8:42     ` Sagi Grimberg
  2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2016-07-18 15:20 UTC (permalink / raw)


On 06/22/2016 09:15 AM, Christoph Hellwig wrote:
> On Wed, Jun 22, 2016@03:05:59PM +0300, Sagi Grimberg wrote:
>> This parameter was added in order to support a proper timeout for
>> error recovery before the spec defined a periodic keep-alive.
>>
>> Now that we have periodic keep-alive, we don't need a user configurable
>> transport layer retry count, the keep-alive timeout is sufficient,
>> transports can retry for as long as they see fit.
>
> Isn't there some IB protocol level rationale for a low retry count
> in various fabric setups?

The IB spec defines an end-to-end credit mechanism for RC connections. 
So if the transport layer is reliable (InfiniBand, RoCE with DCB 
enabled) setting the retry count high enough is only needed to avoid 
connection shutdown due to brief cable disconnect/reconnect events.

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Get rid of transport layer retry count config parameter
  2016-07-18 15:20   ` Bart Van Assche
@ 2016-07-20  8:42     ` Sagi Grimberg
  0 siblings, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2016-07-20  8:42 UTC (permalink / raw)



> The IB spec defines an end-to-end credit mechanism for RC connections.
> So if the transport layer is reliable (InfiniBand, RoCE with DCB
> enabled) setting the retry count high enough is only needed to avoid
> connection shutdown due to brief cable disconnect/reconnect events.

Right, this is why I think the driver can use whatever it sees fit (we
have a keep-alive mechanism for fast-fail functionality).

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-07-20  8:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg
2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg
2016-06-22 12:06 ` [PATCH 2/2] nvme-fabrics: Remove tl_retry_count Sagi Grimberg
2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
2016-06-22 16:31   ` Sagi Grimberg
2016-06-22 20:31   ` Jason Gunthorpe
2016-06-23  7:09     ` Sagi Grimberg
2016-06-24  7:13       ` Christoph Hellwig
2016-06-26 15:48         ` Sagi Grimberg
2016-07-17 11:52         ` Sagi Grimberg
2016-07-18  4:09           ` Christoph Hellwig
2016-07-18  4:09             ` Christoph Hellwig
2016-07-18  8:01               ` Sagi Grimberg
2016-07-18 15:20   ` Bart Van Assche
2016-07-20  8:42     ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).