* [PATCH 0/2] Get rid of transport layer retry count config parameter @ 2016-06-22 12:05 Sagi Grimberg 2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-06-22 12:05 UTC (permalink / raw) This parameter was added in order to support a proper timeout for error recovery before the spec defined a periodic keep-alive. Now that we have periodic keep-alive, we don't need a user configurable transport layer retry count, the keep-alive timeout is sufficient, transports can retry for as long as they see fit. Sagi Grimberg (2): nvme-rdma: Don't use tl_retry_count nvme-fabrics: Remove tl_retry_count drivers/nvme/host/fabrics.c | 14 -------------- drivers/nvme/host/fabrics.h | 3 --- drivers/nvme/host/rdma.c | 9 +++------ 3 files changed, 3 insertions(+), 23 deletions(-) -- 1.9.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] nvme-rdma: Don't use tl_retry_count 2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg @ 2016-06-22 12:06 ` Sagi Grimberg 2016-06-22 12:06 ` [PATCH 2/2] nvme-fabrics: Remove tl_retry_count Sagi Grimberg 2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig 2 siblings, 0 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-06-22 12:06 UTC (permalink / raw) Always use the maximum qp retry count as the error recovery timeout is dictated from the nvme keep-alive. Signed-off-by: Sagi Grimberg <sagi at grimberg.me> --- drivers/nvme/host/rdma.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index e1205c0d36e4..b939f89ad936 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -117,7 +117,6 @@ struct nvme_rdma_ctrl { u32 queue_count; /* other member variables */ - unsigned short tl_retry_count; struct blk_mq_tag_set tag_set; struct work_struct delete_work; struct work_struct reset_work; @@ -1268,8 +1267,8 @@ static int nvme_rdma_route_resolved(struct nvme_rdma_queue *queue) param.flow_control = 1; param.responder_resources = queue->device->dev->attrs.max_qp_rd_atom; - /* rdma_cm will clamp down to max QP retry count (7) */ - param.retry_count = ctrl->tl_retry_count; + /* maximum retry count */ + param.retry_count = 7; param.rnr_retry_count = 7; param.private_data = &priv; param.private_data_len = sizeof(priv); @@ -1898,7 +1897,6 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev, ctrl->queue_count = opts->nr_io_queues + 1; /* +1 for admin queue */ ctrl->ctrl.sqsize = opts->queue_size; - ctrl->tl_retry_count = opts->tl_retry_count; ctrl->ctrl.kato = opts->kato; ret = -ENOMEM; @@ -1975,8 +1973,7 @@ out_free_ctrl: static struct nvmf_transport_ops nvme_rdma_transport = { .name = "rdma", .required_opts = NVMF_OPT_TRADDR, - .allowed_opts = NVMF_OPT_TRSVCID | NVMF_OPT_TL_RETRY_COUNT | - NVMF_OPT_RECONNECT_DELAY, + .allowed_opts = NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY, .create_ctrl = nvme_rdma_create_ctrl, }; -- 1.9.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] nvme-fabrics: Remove tl_retry_count 2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg 2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg @ 2016-06-22 12:06 ` Sagi Grimberg 2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig 2 siblings, 0 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-06-22 12:06 UTC (permalink / raw) The timeout before error recovery logic kicks in is dictated by the nvme keep-alive, so we don't really need a transport layer retry count. transports can retry for as much as they like. Signed-off-by: Sagi Grimberg <sagi at grimberg.me> --- drivers/nvme/host/fabrics.c | 14 -------------- drivers/nvme/host/fabrics.h | 3 --- 2 files changed, 17 deletions(-) diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c index b86b6379ef0c..0a5a4c4684bc 100644 --- a/drivers/nvme/host/fabrics.c +++ b/drivers/nvme/host/fabrics.c @@ -503,7 +503,6 @@ static const match_table_t opt_tokens = { { NVMF_OPT_NQN, "nqn=%s" }, { NVMF_OPT_QUEUE_SIZE, "queue_size=%d" }, { NVMF_OPT_NR_IO_QUEUES, "nr_io_queues=%d" }, - { NVMF_OPT_TL_RETRY_COUNT, "tl_retry_count=%d" }, { NVMF_OPT_RECONNECT_DELAY, "reconnect_delay=%d" }, { NVMF_OPT_KATO, "keep_alive_tmo=%d" }, { NVMF_OPT_HOSTNQN, "hostnqn=%s" }, @@ -521,7 +520,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts, /* Set defaults */ opts->queue_size = NVMF_DEF_QUEUE_SIZE; opts->nr_io_queues = num_online_cpus(); - opts->tl_retry_count = 2; opts->reconnect_delay = NVMF_DEF_RECONNECT_DELAY; options = o = kstrdup(buf, GFP_KERNEL); @@ -605,18 +603,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts, opts->nr_io_queues = min_t(unsigned int, num_online_cpus(), token); break; - case NVMF_OPT_TL_RETRY_COUNT: - if (match_int(args, &token)) { - ret = -EINVAL; - goto out; - } - if (token < 0) { - pr_err("Invalid tl_retry_count %d\n", token); - ret = -EINVAL; - goto out; - } - opts->tl_retry_count = token; - break; case NVMF_OPT_KATO: if (match_int(args, &token)) { ret = -EINVAL; diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h index b54067404963..89df52c8be97 100644 --- a/drivers/nvme/host/fabrics.h +++ b/drivers/nvme/host/fabrics.h @@ -69,8 +69,6 @@ enum { * @trsvcid: network port used for host-controller communication. * @queue_size: Number of IO queue elements. * @nr_io_queues: Number of controller IO queues that will be established. - * @tl_retry_count: Number of transport layer retries for a fabric queue before - * kicking upper layer(s) error recovery. * @reconnect_delay: Time between two consecutive reconnect attempts. * @discovery_nqn: indicates if the subsysnqn is the well-known discovery NQN. * @kato: Keep-alive timeout. @@ -84,7 +82,6 @@ struct nvmf_ctrl_options { char *trsvcid; size_t queue_size; unsigned int nr_io_queues; - unsigned short tl_retry_count; unsigned int reconnect_delay; bool discovery_nqn; unsigned int kato; -- 1.9.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg 2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg 2016-06-22 12:06 ` [PATCH 2/2] nvme-fabrics: Remove tl_retry_count Sagi Grimberg @ 2016-06-22 16:15 ` Christoph Hellwig 2016-06-22 16:31 ` Sagi Grimberg ` (2 more replies) 2 siblings, 3 replies; 15+ messages in thread From: Christoph Hellwig @ 2016-06-22 16:15 UTC (permalink / raw) On Wed, Jun 22, 2016@03:05:59PM +0300, Sagi Grimberg wrote: > This parameter was added in order to support a proper timeout for > error recovery before the spec defined a periodic keep-alive. > > Now that we have periodic keep-alive, we don't need a user configurable > transport layer retry count, the keep-alive timeout is sufficient, > transports can retry for as long as they see fit. Isn't there some IB protocol level rationale for a low retry count in various fabric setups? ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig @ 2016-06-22 16:31 ` Sagi Grimberg 2016-06-22 20:31 ` Jason Gunthorpe 2016-07-18 15:20 ` Bart Van Assche 2 siblings, 0 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-06-22 16:31 UTC (permalink / raw) >> This parameter was added in order to support a proper timeout for >> error recovery before the spec defined a periodic keep-alive. >> >> Now that we have periodic keep-alive, we don't need a user configurable >> transport layer retry count, the keep-alive timeout is sufficient, >> transports can retry for as long as they see fit. > > Isn't there some IB protocol level rationale for a low retry count > in various fabric setups? None that I know of... The QP retry count determines the time it would take to fail a send/read/write.. The retry_count value is multiplied with the packet timeout (which is a result of an IB specific computation - managed by the CM). It's useful when one needs to limit the time until a send fails in order to kick error recovery (useful for srp which doesn't implement periodic keep-alive), but since nvme does, I don't see the reason why RDMA or any other transport should expose this configuration as the keep-alive timeout exists for that. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig 2016-06-22 16:31 ` Sagi Grimberg @ 2016-06-22 20:31 ` Jason Gunthorpe 2016-06-23 7:09 ` Sagi Grimberg 2016-07-18 15:20 ` Bart Van Assche 2 siblings, 1 reply; 15+ messages in thread From: Jason Gunthorpe @ 2016-06-22 20:31 UTC (permalink / raw) On Wed, Jun 22, 2016@09:15:59AM -0700, Christoph Hellwig wrote: > On Wed, Jun 22, 2016@03:05:59PM +0300, Sagi Grimberg wrote: > > This parameter was added in order to support a proper timeout for > > error recovery before the spec defined a periodic keep-alive. > > > > Now that we have periodic keep-alive, we don't need a user configurable > > transport layer retry count, the keep-alive timeout is sufficient, > > transports can retry for as long as they see fit. > > Isn't there some IB protocol level rationale for a low retry count > in various fabric setups? IIRC the retry count is part of what drives the APM switch over, so APM configuration should use a lower value. Jason ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-22 20:31 ` Jason Gunthorpe @ 2016-06-23 7:09 ` Sagi Grimberg 2016-06-24 7:13 ` Christoph Hellwig 0 siblings, 1 reply; 15+ messages in thread From: Sagi Grimberg @ 2016-06-23 7:09 UTC (permalink / raw) >>> This parameter was added in order to support a proper timeout for >>> error recovery before the spec defined a periodic keep-alive. >>> >>> Now that we have periodic keep-alive, we don't need a user configurable >>> transport layer retry count, the keep-alive timeout is sufficient, >>> transports can retry for as long as they see fit. >> >> Isn't there some IB protocol level rationale for a low retry count >> in various fabric setups? > > IIRC the retry count is part of what drives the APM switch over, so > APM configuration should use a lower value. Completely agree Jason. Lowering the retry_count is very useful for APM (Automatic Path Migration). ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-23 7:09 ` Sagi Grimberg @ 2016-06-24 7:13 ` Christoph Hellwig 2016-06-26 15:48 ` Sagi Grimberg 2016-07-17 11:52 ` Sagi Grimberg 0 siblings, 2 replies; 15+ messages in thread From: Christoph Hellwig @ 2016-06-24 7:13 UTC (permalink / raw) On Thu, Jun 23, 2016@10:09:36AM +0300, Sagi Grimberg wrote: > Completely agree Jason. Lowering the retry_count is very useful > for APM (Automatic Path Migration). Does this mean you're retracting the patches? ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-24 7:13 ` Christoph Hellwig @ 2016-06-26 15:48 ` Sagi Grimberg 2016-07-17 11:52 ` Sagi Grimberg 1 sibling, 0 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-06-26 15:48 UTC (permalink / raw) >> Completely agree Jason. Lowering the retry_count is very useful >> for APM (Automatic Path Migration). > > Does this mean you're retracting the patches? I'm not, were not using APM anywhere in nvme-rdma. multipathing is done at a higher level than the transport. Do you see a reason to keep this? I'm not too enthusiast with leaving configs that aren't absolutely needed. As mentioned this config was added to add a fast-fail functionality before we defined the periodic keep-alive... ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-24 7:13 ` Christoph Hellwig 2016-06-26 15:48 ` Sagi Grimberg @ 2016-07-17 11:52 ` Sagi Grimberg 2016-07-18 4:09 ` Christoph Hellwig 1 sibling, 1 reply; 15+ messages in thread From: Sagi Grimberg @ 2016-07-17 11:52 UTC (permalink / raw) >> Completely agree Jason. Lowering the retry_count is very useful >> for APM (Automatic Path Migration). > > Does this mean you're retracting the patches? No, we never use APM in nvme-rdma, so I don't see a good reason why we should keep it around.... Can I get a review on this btw? ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-07-17 11:52 ` Sagi Grimberg @ 2016-07-18 4:09 ` Christoph Hellwig 2016-07-18 4:09 ` Christoph Hellwig 0 siblings, 1 reply; 15+ messages in thread From: Christoph Hellwig @ 2016-07-18 4:09 UTC (permalink / raw) On Sun, Jul 17, 2016@02:52:45PM +0300, Sagi Grimberg wrote: > Can I get a review on this btw? Jens already merged that patch after I pinged him last week. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-07-18 4:09 ` Christoph Hellwig @ 2016-07-18 4:09 ` Christoph Hellwig 2016-07-18 8:01 ` Sagi Grimberg 0 siblings, 1 reply; 15+ messages in thread From: Christoph Hellwig @ 2016-07-18 4:09 UTC (permalink / raw) On Sun, Jul 17, 2016@09:09:01PM -0700, Christoph Hellwig wrote: > On Sun, Jul 17, 2016@02:52:45PM +0300, Sagi Grimberg wrote: > > Can I get a review on this btw? > > Jens already merged that patch after I pinged him last week. s/patch/series/. Time to havea coffee.. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-07-18 4:09 ` Christoph Hellwig @ 2016-07-18 8:01 ` Sagi Grimberg 0 siblings, 0 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-07-18 8:01 UTC (permalink / raw) >> Jens already merged that patch after I pinged him last week. > > s/patch/series/. Time to havea coffee.. Thanks Christoph and Jens :) ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig 2016-06-22 16:31 ` Sagi Grimberg 2016-06-22 20:31 ` Jason Gunthorpe @ 2016-07-18 15:20 ` Bart Van Assche 2016-07-20 8:42 ` Sagi Grimberg 2 siblings, 1 reply; 15+ messages in thread From: Bart Van Assche @ 2016-07-18 15:20 UTC (permalink / raw) On 06/22/2016 09:15 AM, Christoph Hellwig wrote: > On Wed, Jun 22, 2016@03:05:59PM +0300, Sagi Grimberg wrote: >> This parameter was added in order to support a proper timeout for >> error recovery before the spec defined a periodic keep-alive. >> >> Now that we have periodic keep-alive, we don't need a user configurable >> transport layer retry count, the keep-alive timeout is sufficient, >> transports can retry for as long as they see fit. > > Isn't there some IB protocol level rationale for a low retry count > in various fabric setups? The IB spec defines an end-to-end credit mechanism for RC connections. So if the transport layer is reliable (InfiniBand, RoCE with DCB enabled) setting the retry count high enough is only needed to avoid connection shutdown due to brief cable disconnect/reconnect events. Bart. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] Get rid of transport layer retry count config parameter 2016-07-18 15:20 ` Bart Van Assche @ 2016-07-20 8:42 ` Sagi Grimberg 0 siblings, 0 replies; 15+ messages in thread From: Sagi Grimberg @ 2016-07-20 8:42 UTC (permalink / raw) > The IB spec defines an end-to-end credit mechanism for RC connections. > So if the transport layer is reliable (InfiniBand, RoCE with DCB > enabled) setting the retry count high enough is only needed to avoid > connection shutdown due to brief cable disconnect/reconnect events. Right, this is why I think the driver can use whatever it sees fit (we have a keep-alive mechanism for fast-fail functionality). ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-07-20 8:42 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-22 12:05 [PATCH 0/2] Get rid of transport layer retry count config parameter Sagi Grimberg 2016-06-22 12:06 ` [PATCH 1/2] nvme-rdma: Don't use tl_retry_count Sagi Grimberg 2016-06-22 12:06 ` [PATCH 2/2] nvme-fabrics: Remove tl_retry_count Sagi Grimberg 2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig 2016-06-22 16:31 ` Sagi Grimberg 2016-06-22 20:31 ` Jason Gunthorpe 2016-06-23 7:09 ` Sagi Grimberg 2016-06-24 7:13 ` Christoph Hellwig 2016-06-26 15:48 ` Sagi Grimberg 2016-07-17 11:52 ` Sagi Grimberg 2016-07-18 4:09 ` Christoph Hellwig 2016-07-18 4:09 ` Christoph Hellwig 2016-07-18 8:01 ` Sagi Grimberg 2016-07-18 15:20 ` Bart Van Assche 2016-07-20 8:42 ` Sagi Grimberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).