* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <1466597161-5242-1-git-send-email-sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2016-06-22 16:15 ` Christoph Hellwig [not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2016-06-22 16:15 UTC (permalink / raw) To: Sagi Grimberg Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Wed, Jun 22, 2016 at 03:05:59PM +0300, Sagi Grimberg wrote: > This parameter was added in order to support a proper timeout for > error recovery before the spec defined a periodic keep-alive. > > Now that we have periodic keep-alive, we don't need a user configurable > transport layer retry count, the keep-alive timeout is sufficient, > transports can retry for as long as they see fit. Isn't there some IB protocol level rationale for a low retry count in various fabric setups? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2016-06-22 16:31 ` Sagi Grimberg 2016-06-22 20:31 ` Jason Gunthorpe 2016-07-18 15:20 ` Bart Van Assche 2 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2016-06-22 16:31 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-rdma-u79uwXL29TY76Z2rM5mHXA >> This parameter was added in order to support a proper timeout for >> error recovery before the spec defined a periodic keep-alive. >> >> Now that we have periodic keep-alive, we don't need a user configurable >> transport layer retry count, the keep-alive timeout is sufficient, >> transports can retry for as long as they see fit. > > Isn't there some IB protocol level rationale for a low retry count > in various fabric setups? None that I know of... The QP retry count determines the time it would take to fail a send/read/write.. The retry_count value is multiplied with the packet timeout (which is a result of an IB specific computation - managed by the CM). It's useful when one needs to limit the time until a send fails in order to kick error recovery (useful for srp which doesn't implement periodic keep-alive), but since nvme does, I don't see the reason why RDMA or any other transport should expose this configuration as the keep-alive timeout exists for that. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 2016-06-22 16:31 ` Sagi Grimberg @ 2016-06-22 20:31 ` Jason Gunthorpe [not found] ` <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 2016-07-18 15:20 ` Bart Van Assche 2 siblings, 1 reply; 12+ messages in thread From: Jason Gunthorpe @ 2016-06-22 20:31 UTC (permalink / raw) To: Christoph Hellwig Cc: Sagi Grimberg, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Wed, Jun 22, 2016 at 09:15:59AM -0700, Christoph Hellwig wrote: > On Wed, Jun 22, 2016 at 03:05:59PM +0300, Sagi Grimberg wrote: > > This parameter was added in order to support a proper timeout for > > error recovery before the spec defined a periodic keep-alive. > > > > Now that we have periodic keep-alive, we don't need a user configurable > > transport layer retry count, the keep-alive timeout is sufficient, > > transports can retry for as long as they see fit. > > Isn't there some IB protocol level rationale for a low retry count > in various fabric setups? IIRC the retry count is part of what drives the APM switch over, so APM configuration should use a lower value. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2016-06-23 7:09 ` Sagi Grimberg [not found] ` <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Sagi Grimberg @ 2016-06-23 7:09 UTC (permalink / raw) To: Jason Gunthorpe, Christoph Hellwig Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-rdma-u79uwXL29TY76Z2rM5mHXA >>> This parameter was added in order to support a proper timeout for >>> error recovery before the spec defined a periodic keep-alive. >>> >>> Now that we have periodic keep-alive, we don't need a user configurable >>> transport layer retry count, the keep-alive timeout is sufficient, >>> transports can retry for as long as they see fit. >> >> Isn't there some IB protocol level rationale for a low retry count >> in various fabric setups? > > IIRC the retry count is part of what drives the APM switch over, so > APM configuration should use a lower value. Completely agree Jason. Lowering the retry_count is very useful for APM (Automatic Path Migration). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2016-06-24 7:13 ` Christoph Hellwig [not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2016-06-24 7:13 UTC (permalink / raw) To: Sagi Grimberg Cc: Jason Gunthorpe, Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r On Thu, Jun 23, 2016 at 10:09:36AM +0300, Sagi Grimberg wrote: > Completely agree Jason. Lowering the retry_count is very useful > for APM (Automatic Path Migration). Does this mean you're retracting the patches? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2016-06-26 15:48 ` Sagi Grimberg 2016-07-17 11:52 ` Sagi Grimberg 1 sibling, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2016-06-26 15:48 UTC (permalink / raw) To: Christoph Hellwig Cc: Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r >> Completely agree Jason. Lowering the retry_count is very useful >> for APM (Automatic Path Migration). > > Does this mean you're retracting the patches? I'm not, were not using APM anywhere in nvme-rdma. multipathing is done at a higher level than the transport. Do you see a reason to keep this? I'm not too enthusiast with leaving configs that aren't absolutely needed. As mentioned this config was added to add a fast-fail functionality before we defined the periodic keep-alive... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 2016-06-26 15:48 ` Sagi Grimberg @ 2016-07-17 11:52 ` Sagi Grimberg [not found] ` <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 1 sibling, 1 reply; 12+ messages in thread From: Sagi Grimberg @ 2016-07-17 11:52 UTC (permalink / raw) To: Christoph Hellwig Cc: Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r >> Completely agree Jason. Lowering the retry_count is very useful >> for APM (Automatic Path Migration). > > Does this mean you're retracting the patches? No, we never use APM in nvme-rdma, so I don't see a good reason why we should keep it around.... Can I get a review on this btw? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2016-07-18 4:09 ` Christoph Hellwig [not found] ` <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2016-07-18 4:09 UTC (permalink / raw) To: Sagi Grimberg Cc: Christoph Hellwig, Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r On Sun, Jul 17, 2016 at 02:52:45PM +0300, Sagi Grimberg wrote: > Can I get a review on this btw? Jens already merged that patch after I pinged him last week. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2016-07-18 4:09 ` Christoph Hellwig [not found] ` <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2016-07-18 4:09 UTC (permalink / raw) To: Sagi Grimberg Cc: Christoph Hellwig, Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r On Sun, Jul 17, 2016 at 09:09:01PM -0700, Christoph Hellwig wrote: > On Sun, Jul 17, 2016 at 02:52:45PM +0300, Sagi Grimberg wrote: > > Can I get a review on this btw? > > Jens already merged that patch after I pinged him last week. s/patch/series/. Time to havea coffee.. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2016-07-18 8:01 ` Sagi Grimberg 0 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2016-07-18 8:01 UTC (permalink / raw) To: Christoph Hellwig Cc: Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r >> Jens already merged that patch after I pinged him last week. > > s/patch/series/. Time to havea coffee.. Thanks Christoph and Jens :) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 2016-06-22 16:31 ` Sagi Grimberg 2016-06-22 20:31 ` Jason Gunthorpe @ 2016-07-18 15:20 ` Bart Van Assche [not found] ` <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> 2 siblings, 1 reply; 12+ messages in thread From: Bart Van Assche @ 2016-07-18 15:20 UTC (permalink / raw) To: Christoph Hellwig, Sagi Grimberg Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-rdma-u79uwXL29TY76Z2rM5mHXA On 06/22/2016 09:15 AM, Christoph Hellwig wrote: > On Wed, Jun 22, 2016 at 03:05:59PM +0300, Sagi Grimberg wrote: >> This parameter was added in order to support a proper timeout for >> error recovery before the spec defined a periodic keep-alive. >> >> Now that we have periodic keep-alive, we don't need a user configurable >> transport layer retry count, the keep-alive timeout is sufficient, >> transports can retry for as long as they see fit. > > Isn't there some IB protocol level rationale for a low retry count > in various fabric setups? The IB spec defines an end-to-end credit mechanism for RC connections. So if the transport layer is reliable (InfiniBand, RoCE with DCB enabled) setting the retry count high enough is only needed to avoid connection shutdown due to brief cable disconnect/reconnect events. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter [not found] ` <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org> @ 2016-07-20 8:42 ` Sagi Grimberg 0 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2016-07-20 8:42 UTC (permalink / raw) To: Bart Van Assche, Christoph Hellwig Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-rdma-u79uwXL29TY76Z2rM5mHXA > The IB spec defines an end-to-end credit mechanism for RC connections. > So if the transport layer is reliable (InfiniBand, RoCE with DCB > enabled) setting the retry count high enough is only needed to avoid > connection shutdown due to brief cable disconnect/reconnect events. Right, this is why I think the driver can use whatever it sees fit (we have a keep-alive mechanism for fast-fail functionality). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-07-20 8:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1466597161-5242-1-git-send-email-sagi@grimberg.me>
[not found] ` <1466597161-5242-1-git-send-email-sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
[not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-22 16:31 ` Sagi Grimberg
2016-06-22 20:31 ` Jason Gunthorpe
[not found] ` <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-23 7:09 ` Sagi Grimberg
[not found] ` <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-06-24 7:13 ` Christoph Hellwig
[not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-26 15:48 ` Sagi Grimberg
2016-07-17 11:52 ` Sagi Grimberg
[not found] ` <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-07-18 4:09 ` Christoph Hellwig
[not found] ` <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-07-18 4:09 ` Christoph Hellwig
[not found] ` <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-07-18 8:01 ` Sagi Grimberg
2016-07-18 15:20 ` Bart Van Assche
[not found] ` <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-07-20 8:42 ` Sagi Grimberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox