* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <1466597161-5242-1-git-send-email-sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-06-22 16:15 ` Christoph Hellwig
[not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2016-06-22 16:15 UTC (permalink / raw)
To: Sagi Grimberg
Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Wed, Jun 22, 2016 at 03:05:59PM +0300, Sagi Grimberg wrote:
> This parameter was added in order to support a proper timeout for
> error recovery before the spec defined a periodic keep-alive.
>
> Now that we have periodic keep-alive, we don't need a user configurable
> transport layer retry count, the keep-alive timeout is sufficient,
> transports can retry for as long as they see fit.
Isn't there some IB protocol level rationale for a low retry count
in various fabric setups?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-06-22 16:31 ` Sagi Grimberg
2016-06-22 20:31 ` Jason Gunthorpe
2016-07-18 15:20 ` Bart Van Assche
2 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2016-06-22 16:31 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
>> This parameter was added in order to support a proper timeout for
>> error recovery before the spec defined a periodic keep-alive.
>>
>> Now that we have periodic keep-alive, we don't need a user configurable
>> transport layer retry count, the keep-alive timeout is sufficient,
>> transports can retry for as long as they see fit.
>
> Isn't there some IB protocol level rationale for a low retry count
> in various fabric setups?
None that I know of... The QP retry count determines the time it would
take to fail a send/read/write.. The retry_count value is multiplied
with the packet timeout (which is a result of an IB specific
computation - managed by the CM).
It's useful when one needs to limit the time until a send fails in order
to kick error recovery (useful for srp which doesn't implement periodic
keep-alive), but since nvme does, I don't see the reason why RDMA or any
other transport should expose this configuration as the keep-alive
timeout exists for that.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-22 16:31 ` Sagi Grimberg
@ 2016-06-22 20:31 ` Jason Gunthorpe
[not found] ` <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-07-18 15:20 ` Bart Van Assche
2 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2016-06-22 20:31 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Wed, Jun 22, 2016 at 09:15:59AM -0700, Christoph Hellwig wrote:
> On Wed, Jun 22, 2016 at 03:05:59PM +0300, Sagi Grimberg wrote:
> > This parameter was added in order to support a proper timeout for
> > error recovery before the spec defined a periodic keep-alive.
> >
> > Now that we have periodic keep-alive, we don't need a user configurable
> > transport layer retry count, the keep-alive timeout is sufficient,
> > transports can retry for as long as they see fit.
>
> Isn't there some IB protocol level rationale for a low retry count
> in various fabric setups?
IIRC the retry count is part of what drives the APM switch over, so
APM configuration should use a lower value.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-06-23 7:09 ` Sagi Grimberg
[not found] ` <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2016-06-23 7:09 UTC (permalink / raw)
To: Jason Gunthorpe, Christoph Hellwig
Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
>>> This parameter was added in order to support a proper timeout for
>>> error recovery before the spec defined a periodic keep-alive.
>>>
>>> Now that we have periodic keep-alive, we don't need a user configurable
>>> transport layer retry count, the keep-alive timeout is sufficient,
>>> transports can retry for as long as they see fit.
>>
>> Isn't there some IB protocol level rationale for a low retry count
>> in various fabric setups?
>
> IIRC the retry count is part of what drives the APM switch over, so
> APM configuration should use a lower value.
Completely agree Jason. Lowering the retry_count is very useful
for APM (Automatic Path Migration).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-06-24 7:13 ` Christoph Hellwig
[not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2016-06-24 7:13 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Jason Gunthorpe, Christoph Hellwig,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Thu, Jun 23, 2016 at 10:09:36AM +0300, Sagi Grimberg wrote:
> Completely agree Jason. Lowering the retry_count is very useful
> for APM (Automatic Path Migration).
Does this mean you're retracting the patches?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-06-26 15:48 ` Sagi Grimberg
2016-07-17 11:52 ` Sagi Grimberg
1 sibling, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2016-06-26 15:48 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
>> Completely agree Jason. Lowering the retry_count is very useful
>> for APM (Automatic Path Migration).
>
> Does this mean you're retracting the patches?
I'm not, were not using APM anywhere in nvme-rdma. multipathing
is done at a higher level than the transport.
Do you see a reason to keep this? I'm not too enthusiast with leaving
configs that aren't absolutely needed. As mentioned this config was
added to add a fast-fail functionality before we defined the periodic
keep-alive...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-26 15:48 ` Sagi Grimberg
@ 2016-07-17 11:52 ` Sagi Grimberg
[not found] ` <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
1 sibling, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2016-07-17 11:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
>> Completely agree Jason. Lowering the retry_count is very useful
>> for APM (Automatic Path Migration).
>
> Does this mean you're retracting the patches?
No, we never use APM in nvme-rdma, so I don't see a good reason
why we should keep it around....
Can I get a review on this btw?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-07-18 4:09 ` Christoph Hellwig
[not found] ` <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2016-07-18 4:09 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Christoph Hellwig, Jason Gunthorpe,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Sun, Jul 17, 2016 at 02:52:45PM +0300, Sagi Grimberg wrote:
> Can I get a review on this btw?
Jens already merged that patch after I pinged him last week.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-07-18 4:09 ` Christoph Hellwig
[not found] ` <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2016-07-18 4:09 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Christoph Hellwig, Jason Gunthorpe,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Sun, Jul 17, 2016 at 09:09:01PM -0700, Christoph Hellwig wrote:
> On Sun, Jul 17, 2016 at 02:52:45PM +0300, Sagi Grimberg wrote:
> > Can I get a review on this btw?
>
> Jens already merged that patch after I pinged him last week.
s/patch/series/. Time to havea coffee..
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-07-18 8:01 ` Sagi Grimberg
0 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2016-07-18 8:01 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
>> Jens already merged that patch after I pinged him last week.
>
> s/patch/series/. Time to havea coffee..
Thanks Christoph and Jens :)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-22 16:31 ` Sagi Grimberg
2016-06-22 20:31 ` Jason Gunthorpe
@ 2016-07-18 15:20 ` Bart Van Assche
[not found] ` <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2016-07-18 15:20 UTC (permalink / raw)
To: Christoph Hellwig, Sagi Grimberg
Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 06/22/2016 09:15 AM, Christoph Hellwig wrote:
> On Wed, Jun 22, 2016 at 03:05:59PM +0300, Sagi Grimberg wrote:
>> This parameter was added in order to support a proper timeout for
>> error recovery before the spec defined a periodic keep-alive.
>>
>> Now that we have periodic keep-alive, we don't need a user configurable
>> transport layer retry count, the keep-alive timeout is sufficient,
>> transports can retry for as long as they see fit.
>
> Isn't there some IB protocol level rationale for a low retry count
> in various fabric setups?
The IB spec defines an end-to-end credit mechanism for RC connections.
So if the transport layer is reliable (InfiniBand, RoCE with DCB
enabled) setting the retry count high enough is only needed to avoid
connection shutdown due to brief cable disconnect/reconnect events.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] Get rid of transport layer retry count config parameter
[not found] ` <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-07-20 8:42 ` Sagi Grimberg
0 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2016-07-20 8:42 UTC (permalink / raw)
To: Bart Van Assche, Christoph Hellwig
Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
> The IB spec defines an end-to-end credit mechanism for RC connections.
> So if the transport layer is reliable (InfiniBand, RoCE with DCB
> enabled) setting the retry count high enough is only needed to avoid
> connection shutdown due to brief cable disconnect/reconnect events.
Right, this is why I think the driver can use whatever it sees fit (we
have a keep-alive mechanism for fast-fail functionality).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-07-20 8:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1466597161-5242-1-git-send-email-sagi@grimberg.me>
[not found] ` <1466597161-5242-1-git-send-email-sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-06-22 16:15 ` [PATCH 0/2] Get rid of transport layer retry count config parameter Christoph Hellwig
[not found] ` <20160622161559.GA18361-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-22 16:31 ` Sagi Grimberg
2016-06-22 20:31 ` Jason Gunthorpe
[not found] ` <20160622203110.GA20838-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-23 7:09 ` Sagi Grimberg
[not found] ` <576B8B30.8080402-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-06-24 7:13 ` Christoph Hellwig
[not found] ` <20160624071336.GE4252-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-26 15:48 ` Sagi Grimberg
2016-07-17 11:52 ` Sagi Grimberg
[not found] ` <578B718D.2090909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-07-18 4:09 ` Christoph Hellwig
[not found] ` <20160718040901.GA2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-07-18 4:09 ` Christoph Hellwig
[not found] ` <20160718040946.GB2521-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-07-18 8:01 ` Sagi Grimberg
2016-07-18 15:20 ` Bart Van Assche
[not found] ` <12b64608-1d42-ffe6-c11a-58139cbabd3a-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-07-20 8:42 ` Sagi Grimberg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox