[PATCH] Handling busy responses from the SA

public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] Handling busy responses from the SA
@ 2010-06-04 18:30 Mike Heinz
       [not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DAF3-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Heinz @ 2010-06-04 18:30 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

The purpose of this patch is to cause the ib_mad driver to discard busy responses from the SA, effectively causing busy responses to become time outs.

This ensures that naïve IB applications cannot overwhelm the SA with queries, which could happen when a cluster is being rebooted, or when a large HPC application is started.

Note that this patch directly changes the same code affected by the mad user rmpp patch - it cannot be successfully applied without that patch.

Signed-Off-By: Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>

----

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index efca783..05f2930 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1815,9 +1815,20 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
 	 */
 	/* Complete corresponding request */
 	if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
+		u16 busy = __be16_to_cpu(mad_recv_wc->recv_buf.mad->mad_hdr.status) &
+					IB_MGMT_MAD_STATUS_BUSY;
+
 		spin_lock_irqsave(&mad_agent_priv->lock, flags);
 		mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
 		if (mad_send_wr) {
+			if (busy && mad_send_wr->retries_left) {
+				/* Just let the query timeout and have it requeued later */
+				spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+				ib_free_recv_mad(mad_recv_wc);
+				deref_mad_agent(mad_agent_priv);
+				printk(KERN_NOTICE PFX "Response returned with MAD_STATUS_BUSY\n");
+				return;
+			}
 			ib_mark_mad_done(mad_send_wr);
 			spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
 
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h index 2651e93..e9dc4cc 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -77,6 +77,15 @@
 
 #define IB_MGMT_MAX_METHODS			128
 
+/* MAD Status field bit masks */
+#define IB_MGMT_MAD_STATUS_SUCCESS						0x0000
+#define IB_MGMT_MAD_STATUS_BUSY							0x0001
+#define IB_MGMT_MAD_STATUS_REDIRECT_REQD				0x0002
+#define IB_MGMT_MAD_STATUS_BAD_VERERSION				0x0004	
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD			0x0008	
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB	0x000c
+#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE			0x001c
+
 /* RMPP information */
 #define IB_MGMT_RMPP_VERSION			1
 #define IB_MGMT_RMPP_PASSTHRU			255
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Handling busy responses from the SA
       [not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DAF3-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
@ 2010-06-04 19:59   ` Roland Dreier
       [not found]     ` <adaocfqo9gu.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  2010-06-04 21:05   ` Hefty, Sean
  1 sibling, 1 reply; 11+ messages in thread
From: Roland Dreier @ 2010-06-04 19:59 UTC (permalink / raw)
  To: Mike Heinz; +Cc: linux-rdma@vger.kernel.org, ewg@openfabrics.org

 > The purpose of this patch is to cause the ib_mad driver to discard
 > busy responses from the SA, effectively causing busy responses to
 > become time outs.

I don't have a strong opinion on this but it seems a bit odd.  If we're
just going to drop the response anyway, why did the SA send it in the
first place?  On the other hand, if the SA told us it's busy, it does
seem we could do something more sensible than retrying immediately.

Any opinions from anyone who worked on fabric scalability?

 > +				printk(KERN_NOTICE PFX "Response returned with MAD_STATUS_BUSY\n");

Do we want to spam kernel logs with this?  Seems it could generate a lot
of messages.

 > +#define IB_MGMT_MAD_STATUS_SUCCESS						0x0000
 > +#define IB_MGMT_MAD_STATUS_BUSY							0x0001
 > +#define IB_MGMT_MAD_STATUS_REDIRECT_REQD				0x0002
 > +#define IB_MGMT_MAD_STATUS_BAD_VERERSION				0x0004	
 > +#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD			0x0008	
 > +#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB	0x000c
 > +#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE			0x001c

The indentation of values seems pretty crazy here.  Also I'm not sure
what most of these defines are for?  They seem unused in this patch.
-- 
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DAF3-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
  2010-06-04 19:59   ` Roland Dreier
@ 2010-06-04 21:05   ` Hefty, Sean
       [not found]     ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F4DC96C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 11+ messages in thread
From: Hefty, Sean @ 2010-06-04 21:05 UTC (permalink / raw)
  To: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

> This ensures that naïve IB applications cannot overwhelm the SA with
> queries, which could happen when a cluster is being rebooted, or when a
> large HPC application is started.

I don't object to the concept of treating a busy response as a timeout, but how does this help prevent overwhelming the SA?  It continues to retry the queries, even if the SA says that it's too busy to respond without adjusting the timeout specified by the user.  I would think that you'd at least want to adjust the timeout (double it or use some random backoff).

The general guideline that we've been using for adjusting timeouts has been to report the failures and let the caller make the a necessary adjustments.  As far as I know, the only way for user space applications to query the SA are through the librdmacm, which sets retries to 0, or through the libibumad interface directly.  I would expect any application using the latter to be intelligent enough to handle a busy response.

Maybe we should re-think that guideline and allow users to simply indicate that the MAD layer should use reasonable defaults.  This would enable the ib_mad module to adjust the timeout values for all consumers based on actual destination response times.  It could also back off retrying multiple requests that were initiated around the same time, instead only retrying the first request, while simply increasing the timeout values for the others.  This is more complex, but we should be able to start with something fairly simple.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Handling busy responses from the SA
       [not found]     ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F4DC96C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-06-04 22:57       ` Jason Gunthorpe
       [not found]         ` <20100604225739.GK15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-06-07 15:42       ` Mike Heinz
  1 sibling, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2010-06-04 22:57 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

On Fri, Jun 04, 2010 at 02:05:10PM -0700, Hefty, Sean wrote:

> Maybe we should re-think that guideline and allow users to simply
> indicate that the MAD layer should use reasonable defaults.  This
> would enable the ib_mad module to adjust the timeout values for all
> consumers based on actual destination response times.  It could also
> back off retrying multiple requests that were initiated around the
> same time, instead only retrying the first request, while simply
> increasing the timeout values for the others.  This is more complex,
> but we should be able to start with something fairly simple.

A common method for handling this sort of thing is to randomize
the retry timeout. It would be a good idea to randomize all timeouts,
but the BUSY replies should probably randomize over a longer time
period.

Randomization prevents nodes in the cluster from self-synchronizing
and making the load on the SA worse.

But, I also agree with Roland.. having the SA return busy when it is
under load seems insane :) But if you really want to do this then I
think a different, larger, timeout should be used than the standard
mad timeout.

Also, I guess, it would be a good API choice if the caller could say
'get me a reply for this mad or error within 60s' rather than specify
details like retry counts, etc. The timeout values should be globally
set and derived from the usual SA provided data for network transits...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found]         ` <20100604225739.GK15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-06-04 23:19           ` Hefty, Sean
  2010-06-07 16:02           ` Mike Heinz
  2010-06-08 16:44           ` Hefty, Sean
  2 siblings, 0 replies; 11+ messages in thread
From: Hefty, Sean @ 2010-06-04 23:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

> A common method for handling this sort of thing is to randomize
> the retry timeout. It would be a good idea to randomize all timeouts,
> but the BUSY replies should probably randomize over a longer time
> period.
> 
> Randomization prevents nodes in the cluster from self-synchronizing
> and making the load on the SA worse.

I agree that randomization would be nice, but I think we want even more than that.  Part of the issues that we've seen with the current implementation is that when a large HPC job starts, everyone and their dog sends the SA a query.  These time out around the same time and get resent, and the SA ends up processing a huge number of duplicates.  The mad layer could be a lot more intelligent and avoid sending more than a handful (1?) of retries (or even initial requests) at a time until some complete.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found]     ` <adaocfqo9gu.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2010-06-07 15:14       ` Mike Heinz
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Heinz @ 2010-06-07 15:14 UTC (permalink / raw)
  To: Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

Roland Dreier said:

> I don't have a strong opinion on this but it seems a bit odd.  If we're just going to drop the response anyway, why did the SA send it in the first place?  On the other hand, if the SA told us it's busy, it does seem we could do something more sensible than retrying immediately.

The spec provides for the SA to return a BUSY response. When that happens, this patch causes us to wait for the original request to time out before retrying, not trying again immediately. In effect, we are pretending we never got the BUSY response and allowing the request to time out, instead.

Roland Dreier said:

> The indentation of values seems pretty crazy here.  Also I'm not sure what most of these defines are for?  They seem unused in this patch.

The indentation is probably from the conversion of tabs to spaces when the patch was pasted into the email - correcting it is no problem.  The value IB_MGMT_MAD_STATUS_BUSY is used in the patch, the others are defined because they are the other possible values for the same status field. We might as well define them all, for completeness.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found]     ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F4DC96C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2010-06-04 22:57       ` Jason Gunthorpe
@ 2010-06-07 15:42       ` Mike Heinz
  1 sibling, 0 replies; 11+ messages in thread
From: Mike Heinz @ 2010-06-07 15:42 UTC (permalink / raw)
  To: Hefty, Sean, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

Sean said:
> I don't object to the concept of treating a busy response as a timeout, but how does this help prevent overwhelming the SA?  It continues to retry the queries, even if the SA says that it's too busy to respond without adjusting the timeout specified by the user.  I would think that you'd at least want to adjust the timeout (double it or use some random backoff).

Well, the current behavior is to simply return the BUSY to the client or ULP, which  is either treated as a permanent error or causes an immediate retry. This can be a big problem with, for example, ipoib which sets retries to 15 and (as I understand it) immediately retries to connect when getting an error response from the SA. Other ulps have similar settings. Without some kind of delay, starting up ipoib on a large fabric (at boot time, for example) can cause a real packet storm. 

By treating BUSY replies identically to timeouts, this patch at least introduces a delay between attempts. In the case of the ULPs, the delay is typically 4 seconds.

Sean said:
> The general guideline that we've been using for adjusting timeouts has been to report the failures and let the caller make the a necessary adjustments.  As far as I know, the only way for user space applications to query the SA are through the librdmacm, which sets retries to 0, or through the libibumad interface directly.  I would expect any application using the latter to be intelligent enough to handle a busy response.

And this approach encourages applications to adjust their timeouts appropriately by treating BUSY responses as non-events and forcing the applications to wait for their request to time out.

Depending on the application developers to take BUSY responses into account seems to be asking for trouble - it allows one rogue app to bring the SA to its knees, for example. By enforcing this timeout model in the kernel, we guarantee that there will be at least some delay between each message when the SA is reporting a busy status. And as I previously mentioned this patch also affects kernel code, much of which does use retries.

Sean said:
> Maybe we should re-think that guideline and allow users to simply indicate that the MAD layer should use reasonable defaults.  This would enable the ib_mad module to adjust the timeout values for all consumers based on actual destination response times.  It could also back off retrying multiple requests that were initiated around the same time, instead only retrying the first request, while simply increasing the timeout values for the others.  This is more complex, but we should be able to start with something fairly simple.

It's an interesting idea, but in the meantime this is a problem that affects large clusters today.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found]         ` <20100604225739.GK15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-06-04 23:19           ` Hefty, Sean
@ 2010-06-07 16:02           ` Mike Heinz
       [not found]             ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DC0D-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
  2010-06-08 16:44           ` Hefty, Sean
  2 siblings, 1 reply; 11+ messages in thread
From: Mike Heinz @ 2010-06-07 16:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Hefty, Sean
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

> But, I also agree with Roland.. having the SA return busy when it is
under load seems insane :) 

In that case, what is the purpose of the BUSY response? 

-----Original Message-----
From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org] 
Sent: Friday, June 04, 2010 6:58 PM
To: Hefty, Sean
Cc: Mike Heinz; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org
Subject: Re: [PATCH] Handling busy responses from the SA

On Fri, Jun 04, 2010 at 02:05:10PM -0700, Hefty, Sean wrote:

> Maybe we should re-think that guideline and allow users to simply
> indicate that the MAD layer should use reasonable defaults.  This
> would enable the ib_mad module to adjust the timeout values for all
> consumers based on actual destination response times.  It could also
> back off retrying multiple requests that were initiated around the
> same time, instead only retrying the first request, while simply
> increasing the timeout values for the others.  This is more complex,
> but we should be able to start with something fairly simple.

A common method for handling this sort of thing is to randomize
the retry timeout. It would be a good idea to randomize all timeouts,
but the BUSY replies should probably randomize over a longer time
period.

Randomization prevents nodes in the cluster from self-synchronizing
and making the load on the SA worse.

But, I also agree with Roland.. having the SA return busy when it is
under load seems insane :) But if you really want to do this then I
think a different, larger, timeout should be used than the standard
mad timeout.

Also, I guess, it would be a good API choice if the caller could say
'get me a reply for this mad or error within 60s' rather than specify
details like retry counts, etc. The timeout values should be globally
set and derived from the usual SA provided data for network transits...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Handling busy responses from the SA
       [not found]             ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DC0D-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
@ 2010-06-07 16:41               ` Jason Gunthorpe
  0 siblings, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2010-06-07 16:41 UTC (permalink / raw)
  To: Mike Heinz
  Cc: Hefty, Sean, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

On Mon, Jun 07, 2010 at 11:02:10AM -0500, Mike Heinz wrote:
> > But, I also agree with Roland.. having the SA return busy when it is
> > under load seems insane :) 
> 
> In that case, what is the purpose of the BUSY response? 

I would say to indicate that the SA is internally busy (ie computing
routing tables, etc) and that the client should sit back and wait.

Too busy to handle the request load should just rely on timeouts.

-- 
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>        (780)4406067x832
Chief Technology Officer, Obsidian Research Corp         Edmonton, Canada
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found]         ` <20100604225739.GK15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-06-04 23:19           ` Hefty, Sean
  2010-06-07 16:02           ` Mike Heinz
@ 2010-06-08 16:44           ` Hefty, Sean
       [not found]             ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F578BBE-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2 siblings, 1 reply; 11+ messages in thread
From: Hefty, Sean @ 2010-06-08 16:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

> Also, I guess, it would be a good API choice if the caller could say
> 'get me a reply for this mad or error within 60s' rather than specify
> details like retry counts, etc. The timeout values should be globally
> set and derived from the usual SA provided data for network transits...

I agree with this.  Within the framework of the existing umad ABI, this could be specified by setting the high bit in the ib_user_mad_hdr:timeout_ms field, assuming that no one is using that bit in practice.  The kernel could then freely select the retry/timeout policy for these clients, which for starters could include dropping BUSY responses and adjusting the timeout using an approach similar to what Mike mentioned in a separate email.  Kernel clients could be updated to use this new mode.

Any disagreements to this approach?  
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Handling busy responses from the SA
       [not found]             ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F578BBE-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-06-08 18:11               ` Mike Heinz
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Heinz @ 2010-06-08 18:11 UTC (permalink / raw)
  To: Hefty, Sean, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org

It's workable, although I really wish there was a way to handle "stupid" apps that aren't written to handle a busy response.

-----Original Message-----
From: Hefty, Sean [mailto:sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
Sent: Tuesday, June 08, 2010 12:44 PM
To: Jason Gunthorpe
Cc: Mike Heinz; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org
Subject: RE: [PATCH] Handling busy responses from the SA

> Also, I guess, it would be a good API choice if the caller could say
> 'get me a reply for this mad or error within 60s' rather than specify
> details like retry counts, etc. The timeout values should be globally
> set and derived from the usual SA provided data for network transits...

I agree with this.  Within the framework of the existing umad ABI, this could be specified by setting the high bit in the ib_user_mad_hdr:timeout_ms field, assuming that no one is using that bit in practice.  The kernel could then freely select the retry/timeout policy for these clients, which for starters could include dropping BUSY responses and adjusting the timeout using an approach similar to what Mike mentioned in a separate email.  Kernel clients could be updated to use this new mode.

Any disagreements to this approach?  
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-06-08 18:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-04 18:30 [PATCH] Handling busy responses from the SA Mike Heinz
     [not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DAF3-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2010-06-04 19:59   ` Roland Dreier
     [not found]     ` <adaocfqo9gu.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-06-07 15:14       ` Mike Heinz
2010-06-04 21:05   ` Hefty, Sean
     [not found]     ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F4DC96C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-06-04 22:57       ` Jason Gunthorpe
     [not found]         ` <20100604225739.GK15969-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-06-04 23:19           ` Hefty, Sean
2010-06-07 16:02           ` Mike Heinz
     [not found]             ` <4C2744E8AD2982428C5BFE523DF8CDCB49A488DC0D-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2010-06-07 16:41               ` Jason Gunthorpe
2010-06-08 16:44           ` Hefty, Sean
     [not found]             ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F578BBE-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-06-08 18:11               ` Mike Heinz
2010-06-07 15:42       ` Mike Heinz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox