public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Khapyorsky <sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org>
To: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
Cc: Hal Rosenstock
	<hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: SRP issues with OpenSM 3.3.3
Date: Tue, 15 Dec 2009 19:03:17 +0200	[thread overview]
Message-ID: <20091215170317.GV5262@me> (raw)
In-Reply-To: <20091214164334.064102f0.weiny2-i2BcT+NCU+M@public.gmane.org>

Hi Ira,

On 16:43 Mon 14 Dec     , Ira Weiny wrote:
> 
> I have found that the following patch caused our SRP connected storage to
> break.
> 
> patch: 3d20f82edd3246879063b77721d0bcef927bdc48
> 
>     opensm/osm_sa_path_record.c: separate router guid resolution code
>     
>     Move off subnet destination (router address) resolution code to separate
>     function to improve readability.
>     
>     Signed-off-by: Sasha Khapyorsky <sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org>
> 
> I further traced the problem to pr_rcv_build_pr and the patch below fixes the
> bug.

Nice catch and great investigation!

> 
> 16:03:34 > git diff 
> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> index be0cd71..32c889f 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * p_src_port,
>  
>         /* Only set HopLimit if going through a router */
>         if (is_nonzero_gid)
> -               p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX);
> +               p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX;
>  
>         p_pr->pkey = p_parms->pkey;
>         ib_path_rec_set_sl(p_pr, p_parms->sl);
> 
> 
> "hop_flow_raw" is not really a 32bit value but rather 4 fields:
> Component     Length(bits)   Offset(bits)
> RawTraffic    1              352
> Reserved      3              353
> FlowLabel     20             356
> HopLimit      8              384

This patch doesn't look correct for me. Instead of placing
IB_HOPLIMIT_MAX in proper place this will put it in RawTraffic and
Reserved fields on little-endian machines (such as x86). And this
changes nothing on a big-endian machine. Right?

Following your investigation it seems that SRP target breaks when
IB_HOPLIMIT_MAX is set in SA PR query responses for intra-subnet DGIDs.

> However, I don't understand the comment "Only set HopLimit if going through a
> router"?

This is from '#ifdef ROUTER_EXP' days - as far as I could understand
HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs.

> 
> Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading
> through a router?

Seems that this was an original logic.

> I don't think it matters if we are going through a router
> or not does it?

I thought so too, but as we can see it triggers the bug you discovered.

So immediate fix for this is to move p_dgid setup back to router
section, like this:


diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
index 7855be5..fbeef03 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -1236,6 +1236,8 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * sa,
 				sa_status = IB_SA_MAD_STATUS_INVALID_GID;
 				goto Exit;
 			}
+			if (p_dgid)
+				*p_dgid = p_pr->dgid;
 		} else
 			dest_guid = p_pr->dgid.unicast.interface_id;
 
@@ -1253,9 +1255,6 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * sa,
 			sa_status = IB_SA_MAD_STATUS_INVALID_GID;
 			goto Exit;
 		}
-
-		if (p_dgid)
-			*p_dgid = p_pr->dgid;
 	} else {
 		*pp_dest_port = 0;
 		if (comp_mask & IB_PR_COMPMASK_DLID) {


BTW, could you test this? Unfortunately I don't have SRP target :(

And seems that as more solid solution we need to cleanup some logic in
osm_sa_path_record.c code.

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2009-12-15 17:03 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-15  0:43 SRP issues with OpenSM 3.3.3 Ira Weiny
     [not found] ` <20091214164334.064102f0.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-15 15:16   ` Hal Rosenstock
     [not found]     ` <f0e08f230912150716y392cf1f1t4cd640b6663f7fea-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-15 17:09       ` Ira Weiny
     [not found]         ` <20091215090942.b33ddc1e.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-15 17:53           ` Hal Rosenstock
2009-12-15 17:12       ` Sasha Khapyorsky
2009-12-15 17:03   ` Sasha Khapyorsky [this message]
2009-12-15 17:14     ` Ira Weiny
2009-12-15 17:15     ` Jason Gunthorpe
     [not found]       ` <20091215171532.GA8288-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-12-15 17:18         ` Ira Weiny
     [not found]           ` <20091215091819.c217cf36.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-15 17:31             ` Jason Gunthorpe
     [not found]               ` <20091215173140.GB8288-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-12-15 17:48                 ` Ira Weiny
2009-12-15 17:59             ` Hal Rosenstock
     [not found]               ` <f0e08f230912150959j536667bbg51b8381724681880-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-16  2:55                 ` [RFC PATCH] Set HopLimit based on off subnet " Ira Weiny
     [not found]                   ` <20091215185511.3ae458cc.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-16 13:29                     ` Hal Rosenstock
     [not found]                       ` <f0e08f230912160529h64424ba7id5c57dffb770380c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-18  2:18                         ` Ira Weiny
     [not found]                           ` <20091217181800.a1ee6b9b.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-18 15:20                             ` Hal Rosenstock
2009-12-20 19:57                             ` Sasha Khapyorsky
2009-12-22 11:37                               ` [PATCH] opensm/osm_sa_path_record.c: MGID must be specified explicitly Sasha Khapyorsky
2009-12-22 11:38                                 ` [PATCH] osm_sa_path_record.c: separate mutlicast processing code Sasha Khapyorsky
2009-12-22 11:57                                   ` [PATCH] osm_sa_path_record.c: use PR DGID by reference Sasha Khapyorsky
2010-01-04 19:11                                 ` [PATCH] opensm/osm_sa_path_record.c: MGID must be specified explicitly Hal Rosenstock
2009-12-15 17:56         ` SRP issues with OpenSM 3.3.3 Hal Rosenstock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091215170317.GV5262@me \
    --to=sashak-smomgflxvozwk0htik3j/w@public.gmane.org \
    --cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=weiny2-i2BcT+NCU+M@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox