From: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
To: Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Sasha Khapyorsky <sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: SRP issues with OpenSM 3.3.3
Date: Tue, 15 Dec 2009 09:09:42 -0800 [thread overview]
Message-ID: <20091215090942.b33ddc1e.weiny2@llnl.gov> (raw)
In-Reply-To: <f0e08f230912150716y392cf1f1t4cd640b6663f7fea-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Tue, 15 Dec 2009 10:16:42 -0500
Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Ira,
>
> On Mon, Dec 14, 2009 at 7:43 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> > Sasha, Hal,
> >
> > I have found that the following patch caused our SRP connected storage to
> > break.
>
> What is causing the SRP target to fail ? Is it a non zero hop limit
> returned in the SA PathRecord ?
I am not sure exactly because I don't know what the SRP Target is seeing.
(pesky firmware...) :-(
>
> > patch: 3d20f82edd3246879063b77721d0bcef927bdc48
> >
> > opensm/osm_sa_path_record.c: separate router guid resolution code
> >
> > Move off subnet destination (router address) resolution code to separate
> > function to improve readability.
> >
> > Signed-off-by: Sasha Khapyorsky <sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org>
> >
> > I further traced the problem to pr_rcv_build_pr and the patch below fixes the
> > bug.
>
> On quick glance, I'm missing the connection between Sasha's patch and
> this routine setting something bad in the hop limit of the returned
> path record.
It sets p_dgid even when we are not using a router which results in
is_nonzero_gid being true.
>
> > 16:03:34 > git diff
> > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> > index be0cd71..32c889f 100644
> > --- a/opensm/opensm/osm_sa_path_record.c
> > +++ b/opensm/opensm/osm_sa_path_record.c
> > @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * p_src_port,
> >
> > /* Only set HopLimit if going through a router */
> > if (is_nonzero_gid)
> > - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX);
> > + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX;
>
> This may fix this issue but it doesn't look right to me and I think
> would break router scenarios and corrupts other field(s). Path record
> fields are in network (not host) order.
But HopLimit is an 8bit field? I might be wrong on this but I was thinking
this set RawTraffic and parts of Flow label to 1's. I should double check
that. Also, I wonder if the FW on the other side is flipping this wrong? :-/
I have not been able to determine this and I still have to test with our other
vendor's SRP target...
>
> > p_pr->pkey = p_parms->pkey;
> > ib_path_rec_set_sl(p_pr, p_parms->sl);
> >
> >
> > "hop_flow_raw" is not really a 32bit value but rather 4 fields:
> > Component Length(bits) Offset(bits)
> > RawTraffic 1 352
> > Reserved 3 353
> > FlowLabel 20 356
> > HopLimit 8 384
> >
> >
> > However, I don't understand the comment "Only set HopLimit if going through a
> > router"?
>
> Well, in a sense the HopLimit is always set. It's already set to 0
> right above that code snippet in the patch where:
> p_pr->hop_flow_raw &= cl_hton32(1 << 31);
True, however, I believe the behavior before was that the entire field is 0
so the above statement really does nothing.
> This code is only setting the HopLimit to 255 (as this is very
> primitive so far) when going through a router. Maybe it could be
> stated better.
That is exactly what Sasha's patch changes. This is not only executed when
going through a router but at all times DGID is specified in the CompMask. I
am willing to accept the fact that perhaps p_dgid should not be set unless we
find a router, however, I __don't__ think that is correct. Searching the spec
I attempted to find where HopLimit was supposed to be 0xFF only when going
through a router and I could not find it; therefore my email, I'm confused
again... :-(
>
> > Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading
> > through a router? I don't think it matters if we are going through a router
> > or not does it?
>
> It's used when DGID is selected in the SA PR query via the comp mask
> and in that instance is used for both the router and non router DGID
> cases.
So we agree that Sasha's patch is correct?
>
> > If not, I think the comment needs to be removed in the patch above.
>
> I'm not following why you say this.
Well, if HopLimit must only be set to 0xff when going through a router and
Sasha's patch is correct, then I think we need another way to determine if
this path is going through a router. In that case the comment is correct but
the logic is wrong.
Ira
>
> -- Hal
>
> > Thanks,
> > Ira
> >
> > --
> > Ira Weiny
> > Math Programmer/Computer Scientist
> > Lawrence Livermore National Lab
> > 925-423-8008
> > weiny2-i2BcT+NCU+M@public.gmane.org
> >
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-12-15 17:09 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-15 0:43 SRP issues with OpenSM 3.3.3 Ira Weiny
[not found] ` <20091214164334.064102f0.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-15 15:16 ` Hal Rosenstock
[not found] ` <f0e08f230912150716y392cf1f1t4cd640b6663f7fea-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-15 17:09 ` Ira Weiny [this message]
[not found] ` <20091215090942.b33ddc1e.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-15 17:53 ` Hal Rosenstock
2009-12-15 17:12 ` Sasha Khapyorsky
2009-12-15 17:03 ` Sasha Khapyorsky
2009-12-15 17:14 ` Ira Weiny
2009-12-15 17:15 ` Jason Gunthorpe
[not found] ` <20091215171532.GA8288-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-12-15 17:18 ` Ira Weiny
[not found] ` <20091215091819.c217cf36.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-15 17:31 ` Jason Gunthorpe
[not found] ` <20091215173140.GB8288-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-12-15 17:48 ` Ira Weiny
2009-12-15 17:59 ` Hal Rosenstock
[not found] ` <f0e08f230912150959j536667bbg51b8381724681880-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-16 2:55 ` [RFC PATCH] Set HopLimit based on off subnet " Ira Weiny
[not found] ` <20091215185511.3ae458cc.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-16 13:29 ` Hal Rosenstock
[not found] ` <f0e08f230912160529h64424ba7id5c57dffb770380c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-18 2:18 ` Ira Weiny
[not found] ` <20091217181800.a1ee6b9b.weiny2-i2BcT+NCU+M@public.gmane.org>
2009-12-18 15:20 ` Hal Rosenstock
2009-12-20 19:57 ` Sasha Khapyorsky
2009-12-22 11:37 ` [PATCH] opensm/osm_sa_path_record.c: MGID must be specified explicitly Sasha Khapyorsky
2009-12-22 11:38 ` [PATCH] osm_sa_path_record.c: separate mutlicast processing code Sasha Khapyorsky
2009-12-22 11:57 ` [PATCH] osm_sa_path_record.c: use PR DGID by reference Sasha Khapyorsky
2010-01-04 19:11 ` [PATCH] opensm/osm_sa_path_record.c: MGID must be specified explicitly Hal Rosenstock
2009-12-15 17:56 ` SRP issues with OpenSM 3.3.3 Hal Rosenstock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091215090942.b33ddc1e.weiny2@llnl.gov \
--to=weiny2-i2bct+ncu+m@public.gmane.org \
--cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox