* [PATCH] opensm/osm_sa_path_record.c: Lower max number of hops allowed
From: Line Holen @ 2010-04-21 11:22 UTC (permalink / raw)
To: sashak-smomgflXvOZWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Lower max number of hops allowed in a path from 128 to 64.
Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
---
diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
index 62102f4..9f508db 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -70,7 +70,7 @@
#include <opensm/osm_prefix_route.h>
#include <opensm/osm_ucast_lash.h>
-#define MAX_HOPS 128
+#define MAX_HOPS 64
typedef struct osm_pr_item {
cl_list_item_t list_item;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Line Holen @ 2010-04-21 10:40 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20100421102112.GE23994@me>
On 04/21/10 12:21 PM, Sasha Khapyorsky wrote:
> On 20:32 Mon 19 Apr , Line Holen wrote:
>> The value of 128 was chosen as 2x max DR path allowing the SM to be in
>> the middle of a fabric. But I have no problem lowering to 64.
>
> Would you care about patch?
Sure, I can send a patch.
Line
>
> Sasha
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Sasha Khapyorsky @ 2010-04-21 10:21 UTC (permalink / raw)
To: Line Holen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4BCCA1C5.5000904-UdXhSnd/wVw@public.gmane.org>
On 20:32 Mon 19 Apr , Line Holen wrote:
>
> The value of 128 was chosen as 2x max DR path allowing the SM to be in
> the middle of a fabric. But I have no problem lowering to 64.
Would you care about patch?
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Sasha Khapyorsky @ 2010-04-21 10:16 UTC (permalink / raw)
To: Line Holen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4BCCA1C5.5000904-UdXhSnd/wVw@public.gmane.org>
On 20:32 Mon 19 Apr , Line Holen wrote:
> >> @@ -69,6 +70,9 @@
> >> #include <opensm/osm_prefix_route.h>
> >> #include <opensm/osm_ucast_lash.h>
> >>
> >> +
> >> +#define MAX_HOPS 128
> >
> > IB spec defines maximal number of hops for a fabric which is 64. Would
> > it be netter to use this value here?
> >
> > Sasha
>
> The value of 128 was chosen as 2x max DR path allowing the SM to be in
> the middle of a fabric. But I have no problem lowering to 64.
The path in this calculation is between ports and SM is not part of the
game.
For me it seems that 64 would be better number. Hypothetically it could
be even unrelated to LFTs transition issue - when path exceeds 64 hops
SA can return NOT FOUND just well.
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: opensm with multiple IB subnets
From: Ken Teague @ 2010-04-21 0:07 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <k2s2d0a59b21004201413ia115ae29u661f8df428d5ad08-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Tue, Apr 20, 2010 at 2:13 PM, Ken Teague <kteague-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org> wrote:
> I have a 17-node cluster and each node has a single IB card that has
> 2x IB ports (ib0 and ib1).....
After doing a little more research, I confirmed that my understanding
of the manual page is correct. To run opensm for each GUID, I
modified my init script to run a for loop based on the information
returned from "ibstat -p".
I added this near the beginning of the script where the other
environment variables are located:
<snip>
OFA_HOME="/usr/local/sbin"
IBSTAT_BIN="${OFA_HOME}/ibstat"
IBSTAT_ARG="-p"
OPENSM_BIN="${OFA_HOME}/opensm"
OPENSM_ARG="-B -g"
<snip>
I replaced the single line which started opensm with this for loop:
for i in `${IBSTAT_BIN} ${IBSTAT_ARG}`
do
${OPENSM_BIN} ${OPENSM_ARG} ${i}
done
<snip>
If anyone has a more elegant way to handle this, I'm open to
suggestions. Many thanks.
Ken
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [infiniband-diags] support diffing nodedesc on remoteports in ibnetdiscover
From: Al Chu @ 2010-04-20 22:30 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 294 bytes --]
Hey Sasha,
This patch supports diffing node descriptions on remote ports
(previously diffing of just the "local" node description was supported).
Al
--
Albert Chu
chu11-i2BcT+NCU+M@public.gmane.org
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
[-- Attachment #2: 0001-support-diffing-nodedesc-on-remoteports-in-ibnetdisc.patch --]
[-- Type: message/rfc822, Size: 1312 bytes --]
From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Subject: [PATCH] support diffing nodedesc on remoteports in ibnetdiscover
Date: Tue, 20 Apr 2010 15:09:59 -0700
Message-ID: <1271802596.17987.229.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
---
infiniband-diags/src/ibnetdiscover.c | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/infiniband-diags/src/ibnetdiscover.c b/infiniband-diags/src/ibnetdiscover.c
index 57f9625..eeb1b9f 100644
--- a/infiniband-diags/src/ibnetdiscover.c
+++ b/infiniband-diags/src/ibnetdiscover.c
@@ -720,6 +720,17 @@ static void diff_ports(ibnd_node_t * fabric1_node, ibnd_node_t * fabric2_node,
fabric2_out++;
}
+ if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION
+ && data->diff_flags & DIFF_FLAG_NODE_DESCRIPTION
+ && fabric1_port && fabric2_port
+ && fabric1_port->remoteport && fabric2_port->remoteport
+ && memcmp(fabric1_port->remoteport->node->nodedesc,
+ fabric2_port->remoteport->node->nodedesc,
+ IB_SMP_DATA_SIZE)) {
+ fabric1_out++;
+ fabric2_out++;
+ }
+
if (fabric1_out) {
diff_iter_out_header(fabric1_node, data,
out_header_flag);
--
1.5.4.5
^ permalink raw reply related
* opensm with multiple IB subnets
From: Ken Teague @ 2010-04-20 21:13 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
I have a 17-node cluster and each node has a single IB card that has
2x IB ports (ib0 and ib1). I have each node plugging in to an IB
switch; one cable between ib0 and the respective A-channel and another
cable between ib1 and the respective B-channel (e.g. c2n2, ib0 leads
to switch port 2A / c2n2, ib1 leads to switch port 2B / c2n3, ib0
leads to switch port 3A / c2n3, ib1 leads to switch port 3B ... etc.).
Each channel and IB port/cable run are on separate subnets. opensm
is running as a demon on the master node.
The IB switch in question has a total of 48 ports split between 24
A-channels and 24 B-channels. Although this is 1 physical switch, the
two channels are separated internally in the circuitry.
After deploying the cluster, my A-channels were lighting up with both
the green and amber lights, but the B-channels were only lighting up
with the green link light. I read opensm(8) and, if I'm understanding
this correctly, I need to run two instances of opensm; one for each
port. Is this correct?
If I do need to run an instances of opensm per subnet, what is the
best way to do this automatically during boot? /etc/ofa/opensm.conf
does allow me to specify a GUID, but will it allow me to specify
multiple GUIDs? Should I (or is there a benefit to) run opensm on the
same host? Please let me know if more information is needed. Thanks
in advance.
Distribution:
OpenSM 3.2.2
openSUSE 10.3 (X86-64)
VERSION = 10.3
LSB_VERSION="core-2.0-noarch:core-3.0-noarch:core-2.0-x86_64:core-3.0-x86_64"
Subnets:
10.1.1.x - eth1
10.0.1.x - ib
10.0.2.x - ib2
HCA:
Microway DDR using Mellanox chipset. Each card has 2x IB ports and 2x
EIA-422-B ports.
Switch:
Microway FasTree 48-port split between A-channels and B-channels, 24
ports per channel. Although this is 1 physical switch, the A-channels
and B-channels are separate.
Cluster:
17 nodes including the master
/etc/hosts:
127.0.0.1 localhost.cl.mydomain.local localhost
10.1.2.1 master.cl.mydomain.local master
10.1.2.2 c2n2.cl.mydomain.local c2n2
10.1.2.3 c2n3.cl.mydomain.local c2n3
10.1.2.4 c2n4.cl.mydomain.local c2n4
10.1.2.5 c2n5.cl.mydomain.local c2n5
10.1.2.6 c2n6.cl.mydomain.local c2n6
10.1.2.7 c2n7.cl.mydomain.local c2n7
10.1.2.8 c2n8.cl.mydomain.local c2n8
10.1.2.9 c2n9.cl.mydomain.local c2n9
10.1.2.10 c2n10.cl.mydomain.local c2n10
10.1.2.11 c2n11.cl.mydomain.local c2n11
10.1.2.12 c2n12.cl.mydomain.local c2n12
10.1.2.13 c2n13.cl.mydomain.local c2n13
10.1.2.14 c2n14.cl.mydomain.local c2n14
10.1.2.15 c2n15.cl.mydomain.local c2n15
10.1.2.16 c2n16.cl.mydomain.local c2n16
10.1.2.17 c2n17.cl.mydomain.local c2n17
10.0.1.1 master-ib.cl.mydomain.local master-ib
10.0.1.2 c2n2-ib.cl.mydomain.local c2n2-ib
10.0.1.3 c2n3-ib.cl.mydomain.local c2n3-ib c2n3ib
10.0.1.4 c2n4-ib.cl.mydomain.local c2n4-ib
10.0.1.5 c2n5-ib.cl.mydomain.local c2n5-ib
10.0.1.6 c2n6-ib.cl.mydomain.local c2n6-ib
10.0.1.7 c2n7-ib.cl.mydomain.local c2n7-ib
10.0.1.8 c2n8-ib.cl.mydomain.local c2n8-ib
10.0.1.9 c2n9-ib.cl.mydomain.local c2n9-ib
10.0.1.10 c2n10-ib.cl.mydomain.local c2n10-ib
10.0.1.11 c2n11-ib.cl.mydomain.local c2n11-ib
10.0.1.12 c2n12-ib.cl.mydomain.local c2n12-ib
10.0.1.13 c2n13-ib.cl.mydomain.local c2n13-ib
10.0.1.14 c2n14-ib.cl.mydomain.local c2n14-ib
10.0.1.15 c2n15-ib.cl.mydomain.local c2n15-ib
10.0.1.16 c2n16-ib.cl.mydomain.local c2n16-ib
10.0.1.17 c2n17-ib.cl.mydomain.local c2n17-ib
10.0.2.1 master-ib2.cl.mydomain.local master-ib2
10.0.2.2 c2n2-ib2.cl.mydomain.local c2n2-ib2
10.0.2.3 c2n3-ib2.cl.mydomain.local c2n3-ib2 c2n3ib2
10.0.2.4 c2n4-ib2.cl.mydomain.local c2n4-ib2
10.0.2.5 c2n5-ib2.cl.mydomain.local c2n5-ib2
10.0.2.6 c2n6-ib2.cl.mydomain.local c2n6-ib2
10.0.2.7 c2n7-ib2.cl.mydomain.local c2n7-ib2
10.0.2.8 c2n8-ib2.cl.mydomain.local c2n8-ib2
10.0.2.9 c2n9-ib2.cl.mydomain.local c2n9-ib2
10.0.2.10 c2n10-ib2.cl.mydomain.local c2n10-ib2
10.0.2.11 c2n11-ib2.cl.mydomain.local c2n11-ib2
10.0.2.12 c2n12-ib2.cl.mydomain.local c2n12-ib2
10.0.2.13 c2n13-ib2.cl.mydomain.local c2n13-ib2
10.0.2.14 c2n14-ib2.cl.mydomain.local c2n14-ib2
10.0.2.15 c2n15-ib2.cl.mydomain.local c2n15-ib2
10.0.2.16 c2n16-ib2.cl.mydomain.local c2n16-ib2
10.0.2.17 c2n17-ib2.cl.mydomain.local c2n17-ib2
Routes:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
239.2.11.71 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
10.0.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ib0
10.0.2.0 0.0.0.0 255.255.255.0 U 0 0 0 ib1
10.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 10.1.1.1 0.0.0.0 UG 0 0 0 eth1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Socket Direct Protocol: help (2)
From: Andrea Gozzelino @ 2010-04-20 13:53 UTC (permalink / raw)
To: Amir Vadai
Cc: Tung, Chien Tin, Steve Wise,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org,
peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
pavel-+ZI9xUNit7I@public.gmane.org,
mingo-X9Un+BFzKDI@public.gmane.org, Eric B Munson
In-Reply-To: <4BC6D075.1070405-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
Hi Amir,
have you any news about bugs 2027 "SDP not respecting # SGEs as reported
from HW" and 2028 "SDP should support fastreg mrs"?
When those bugs will be fixed, I will test the NE020 cards performance
with SDP protocol and I will compare SDP and TCP.
Keep in touch,
Andrea Gozzelino
INFN - Laboratori Nazionali di Legnaro (LNL)
Viale dell'Universita' 2
I-35020 - Legnaro (PD)- ITALIA
Tel: +39 049 8068346
Fax: +39 049 641925
Mail: andrea.gozzelino-PK20h7lG/Rc1GQ1Ptb7lUw@public.gmane.org
On Apr 15, 2010 10:38 AM, Amir Vadai <amirv-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org> wrote:
> It should be a simple fix and I plan to do soon - just add yourself as
> CC in bugzilla - that way I won't forget to notify you.
>
> - amir
>
> On 04/15/2010 10:07 AM, Andrea Gozzelino wrote:
> > On Apr 15, 2010 08:24 AM, Amir Vadai <amirv-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org> wrote:
> >
> >
> >> I hope to have a fix next week for the first one.
> >>
> >> Thanks,
> >> Amir
> >>
> >> On 04/14/2010 09:48 PM, Tung, Chien Tin wrote:
> >>
> >>>> Tung, Chien Tin wrote:
> >>>>
> >>>>
> >>>>>> One more thing - Please open a bug regarding the num_sge
> >>>>>> limitation at:
> >>>>>> https://bugs.openfabrics.org/
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> Done, Bug 2027.
> >>>>>
> >>>>> Chien
> >>>>>
> >>>>>
> >>>>>
> >>>> And 2028 opened to request fastreg support.
> >>>>
> >>>>
> >>>>
> >>> I am open to test fixes for these two bugs.
> >>>
> >>> Chien
> >>>
> >>>
> >>>
> >>
> > Hi Amir,
> > Hi Chien,
> >
> > I understand that the bug 2027 could be solved next week, so I will
> > test
> > SDP protocol performance on NE020 cards.
> > Is it correct?
> > If yes, could you point out the code modifies?
> >
> > Keep in touch and take care.
> > Regards,
> > Andrea
> >
> >
> > Andrea Gozzelino
> >
> > INFN - Laboratori Nazionali di Legnaro (LNL)
> > Viale dell'Universita' 2
> > I-35020 - Legnaro (PD)- ITALIA
> > Tel: +39 049 8068346
> > Fax: +39 049 641925
> > Mail: andrea.gozzelino-PK20h7lG/Rc1GQ1Ptb7lUw@public.gmane.org
> >
> >
> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH V4 1/2] IB/core: Add support for enhanced atomic operations
From: Vladimir Sokolovsky @ 2010-04-20 13:36 UTC (permalink / raw)
To: Håkon Bugge
Cc: rdreier-FYB4Gu1CFyUAvxtiuMwx3w, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <BE945405-03B6-4D5F-8DC8-A7424250BDEC-U0mLk4xYmo8@public.gmane.org>
Håkon Bugge wrote:
> On Apr 14, 2010, at 16:23 , Vladimir Sokolovsky wrote:
>
>> The additional operands are carried in the Extended Transport Header
>
> Is this a newly defined ETH which follows the AETH on the wire?
>
>
>
> Thanks, Håkon
>
Yes,
Atomic masked Fetch and Add uses first 64 bits to provide the date to add,
and the second 64 bits provide the field boundary:
Swap (or Add) data high [63:32]
Swap (or Add) data low [31:0]
Compare data (or Field boundary) high [63:32]
Compare data (or Field boundary) high [31:0]
Atomic masked Compare and Swap uses:
Swap (or Add) data high [63:32]
Swap (or Add) data low [31:0]
Compare data high [63:32]
Compare data high [31:0]
Swap mask high [63:32]
Swap mask low [31:0]
Compare mask high [63:32]
Compare mask low [31:0]
Regards,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] rdma/cm: Randomize local port allocation.
From: Cong Wang @ 2010-04-20 4:34 UTC (permalink / raw)
To: David Miller
Cc: penguin-kernel, sean.hefty, opurdila, eric.dumazet, netdev,
nhorman, ebiederm, linux-kernel, rolandd, linux-rdma
In-Reply-To: <20100416.133001.262206466.davem@davemloft.net>
David Miller wrote:
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Fri, 16 Apr 2010 22:54:22 +0900
>
>> Cong Wang wrote:
>>> Sean Hefty wrote:
>>>> I like this version, thanks! I'm not sure which tree to merge it through.
>>>> Are you needing this for 2.6.34, or is 2.6.35 okay?
>>>>
>>> As soon as possible, so 2.6.34. :)
>>>
>> Cong, merge window for 2.6.34 was already closed.
>> You need to make your patchset towards 2.6.35 (using net-next-2.6 tree)
>> rather than 2.6.34 (using linux-2.6 tree). Therefore, this patch being
>> queued for 2.6.35 (through net-next-2.6 tree) should be okay for you.
>
> I don't take RDMA patches into net-next-2.6, the less I touch this
> stack avoiding stuff the better and Roland has been taking this stuff
> into his own tree for some time now.
I left for a few days.
Ok, so I will wait for this to be merged.
Thanks, David and Tetsuo!
^ permalink raw reply
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Line Holen @ 2010-04-19 18:48 UTC (permalink / raw)
To: Hal Rosenstock
Cc: sashak-smomgflXvOZWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <j2uf0e08f231004191120oc1e78130l683b9ae0ca51003a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 04/19/10 08:20 PM, Hal Rosenstock wrote:
> On Mon, Apr 19, 2010 at 5:15 AM, Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org> wrote:
>> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
>> This can happen if a path request is handled while LFT updates to the fabric
>> are in progress.
>> The LFT of the switch data structure is updated as part of the LFT response
>> processing. So while the SM is busy pushing the LFT updates, some switches have
>> up to date LFT info while others are not yet updated and contains the LFT of
>> the previous routing. For a (short) time interval there is a potential for
>> loops in the fabric. The livelock occurs if a path request is received during
>> this time interval.
>> Both LFT response handling and path request processing needs the SM lock.
>> When the livelock occurs the LFT response handling blocks forever waiting for
>> the lock to be released.
>>
>> The suggested fix is simply to introduce a max number of hops that should
>> be traversed while handling the path request. If this max is reached then
>> the request will return with NO_RECORD response
>
> To me, this begs the question of whether this should return a BUSY
> status rather than no record (and whether SA clients should handle
> those two differently) but that is a bigger change (and may require
> some end node change as well).
I think the fundamental issue here is that the path request handling is operating
on inconsistent data - a mixture of old and new lft setup. A proper fix would
be to use a consistent lft setup (either old or new) or deny service (return BUSY)
while LFT updates are in progress. A check on number of hops still make sense
though, because the routing could generate loops too.
>
> Also, should a similar change be made in SA MPR mpr_rcv_get_path_parms ?
Could be. I haven't checked that code.
Line
>
> -- Hal
>
>> and release the SM lock.
>> This way the LFT processing will be able to complete.
>>
>> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
>>
>> ---
>>
>> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
>> index c4c3f86..b399b70 100644
>> --- a/opensm/opensm/osm_sa_path_record.c
>> +++ b/opensm/opensm/osm_sa_path_record.c
>> @@ -4,6 +4,7 @@
>> * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
>> * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
>> * Copyright (c) 2009 HNR Consulting. All rights reserved.
>> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
>> *
>> * This software is available to you under a choice of one of two
>> * licenses. You may choose to be licensed under the terms of the GNU
>> @@ -69,6 +70,9 @@
>> #include <opensm/osm_prefix_route.h>
>> #include <opensm/osm_ucast_lash.h>
>>
>> +
>> +#define MAX_HOPS 128
>> +
>> typedef struct osm_pr_item {
>> cl_list_item_t list_item;
>> ib_path_rec_t path_rec;
>> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>> osm_qos_level_t *p_qos_level = NULL;
>> uint16_t valid_sl_mask = 0xffff;
>> int is_lash;
>> + int hops = 0;
>>
>> OSM_LOG_ENTER(sa->p_log);
>>
>> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>> goto Exit;
>> }
>> }
>> +
>> + /* update number of hops traversed */
>> + hops++;
>> + if (hops > MAX_HOPS) {
>> +
>> + OSM_LOG(sa->p_log, OSM_LOG_ERROR,
>> + "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
>> + PRIx64 " (%s) needs more than %d hops, "
>> + "max %d hops allowed\n",
>> + cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
>> + p_src_physp->p_node->print_desc,
>> + dest_lid_ho,
>> + cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>> + p_dest_physp->p_node->print_desc,
>> + hops, MAX_HOPS);
>> +
>> + status = IB_NOT_FOUND;
>> + goto Exit;
>> + }
>> }
>>
>> /*
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [infiniband-diags] [3/3] add ibcacheedit tool
From: Al Chu @ 2010-04-19 18:43 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 505 bytes --]
Hey Sasha,
This patch adds a new ibcacheedit tool to infiniband-diags. As the name
suggests, it offers users options to edit a stored cache. This tool is
primarily necessary to allow system administrators to update caches as
they make minor modifications to a cluster (e.g. node dies, thus HCA is
replaced) rather than regularly regenerating a new cache.
Al
--
Albert Chu
chu11-i2BcT+NCU+M@public.gmane.org
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
[-- Attachment #2: 0003-add-ibcacheedit-tool.patch --]
[-- Type: message/rfc822, Size: 12943 bytes --]
From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Subject: [PATCH] add ibcacheedit tool
Date: Mon, 19 Apr 2010 11:16:51 -0700
Message-ID: <1271702059.17987.209.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
---
infiniband-diags/Makefile.am | 6 +-
infiniband-diags/man/ibcacheedit.8 | 57 ++++++
infiniband-diags/src/ibcacheedit.c | 356 ++++++++++++++++++++++++++++++++++++
3 files changed, 417 insertions(+), 2 deletions(-)
create mode 100644 infiniband-diags/man/ibcacheedit.8
create mode 100644 infiniband-diags/src/ibcacheedit.c
diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am
index 1cdb60e..af90b05 100644
--- a/infiniband-diags/Makefile.am
+++ b/infiniband-diags/Makefile.am
@@ -13,7 +13,7 @@ sbin_PROGRAMS = src/ibaddr src/ibnetdiscover src/ibping src/ibportstate \
src/ibroute src/ibstat src/ibsysstat src/ibtracert \
src/perfquery src/sminfo src/smpdump src/smpquery \
src/saquery src/vendstat src/iblinkinfo \
- src/ibqueryerrors
+ src/ibqueryerrors src/ibcacheedit
if ENABLE_TEST_UTILS
sbin_PROGRAMS += src/ibsendtrap src/mcm_rereg_test
@@ -62,6 +62,8 @@ src_iblinkinfo_SOURCES = src/iblinkinfo.c
src_iblinkinfo_LDFLAGS = -L$(top_builddir)/libibnetdisc -libnetdisc
src_ibqueryerrors_SOURCES = src/ibqueryerrors.c
src_ibqueryerrors_LDFLAGS = -L$(top_builddir)/libibnetdisc -libnetdisc
+src_ibcacheedit_SOURCES = src/ibcacheedit.c
+src_ibcacheedit_LDFLAGS = -L$(top_builddir)/libibnetdisc -libnetdisc
man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \
man/ibchecknet.8 man/ibchecknode.8 man/ibcheckport.8 \
@@ -76,7 +78,7 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \
man/ibprintswitch.8 man/ibprintca.8 man/ibfindnodesusing.8 \
man/ibdatacounts.8 man/ibdatacounters.8 \
man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 \
- man/check_lft_balance.8
+ man/check_lft_balance.8 man/ibcacheedit.8
BUILT_SOURCES = ibdiag_version
ibdiag_version:
diff --git a/infiniband-diags/man/ibcacheedit.8 b/infiniband-diags/man/ibcacheedit.8
new file mode 100644
index 0000000..b977827
--- /dev/null
+++ b/infiniband-diags/man/ibcacheedit.8
@@ -0,0 +1,57 @@
+.TH IBCACHEEDIT 8 "Apr 15, 2010" "OpenIB" "OpenIB Diagnostics"
+
+.SH NAME
+ibcacheedit \- edit an ibnetdiscover cache
+
+.SH SYNOPSIS
+.B ibcacheedit
+[\-\-switchguid BEFOREGUID:AFTERGUID] [\-\-caguid BEFORE:AFTER]
+[\-\-sysimgguid BEFOREGUID:AFTERGUID] [\-\-portguid NODEGUID:BEFOREGUID:AFTERGUID]
+[\-h(elp)] <orig.cache> <new.cache>
+
+.SH DESCRIPTION
+.PP
+ibcacheedit allows users to edit an ibnetdiscover cache created through the
+\fB\-\-cache\fR option in
+.B ibnetdiscover(8).
+
+.SH OPTIONS
+
+.PP
+.TP
+\fB\-\-switchguid\fR BEFOREGUID:AFTERGUID
+Specify a switchguid that should be changed. The before and after guid
+should be separated by a colon. On switches, port guids are identical
+to the switch guid, so port guids will be adjusted as well on switches.
+.TP
+\fB\-\-caguid\fR BEFOREGUID:AFTERGUID
+Specify a caguid that should be changed. The before and after guid
+should be separated by a colon.
+.TP
+\fB\-\-sysimgguid\fR BEFOREGUID:AFTERGUID
+Specify a sysimgguid that should be changed. The before and after guid
+should be spearated by a colon.
+.TP
+\fB\-\-portguid\fR NODEGUID:BEFOREGUID:AFTERGUID
+Specify a portguid that should be changed. The nodeguid of the port
+(e.g. switchguid or caguid) should be specified first, followed by a
+colon, the before port guid, another colon, then the after port guid.
+On switches, port guids are identical to the switch guid, so the
+switch guid will be adjusted as well on switches.
+
+.SH COMMON OPTIONS
+
+Most OpenIB diagnostics take the following common flags. The exact list of
+supported flags per utility can be found in the usage message and can be shown
+using the util_name -h syntax.
+
+# Debugging flags
+.PP
+\-h show the usage message
+.PP
+\-V show the version info.
+
+.SH AUTHORS
+.TP
+Albert Chu
+.RI < chu11-i2BcT+NCU+M@public.gmane.org >
diff --git a/infiniband-diags/src/ibcacheedit.c b/infiniband-diags/src/ibcacheedit.c
new file mode 100644
index 0000000..28b8b21
--- /dev/null
+++ b/infiniband-diags/src/ibcacheedit.c
@@ -0,0 +1,356 @@
+/*
+ * Copyright (c) 2010 Lawrence Livermore National Lab. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#if HAVE_CONFIG_H
+# include <config.h>
+#endif /* HAVE_CONFIG_H */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+#include <inttypes.h>
+
+#include <infiniband/mad.h>
+#include <infiniband/ibnetdisc.h>
+
+#include "ibdiag_common.h"
+
+uint64_t switchguid_before = 0;
+uint64_t switchguid_after = 0;
+int switchguid_flag = 0;
+
+uint64_t caguid_before = 0;
+uint64_t caguid_after = 0;
+int caguid_flag = 0;
+
+uint64_t sysimgguid_before = 0;
+uint64_t sysimgguid_after = 0;
+int sysimgguid_flag = 0;
+
+uint64_t portguid_nodeguid = 0;
+uint64_t portguid_before = 0;
+uint64_t portguid_after = 0;
+int portguid_flag = 0;
+
+struct guids {
+ uint64_t searchguid;
+ int searchguid_found;
+ uint64_t before;
+ uint64_t after;
+ int found;
+};
+
+static int parse_beforeafter(char *arg, uint64_t *before, uint64_t *after)
+{
+ char *ptr;
+ char *before_str;
+ char *after_str;
+
+ ptr = strchr(optarg, ':');
+ if (!ptr || !(*(ptr + 1))) {
+ fprintf(stderr, "invalid input '%s'\n", arg);
+ return -1;
+ }
+ (*ptr) = '\0';
+ before_str = arg;
+ after_str = ptr + 1;
+
+ (*before) = strtoull(before_str, 0, 0);
+ (*after) = strtoull(after_str, 0, 0);
+ return 0;
+}
+
+static int parse_guidbeforeafter(char *arg,
+ uint64_t *guid,
+ uint64_t *before,
+ uint64_t *after)
+{
+ char *ptr1;
+ char *ptr2;
+ char *guid_str;
+ char *before_str;
+ char *after_str;
+
+ ptr1 = strchr(optarg, ':');
+ if (!ptr1 || !(*(ptr1 + 1))) {
+ fprintf(stderr, "invalid input '%s'\n", arg);
+ return -1;
+ }
+ guid_str = arg;
+ before_str = ptr1 + 1;
+
+ ptr2 = strchr(before_str, ':');
+ if (!ptr2 || !(*(ptr2 + 1))) {
+ fprintf(stderr, "invalid input '%s'\n", arg);
+ return -1;
+ }
+ (*ptr1) = '\0';
+ (*ptr2) = '\0';
+ after_str = ptr2 + 1;
+
+ (*guid) = strtoull(guid_str, 0, 0);
+ (*before) = strtoull(before_str, 0, 0);
+ (*after) = strtoull(after_str, 0, 0);
+ return 0;
+}
+
+static int process_opt(void *context, int ch, char *optarg)
+{
+ switch (ch) {
+ case 1:
+ if (parse_beforeafter(optarg,
+ &switchguid_before,
+ &switchguid_after) < 0)
+ return -1;
+ switchguid_flag++;
+ break;
+ case 2:
+ if (parse_beforeafter(optarg,
+ &caguid_before,
+ &caguid_after) < 0)
+ return -1;
+ caguid_flag++;
+ break;
+ case 3:
+ if (parse_beforeafter(optarg,
+ &sysimgguid_before,
+ &sysimgguid_after) < 0)
+ return -1;
+ sysimgguid_flag++;
+ break;
+ case 4:
+ if (parse_guidbeforeafter(optarg,
+ &portguid_nodeguid,
+ &portguid_before,
+ &portguid_after) < 0)
+ return -1;
+ portguid_flag++;
+ break;
+ default:
+ return -1;
+ }
+
+ return 0;
+}
+
+static void update_switchportguids(ibnd_node_t *node, uint64_t guid)
+{
+ ibnd_port_t *port;
+ int p;
+
+ for (p = 0; p <= node->numports; p++) {
+ port = node->ports[p];
+ if (port)
+ port->guid = node->guid;
+ }
+}
+
+static void replace_node_guid(ibnd_node_t *node, void *user_data)
+{
+ struct guids *guids;
+
+ guids = (struct guids *)user_data;
+
+ if (node->guid == guids->before) {
+
+ node->guid = guids->after;
+
+ /* port guids are identical to switch guids on
+ * switches, so update port guids too
+ */
+ if (node->type == IB_NODE_SWITCH)
+ update_switchportguids(node, guids->after);
+
+ guids->found++;
+ }
+}
+
+static void replace_sysimgguid(ibnd_node_t *node, void *user_data)
+{
+ struct guids *guids;
+ uint64_t sysimgguid;
+
+ guids = (struct guids *)user_data;
+
+ sysimgguid = mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F);
+ if (sysimgguid == guids->before) {
+ mad_set_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F,
+ guids->after);
+ guids->found++;
+ }
+}
+
+static void replace_portguid(ibnd_node_t *node, void *user_data)
+{
+ struct guids *guids;
+
+ guids = (struct guids *)user_data;
+
+ if (node->guid != guids->searchguid)
+ return;
+
+ guids->searchguid_found++;
+
+ if (node->type == IB_NODE_SWITCH) {
+ /* port guids are identical to switch guids on
+ * switches, so update switch guid too
+ */
+ if (node->guid == guids->before) {
+ node->guid = guids->after;
+ update_switchportguids(node, guids->after);
+ guids->found++;
+ }
+ }
+ else {
+ ibnd_port_t *port;
+ int p;
+
+ for (p = 1; p <= node->numports; p++) {
+ port = node->ports[p];
+ if (port
+ && port->guid == guids->before) {
+ port->guid = guids->after;
+ guids->found++;
+ break;
+ }
+ }
+ }
+}
+
+int main(int argc, char **argv)
+{
+ ibnd_fabric_t *fabric = NULL;
+ char *orig_cache_file = NULL;
+ char *new_cache_file = NULL;
+ struct guids guids;
+
+ const struct ibdiag_opt opts[] = {
+ {"switchguid", 1, 1, "BEFOREGUID:AFTERGUID",
+ "Specify before and after switchguid to edit"},
+ {"caguid", 2, 1, "BEFOREGUID:AFTERGUID",
+ "Specify before and after caguid to edit"},
+ {"sysimgguid", 3, 1, "BEFOREGUID:AFTERGUID",
+ "Specify before and after sysimgguid to edit"},
+ {"portguid", 4, 1, "NODEGUID:BEFOREGUID:AFTERGUID",
+ "Specify before and after port guid to edit"},
+ {0}
+ };
+ char *usage_args = "<orig.cache> <new.cache>";
+
+ ibdiag_process_opts(argc, argv, NULL, "edCPDLGtsv",
+ opts, process_opt, usage_args,
+ NULL);
+
+ argc -= optind;
+ argv += optind;
+
+ orig_cache_file = argv[0];
+ new_cache_file = argv[1];
+
+ if (!orig_cache_file)
+ IBERROR("original cache file not specified");
+
+ if (!new_cache_file)
+ IBERROR("new cache file not specified");
+
+ if ((fabric = ibnd_load_fabric(orig_cache_file, 0)) == NULL)
+ IBERROR("loading original cached fabric failed");
+
+ if (switchguid_flag) {
+ guids.before = switchguid_before;
+ guids.after = switchguid_after;
+ guids.found = 0;
+ ibnd_iter_nodes_type(fabric,
+ replace_node_guid,
+ IB_NODE_SWITCH,
+ &guids);
+
+ if (!guids.found)
+ IBERROR("switchguid = %" PRIx64 " not found",
+ switchguid_before);
+ }
+
+ if (caguid_flag) {
+ guids.before = caguid_before;
+ guids.after = caguid_after;
+ guids.found = 0;
+ ibnd_iter_nodes_type(fabric,
+ replace_node_guid,
+ IB_NODE_CA,
+ &guids);
+
+ if (!guids.found)
+ IBERROR("caguid = %" PRIx64 " not found",
+ caguid_before);
+ }
+
+ if (sysimgguid_flag) {
+ guids.before = sysimgguid_before;
+ guids.after = sysimgguid_after;
+ guids.found = 0;
+ ibnd_iter_nodes(fabric,
+ replace_sysimgguid,
+ &guids);
+
+ if (!guids.found)
+ IBERROR("sysimgguid = %" PRIx64 " not found",
+ sysimgguid_before);
+ }
+
+ if (portguid_flag) {
+ guids.searchguid = portguid_nodeguid;
+ guids.searchguid_found = 0;
+ guids.before = portguid_before;
+ guids.after = portguid_after;
+ guids.found = 0;
+ ibnd_iter_nodes(fabric,
+ replace_portguid,
+ &guids);
+
+ if (!guids.searchguid_found)
+ IBERROR("nodeguid = %" PRIx64 " not found",
+ portguid_nodeguid);
+
+ if (!guids.found)
+ IBERROR("portguid = %" PRIx64 " not found",
+ portguid_before);
+ }
+
+ if (ibnd_cache_fabric(fabric, new_cache_file, 0) < 0)
+ IBERROR("caching new cache data failed");
+
+ ibnd_destroy_fabric(fabric);
+ exit(0);
+}
--
1.5.4.5
^ permalink raw reply related
* [infiniband-diags] [2/3] support libibnetdisc caching no-overwrite flag
From: Al Chu @ 2010-04-19 18:43 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 239 bytes --]
Hey Sasha,
This patch adds a flag to support a no-overwrite caching flag for
libibnetdisc.
Al
--
Albert Chu
chu11-i2BcT+NCU+M@public.gmane.org
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
[-- Attachment #2: 0002-support-libibnetdisc-caching-no-overwrite-flag.patch --]
[-- Type: message/rfc822, Size: 2094 bytes --]
From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Subject: [PATCH] support libibnetdisc caching no-overwrite flag
Date: Mon, 19 Apr 2010 11:09:37 -0700
Message-ID: <1271702059.17987.207.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
---
.../libibnetdisc/include/infiniband/ibnetdisc.h | 3 +++
.../libibnetdisc/src/ibnetdisc_cache.c | 16 ++++++++++++----
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/infiniband-diags/libibnetdisc/include/infiniband/ibnetdisc.h b/infiniband-diags/libibnetdisc/include/infiniband/ibnetdisc.h
index 136282c..865c1e0 100644
--- a/infiniband-diags/libibnetdisc/include/infiniband/ibnetdisc.h
+++ b/infiniband-diags/libibnetdisc/include/infiniband/ibnetdisc.h
@@ -181,6 +181,9 @@ MAD_EXPORT void ibnd_destroy_fabric(ibnd_fabric_t * fabric);
MAD_EXPORT ibnd_fabric_t *ibnd_load_fabric(const char *file,
unsigned int flags);
+#define IBND_CACHE_FABRIC_FLAG_DEFAULT 0x0000
+#define IBND_CACHE_FABRIC_FLAG_NO_OVERWRITE 0x0001
+
MAD_EXPORT int ibnd_cache_fabric(ibnd_fabric_t * fabric, const char *file,
unsigned int flags);
diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
index e05ce99..56b59fb 100644
--- a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
+++ b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
@@ -888,10 +888,18 @@ int ibnd_cache_fabric(ibnd_fabric_t * fabric, const char *file,
return -1;
}
- if (!stat(file, &statbuf)) {
- if (unlink(file) < 0) {
- IBND_DEBUG("error removing '%s': %s\n",
- file, strerror(errno));
+ if (!(flags & IBND_CACHE_FABRIC_FLAG_NO_OVERWRITE)) {
+ if (!stat(file, &statbuf)) {
+ if (unlink(file) < 0) {
+ IBND_DEBUG("error removing '%s': %s\n",
+ file, strerror(errno));
+ return -1;
+ }
+ }
+ }
+ else {
+ if (!stat(file, &statbuf)) {
+ IBND_DEBUG("file '%s' already exists\n", file);
return -1;
}
}
--
1.5.4.5
^ permalink raw reply related
* [infiniband-diags] [1/3] make libibnetdisc default behavior to overwrite a cache
From: Al Chu @ 2010-04-19 18:43 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 287 bytes --]
Hey Sasha,
As discussed in a previous thread, this patch makes the default caching
behavior in libibnetdisc to overwrite a previous cache.
Al
--
Albert Chu
chu11-i2BcT+NCU+M@public.gmane.org
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
[-- Attachment #2: 0001-make-libibnetdisc-default-behavior-to-overwrite-a-ca.patch --]
[-- Type: message/rfc822, Size: 1143 bytes --]
From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Subject: [PATCH] make libibnetdisc default behavior to overwrite a cache
Date: Mon, 19 Apr 2010 10:52:14 -0700
Message-ID: <1271702059.17987.211.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
---
.../libibnetdisc/src/ibnetdisc_cache.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
index 5acb8f6..e05ce99 100644
--- a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
+++ b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
@@ -889,8 +889,11 @@ int ibnd_cache_fabric(ibnd_fabric_t * fabric, const char *file,
}
if (!stat(file, &statbuf)) {
- IBND_DEBUG("file '%s' already exists\n", file);
- return -1;
+ if (unlink(file) < 0) {
+ IBND_DEBUG("error removing '%s': %s\n",
+ file, strerror(errno));
+ return -1;
+ }
}
if ((fd = open(file, O_CREAT | O_EXCL | O_WRONLY, 0644)) < 0) {
--
1.5.4.5
^ permalink raw reply related
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Line Holen @ 2010-04-19 18:32 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20100419153421.GB23994@me>
On 04/19/10 05:34 PM, Sasha Khapyorsky wrote:
> On 11:15 Mon 19 Apr , Line Holen wrote:
>> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
>> This can happen if a path request is handled while LFT updates to the fabric
>> are in progress.
>> The LFT of the switch data structure is updated as part of the LFT response
>> processing. So while the SM is busy pushing the LFT updates, some switches have
>> up to date LFT info while others are not yet updated and contains the LFT of
>> the previous routing. For a (short) time interval there is a potential for
>> loops in the fabric. The livelock occurs if a path request is received during
>> this time interval.
>> Both LFT response handling and path request processing needs the SM lock.
>> When the livelock occurs the LFT response handling blocks forever waiting for
>> the lock to be released.
>>
>> The suggested fix is simply to introduce a max number of hops that should
>> be traversed while handling the path request. If this max is reached then
>> the request will return with NO_RECORD response and release the SM lock.
>> This way the LFT processing will be able to complete.
>>
>> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
>
> Applied. Thanks. See minor question/note below.
>
>> ---
>>
>> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
>> index c4c3f86..b399b70 100644
>> --- a/opensm/opensm/osm_sa_path_record.c
>> +++ b/opensm/opensm/osm_sa_path_record.c
>> @@ -4,6 +4,7 @@
>> * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
>> * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
>> * Copyright (c) 2009 HNR Consulting. All rights reserved.
>> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
>> *
>> * This software is available to you under a choice of one of two
>> * licenses. You may choose to be licensed under the terms of the GNU
>> @@ -69,6 +70,9 @@
>> #include <opensm/osm_prefix_route.h>
>> #include <opensm/osm_ucast_lash.h>
>>
>> +
>> +#define MAX_HOPS 128
>
> IB spec defines maximal number of hops for a fabric which is 64. Would
> it be netter to use this value here?
>
> Sasha
The value of 128 was chosen as 2x max DR path allowing the SM to be in
the middle of a fabric. But I have no problem lowering to 64.
Line
>
>> +
>> typedef struct osm_pr_item {
>> cl_list_item_t list_item;
>> ib_path_rec_t path_rec;
>> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>> osm_qos_level_t *p_qos_level = NULL;
>> uint16_t valid_sl_mask = 0xffff;
>> int is_lash;
>> + int hops = 0;
>>
>> OSM_LOG_ENTER(sa->p_log);
>>
>> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>> goto Exit;
>> }
>> }
>> +
>> + /* update number of hops traversed */
>> + hops++;
>> + if (hops > MAX_HOPS) {
>> +
>> + OSM_LOG(sa->p_log, OSM_LOG_ERROR,
>> + "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
>> + PRIx64 " (%s) needs more than %d hops, "
>> + "max %d hops allowed\n",
>> + cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
>> + p_src_physp->p_node->print_desc,
>> + dest_lid_ho,
>> + cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>> + p_dest_physp->p_node->print_desc,
>> + hops, MAX_HOPS);
>> +
>> + status = IB_NOT_FOUND;
>> + goto Exit;
>> + }
>> }
>>
>> /*
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Hal Rosenstock @ 2010-04-19 18:20 UTC (permalink / raw)
To: Line Holen
Cc: sashak-smomgflXvOZWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4BCC1F3F.5080000-UdXhSnd/wVw@public.gmane.org>
On Mon, Apr 19, 2010 at 5:15 AM, Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org> wrote:
> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
> This can happen if a path request is handled while LFT updates to the fabric
> are in progress.
> The LFT of the switch data structure is updated as part of the LFT response
> processing. So while the SM is busy pushing the LFT updates, some switches have
> up to date LFT info while others are not yet updated and contains the LFT of
> the previous routing. For a (short) time interval there is a potential for
> loops in the fabric. The livelock occurs if a path request is received during
> this time interval.
> Both LFT response handling and path request processing needs the SM lock.
> When the livelock occurs the LFT response handling blocks forever waiting for
> the lock to be released.
>
> The suggested fix is simply to introduce a max number of hops that should
> be traversed while handling the path request. If this max is reached then
> the request will return with NO_RECORD response
To me, this begs the question of whether this should return a BUSY
status rather than no record (and whether SA clients should handle
those two differently) but that is a bigger change (and may require
some end node change as well).
Also, should a similar change be made in SA MPR mpr_rcv_get_path_parms ?
-- Hal
> and release the SM lock.
> This way the LFT processing will be able to complete.
>
> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
>
> ---
>
> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> index c4c3f86..b399b70 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -4,6 +4,7 @@
> * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
> * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
> * Copyright (c) 2009 HNR Consulting. All rights reserved.
> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
> *
> * This software is available to you under a choice of one of two
> * licenses. You may choose to be licensed under the terms of the GNU
> @@ -69,6 +70,9 @@
> #include <opensm/osm_prefix_route.h>
> #include <opensm/osm_ucast_lash.h>
>
> +
> +#define MAX_HOPS 128
> +
> typedef struct osm_pr_item {
> cl_list_item_t list_item;
> ib_path_rec_t path_rec;
> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
> osm_qos_level_t *p_qos_level = NULL;
> uint16_t valid_sl_mask = 0xffff;
> int is_lash;
> + int hops = 0;
>
> OSM_LOG_ENTER(sa->p_log);
>
> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
> goto Exit;
> }
> }
> +
> + /* update number of hops traversed */
> + hops++;
> + if (hops > MAX_HOPS) {
> +
> + OSM_LOG(sa->p_log, OSM_LOG_ERROR,
> + "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
> + PRIx64 " (%s) needs more than %d hops, "
> + "max %d hops allowed\n",
> + cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
> + p_src_physp->p_node->print_desc,
> + dest_lid_ho,
> + cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
> + p_dest_physp->p_node->print_desc,
> + hops, MAX_HOPS);
> +
> + status = IB_NOT_FOUND;
> + goto Exit;
> + }
> }
>
> /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [infiniband-diags] [1/2] support libibnetdisc caching overwrite flag
From: Al Chu @ 2010-04-19 16:40 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20100419150410.GA23994@me>
Hey Sasha,
> Wouldn't it be better to make it in opposite direction - overwrite
> file by default and drop an error when "exclusive" flag is specified?
>
> For me it looks as more "intuitive" behavior (similar to other
> editors).
Now that you mention it, it would make more sense. It would make more
sense for ibnetdiscover too when it tries to cache to the same filename.
I'll tweak and resubmit the patch series.
Al
On Mon, 2010-04-19 at 08:04 -0700, Sasha Khapyorsky wrote:
> Hi Al,
>
> On 16:52 Thu 15 Apr , Al Chu wrote:
> > diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
> > index 480a0a2..6cf7d4d 100644
> > --- a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
> > +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
> > @@ -876,9 +876,20 @@ int ibnd_cache_fabric(ibnd_fabric_t * fabric, const char *file,
> > return -1;
> > }
> >
> > - if (!stat(file, &statbuf)) {
> > - IBND_DEBUG("file '%s' already exists\n", file);
> > - return -1;
> > + if (flags & IBND_CACHE_FABRIC_FLAG_OVERWRITE) {
> > + if (!stat(file, &statbuf)) {
> > + if (unlink(file) < 0) {
> > + IBND_DEBUG("error removing '%s': %s\n",
> > + file, strerror(errno));
> > + return -1;
> > + }
> > + }
> > + }
> > + else {
> > + if (!stat(file, &statbuf)) {
> > + IBND_DEBUG("file '%s' already exists\n", file);
> > + return -1;
> > + }
> > }
>
> Wouldn't it be better to make it in opposite direction - overwrite file
> by default and drop an error when "exclusive" flag is specified?
>
> For me it looks as more "intuitive" behavior (similar to other
> editors).
>
> Sasha
--
Albert Chu
chu11-i2BcT+NCU+M@public.gmane.org
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Sasha Khapyorsky @ 2010-04-19 15:34 UTC (permalink / raw)
To: Line Holen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4BCC1F3F.5080000-UdXhSnd/wVw@public.gmane.org>
On 11:15 Mon 19 Apr , Line Holen wrote:
> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
> This can happen if a path request is handled while LFT updates to the fabric
> are in progress.
> The LFT of the switch data structure is updated as part of the LFT response
> processing. So while the SM is busy pushing the LFT updates, some switches have
> up to date LFT info while others are not yet updated and contains the LFT of
> the previous routing. For a (short) time interval there is a potential for
> loops in the fabric. The livelock occurs if a path request is received during
> this time interval.
> Both LFT response handling and path request processing needs the SM lock.
> When the livelock occurs the LFT response handling blocks forever waiting for
> the lock to be released.
>
> The suggested fix is simply to introduce a max number of hops that should
> be traversed while handling the path request. If this max is reached then
> the request will return with NO_RECORD response and release the SM lock.
> This way the LFT processing will be able to complete.
>
> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
Applied. Thanks. See minor question/note below.
>
> ---
>
> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> index c4c3f86..b399b70 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -4,6 +4,7 @@
> * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
> * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
> * Copyright (c) 2009 HNR Consulting. All rights reserved.
> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
> *
> * This software is available to you under a choice of one of two
> * licenses. You may choose to be licensed under the terms of the GNU
> @@ -69,6 +70,9 @@
> #include <opensm/osm_prefix_route.h>
> #include <opensm/osm_ucast_lash.h>
>
> +
> +#define MAX_HOPS 128
IB spec defines maximal number of hops for a fabric which is 64. Would
it be netter to use this value here?
Sasha
> +
> typedef struct osm_pr_item {
> cl_list_item_t list_item;
> ib_path_rec_t path_rec;
> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
> osm_qos_level_t *p_qos_level = NULL;
> uint16_t valid_sl_mask = 0xffff;
> int is_lash;
> + int hops = 0;
>
> OSM_LOG_ENTER(sa->p_log);
>
> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
> goto Exit;
> }
> }
> +
> + /* update number of hops traversed */
> + hops++;
> + if (hops > MAX_HOPS) {
> +
> + OSM_LOG(sa->p_log, OSM_LOG_ERROR,
> + "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
> + PRIx64 " (%s) needs more than %d hops, "
> + "max %d hops allowed\n",
> + cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
> + p_src_physp->p_node->print_desc,
> + dest_lid_ho,
> + cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
> + p_dest_physp->p_node->print_desc,
> + hops, MAX_HOPS);
> +
> + status = IB_NOT_FOUND;
> + goto Exit;
> + }
> }
>
> /*
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [infiniband-diags] [1/2] support libibnetdisc caching overwrite flag
From: Sasha Khapyorsky @ 2010-04-19 15:04 UTC (permalink / raw)
To: Al Chu; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1271375547.17987.178.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
Hi Al,
On 16:52 Thu 15 Apr , Al Chu wrote:
> diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
> index 480a0a2..6cf7d4d 100644
> --- a/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
> +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc_cache.c
> @@ -876,9 +876,20 @@ int ibnd_cache_fabric(ibnd_fabric_t * fabric, const char *file,
> return -1;
> }
>
> - if (!stat(file, &statbuf)) {
> - IBND_DEBUG("file '%s' already exists\n", file);
> - return -1;
> + if (flags & IBND_CACHE_FABRIC_FLAG_OVERWRITE) {
> + if (!stat(file, &statbuf)) {
> + if (unlink(file) < 0) {
> + IBND_DEBUG("error removing '%s': %s\n",
> + file, strerror(errno));
> + return -1;
> + }
> + }
> + }
> + else {
> + if (!stat(file, &statbuf)) {
> + IBND_DEBUG("file '%s' already exists\n", file);
> + return -1;
> + }
> }
Wouldn't it be better to make it in opposite direction - overwrite file
by default and drop an error when "exclusive" flag is specified?
For me it looks as more "intuitive" behavior (similar to other
editors).
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [infiniband-diags] [2/2] check for duplicate port guids in libibnetdisc cache
From: Sasha Khapyorsky @ 2010-04-19 14:57 UTC (permalink / raw)
To: Al Chu; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1271285281.17987.140.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
On 15:48 Wed 14 Apr , Al Chu wrote:
> Hey Sasha,
>
> This patch checks for duplicate port guids in a libibnetdisc cache when
> it is loaded and report an error back to the user appropriately.
>
> Al
>
> --
> Albert Chu
> chu11-i2BcT+NCU+M@public.gmane.org
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
> Date: Wed, 14 Apr 2010 15:25:11 -0700
> From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
> Subject: [PATCH] check for duplicate port guids in libibnetdisc cache
> Message-Id: <1271285070.17987.138.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
> Mime-Version: 1.0
> Content-Transfer-Encoding: 7bit
>
>
> Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Applied. Thanks.
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [infiniband-diags] [1/2] fix libibnetdisc cache error path memleak
From: Sasha Khapyorsky @ 2010-04-19 14:56 UTC (permalink / raw)
To: Al Chu; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1271285269.17987.139.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
On 15:47 Wed 14 Apr , Al Chu wrote:
> Hey Sasha,
>
> This patch fixes a mem-leak through error paths in the libibnetdisc
> cache loading. If some data had not yet been "copied over" to the
> fabric struct and an error occurred, that memory would be leaked.
>
> Al
>
> --
> Albert Chu
> chu11-i2BcT+NCU+M@public.gmane.org
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
> Date: Wed, 14 Apr 2010 14:11:27 -0700
> From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
> Subject: [PATCH] fix libibnetdisc cache error path memleak
> Message-Id: <1271284770.17987.131.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
> Mime-Version: 1.0
> Content-Transfer-Encoding: 7bit
>
>
> Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Applied. Thanks.
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
From: Line Holen @ 2010-04-19 9:15 UTC (permalink / raw)
To: sashak-smomgflXvOZWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
This can happen if a path request is handled while LFT updates to the fabric
are in progress.
The LFT of the switch data structure is updated as part of the LFT response
processing. So while the SM is busy pushing the LFT updates, some switches have
up to date LFT info while others are not yet updated and contains the LFT of
the previous routing. For a (short) time interval there is a potential for
loops in the fabric. The livelock occurs if a path request is received during
this time interval.
Both LFT response handling and path request processing needs the SM lock.
When the livelock occurs the LFT response handling blocks forever waiting for
the lock to be released.
The suggested fix is simply to introduce a max number of hops that should
be traversed while handling the path request. If this max is reached then
the request will return with NO_RECORD response and release the SM lock.
This way the LFT processing will be able to complete.
Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
---
diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
index c4c3f86..b399b70 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -4,6 +4,7 @@
* Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
* Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
* Copyright (c) 2009 HNR Consulting. All rights reserved.
+ * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -69,6 +70,9 @@
#include <opensm/osm_prefix_route.h>
#include <opensm/osm_ucast_lash.h>
+
+#define MAX_HOPS 128
+
typedef struct osm_pr_item {
cl_list_item_t list_item;
ib_path_rec_t path_rec;
@@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
osm_qos_level_t *p_qos_level = NULL;
uint16_t valid_sl_mask = 0xffff;
int is_lash;
+ int hops = 0;
OSM_LOG_ENTER(sa->p_log);
@@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
goto Exit;
}
}
+
+ /* update number of hops traversed */
+ hops++;
+ if (hops > MAX_HOPS) {
+
+ OSM_LOG(sa->p_log, OSM_LOG_ERROR,
+ "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
+ PRIx64 " (%s) needs more than %d hops, "
+ "max %d hops allowed\n",
+ cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
+ p_src_physp->p_node->print_desc,
+ dest_lid_ho,
+ cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
+ p_dest_physp->p_node->print_desc,
+ hops, MAX_HOPS);
+
+ status = IB_NOT_FOUND;
+ goto Exit;
+ }
}
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH V4 1/2] IB/core: Add support for enhanced atomic operations
From: Håkon Bugge @ 2010-04-19 8:18 UTC (permalink / raw)
To: vlad-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: rdreier-FYB4Gu1CFyUAvxtiuMwx3w, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20100414142300.GB16346@vlad-laptop>
On Apr 14, 2010, at 16:23 , Vladimir Sokolovsky wrote:
> The additional operands are carried in the Extended Transport Header
Is this a newly defined ETH which follows the AETH on the wire?
Thanks, Håkon
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [infiniband-diags] fix libibnetdisc portguid hashing corner case
From: Sasha Khapyorsky @ 2010-04-18 16:47 UTC (permalink / raw)
To: Al Chu; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1271267283.17987.120.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
On 10:48 Wed 14 Apr , Al Chu wrote:
> Hey Sasha,
>
> This patch fixes a corner case in libibnetdisc that was storing
> portguids w/ a guid of 0.
>
> This bug was relatively innoucuous for ibnetdiscover b/c ibnetdiscover
> does not output these ports. However, it became a problem for me in the
> caching library as I attempted to reconstruct a fabric, and multiple
> ports were identifying themselves with identical guids [1].
>
> Al
>
> [1] - The fact the caching code assumes duplicate guids can't exist is
> also a bug. But that's for another patch. This is a bug by itself.
>
> --
> Albert Chu
> chu11-i2BcT+NCU+M@public.gmane.org
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
> Date: Tue, 13 Apr 2010 14:01:43 -0700
> From: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
> Subject: [PATCH] fix libibnetdisc portguid hashing corner case
> Message-Id: <1271264633.17987.112.camel-X2zTWyBD0EhliZ7u+bvwcg@public.gmane.org>
> Mime-Version: 1.0
> Content-Transfer-Encoding: 7bit
>
>
> Signed-off-by: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Applied. Thanks.
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] libibnetdisc: remove not needed num_smps_outstanding counter
From: Sasha Khapyorsky @ 2010-04-18 16:10 UTC (permalink / raw)
To: Ira Weiny
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Hal Rosenstock
In-Reply-To: <20100418160357.GK11943@me>
Now 'num_smps_outstanding' represents number of MADs on a wire, which is
duplicates by using its qmap's cl_qmap_count() function.
Signed-off-by: Sasha Khapyorsky <sashak-smomgflXvOZWk0Htik3J/w@public.gmane.org>
---
infiniband-diags/libibnetdisc/src/internal.h | 1 -
infiniband-diags/libibnetdisc/src/query_smp.c | 14 ++++----------
2 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/infiniband-diags/libibnetdisc/src/internal.h b/infiniband-diags/libibnetdisc/src/internal.h
index 57034f9..571b2f4 100644
--- a/infiniband-diags/libibnetdisc/src/internal.h
+++ b/infiniband-diags/libibnetdisc/src/internal.h
@@ -80,7 +80,6 @@ struct smp_engine {
ibnd_smp_t *smp_queue_tail;
void *user_data;
cl_qmap_t smps_on_wire;
- int num_smps_outstanding;
int max_smps_on_wire;
unsigned total_smps;
};
diff --git a/infiniband-diags/libibnetdisc/src/query_smp.c b/infiniband-diags/libibnetdisc/src/query_smp.c
index d38c2ef..7234844 100644
--- a/infiniband-diags/libibnetdisc/src/query_smp.c
+++ b/infiniband-diags/libibnetdisc/src/query_smp.c
@@ -100,7 +100,6 @@ static int process_smp_queue(smp_engine_t * engine)
free(smp);
return rc;
}
- engine->num_smps_outstanding++;
cl_qmap_insert(&engine->smps_on_wire, (uint32_t) smp->rpc.trid,
(cl_map_item_t *) smp);
engine->total_smps++;
@@ -171,7 +170,6 @@ static int process_one_recv(smp_engine_t * engine)
return -1;
}
- engine->num_smps_outstanding--;
rc = process_smp_queue(engine);
if (rc)
goto error;
@@ -199,7 +197,6 @@ void smp_engine_init(smp_engine_t * engine, struct ibmad_port *ibmad_port,
engine->ibmad_port = ibmad_port;
engine->user_data = user_data;
cl_qmap_init(&engine->smps_on_wire);
- engine->num_smps_outstanding = 0;
engine->max_smps_on_wire = max_smps_on_wire;
}
@@ -224,16 +221,13 @@ void smp_engine_destroy(smp_engine_t * engine)
cl_qmap_remove_item(&engine->smps_on_wire, item);
free(item);
}
-
- engine->num_smps_outstanding = 0;
}
int process_mads(smp_engine_t * engine)
{
- int rc = 0;
- while (engine->num_smps_outstanding > 0)
- while (!cl_is_qmap_empty(&engine->smps_on_wire))
- if ((rc = process_one_recv(engine)) != 0)
- return rc;
+ int rc;
+ while (!cl_is_qmap_empty(&engine->smps_on_wire))
+ if ((rc = process_one_recv(engine)) != 0)
+ return rc;
return 0;
}
--
1.7.0.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox