* [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
@ 2010-02-03 0:45 Ira Weiny
[not found] ` <20100202164514.bf2b152a.weiny2-i2BcT+NCU+M@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Ira Weiny @ 2010-02-03 0:45 UTC (permalink / raw)
To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Sasha,
Following up on our thread regarding having multiple outstanding SMP's in libibnetdisc.
These 2 patches implement that as well as add a function to set the max outstanding the lib will use.
I left the default here to be 4. On a large cluster there seems to be some variance with using 8 or 12. Sometimes I get a speed up over 4 and other times I don't see any. I think it has to do with the traffic on the fabric at any particular time.
For example here are some runs I just did on Hyperion.
14:31:55 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
Errors for 0x66a00d90006fb "SW19"
GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 14562048] [RcvData == 14563872] [XmtPkts == 202255] [RcvPkts == 202276]
Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
14:32:02 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
real 0m2.210s
user 0m1.251s
sys 0m0.869s
14:40:36 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
real 0m3.385s
user 0m1.888s
sys 0m1.448s
14:40:46 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
real 0m2.211s
user 0m1.165s
sys 0m0.951s
14:40:51 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
real 0m2.249s
user 0m1.244s
sys 0m0.936s
14:40:59 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
real 0m2.170s
user 0m1.160s
sys 0m0.933s
14:41:10 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
Errors for 0x66a00d90006fb "SW19"
GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
Note that there were no additional VL15Dropped packets on the fabric. I think 4 seems to be a good compromise. I have not tested when there are errors on the fabric. (Right now things seem to be good!)
The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
Ira
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <20100202164514.bf2b152a.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2010-02-04 14:19 ` Hal Rosenstock
[not found] ` <f0e08f231002040619y34784bauec6a25d31e7d229b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Hal Rosenstock @ 2010-02-04 14:19 UTC (permalink / raw)
To: Ira Weiny
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> Sasha,
>
> Following up on our thread regarding having multiple outstanding SMP's in libibnetdisc.
>
> These 2 patches implement that as well as add a function to set the max outstanding the lib will use.
>
> I left the default here to be 4. On a large cluster there seems to be some variance with using 8 or 12. Sometimes I get a speed up over 4 and other times I don't see any. I think it has to do with the traffic on the fabric at any particular time.
>
> For example here are some runs I just did on Hyperion.
>
> 14:31:55 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
> Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
> Errors for 0x66a00d90006fb "SW19"
> GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 14562048] [RcvData == 14563872] [XmtPkts == 202255] [RcvPkts == 202276]
> Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
>
> 14:32:02 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
>
> real 0m2.210s
> user 0m1.251s
> sys 0m0.869s
>
> 14:40:36 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>
> real 0m3.385s
> user 0m1.888s
> sys 0m1.448s
>
> 14:40:46 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>
> real 0m2.211s
> user 0m1.165s
> sys 0m0.951s
>
> 14:40:51 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
>
> real 0m2.249s
> user 0m1.244s
> sys 0m0.936s
>
> 14:40:59 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>
> real 0m2.170s
> user 0m1.160s
> sys 0m0.933s
>
> 14:41:10 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
> Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
> Errors for 0x66a00d90006fb "SW19"
> GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
> Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
>
> Note that there were no additional VL15Dropped packets on the fabric. I think 4 seems to be a good compromise. I have not tested when there are errors on the fabric. (Right now things seem to be good!)
Is this just with the SM doing light sweeping ?
Is there a speedup with 4 rather than 2 ?
-- Hal
>
> The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
>
> Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
>
> Ira
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> weiny2-i2BcT+NCU+M@public.gmane.org
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <f0e08f231002040619y34784bauec6a25d31e7d229b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-02-04 18:00 ` Ira Weiny
[not found] ` <20100204100045.4d2aa9aa.weiny2-i2BcT+NCU+M@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Ira Weiny @ 2010-02-04 18:00 UTC (permalink / raw)
To: Hal Rosenstock
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Thu, 4 Feb 2010 09:19:39 -0500
Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> > Sasha,
> >
> > Following up on our thread regarding having multiple outstanding SMP's in libibnetdisc.
> >
> > These 2 patches implement that as well as add a function to set the max outstanding the lib will use.
> >
> > I left the default here to be 4. On a large cluster there seems to be some variance with using 8 or 12. Sometimes I get a speed up over 4 and other times I don't see any. I think it has to do with the traffic on the fabric at any particular time.
> >
> > For example here are some runs I just did on Hyperion.
> >
> > 14:31:55 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
> > Errors for 0x66a00d90006fb "SW19"
> > GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 14562048] [RcvData == 14563872] [XmtPkts == 202255] [RcvPkts == 202276]
> > Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
> >
> > 14:32:02 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
> >
> > real 0m2.210s
> > user 0m1.251s
> > sys 0m0.869s
> >
> > 14:40:36 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
> >
> > real 0m3.385s
> > user 0m1.888s
> > sys 0m1.448s
> >
> > 14:40:46 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
> >
> > real 0m2.211s
> > user 0m1.165s
> > sys 0m0.951s
> >
> > 14:40:51 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
> >
> > real 0m2.249s
> > user 0m1.244s
> > sys 0m0.936s
> >
> > 14:40:59 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
> >
> > real 0m2.170s
> > user 0m1.160s
> > sys 0m0.933s
> >
> > 14:41:10 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
> > Errors for 0x66a00d90006fb "SW19"
> > GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
> > Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
> >
> > Note that there were no additional VL15Dropped packets on the fabric. I think 4 seems to be a good compromise. I have not tested when there are errors on the fabric. (Right now things seem to be good!)
>
> Is this just with the SM doing light sweeping ?
Yes.
>
> Is there a speedup with 4 rather than 2 ?
There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to want to
go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
will give us better parallelism to get around these issues. I have not had
the chance to test this condition with the new algorithm but the original
ibnetdiscover would slow way down when there are nodes which have unresponsive
SMA's. If there are only 2 outstanding this will not give us much speed up.
This was the main motivation I had for improving the library in this way.
Also, I think you are correct that we should increase OpenSM's default from 4
to 8. For the same reason as above. Some of our clusters have worked better
with 8 when we are having issues. But right now we are still running with 4.
Ira
>
> -- Hal
>
> >
> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
> >
> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
> >
> > Ira
> >
> > --
> > Ira Weiny
> > Math Programmer/Computer Scientist
> > Lawrence Livermore National Lab
> > 925-423-8008
> > weiny2-i2BcT+NCU+M@public.gmane.org
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at http://*vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://*vger.kernel.org/majordomo-info.html
>
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <20100204100045.4d2aa9aa.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2010-02-04 20:01 ` Hal Rosenstock
[not found] ` <f0e08f231002041201y2b48aa30nc33d2816ed11b1bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Hal Rosenstock @ 2010-02-04 20:01 UTC (permalink / raw)
To: Ira Weiny
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> On Thu, 4 Feb 2010 09:19:39 -0500
> Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>> > Sasha,
>> >
>> > Following up on our thread regarding having multiple outstanding SMP's in libibnetdisc.
>> >
>> > These 2 patches implement that as well as add a function to set the max outstanding the lib will use.
>> >
>> > I left the default here to be 4. On a large cluster there seems to be some variance with using 8 or 12. Sometimes I get a speed up over 4 and other times I don't see any. I think it has to do with the traffic on the fabric at any particular time.
>> >
>> > For example here are some runs I just did on Hyperion.
>> >
>> > 14:31:55 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
>> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
>> > Errors for 0x66a00d90006fb "SW19"
>> > GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 14562048] [RcvData == 14563872] [XmtPkts == 202255] [RcvPkts == 202276]
>> > Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
>> >
>> > 14:32:02 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
>> >
>> > real 0m2.210s
>> > user 0m1.251s
>> > sys 0m0.869s
>> >
>> > 14:40:36 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>> >
>> > real 0m3.385s
>> > user 0m1.888s
>> > sys 0m1.448s
>> >
>> > 14:40:46 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>> >
>> > real 0m2.211s
>> > user 0m1.165s
>> > sys 0m0.951s
>> >
>> > 14:40:51 > time ./ibnetdiscover -o 8 --node-name-map /etc/opensm/ib-node-name-map -g > new
>> >
>> > real 0m2.249s
>> > user 0m1.244s
>> > sys 0m0.936s
>> >
>> > 14:40:59 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>> >
>> > real 0m2.170s
>> > user 0m1.160s
>> > sys 0m0.933s
>> >
>> > 14:41:10 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
>> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
>> > Errors for 0x66a00d90006fb "SW19"
>> > GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
>> > Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
>> >
>> > Note that there were no additional VL15Dropped packets on the fabric. I think 4 seems to be a good compromise. I have not tested when there are errors on the fabric. (Right now things seem to be good!)
>>
>> Is this just with the SM doing light sweeping ?
>
> Yes.
That's not a lot of SMP stress from the SM side. SMP consumers are SM,
diags, and the unsolicited traps.
>
>>
>> Is there a speedup with 4 rather than 2 ?
>
> There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to want to
> go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
> will give us better parallelism to get around these issues. I have not had
> the chance to test this condition with the new algorithm but the original
> ibnetdiscover would slow way down when there are nodes which have unresponsive
> SMA's. If there are only 2 outstanding this will not give us much speed up.
> This was the main motivation I had for improving the library in this way.
>
> Also, I think you are correct that we should increase OpenSM's default from 4
> to 8. For the same reason as above. Some of our clusters have worked better
> with 8 when we are having issues. But right now we are still running with 4.
I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
I've seen a number of clusters with SMP dropping with the current
lower defaults.
-- Hal
> Ira
>
>>
>> -- Hal
>>
>> >
>> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
>> >
>> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
>> >
>> > Ira
>> >
>> > --
>> > Ira Weiny
>> > Math Programmer/Computer Scientist
>> > Lawrence Livermore National Lab
>> > 925-423-8008
>> > weiny2-i2BcT+NCU+M@public.gmane.org
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> > More majordomo info at http://*vger.kernel.org/majordomo-info.html
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://*vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> weiny2-i2BcT+NCU+M@public.gmane.org
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <f0e08f231002041201y2b48aa30nc33d2816ed11b1bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-02-05 0:13 ` Ira Weiny
[not found] ` <20100204161325.c4481bfe.weiny2-i2BcT+NCU+M@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Ira Weiny @ 2010-02-05 0:13 UTC (permalink / raw)
To: Hal Rosenstock
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Thu, 4 Feb 2010 15:01:32 -0500
Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> > On Thu, 4 Feb 2010 09:19:39 -0500
> > Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> >> > Sasha,
> >> >
[snip]
> >> >
> >> > real 0m2.249s
> >> > user 0m1.244s
> >> > sys 0m0.936s
> >> >
> >> > 14:40:59 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
> >> >
> >> > real 0m2.170s
> >> > user 0m1.160s
> >> > sys 0m0.933s
> >> >
> >> > 14:41:10 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
> >> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
> >> > Errors for 0x66a00d90006fb "SW19"
> >> > GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
> >> > Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
> >> >
> >> > Note that there were no additional VL15Dropped packets on the fabric. I think 4 seems to be a good compromise. I have not tested when there are errors on the fabric. (Right now things seem to be good!)
> >>
> >> Is this just with the SM doing light sweeping ?
> >
> > Yes.
>
> That's not a lot of SMP stress from the SM side. SMP consumers are SM,
> diags, and the unsolicited traps.
Agreed. I hope to test this more next week.
>
> >
> >>
> >> Is there a speedup with 4 rather than 2 ?
> >
> > There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to want to
> > go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
> > will give us better parallelism to get around these issues. I have not had
> > the chance to test this condition with the new algorithm but the original
> > ibnetdiscover would slow way down when there are nodes which have unresponsive
> > SMA's. If there are only 2 outstanding this will not give us much speed up.
> > This was the main motivation I had for improving the library in this way.
> >
> > Also, I think you are correct that we should increase OpenSM's default from 4
> > to 8. For the same reason as above. Some of our clusters have worked better
> > with 8 when we are having issues. But right now we are still running with 4.
>
> I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
> I've seen a number of clusters with SMP dropping with the current
> lower defaults.
So OpenSM is seeing dropped packets? With 4 SMP's on the wire? I do see some
VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an
issue. What kind of rate are you seeing?
The other question is; do people regularly run the tools which are using
libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)? We do. If others
are not then I would say this change would have less impact as they would want
the diags to have some priority for debugging. The other option is to change
the patch to be a default of 2 and allow user to change it depending on what
they are trying to do. If you think that is best I will change the patch.
Ira
>
> -- Hal
>
> > Ira
> >
> >>
> >> -- Hal
> >>
> >> >
> >> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
> >> >
> >> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
> >> >
> >> > Ira
> >> >
> >> > --
> >> > Ira Weiny
> >> > Math Programmer/Computer Scientist
> >> > Lawrence Livermore National Lab
> >> > 925-423-8008
> >> > weiny2-i2BcT+NCU+M@public.gmane.org
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> > More majordomo info at http://**vger.kernel.org/majordomo-info.html
> >> >
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> More majordomo info at http://**vger.kernel.org/majordomo-info.html
> >>
> >
> >
> > --
> > Ira Weiny
> > Math Programmer/Computer Scientist
> > Lawrence Livermore National Lab
> > 925-423-8008
> > weiny2-i2BcT+NCU+M@public.gmane.org
> >
>
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <20100204161325.c4481bfe.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2010-02-05 2:18 ` Ira Weiny
[not found] ` <20100204181852.f175d968.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-02-05 12:24 ` Hal Rosenstock
1 sibling, 1 reply; 9+ messages in thread
From: Ira Weiny @ 2010-02-05 2:18 UTC (permalink / raw)
To: Ira Weiny
Cc: Hal Rosenstock, Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Thu, 4 Feb 2010 16:13:25 -0800
Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> On Thu, 4 Feb 2010 15:01:32 -0500
> Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> > On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> > > On Thu, 4 Feb 2010 09:19:39 -0500
> > > Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > >
> > >> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> > >> > Sasha,
> > >> >
>
> [snip]
[snip]
> > >>
> > >> Is there a speedup with 4 rather than 2 ?
> > >
> > > There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to want to
> > > go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
> > > will give us better parallelism to get around these issues. I have not had
> > > the chance to test this condition with the new algorithm but the original
> > > ibnetdiscover would slow way down when there are nodes which have unresponsive
> > > SMA's. If there are only 2 outstanding this will not give us much speed up.
> > > This was the main motivation I had for improving the library in this way.
Ok, I found a fabric with just 2 nodes which were unresponsive... A quick
test shows...
Original ibnetdiscover:
18:12:29 > time ./ibnetdiscover > foo
ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,11,9)
src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) failed, skipping port
ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,24,18,7,6)
src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) failed, skipping port
real 0m9.073s
user 0m0.137s
sys 0m0.172s
18:12:43 > time ./ibnetdiscover > foo
ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,11,9)
src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) failed, skipping port
ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,24,18,7,6)
src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) failed, skipping port
real 0m9.103s
user 0m0.046s
sys 0m0.046s
*New* ibnetdiscover with different outstanding SMP's.
18:12:14 > time ./ibnetdiscover -o 2 > foo
src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out
src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out
real 0m9.746s
user 0m6.559s
sys 0m3.156s
18:13:00 > time ./ibnetdiscover -o 4 > foo
src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out
src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out
real 0m4.668s
user 0m3.043s
sys 0m1.601s
18:13:10 > time ./ibnetdiscover -o 8 > foo
src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out
src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out
real 0m4.360s
user 0m2.891s
sys 0m1.451s
Note that 2 does not give much speed up, where 4 does. Obviously this could
have to do with the fact there were 2 nodes which were bad (so if you had
100's of nodes unresponsive a higher value might be worth using) but as a
default compromise I think 4 is good.
Ira
> > >
> > > Also, I think you are correct that we should increase OpenSM's default from 4
> > > to 8. For the same reason as above. Some of our clusters have worked better
> > > with 8 when we are having issues. But right now we are still running with 4.
> >
> > I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
> > I've seen a number of clusters with SMP dropping with the current
> > lower defaults.
>
> So OpenSM is seeing dropped packets? With 4 SMP's on the wire? I do see some
> VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an
> issue. What kind of rate are you seeing?
>
> The other question is; do people regularly run the tools which are using
> libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)? We do. If others
> are not then I would say this change would have less impact as they would want
> the diags to have some priority for debugging. The other option is to change
> the patch to be a default of 2 and allow user to change it depending on what
> they are trying to do. If you think that is best I will change the patch.
>
> Ira
>
> >
> > -- Hal
> >
> > > Ira
> > >
> > >>
> > >> -- Hal
> > >>
> > >> >
> > >> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
> > >> >
> > >> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
> > >> >
> > >> > Ira
> > >> >
> > >> > --
> > >> > Ira Weiny
> > >> > Math Programmer/Computer Scientist
> > >> > Lawrence Livermore National Lab
> > >> > 925-423-8008
> > >> > weiny2-i2BcT+NCU+M@public.gmane.org
> > >> > --
> > >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > >> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > >> > More majordomo info at http://**vger.kernel.org/majordomo-info.html
> > >> >
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > >> More majordomo info at http://**vger.kernel.org/majordomo-info.html
> > >>
> > >
> > >
> > > --
> > > Ira Weiny
> > > Math Programmer/Computer Scientist
> > > Lawrence Livermore National Lab
> > > 925-423-8008
> > > weiny2-i2BcT+NCU+M@public.gmane.org
> > >
> >
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> weiny2-i2BcT+NCU+M@public.gmane.org
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <20100204161325.c4481bfe.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-02-05 2:18 ` Ira Weiny
@ 2010-02-05 12:24 ` Hal Rosenstock
1 sibling, 0 replies; 9+ messages in thread
From: Hal Rosenstock @ 2010-02-05 12:24 UTC (permalink / raw)
To: Ira Weiny
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Thu, Feb 4, 2010 at 7:13 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> On Thu, 4 Feb 2010 15:01:32 -0500
> Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>> > On Thu, 4 Feb 2010 09:19:39 -0500
>> > Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> >
>> >> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>> >> > Sasha,
>> >> >
>
> [snip]
>
>> >> >
>> >> > real 0m2.249s
>> >> > user 0m1.244s
>> >> > sys 0m0.936s
>> >> >
>> >> > 14:40:59 > time ./ibnetdiscover -o 4 --node-name-map /etc/opensm/ib-node-name-map -g > new
>> >> >
>> >> > real 0m2.170s
>> >> > user 0m1.160s
>> >> > sys 0m0.933s
>> >> >
>> >> > 14:41:10 > /usr/sbin/ibqueryerrors -s RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
>> >> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
>> >> > Errors for 0x66a00d90006fb "SW19"
>> >> > GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
>> >> > Link info: 139 9[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 0x0002c9030001d736 864 1[ ] "hyperion1" ( )
>> >> >
>> >> > Note that there were no additional VL15Dropped packets on the fabric. I think 4 seems to be a good compromise. I have not tested when there are errors on the fabric. (Right now things seem to be good!)
>> >>
>> >> Is this just with the SM doing light sweeping ?
>> >
>> > Yes.
>>
>> That's not a lot of SMP stress from the SM side. SMP consumers are SM,
>> diags, and the unsolicited traps.
>
> Agreed. I hope to test this more next week.
>>
>> >
>> >>
>> >> Is there a speedup with 4 rather than 2 ?
>> >
>> > There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to want to
>> > go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
>> > will give us better parallelism to get around these issues. I have not had
>> > the chance to test this condition with the new algorithm but the original
>> > ibnetdiscover would slow way down when there are nodes which have unresponsive
>> > SMA's. If there are only 2 outstanding this will not give us much speed up.
>> > This was the main motivation I had for improving the library in this way.
>> >
>> > Also, I think you are correct that we should increase OpenSM's default from 4
>> > to 8. For the same reason as above. Some of our clusters have worked better
>> > with 8 when we are having issues. But right now we are still running with 4.
>>
>> I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
>> I've seen a number of clusters with SMP dropping with the current
>> lower defaults.
>
> So OpenSM is seeing dropped packets?
OpenSM is seeing timeouts and there are VL15 drops in the subnet.
> With 4 SMP's on the wire?
Yes.
> I do see some
> VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an
> issue. What kind of rate are you seeing?
> The other question is; do people regularly run the tools which are using
> libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)?
These tools are being used (at least ibnetdiscover and ibqueryerrors).
> We do. If others
> are not then I would say this change would have less impact as they would want
> the diags to have some priority for debugging. The other option is to change
> the patch to be a default of 2 and allow user to change it depending on what
> they are trying to do. If you think that is best I will change the patch.
FWIW I think 2 is better until we have more exhaustive experience with
4. The other alternative would be to make it 4 and then see if people
start noticing (more) VL15 drops and possibly other issues.
-- Hal
> Ira
>
>>
>> -- Hal
>>
>> > Ira
>> >
>> >>
>> >> -- Hal
>> >>
>> >> >
>> >> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
>> >> >
>> >> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
>> >> >
>> >> > Ira
>> >> >
>> >> > --
>> >> > Ira Weiny
>> >> > Math Programmer/Computer Scientist
>> >> > Lawrence Livermore National Lab
>> >> > 925-423-8008
>> >> > weiny2-i2BcT+NCU+M@public.gmane.org
>> >> > --
>> >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> >> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> >> > More majordomo info at http://**vger.kernel.org/majordomo-info.html
>> >> >
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> >> More majordomo info at http://**vger.kernel.org/majordomo-info.html
>> >>
>> >
>> >
>> > --
>> > Ira Weiny
>> > Math Programmer/Computer Scientist
>> > Lawrence Livermore National Lab
>> > 925-423-8008
>> > weiny2-i2BcT+NCU+M@public.gmane.org
>> >
>>
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> weiny2-i2BcT+NCU+M@public.gmane.org
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <20100204181852.f175d968.weiny2-i2BcT+NCU+M@public.gmane.org>
@ 2010-02-05 12:27 ` Hal Rosenstock
[not found] ` <f0e08f231002050427r37d326b4w4c188e30992579ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Hal Rosenstock @ 2010-02-05 12:27 UTC (permalink / raw)
To: Ira Weiny
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Thu, Feb 4, 2010 at 9:18 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
> On Thu, 4 Feb 2010 16:13:25 -0800
> Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>
>> On Thu, 4 Feb 2010 15:01:32 -0500
>> Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>> > On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>> > > On Thu, 4 Feb 2010 09:19:39 -0500
>> > > Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > >
>> > >> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org> wrote:
>> > >> > Sasha,
>> > >> >
>>
>> [snip]
>
> [snip]
>
>> > >>
>> > >> Is there a speedup with 4 rather than 2 ?
>> > >
>> > > There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to want to
>> > > go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
>> > > will give us better parallelism to get around these issues. I have not had
>> > > the chance to test this condition with the new algorithm but the original
>> > > ibnetdiscover would slow way down when there are nodes which have unresponsive
>> > > SMA's. If there are only 2 outstanding this will not give us much speed up.
>> > > This was the main motivation I had for improving the library in this way.
>
> Ok, I found a fabric with just 2 nodes which were unresponsive... A quick
> test shows...
>
> Original ibnetdiscover:
>
> 18:12:29 > time ./ibnetdiscover > foo
> ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,11,9)
> src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) failed, skipping port
> ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,24,18,7,6)
> src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) failed, skipping port
>
> real 0m9.073s
> user 0m0.137s
> sys 0m0.172s
>
> 18:12:43 > time ./ibnetdiscover > foo
> ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,11,9)
> src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) failed, skipping port
> ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,24,18,7,6)
> src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) failed, skipping port
>
> real 0m9.103s
> user 0m0.046s
> sys 0m0.046s
>
>
> *New* ibnetdiscover with different outstanding SMP's.
>
> 18:12:14 > time ./ibnetdiscover -o 2 > foo
> src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out
> src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out
>
> real 0m9.746s
> user 0m6.559s
> sys 0m3.156s
>
> 18:13:00 > time ./ibnetdiscover -o 4 > foo
> src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out
> src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out
>
> real 0m4.668s
> user 0m3.043s
> sys 0m1.601s
>
> 18:13:10 > time ./ibnetdiscover -o 8 > foo
> src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out
> src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out
>
> real 0m4.360s
> user 0m2.891s
> sys 0m1.451s
>
>
> Note that 2 does not give much speed up, where 4 does. Obviously this could
> have to do with the fact there were 2 nodes which were bad (so if you had
> 100's of nodes unresponsive a higher value might be worth using)
It depends on the number of unresponsive nodes being same or higher
than number of outstanding/parallel SMPs. In a sense, the number of
outstanding SMPs is a measure of how many unresponsive nodes one is
willing to tolerate before slowing down/waiting for timeouts. In some
environments, unresponsive nodes are a normal case.
-- Hal
> but as a
> default compromise I think 4 is good.
>
> Ira
>
>> > >
>> > > Also, I think you are correct that we should increase OpenSM's default from 4
>> > > to 8. For the same reason as above. Some of our clusters have worked better
>> > > with 8 when we are having issues. But right now we are still running with 4.
>> >
>> > I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
>> > I've seen a number of clusters with SMP dropping with the current
>> > lower defaults.
>>
>> So OpenSM is seeing dropped packets? With 4 SMP's on the wire? I do see some
>> VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an
>> issue. What kind of rate are you seeing?
>>
>> The other question is; do people regularly run the tools which are using
>> libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)? We do. If others
>> are not then I would say this change would have less impact as they would want
>> the diags to have some priority for debugging. The other option is to change
>> the patch to be a default of 2 and allow user to change it depending on what
>> they are trying to do. If you think that is best I will change the patch.
>>
>> Ira
>>
>> >
>> > -- Hal
>> >
>> > > Ira
>> > >
>> > >>
>> > >> -- Hal
>> > >>
>> > >> >
>> > >> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
>> > >> >
>> > >> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
>> > >> >
>> > >> > Ira
>> > >> >
>> > >> > --
>> > >> > Ira Weiny
>> > >> > Math Programmer/Computer Scientist
>> > >> > Lawrence Livermore National Lab
>> > >> > 925-423-8008
>> > >> > weiny2-i2BcT+NCU+M@public.gmane.org
>> > >> > --
>> > >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> > >> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> > >> > More majordomo info at http://**vger.kernel.org/majordomo-info.html
>> > >> >
>> > >> --
>> > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> > >> More majordomo info at http://**vger.kernel.org/majordomo-info.html
>> > >>
>> > >
>> > >
>> > > --
>> > > Ira Weiny
>> > > Math Programmer/Computer Scientist
>> > > Lawrence Livermore National Lab
>> > > 925-423-8008
>> > > weiny2-i2BcT+NCU+M@public.gmane.org
>> > >
>> >
>>
>>
>> --
>> Ira Weiny
>> Math Programmer/Computer Scientist
>> Lawrence Livermore National Lab
>> 925-423-8008
>> weiny2-i2BcT+NCU+M@public.gmane.org
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> weiny2-i2BcT+NCU+M@public.gmane.org
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] Using multi-smps on the wire in libibnetdisc
[not found] ` <f0e08f231002050427r37d326b4w4c188e30992579ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-02-06 1:03 ` Ira Weiny
0 siblings, 0 replies; 9+ messages in thread
From: Ira Weiny @ 2010-02-06 1:03 UTC (permalink / raw)
To: Hal Rosenstock
Cc: Sasha Khapyorsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Fri, 5 Feb 2010 07:27:05 -0500
Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >
> > Note that 2 does not give much speed up, where 4 does. Obviously this could
> > have to do with the fact there were 2 nodes which were bad (so if you had
> > 100's of nodes unresponsive a higher value might be worth using)
>
> It depends on the number of unresponsive nodes being same or higher
> than number of outstanding/parallel SMPs. In a sense, the number of
> outstanding SMPs is a measure of how many unresponsive nodes one is
> willing to tolerate before slowing down/waiting for timeouts. In some
> environments, unresponsive nodes are a normal case.
Agreed but where should we set the default? I don't think 4 is a bad default.
I don't think it makes the diags overly aggressive, compared with OpenSM.
Sasha I guess this is your call.
Just tell me where to set it and I will make the patch. Basically with the
user option it can always be changed on a run by run basis.
Ira
>
> -- Hal
>
> > but as a
> > default compromise I think 4 is good.
> >
> > Ira
> >
> >> > >
> >> > > Also, I think you are correct that we should increase OpenSM's default from 4
> >> > > to 8. For the same reason as above. Some of our clusters have worked better
> >> > > with 8 when we are having issues. But right now we are still running with 4.
> >> >
> >> > I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
> >> > I've seen a number of clusters with SMP dropping with the current
> >> > lower defaults.
> >>
> >> So OpenSM is seeing dropped packets? With 4 SMP's on the wire? I do see some
> >> VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an
> >> issue. What kind of rate are you seeing?
> >>
> >> The other question is; do people regularly run the tools which are using
> >> libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)? We do. If others
> >> are not then I would say this change would have less impact as they would want
> >> the diags to have some priority for debugging. The other option is to change
> >> the patch to be a default of 2 and allow user to change it depending on what
> >> they are trying to do. If you think that is best I will change the patch.
> >>
> >> Ira
> >>
> >> >
> >> > -- Hal
> >> >
> >> > > Ira
> >> > >
> >> > >>
> >> > >> -- Hal
> >> > >>
> >> > >> >
> >> > >> > The first patch converts the algorithm and the second adds the ibnd_set_max_smps_on_wire call.
> >> > >> >
> >> > >> > Let me know what you think. Because the algorithm changed so much testing this is a bit difficult because the order of the node discovery is different. However, I have done some extensive diffing of the output of ibnetdiscover and things look good.
> >> > >> >
> >> > >> > Ira
> >> > >> >
> >> > >> > --
> >> > >> > Ira Weiny
> >> > >> > Math Programmer/Computer Scientist
> >> > >> > Lawrence Livermore National Lab
> >> > >> > 925-423-8008
> >> > >> > weiny2-i2BcT+NCU+M@public.gmane.org
> >> > >> > --
> >> > >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> > >> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> > >> > More majordomo info at http://***vger.kernel.org/majordomo-info.html
> >> > >> >
> >> > >> --
> >> > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> > >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> > >> More majordomo info at http://***vger.kernel.org/majordomo-info.html
> >> > >>
> >> > >
> >> > >
> >> > > --
> >> > > Ira Weiny
> >> > > Math Programmer/Computer Scientist
> >> > > Lawrence Livermore National Lab
> >> > > 925-423-8008
> >> > > weiny2-i2BcT+NCU+M@public.gmane.org
> >> > >
> >> >
> >>
> >>
> >> --
> >> Ira Weiny
> >> Math Programmer/Computer Scientist
> >> Lawrence Livermore National Lab
> >> 925-423-8008
> >> weiny2-i2BcT+NCU+M@public.gmane.org
> >
> >
> > --
> > Ira Weiny
> > Math Programmer/Computer Scientist
> > Lawrence Livermore National Lab
> > 925-423-8008
> > weiny2-i2BcT+NCU+M@public.gmane.org
> >
>
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-02-06 1:03 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-03 0:45 [PATCH 0/2] Using multi-smps on the wire in libibnetdisc Ira Weiny
[not found] ` <20100202164514.bf2b152a.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-02-04 14:19 ` Hal Rosenstock
[not found] ` <f0e08f231002040619y34784bauec6a25d31e7d229b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-04 18:00 ` Ira Weiny
[not found] ` <20100204100045.4d2aa9aa.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-02-04 20:01 ` Hal Rosenstock
[not found] ` <f0e08f231002041201y2b48aa30nc33d2816ed11b1bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-05 0:13 ` Ira Weiny
[not found] ` <20100204161325.c4481bfe.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-02-05 2:18 ` Ira Weiny
[not found] ` <20100204181852.f175d968.weiny2-i2BcT+NCU+M@public.gmane.org>
2010-02-05 12:27 ` Hal Rosenstock
[not found] ` <f0e08f231002050427r37d326b4w4c188e30992579ac-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-06 1:03 ` Ira Weiny
2010-02-05 12:24 ` Hal Rosenstock
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox