* Slow veth performance over ipoib interface on 4.7.0 (and earlier) (Was Re: [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE) [not found] ` <CAAk-MO9C7i0en5ZE=pufz6tMecUi23kL=5FR36JNfPzuO1G5-g@mail.gmail.com> @ 2016-08-04 13:34 ` Nikolay Borisov [not found] ` <57A34448.1040600-6AxghH7DbtA@public.gmane.org> 0 siblings, 1 reply; 2+ messages in thread From: Nikolay Borisov @ 2016-08-04 13:34 UTC (permalink / raw) To: Erez Shitrit; +Cc: linux-rdma@vger.kernel.org, netdev [-- Attachment #1: Type: text/plain, Size: 4084 bytes --] On 08/01/2016 11:56 AM, Erez Shitrit wrote: > The GID (9000:0:2800:0:bc00:7500:6e:d8a4) is not regular, not from > local subnet prefix. > why is that? > So I managed to debug this and it tuns out the problem lies between veth and ipoib interaction: I've discovered the following strange thing. If I have a vethpair where the 2 devices are in a different net namespaces as shown in the scripts I have attached then the performance of sending a file, originating from the veth interface inside the non-init netnamespace, going across the ipoib interface is very slow (100kb). For simple reproduction I'm attaching 2 scripts which have to be run on 2 machine and the respective ip addresses set on them. Then sending node woult initiate a simple file copy over NC. I've observed this behavior on upstream 4.4, 4.5.4 and 4.7.0 kernels both with ipv4 and ipv6 addresses. Here is what the debug log of the ipoib module shows: ib%d: max_srq_sge=128 ib%d: max_cm_mtu = 0xfff0, num_frags=16 ib0: enabling connected mode will cause multicast packet drops ib0: mtu > 4092 will cause multicast packet drops. ib0: bringing up interface ib0: starting multicast thread ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff ib0: restarting multicast task ib0: adding multicast entry for mgid ff12:601b:ffff:0000:0000:0000:0000:0001 ib0: restarting multicast task ib0: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0000:0001 ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff (status 0) ib0: Created ah ffff88081063ea80 ib0: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV ffff88081063ea80, LID 0xc000, SL 0 ib0: joining MGID ff12:601b:ffff:0000:0000:0000:0000:0001 ib0: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001 ib0: successfully started all multicast joins ib0: join completion for ff12:601b:ffff:0000:0000:0000:0000:0001 (status 0) ib0: Created ah ffff880839084680 ib0: MGID ff12:601b:ffff:0000:0000:0000:0000:0001 AV ffff880839084680, LID 0xc002, SL 0 ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 (status 0) ib0: Created ah ffff88081063e280 ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV ffff88081063e280, LID 0xc004, SL 0 When the transfer is initiated I can see the following errors on the sending node: ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 Here is the port guid of the sending node: 0x0011750000772664 and on the receiving one: 0x0011750000774d36 Here is how the paths look like on the sending node, clearly the paths being requested from the veth interface cat /sys/kernel/debug/ipoib/ib0_path GID: 401:0:1400:0:a0a8:ffff:1c01:4d36 complete: no GID: 401:0:1400:0:a410:ffff:1c01:4d36 complete: no GID: fe80:0:0:0:11:7500:77:2a1a complete: yes DLID: 0x0004 SL: 0 rate: 40.0 Gb/sec GID: fe80:0:0:0:11:7500:77:4d36 complete: yes DLID: 0x000a SL: 0 rate: 40.0 Gb/sec Testing the same scenario but instead of using veth devices I create the device in the non-init netnamespace via the following commands I can achieve sensible speeds: ip link add link ib0 name ip1 type ipoib ip link set dev ip1 netns test-netnamespace [Snipped a lot of useless stuff] [-- Attachment #2: receive-node.sh --] [-- Type: application/x-shellscript, Size: 181 bytes --] [-- Attachment #3: sending-node.sh --] [-- Type: application/x-shellscript, Size: 806 bytes --] ^ permalink raw reply [flat|nested] 2+ messages in thread
[parent not found: <57A34448.1040600-6AxghH7DbtA@public.gmane.org>]
* Re: Slow veth performance over ipoib interface on 4.7.0 (and earlier) (Was Re: [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE) [not found] ` <57A34448.1040600-6AxghH7DbtA@public.gmane.org> @ 2016-08-04 14:08 ` Doug Ledford 0 siblings, 0 replies; 2+ messages in thread From: Doug Ledford @ 2016-08-04 14:08 UTC (permalink / raw) To: Nikolay Borisov, Erez Shitrit Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, netdev-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 4810 bytes --] On Thu, 2016-08-04 at 16:34 +0300, Nikolay Borisov wrote: > > On 08/01/2016 11:56 AM, Erez Shitrit wrote: > > > > The GID (9000:0:2800:0:bc00:7500:6e:d8a4) is not regular, not from > > local subnet prefix. > > why is that? > > > > So I managed to debug this and it tuns out the problem lies between > veth > and ipoib interaction: > > I've discovered the following strange thing. If I have a vethpair > where > the 2 devices are in a different net namespaces as shown in the > scripts > I have attached then the performance of sending a file, originating > from > the veth interface inside the non-init netnamespace, going across the > ipoib interface is very slow (100kb). For simple reproduction I'm > attaching > 2 scripts which have to be run on 2 machine and the respective ip > addresses > set on them. Then sending node woult initiate a simple file copy over > NC. > I've observed this behavior on upstream 4.4, 4.5.4 and 4.7.0 kernels > both > with ipv4 and ipv6 addresses. Here is what the debug log of the ipoib > module shows: > > ib%d: max_srq_sge=128 > ib%d: max_cm_mtu = 0xfff0, num_frags=16 > ib0: enabling connected mode will cause multicast packet drops > ib0: mtu > 4092 will cause multicast packet drops. > ib0: bringing up interface > ib0: starting multicast thread > ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff > ib0: restarting multicast task > ib0: adding multicast entry for mgid > ff12:601b:ffff:0000:0000:0000:0000:0001 > ib0: restarting multicast task > ib0: adding multicast entry for mgid > ff12:401b:ffff:0000:0000:0000:0000:0001 > ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff > (status 0) > ib0: Created ah ffff88081063ea80 > ib0: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV > ffff88081063ea80, LID 0xc000, SL 0 > ib0: joining MGID ff12:601b:ffff:0000:0000:0000:0000:0001 > ib0: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001 > ib0: successfully started all multicast joins > ib0: join completion for ff12:601b:ffff:0000:0000:0000:0000:0001 > (status 0) > ib0: Created ah ffff880839084680 > ib0: MGID ff12:601b:ffff:0000:0000:0000:0000:0001 AV > ffff880839084680, LID 0xc002, SL 0 > ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 > (status 0) > ib0: Created ah ffff88081063e280 > ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV > ffff88081063e280, LID 0xc004, SL 0 > > When the transfer is initiated I can see the following errors > on the sending node: > > ib0: PathRec status -22 for GID > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: Start path record lookup for > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: PathRec status -22 for GID > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: Start path record lookup for > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: PathRec status -22 for GID > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: Start path record lookup for > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: PathRec status -22 for GID > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: Start path record lookup for > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: PathRec status -22 for GID > 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 > > Here is the port guid of the sending node: 0x0011750000772664 and > on the receiving one: 0x0011750000774d36 > > Here is how the paths look like on the sending node, > clearly the paths being requested from the veth interface > > cat /sys/kernel/debug/ipoib/ib0_path > GID: 401:0:1400:0:a0a8:ffff:1c01:4d36 > complete: no > > GID: 401:0:1400:0:a410:ffff:1c01:4d36 > complete: no > > GID: fe80:0:0:0:11:7500:77:2a1a > complete: yes > DLID: 0x0004 > SL: 0 > rate: 40.0 Gb/sec > > GID: fe80:0:0:0:11:7500:77:4d36 > complete: yes > DLID: 0x000a > SL: 0 > rate: 40.0 Gb/sec > > Testing the same scenario but instead of using veth devices I create > the device in the non-init netnamespace via the following commands > I can achieve sensible speeds: > ip link add link ib0 name ip1 type ipoib > ip link set dev ip1 netns test-netnamespace > > > > > > [Snipped a lot of useless stuff] The poor performance sounds a duplicate of the issue reported by Roland and in the upstream kernel bugzilla 111921. That would be the IPoIB routed packet performance issue. -- Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> GPG KeyID: 0E572FDD [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-08-04 14:08 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <5799E5E6.3060104@kyup.com>
[not found] ` <CAAk-MO83mJTq=E_MC=izqq8fEmVujY=5egVmKfFjxAz4jO3hHg@mail.gmail.com>
[not found] ` <579F065C.602@kyup.com>
[not found] ` <CAAk-MO9C7i0en5ZE=pufz6tMecUi23kL=5FR36JNfPzuO1G5-g@mail.gmail.com>
2016-08-04 13:34 ` Slow veth performance over ipoib interface on 4.7.0 (and earlier) (Was Re: [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE) Nikolay Borisov
[not found] ` <57A34448.1040600-6AxghH7DbtA@public.gmane.org>
2016-08-04 14:08 ` Doug Ledford
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).