All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Nikolay Borisov <kernel-6AxghH7DbtA@public.gmane.org>,
	Erez Shitrit
	<erezsh-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Slow veth performance over ipoib interface on 4.7.0 (and earlier) (Was Re: [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE)
Date: Thu, 04 Aug 2016 10:08:28 -0400	[thread overview]
Message-ID: <1470319708.18081.104.camel@redhat.com> (raw)
In-Reply-To: <57A34448.1040600-6AxghH7DbtA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 4810 bytes --]

On Thu, 2016-08-04 at 16:34 +0300, Nikolay Borisov wrote:
> 
> On 08/01/2016 11:56 AM, Erez Shitrit wrote:
> > 
> > The GID (9000:0:2800:0:bc00:7500:6e:d8a4) is not regular, not from
> > local subnet prefix.
> > why is that?
> > 
> 
> So I managed to debug this and it tuns out the problem lies between
> veth
> and ipoib interaction:
> 
> I've discovered the following strange thing. If I have a vethpair
> where
> the 2 devices are in a different net namespaces as shown in the
> scripts
> I have attached then the performance of sending a file, originating
> from
> the veth interface inside the non-init netnamespace, going across the
> ipoib interface is very slow (100kb). For simple reproduction I'm
> attaching
> 2 scripts which have to be run on 2 machine and the respective ip
> addresses
> set on them. Then sending node woult initiate a simple file copy over
> NC.
> I've observed this behavior on upstream 4.4, 4.5.4 and 4.7.0 kernels
> both
> with ipv4 and ipv6 addresses. Here is what the debug log of the ipoib
> module shows:
> 
> ib%d: max_srq_sge=128
> ib%d: max_cm_mtu = 0xfff0, num_frags=16
> ib0: enabling connected mode will cause multicast packet drops
> ib0: mtu > 4092 will cause multicast packet drops.
> ib0: bringing up interface
> ib0: starting multicast thread
> ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff
> ib0: restarting multicast task
> ib0: adding multicast entry for mgid
> ff12:601b:ffff:0000:0000:0000:0000:0001
> ib0: restarting multicast task
> ib0: adding multicast entry for mgid
> ff12:401b:ffff:0000:0000:0000:0000:0001
> ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff
> (status 0)
> ib0: Created ah ffff88081063ea80
> ib0: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV
> ffff88081063ea80, LID 0xc000, SL 0
> ib0: joining MGID ff12:601b:ffff:0000:0000:0000:0000:0001
> ib0: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001
> ib0: successfully started all multicast joins
> ib0: join completion for ff12:601b:ffff:0000:0000:0000:0000:0001
> (status 0)
> ib0: Created ah ffff880839084680
> ib0: MGID ff12:601b:ffff:0000:0000:0000:0000:0001 AV
> ffff880839084680, LID 0xc002, SL 0
> ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001
> (status 0)
> ib0: Created ah ffff88081063e280
> ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV
> ffff88081063e280, LID 0xc004, SL 0
> 
> When the transfer is initiated I can see the following errors
> on the sending node:
> 
> ib0: PathRec status -22 for GID
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: Start path record lookup for
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: PathRec status -22 for GID
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: Start path record lookup for
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: PathRec status -22 for GID
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: Start path record lookup for
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: PathRec status -22 for GID
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: Start path record lookup for
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: PathRec status -22 for GID
> 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36
> 
> Here is the port guid of the sending node: 0x0011750000772664 and
> on the receiving one: 0x0011750000774d36
> 
> Here is how the paths look like on the sending node, 
> clearly the paths being requested from the veth interface
> 
> cat /sys/kernel/debug/ipoib/ib0_path
> GID: 401:0:1400:0:a0a8:ffff:1c01:4d36
> complete: no
> 
> GID: 401:0:1400:0:a410:ffff:1c01:4d36
> complete: no
> 
> GID: fe80:0:0:0:11:7500:77:2a1a
> complete: yes
> DLID: 0x0004
> SL: 0
> rate: 40.0 Gb/sec
> 
> GID: fe80:0:0:0:11:7500:77:4d36
> complete: yes
> DLID: 0x000a
> SL: 0
> rate: 40.0 Gb/sec
> 
> Testing the same scenario but instead of using veth devices I create
> the device in the non-init netnamespace via the following commands
> I can achieve sensible speeds:
> ip link add link ib0 name ip1 type ipoib
> ip link set dev ip1 netns test-netnamespace
> 
> 
> 
> 
>  
> [Snipped a lot of useless stuff]

The poor performance sounds a duplicate of the issue reported by Roland
and in the upstream kernel bugzilla 111921.  That would be the IPoIB
routed packet performance issue.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

      parent reply	other threads:[~2016-08-04 14:08 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-28 11:00 [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE Nikolay Borisov
     [not found] ` <5799E5E6.3060104-6AxghH7DbtA@public.gmane.org>
2016-08-01  8:01   ` Erez Shitrit
     [not found]     ` <CAAk-MO83mJTq=E_MC=izqq8fEmVujY=5egVmKfFjxAz4jO3hHg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-01  8:20       ` Nikolay Borisov
     [not found]         ` <579F065C.602-6AxghH7DbtA@public.gmane.org>
2016-08-01  8:56           ` Erez Shitrit
     [not found]             ` <CAAk-MO9C7i0en5ZE=pufz6tMecUi23kL=5FR36JNfPzuO1G5-g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-01  9:46               ` Nikolay Borisov
2016-08-04 13:34             ` Slow veth performance over ipoib interface on 4.7.0 (and earlier) (Was Re: [IPOIB] Excessive TX packet drops due to IPOIB_MAX_PATH_REC_QUEUE) Nikolay Borisov
     [not found]               ` <57A34448.1040600-6AxghH7DbtA@public.gmane.org>
2016-08-04 14:08                 ` Doug Ledford [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1470319708.18081.104.camel@redhat.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=erezsh-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=kernel-6AxghH7DbtA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.