netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: "Yi Yang (杨燚)-云服务集团" <yangyi01@inspur.com>
Cc: "yang_y_yi@163.com" <yang_y_yi@163.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"u9012063@gmail.com" <u9012063@gmail.com>
Subject: Re: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [PATCH net-next] net/ packet: fix TPACKET_V3 performance issue in case of TSO
Date: Mon, 30 Mar 2020 10:16:26 -0400	[thread overview]
Message-ID: <CA+FuTSf5sUxoNTSurptYAq9UGVoDAxPRLHrKHmT0r-QBm=wRmw@mail.gmail.com> (raw)
In-Reply-To: <934640b05d7f46848ba2636fcc0b1e34@inspur.com>

On Mon, Mar 30, 2020 at 2:35 AM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> -----邮件原件-----
> 发件人: Willem de Bruijn [mailto:willemdebruijn.kernel@gmail.com]
> 发送时间: 2020年3月30日 9:52
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: willemdebruijn.kernel@gmail.com; yang_y_yi@163.com; netdev@vger.kernel.org; u9012063@gmail.com
> 主题: Re: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [PATCH net-next] net/ packet: fix TPACKET_V3 performance issue in case of TSO
>
> > iperf3 test result
> > -----------------------
> > [yangyi@localhost ovs-master]$ sudo ../run-iperf3.sh
> > iperf3: no process found
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port
> > 44976 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  19.6 GBytes  16.8 Gbits/sec  106586    307 KBytes
> > [  4]  10.00-20.00  sec  19.5 GBytes  16.7 Gbits/sec  104625    215 KBytes
> > [  4]  20.00-30.00  sec  20.0 GBytes  17.2 Gbits/sec  106962    301 KBytes
>
> Thanks for the detailed info.
>
> So there is more going on there than a simple network tap. veth, which calls netif_rx and thus schedules delivery with a napi after a softirq (twice), tpacket for recv + send + ovs processing. And this is a single flow, so more sensitive to batching, drops and interrupt moderation than a workload of many flows.
>
> If anything, I would expect the ACKs on the return path to be the more likely cause for concern, as they are even less likely to fill a block before the timer. The return path is a separate packet socket?
>
> With initial small window size, I guess it might be possible for the entire window to be in transit. And as no follow-up data will arrive, this waits for the timeout. But at 3Gbps that is no longer the case.
> Again, the timeout is intrinsic to TPACKET_V3. If that is unacceptable, then TPACKET_V2 is a more logical choice. Here also in relation to timely ACK responses.
>
> Other users of TPACKET_V3 may be using fewer blocks of larger size. A change to retire blocks after 1 gso packet will negatively affect their workloads. At the very least this should be an optional feature, similar to how I suggested converting to micro seconds.
>
> [Yi Yang] My iperf3 test is TCP socket, return path is same socket as forward path. BTW this patch will retire current block only if vnet header is in packets, I don't know what else use cases will use vnet header except our user scenario. In addition, I also have more conditions to limit this, but it impacts on performance. I'll try if V2 can fix our issue, this will be only one way to fix our issue if not.
>

Thanks. Also interesting might be a short packet trace of packet
arrival on the bond device ports, taken at the steady state of 3 Gbps.
To observe when inter-arrival time exceeds the 167 usec mean. Also
informative would be to learn whether when retiring a block using your
patch, that block also holds one or more ACK packets along with the
GSO packet. As their delay might be the true source of throttling the sender.

I think we need to understand the underlying problem better to
implement a robust fix that works for a variety of configurations, and
does not causing accidental regressions. The current patch works for
your setup, but I'm afraid that it might paper over the real issue.

It is a peculiar aspect of TPACKET_V3 that blocks are retired not when
a packet is written that fills them, but when the next packet arrives
and cannot find room. Again, at sustained rate that delay should be
immaterial. But it might be okay to measure remaining space after
write and decide to retire if below some watermark. I would prefer
that watermark to be a ratio of block size rather than whether the
packet is gso or not.

  reply	other threads:[~2020-03-30 14:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 14:08 [PATCH net-next] net/packet: fix TPACKET_V3 performance issue in case of TSO yang_y_yi
2020-03-25 14:37 ` Willem de Bruijn
2020-03-26  0:43   ` 答复: [vger.kernel.org代发]Re: " Yi Yang (杨燚)-云服务集团
2020-03-26  1:20     ` Willem de Bruijn
2020-03-27  2:33       ` 答复: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [PATCH net-next] net/ packet: " Yi Yang (杨燚)-云服务集团
2020-03-27  3:16         ` Willem de Bruijn
2020-03-28  8:36           ` 答复: " Yi Yang (杨燚)-云服务集团
2020-03-28 18:36             ` Willem de Bruijn
2020-03-29  2:42               ` 答复: " Yi Yang (杨�D)-云服务集团
2020-03-30  1:51                 ` Willem de Bruijn
2020-03-30  6:34                   ` 答复: " Yi Yang (杨燚)-云服务集团
2020-03-30 14:16                     ` Willem de Bruijn [this message]
2020-04-14  3:41                       ` 答复: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [vger.kernel.org代 发]Re: [PATCH net-next] net/ packet: fix TPACKET_V3 perform ance " Yi Yang (杨燚)-云服务集团
2020-04-14 14:03                         ` Willem de Bruijn
2020-04-15  3:33                           ` 答复: [vger.kernel.org代发]Re: [vger.kernel.org代发]Re: [vger.kernel.org代 发]Re: [vger.kernel.org代 发]Re: [PATCH net-next] net/ pa cket: " Yi Yang (杨燚)-云服务集团
     [not found]   ` <8c7c4b8.a0a4.17112280afb.Coremail.yang_y_yi@163.com>
2020-03-26  1:16     ` Re: [PATCH net-next] net/packet: fix TPACKET_V3 performance " Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+FuTSf5sUxoNTSurptYAq9UGVoDAxPRLHrKHmT0r-QBm=wRmw@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=u9012063@gmail.com \
    --cc=yang_y_yi@163.com \
    --cc=yangyi01@inspur.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).