public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Jonathan Davies <Jonathan.Davies@eu.citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	netdev@vger.kernel.org, xen-devel@lists.xenproject.org,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: TSQ accounting skb->truesize degrades throughput for large packets
Date: Tue, 10 Sep 2013 05:35:36 -0700	[thread overview]
Message-ID: <1378816536.26319.71.camel@edumazet-glaptop> (raw)
In-Reply-To: <522ECE2B.7020409@redhat.com>

On Tue, 2013-09-10 at 15:45 +0800, Jason Wang wrote:

> For example, virtio-net will stop the tx queue when it finds the tx
> queue may full and enable the queue when some packets were sent. In this
> case, tsq works and throttles the total bytes queued in qdisc. This
> usually happen during heavy network load such as two sessions of netperf.

You told me skb were _orphaned_.

This automatically _disables_ TSQ, after packets leave Qdisc.

So you have a problem because your skb orphaning is only working when
packets leave Qdisc.

If you cant afford sockets being throttled, make sure you have no
Qdisc !

> We notice a regression, and bisect shows it was introduced by TSQ.

You do realize TSQ is a balance between throughput and latencies ?

In case of TSQ, it was very clear that limiting amount of outstanding
bytes in queues could have an impact on bandwidth.

Pushing Megabytes of TCP packets with identical TCP timestamps is
bad, because it prevents us doing delay based congestion control and
a single flow could fill the Qdisc with a thousand of packets.
(Self induced delays, see BufferBloat discussions)

One known problem in TCP stack is that sendmsg() locks the socket for
the duration of the call. sendpage() do not have this problem.

tcp_tsq_handler() is deferred if tcp_tasklet_func() finds a locked
socket. The owner of socket will call tcp_tsq_handler() when socket is
released.

So if you use sendmsg() with large buffers or if copyin data from user
land involves page faults, it may explain why you need larger number of
in-flight bytes to sustain a given throughput.

You could take a look at commit c9bee3b7fdecb0c1d070c
("tcp: TCP_NOTSENT_LOWAT socket option"), and play
with /proc/sys/net/ipv4/tcp_notsent_lowat, to force sendmsg() to release
the socket lock every hundreds of kbytes.

  reply	other threads:[~2013-09-10 12:35 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-06 10:16 TSQ accounting skb->truesize degrades throughput for large packets Wei Liu
2013-09-06 12:57 ` Eric Dumazet
2013-09-06 13:12   ` Wei Liu
2013-09-06 16:36   ` Zoltan Kiss
2013-09-06 16:56     ` Eric Dumazet
2013-09-09  9:27       ` Jason Wang
2013-09-09 13:47         ` Eric Dumazet
2013-09-10  7:45           ` Jason Wang
2013-09-10 12:35             ` Eric Dumazet [this message]
2013-09-06 17:00     ` Eric Dumazet
2013-09-07 17:21       ` Eric Dumazet
2013-09-09 21:41         ` Zoltan Kiss
2013-09-09 21:56           ` Eric Dumazet
     [not found]             ` <loom.20130921T045654-573@post.gmane.org>
     [not found]               ` <20130921150327.GA9078@zion.uk.xensource.com>
2013-09-22  2:36                 ` [Xen-devel] " Cong Wang
2013-09-22 14:58                   ` Eric Dumazet
2013-09-27 10:28                     ` [PATCH] tcp: TSQ can use a dynamic limit Eric Dumazet
2013-09-27 15:08                       ` Neal Cardwell
2013-09-29 15:41                       ` Cong Wang
2013-10-01  3:52                       ` David Miller
2013-09-09  5:28       ` TSQ accounting skb->truesize degrades throughput for large packets Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1378816536.26319.71.camel@edumazet-glaptop \
    --to=eric.dumazet@gmail.com \
    --cc=Jonathan.Davies@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    --cc=zoltan.kiss@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox