From: Rick Jones <rick.jones2@hp.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
Yuchung Cheng <ycheng@google.com>,
Neal Cardwell <ncardwell@google.com>,
Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [PATCH net-next] tcp: TCP_NOSENT_LOWAT socket option
Date: Mon, 22 Jul 2013 13:43:03 -0700 [thread overview]
Message-ID: <51ED9957.9070107@hp.com> (raw)
In-Reply-To: <1374520422.4990.33.camel@edumazet-glaptop>
On 07/22/2013 12:13 PM, Eric Dumazet wrote:
>
> Tested:
>
> netperf sessions, and watching /proc/net/protocols "memory" column for TCP
>
> Even in the absence of shallow queues, we get a benefit.
>
> With 200 concurrent netperf -t TCP_STREAM sessions, amount of kernel memory
> used by TCP buffers shrinks by ~55 % (20567 pages instead of 45458)
>
> lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat
> lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols
> TCPv6 1880 2 45458 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y
> TCP 1696 508 45458 no 208 yes kernel y y y y y y y y y y y y y n y y y y y
>
> lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat
> lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols
> TCPv6 1880 2 20567 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y
> TCP 1696 508 20567 no 208 yes kernel y y y y y y y y y y y y y n y y y y y
>
> Using 128KB has no bad effect on the throughput of a single flow, although
> there is an increase of cpu time as sendmsg() calls trigger more
> context switches. A bonus is that we hold socket lock for a shorter amount
> of time and should improve latencies.
>
> lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat
> lpq83:~# perf stat -e context-switches ./netperf -H lpq84 -t omni -l 20 -Cc
> OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84 () port 0 AF_INET
> Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service
> Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand
> Size Size Size (sec) Util Util Util Util Demand Demand Units
> Final Final % Method % Method
> 2097152 6000000 16384 20.00 16509.68 10^6bits/s 3.05 S 4.50 S 0.363 0.536 usec/KB
>
> Performance counter stats for './netperf -H lpq84 -t omni -l 20 -Cc':
>
> 30,141 context-switches
>
> 20.006308407 seconds time elapsed
>
> lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat
> lpq83:~# perf stat -e context-switches ./netperf -H lpq84 -t omni -l 20 -Cc
> OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84 () port 0 AF_INET
> Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service
> Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand
> Size Size Size (sec) Util Util Util Util Demand Demand Units
> Final Final % Method % Method
> 1911888 6000000 16384 20.00 17412.51 10^6bits/s 3.94 S 4.39 S 0.444 0.496 usec/KB
>
> Performance counter stats for './netperf -H lpq84 -t omni -l 20 -Cc':
>
> 284,669 context-switches
>
> 20.005294656 seconds time elapsed
Netperf is perhaps a "best case" for this as it has no think time and
will not itself build-up a queue of data internally.
The 18% increase in service demand is troubling.
It would be good to hit that with the confidence intervals (eg -i 30,3
and perhaps -i 99,<somthing other than the default of 5>) or do many
separate runs to get an idea of the variation. Presumably remote
service demand is not of interest, so for the confidence intervals bit
you might drop the -C and keep only the -c in which case, netperf will
not be trying to hit the confidence interval remote CPU utilization
along with local CPU and throughput
Why are there more context switches with the lowat set to 128KB? Is the
SO_SNDBUF growth in the first case the reason? Otherwise I would have
thought that netperf would have been context switching back and forth at
at "socket full" just as often as "at 128KB." You might then also
compare before and after with a fixed socket buffer size
Anything interesting happen when the send size is larger than the lowat?
rick jones
next prev parent reply other threads:[~2013-07-22 20:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-22 19:13 [PATCH net-next] tcp: TCP_NOSENT_LOWAT socket option Eric Dumazet
2013-07-22 19:28 ` Eric Dumazet
2013-07-22 20:43 ` Rick Jones [this message]
2013-07-22 22:44 ` Eric Dumazet
2013-07-22 23:08 ` Rick Jones
2013-07-23 0:13 ` Eric Dumazet
2013-07-23 0:40 ` Eric Dumazet
2013-07-23 1:20 ` Hannes Frederic Sowa
2013-07-23 1:33 ` Eric Dumazet
2013-07-23 2:32 ` Eric Dumazet
2013-07-23 15:25 ` Rick Jones
2013-07-23 15:28 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51ED9957.9070107@hp.com \
--to=rick.jones2@hp.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=mtk.manpages@gmail.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).