From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Daniel Borkmann <dborkman@redhat.com>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
cwang@twopensource.com, Eric Dumazet <eric.dumazet@gmail.com>
Subject: Re: [RFC PATCH] pktgen: skb bursting via skb->xmit_more API
Date: Sat, 30 Aug 2014 15:37:42 +0200 [thread overview]
Message-ID: <20140830153742.7f27d98c@redhat.com> (raw)
In-Reply-To: <20140827211300.26976.52104.stgit@dragon>
On Wed, 27 Aug 2014 23:13:00 +0200 Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> This patch just demonstrates the effect of delaying the HW tailptr.
> Let me demonstrate the performance effect of bulking packet with pktgen.
>
> These results is a **single** CPU pktgen TX via script:
> https://github.com/netoptimizer/network-testing/blob/master/pktgen/pktgen02_burst.sh
>
> Cmdline args:
> ./pktgen02_burst.sh -i eth5 -d 192.168.21.4 -m 00:12:c0:80:1d:54 -b $skb_burst
>
> Special case skb_burst=1 does not burst, but activates the
> skb_burst_count++ and writing to skb->xmit_more.
>
> Performance
> skb_burst=0 tx:5614370 pps
> skb_burst=1 tx:5571279 pps ( -1.38 ns (worse))
> skb_burst=2 tx:6942821 pps ( 35.46 ns)
> skb_burst=3 tx:7556214 pps ( 11.69 ns)
> skb_burst=4 tx:7740632 pps ( 3.15 ns)
> skb_burst=5 tx:7972489 pps ( 3.76 ns)
> skb_burst=6 tx:8129856 pps ( 2.43 ns)
> skb_burst=7 tx:8281671 pps ( 2.25 ns)
> skb_burst=8 tx:8383790 pps ( 1.47 ns)
> skb_burst=9 tx:8451248 pps ( 0.95 ns)
> skb_burst=10 tx:8503571 pps ( 0.73 ns)
> skb_burst=16 tx:8745878 pps ( 3.26 ns)
> skb_burst=24 tx:8871629 pps ( 1.62 ns)
> skb_burst=32 tx:8945166 pps ( 0.93 ns)
>
> skb_burst=(0 vs 32) improvement:
> (1/5614370*10^9)-(1/8945166*10^9) = 66.32 ns
> + 3330796 pps
A more interesting benchmark with pktgen is to see what happens if
pktgen have to free and allocate a new SKB everytime in the transmit
loop. Because this adds a relatively significant delay between packets.
Baseline before with SKB_CLONE=100000 (and skb_burst=0), was
5614370pps. Corrosponding to a 178 nanosec delay between packets
(1/5614370*10^9).
Pktgen performance drops to 2421076 pps with SKB_CLONE=0 (and
skb_burst=0), causing a full free+alloc cycle (also keeping the
do_gettimeofday() timestamp). This corrosponds to (1/2421076*10^9)
413 nanosec between packets.
Interesting this also tell us that the stack overhead + pktgen
packet-init is (413-178=) 235ns. (The do_gettimeofday contributes
23ns, leaving 212ns).
Results:
skb_burst=0 2421076 pps
skb_burst=1 2410301 pps ( -1.85 ns (worse))
skb_burst=2 2580824 pps ( 27.41 ns)
skb_burst=3 2678276 pps ( 14.10 ns)
skb_burst=4 2729021 pps ( 6.94 ns)
skb_burst=5 2742044 pps ( 1.74 ns)
skb_burst=6 2763974 pps ( 2.89 ns)
skb_burst=7 2772413 pps ( 1.10 ns)
skb_burst=8 2788705 pps ( 2.10 ns)
skb_burst=9 2791055 pps ( 0.30 ns)
skb_burst=10 2791726 pps ( 0.09 ns)
skb_burst=16 2819949 pps ( 3.58 ns)
skb_burst=24 2817786 pps ( -0.27 ns)
skb_burst=32 2813690 pps ( -0.51 ns)
Perhaps a little bit interesting that performance slightly decreases
after skb_burst=16, but this could simply be caused by the accuracy
level (as those tests had a variation of min:-0.250 max:1.811 ns).
skb_burst=(0 vs 32) improvement:
(1/2421076*10^9)-(1/2813690*10^9) = 57.63 ns
2,813,690-2,421,076 = +392,614 pps
Bulking via HW ring buffer tailptr "flush", still showed a significant
performance improvement, even with this spacing caused by pktgen
free+alloc+init+timestamp. I tried to tcpdump packets on the sink
host, but I could not "see" the bulking (this is most likely a problem
with the sink and tcpdumps time resolution).
Setup notes:
- pktgen TX single CPU test (E5-2695)
- ethtool -C eth5 rx-usecs 30
- tuned-adm profile latency-performance
- IRQ aligned to CPUs
- Ethernet Flow-Control disabled
- No Hyper-Threading
- netfilter_unload_modules.sh
Need something to relate these nanosec to?
Go read:
http://netoptimizer.blogspot.dk/2014/05/the-calculations-10gbits-wirespeed.html
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
prev parent reply other threads:[~2014-08-30 13:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-27 21:13 [RFC PATCH] pktgen: skb bursting via skb->xmit_more API Jesper Dangaard Brouer
2014-08-27 21:36 ` David Miller
2014-08-30 13:37 ` Jesper Dangaard Brouer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140830153742.7f27d98c@redhat.com \
--to=brouer@redhat.com \
--cc=cwang@twopensource.com \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=eric.dumazet@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).