netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Sabrina Dubroca <sd@queasysnail.net>,
	brouer@redhat.com
Subject: Re: [PATCH net-next 0/5] net: add protocol level recvmmsg support
Date: Fri, 25 Nov 2016 18:37:11 +0100	[thread overview]
Message-ID: <20161125183711.675fa4a7@redhat.com> (raw)
In-Reply-To: <cover.1480086321.git.pabeni@redhat.com>

On Fri, 25 Nov 2016 16:39:51 +0100
Paolo Abeni <pabeni@redhat.com> wrote:

> The goal of recvmmsg() is to amortize the syscall overhead on a possible
> long messages batch, but for most networking protocols, e.g. udp the
> syscall overhead is negligible compared to the protocol specific operations
> like dequeuing.

Sounds good. I'm excited to see work in this area! :-)
 
[...]
> The udp version of recvmmsg() tries to bulk-dequeue skbs from the receive queue,
> each burst acquires the lock once to extract as many skbs from the receive
> queue as possible, up to the number needed to reach the specified maximum.
> rmem_alloc and fwd memory are touched once per burst.

Sounds good.
 
> When the protocol-level recvmmsg() is not available or it does not support the
> specified flags, the code falls-back to the current generic implementation.
> 
> This series introduces some behavior changes for the recvmmsg() syscall (only
> for udp):
> - the timeout argument now works as expected
> - recvmmsg() does not stop anymore when getting the first error, instead
>   it keeps processing the current burst and then handle the error code as
>   in the generic implementation.
> 
> The measured performance delta is as follow:
> 
> 		before		after
> 		(Kpps)		(Kpps)
> 
> udp flood[1]	570		1800(+215%)
> max tput[2]	1850		3500(+89%)
> single queue[3]	1850		1630(-11%)
> 
> [1] line rate flood using multiple 64 bytes packets and multiple flows

Is [1] sending multiple flow in the a single UDP-sink?


> [2] like [1], but using the minimum number of flows to saturate the user space
>  sink, that is 1 flow for the old kernel and 3 for the patched one.
>  the tput increases since the contention on the rx lock is low.
> [3] like [1] but using a single flow with both old and new kernel. All the
>  packets land on the same rx queue and there is a single ksoftirqd instance
>  running

It is important to know, if ksoftirqd and the UDP-sink runs on the same CPU?


> The regression in the single queue scenario is actually due to the improved
> performance of the recvmmsg() syscall: the user space process is now
> significantly faster than the ksoftirqd process so that the latter needs often
> to wake up the user space process.

When measuring these things, make sure that we/you measure both the packets
actually received in the userspace UDP-sink, and also measure packets
RX processed by ksoftirq (and I often also look at what HW got delivered).
Some times, when userspace is too slow, the kernel can/will drop packets.

It is actually quite easily verified with cmdline:

 nstat > /dev/null && sleep 1  && nstat

For HW measurements I use the tool ethtool_stats.pl:
 https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl


> Since ksoftirqd is the bottle-neck is such scenario, overall this causes a
> tput reduction. In a real use case, where the udp sink is performing some
> actual processing of the received data, such regression is unlikely to really
> have an effect.

My experience is that the performance of RX UDP is affected by:
 * if socket is connected or not (yes, RX side also)
 * state of /proc/sys/net/ipv4/ip_early_demux

You don't need to run with all the combinations, but it would be nice
if you specify what config your have based your measurements on (and
keep them stable in your runs).

I've actually implemented the "--connect" option to my udp_sink
program[1] today, but I've not pushed it yet, if you are interested.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

[1] 

  parent reply	other threads:[~2016-11-25 17:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-25 15:39 [PATCH net-next 0/5] net: add protocol level recvmmsg support Paolo Abeni
2016-11-25 15:39 ` [PATCH net-next 1/5] net/socket: factor out msghdr manipulation helpers Paolo Abeni
2016-11-25 15:39 ` [PATCH net-next 2/5] net/socket: add per protocol mmesg support Paolo Abeni
2016-11-25 15:39 ` [PATCH net-next 3/5] net/udp: factor out main skb processing routine Paolo Abeni
2016-11-25 15:39 ` [PATCH net-next 4/5] net/socket: add helpers for recvmmsg Paolo Abeni
2016-11-25 20:52   ` kbuild test robot
2016-11-25 20:52   ` kbuild test robot
2016-11-25 22:30   ` Eric Dumazet
2016-11-27 16:21     ` Paolo Abeni
2016-11-25 15:39 ` [PATCH net-next 5/5] udp: add recvmmsg implementation Paolo Abeni
2016-11-25 17:09   ` Hannes Frederic Sowa
2016-11-28 12:32     ` David Laight
2016-11-30  0:22     ` David Miller
2016-11-30  3:47       ` Hannes Frederic Sowa
2016-11-25 17:37 ` Jesper Dangaard Brouer [this message]
2016-11-28 10:52   ` [PATCH net-next 0/5] net: add protocol level recvmmsg support Paolo Abeni
2016-11-28 12:21     ` Jesper Dangaard Brouer
2016-11-28 13:52       ` Jesper Dangaard Brouer
2016-11-25 21:16 ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161125183711.675fa4a7@redhat.com \
    --to=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hannes@stressinduktion.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sd@queasysnail.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).