All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: David Laight <David.Laight@ACULAB.COM>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	'Paolo Abeni' <pabeni@redhat.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	'Marek Majkowski' <marek@cloudflare.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	network dev <netdev@vger.kernel.org>,
	kernel-team <kernel-team@cloudflare.com>
Subject: Re: epoll_wait() performance
Date: Sat, 30 Nov 2019 14:29:41 +0100	[thread overview]
Message-ID: <878snxo5kq.fsf@cloudflare.com> (raw)
In-Reply-To: <c46e43d1-ba7d-39d9-688f-0141931df1b0@gmail.com>

On Sat, Nov 30, 2019 at 02:07 AM CET, Eric Dumazet wrote:
> On 11/28/19 2:17 AM, David Laight wrote:
>> From: Eric Dumazet
>>> Sent: 27 November 2019 17:47
>> ...
>>> A QUIC server handles hundred of thousands of ' UDP flows' all using only one UDP socket
>>> per cpu.
>>>
>>> This is really the only way to scale, and does not need kernel changes to efficiently
>>> organize millions of UDP sockets (huge memory footprint even if we get right how
>>> we manage them)
>>>
>>> Given that UDP has no state, there is really no point trying to have one UDP
>>> socket per flow, and having to deal with epoll()/poll() overhead.
>>
>> How can you do that when all the UDP flows have different destination port numbers?
>> These are message flows not idempotent requests.
>> I don't really want to collect the packets before they've been processed by IP.
>>
>> I could write a driver that uses kernel udp sockets to generate a single message queue
>> than can be efficiently processed from userspace - but it is a faff compiling it for
>> the systems kernel version.
>
> Well if destinations ports are not under your control,
> you also could use AF_PACKET sockets, no need for 'UDP sockets' to receive UDP traffic,
> especially it the rate is small.

Alternatively, you could steer UDP flows coming to a certain port range
to one UDP socket using TPROXY [0, 1].

TPROXY has the same downside as AF_PACKET, meaning that it requires at
least CAP_NET_RAW to create/set up the socket.

OTOH, with TPROXY you can gracefully co-reside with other services,
filering on just the destination addresses you want in iptables/nftables.

Fan-out / load-balancing with reuseport to have one socket per CPU is
not possible, though. You would need to do that with Netfilter.

-Jakub

[0] https://www.kernel.org/doc/Documentation/networking/tproxy.txt
[1] https://blog.cloudflare.com/how-we-built-spectrum/

  reply	other threads:[~2019-11-30 13:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-22 11:17 epoll_wait() performance David Laight
2019-11-27  9:50 ` Marek Majkowski
2019-11-27 10:39   ` David Laight
2019-11-27 15:48     ` Jesper Dangaard Brouer
2019-11-27 16:04       ` David Laight
2019-11-27 19:48         ` Willem de Bruijn
2019-11-28 16:25           ` David Laight
2019-11-28 11:12         ` Jesper Dangaard Brouer
2019-11-28 16:37           ` David Laight
2019-11-28 16:52             ` Willy Tarreau
2019-12-19  7:57             ` Jesper Dangaard Brouer
2019-11-27 16:26       ` Paolo Abeni
2019-11-27 17:30         ` David Laight
2019-11-27 17:46           ` Eric Dumazet
2019-11-28 10:17             ` David Laight
2019-11-30  1:07               ` Eric Dumazet
2019-11-30 13:29                 ` Jakub Sitnicki [this message]
2019-12-02 12:24                   ` David Laight
2019-12-02 16:47                     ` Willem de Bruijn
2019-11-27 17:50           ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878snxo5kq.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=brouer@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marek@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.