netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Leif Hedstrom <lhedstrom@apple.com>
Cc: Christoph Paasch <cpaasch@apple.com>,
	netdev@vger.kernel.org, Ian Swett <ianswett@google.com>,
	Jana Iyengar <jri.ietf@gmail.com>
Subject: Re: [RFC 0/2] Delayed binding of UDP sockets for Quic per-connection sockets
Date: Thu, 1 Nov 2018 11:21:56 -0700	[thread overview]
Message-ID: <6ada7f5f-a790-b5e8-5d42-94f1acc94aa3@gmail.com> (raw)
In-Reply-To: <60615C27-057A-4215-9D1E-3A164B72757B@apple.com>



On 11/01/2018 10:58 AM, Leif Hedstrom wrote:
> 
> 
>> On Oct 31, 2018, at 6:53 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 10/31/2018 04:26 PM, Christoph Paasch wrote:
>>> Implementations of Quic might want to create a separate socket for each
>>> Quic-connection by creating a connected UDP-socket.
>>>
>>
>> Nice proposal, but I doubt a QUIC server can afford having one UDP socket per connection ?
> 
> First thing: This is an idea we’ve been floating, and it’s not completed yet, so we don’t have any performance numbers etc. to share. The ideas for the implementation came up after a discussion with Ian and Jana re: their implementation of a QUIC server.
> 
> That much said, the general rationale for this is that having a socket for each QUIC connection could simplify integrating QUIC into existing software that already does epoll() over TCP sockets. This is how e.g. Apache Traffic Server works, which is our target implementation for QUIC.
> 
> 
> 
>>
>> It would add a huge overhead in term of memory usage in the kernel,
>> and lots of epoll events to manage (say a QUIC server with one million flows, receiving
>> very few packets per second per flow)
> 
> Our use case is not millions of sockets, rather, 10’s of thousands. There would be one socket for each QUIC Connection, not per stream (obviously). At ~80Gbps on a box, we definitely see much less than 100k TCP connections.
> 
> Question: is there additional memory overhead here for the UDP sockets vs a normal TCP socket for e.g. HTTP or HTTP/2 ?

TCP sockets have a lot of state. We can understand spending 2 or 3 KB per socket.

UDP sockets really have no state. The receive queue anchor is only 24 bytes.
Still, memory cost for one UDP socket are :

1344 bytes for UDP socket,
320 bytes for the "struct file"
192 bytes for the struct dentry
704 bytes for inode
512 bytes for the two dst (connected socket)
200 bytes for eventpoll structures
104 bytes for the fq flow

That is about 3.1KB per socket (but you probably can round this to 4KB due to kmalloc roundings)

One million sockets -> 4GB of memory.

This really does not scale.

  reply	other threads:[~2018-11-02  3:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-31 23:26 [RFC 0/2] Delayed binding of UDP sockets for Quic per-connection sockets Christoph Paasch
2018-10-31 23:26 ` [RFC 1/2] net: Add new socket-option SO_DELAYED_BIND Christoph Paasch
2018-10-31 23:26 ` [RFC 2/2] udp: Support SO_DELAYED_BIND Christoph Paasch
2018-11-01  0:53 ` [RFC 0/2] Delayed binding of UDP sockets for Quic per-connection sockets Eric Dumazet
2018-11-01  3:50   ` Christoph Paasch
2018-11-01  5:04     ` Eric Dumazet
2018-11-01  5:07       ` Christoph Paasch
2018-11-01  5:08     ` Eric Dumazet
2018-11-01  5:17       ` Eric Dumazet
2018-11-01 17:58   ` Leif Hedstrom
2018-11-01 18:21     ` Eric Dumazet [this message]
2018-11-01 21:51 ` Willem de Bruijn
2018-11-01 22:11   ` Christoph Paasch
     [not found]     ` <CAKcm_gNZqgRGRj2J5yJDsavHsoaeXtozrbGp+TmAj_DRsCUOLQ@mail.gmail.com>
     [not found]       ` <CACpbDccs6WmLCknpu2GLMMBnkHwS4apsr3Z3sAKt4Ch_2HPwgg@mail.gmail.com>
2018-11-04 18:58         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ada7f5f-a790-b5e8-5d42-94f1acc94aa3@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=cpaasch@apple.com \
    --cc=ianswett@google.com \
    --cc=jri.ietf@gmail.com \
    --cc=lhedstrom@apple.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).