From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [RFC 0/2] Delayed binding of UDP sockets for Quic per-connection sockets Date: Wed, 31 Oct 2018 17:53:22 -0700 Message-ID: <0ce864f0-38b9-59cc-18ea-e071afca347d@gmail.com> References: <20181031232635.33750-1-cpaasch@apple.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Ian Swett , Leif Hedstrom , Jana Iyengar To: Christoph Paasch , netdev@vger.kernel.org Return-path: Received: from mail-pg1-f195.google.com ([209.85.215.195]:34072 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726070AbeKAJyD (ORCPT ); Thu, 1 Nov 2018 05:54:03 -0400 Received: by mail-pg1-f195.google.com with SMTP id k1-v6so6767838pgq.1 for ; Wed, 31 Oct 2018 17:53:25 -0700 (PDT) In-Reply-To: <20181031232635.33750-1-cpaasch@apple.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 10/31/2018 04:26 PM, Christoph Paasch wrote: > Implementations of Quic might want to create a separate socket for each > Quic-connection by creating a connected UDP-socket. > Nice proposal, but I doubt a QUIC server can afford having one UDP socket per connection ? It would add a huge overhead in term of memory usage in the kernel, and lots of epoll events to manage (say a QUIC server with one million flows, receiving very few packets per second per flow) Maybe you could elaborate on the need of having one UDP socket per connection. > To achieve that on the server-side, a "master-socket" needs to wait for > incoming new connections and then creates a new socket that will be a > connected UDP-socket. To create that latter one, the server needs to > first bind() and then connect(). However, after the bind() the server > might already receive traffic on that new socket that is unrelated to the > Quic-connection at hand. Only after the connect() a full 4-tuple match > is happening. So, one can't really create this kind of a server that has > a connected UDP-socket per Quic connection. > > So, what is needed is an "atomic bind & connect" that basically > prevents any incoming traffic until the connect() call has been issued > at which point the full 4-tuple is known. > > > This patchset implements this functionality and exposes a socket-option > to do this. > > Usage would be: > > int fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); > > int val = 1; > setsockopt(fd, SOL_SOCKET, SO_DELAYED_BIND, &val, sizeof(val)); > > bind(fd, (struct sockaddr *)&src, sizeof(src)); > > /* At this point, incoming traffic will never match on this socket */ > > connect(fd, (struct sockaddr *)&dst, sizeof(dst)); > > /* Only now incoming traffic will reach the socket */ > > > > There is literally an infinite number of ways on how to implement it, > which is why I first send it out as an RFC. With this approach here I > chose the least invasive one, just preventing the match on the incoming > path. > > > The reason for choosing a SOL_SOCKET socket-option and not at the > SOL_UDP-level is because that functionality actually could be useful for > other protocols as well. E.g., TCP wants to better use the full 4-tuple space > by binding to the source-IP and the destination-IP at the same time. Passive TCP flows can not benefit from this idea. Active TCP flows can already do that, I do not really understand what you are suggesting.