netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Craig Gallek <kraigatgoog@gmail.com>
To: Linux Kernel Network Developers <netdev@vger.kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Subject: Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode
Date: Fri, 25 Mar 2016 12:31:48 -0400	[thread overview]
Message-ID: <CAEfhGiy-V0da6LQ_SY+E4u7S49n5dEpVaKiOOo103ijQtPujAw@mail.gmail.com> (raw)
In-Reply-To: <20160325162114.GA72479@ast-mbp.thefacebook.com>

On Fri, Mar 25, 2016 at 12:21 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Fri, Mar 25, 2016 at 11:29:10AM -0400, Craig Gallek wrote:
>> On Thu, Mar 24, 2016 at 2:00 PM, Willy Tarreau <w@1wt.eu> wrote:
>> > The pattern is :
>> >
>> >   t0 : unprivileged processes 1 and 2 are listening to the same port
>> >        (sock1@pid1) (sock2@pid2)
>> >        <------ listening ------>
>> >
>> >   t1 : new processes are started to replace the old ones
>> >        (sock1@pid1) (sock2@pid2) (sock3@pid3) (sock4@pid4)
>> >        <------ listening ------> <------ listening ------>
>> >
>> >   t2 : new processes signal the old ones they must stop
>> >        (sock1@pid1) (sock2@pid2) (sock3@pid3) (sock4@pid4)
>> >        <------- draining ------> <------ listening ------>
>> >
>> >   t3 : pids 1 and 2 have finished, they go away
>> >                                  (sock3@pid3) (sock4@pid4)
>> >         <------ gone ----->      <------ listening ------>
> ...
>> t3: Close the first two sockets and only use the last two.  This is
>> the tricky step.  Before this point, the sockets are numbered 0
>> through 3 from the perspective of the BPF program (in the order
>> listen() was called).  As soon as socket 0 is closed, the last socket
>> in the list replaces it (what was 3 becomes 0).  When socket 1 is
>> closed, socket 2 moves into that position.  The assumptions about the
>> socket indexes in the BPF program need to change as the indexes change
>> as a result of closing them.
>
> yeah, the way reuseport_detach_sock() was done makes it hard to manage
> such transitions from bpf program, but I don't see yet what stops
> pid1 an pid2 at stage t2 to just close their sockets.
> If these 'draining' pids don't want to receive packets, they should
> close their sockets. Complicating bpf side to redistribute spraying
> to sock3 and sock4 only (while sock1 and sock2 are still open) is possible,
> but looks unnecessary complex to me.
> Just close sock1 and sock2 at t2 time and then exit pid1, pid2 later.
> If they are tcp sockets with rpc protocol on top and have a problem of
> partial messages, then kcm can solve that and it will simplify
> the user space side as well.
I believe the issue here is that closing the listen sockets will drop
any connections that are in the listen queue but have not been
accepted yet.  In the case of reuseport, you could in theory drain
those queues into the non-closed sockets, but that probably has some
interesting consequences...

  reply	other threads:[~2016-03-25 16:31 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-25 15:29 [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode Craig Gallek
2016-03-25 16:21 ` Alexei Starovoitov
2016-03-25 16:31   ` Craig Gallek [this message]
2016-03-25 17:00     ` Eric Dumazet
2016-03-25 18:31       ` Willem de Bruijn
  -- strict thread matches above, loose matches on Subject: below --
2015-09-27  0:30 Tolga Ceylan
2015-09-27  1:04 ` Eric Dumazet
2015-09-27  1:37   ` Tolga Ceylan
2015-09-27  1:44 ` Aaron Conole
2015-09-27  2:02   ` Tolga Ceylan
2015-09-27  2:24     ` Eric Dumazet
2015-11-11  5:41       ` Tom Herbert
2015-11-11  6:19         ` Eric Dumazet
2015-11-11 17:05           ` Tom Herbert
2015-11-11 17:23             ` Eric Dumazet
2015-11-11 18:23               ` Tom Herbert
2015-11-11 18:43                 ` Eric Dumazet
2015-11-12  1:09                   ` Eric Dumazet
2015-12-15 16:14                     ` Willy Tarreau
2015-12-15 17:10                       ` Eric Dumazet
2015-12-15 17:43                         ` Willy Tarreau
2015-12-15 18:21                           ` Eric Dumazet
2015-12-15 19:44                             ` Willy Tarreau
2015-12-15 21:21                               ` Eric Dumazet
2015-12-16  7:38                                 ` Willy Tarreau
2015-12-16 16:15                                   ` Willy Tarreau
2015-12-18 16:33                                     ` Josh Snyder
2015-12-18 18:58                                       ` Willy Tarreau
2015-12-19  2:38                                         ` Eric Dumazet
2015-12-19  7:00                                           ` Willy Tarreau
2015-12-21 20:38                                             ` Tom Herbert
2015-12-21 20:41                                               ` Willy Tarreau
2016-03-24  5:10                                                 ` Tolga Ceylan
2016-03-24  6:12                                                   ` Willy Tarreau
2016-03-24 14:13                                                     ` Eric Dumazet
2016-03-24 14:22                                                       ` Willy Tarreau
2016-03-24 14:45                                                         ` Eric Dumazet
2016-03-24 15:30                                                           ` Willy Tarreau
2016-03-24 16:33                                                             ` Eric Dumazet
2016-03-24 16:50                                                               ` Willy Tarreau
2016-03-24 17:01                                                                 ` Eric Dumazet
2016-03-24 17:26                                                                   ` Tom Herbert
2016-03-24 17:55                                                                     ` Daniel Borkmann
2016-03-24 18:20                                                                       ` Tolga Ceylan
2016-03-24 18:24                                                                         ` Willy Tarreau
2016-03-24 18:37                                                                         ` Eric Dumazet
2016-03-24 22:40                                                                       ` Yann Ylavic
2016-03-24 22:49                                                                         ` Eric Dumazet
2016-03-24 23:40                                                                           ` Yann Ylavic
2016-03-24 23:54                                                                             ` Tom Herbert
2016-03-25  0:01                                                                               ` Yann Ylavic
2016-03-25  5:28                                                                               ` Willy Tarreau
2016-03-25  6:49                                                                                 ` Eric Dumazet
2016-03-25  8:53                                                                                   ` Willy Tarreau
2016-03-25 11:21                                                                                     ` Yann Ylavic
2016-03-25 13:17                                                                                       ` Eric Dumazet
2016-03-25  0:25                                                                           ` David Miller
2016-03-25  0:24                                                                         ` David Miller
2016-03-24 18:00                                                                   ` Willy Tarreau
2016-03-24 18:21                                                                     ` Willy Tarreau
2016-03-24 18:32                                                                     ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEfhGiy-V0da6LQ_SY+E4u7S49n5dEpVaKiOOo103ijQtPujAw@mail.gmail.com \
    --to=kraigatgoog@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).