From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kuniyuki Iwashima <kuniyu@amazon.co.jp>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Eric Dumazet <edumazet@google.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>
Cc: Benjamin Herrenschmidt <benh@amazon.com>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH bpf-next 0/8] Socket migration for SO_REUSEPORT.
Date: Wed, 18 Nov 2020 17:25:44 +0100 [thread overview]
Message-ID: <5feaafd3-72ca-72da-0fe8-cc4206bc29e6@gmail.com> (raw)
In-Reply-To: <20201117094023.3685-1-kuniyu@amazon.co.jp>
On 11/17/20 10:40 AM, Kuniyuki Iwashima wrote:
> The SO_REUSEPORT option allows sockets to listen on the same port and to
> accept connections evenly. However, there is a defect in the current
> implementation. When a SYN packet is received, the connection is tied to a
> listening socket. Accordingly, when the listener is closed, in-flight
> requests during the three-way handshake and child sockets in the accept
> queue are dropped even if other listeners could accept such connections.
>
> This situation can happen when various server management tools restart
> server (such as nginx) processes. For instance, when we change nginx
> configurations and restart it, it spins up new workers that respect the new
> configuration and closes all listeners on the old workers, resulting in
> in-flight ACK of 3WHS is responded by RST.
>
I know some programs are simply removing a listener from the group,
so that they no longer handle new SYN packets,
and wait until all timers or 3WHS have completed before closing them.
They pass fd of newly accepted children to more recent programs using af_unix fd passing,
while in this draining mode.
Quite frankly, mixing eBPF in the picture is distracting.
It seems you want some way to transfer request sockets (and/or not yet accepted established ones)
from fd1 to fd2, isn't it something that should be discussed independently ?
next prev parent reply other threads:[~2020-11-18 16:25 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-17 9:40 [RFC PATCH bpf-next 0/8] Socket migration for SO_REUSEPORT Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 1/8] net: Introduce net.ipv4.tcp_migrate_req Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 2/8] tcp: Keep TCP_CLOSE sockets in the reuseport group Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 3/8] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues Kuniyuki Iwashima
2020-11-18 23:50 ` Martin KaFai Lau
2020-11-19 22:09 ` Kuniyuki Iwashima
2020-11-20 1:53 ` Martin KaFai Lau
2020-11-21 10:13 ` Kuniyuki Iwashima
2020-11-23 0:40 ` Martin KaFai Lau
2020-11-24 9:24 ` Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 4/8] tcp: Migrate TFO requests causing RST during TCP_SYN_RECV Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 5/8] tcp: Migrate TCP_NEW_SYN_RECV requests Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 6/8] bpf: Add cookie in sk_reuseport_md Kuniyuki Iwashima
2020-11-19 0:11 ` Martin KaFai Lau
2020-11-19 22:10 ` Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 7/8] bpf: Call bpf_run_sk_reuseport() for socket migration Kuniyuki Iwashima
2020-11-19 1:00 ` Martin KaFai Lau
2020-11-19 22:13 ` Kuniyuki Iwashima
2020-11-17 9:40 ` [RFC PATCH bpf-next 8/8] bpf: Test BPF_PROG_TYPE_SK_REUSEPORT " Kuniyuki Iwashima
2020-11-18 9:18 ` [RFC PATCH bpf-next 0/8] Socket migration for SO_REUSEPORT David Laight
2020-11-19 22:01 ` Kuniyuki Iwashima
2020-11-18 16:25 ` Eric Dumazet [this message]
2020-11-19 22:05 ` Kuniyuki Iwashima
2020-11-19 1:49 ` Martin KaFai Lau
2020-11-19 22:17 ` Kuniyuki Iwashima
2020-11-20 2:31 ` Martin KaFai Lau
2020-11-21 10:16 ` Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5feaafd3-72ca-72da-0fe8-cc4206bc29e6@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=ast@kernel.org \
--cc=benh@amazon.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=kuniyu@amazon.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox