From: Jakub Sitnicki <jakub@cloudflare.com>
To: Martin Lau <kafai@fb.com>
Cc: "bpf\@vger.kernel.org" <bpf@vger.kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
"netdev\@vger.kernel.org" <netdev@vger.kernel.org>,
"kernel-team\@cloudflare.com" <kernel-team@cloudflare.com>
Subject: Re: [RFC bpf-next 0/5] Extend SOCKMAP to store listening sockets
Date: Mon, 28 Oct 2019 13:35:26 +0100 [thread overview]
Message-ID: <875zk9oxo1.fsf@cloudflare.com> (raw)
In-Reply-To: <20191028055247.bh5bctgxfvmr3zjh@kafai-mbp.dhcp.thefacebook.com>
On Mon, Oct 28, 2019 at 06:52 AM CET, Martin Lau wrote:
> On Tue, Oct 22, 2019 at 01:37:25PM +0200, Jakub Sitnicki wrote:
>> This patch set is a follow up on a suggestion from LPC '19 discussions to
>> make SOCKMAP (or a new map type derived from it) a generic type for storing
>> established as well as listening sockets.
>>
>> We found ourselves in need of a map type that keeps references to listening
>> sockets when working on making the socket lookup programmable, aka BPF
>> inet_lookup [1]. Initially we repurposed REUSEPORT_SOCKARRAY but found it
>> problematic to extend due to being tightly coupled with reuseport
>> logic (see slides [2]).
>> So we've turned our attention to SOCKMAP instead.
>>
>> As it turns out the changes needed to make SOCKMAP suitable for storing
>> listening sockets are self-contained and have use outside of programming
>> the socket lookup. Hence this patch set.
>>
>> With these patches SOCKMAP can be used in SK_REUSEPORT BPF programs as a
>> drop-in replacement for REUSEPORT_SOCKARRAY for TCP. This can hopefully
>> lead to code consolidation between the two map types in the future.
> What is the plan for UDP support in sockmap?
It's on our road-map because without SOCKMAP support for UDP we won't be
able to move away from TPROXY [1] and custom SO_BINDTOPREFIX extension
[2] for steering new UDP flows to receiving sockets. Also we would like
to look into using SOCKMAP for connected UDP socket splicing in the
future [3].
I was planning to split work as follows:
1. SOCKMAP support for listening sockets (this series)
2. programmable socket lookup for TCP (cut-down version of [4])
3. SOCKMAP support for UDP (work not started)
4. programmable socket lookup for UDP (rest of [4])
I'm open to suggestions on how to organize it.
>> Having said that, the main intention here is to lay groundwork for using
>> SOCKMAP in the next iteration of programmable socket lookup patches.
> What may be the minimal to get only lookup work for UDP sockmap?
> .close() and .unhash()?
John would know better. I haven't tried doing it yet.
From just reading the code - override the two proto ops you mentioned,
close and unhash, and adapt the socket checks in SOCKMAP.
-Jakub
[1] https://blog.cloudflare.com/how-we-built-spectrum/
[2] https://lore.kernel.org/netdev/1458699966-3752-1-git-send-email-gilberto.bertin@gmail.com/
[3] https://lore.kernel.org/bpf/20190828072250.29828-1-jakub@cloudflare.com/
[4] https://blog.cloudflare.com/sockmap-tcp-splicing-of-the-future/
next prev parent reply other threads:[~2019-10-28 12:35 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-22 11:37 [RFC bpf-next 0/5] Extend SOCKMAP to store listening sockets Jakub Sitnicki
2019-10-22 11:37 ` [RFC bpf-next 1/5] bpf, sockmap: Let BPF helpers use lookup operation on SOCKMAP Jakub Sitnicki
2019-10-24 16:59 ` John Fastabend
2019-10-22 11:37 ` [RFC bpf-next 2/5] bpf, sockmap: Allow inserting listening TCP sockets into SOCKMAP Jakub Sitnicki
2019-10-24 17:06 ` John Fastabend
2019-10-25 9:41 ` Jakub Sitnicki
2019-10-22 11:37 ` [RFC bpf-next 3/5] bpf, sockmap: Don't let child socket inherit psock or its ops on copy Jakub Sitnicki
2019-10-22 11:37 ` [RFC bpf-next 4/5] bpf: Allow selecting reuseport socket from a SOCKMAP Jakub Sitnicki
2019-10-22 11:37 ` [RFC bpf-next 5/5] selftests/bpf: Extend SK_REUSEPORT tests to cover SOCKMAP Jakub Sitnicki
2019-10-24 16:12 ` [RFC bpf-next 0/5] Extend SOCKMAP to store listening sockets Alexei Starovoitov
2019-10-24 16:56 ` John Fastabend
2019-10-25 9:26 ` Jakub Sitnicki
2019-10-25 14:18 ` John Fastabend
2019-10-28 5:52 ` Martin Lau
2019-10-28 12:35 ` Jakub Sitnicki [this message]
2019-10-28 19:04 ` John Fastabend
2019-10-29 8:56 ` Jakub Sitnicki
2019-10-28 20:42 ` Martin Lau
2019-10-28 21:05 ` John Fastabend
2019-10-28 21:38 ` Martin Lau
2019-10-29 8:52 ` Jakub Sitnicki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875zk9oxo1.fsf@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=bpf@vger.kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kernel-team@cloudflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.