* [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG @ 2026-06-23 11:20 Jakub Sitnicki 2026-06-23 16:08 ` Kuniyuki Iwashima 0 siblings, 1 reply; 14+ messages in thread From: Jakub Sitnicki @ 2026-06-23 11:20 UTC (permalink / raw) To: bpf Cc: Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Kuniyuki Iwashima, netdev, kernel-team Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When completed all code paths related to sockmap-based redirects should be guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by disabling NET_SOCK_MSG. The implementation of sockmap as a container for socket references would remain under BPF_SYSCALL. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> --- Changes in v2: - Handle prot->recvmsg being NULL (Sashiko) - Elaborate on the end goal in description - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com --- net/unix/af_unix.c | 4 ++-- net/unix/unix_bpf.c | 6 ++++++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index f7a9d55eee8a..84c11c60c75f 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si #ifdef CONFIG_BPF_SYSCALL const struct proto *prot = READ_ONCE(sk->sk_prot); - if (prot != &unix_dgram_proto) + if (prot->recvmsg) return prot->recvmsg(sk, msg, size, flags); #endif return __unix_dgram_recvmsg(sk, msg, size, flags); @@ -3152,7 +3152,7 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg, struct sock *sk = sock->sk; const struct proto *prot = READ_ONCE(sk->sk_prot); - if (prot != &unix_stream_proto) + if (prot->recvmsg) return prot->recvmsg(sk, msg, size, flags); #endif return unix_stream_read_generic(&state, true); diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c index f86ff19e9764..5289a04b4993 100644 --- a/net/unix/unix_bpf.c +++ b/net/unix/unix_bpf.c @@ -7,6 +7,7 @@ #include "af_unix.h" +#ifdef CONFIG_NET_SOCK_MSG #define unix_sk_has_data(__sk, __psock) \ ({ !skb_queue_empty(&__sk->sk_receive_queue) || \ !skb_queue_empty(&__psock->ingress_skb) || \ @@ -94,6 +95,7 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg, sk_psock_put(sk, psock); return copied; } +#endif /* CONFIG_NET_SOCK_MSG */ static struct proto *unix_dgram_prot_saved __read_mostly; static DEFINE_SPINLOCK(unix_dgram_prot_lock); @@ -107,8 +109,10 @@ static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto { *prot = *base; prot->close = sock_map_close; +#ifdef CONFIG_NET_SOCK_MSG prot->recvmsg = unix_bpf_recvmsg; prot->sock_is_readable = sk_msg_is_readable; +#endif } static void unix_stream_bpf_rebuild_protos(struct proto *prot, @@ -116,8 +120,10 @@ static void unix_stream_bpf_rebuild_protos(struct proto *prot, { *prot = *base; prot->close = sock_map_close; +#ifdef CONFIG_NET_SOCK_MSG prot->recvmsg = unix_bpf_recvmsg; prot->sock_is_readable = sk_msg_is_readable; +#endif prot->unhash = sock_map_unhash; } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 11:20 [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG Jakub Sitnicki @ 2026-06-23 16:08 ` Kuniyuki Iwashima 2026-06-23 19:21 ` Jakub Sitnicki 0 siblings, 1 reply; 14+ messages in thread From: Kuniyuki Iwashima @ 2026-06-23 16:08 UTC (permalink / raw) To: Jakub Sitnicki Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, netdev, kernel-team On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > completed all code paths related to sockmap-based redirects should be > guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > disabling NET_SOCK_MSG. The implementation of sockmap as a container for > socket references would remain under BPF_SYSCALL. > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > --- > Changes in v2: > - Handle prot->recvmsg being NULL (Sashiko) > - Elaborate on the end goal in description > - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > --- > net/unix/af_unix.c | 4 ++-- > net/unix/unix_bpf.c | 6 ++++++ > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > index f7a9d55eee8a..84c11c60c75f 100644 > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > #ifdef CONFIG_BPF_SYSCALL > const struct proto *prot = READ_ONCE(sk->sk_prot); > > - if (prot != &unix_dgram_proto) > + if (prot->recvmsg) There is no reason to have this dead branch when CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. Let's compile out all sockmap code when both configs are not enabled. Since AF_UNIX differs from TCP/UDP, it can take the simpler approach. > return prot->recvmsg(sk, msg, size, flags); > #endif > return __unix_dgram_recvmsg(sk, msg, size, flags); > @@ -3152,7 +3152,7 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg, > struct sock *sk = sock->sk; > const struct proto *prot = READ_ONCE(sk->sk_prot); > > - if (prot != &unix_stream_proto) > + if (prot->recvmsg) > return prot->recvmsg(sk, msg, size, flags); > #endif > return unix_stream_read_generic(&state, true); > diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c > index f86ff19e9764..5289a04b4993 100644 > --- a/net/unix/unix_bpf.c > +++ b/net/unix/unix_bpf.c > @@ -7,6 +7,7 @@ > > #include "af_unix.h" > > +#ifdef CONFIG_NET_SOCK_MSG > #define unix_sk_has_data(__sk, __psock) \ > ({ !skb_queue_empty(&__sk->sk_receive_queue) || \ > !skb_queue_empty(&__psock->ingress_skb) || \ > @@ -94,6 +95,7 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > sk_psock_put(sk, psock); > return copied; > } > +#endif /* CONFIG_NET_SOCK_MSG */ > > static struct proto *unix_dgram_prot_saved __read_mostly; > static DEFINE_SPINLOCK(unix_dgram_prot_lock); > @@ -107,8 +109,10 @@ static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto > { > *prot = *base; > prot->close = sock_map_close; > +#ifdef CONFIG_NET_SOCK_MSG > prot->recvmsg = unix_bpf_recvmsg; > prot->sock_is_readable = sk_msg_is_readable; > +#endif > } > > static void unix_stream_bpf_rebuild_protos(struct proto *prot, > @@ -116,8 +120,10 @@ static void unix_stream_bpf_rebuild_protos(struct proto *prot, > { > *prot = *base; > prot->close = sock_map_close; > +#ifdef CONFIG_NET_SOCK_MSG > prot->recvmsg = unix_bpf_recvmsg; > prot->sock_is_readable = sk_msg_is_readable; > +#endif > prot->unhash = sock_map_unhash; > } > > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 16:08 ` Kuniyuki Iwashima @ 2026-06-23 19:21 ` Jakub Sitnicki 2026-06-23 19:31 ` Kuniyuki Iwashima 0 siblings, 1 reply; 14+ messages in thread From: Jakub Sitnicki @ 2026-06-23 19:21 UTC (permalink / raw) To: Kuniyuki Iwashima Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, netdev, kernel-team On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When >> completed all code paths related to sockmap-based redirects should be >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for >> socket references would remain under BPF_SYSCALL. >> >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >> --- >> Changes in v2: >> - Handle prot->recvmsg being NULL (Sashiko) >> - Elaborate on the end goal in description >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com >> --- >> net/unix/af_unix.c | 4 ++-- >> net/unix/unix_bpf.c | 6 ++++++ >> 2 files changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c >> index f7a9d55eee8a..84c11c60c75f 100644 >> --- a/net/unix/af_unix.c >> +++ b/net/unix/af_unix.c >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si >> #ifdef CONFIG_BPF_SYSCALL >> const struct proto *prot = READ_ONCE(sk->sk_prot); >> >> - if (prot != &unix_dgram_proto) >> + if (prot->recvmsg) > > There is no reason to have this dead branch when > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > > Let's compile out all sockmap code when both configs > are not enabled. > > Since AF_UNIX differs from TCP/UDP, it can take the > simpler approach. Okay, will put the whole file behind hidden config option like so: --- a/net/unix/Kconfig +++ b/net/unix/Kconfig @@ -30,3 +30,8 @@ config UNIX_DIAG help Support for UNIX socket monitoring interface used by the ss tool. If unsure, say Y. + +config UNIX_BPF + bool + depends on UNIX + default y if BPF_SYSCALL && NET_SOCK_MSG ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 19:21 ` Jakub Sitnicki @ 2026-06-23 19:31 ` Kuniyuki Iwashima 2026-06-23 19:33 ` Alexei Starovoitov 2026-06-23 20:09 ` Jakub Sitnicki 0 siblings, 2 replies; 14+ messages in thread From: Kuniyuki Iwashima @ 2026-06-23 19:31 UTC (permalink / raw) To: Jakub Sitnicki Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, netdev, kernel-team On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > >> completed all code paths related to sockmap-based redirects should be > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for > >> socket references would remain under BPF_SYSCALL. > >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > >> --- > >> Changes in v2: > >> - Handle prot->recvmsg being NULL (Sashiko) > >> - Elaborate on the end goal in description > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > >> --- > >> net/unix/af_unix.c | 4 ++-- > >> net/unix/unix_bpf.c | 6 ++++++ > >> 2 files changed, 8 insertions(+), 2 deletions(-) > >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > >> index f7a9d55eee8a..84c11c60c75f 100644 > >> --- a/net/unix/af_unix.c > >> +++ b/net/unix/af_unix.c > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > >> #ifdef CONFIG_BPF_SYSCALL > >> const struct proto *prot = READ_ONCE(sk->sk_prot); > >> > >> - if (prot != &unix_dgram_proto) > >> + if (prot->recvmsg) > > > > There is no reason to have this dead branch when > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > > > > Let's compile out all sockmap code when both configs > > are not enabled. > > > > Since AF_UNIX differs from TCP/UDP, it can take the > > simpler approach. > > Okay, will put the whole file behind hidden config option like so: > > --- a/net/unix/Kconfig > +++ b/net/unix/Kconfig > @@ -30,3 +30,8 @@ config UNIX_DIAG > help > Support for UNIX socket monitoring interface used by the ss tool. > If unsure, say Y. > + > +config UNIX_BPF Maybe UNIX_BPF_SOCKMAP or something. bpf_iter is supported without this config. > + bool > + depends on UNIX > + default y if BPF_SYSCALL && NET_SOCK_MSG ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 19:31 ` Kuniyuki Iwashima @ 2026-06-23 19:33 ` Alexei Starovoitov 2026-06-23 20:03 ` Jakub Sitnicki 2026-06-23 20:09 ` Jakub Sitnicki 1 sibling, 1 reply; 14+ messages in thread From: Alexei Starovoitov @ 2026-06-23 19:33 UTC (permalink / raw) To: Kuniyuki Iwashima Cc: Jakub Sitnicki, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: > > On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > > > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > >> > > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > > >> completed all code paths related to sockmap-based redirects should be > > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for > > >> socket references would remain under BPF_SYSCALL. > > >> > > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > > >> --- > > >> Changes in v2: > > >> - Handle prot->recvmsg being NULL (Sashiko) > > >> - Elaborate on the end goal in description > > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > > >> --- > > >> net/unix/af_unix.c | 4 ++-- > > >> net/unix/unix_bpf.c | 6 ++++++ > > >> 2 files changed, 8 insertions(+), 2 deletions(-) > > >> > > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > > >> index f7a9d55eee8a..84c11c60c75f 100644 > > >> --- a/net/unix/af_unix.c > > >> +++ b/net/unix/af_unix.c > > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > > >> #ifdef CONFIG_BPF_SYSCALL > > >> const struct proto *prot = READ_ONCE(sk->sk_prot); > > >> > > >> - if (prot != &unix_dgram_proto) > > >> + if (prot->recvmsg) > > > > > > There is no reason to have this dead branch when > > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > > > > > > Let's compile out all sockmap code when both configs > > > are not enabled. > > > > > > Since AF_UNIX differs from TCP/UDP, it can take the > > > simpler approach. > > > > Okay, will put the whole file behind hidden config option like so: > > > > --- a/net/unix/Kconfig > > +++ b/net/unix/Kconfig > > @@ -30,3 +30,8 @@ config UNIX_DIAG > > help > > Support for UNIX socket monitoring interface used by the ss tool. > > If unsure, say Y. > > + > > +config UNIX_BPF > > Maybe UNIX_BPF_SOCKMAP or something. > bpf_iter is supported without this config. I don't like where it's going. I strongly dislike new config knobs. I'd rather remove existing knobs. What is the motivation? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 19:33 ` Alexei Starovoitov @ 2026-06-23 20:03 ` Jakub Sitnicki 2026-06-23 20:13 ` Kuniyuki Iwashima 2026-06-23 20:22 ` Amery Hung 0 siblings, 2 replies; 14+ messages in thread From: Jakub Sitnicki @ 2026-06-23 20:03 UTC (permalink / raw) To: Alexei Starovoitov Cc: Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: >> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> > >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> > >> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When >> > >> completed all code paths related to sockmap-based redirects should be >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for >> > >> socket references would remain under BPF_SYSCALL. >> > >> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >> > >> --- >> > >> Changes in v2: >> > >> - Handle prot->recvmsg being NULL (Sashiko) >> > >> - Elaborate on the end goal in description >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com >> > >> --- >> > >> net/unix/af_unix.c | 4 ++-- >> > >> net/unix/unix_bpf.c | 6 ++++++ >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) >> > >> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c >> > >> index f7a9d55eee8a..84c11c60c75f 100644 >> > >> --- a/net/unix/af_unix.c >> > >> +++ b/net/unix/af_unix.c >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si >> > >> #ifdef CONFIG_BPF_SYSCALL >> > >> const struct proto *prot = READ_ONCE(sk->sk_prot); >> > >> >> > >> - if (prot != &unix_dgram_proto) >> > >> + if (prot->recvmsg) >> > > >> > > There is no reason to have this dead branch when >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. >> > > >> > > Let's compile out all sockmap code when both configs >> > > are not enabled. >> > > >> > > Since AF_UNIX differs from TCP/UDP, it can take the >> > > simpler approach. >> > >> > Okay, will put the whole file behind hidden config option like so: >> > >> > --- a/net/unix/Kconfig >> > +++ b/net/unix/Kconfig >> > @@ -30,3 +30,8 @@ config UNIX_DIAG >> > help >> > Support for UNIX socket monitoring interface used by the ss tool. >> > If unsure, say Y. >> > + >> > +config UNIX_BPF >> >> Maybe UNIX_BPF_SOCKMAP or something. >> bpf_iter is supported without this config. > > I don't like where it's going. > I strongly dislike new config knobs. > I'd rather remove existing knobs. > What is the motivation? The goal is to compile out sockmap bits that use sk_msg. NET_SOCK_MSG is natural, exisiting candidate. New knob wasn't my idea. Alternatively, we can do this to avoid the extra knob: ifdef CONFIG_BPF_SYSCALL unix-$(CONFIG_NET_SOCK_MSG) += unix_bpf.o endif ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 20:03 ` Jakub Sitnicki @ 2026-06-23 20:13 ` Kuniyuki Iwashima 2026-06-23 20:22 ` Amery Hung 1 sibling, 0 replies; 14+ messages in thread From: Kuniyuki Iwashima @ 2026-06-23 20:13 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 1:03 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: > > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: > >> > >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > > >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > >> > >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > >> > >> completed all code paths related to sockmap-based redirects should be > >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for > >> > >> socket references would remain under BPF_SYSCALL. > >> > >> > >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > >> > >> --- > >> > >> Changes in v2: > >> > >> - Handle prot->recvmsg being NULL (Sashiko) > >> > >> - Elaborate on the end goal in description > >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > >> > >> --- > >> > >> net/unix/af_unix.c | 4 ++-- > >> > >> net/unix/unix_bpf.c | 6 ++++++ > >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) > >> > >> > >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > >> > >> index f7a9d55eee8a..84c11c60c75f 100644 > >> > >> --- a/net/unix/af_unix.c > >> > >> +++ b/net/unix/af_unix.c > >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > >> > >> #ifdef CONFIG_BPF_SYSCALL > >> > >> const struct proto *prot = READ_ONCE(sk->sk_prot); > >> > >> > >> > >> - if (prot != &unix_dgram_proto) > >> > >> + if (prot->recvmsg) > >> > > > >> > > There is no reason to have this dead branch when > >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > >> > > > >> > > Let's compile out all sockmap code when both configs > >> > > are not enabled. > >> > > > >> > > Since AF_UNIX differs from TCP/UDP, it can take the > >> > > simpler approach. > >> > > >> > Okay, will put the whole file behind hidden config option like so: > >> > > >> > --- a/net/unix/Kconfig > >> > +++ b/net/unix/Kconfig > >> > @@ -30,3 +30,8 @@ config UNIX_DIAG > >> > help > >> > Support for UNIX socket monitoring interface used by the ss tool. > >> > If unsure, say Y. > >> > + > >> > +config UNIX_BPF > >> > >> Maybe UNIX_BPF_SOCKMAP or something. > >> bpf_iter is supported without this config. > > > > I don't like where it's going. > > I strongly dislike new config knobs. > > I'd rather remove existing knobs. > > What is the motivation? > > The goal is to compile out sockmap bits that use sk_msg. > NET_SOCK_MSG is natural, exisiting candidate. > New knob wasn't my idea. I think config w/o description is okay since it's not selectable. > > Alternatively, we can do this to avoid the extra knob: > > ifdef CONFIG_BPF_SYSCALL > unix-$(CONFIG_NET_SOCK_MSG) += unix_bpf.o > endif This is far better, I forgot ifdef is available. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 20:03 ` Jakub Sitnicki 2026-06-23 20:13 ` Kuniyuki Iwashima @ 2026-06-23 20:22 ` Amery Hung 2026-06-23 20:36 ` Jakub Sitnicki 1 sibling, 1 reply; 14+ messages in thread From: Amery Hung @ 2026-06-23 20:22 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: > > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: > >> > >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > > >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > >> > >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > >> > >> completed all code paths related to sockmap-based redirects should be > >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for > >> > >> socket references would remain under BPF_SYSCALL. > >> > >> > >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > >> > >> --- > >> > >> Changes in v2: > >> > >> - Handle prot->recvmsg being NULL (Sashiko) > >> > >> - Elaborate on the end goal in description > >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > >> > >> --- > >> > >> net/unix/af_unix.c | 4 ++-- > >> > >> net/unix/unix_bpf.c | 6 ++++++ > >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) > >> > >> > >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > >> > >> index f7a9d55eee8a..84c11c60c75f 100644 > >> > >> --- a/net/unix/af_unix.c > >> > >> +++ b/net/unix/af_unix.c > >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > >> > >> #ifdef CONFIG_BPF_SYSCALL > >> > >> const struct proto *prot = READ_ONCE(sk->sk_prot); > >> > >> > >> > >> - if (prot != &unix_dgram_proto) > >> > >> + if (prot->recvmsg) > >> > > > >> > > There is no reason to have this dead branch when > >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > >> > > > >> > > Let's compile out all sockmap code when both configs > >> > > are not enabled. > >> > > > >> > > Since AF_UNIX differs from TCP/UDP, it can take the > >> > > simpler approach. > >> > > >> > Okay, will put the whole file behind hidden config option like so: > >> > > >> > --- a/net/unix/Kconfig > >> > +++ b/net/unix/Kconfig > >> > @@ -30,3 +30,8 @@ config UNIX_DIAG > >> > help > >> > Support for UNIX socket monitoring interface used by the ss tool. > >> > If unsure, say Y. > >> > + > >> > +config UNIX_BPF > >> > >> Maybe UNIX_BPF_SOCKMAP or something. > >> bpf_iter is supported without this config. > > > > I don't like where it's going. > > I strongly dislike new config knobs. > > I'd rather remove existing knobs. > > What is the motivation? > > The goal is to compile out sockmap bits that use sk_msg. > NET_SOCK_MSG is natural, exisiting candidate. > New knob wasn't my idea. I'm also missing the big picture here. sockmap already holds socket references today. You can store and look up sockets without attaching any verdict/parser program, and no redirect happens. So if the goal is to use sockmap purely as a socket container without the sk_msg fast-path overhead, what does a compile-time NET_SOCK_MSG knob add over the runtime checks? I am also not sure if NET_SOCK_MSG is right. It is broader than "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP. Because those select it, it can't be toggled independently. Could you share the concrete use case you have in mind, and whether this came out of an earlier discussion or thread upstream? > > Alternatively, we can do this to avoid the extra knob: > > ifdef CONFIG_BPF_SYSCALL > unix-$(CONFIG_NET_SOCK_MSG) += unix_bpf.o > endif > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 20:22 ` Amery Hung @ 2026-06-23 20:36 ` Jakub Sitnicki 2026-06-23 20:44 ` Amery Hung 2026-06-23 21:26 ` Alexei Starovoitov 0 siblings, 2 replies; 14+ messages in thread From: Jakub Sitnicki @ 2026-06-23 20:36 UTC (permalink / raw) To: Amery Hung Cc: Alexei Starovoitov, Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote: > On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> >> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: >> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: >> >> >> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> >> > >> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: >> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> >> > >> >> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When >> >> > >> completed all code paths related to sockmap-based redirects should be >> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by >> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for >> >> > >> socket references would remain under BPF_SYSCALL. >> >> > >> >> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >> >> > >> --- >> >> > >> Changes in v2: >> >> > >> - Handle prot->recvmsg being NULL (Sashiko) >> >> > >> - Elaborate on the end goal in description >> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com >> >> > >> --- >> >> > >> net/unix/af_unix.c | 4 ++-- >> >> > >> net/unix/unix_bpf.c | 6 ++++++ >> >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) >> >> > >> >> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c >> >> > >> index f7a9d55eee8a..84c11c60c75f 100644 >> >> > >> --- a/net/unix/af_unix.c >> >> > >> +++ b/net/unix/af_unix.c >> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si >> >> > >> #ifdef CONFIG_BPF_SYSCALL >> >> > >> const struct proto *prot = READ_ONCE(sk->sk_prot); >> >> > >> >> >> > >> - if (prot != &unix_dgram_proto) >> >> > >> + if (prot->recvmsg) >> >> > > >> >> > > There is no reason to have this dead branch when >> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. >> >> > > >> >> > > Let's compile out all sockmap code when both configs >> >> > > are not enabled. >> >> > > >> >> > > Since AF_UNIX differs from TCP/UDP, it can take the >> >> > > simpler approach. >> >> > >> >> > Okay, will put the whole file behind hidden config option like so: >> >> > >> >> > --- a/net/unix/Kconfig >> >> > +++ b/net/unix/Kconfig >> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG >> >> > help >> >> > Support for UNIX socket monitoring interface used by the ss tool. >> >> > If unsure, say Y. >> >> > + >> >> > +config UNIX_BPF >> >> >> >> Maybe UNIX_BPF_SOCKMAP or something. >> >> bpf_iter is supported without this config. >> > >> > I don't like where it's going. >> > I strongly dislike new config knobs. >> > I'd rather remove existing knobs. >> > What is the motivation? >> >> The goal is to compile out sockmap bits that use sk_msg. >> NET_SOCK_MSG is natural, exisiting candidate. >> New knob wasn't my idea. > > I'm also missing the big picture here. > > sockmap already holds socket references today. You can store and look > up sockets without attaching any verdict/parser program, and no > redirect happens. So if the goal is to use sockmap purely as a socket > container without the sk_msg fast-path overhead, what does a > compile-time NET_SOCK_MSG knob add over the runtime checks? Sure, let me clarify. It's about the maintenance overhead. sockmap-based redirects are a rather niche feature with few users, for which we've been getting quite a few bug reports since AI came along. We're not using it internally at Cloudflare, so I don't really have a good reason to justify time spent on these bug reports. Hence the move to put sockmap-based redirect behind a config option, which you can enable at your own risk. Or which we can deprecate, but that's not really my call. > I am also not sure if NET_SOCK_MSG is right. It is broader than > "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP. > Because those select it, it can't be toggled independently. Once the sockmap redirect bits are behind _some_ config option, it will be easy to replace it with a more granular one that depends on NET_SOCK_MSG. But we're not there yet. One step at a time. > Could you share the concrete use case you have in mind, and whether > this came out of an earlier discussion or thread upstream? This is a follow up from discussions at BPF summit with Alexei & John. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 20:36 ` Jakub Sitnicki @ 2026-06-23 20:44 ` Amery Hung 2026-06-23 21:26 ` Alexei Starovoitov 1 sibling, 0 replies; 14+ messages in thread From: Amery Hung @ 2026-06-23 20:44 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 1:36 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote: > > On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > >> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: > >> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: > >> >> > >> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> >> > > >> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > >> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> >> > >> > >> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > >> >> > >> completed all code paths related to sockmap-based redirects should be > >> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > >> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for > >> >> > >> socket references would remain under BPF_SYSCALL. > >> >> > >> > >> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > >> >> > >> --- > >> >> > >> Changes in v2: > >> >> > >> - Handle prot->recvmsg being NULL (Sashiko) > >> >> > >> - Elaborate on the end goal in description > >> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > >> >> > >> --- > >> >> > >> net/unix/af_unix.c | 4 ++-- > >> >> > >> net/unix/unix_bpf.c | 6 ++++++ > >> >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) > >> >> > >> > >> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > >> >> > >> index f7a9d55eee8a..84c11c60c75f 100644 > >> >> > >> --- a/net/unix/af_unix.c > >> >> > >> +++ b/net/unix/af_unix.c > >> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > >> >> > >> #ifdef CONFIG_BPF_SYSCALL > >> >> > >> const struct proto *prot = READ_ONCE(sk->sk_prot); > >> >> > >> > >> >> > >> - if (prot != &unix_dgram_proto) > >> >> > >> + if (prot->recvmsg) > >> >> > > > >> >> > > There is no reason to have this dead branch when > >> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > >> >> > > > >> >> > > Let's compile out all sockmap code when both configs > >> >> > > are not enabled. > >> >> > > > >> >> > > Since AF_UNIX differs from TCP/UDP, it can take the > >> >> > > simpler approach. > >> >> > > >> >> > Okay, will put the whole file behind hidden config option like so: > >> >> > > >> >> > --- a/net/unix/Kconfig > >> >> > +++ b/net/unix/Kconfig > >> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG > >> >> > help > >> >> > Support for UNIX socket monitoring interface used by the ss tool. > >> >> > If unsure, say Y. > >> >> > + > >> >> > +config UNIX_BPF > >> >> > >> >> Maybe UNIX_BPF_SOCKMAP or something. > >> >> bpf_iter is supported without this config. > >> > > >> > I don't like where it's going. > >> > I strongly dislike new config knobs. > >> > I'd rather remove existing knobs. > >> > What is the motivation? > >> > >> The goal is to compile out sockmap bits that use sk_msg. > >> NET_SOCK_MSG is natural, exisiting candidate. > >> New knob wasn't my idea. > > > > I'm also missing the big picture here. > > > > sockmap already holds socket references today. You can store and look > > up sockets without attaching any verdict/parser program, and no > > redirect happens. So if the goal is to use sockmap purely as a socket > > container without the sk_msg fast-path overhead, what does a > > compile-time NET_SOCK_MSG knob add over the runtime checks? > > Sure, let me clarify. It's about the maintenance overhead. > > sockmap-based redirects are a rather niche feature with few users, for > which we've been getting quite a few bug reports since AI came along. > > We're not using it internally at Cloudflare, so I don't really have a > good reason to justify time spent on these bug reports. > > Hence the move to put sockmap-based redirect behind a config option, > which you can enable at your own risk. Or which we can deprecate, but > that's not really my call. > > > I am also not sure if NET_SOCK_MSG is right. It is broader than > > "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP. > > Because those select it, it can't be toggled independently. > > Once the sockmap redirect bits are behind _some_ config option, it will > be easy to replace it with a more granular one that depends on > NET_SOCK_MSG. But we're not there yet. One step at a time. > > > Could you share the concrete use case you have in mind, and whether > > this came out of an earlier discussion or thread upstream? > > This is a follow up from discussions at BPF summit with Alexei & John. I see. Thanks for explaining the motivation. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 20:36 ` Jakub Sitnicki 2026-06-23 20:44 ` Amery Hung @ 2026-06-23 21:26 ` Alexei Starovoitov 2026-06-24 1:32 ` Jiayuan Chen 1 sibling, 1 reply; 14+ messages in thread From: Alexei Starovoitov @ 2026-06-23 21:26 UTC (permalink / raw) To: Jakub Sitnicki Cc: Amery Hung, Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development, kernel-team On Tue, Jun 23, 2026 at 1:36 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote: > > On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> > >> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: > >> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: > >> >> > >> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> >> > > >> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: > >> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> >> > >> > >> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When > >> >> > >> completed all code paths related to sockmap-based redirects should be > >> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by > >> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for > >> >> > >> socket references would remain under BPF_SYSCALL. > >> >> > >> > >> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > >> >> > >> --- > >> >> > >> Changes in v2: > >> >> > >> - Handle prot->recvmsg being NULL (Sashiko) > >> >> > >> - Elaborate on the end goal in description > >> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com > >> >> > >> --- > >> >> > >> net/unix/af_unix.c | 4 ++-- > >> >> > >> net/unix/unix_bpf.c | 6 ++++++ > >> >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) > >> >> > >> > >> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > >> >> > >> index f7a9d55eee8a..84c11c60c75f 100644 > >> >> > >> --- a/net/unix/af_unix.c > >> >> > >> +++ b/net/unix/af_unix.c > >> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si > >> >> > >> #ifdef CONFIG_BPF_SYSCALL > >> >> > >> const struct proto *prot = READ_ONCE(sk->sk_prot); > >> >> > >> > >> >> > >> - if (prot != &unix_dgram_proto) > >> >> > >> + if (prot->recvmsg) > >> >> > > > >> >> > > There is no reason to have this dead branch when > >> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. > >> >> > > > >> >> > > Let's compile out all sockmap code when both configs > >> >> > > are not enabled. > >> >> > > > >> >> > > Since AF_UNIX differs from TCP/UDP, it can take the > >> >> > > simpler approach. > >> >> > > >> >> > Okay, will put the whole file behind hidden config option like so: > >> >> > > >> >> > --- a/net/unix/Kconfig > >> >> > +++ b/net/unix/Kconfig > >> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG > >> >> > help > >> >> > Support for UNIX socket monitoring interface used by the ss tool. > >> >> > If unsure, say Y. > >> >> > + > >> >> > +config UNIX_BPF > >> >> > >> >> Maybe UNIX_BPF_SOCKMAP or something. > >> >> bpf_iter is supported without this config. > >> > > >> > I don't like where it's going. > >> > I strongly dislike new config knobs. > >> > I'd rather remove existing knobs. > >> > What is the motivation? > >> > >> The goal is to compile out sockmap bits that use sk_msg. > >> NET_SOCK_MSG is natural, exisiting candidate. > >> New knob wasn't my idea. > > > > I'm also missing the big picture here. > > > > sockmap already holds socket references today. You can store and look > > up sockets without attaching any verdict/parser program, and no > > redirect happens. So if the goal is to use sockmap purely as a socket > > container without the sk_msg fast-path overhead, what does a > > compile-time NET_SOCK_MSG knob add over the runtime checks? > > Sure, let me clarify. It's about the maintenance overhead. > > sockmap-based redirects are a rather niche feature with few users, for > which we've been getting quite a few bug reports since AI came along. > > We're not using it internally at Cloudflare, so I don't really have a > good reason to justify time spent on these bug reports. > > Hence the move to put sockmap-based redirect behind a config option, > which you can enable at your own risk. Or which we can deprecate, but > that's not really my call. This is wishful thinking that a config knob will stop the bug reports. Just disable it for real instead. > > I am also not sure if NET_SOCK_MSG is right. It is broader than > > "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP. > > Because those select it, it can't be toggled independently. > > Once the sockmap redirect bits are behind _some_ config option, it will > be easy to replace it with a more granular one that depends on > NET_SOCK_MSG. But we're not there yet. One step at a time. No. That's not workable. > > Could you share the concrete use case you have in mind, and whether > > this came out of an earlier discussion or thread upstream? > > This is a follow up from discussions at BPF summit with Alexei & John. Not quite. The discussion was to disable pieces of sockmap that are causing trouble. Not to move them under config knobs, but disable them. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 21:26 ` Alexei Starovoitov @ 2026-06-24 1:32 ` Jiayuan Chen 0 siblings, 0 replies; 14+ messages in thread From: Jiayuan Chen @ 2026-06-24 1:32 UTC (permalink / raw) To: Alexei Starovoitov, Jakub Sitnicki Cc: Amery Hung, Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, John Fastabend, Network Development, kernel-team On 6/24/26 5:26 AM, Alexei Starovoitov wrote: > On Tue, Jun 23, 2026 at 1:36 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote: >>> On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >>>> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote: >>>>> On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote: >>>>>> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >>>>>>> On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote: >>>>>>>> On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote: >>>>>>>>> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When >>>>>>>>> completed all code paths related to sockmap-based redirects should be >>>>>>>>> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by >>>>>>>>> disabling NET_SOCK_MSG. The implementation of sockmap as a container for >>>>>>>>> socket references would remain under BPF_SYSCALL. >>>>>>>>> >>>>>>>>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >>>>>>>>> --- >>>>>>>>> Changes in v2: >>>>>>>>> - Handle prot->recvmsg being NULL (Sashiko) >>>>>>>>> - Elaborate on the end goal in description >>>>>>>>> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com >>>>>>>>> --- >>>>>>>>> net/unix/af_unix.c | 4 ++-- >>>>>>>>> net/unix/unix_bpf.c | 6 ++++++ >>>>>>>>> 2 files changed, 8 insertions(+), 2 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c >>>>>>>>> index f7a9d55eee8a..84c11c60c75f 100644 >>>>>>>>> --- a/net/unix/af_unix.c >>>>>>>>> +++ b/net/unix/af_unix.c >>>>>>>>> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si >>>>>>>>> #ifdef CONFIG_BPF_SYSCALL >>>>>>>>> const struct proto *prot = READ_ONCE(sk->sk_prot); >>>>>>>>> >>>>>>>>> - if (prot != &unix_dgram_proto) >>>>>>>>> + if (prot->recvmsg) >>>>>>>> There is no reason to have this dead branch when >>>>>>>> CONFIG_BPF_SYSCALL && !NET_SOCK_MSG. >>>>>>>> >>>>>>>> Let's compile out all sockmap code when both configs >>>>>>>> are not enabled. >>>>>>>> >>>>>>>> Since AF_UNIX differs from TCP/UDP, it can take the >>>>>>>> simpler approach. >>>>>>> Okay, will put the whole file behind hidden config option like so: >>>>>>> >>>>>>> --- a/net/unix/Kconfig >>>>>>> +++ b/net/unix/Kconfig >>>>>>> @@ -30,3 +30,8 @@ config UNIX_DIAG >>>>>>> help >>>>>>> Support for UNIX socket monitoring interface used by the ss tool. >>>>>>> If unsure, say Y. >>>>>>> + >>>>>>> +config UNIX_BPF >>>>>> Maybe UNIX_BPF_SOCKMAP or something. >>>>>> bpf_iter is supported without this config. >>>>> I don't like where it's going. >>>>> I strongly dislike new config knobs. >>>>> I'd rather remove existing knobs. >>>>> What is the motivation? >>>> The goal is to compile out sockmap bits that use sk_msg. >>>> NET_SOCK_MSG is natural, exisiting candidate. >>>> New knob wasn't my idea. >>> I'm also missing the big picture here. >>> >>> sockmap already holds socket references today. You can store and look >>> up sockets without attaching any verdict/parser program, and no >>> redirect happens. So if the goal is to use sockmap purely as a socket >>> container without the sk_msg fast-path overhead, what does a >>> compile-time NET_SOCK_MSG knob add over the runtime checks? >> Sure, let me clarify. It's about the maintenance overhead. >> >> sockmap-based redirects are a rather niche feature with few users, for >> which we've been getting quite a few bug reports since AI came along. >> >> We're not using it internally at Cloudflare, so I don't really have a >> good reason to justify time spent on these bug reports. >> >> Hence the move to put sockmap-based redirect behind a config option, >> which you can enable at your own risk. Or which we can deprecate, but >> that's not really my call. Hi Alexei and Jakub, skmsg is actually still pretty useful for gateways. I started with bpf by integrating skmsg into nginx as a module and envoy has something similar. The usual setup is cgroup/sk for L4 bypass (reject SYN), and skmsg for L7, redirecting between local apps by looking at the payload. So there are real users. > This is wishful thinking that a config knob will stop > the bug reports. > Just disable it for real instead. About the AI bug reports - yeah, I've seen them too. I think it just comes from the complexity of networking plus how programmable bpf is. Reviewing AI-written patches is often painful, the commit message is frequently wrong, once it took me a whole day just to reproduce and confirm the issue. But I do believe these reports will converge eventually. >>> I am also not sure if NET_SOCK_MSG is right. It is broader than >>> "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP. >>> Because those select it, it can't be toggled independently. >> Once the sockmap redirect bits are behind _some_ config option, it will >> be easy to replace it with a more granular one that depends on >> NET_SOCK_MSG. But we're not there yet. One step at a time. > No. That's not workable. > >>> Could you share the concrete use case you have in mind, and whether >>> this came out of an earlier discussion or thread upstream? >> This is a follow up from discussions at BPF summit with Alexei & John. > Not quite. The discussion was to disable pieces of sockmap > that are causing trouble. > Not to move them under config knobs, but disable them. Agree, just like we remove skmsg from KTLS which is rarely used. I think the motivation of this patch - making the boundary between skmsg and sockmap clear - is worthwhile. Hope not have skmsg disabled by default. I don't work on that upper-layer software anymore, but I really don't want my ex-colleagues to upgrade their kernel some day, find the feature I wrote broken, and come curse me :) (selfish) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 19:31 ` Kuniyuki Iwashima 2026-06-23 19:33 ` Alexei Starovoitov @ 2026-06-23 20:09 ` Jakub Sitnicki 2026-06-23 20:14 ` Kuniyuki Iwashima 1 sibling, 1 reply; 14+ messages in thread From: Jakub Sitnicki @ 2026-06-23 20:09 UTC (permalink / raw) To: Kuniyuki Iwashima Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, netdev, kernel-team On Tue, Jun 23, 2026 at 12:31 PM -07, Kuniyuki Iwashima wrote: > On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: >> Okay, will put the whole file behind hidden config option like so: >> >> --- a/net/unix/Kconfig >> +++ b/net/unix/Kconfig >> @@ -30,3 +30,8 @@ config UNIX_DIAG >> help >> Support for UNIX socket monitoring interface used by the ss tool. >> If unsure, say Y. >> + >> +config UNIX_BPF > > Maybe UNIX_BPF_SOCKMAP or something. > bpf_iter is supported without this config. Not sure what you have in mind re bpf_iter. Can you share more? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG 2026-06-23 20:09 ` Jakub Sitnicki @ 2026-06-23 20:14 ` Kuniyuki Iwashima 0 siblings, 0 replies; 14+ messages in thread From: Kuniyuki Iwashima @ 2026-06-23 20:14 UTC (permalink / raw) To: Jakub Sitnicki Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend, netdev, kernel-team On Tue, Jun 23, 2026 at 1:09 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > > On Tue, Jun 23, 2026 at 12:31 PM -07, Kuniyuki Iwashima wrote: > > On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote: > >> Okay, will put the whole file behind hidden config option like so: > >> > >> --- a/net/unix/Kconfig > >> +++ b/net/unix/Kconfig > >> @@ -30,3 +30,8 @@ config UNIX_DIAG > >> help > >> Support for UNIX socket monitoring interface used by the ss tool. > >> If unsure, say Y. > >> + > >> +config UNIX_BPF > > > > Maybe UNIX_BPF_SOCKMAP or something. > > bpf_iter is supported without this config. > > Not sure what you have in mind re bpf_iter. Can you share more? I meant UNIX_BPF sounds like it covers bpf iterator for AF_UNIX too. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-06-24 1:32 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-23 11:20 [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG Jakub Sitnicki 2026-06-23 16:08 ` Kuniyuki Iwashima 2026-06-23 19:21 ` Jakub Sitnicki 2026-06-23 19:31 ` Kuniyuki Iwashima 2026-06-23 19:33 ` Alexei Starovoitov 2026-06-23 20:03 ` Jakub Sitnicki 2026-06-23 20:13 ` Kuniyuki Iwashima 2026-06-23 20:22 ` Amery Hung 2026-06-23 20:36 ` Jakub Sitnicki 2026-06-23 20:44 ` Amery Hung 2026-06-23 21:26 ` Alexei Starovoitov 2026-06-24 1:32 ` Jiayuan Chen 2026-06-23 20:09 ` Jakub Sitnicki 2026-06-23 20:14 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox