Netdev List
 help / color / mirror / Atom feed
* [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
@ 2026-06-23 11:20 Jakub Sitnicki
  2026-06-23 16:08 ` Kuniyuki Iwashima
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Sitnicki @ 2026-06-23 11:20 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, Jiayuan Chen,
	John Fastabend, Kuniyuki Iwashima, netdev, kernel-team

Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
completed all code paths related to sockmap-based redirects should be
guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
disabling NET_SOCK_MSG. The implementation of sockmap as a container for
socket references would remain under BPF_SYSCALL.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
Changes in v2:
- Handle prot->recvmsg being NULL (Sashiko)
- Elaborate on the end goal in description
- Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
---
 net/unix/af_unix.c  | 4 ++--
 net/unix/unix_bpf.c | 6 ++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f7a9d55eee8a..84c11c60c75f 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
 #ifdef CONFIG_BPF_SYSCALL
 	const struct proto *prot = READ_ONCE(sk->sk_prot);
 
-	if (prot != &unix_dgram_proto)
+	if (prot->recvmsg)
 		return prot->recvmsg(sk, msg, size, flags);
 #endif
 	return __unix_dgram_recvmsg(sk, msg, size, flags);
@@ -3152,7 +3152,7 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
 	struct sock *sk = sock->sk;
 	const struct proto *prot = READ_ONCE(sk->sk_prot);
 
-	if (prot != &unix_stream_proto)
+	if (prot->recvmsg)
 		return prot->recvmsg(sk, msg, size, flags);
 #endif
 	return unix_stream_read_generic(&state, true);
diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
index f86ff19e9764..5289a04b4993 100644
--- a/net/unix/unix_bpf.c
+++ b/net/unix/unix_bpf.c
@@ -7,6 +7,7 @@
 
 #include "af_unix.h"
 
+#ifdef CONFIG_NET_SOCK_MSG
 #define unix_sk_has_data(__sk, __psock)					\
 		({	!skb_queue_empty(&__sk->sk_receive_queue) ||	\
 			!skb_queue_empty(&__psock->ingress_skb) ||	\
@@ -94,6 +95,7 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
 	sk_psock_put(sk, psock);
 	return copied;
 }
+#endif /* CONFIG_NET_SOCK_MSG */
 
 static struct proto *unix_dgram_prot_saved __read_mostly;
 static DEFINE_SPINLOCK(unix_dgram_prot_lock);
@@ -107,8 +109,10 @@ static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto
 {
 	*prot        = *base;
 	prot->close  = sock_map_close;
+#ifdef CONFIG_NET_SOCK_MSG
 	prot->recvmsg = unix_bpf_recvmsg;
 	prot->sock_is_readable = sk_msg_is_readable;
+#endif
 }
 
 static void unix_stream_bpf_rebuild_protos(struct proto *prot,
@@ -116,8 +120,10 @@ static void unix_stream_bpf_rebuild_protos(struct proto *prot,
 {
 	*prot        = *base;
 	prot->close  = sock_map_close;
+#ifdef CONFIG_NET_SOCK_MSG
 	prot->recvmsg = unix_bpf_recvmsg;
 	prot->sock_is_readable = sk_msg_is_readable;
+#endif
 	prot->unhash  = sock_map_unhash;
 }
 




^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 11:20 [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG Jakub Sitnicki
@ 2026-06-23 16:08 ` Kuniyuki Iwashima
  2026-06-23 19:21   ` Jakub Sitnicki
  0 siblings, 1 reply; 13+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-23 16:08 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski,
	Jiayuan Chen, John Fastabend, netdev, kernel-team

On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> completed all code paths related to sockmap-based redirects should be
> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> socket references would remain under BPF_SYSCALL.
>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> Changes in v2:
> - Handle prot->recvmsg being NULL (Sashiko)
> - Elaborate on the end goal in description
> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> ---
>  net/unix/af_unix.c  | 4 ++--
>  net/unix/unix_bpf.c | 6 ++++++
>  2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index f7a9d55eee8a..84c11c60c75f 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
>  #ifdef CONFIG_BPF_SYSCALL
>         const struct proto *prot = READ_ONCE(sk->sk_prot);
>
> -       if (prot != &unix_dgram_proto)
> +       if (prot->recvmsg)

There is no reason to have this dead branch when
CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.

Let's compile out all sockmap code when both configs
are not enabled.

Since AF_UNIX differs from TCP/UDP, it can take the
simpler approach.


>                 return prot->recvmsg(sk, msg, size, flags);
>  #endif
>         return __unix_dgram_recvmsg(sk, msg, size, flags);
> @@ -3152,7 +3152,7 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
>         struct sock *sk = sock->sk;
>         const struct proto *prot = READ_ONCE(sk->sk_prot);
>
> -       if (prot != &unix_stream_proto)
> +       if (prot->recvmsg)
>                 return prot->recvmsg(sk, msg, size, flags);
>  #endif
>         return unix_stream_read_generic(&state, true);
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index f86ff19e9764..5289a04b4993 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -7,6 +7,7 @@
>
>  #include "af_unix.h"
>
> +#ifdef CONFIG_NET_SOCK_MSG
>  #define unix_sk_has_data(__sk, __psock)                                        \
>                 ({      !skb_queue_empty(&__sk->sk_receive_queue) ||    \
>                         !skb_queue_empty(&__psock->ingress_skb) ||      \
> @@ -94,6 +95,7 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
>         sk_psock_put(sk, psock);
>         return copied;
>  }
> +#endif /* CONFIG_NET_SOCK_MSG */
>
>  static struct proto *unix_dgram_prot_saved __read_mostly;
>  static DEFINE_SPINLOCK(unix_dgram_prot_lock);
> @@ -107,8 +109,10 @@ static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto
>  {
>         *prot        = *base;
>         prot->close  = sock_map_close;
> +#ifdef CONFIG_NET_SOCK_MSG
>         prot->recvmsg = unix_bpf_recvmsg;
>         prot->sock_is_readable = sk_msg_is_readable;
> +#endif
>  }
>
>  static void unix_stream_bpf_rebuild_protos(struct proto *prot,
> @@ -116,8 +120,10 @@ static void unix_stream_bpf_rebuild_protos(struct proto *prot,
>  {
>         *prot        = *base;
>         prot->close  = sock_map_close;
> +#ifdef CONFIG_NET_SOCK_MSG
>         prot->recvmsg = unix_bpf_recvmsg;
>         prot->sock_is_readable = sk_msg_is_readable;
> +#endif
>         prot->unhash  = sock_map_unhash;
>  }
>
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 16:08 ` Kuniyuki Iwashima
@ 2026-06-23 19:21   ` Jakub Sitnicki
  2026-06-23 19:31     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Sitnicki @ 2026-06-23 19:21 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski,
	Jiayuan Chen, John Fastabend, netdev, kernel-team

On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
>> completed all code paths related to sockmap-based redirects should be
>> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
>> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
>> socket references would remain under BPF_SYSCALL.
>>
>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> ---
>> Changes in v2:
>> - Handle prot->recvmsg being NULL (Sashiko)
>> - Elaborate on the end goal in description
>> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
>> ---
>>  net/unix/af_unix.c  | 4 ++--
>>  net/unix/unix_bpf.c | 6 ++++++
>>  2 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> index f7a9d55eee8a..84c11c60c75f 100644
>> --- a/net/unix/af_unix.c
>> +++ b/net/unix/af_unix.c
>> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
>>  #ifdef CONFIG_BPF_SYSCALL
>>         const struct proto *prot = READ_ONCE(sk->sk_prot);
>>
>> -       if (prot != &unix_dgram_proto)
>> +       if (prot->recvmsg)
>
> There is no reason to have this dead branch when
> CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
>
> Let's compile out all sockmap code when both configs
> are not enabled.
>
> Since AF_UNIX differs from TCP/UDP, it can take the
> simpler approach.

Okay, will put the whole file behind hidden config option like so:

--- a/net/unix/Kconfig
+++ b/net/unix/Kconfig
@@ -30,3 +30,8 @@ config UNIX_DIAG
        help
          Support for UNIX socket monitoring interface used by the ss tool.
          If unsure, say Y.
+
+config UNIX_BPF
+       bool
+       depends on UNIX
+       default y if BPF_SYSCALL && NET_SOCK_MSG

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 19:21   ` Jakub Sitnicki
@ 2026-06-23 19:31     ` Kuniyuki Iwashima
  2026-06-23 19:33       ` Alexei Starovoitov
  2026-06-23 20:09       ` Jakub Sitnicki
  0 siblings, 2 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-23 19:31 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski,
	Jiayuan Chen, John Fastabend, netdev, kernel-team

On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >>
> >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> >> completed all code paths related to sockmap-based redirects should be
> >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> >> socket references would remain under BPF_SYSCALL.
> >>
> >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >> ---
> >> Changes in v2:
> >> - Handle prot->recvmsg being NULL (Sashiko)
> >> - Elaborate on the end goal in description
> >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> >> ---
> >>  net/unix/af_unix.c  | 4 ++--
> >>  net/unix/unix_bpf.c | 6 ++++++
> >>  2 files changed, 8 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >> index f7a9d55eee8a..84c11c60c75f 100644
> >> --- a/net/unix/af_unix.c
> >> +++ b/net/unix/af_unix.c
> >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
> >>  #ifdef CONFIG_BPF_SYSCALL
> >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
> >>
> >> -       if (prot != &unix_dgram_proto)
> >> +       if (prot->recvmsg)
> >
> > There is no reason to have this dead branch when
> > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
> >
> > Let's compile out all sockmap code when both configs
> > are not enabled.
> >
> > Since AF_UNIX differs from TCP/UDP, it can take the
> > simpler approach.
>
> Okay, will put the whole file behind hidden config option like so:
>
> --- a/net/unix/Kconfig
> +++ b/net/unix/Kconfig
> @@ -30,3 +30,8 @@ config UNIX_DIAG
>         help
>           Support for UNIX socket monitoring interface used by the ss tool.
>           If unsure, say Y.
> +
> +config UNIX_BPF

Maybe UNIX_BPF_SOCKMAP or something.
bpf_iter is supported without this config.

> +       bool
> +       depends on UNIX
> +       default y if BPF_SYSCALL && NET_SOCK_MSG

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 19:31     ` Kuniyuki Iwashima
@ 2026-06-23 19:33       ` Alexei Starovoitov
  2026-06-23 20:03         ` Jakub Sitnicki
  2026-06-23 20:09       ` Jakub Sitnicki
  1 sibling, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2026-06-23 19:33 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: Jakub Sitnicki, bpf, Alexei Starovoitov, Daniel Borkmann,
	Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development,
	kernel-team

On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>
> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >
> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> > >>
> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> > >> completed all code paths related to sockmap-based redirects should be
> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> > >> socket references would remain under BPF_SYSCALL.
> > >>
> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > >> ---
> > >> Changes in v2:
> > >> - Handle prot->recvmsg being NULL (Sashiko)
> > >> - Elaborate on the end goal in description
> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> > >> ---
> > >>  net/unix/af_unix.c  | 4 ++--
> > >>  net/unix/unix_bpf.c | 6 ++++++
> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
> > >>
> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> > >> index f7a9d55eee8a..84c11c60c75f 100644
> > >> --- a/net/unix/af_unix.c
> > >> +++ b/net/unix/af_unix.c
> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
> > >>  #ifdef CONFIG_BPF_SYSCALL
> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
> > >>
> > >> -       if (prot != &unix_dgram_proto)
> > >> +       if (prot->recvmsg)
> > >
> > > There is no reason to have this dead branch when
> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
> > >
> > > Let's compile out all sockmap code when both configs
> > > are not enabled.
> > >
> > > Since AF_UNIX differs from TCP/UDP, it can take the
> > > simpler approach.
> >
> > Okay, will put the whole file behind hidden config option like so:
> >
> > --- a/net/unix/Kconfig
> > +++ b/net/unix/Kconfig
> > @@ -30,3 +30,8 @@ config UNIX_DIAG
> >         help
> >           Support for UNIX socket monitoring interface used by the ss tool.
> >           If unsure, say Y.
> > +
> > +config UNIX_BPF
>
> Maybe UNIX_BPF_SOCKMAP or something.
> bpf_iter is supported without this config.

I don't like where it's going.
I strongly dislike new config knobs.
I'd rather remove existing knobs.
What is the motivation?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 19:33       ` Alexei Starovoitov
@ 2026-06-23 20:03         ` Jakub Sitnicki
  2026-06-23 20:13           ` Kuniyuki Iwashima
  2026-06-23 20:22           ` Amery Hung
  0 siblings, 2 replies; 13+ messages in thread
From: Jakub Sitnicki @ 2026-06-23 20:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Kuniyuki Iwashima, bpf, Alexei Starovoitov, Daniel Borkmann,
	Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development,
	kernel-team

On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
> On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>>
>> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> >
>> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
>> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> > >>
>> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
>> > >> completed all code paths related to sockmap-based redirects should be
>> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
>> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
>> > >> socket references would remain under BPF_SYSCALL.
>> > >>
>> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> > >> ---
>> > >> Changes in v2:
>> > >> - Handle prot->recvmsg being NULL (Sashiko)
>> > >> - Elaborate on the end goal in description
>> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
>> > >> ---
>> > >>  net/unix/af_unix.c  | 4 ++--
>> > >>  net/unix/unix_bpf.c | 6 ++++++
>> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
>> > >>
>> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> > >> index f7a9d55eee8a..84c11c60c75f 100644
>> > >> --- a/net/unix/af_unix.c
>> > >> +++ b/net/unix/af_unix.c
>> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
>> > >>  #ifdef CONFIG_BPF_SYSCALL
>> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
>> > >>
>> > >> -       if (prot != &unix_dgram_proto)
>> > >> +       if (prot->recvmsg)
>> > >
>> > > There is no reason to have this dead branch when
>> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
>> > >
>> > > Let's compile out all sockmap code when both configs
>> > > are not enabled.
>> > >
>> > > Since AF_UNIX differs from TCP/UDP, it can take the
>> > > simpler approach.
>> >
>> > Okay, will put the whole file behind hidden config option like so:
>> >
>> > --- a/net/unix/Kconfig
>> > +++ b/net/unix/Kconfig
>> > @@ -30,3 +30,8 @@ config UNIX_DIAG
>> >         help
>> >           Support for UNIX socket monitoring interface used by the ss tool.
>> >           If unsure, say Y.
>> > +
>> > +config UNIX_BPF
>>
>> Maybe UNIX_BPF_SOCKMAP or something.
>> bpf_iter is supported without this config.
>
> I don't like where it's going.
> I strongly dislike new config knobs.
> I'd rather remove existing knobs.
> What is the motivation?

The goal is to compile out sockmap bits that use sk_msg.
NET_SOCK_MSG is natural, exisiting candidate.
New knob wasn't my idea.

Alternatively, we can do this to avoid the extra knob:

ifdef CONFIG_BPF_SYSCALL
unix-$(CONFIG_NET_SOCK_MSG) += unix_bpf.o
endif

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 19:31     ` Kuniyuki Iwashima
  2026-06-23 19:33       ` Alexei Starovoitov
@ 2026-06-23 20:09       ` Jakub Sitnicki
  2026-06-23 20:14         ` Kuniyuki Iwashima
  1 sibling, 1 reply; 13+ messages in thread
From: Jakub Sitnicki @ 2026-06-23 20:09 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski,
	Jiayuan Chen, John Fastabend, netdev, kernel-team

On Tue, Jun 23, 2026 at 12:31 PM -07, Kuniyuki Iwashima wrote:
> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> Okay, will put the whole file behind hidden config option like so:
>>
>> --- a/net/unix/Kconfig
>> +++ b/net/unix/Kconfig
>> @@ -30,3 +30,8 @@ config UNIX_DIAG
>>         help
>>           Support for UNIX socket monitoring interface used by the ss tool.
>>           If unsure, say Y.
>> +
>> +config UNIX_BPF
>
> Maybe UNIX_BPF_SOCKMAP or something.
> bpf_iter is supported without this config.

Not sure what you have in mind re bpf_iter. Can you share more?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 20:03         ` Jakub Sitnicki
@ 2026-06-23 20:13           ` Kuniyuki Iwashima
  2026-06-23 20:22           ` Amery Hung
  1 sibling, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-23 20:13 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Daniel Borkmann,
	Jakub Kicinski, Jiayuan Chen, John Fastabend, Network Development,
	kernel-team

On Tue, Jun 23, 2026 at 1:03 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
> >>
> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> >
> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> > >>
> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> >> > >> completed all code paths related to sockmap-based redirects should be
> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> >> > >> socket references would remain under BPF_SYSCALL.
> >> > >>
> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >> > >> ---
> >> > >> Changes in v2:
> >> > >> - Handle prot->recvmsg being NULL (Sashiko)
> >> > >> - Elaborate on the end goal in description
> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> >> > >> ---
> >> > >>  net/unix/af_unix.c  | 4 ++--
> >> > >>  net/unix/unix_bpf.c | 6 ++++++
> >> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
> >> > >>
> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >> > >> index f7a9d55eee8a..84c11c60c75f 100644
> >> > >> --- a/net/unix/af_unix.c
> >> > >> +++ b/net/unix/af_unix.c
> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
> >> > >>  #ifdef CONFIG_BPF_SYSCALL
> >> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
> >> > >>
> >> > >> -       if (prot != &unix_dgram_proto)
> >> > >> +       if (prot->recvmsg)
> >> > >
> >> > > There is no reason to have this dead branch when
> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
> >> > >
> >> > > Let's compile out all sockmap code when both configs
> >> > > are not enabled.
> >> > >
> >> > > Since AF_UNIX differs from TCP/UDP, it can take the
> >> > > simpler approach.
> >> >
> >> > Okay, will put the whole file behind hidden config option like so:
> >> >
> >> > --- a/net/unix/Kconfig
> >> > +++ b/net/unix/Kconfig
> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG
> >> >         help
> >> >           Support for UNIX socket monitoring interface used by the ss tool.
> >> >           If unsure, say Y.
> >> > +
> >> > +config UNIX_BPF
> >>
> >> Maybe UNIX_BPF_SOCKMAP or something.
> >> bpf_iter is supported without this config.
> >
> > I don't like where it's going.
> > I strongly dislike new config knobs.
> > I'd rather remove existing knobs.
> > What is the motivation?
>
> The goal is to compile out sockmap bits that use sk_msg.
> NET_SOCK_MSG is natural, exisiting candidate.
> New knob wasn't my idea.

I think config w/o description is okay since it's not selectable.

>
> Alternatively, we can do this to avoid the extra knob:
>
> ifdef CONFIG_BPF_SYSCALL
> unix-$(CONFIG_NET_SOCK_MSG) += unix_bpf.o
> endif

This is far better, I forgot ifdef is available.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 20:09       ` Jakub Sitnicki
@ 2026-06-23 20:14         ` Kuniyuki Iwashima
  0 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-23 20:14 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski,
	Jiayuan Chen, John Fastabend, netdev, kernel-team

On Tue, Jun 23, 2026 at 1:09 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jun 23, 2026 at 12:31 PM -07, Kuniyuki Iwashima wrote:
> > On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> Okay, will put the whole file behind hidden config option like so:
> >>
> >> --- a/net/unix/Kconfig
> >> +++ b/net/unix/Kconfig
> >> @@ -30,3 +30,8 @@ config UNIX_DIAG
> >>         help
> >>           Support for UNIX socket monitoring interface used by the ss tool.
> >>           If unsure, say Y.
> >> +
> >> +config UNIX_BPF
> >
> > Maybe UNIX_BPF_SOCKMAP or something.
> > bpf_iter is supported without this config.
>
> Not sure what you have in mind re bpf_iter. Can you share more?

I meant UNIX_BPF sounds like it covers bpf iterator for AF_UNIX too.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 20:03         ` Jakub Sitnicki
  2026-06-23 20:13           ` Kuniyuki Iwashima
@ 2026-06-23 20:22           ` Amery Hung
  2026-06-23 20:36             ` Jakub Sitnicki
  1 sibling, 1 reply; 13+ messages in thread
From: Amery Hung @ 2026-06-23 20:22 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Kuniyuki Iwashima, bpf, Alexei Starovoitov,
	Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend,
	Network Development, kernel-team

On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
> >>
> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> >
> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> > >>
> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> >> > >> completed all code paths related to sockmap-based redirects should be
> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> >> > >> socket references would remain under BPF_SYSCALL.
> >> > >>
> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >> > >> ---
> >> > >> Changes in v2:
> >> > >> - Handle prot->recvmsg being NULL (Sashiko)
> >> > >> - Elaborate on the end goal in description
> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> >> > >> ---
> >> > >>  net/unix/af_unix.c  | 4 ++--
> >> > >>  net/unix/unix_bpf.c | 6 ++++++
> >> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
> >> > >>
> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >> > >> index f7a9d55eee8a..84c11c60c75f 100644
> >> > >> --- a/net/unix/af_unix.c
> >> > >> +++ b/net/unix/af_unix.c
> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
> >> > >>  #ifdef CONFIG_BPF_SYSCALL
> >> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
> >> > >>
> >> > >> -       if (prot != &unix_dgram_proto)
> >> > >> +       if (prot->recvmsg)
> >> > >
> >> > > There is no reason to have this dead branch when
> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
> >> > >
> >> > > Let's compile out all sockmap code when both configs
> >> > > are not enabled.
> >> > >
> >> > > Since AF_UNIX differs from TCP/UDP, it can take the
> >> > > simpler approach.
> >> >
> >> > Okay, will put the whole file behind hidden config option like so:
> >> >
> >> > --- a/net/unix/Kconfig
> >> > +++ b/net/unix/Kconfig
> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG
> >> >         help
> >> >           Support for UNIX socket monitoring interface used by the ss tool.
> >> >           If unsure, say Y.
> >> > +
> >> > +config UNIX_BPF
> >>
> >> Maybe UNIX_BPF_SOCKMAP or something.
> >> bpf_iter is supported without this config.
> >
> > I don't like where it's going.
> > I strongly dislike new config knobs.
> > I'd rather remove existing knobs.
> > What is the motivation?
>
> The goal is to compile out sockmap bits that use sk_msg.
> NET_SOCK_MSG is natural, exisiting candidate.
> New knob wasn't my idea.

I'm also missing the big picture here.

sockmap already holds socket references today. You can store and look
up sockets without attaching any verdict/parser program, and no
redirect happens. So if the goal is to use sockmap purely as a socket
container without the sk_msg fast-path overhead, what does a
compile-time NET_SOCK_MSG knob add over the runtime checks?

I am also not sure if NET_SOCK_MSG is right. It is broader than
"sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP.
Because those select it, it can't be toggled independently.

Could you share the concrete use case you have in mind, and whether
this came out of an earlier discussion or thread upstream?

>
> Alternatively, we can do this to avoid the extra knob:
>
> ifdef CONFIG_BPF_SYSCALL
> unix-$(CONFIG_NET_SOCK_MSG) += unix_bpf.o
> endif
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 20:22           ` Amery Hung
@ 2026-06-23 20:36             ` Jakub Sitnicki
  2026-06-23 20:44               ` Amery Hung
  2026-06-23 21:26               ` Alexei Starovoitov
  0 siblings, 2 replies; 13+ messages in thread
From: Jakub Sitnicki @ 2026-06-23 20:36 UTC (permalink / raw)
  To: Amery Hung
  Cc: Alexei Starovoitov, Kuniyuki Iwashima, bpf, Alexei Starovoitov,
	Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend,
	Network Development, kernel-team

On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote:
> On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
>> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>> >>
>> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> >> >
>> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
>> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> >> > >>
>> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
>> >> > >> completed all code paths related to sockmap-based redirects should be
>> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
>> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
>> >> > >> socket references would remain under BPF_SYSCALL.
>> >> > >>
>> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> >> > >> ---
>> >> > >> Changes in v2:
>> >> > >> - Handle prot->recvmsg being NULL (Sashiko)
>> >> > >> - Elaborate on the end goal in description
>> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
>> >> > >> ---
>> >> > >>  net/unix/af_unix.c  | 4 ++--
>> >> > >>  net/unix/unix_bpf.c | 6 ++++++
>> >> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
>> >> > >>
>> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> >> > >> index f7a9d55eee8a..84c11c60c75f 100644
>> >> > >> --- a/net/unix/af_unix.c
>> >> > >> +++ b/net/unix/af_unix.c
>> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
>> >> > >>  #ifdef CONFIG_BPF_SYSCALL
>> >> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
>> >> > >>
>> >> > >> -       if (prot != &unix_dgram_proto)
>> >> > >> +       if (prot->recvmsg)
>> >> > >
>> >> > > There is no reason to have this dead branch when
>> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
>> >> > >
>> >> > > Let's compile out all sockmap code when both configs
>> >> > > are not enabled.
>> >> > >
>> >> > > Since AF_UNIX differs from TCP/UDP, it can take the
>> >> > > simpler approach.
>> >> >
>> >> > Okay, will put the whole file behind hidden config option like so:
>> >> >
>> >> > --- a/net/unix/Kconfig
>> >> > +++ b/net/unix/Kconfig
>> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG
>> >> >         help
>> >> >           Support for UNIX socket monitoring interface used by the ss tool.
>> >> >           If unsure, say Y.
>> >> > +
>> >> > +config UNIX_BPF
>> >>
>> >> Maybe UNIX_BPF_SOCKMAP or something.
>> >> bpf_iter is supported without this config.
>> >
>> > I don't like where it's going.
>> > I strongly dislike new config knobs.
>> > I'd rather remove existing knobs.
>> > What is the motivation?
>>
>> The goal is to compile out sockmap bits that use sk_msg.
>> NET_SOCK_MSG is natural, exisiting candidate.
>> New knob wasn't my idea.
>
> I'm also missing the big picture here.
>
> sockmap already holds socket references today. You can store and look
> up sockets without attaching any verdict/parser program, and no
> redirect happens. So if the goal is to use sockmap purely as a socket
> container without the sk_msg fast-path overhead, what does a
> compile-time NET_SOCK_MSG knob add over the runtime checks?

Sure, let me clarify. It's about the maintenance overhead.

sockmap-based redirects are a rather niche feature with few users, for
which we've been getting quite a few bug reports since AI came along.

We're not using it internally at Cloudflare, so I don't really have a
good reason to justify time spent on these bug reports.

Hence the move to put sockmap-based redirect behind a config option,
which you can enable at your own risk. Or which we can deprecate, but
that's not really my call.

> I am also not sure if NET_SOCK_MSG is right. It is broader than
> "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP.
> Because those select it, it can't be toggled independently.

Once the sockmap redirect bits are behind _some_ config option, it will
be easy to replace it with a more granular one that depends on
NET_SOCK_MSG. But we're not there yet. One step at a time.

> Could you share the concrete use case you have in mind, and whether
> this came out of an earlier discussion or thread upstream?

This is a follow up from discussions at BPF summit with Alexei & John.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 20:36             ` Jakub Sitnicki
@ 2026-06-23 20:44               ` Amery Hung
  2026-06-23 21:26               ` Alexei Starovoitov
  1 sibling, 0 replies; 13+ messages in thread
From: Amery Hung @ 2026-06-23 20:44 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Kuniyuki Iwashima, bpf, Alexei Starovoitov,
	Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend,
	Network Development, kernel-team

On Tue, Jun 23, 2026 at 1:36 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote:
> > On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >>
> >> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
> >> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
> >> >>
> >> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> >> >
> >> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> >> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> >> > >>
> >> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> >> >> > >> completed all code paths related to sockmap-based redirects should be
> >> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> >> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> >> >> > >> socket references would remain under BPF_SYSCALL.
> >> >> > >>
> >> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >> >> > >> ---
> >> >> > >> Changes in v2:
> >> >> > >> - Handle prot->recvmsg being NULL (Sashiko)
> >> >> > >> - Elaborate on the end goal in description
> >> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> >> >> > >> ---
> >> >> > >>  net/unix/af_unix.c  | 4 ++--
> >> >> > >>  net/unix/unix_bpf.c | 6 ++++++
> >> >> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
> >> >> > >>
> >> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >> >> > >> index f7a9d55eee8a..84c11c60c75f 100644
> >> >> > >> --- a/net/unix/af_unix.c
> >> >> > >> +++ b/net/unix/af_unix.c
> >> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
> >> >> > >>  #ifdef CONFIG_BPF_SYSCALL
> >> >> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
> >> >> > >>
> >> >> > >> -       if (prot != &unix_dgram_proto)
> >> >> > >> +       if (prot->recvmsg)
> >> >> > >
> >> >> > > There is no reason to have this dead branch when
> >> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
> >> >> > >
> >> >> > > Let's compile out all sockmap code when both configs
> >> >> > > are not enabled.
> >> >> > >
> >> >> > > Since AF_UNIX differs from TCP/UDP, it can take the
> >> >> > > simpler approach.
> >> >> >
> >> >> > Okay, will put the whole file behind hidden config option like so:
> >> >> >
> >> >> > --- a/net/unix/Kconfig
> >> >> > +++ b/net/unix/Kconfig
> >> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG
> >> >> >         help
> >> >> >           Support for UNIX socket monitoring interface used by the ss tool.
> >> >> >           If unsure, say Y.
> >> >> > +
> >> >> > +config UNIX_BPF
> >> >>
> >> >> Maybe UNIX_BPF_SOCKMAP or something.
> >> >> bpf_iter is supported without this config.
> >> >
> >> > I don't like where it's going.
> >> > I strongly dislike new config knobs.
> >> > I'd rather remove existing knobs.
> >> > What is the motivation?
> >>
> >> The goal is to compile out sockmap bits that use sk_msg.
> >> NET_SOCK_MSG is natural, exisiting candidate.
> >> New knob wasn't my idea.
> >
> > I'm also missing the big picture here.
> >
> > sockmap already holds socket references today. You can store and look
> > up sockets without attaching any verdict/parser program, and no
> > redirect happens. So if the goal is to use sockmap purely as a socket
> > container without the sk_msg fast-path overhead, what does a
> > compile-time NET_SOCK_MSG knob add over the runtime checks?
>
> Sure, let me clarify. It's about the maintenance overhead.
>
> sockmap-based redirects are a rather niche feature with few users, for
> which we've been getting quite a few bug reports since AI came along.
>
> We're not using it internally at Cloudflare, so I don't really have a
> good reason to justify time spent on these bug reports.
>
> Hence the move to put sockmap-based redirect behind a config option,
> which you can enable at your own risk. Or which we can deprecate, but
> that's not really my call.
>
> > I am also not sure if NET_SOCK_MSG is right. It is broader than
> > "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP.
> > Because those select it, it can't be toggled independently.
>
> Once the sockmap redirect bits are behind _some_ config option, it will
> be easy to replace it with a more granular one that depends on
> NET_SOCK_MSG. But we're not there yet. One step at a time.
>
> > Could you share the concrete use case you have in mind, and whether
> > this came out of an earlier discussion or thread upstream?
>
> This is a follow up from discussions at BPF summit with Alexei & John.

I see. Thanks for explaining the motivation.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
  2026-06-23 20:36             ` Jakub Sitnicki
  2026-06-23 20:44               ` Amery Hung
@ 2026-06-23 21:26               ` Alexei Starovoitov
  1 sibling, 0 replies; 13+ messages in thread
From: Alexei Starovoitov @ 2026-06-23 21:26 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Amery Hung, Kuniyuki Iwashima, bpf, Alexei Starovoitov,
	Daniel Borkmann, Jakub Kicinski, Jiayuan Chen, John Fastabend,
	Network Development, kernel-team

On Tue, Jun 23, 2026 at 1:36 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote:
> > On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >>
> >> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
> >> > On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
> >> >>
> >> >> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> >> >
> >> >> > On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
> >> >> > > On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> >> > >>
> >> >> > >> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
> >> >> > >> completed all code paths related to sockmap-based redirects should be
> >> >> > >> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
> >> >> > >> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
> >> >> > >> socket references would remain under BPF_SYSCALL.
> >> >> > >>
> >> >> > >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >> >> > >> ---
> >> >> > >> Changes in v2:
> >> >> > >> - Handle prot->recvmsg being NULL (Sashiko)
> >> >> > >> - Elaborate on the end goal in description
> >> >> > >> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
> >> >> > >> ---
> >> >> > >>  net/unix/af_unix.c  | 4 ++--
> >> >> > >>  net/unix/unix_bpf.c | 6 ++++++
> >> >> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
> >> >> > >>
> >> >> > >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >> >> > >> index f7a9d55eee8a..84c11c60c75f 100644
> >> >> > >> --- a/net/unix/af_unix.c
> >> >> > >> +++ b/net/unix/af_unix.c
> >> >> > >> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
> >> >> > >>  #ifdef CONFIG_BPF_SYSCALL
> >> >> > >>         const struct proto *prot = READ_ONCE(sk->sk_prot);
> >> >> > >>
> >> >> > >> -       if (prot != &unix_dgram_proto)
> >> >> > >> +       if (prot->recvmsg)
> >> >> > >
> >> >> > > There is no reason to have this dead branch when
> >> >> > > CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
> >> >> > >
> >> >> > > Let's compile out all sockmap code when both configs
> >> >> > > are not enabled.
> >> >> > >
> >> >> > > Since AF_UNIX differs from TCP/UDP, it can take the
> >> >> > > simpler approach.
> >> >> >
> >> >> > Okay, will put the whole file behind hidden config option like so:
> >> >> >
> >> >> > --- a/net/unix/Kconfig
> >> >> > +++ b/net/unix/Kconfig
> >> >> > @@ -30,3 +30,8 @@ config UNIX_DIAG
> >> >> >         help
> >> >> >           Support for UNIX socket monitoring interface used by the ss tool.
> >> >> >           If unsure, say Y.
> >> >> > +
> >> >> > +config UNIX_BPF
> >> >>
> >> >> Maybe UNIX_BPF_SOCKMAP or something.
> >> >> bpf_iter is supported without this config.
> >> >
> >> > I don't like where it's going.
> >> > I strongly dislike new config knobs.
> >> > I'd rather remove existing knobs.
> >> > What is the motivation?
> >>
> >> The goal is to compile out sockmap bits that use sk_msg.
> >> NET_SOCK_MSG is natural, exisiting candidate.
> >> New knob wasn't my idea.
> >
> > I'm also missing the big picture here.
> >
> > sockmap already holds socket references today. You can store and look
> > up sockets without attaching any verdict/parser program, and no
> > redirect happens. So if the goal is to use sockmap purely as a socket
> > container without the sk_msg fast-path overhead, what does a
> > compile-time NET_SOCK_MSG knob add over the runtime checks?
>
> Sure, let me clarify. It's about the maintenance overhead.
>
> sockmap-based redirects are a rather niche feature with few users, for
> which we've been getting quite a few bug reports since AI came along.
>
> We're not using it internally at Cloudflare, so I don't really have a
> good reason to justify time spent on these bug reports.
>
> Hence the move to put sockmap-based redirect behind a config option,
> which you can enable at your own risk. Or which we can deprecate, but
> that's not really my call.

This is wishful thinking that a config knob will stop
the bug reports.
Just disable it for real instead.

> > I am also not sure if NET_SOCK_MSG is right. It is broader than
> > "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP.
> > Because those select it, it can't be toggled independently.
>
> Once the sockmap redirect bits are behind _some_ config option, it will
> be easy to replace it with a more granular one that depends on
> NET_SOCK_MSG. But we're not there yet. One step at a time.

No. That's not workable.

> > Could you share the concrete use case you have in mind, and whether
> > this came out of an earlier discussion or thread upstream?
>
> This is a follow up from discussions at BPF summit with Alexei & John.

Not quite. The discussion was to disable pieces of sockmap
that are causing trouble.
Not to move them under config knobs, but disable them.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-06-23 21:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 11:20 [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG Jakub Sitnicki
2026-06-23 16:08 ` Kuniyuki Iwashima
2026-06-23 19:21   ` Jakub Sitnicki
2026-06-23 19:31     ` Kuniyuki Iwashima
2026-06-23 19:33       ` Alexei Starovoitov
2026-06-23 20:03         ` Jakub Sitnicki
2026-06-23 20:13           ` Kuniyuki Iwashima
2026-06-23 20:22           ` Amery Hung
2026-06-23 20:36             ` Jakub Sitnicki
2026-06-23 20:44               ` Amery Hung
2026-06-23 21:26               ` Alexei Starovoitov
2026-06-23 20:09       ` Jakub Sitnicki
2026-06-23 20:14         ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox