From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Maoyi Xie <maoyixie.tju@gmail.com>, netdev@vger.kernel.org
Cc: willemb@google.com, edumazet@google.com, pabeni@redhat.com,
kuba@kernel.org, davem@davemloft.net, dsahern@kernel.org,
kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org,
stable@vger.kernel.org, security@kernel.org
Subject: Re: [PATCH net v3] ipv6: flowlabel: enforce per-netns limit for unprivileged callers
Date: Thu, 30 Apr 2026 09:42:32 -0400 [thread overview]
Message-ID: <willemdebruijn.kernel.2674e48b8c343@gmail.com> (raw)
In-Reply-To: <20260430081608.3137365-1-maoyixie.tju@gmail.com>
Maoyi Xie wrote:
> From: Maoyi Xie <maoyi.xie@ntu.edu.sg>
>
> fl_size, fl_ht and ip6_fl_lock in net/ipv6/ip6_flowlabel.c are file
> scope and shared across netns. mem_check() reads fl_size to decide
> whether to deny non-CAP_NET_ADMIN callers; capable() runs against
> init_user_ns, so an unprivileged user in any non-init userns can
> push fl_size past FL_MAX_SIZE - FL_MAX_SIZE/4 and starve every
> other unprivileged userns on the host.
>
> Add struct netns_ipv6::flowlabel_count, bumped and decremented next
> to fl_size in fl_intern, ip6_fl_gc and ip6_fl_purge. Place it near
> ipmr_seq rather than next to flowlabel_has_excl: flowlabel_has_excl
> is read on every flowlabel lookup, and a counter written on every
> alloc would dirty its cacheline.
The cacheline point is more about truly ipv6 hot path fields. This
entire explicit flowlabel mgmt is not that.
Did this new location fill a 4B hole? (on 64b builds)
>
> mem_check() folds an extra FL_MAX_SIZE/8 ceiling into the existing
> non-CAP_NET_ADMIN conditional.
>
> Bump FL_MAX_SIZE from 4096 to 8192. It has been 4096 since the file
> was added; machines and connection counts have grown. The new
> per-netns ceiling is then 1024 flowlabels, half of FL_MAX_SIZE/4.
>
> CAP_NET_ADMIN against init_user_ns still bypasses both caps.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Suggested-by: Willem de Bruijn <willemb@google.com>
> Cc: stable@vger.kernel.org # v5.15+
> Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
> ---
> v3 (this submission, netdev): addressed Willem's review on the
> private security@ thread:
> - merged the FL_MAX_SIZE doubling into this patch
> - dropped the test data block from the commit body
> - moved flowlabel_count to a 4-byte hole next to ipmr_seq, off
> the flowlabel_has_excl cacheline
> - inlined fl->fl_net in ip6_fl_gc (no local var)
> v2: per-netns counter + cap, sent to security@ as a 2-patch series
> v1: fix-shape sketch in original disclosure
>
> include/net/netns/ipv6.h | 1 +
> net/ipv6/ip6_flowlabel.c | 10 ++++++++--
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
> index 34bdb1308..329482373 100644
> --- a/include/net/netns/ipv6.h
> +++ b/include/net/netns/ipv6.h
> @@ -119,6 +119,7 @@ struct netns_ipv6 {
> struct fib_notifier_ops *notifier_ops;
> struct fib_notifier_ops *ip6mr_notifier_ops;
> unsigned int ipmr_seq; /* protected by rtnl_mutex */
> + atomic_t flowlabel_count;
> struct {
> struct hlist_head head;
> spinlock_t lock;
> diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c
> index c92f98c6f..4a5219356 100644
> --- a/net/ipv6/ip6_flowlabel.c
> +++ b/net/ipv6/ip6_flowlabel.c
> @@ -36,7 +36,7 @@
> /* FL hash table */
>
> #define FL_MAX_PER_SOCK 32
> -#define FL_MAX_SIZE 4096
> +#define FL_MAX_SIZE 8192
> #define FL_HASH_MASK 255
> #define FL_HASH(l) (ntohl(l)&FL_HASH_MASK)
>
> @@ -162,6 +162,7 @@ static void ip6_fl_gc(struct timer_list *unused)
> ttd = fl->expires;
> if (time_after_eq(now, ttd)) {
> *flp = fl->next;
> + atomic_dec(&fl->fl_net->ipv6.flowlabel_count);
> fl_free(fl);
> atomic_dec(&fl_size);
nit: can you place these consistently immediately after the fl_size
operations, to make clear that they are paired.
> continue;
> @@ -195,6 +196,7 @@ static void __net_exit ip6_fl_purge(struct net *net)
> if (net_eq(fl->fl_net, net) &&
> atomic_read(&fl->users) == 0) {
> *flp = fl->next;
> + atomic_dec(&net->ipv6.flowlabel_count);
> fl_free(fl);
> atomic_dec(&fl_size);
> continue;
> @@ -245,6 +247,7 @@ static struct ip6_flowlabel *fl_intern(struct net *net,
> fl->next = fl_ht[FL_HASH(fl->label)];
> rcu_assign_pointer(fl_ht[FL_HASH(fl->label)], fl);
> atomic_inc(&fl_size);
> + atomic_inc(&net->ipv6.flowlabel_count);
> spin_unlock_bh(&ip6_fl_lock);
> rcu_read_unlock();
> return NULL;
> @@ -464,6 +467,7 @@ fl_create(struct net *net, struct sock *sk, struct in6_flowlabel_req *freq,
>
> static int mem_check(struct sock *sk)
> {
> + struct net *net = sock_net(sk);
> int room = FL_MAX_SIZE - atomic_read(&fl_size);
> struct ipv6_fl_socklist *sfl;
> int count = 0;
> @@ -478,7 +482,9 @@ static int mem_check(struct sock *sk)
>
> if (room <= 0 ||
> ((count >= FL_MAX_PER_SOCK ||
> - (count > 0 && room < FL_MAX_SIZE/2) || room < FL_MAX_SIZE/4) &&
> + (count > 0 && room < FL_MAX_SIZE/2) ||
> + room < FL_MAX_SIZE/4 ||
> + atomic_read(&net->ipv6.flowlabel_count) >= FL_MAX_SIZE/8) &&
> !capable(CAP_NET_ADMIN)))
> return -ENOBUFS;
>
> --
> 2.34.1
>
prev parent reply other threads:[~2026-04-30 13:42 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 8:16 [PATCH net v3] ipv6: flowlabel: enforce per-netns limit for unprivileged callers Maoyi Xie
2026-04-30 13:42 ` Willem de Bruijn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=willemdebruijn.kernel.2674e48b8c343@gmail.com \
--to=willemdebruijn.kernel@gmail.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=maoyixie.tju@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=security@kernel.org \
--cc=stable@vger.kernel.org \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.