From: Kuniyuki Iwashima <kuniyu@google.com>
To: avimalin@gmail.com
Cc: edumazet@google.com, kuniyu@google.com, netdev@vger.kernel.org,
vimal.agrawal@sophos.com, kuba@kernel.org
Subject: Re: [PATCH net-next] net: neigh: avoid calling neigh_forced_gc on every alloc when table is full
Date: Thu, 25 Jun 2026 21:45:30 +0000 [thread overview]
Message-ID: <20260625214659.1115819-1-kuniyu@google.com> (raw)
In-Reply-To: <20260625102020.92814-1-vimal.agrawal@sophos.com>
From: Vimal Agrawal <avimalin@gmail.com>
Date: Thu, 25 Jun 2026 10:20:20 +0000
> Once the neighbour table exceeds gc_thresh3, neigh_forced_gc() is called
> on every allocation attempt with no rate limiting. In workloads with mostly
> active/reachable entries, the GC walk traverses a large portion of the
> neighbour table without reclaiming entries, holding tbl->lock for an
> extended period. This causes severe lock contention and allocation
> latencies exceeding 16ms under sustained neighbour creation.
>
> Add a pre-lock check in neigh_forced_gc() to skip the GC run if one was
> performed within the last second, avoiding repeated full table scans and
> lock acquisitions on the hot allocation path.
>
> Profiling of neigh_create() shows ~3 orders of magnitude latency
> improvement with this change.
>
> Link:https://lore.kernel.org/netdev/CALkUMdSCpx_ywYCx_ePLdm6yioO1nQWx7sSM=AEgsq0kywHxTw@mail.gmail.com/
From the thread, these look misconfigured.
---8<---
net.ipv6.neigh.default.gc_thresh2 = 32768
net.ipv6.neigh.default.gc_thresh3 = 32768
---8<---
If gc_thresh3 is larger enough, gc_thresh2 will give you 5s
rate limiting.
If the number of active neigh entries constantly exceeds
gc_thresh3, it will be the correct gc_thresh2 for you.
Also, I guess you want a new kernel param for the first
neigh_hash_alloc(), which is currently fixed for 3, which
is too small for some hosts.
50000 entries require neigh_hash_grow() 13 times.
Can you test this on your real workload, starting from
neigh_hash_shift=16 and appropriate gc_thresh2/3 ?
---8<---
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 1349c0eedb64..a75b3750eec9 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1817,6 +1817,22 @@ EXPORT_SYMBOL(neigh_parms_release);
static struct lock_class_key neigh_table_proxy_queue_class;
static struct neigh_table __rcu *neigh_tables[NEIGH_NR_TABLES] __read_mostly;
+static __initdata unsigned long neigh_hash_shift = 3;
+
+static int __init neigh_set_hash_shift(char *str)
+{
+ ssize_t ret;
+
+ if (!str)
+ return 0;
+
+ ret = kstrtoul(str, 0, &neigh_hash_shift);
+ if (ret)
+ return 0;
+
+ return 1;
+}
+__setup("neigh_hash_shift=", neigh_set_hash_shift);
void neigh_table_init(int index, struct neigh_table *tbl)
{
@@ -1843,7 +1859,7 @@ void neigh_table_init(int index, struct neigh_table *tbl)
panic("cannot create neighbour proc dir entry");
#endif
- RCU_INIT_POINTER(tbl->nht, neigh_hash_alloc(3));
+ RCU_INIT_POINTER(tbl->nht, neigh_hash_alloc(neigh_hash_shift));
phsize = (PNEIGH_HASHMASK + 1) * sizeof(struct pneigh_entry *);
tbl->phash_buckets = kzalloc(phsize, GFP_KERNEL);
---8<---
> Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com>
> ---
> net/core/neighbour.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 1349c0eedb64..078842db3c5f 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -260,6 +260,9 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> int shrunk = 0;
> int loop = 0;
>
> + if (!time_after(jiffies, READ_ONCE(tbl->last_flush) + HZ))
> + return 0;
> +
> NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>
> spin_lock_bh(&tbl->lock);
> --
> 2.17.1
> v
prev parent reply other threads:[~2026-06-25 21:47 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-18 8:17 neigh: poor scalability of forced GC when neighbour count exceeds gc_thresh3 Vimal Agrawal
2026-06-25 10:20 ` [PATCH net-next] net: neigh: avoid calling neigh_forced_gc on every alloc when table is full Vimal Agrawal
2026-06-25 15:42 ` Jakub Kicinski
2026-06-25 21:45 ` Kuniyuki Iwashima [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260625214659.1115819-1-kuniyu@google.com \
--to=kuniyu@google.com \
--cc=avimalin@gmail.com \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=vimal.agrawal@sophos.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox