Linux Netfilter development
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: scott.k.mitch1@gmail.com
Cc: netfilter-devel@vger.kernel.org, pablo@netfilter.org
Subject: Re: [PATCH v6 2/2] netfilter: nfnetlink_queue: optimize verdict lookup with hash table
Date: Sun, 18 Jan 2026 00:00:07 +0100	[thread overview]
Message-ID: <aWwUd1Z8xz5Kk30j@strlen.de> (raw)
In-Reply-To: <20260117173231.88610-3-scott.k.mitch1@gmail.com>

scott.k.mitch1@gmail.com <scott.k.mitch1@gmail.com> wrote:
> From: Scott Mitchell <scott.k.mitch1@gmail.com>
> 
> The current implementation uses a linear list to find queued packets by
> ID when processing verdicts from userspace. With large queue depths and
> out-of-order verdicting, this O(n) lookup becomes a significant
> bottleneck, causing userspace verdict processing to dominate CPU time.
> 
> Replace the linear search with a hash table for O(1) average-case
> packet lookup by ID. The hash table automatically resizes based on
> queue depth: grows at 75% load factor, shrinks at 25% load factor.
> To prevent rapid resize cycling during traffic bursts, shrinking only
> occurs if at least 60 seconds have passed since the last shrink.

Ouch.  Can we first try something simpler rather than starting a
reimplementation of rhashtable?

Or just use a global rhashtable for this?

> Hash table memory is allocated with GFP_KERNEL_ACCOUNT so memory is
> attributed to the cgroup rather than kernel overhead.
> 
> The existing list data structure is retained for operations requiring
> linear iteration (e.g. flush, device down events). Hot fields
> (queue_hash_mask, queue_hash pointer, resize state) are placed in the
> same cache line as the spinlock and packet counters for optimal memory
> access patterns.
> 
> Signed-off-by: Scott Mitchell <scott.k.mitch1@gmail.com>
> ---
>  include/net/netfilter/nf_queue.h |   1 +
>  net/netfilter/nfnetlink_queue.c  | 237 +++++++++++++++++++++++++++++--
>  2 files changed, 229 insertions(+), 9 deletions(-)
> 
> diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h
> index 4aeffddb7586..3d0def310523 100644
> --- a/include/net/netfilter/nf_queue.h
> +++ b/include/net/netfilter/nf_queue.h
> @@ -11,6 +11,7 @@
>  /* Each queued (to userspace) skbuff has one of these. */
>  struct nf_queue_entry {
>  	struct list_head	list;
> +	struct hlist_node	hash_node;
>  	struct sk_buff		*skb;
>  	unsigned int		id;
>  	unsigned int		hook_index;	/* index in hook_entries->hook[] */
> diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
> index 7b2cabf08fdf..772d2a7d0d7c 100644
> --- a/net/netfilter/nfnetlink_queue.c
> +++ b/net/netfilter/nfnetlink_queue.c
> @@ -30,6 +30,11 @@
>  #include <linux/netfilter/nf_conntrack_common.h>
>  #include <linux/list.h>
>  #include <linux/cgroup-defs.h>
> +#include <linux/workqueue.h>
> +#include <linux/jiffies.h>
> +#include <linux/log2.h>
> +#include <linux/memcontrol.h>
> +#include <linux/sched/mm.h>
>  #include <net/gso.h>
>  #include <net/sock.h>
>  #include <net/tcp_states.h>
> @@ -46,7 +51,11 @@
>  #include <net/netfilter/nf_conntrack.h>
>  #endif
>  
> -#define NFQNL_QMAX_DEFAULT 1024
> +#define NFQNL_QMAX_DEFAULT         1024
> +#define NFQNL_HASH_MIN_SIZE        16
> +#define NFQNL_HASH_MAX_SIZE        131072

Is there a use case for such a large table?

> +#define NFQNL_HASH_DEFAULT_SIZE    NFQNL_HASH_MIN_SIZE
> +#define NFQNL_HASH_SHRINK_INTERVAL (60 * HZ)	/* Only shrink every 60 seconds */

>  /* We're using struct nlattr which has 16bit nla_len. Note that nla_len
>   * includes the header length. Thus, the maximum packet length that we
> @@ -59,6 +68,11 @@
>  struct nfqnl_instance {
>  	struct hlist_node hlist;		/* global list of queues */
>  	struct rcu_head rcu;
> +	struct work_struct destroy_work;
> +	struct work_struct resize_work;
> +#ifdef CONFIG_MEMCG
> +	struct mem_cgroup *resize_memcg;
> +#endif

I feel this is way too complicated and over-the-top.

Can we either
1). use a global rhashtable, shared by all netns + all queues (so we
    have no extra memory tied down per queue).

OR

2). Try with a simple, statically sized hash table (16? 32? 64?) without
    any magic resizing?

And, if we go route 2), how much confidence is there that its good
enough?

Because if you already suspect you need all this extra grow/shrink logic
then then 1) is my preferred choice.

What is the deal-breaker wrt. rhashtable so that one would start to
reimplement the features it already offers?

  reply	other threads:[~2026-01-17 23:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-17 17:32 [PATCH v6 0/2] netfilter: nfnetlink_queue: optimize verdict lookup with hash table scott.k.mitch1
2026-01-17 17:32 ` [PATCH v6 1/2] netfilter: nfnetlink_queue: nfqnl_instance GFP_ATOMIC -> GFP_KERNEL_ACCOUNT allocation scott.k.mitch1
2026-01-17 22:45   ` Florian Westphal
2026-01-17 23:25     ` Scott Mitchell
2026-01-19  0:39       ` Florian Westphal
2026-01-23 14:02         ` Scott Mitchell
2026-01-17 17:32 ` [PATCH v6 2/2] netfilter: nfnetlink_queue: optimize verdict lookup with hash table scott.k.mitch1
2026-01-17 23:00   ` Florian Westphal [this message]
2026-01-21 15:25     ` Scott Mitchell
2026-01-21 15:49       ` Florian Westphal
2026-01-23  1:58         ` Scott Mitchell
2026-01-23  6:54           ` Florian Westphal
2026-01-23 13:38             ` Scott Mitchell
2026-01-24 16:48               ` Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWwUd1Z8xz5Kk30j@strlen.de \
    --to=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=scott.k.mitch1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox