All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Florian Westphal <fw@strlen.de>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: [PATCH nf-next 5/6] netfilter: nf_conntrack_helper: add refcounting from datapath
Date: Wed, 27 May 2026 00:19:19 +0200	[thread overview]
Message-ID: <ahYcZ_dFZpAV3B1Z@chamomile> (raw)
In-Reply-To: <ahXea1N1w40Siqin@strlen.de>

Hi Florian,

On Tue, May 26, 2026 at 07:54:51PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
[...]
> > diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
> > index 1956bc12bf56..a03cb4e59ea9 100644
> > --- a/include/net/netfilter/nf_conntrack_helper.h
> > +++ b/include/net/netfilter/nf_conntrack_helper.h
> > @@ -35,20 +35,23 @@ enum nf_ct_helper_flags {
> >  struct nf_conntrack_helper {
> >  	struct hlist_node hnode;	/* Internal use. */
> >  
> > +	struct rcu_head rcu;
> > +
> >  	char name[NF_CT_HELPER_NAME_LEN]; /* name of the module */
> >  	refcount_t refcnt;
> >  	struct module *me;		/* pointer to self */
> >  	struct nf_conntrack_expect_policy expect_policy[NF_CT_MAX_EXPECT_CLASSES];
> >  
> > +	refcount_t ct_refcnt;
> 
> Why do we need two reference counts?  I find this very confusing.
> Which refcount frees the structure?  And can one refcount hit 0 while
> other one is still in use?

The existing refcnt tracks references from the control plane, ie.
rules that point to helper. The new ct_refcnt tracks references from
ct extension.

If the ruleset is flushed, then it is possible to unregister the
helper, otherwise EBUSY is reported. On the other hand, the ct_refcnt
tracks references by ct extension, eg. skb sitting in nfqueue with a
ct helper. The idea is to allow track the helper in memory so it does
not go away even in module is removed/userspace helper is destroyed.

Thus, helper can be removed anytime if ruleset does not use it, but
ct extension that still use the helper hold a reference on ct_refcnt
so it cannot be release. It is a two-level refcount strategy: The
control plane refcnt allows to remove the helper anytime, but the
dataplane refcnt can only be removed if control plane removed the
helper and no ct extension use it.

If I use a single refcnt (the existing one), then a packet sitting in
nfqueue can postpone the module removal of the helper indefinitely.

BTW, there is one single fix I can target to nf.git which is to
disallow userspace helpers in init_user_ns. Userspace helpers have no
netns support, I can post such small patch for inclusion. Still, the
unconfirmed ct race + nfqueue can theoretically still happen without
this series.

> >  	/* Function to call when data passes; return verdict, or -1 to
> >             invalidate. */
> > -	int (*help)(struct sk_buff *skb,
> > -		    unsigned int protoff,
> > -		    struct nf_conn *ct,
> > -		    enum ip_conntrack_info conntrackinfo);
> > +	int __rcu (*help)(struct sk_buff *skb, unsigned int protoff,
> > +			  struct nf_conn *ct,
> > +			  enum ip_conntrack_info conntrackinfo);
> >  
> >  	void (*destroy)(struct nf_conn *ct);
> 
> Why is help RCU protected while other callbacks are not?

As you said above, this is the "dying flag".

> 'destroy' not being rcu protected implies that the helper module must
> remain in memory until after kfree_rcu has released the underlying
> storage anyway.

The only existing .destroy function is not moved in this series to
nf_conntrack_proto_gre.c which is part of the nf_conntrack module.

> If thats true, why do we need rcu head and kfree_rcu in the first place?

Because of userspace helpers, they do not depend on modules, they can
be removed via nfnetlink_cthelper anytime.

> module has to remain in memory until after last possible caller has
> called me->destroy(), no?  If that is correct, then there is no need for
> dynamically allocated storage.

Maybe, but not related to ->destroy(), userspace helpers still are
there.

> > @@ -445,19 +432,18 @@ void nf_conntrack_helper_unregister(struct nf_conntrack_helper *me)
> >  	nf_ct_helper_count--;
> >  	mutex_unlock(&nf_ct_helper_mutex);
> >  
> > +	/* This helper is going away, disable it. */
> > +	rcu_assign_pointer(me->help, NULL);
> > +
> 
> OK, so this signals pending removal (refcnt can still be elevated) to
> prevent new packets/expectations from grabbing another reference.
> Correct?  Is this a 'dying' flag or is there more to it?

Yes, this is a "dying flag".

> I looked at patched 'nf_conntrack_ftp_fini', but I don't see anything
> that spins/waits for completion of referencing entries.

Given the helper object is allocated dynamically, it will remain
around until last reference is dropped.

> How does ->destroy/to_nlattr/from_nlattr etc. work?
> 
> I expected to find something that does a busywait until refcount has
> hit 0 to avoid any calls to the removed module.
> 
> The existing conntracks still hold a pointer to struct
> nf_conntrack_helper, and its refcount can be elevated too, while
> function pointers (not help, but others) are stale.

.help is set to NULL.
.to_nlattr and .from_nlattr are disabled.
.destroy, I moved it to nf_conntrack with the intention that this
pointer is not stale.

> I suspect you need to move the function pointers to an 'op' sub-struct,
> so that it can be cleared via single rcu_assign_pointer(me->help_ops, NULL) ?

I think I cannot disable .destroy that way, it clears the GRE entries
which release the pptp mappings.

> But that still has one problem: if helper module is gone, how can you
> call the destructor?

In this series, the only .destroy callback resides in nf_conntrack.

> Maybe we need to accelerate pptp removal so the only user of destroy
> is removed?

Flagging it as deprecated is convenient for distributors to stop
compiling this, noone should be using this pptp in 2026 I think.

I'd rather see a more simple fix, but I am not sure this can be fixed
for all scenarios (sashiko mentioned also a skb could be sitting in
nf_defrag with a template conntrack with helper/timeout reference, so
nfqueue is no the only queue around).

Let me know, thanks for you comments.

  reply	other threads:[~2026-05-26 22:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26 16:40 [PATCH nf-next 0/6] add refcount to ct timeout/helper Pablo Neira Ayuso
2026-05-26 16:40 ` [PATCH nf-next 1/6] netfilter: nfnetlink_cthelper: use {READ,WRITE}_ONCE for accessing helper flags Pablo Neira Ayuso
2026-05-26 16:40 ` [PATCH nf-next 2/6] netfilter: cttimeout: detach dataplane timeout policy and add refcount Pablo Neira Ayuso
2026-05-26 16:40 ` [PATCH nf-next 3/6] netfilter: nf_conntrack_helper: dynamically allocate struct nf_conntrack_helper Pablo Neira Ayuso
2026-05-26 16:40 ` [PATCH nf-next 4/6] netfilter: nf_conntrack_pptp: move GRE specific cleanup to GRE tracker Pablo Neira Ayuso
2026-05-26 16:40 ` [PATCH nf-next 5/6] netfilter: nf_conntrack_helper: add refcounting from datapath Pablo Neira Ayuso
2026-05-26 17:54   ` Florian Westphal
2026-05-26 22:19     ` Pablo Neira Ayuso [this message]
2026-05-27 13:39       ` Florian Westphal
2026-05-26 16:40 ` [PATCH nf-next 6/6] netfilter: conntrack: revert ct extension genid infrastructure Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahYcZ_dFZpAV3B1Z@chamomile \
    --to=pablo@netfilter.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.