All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Florian Westphal <fw@strlen.de>
Cc: netfilter-devel <netfilter-devel@vger.kernel.org>,
	syzkaller-bugs@googlegroups.com,
	syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
Subject: Re: [PATCH nf] netfilter: nf_tables: do not defer rule destruction via call_rcu
Date: Mon, 9 Dec 2024 22:27:18 +0100	[thread overview]
Message-ID: <Z1dgtm5IhoJW5vGL@calendula> (raw)
In-Reply-To: <20241207111459.7191-1-fw@strlen.de>

Hi Florian,

Thanks a lot for your quick fix.

On Sat, Dec 07, 2024 at 12:14:48PM +0100, Florian Westphal wrote:
> nf_tables_chain_destroy can sleep, it can't be used from call_rcu
> callbacks.
> 
> Moreover, nf_tables_rule_release() is only safe for error unwinding,
> while transaction mutex is held and the to-be-desroyed rule was not
> exposed to either dataplane or dumps, as it deactives+frees without
> the required synchronize_rcu() in-between.
> 
> nft_rule_expr_deactivate() callbacks will change ->use counters
> of other chains/sets, see e.g. nft_lookup .deactivate callback, these
> must be serialized via transaction mutex.
> 
> Also add a few lockdep asserts to make this more explicit.
> 
> Calling synchronize_rcu() isn't ideal, but fixing this without is hard
> and way more intrusive.  As-is, we can get:
> 
> WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x..

Right, rhash needs this, that is why there is a workqueue to release
objects from nftables commit path.

> Workqueue: events nf_tables_trans_destroy_work
> RIP: 0010:nft_set_destroy+0x3fe/0x5c0
> Call Trace:
>  <TASK>
>  nf_tables_trans_destroy_work+0x6b7/0xad0
>  process_one_work+0x64a/0xce0
>  worker_thread+0x613/0x10d0
> 
> In case the synchronize_rcu becomes an issue, we can explore alternatives.
> 
> One way would be to allocate nft_trans_rule objects + one nft_trans_chain
> object, deactivate the rules + the chain and then defer the freeing to the
> nft destroy workqueue.  We'd still need to keep the synchronize_rcu path as
> a fallback to handle -ENOMEM corner cases though.

I think it can be done _without_ nft_trans objects.

Since the commit mutex is held in this netdev event path: Remove this
basechain, deactivate rules and add basechain to global list protected
with spinlock, it invokes worker. Then, worker zaps this list
basechains, it calls synchronize_rcu() and it destroys rules and then
basechain. No memory allocations needed in this case?

Thanks.

> Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/
> Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/netfilter/nf_tables_api.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)
> 
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> index 21b6f7410a1f..a3b6b6b32f72 100644
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
> @@ -3987,8 +3987,11 @@ void nf_tables_rule_destroy(const struct nft_ctx *ctx, struct nft_rule *rule)
>  	kfree(rule);
>  }
>  
> +/* can only be used if rule is no longer visible to dumps */
>  static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *rule)
>  {
> +	lockdep_commit_lock_is_held(ctx->net);
> +
>  	nft_rule_expr_deactivate(ctx, rule, NFT_TRANS_RELEASE);
>  	nf_tables_rule_destroy(ctx, rule);
>  }
> @@ -5757,6 +5760,8 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
>  			      struct nft_set_binding *binding,
>  			      enum nft_trans_phase phase)
>  {
> +	lockdep_commit_lock_is_held(ctx->net);
> +
>  	switch (phase) {
>  	case NFT_TRANS_PREPARE_ERROR:
>  		nft_set_trans_unbind(ctx, set);
> @@ -11695,19 +11700,6 @@ static void __nft_release_basechain_now(struct nft_ctx *ctx)
>  	nf_tables_chain_destroy(ctx->chain);
>  }
>  
> -static void nft_release_basechain_rcu(struct rcu_head *head)
> -{
> -	struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
> -	struct nft_ctx ctx = {
> -		.family	= chain->table->family,
> -		.chain	= chain,
> -		.net	= read_pnet(&chain->table->net),
> -	};
> -
> -	__nft_release_basechain_now(&ctx);
> -	put_net(ctx.net);
> -}
> -
>  int __nft_release_basechain(struct nft_ctx *ctx)
>  {
>  	struct nft_rule *rule;
> @@ -11722,11 +11714,18 @@ int __nft_release_basechain(struct nft_ctx *ctx)
>  	nft_chain_del(ctx->chain);
>  	nft_use_dec(&ctx->table->use);
>  
> -	if (maybe_get_net(ctx->net))
> -		call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu);
> -	else
> +	if (!maybe_get_net(ctx->net)) {
>  		__nft_release_basechain_now(ctx);
> +		return 0;
> +	}
> +
> +	/* wait for ruleset dumps to complete.  Owning chain is no longer in
> +	 * lists, so new dumps can't find any of these rules anymore.
> +	 */
> +	synchronize_rcu();
>  
> +	__nft_release_basechain_now(ctx);
> +	put_net(ctx->net);
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(__nft_release_basechain);
> -- 
> 2.47.1
> 
> 

  reply	other threads:[~2024-12-09 21:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-27 21:22 [syzbot] [kernel?] BUG: sleeping function called from invalid context in static_key_slow_dec syzbot
2024-11-29 10:47 ` Florian Westphal
2024-12-07 11:14 ` [PATCH nf] netfilter: nf_tables: do not defer rule destruction via call_rcu Florian Westphal
2024-12-09 21:27   ` Pablo Neira Ayuso [this message]
2024-12-09 22:04     ` Florian Westphal
2024-12-11 16:16       ` Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1dgtm5IhoJW5vGL@calendula \
    --to=pablo@netfilter.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.