From: Florian Westphal <fw@strlen.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>, netfilter-devel@vger.kernel.org
Subject: Re: [PATCH nf-next v4 4/5] netfilter: nf_tables: switch trans_elem to real flex array
Date: Wed, 13 Nov 2024 12:04:05 +0100 [thread overview]
Message-ID: <20241113110405.GA19651@breakpoint.cc> (raw)
In-Reply-To: <ZzR8W7oQ_3wD-osu@calendula>
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> I'm making another pass on this series, a few thing I would like to
> ask, see below.
>
> On Thu, Nov 07, 2024 at 06:44:08PM +0100, Florian Westphal wrote:
> > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> > index bdf5ba21c76d..e96e538fe2eb 100644
> > --- a/net/netfilter/nf_tables_api.c
> > +++ b/net/netfilter/nf_tables_api.c
> > @@ -25,6 +25,7 @@
> >
> > #define NFT_MODULE_AUTOLOAD_LIMIT (MODULE_NAME_LEN - sizeof("nft-expr-255-"))
> > #define NFT_SET_MAX_ANONLEN 16
> > +#define NFT_MAX_SET_NELEMS ((2048 - sizeof(struct nft_trans_elem)) / sizeof(struct nft_trans_one_elem))
>
> This NFT_MAX_SET_NELEMS is to stay in a specific kmalloc-X?
>
> What is the logic behind this NFT_MAX_SET_NELEMS?
I want to avoid making huge kmalloc requests, plus avoid huge krealloc
overhead.
I think that kmalloc-2048 slab is a good fit.
I can add a comment, or increase to kmalloc-4096 but I'd prefer to
not go over that, since kmalloc allocations > 1 page are more prone
to allocation failure.
> > unsigned int nf_tables_net_id __read_mostly;
> >
> > @@ -391,6 +392,69 @@ static void nf_tables_unregister_hook(struct net *net,
> > return __nf_tables_unregister_hook(net, table, chain, false);
> > }
> >
> > +static bool nft_trans_collapse_set_elem_allowed(const struct nft_trans_elem *a, const struct nft_trans_elem *b)
> > +{
> > + return a->set == b->set && a->bound == b->bound && a->nelems < NFT_MAX_SET_NELEMS;
>
> I think this a->bound == b->bound check defensive.
>
> This code is collapsing only two consecutive transactions, the one at
> the tail (where nelems > 1) and the new transaction (where nelems ==
> 1).
Yes.
> bound state should only change in case there is a NEWRULE transaction
> in between.
Yes.
> I am trying to find a error scenario where a->bound == b->bound
> evaluates false. I considered the following:
>
> newelem -> newrule -> newelem
>
> where newrule has these expressions:
>
> lookup -> error
>
> in this case, newrule error path is exercised:
>
> nft_rule_expr_deactivate(&ctx, rule, NFT_TRANS_PREPARE_ERROR);
>
> this calls nf_tables_deactivate_set() that calls
> nft_set_trans_unbind(), then a->bound is restored to false. Rule is
> released and no transaction is added.
>
> Because if this succeeds:
>
> newelem -> newrule -> newelem
>
> then no element collapsing can happen, because we only collapse what
> is at the tail.
>
> TLDR; Check does not harm, but it looks unlikely to happen to me.
Yes, its defensive check. I could add a comment.
The WARN_ON_ONCE for trans->nelems != 1 exists for same reason.
> > +}
> > +
> > +static bool nft_trans_collapse_set_elem(struct nftables_pernet *nft_net,
> > + struct nft_trans_elem *tail,
> > + struct nft_trans_elem *trans,
> > + gfp_t gfp)
> > +{
> > + unsigned int nelems, old_nelems = tail->nelems;
> > + struct nft_trans_elem *new_trans;
> > +
> > + if (!nft_trans_collapse_set_elem_allowed(tail, trans))
> > + return false;
> > +
> > + if (WARN_ON_ONCE(trans->nelems != 1))
> > + return false;
> > +
> > + if (check_add_overflow(old_nelems, trans->nelems, &nelems))
> > + return false;
> > +
> > + /* krealloc might free tail which invalidates list pointers */
> > + list_del_init(&tail->nft_trans.list);
> > +
> > + new_trans = krealloc(tail, struct_size(tail, elems, nelems), gfp);
> > + if (!new_trans) {
> > + list_add_tail(&tail->nft_trans.list, &nft_net->commit_list);
> > + return false;
> > + }
> > +
> > + INIT_LIST_HEAD(&new_trans->nft_trans.list);
>
> This initialization is also defensive, this element is added via
> list_add_tail().
Yes, the first arg to list_add(_tail) can live without initialisation.
next prev parent reply other threads:[~2024-11-13 11:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-07 17:44 [PATCH nf-next v4 0/5] netfilter: nf_tables: reduce set element transaction size Florian Westphal
2024-11-07 17:44 ` [PATCH nf-next v4 1/5] netfilter: nf_tables: add nft_trans_commit_list_add_elem helper Florian Westphal
2024-11-07 17:44 ` [PATCH nf-next v4 2/5] netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure Florian Westphal
2024-11-07 17:44 ` [PATCH nf-next v4 3/5] netfilter: nf_tables: preemptive fix for audit selftest failure Florian Westphal
2024-11-07 17:44 ` [PATCH nf-next v4 4/5] netfilter: nf_tables: switch trans_elem to real flex array Florian Westphal
2024-11-13 10:15 ` Pablo Neira Ayuso
2024-11-13 11:04 ` Florian Westphal [this message]
2024-11-13 11:11 ` Pablo Neira Ayuso
2024-11-07 17:44 ` [PATCH nf-next v4 5/5] netfilter: nf_tables: allocate element update information dynamically Florian Westphal
2024-11-12 18:42 ` [PATCH nf-next v4 0/5] netfilter: nf_tables: reduce set element transaction size Pablo Neira Ayuso
2024-11-12 20:44 ` Florian Westphal
2024-11-13 10:19 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241113110405.GA19651@breakpoint.cc \
--to=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.