From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: [PATCH nf] nft_set_rbtree: Move clauses for expired nodes, last active node as leaf
Date: Mon, 16 May 2022 20:16:53 +0200 [thread overview]
Message-ID: <YoKVFRR1gggECpiZ@salvia> (raw)
In-Reply-To: <20220512183421.712556-1-sbrivio@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 7301 bytes --]
Hi Stefano,
On Thu, May 12, 2022 at 08:34:21PM +0200, Stefano Brivio wrote:
> In the overlap detection performed as part of the insertion operation,
> we have to skip nodes that are not active in the current generation or
> expired. This was done by adding several conditions in overlap decision
> clauses, which made it hard to check for correctness, and debug the
> actual issue this patch is fixing.
>
> Consolidate these conditions into a single clause, so that we can
> proceed with debugging and fixing the following issue.
>
> Case b3. described in comments covers the insertion of a start
> element after an existing end, as a leaf. If all the descendants of
> a given element are inactive, functionally, for the purposes of
> overlap detection, we still have to treat this as a leaf, but we don't.
>
> If, in the same transaction, we remove a start element to the right,
> remove an end element to the left, and add a start element to the right
> of an existing, active end element, we don't hit case b3. For example:
>
> - existing intervals: 1-2, 3-5, 6-7
> - transaction: delete 3-5, insert 4-5
>
> node traversal might happen as follows:
> - 2 (active end)
> - 5 (inactive end)
> - 3 (inactive start)
>
> when we add 4 as start element, we should simply consider 2 as
> preceding end, and consider it as a leaf, because interval 3-5 has been
> deleted. If we don't, we'll report an overlap because we forgot about
> this.
>
> Add an additional flag which is set as we find an active end, and reset
> it if we find an active start (to the left). If we finish the traversal
> with this flag set, it means we need to functionally consider the
> previous active end as a leaf, and allow insertion instead of reporting
> overlap.
I can still trigger EEXIST with deletion of existing interval. It
became harder to reproduce after this patch.
After hitting EEXIST, if I do:
echo "flush ruleset" > test.nft
nft list ruleset >> test.nft
to dump the existing ruleset, then I run the delete element command
again to remove the interval and it works. Before this patch I could
reproduce it by reloading an existing ruleset dump.
I'm running the script that I'm attaching manually, just one manual
invocation after another.
I ocassionally hit ENOBUFS, but that sounds like a different issue
that I'm looking into.
Thanks.
> Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
> Fixes: 7c84d41416d8 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
> net/netfilter/nft_set_rbtree.c | 92 ++++++++++++++++++++--------------
> 1 file changed, 54 insertions(+), 38 deletions(-)
>
> diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> index 7325bee7d144..dc2184bbe722 100644
> --- a/net/netfilter/nft_set_rbtree.c
> +++ b/net/netfilter/nft_set_rbtree.c
> @@ -222,6 +222,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
> bool overlap = false, dup_end_left = false, dup_end_right = false;
> struct nft_rbtree *priv = nft_set_priv(set);
> u8 genmask = nft_genmask_next(net);
> + bool last_left_node_is_end = false;
> struct nft_rbtree_elem *rbe;
> struct rb_node *parent, **p;
> int d;
> @@ -287,80 +288,95 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
> d = memcmp(nft_set_ext_key(&rbe->ext),
> nft_set_ext_key(&new->ext),
> set->klen);
> - if (d < 0) {
> +
> + if (d < 0)
> p = &parent->rb_left;
> + else if (d > 0)
> + p = &parent->rb_right;
> + else if (nft_rbtree_interval_end(rbe))
> + p = &parent->rb_left;
> + else
> + p = &parent->rb_right;
>
> + /* There might be inactive elements in the tree: ignore them by
> + * traversing them without affecting flags.
> + *
> + * We need to reset the dup_end_left and dup_end_right flags,
> + * though, because those only apply to adjacent nodes.
> + */
> + if (!nft_set_elem_active(&rbe->ext, genmask) ||
> + nft_set_elem_expired(&rbe->ext)) {
> + dup_end_left = dup_end_right = false;
> + continue;
> + }
> +
> + if (d < 0) {
> if (nft_rbtree_interval_start(new)) {
> - if (nft_rbtree_interval_end(rbe) &&
> - nft_set_elem_active(&rbe->ext, genmask) &&
> - !nft_set_elem_expired(&rbe->ext) && !*p)
> - overlap = false;
> + /* See case b3. described above.
> + *
> + * If this is not a leaf, but all nodes below
> + * this one are inactive, except for a leaf, we
> + * still have to consider it a leaf for the
> + * purposes of overlap detection.
> + *
> + * Set last_left_node_is_end if this is not a
> + * leaf and an active end element, and reset it
> + * if we find an active start element to the
> + * left.
> + *
> + * If we end the traversal with this flag set,
> + * this node is a leaf for the purposes of case
> + * b3., and no overlap will be reported.
> + */
> + if (nft_rbtree_interval_end(rbe)) {
> + if (!*p)
> + overlap = false;
> + else
> + last_left_node_is_end = true;
> + } else {
> + last_left_node_is_end = false;
> + }
> } else {
> +
> if (dup_end_left && !*p)
> return -ENOTEMPTY;
>
> - overlap = nft_rbtree_interval_end(rbe) &&
> - nft_set_elem_active(&rbe->ext,
> - genmask) &&
> - !nft_set_elem_expired(&rbe->ext);
> -
> + overlap = nft_rbtree_interval_end(rbe);
> if (overlap) {
> dup_end_right = true;
> continue;
> }
> }
> } else if (d > 0) {
> - p = &parent->rb_right;
> -
> if (nft_rbtree_interval_end(new)) {
> if (dup_end_right && !*p)
> return -ENOTEMPTY;
>
> - overlap = nft_rbtree_interval_end(rbe) &&
> - nft_set_elem_active(&rbe->ext,
> - genmask) &&
> - !nft_set_elem_expired(&rbe->ext);
> -
> + overlap = nft_rbtree_interval_end(rbe);
> if (overlap) {
> dup_end_left = true;
> continue;
> }
> - } else if (nft_set_elem_active(&rbe->ext, genmask) &&
> - !nft_set_elem_expired(&rbe->ext)) {
> + } else {
> overlap = nft_rbtree_interval_end(rbe);
> }
> } else {
> if (nft_rbtree_interval_end(rbe) &&
> nft_rbtree_interval_start(new)) {
> - p = &parent->rb_left;
> -
> - if (nft_set_elem_active(&rbe->ext, genmask) &&
> - !nft_set_elem_expired(&rbe->ext))
> - overlap = false;
> + overlap = false;
> } else if (nft_rbtree_interval_start(rbe) &&
> nft_rbtree_interval_end(new)) {
> - p = &parent->rb_right;
> -
> - if (nft_set_elem_active(&rbe->ext, genmask) &&
> - !nft_set_elem_expired(&rbe->ext))
> - overlap = false;
> - } else if (nft_set_elem_active(&rbe->ext, genmask) &&
> - !nft_set_elem_expired(&rbe->ext)) {
> + overlap = false;
> + } else {
> *ext = &rbe->ext;
> return -EEXIST;
> - } else {
> - overlap = false;
> - if (nft_rbtree_interval_end(rbe))
> - p = &parent->rb_left;
> - else
> - p = &parent->rb_right;
> }
> }
>
> dup_end_left = dup_end_right = false;
> }
>
> - if (overlap)
> + if (overlap && !last_left_node_is_end)
> return -ENOTEMPTY;
>
> rb_link_node_rcu(&new->node, parent, p);
> --
> 2.35.1
>
[-- Attachment #2: rbtree-bug.sh --]
[-- Type: application/x-sh, Size: 1241 bytes --]
next prev parent reply other threads:[~2022-05-16 18:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-12 18:34 [PATCH nf] nft_set_rbtree: Move clauses for expired nodes, last active node as leaf Stefano Brivio
2022-05-16 18:16 ` Pablo Neira Ayuso [this message]
2022-05-17 12:57 ` Stefano Brivio
2022-05-20 15:45 ` Stefano Brivio
2022-05-23 14:59 ` Pablo Neira Ayuso
2022-05-25 12:15 ` Stefano Brivio
2022-06-01 11:15 ` Pablo Neira Ayuso
2022-06-03 13:04 ` Stefano Brivio
2022-06-06 9:01 ` Pablo Neira Ayuso
2022-06-14 9:58 ` Stefano Brivio
2022-06-16 9:08 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YoKVFRR1gggECpiZ@salvia \
--to=pablo@netfilter.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).