All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Emil Tsalapatis" <emil@etsalapatis.com>
To: <bot+bpf-ci@kernel.org>, <emil@etsalapatis.com>, <bpf@vger.kernel.org>
Cc: <ast@kernel.org>, <andrii@kernel.org>, <memxor@gmail.com>,
	<daniel@iogearbox.net>, <eddyz87@gmail.com>, <song@kernel.org>,
	<mattbobrowski@google.com>, <martin.lau@kernel.org>,
	<yonghong.song@linux.dev>, <clm@meta.com>,
	<ihor.solodrai@linux.dev>
Subject: Re: [RESEND PATCH bpf-next 1/2] selftests/bpf: libarena: Add rbtree data structure
Date: Tue, 12 May 2026 16:18:16 -0400	[thread overview]
Message-ID: <DIGZ521R07O0.21CT010M1703B@etsalapatis.com> (raw)
In-Reply-To: <5c0b69fc93d31fa42e47ee102dfd9b88fbcc8c7d0431e617c02a0e5f27025d36@mail.kernel.org>

On Mon May 11, 2026 at 6:24 PM EDT, bot+bpf-ci wrote:
>> diff --git a/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c b/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c
>> new file mode 100644
>> index 000000000000..17d46c4af332
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c
>> @@ -0,0 +1,974 @@
>
> [ ... ]
>
>> +SEC("syscall")
>> +__weak int test_rbtree_find_nonexistent(void)
>> +{
>> +	u64 key = 0xdeadbeef;
>> +	u64 value = 0;
>> +	int ret;
>> +
>> +	rbtree_t *rbtree;
>> +
>> +	rbtree = rb_create(RB_ALLOC, RB_DEFAULT);
>> +	if (!rbtree)
>> +		return 1;
>> +
>> +	/* Should return -EINVAL */
>> +	ret = rb_find(rbtree, key, &value);
>> +	if (!ret)
>> +		return 2;
>> +
>> +	return rb_destroy(rbtree);
>> +}
>
> The comment says rb_find should return -EINVAL when looking for a
> non-existent key in an empty tree, but looking at rb_find in
> rbtree.bpf.c:
>
> int rb_find(rbtree_t *rbtree, u64 key, u64 *value)
> {
>     rbnode_t *node;
>
>     if (unlikely(!rbtree))
>         return -EINVAL;
>
>     if (unlikely(!value))
>         return -EINVAL;
>
>     node = rbnode_find(rbtree->root, key, RBN_FIND_DFL);
>     if (!node || node->key != key)
>         return -ENOENT;
>         ^^^^
>
> rb_find returns -EINVAL only when rbtree or value is NULL. For a valid
> rbtree with no matching node, rbnode_find returns NULL and rb_find
> returns -ENOENT. Should the comment say "Should return -ENOENT" instead?
>

Ack.

> [ ... ]
>
>> diff --git a/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c b/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c
>> new file mode 100644
>> index 000000000000..2ce2dfe5a223
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c
>> @@ -0,0 +1,1032 @@
>
> [ ... ]
>
>> +static
>> +rbnode_t *rbnode_find(rbnode_t *subtree, u64 key, enum rbnode_find_mode mode)
>> +{
>> +	rbnode_t *node = subtree;
>> +	int dir;
>> +
>> +	if (!subtree)
>> +		return NULL;
>> +
>> +	while (can_loop) {
>> +		/*
>> +		 * If we are not looking for the least upper bound,
>> +		 * we have found the only key. Otherwise keep going
>> +		 * until we find the "leftmost" instance of the key.
>> +		 */
>> +		if (mode == RBN_FIND_DFL && node->key == key)
>> +			break;
>> +
>> +		dir = (key <= node->key) ? 0 : 1;
>> +
>> +		if (!node->child[dir])
>> +			break;
>> +
>> +		/*
>> +		 * For least upper bound lookups, ensure that we
>> +		 * stop searching once we find the rightmost key.
>> +		 */
>> +		if (node->key == key && node->child[dir]->key != key)
>> +			break;
>> +
>> +		node = node->child[dir];
>> +	}
>> +
>> +	return node;
>> +}
>
> [ ... ]
>
>> +static
>> +int rb_node_insert(rbtree_t *rbtree __arg_arena, rbnode_t *node __arg_arena)
>> +{
>> +	rbnode_t *grandparent, *parent = rbtree->root;
>> +	enum rbnode_find_mode mode;
>> +	u64 key = node->key;
>> +	rbnode_t *uncle;
>> +	int dir;
>> +	int ret;
>> +
>> +	if (unlikely(!rbtree))
>> +		return -EINVAL;
>> +
>> +	if (!parent) {
>> +		rbtree->root = node;
>> +		return 0;
>> +	}
>> +
>> +	mode = RBN_FIND_DFL;
>> +	/* Always find the first node with the key when allowing duplicates. */
>> +	if (rbtree->insert == RB_DUPLICATE)
>> +		mode = RBN_FIND_LUB;
>> +	parent = rbnode_find(parent, key, mode);
>> +
>> +	if (key == parent->key && rbtree->insert != RB_DUPLICATE) {
>> +		if (rbtree->insert == RB_UPDATE) {
>> +			/*
>> +			 * Replace the old node with the new one.
>> +			 * Free up the old node.
>> +			 */
>> +			ret = rbnode_replace(rbtree, parent, node);
>> +			if (ret)
>> +				return ret;
>> +
>> +			if (rbtree->alloc == RB_ALLOC)
>> +				rb_node_free(rbtree, parent);
>> +
>> +			return 0;
>> +		}
>> +
>> +		/* Otherwise it's RB_DEFAULT. */
>> +		return -EALREADY;
>> +	}
>> +
>> +	node->parent = parent;
>> +	/* Also works if key == parent->key. */
>> +	if (key <= parent->key)
>> +		parent->left = node;
>> +	else
>> +		parent->right = node;
>
> Can this overwrite an existing left subtree in RB_DUPLICATE mode?
>
> When rbnode_find is called with RBN_FIND_LUB mode, it has a third break
> condition that stops the search when it finds a node with a matching key
> whose next-direction child has a different key:
>
> rbnode_find() {
>     ...
>     if (node->key == key && node->child[dir]->key != key)
>         break;
>     ...
> }
>
> With key == node->key, dir is computed as (key <= node->key) ? 0 : 1 = 0.
> The search stops at a matching node that has a non-NULL left child with a
> smaller key.
>
> Back in rb_node_insert, the code skips the RB_DEFAULT check when
> rbtree->insert == RB_DUPLICATE, and falls through to:
>
>     node->parent = parent;
>     /* Also works if key == parent->key. */
>     if (key <= parent->key)
>         parent->left = node;
>
> Since key == parent->key, the condition is true and parent->left is
> overwritten with the new node. But parent->left already points to an
> existing subtree (the smaller node that triggered the third break
> condition). That existing left child is overwritten, orphaning its entire
> subtree.
>
> Reproduction:
>   1) rb_insert(rbtree, 5, v1) in RB_DUPLICATE mode -> tree: {5}
>   2) rb_insert(rbtree, 3, v2)                       -> tree: {5->left = 3}
>   3) rb_insert(rbtree, 5, v3)
>
> In step 3, rbnode_find returns the root node 5 via the third break
> condition (because child 3 has key != 5). Then rb_node_insert sets
> parent->left = new_node, overwriting the pointer to node 3. Node 3 is now
> orphaned and leaked.
>
> The comment claims "Also works if key == parent->key", but this only holds
> when rbnode_find returns a node with an empty left slot, as in the typical
> duplicate-chain case where all keys are identical. As soon as a duplicate
> insertion meets a matching node that has a pre-existing smaller left child,
> the assignment corrupts the tree.
>
> The tree integrity check in rb_integrity_check walks only nodes reachable
> from the root, so it does not detect orphaned nodes. The existing test
> test_rbtree_duplicate uses a single key value for all insertions, so it
> never exercises the interleaved case where a smaller key exists between
> duplicates.
>

This is accurate. This is a bug introduced when refactoring the
least_upper_bound() and find() internal methods that were almost
identical. Will fix in the next iteration.

>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25699461713


  reply	other threads:[~2026-05-12 20:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 21:40 [RESEND PATCH bpf-next 0/2] selftests/bpf: libarena: Add initial data structures Emil Tsalapatis
2026-05-11 21:40 ` [RESEND PATCH bpf-next 1/2] selftests/bpf: libarena: Add rbtree data structure Emil Tsalapatis
2026-05-11 22:24   ` bot+bpf-ci
2026-05-12 20:18     ` Emil Tsalapatis [this message]
2026-05-13  0:42   ` sashiko-bot
2026-05-15  0:08   ` Alexei Starovoitov
2026-05-11 21:41 ` [RESEND PATCH bpf-next 2/2] selftests/bpf: libarena: Add Lev-Chase queue " Emil Tsalapatis
2026-05-11 22:12   ` bot+bpf-ci
2026-05-12 20:20     ` Emil Tsalapatis
2026-05-13  2:23   ` sashiko-bot
2026-05-15  0:11   ` Alexei Starovoitov
2026-05-15  0:13   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DIGZ521R07O0.21CT010M1703B@etsalapatis.com \
    --to=emil@etsalapatis.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=martin.lau@kernel.org \
    --cc=mattbobrowski@google.com \
    --cc=memxor@gmail.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.