BPF List
 help / color / mirror / Atom feed
From: "Emil Tsalapatis" <emil@etsalapatis.com>
To: <bot+bpf-ci@kernel.org>, <emil@etsalapatis.com>, <bpf@vger.kernel.org>
Cc: <ast@kernel.org>, <andrii@kernel.org>, <memxor@gmail.com>,
	<daniel@iogearbox.net>, <eddyz87@gmail.com>, <song@kernel.org>,
	<mattbobrowski@google.com>, <martin.lau@kernel.org>,
	<yonghong.song@linux.dev>, <clm@meta.com>,
	<ihor.solodrai@linux.dev>
Subject: Re: [RESEND PATCH bpf-next 1/2] selftests/bpf: libarena: Add rbtree data structure
Date: Tue, 12 May 2026 16:18:16 -0400	[thread overview]
Message-ID: <DIGZ521R07O0.21CT010M1703B@etsalapatis.com> (raw)
In-Reply-To: <5c0b69fc93d31fa42e47ee102dfd9b88fbcc8c7d0431e617c02a0e5f27025d36@mail.kernel.org>

On Mon May 11, 2026 at 6:24 PM EDT, bot+bpf-ci wrote:
>> diff --git a/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c b/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c
>> new file mode 100644
>> index 000000000000..17d46c4af332
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c
>> @@ -0,0 +1,974 @@
>
> [ ... ]
>
>> +SEC("syscall")
>> +__weak int test_rbtree_find_nonexistent(void)
>> +{
>> +	u64 key = 0xdeadbeef;
>> +	u64 value = 0;
>> +	int ret;
>> +
>> +	rbtree_t *rbtree;
>> +
>> +	rbtree = rb_create(RB_ALLOC, RB_DEFAULT);
>> +	if (!rbtree)
>> +		return 1;
>> +
>> +	/* Should return -EINVAL */
>> +	ret = rb_find(rbtree, key, &value);
>> +	if (!ret)
>> +		return 2;
>> +
>> +	return rb_destroy(rbtree);
>> +}
>
> The comment says rb_find should return -EINVAL when looking for a
> non-existent key in an empty tree, but looking at rb_find in
> rbtree.bpf.c:
>
> int rb_find(rbtree_t *rbtree, u64 key, u64 *value)
> {
>     rbnode_t *node;
>
>     if (unlikely(!rbtree))
>         return -EINVAL;
>
>     if (unlikely(!value))
>         return -EINVAL;
>
>     node = rbnode_find(rbtree->root, key, RBN_FIND_DFL);
>     if (!node || node->key != key)
>         return -ENOENT;
>         ^^^^
>
> rb_find returns -EINVAL only when rbtree or value is NULL. For a valid
> rbtree with no matching node, rbnode_find returns NULL and rb_find
> returns -ENOENT. Should the comment say "Should return -ENOENT" instead?
>

Ack.

> [ ... ]
>
>> diff --git a/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c b/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c
>> new file mode 100644
>> index 000000000000..2ce2dfe5a223
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c
>> @@ -0,0 +1,1032 @@
>
> [ ... ]
>
>> +static
>> +rbnode_t *rbnode_find(rbnode_t *subtree, u64 key, enum rbnode_find_mode mode)
>> +{
>> +	rbnode_t *node = subtree;
>> +	int dir;
>> +
>> +	if (!subtree)
>> +		return NULL;
>> +
>> +	while (can_loop) {
>> +		/*
>> +		 * If we are not looking for the least upper bound,
>> +		 * we have found the only key. Otherwise keep going
>> +		 * until we find the "leftmost" instance of the key.
>> +		 */
>> +		if (mode == RBN_FIND_DFL && node->key == key)
>> +			break;
>> +
>> +		dir = (key <= node->key) ? 0 : 1;
>> +
>> +		if (!node->child[dir])
>> +			break;
>> +
>> +		/*
>> +		 * For least upper bound lookups, ensure that we
>> +		 * stop searching once we find the rightmost key.
>> +		 */
>> +		if (node->key == key && node->child[dir]->key != key)
>> +			break;
>> +
>> +		node = node->child[dir];
>> +	}
>> +
>> +	return node;
>> +}
>
> [ ... ]
>
>> +static
>> +int rb_node_insert(rbtree_t *rbtree __arg_arena, rbnode_t *node __arg_arena)
>> +{
>> +	rbnode_t *grandparent, *parent = rbtree->root;
>> +	enum rbnode_find_mode mode;
>> +	u64 key = node->key;
>> +	rbnode_t *uncle;
>> +	int dir;
>> +	int ret;
>> +
>> +	if (unlikely(!rbtree))
>> +		return -EINVAL;
>> +
>> +	if (!parent) {
>> +		rbtree->root = node;
>> +		return 0;
>> +	}
>> +
>> +	mode = RBN_FIND_DFL;
>> +	/* Always find the first node with the key when allowing duplicates. */
>> +	if (rbtree->insert == RB_DUPLICATE)
>> +		mode = RBN_FIND_LUB;
>> +	parent = rbnode_find(parent, key, mode);
>> +
>> +	if (key == parent->key && rbtree->insert != RB_DUPLICATE) {
>> +		if (rbtree->insert == RB_UPDATE) {
>> +			/*
>> +			 * Replace the old node with the new one.
>> +			 * Free up the old node.
>> +			 */
>> +			ret = rbnode_replace(rbtree, parent, node);
>> +			if (ret)
>> +				return ret;
>> +
>> +			if (rbtree->alloc == RB_ALLOC)
>> +				rb_node_free(rbtree, parent);
>> +
>> +			return 0;
>> +		}
>> +
>> +		/* Otherwise it's RB_DEFAULT. */
>> +		return -EALREADY;
>> +	}
>> +
>> +	node->parent = parent;
>> +	/* Also works if key == parent->key. */
>> +	if (key <= parent->key)
>> +		parent->left = node;
>> +	else
>> +		parent->right = node;
>
> Can this overwrite an existing left subtree in RB_DUPLICATE mode?
>
> When rbnode_find is called with RBN_FIND_LUB mode, it has a third break
> condition that stops the search when it finds a node with a matching key
> whose next-direction child has a different key:
>
> rbnode_find() {
>     ...
>     if (node->key == key && node->child[dir]->key != key)
>         break;
>     ...
> }
>
> With key == node->key, dir is computed as (key <= node->key) ? 0 : 1 = 0.
> The search stops at a matching node that has a non-NULL left child with a
> smaller key.
>
> Back in rb_node_insert, the code skips the RB_DEFAULT check when
> rbtree->insert == RB_DUPLICATE, and falls through to:
>
>     node->parent = parent;
>     /* Also works if key == parent->key. */
>     if (key <= parent->key)
>         parent->left = node;
>
> Since key == parent->key, the condition is true and parent->left is
> overwritten with the new node. But parent->left already points to an
> existing subtree (the smaller node that triggered the third break
> condition). That existing left child is overwritten, orphaning its entire
> subtree.
>
> Reproduction:
>   1) rb_insert(rbtree, 5, v1) in RB_DUPLICATE mode -> tree: {5}
>   2) rb_insert(rbtree, 3, v2)                       -> tree: {5->left = 3}
>   3) rb_insert(rbtree, 5, v3)
>
> In step 3, rbnode_find returns the root node 5 via the third break
> condition (because child 3 has key != 5). Then rb_node_insert sets
> parent->left = new_node, overwriting the pointer to node 3. Node 3 is now
> orphaned and leaked.
>
> The comment claims "Also works if key == parent->key", but this only holds
> when rbnode_find returns a node with an empty left slot, as in the typical
> duplicate-chain case where all keys are identical. As soon as a duplicate
> insertion meets a matching node that has a pre-existing smaller left child,
> the assignment corrupts the tree.
>
> The tree integrity check in rb_integrity_check walks only nodes reachable
> from the root, so it does not detect orphaned nodes. The existing test
> test_rbtree_duplicate uses a single key value for all insertions, so it
> never exercises the interleaved case where a smaller key exists between
> duplicates.
>

This is accurate. This is a bug introduced when refactoring the
least_upper_bound() and find() internal methods that were almost
identical. Will fix in the next iteration.

>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25699461713


  reply	other threads:[~2026-05-12 20:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 21:40 [RESEND PATCH bpf-next 0/2] selftests/bpf: libarena: Add initial data structures Emil Tsalapatis
2026-05-11 21:40 ` [RESEND PATCH bpf-next 1/2] selftests/bpf: libarena: Add rbtree data structure Emil Tsalapatis
2026-05-11 22:24   ` bot+bpf-ci
2026-05-12 20:18     ` Emil Tsalapatis [this message]
2026-05-13  0:42   ` sashiko-bot
2026-05-11 21:41 ` [RESEND PATCH bpf-next 2/2] selftests/bpf: libarena: Add Lev-Chase queue " Emil Tsalapatis
2026-05-11 22:12   ` bot+bpf-ci
2026-05-12 20:20     ` Emil Tsalapatis
2026-05-13  2:23   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DIGZ521R07O0.21CT010M1703B@etsalapatis.com \
    --to=emil@etsalapatis.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=martin.lau@kernel.org \
    --cc=mattbobrowski@google.com \
    --cc=memxor@gmail.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox