From: "Emil Tsalapatis" <emil@etsalapatis.com>
To: <bot+bpf-ci@kernel.org>, <emil@etsalapatis.com>, <bpf@vger.kernel.org>
Cc: <ast@kernel.org>, <andrii@kernel.org>, <memxor@gmail.com>,
<daniel@iogearbox.net>, <eddyz87@gmail.com>, <song@kernel.org>,
<mattbobrowski@google.com>, <martin.lau@kernel.org>,
<yonghong.song@linux.dev>, <clm@meta.com>,
<ihor.solodrai@linux.dev>
Subject: Re: [RESEND PATCH bpf-next 1/2] selftests/bpf: libarena: Add rbtree data structure
Date: Tue, 12 May 2026 16:18:16 -0400 [thread overview]
Message-ID: <DIGZ521R07O0.21CT010M1703B@etsalapatis.com> (raw)
In-Reply-To: <5c0b69fc93d31fa42e47ee102dfd9b88fbcc8c7d0431e617c02a0e5f27025d36@mail.kernel.org>
On Mon May 11, 2026 at 6:24 PM EDT, bot+bpf-ci wrote:
>> diff --git a/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c b/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c
>> new file mode 100644
>> index 000000000000..17d46c4af332
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/libarena/selftests/st_rbtree.bpf.c
>> @@ -0,0 +1,974 @@
>
> [ ... ]
>
>> +SEC("syscall")
>> +__weak int test_rbtree_find_nonexistent(void)
>> +{
>> + u64 key = 0xdeadbeef;
>> + u64 value = 0;
>> + int ret;
>> +
>> + rbtree_t *rbtree;
>> +
>> + rbtree = rb_create(RB_ALLOC, RB_DEFAULT);
>> + if (!rbtree)
>> + return 1;
>> +
>> + /* Should return -EINVAL */
>> + ret = rb_find(rbtree, key, &value);
>> + if (!ret)
>> + return 2;
>> +
>> + return rb_destroy(rbtree);
>> +}
>
> The comment says rb_find should return -EINVAL when looking for a
> non-existent key in an empty tree, but looking at rb_find in
> rbtree.bpf.c:
>
> int rb_find(rbtree_t *rbtree, u64 key, u64 *value)
> {
> rbnode_t *node;
>
> if (unlikely(!rbtree))
> return -EINVAL;
>
> if (unlikely(!value))
> return -EINVAL;
>
> node = rbnode_find(rbtree->root, key, RBN_FIND_DFL);
> if (!node || node->key != key)
> return -ENOENT;
> ^^^^
>
> rb_find returns -EINVAL only when rbtree or value is NULL. For a valid
> rbtree with no matching node, rbnode_find returns NULL and rb_find
> returns -ENOENT. Should the comment say "Should return -ENOENT" instead?
>
Ack.
> [ ... ]
>
>> diff --git a/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c b/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c
>> new file mode 100644
>> index 000000000000..2ce2dfe5a223
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/libarena/src/rbtree.bpf.c
>> @@ -0,0 +1,1032 @@
>
> [ ... ]
>
>> +static
>> +rbnode_t *rbnode_find(rbnode_t *subtree, u64 key, enum rbnode_find_mode mode)
>> +{
>> + rbnode_t *node = subtree;
>> + int dir;
>> +
>> + if (!subtree)
>> + return NULL;
>> +
>> + while (can_loop) {
>> + /*
>> + * If we are not looking for the least upper bound,
>> + * we have found the only key. Otherwise keep going
>> + * until we find the "leftmost" instance of the key.
>> + */
>> + if (mode == RBN_FIND_DFL && node->key == key)
>> + break;
>> +
>> + dir = (key <= node->key) ? 0 : 1;
>> +
>> + if (!node->child[dir])
>> + break;
>> +
>> + /*
>> + * For least upper bound lookups, ensure that we
>> + * stop searching once we find the rightmost key.
>> + */
>> + if (node->key == key && node->child[dir]->key != key)
>> + break;
>> +
>> + node = node->child[dir];
>> + }
>> +
>> + return node;
>> +}
>
> [ ... ]
>
>> +static
>> +int rb_node_insert(rbtree_t *rbtree __arg_arena, rbnode_t *node __arg_arena)
>> +{
>> + rbnode_t *grandparent, *parent = rbtree->root;
>> + enum rbnode_find_mode mode;
>> + u64 key = node->key;
>> + rbnode_t *uncle;
>> + int dir;
>> + int ret;
>> +
>> + if (unlikely(!rbtree))
>> + return -EINVAL;
>> +
>> + if (!parent) {
>> + rbtree->root = node;
>> + return 0;
>> + }
>> +
>> + mode = RBN_FIND_DFL;
>> + /* Always find the first node with the key when allowing duplicates. */
>> + if (rbtree->insert == RB_DUPLICATE)
>> + mode = RBN_FIND_LUB;
>> + parent = rbnode_find(parent, key, mode);
>> +
>> + if (key == parent->key && rbtree->insert != RB_DUPLICATE) {
>> + if (rbtree->insert == RB_UPDATE) {
>> + /*
>> + * Replace the old node with the new one.
>> + * Free up the old node.
>> + */
>> + ret = rbnode_replace(rbtree, parent, node);
>> + if (ret)
>> + return ret;
>> +
>> + if (rbtree->alloc == RB_ALLOC)
>> + rb_node_free(rbtree, parent);
>> +
>> + return 0;
>> + }
>> +
>> + /* Otherwise it's RB_DEFAULT. */
>> + return -EALREADY;
>> + }
>> +
>> + node->parent = parent;
>> + /* Also works if key == parent->key. */
>> + if (key <= parent->key)
>> + parent->left = node;
>> + else
>> + parent->right = node;
>
> Can this overwrite an existing left subtree in RB_DUPLICATE mode?
>
> When rbnode_find is called with RBN_FIND_LUB mode, it has a third break
> condition that stops the search when it finds a node with a matching key
> whose next-direction child has a different key:
>
> rbnode_find() {
> ...
> if (node->key == key && node->child[dir]->key != key)
> break;
> ...
> }
>
> With key == node->key, dir is computed as (key <= node->key) ? 0 : 1 = 0.
> The search stops at a matching node that has a non-NULL left child with a
> smaller key.
>
> Back in rb_node_insert, the code skips the RB_DEFAULT check when
> rbtree->insert == RB_DUPLICATE, and falls through to:
>
> node->parent = parent;
> /* Also works if key == parent->key. */
> if (key <= parent->key)
> parent->left = node;
>
> Since key == parent->key, the condition is true and parent->left is
> overwritten with the new node. But parent->left already points to an
> existing subtree (the smaller node that triggered the third break
> condition). That existing left child is overwritten, orphaning its entire
> subtree.
>
> Reproduction:
> 1) rb_insert(rbtree, 5, v1) in RB_DUPLICATE mode -> tree: {5}
> 2) rb_insert(rbtree, 3, v2) -> tree: {5->left = 3}
> 3) rb_insert(rbtree, 5, v3)
>
> In step 3, rbnode_find returns the root node 5 via the third break
> condition (because child 3 has key != 5). Then rb_node_insert sets
> parent->left = new_node, overwriting the pointer to node 3. Node 3 is now
> orphaned and leaked.
>
> The comment claims "Also works if key == parent->key", but this only holds
> when rbnode_find returns a node with an empty left slot, as in the typical
> duplicate-chain case where all keys are identical. As soon as a duplicate
> insertion meets a matching node that has a pre-existing smaller left child,
> the assignment corrupts the tree.
>
> The tree integrity check in rb_integrity_check walks only nodes reachable
> from the root, so it does not detect orphaned nodes. The existing test
> test_rbtree_duplicate uses a single key value for all insertions, so it
> never exercises the interleaved case where a smaller key exists between
> duplicates.
>
This is accurate. This is a bug introduced when refactoring the
least_upper_bound() and find() internal methods that were almost
identical. Will fix in the next iteration.
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25699461713
next prev parent reply other threads:[~2026-05-12 20:18 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 21:40 [RESEND PATCH bpf-next 0/2] selftests/bpf: libarena: Add initial data structures Emil Tsalapatis
2026-05-11 21:40 ` [RESEND PATCH bpf-next 1/2] selftests/bpf: libarena: Add rbtree data structure Emil Tsalapatis
2026-05-11 22:24 ` bot+bpf-ci
2026-05-12 20:18 ` Emil Tsalapatis [this message]
2026-05-13 0:42 ` sashiko-bot
2026-05-11 21:41 ` [RESEND PATCH bpf-next 2/2] selftests/bpf: libarena: Add Lev-Chase queue " Emil Tsalapatis
2026-05-11 22:12 ` bot+bpf-ci
2026-05-12 20:20 ` Emil Tsalapatis
2026-05-13 2:23 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DIGZ521R07O0.21CT010M1703B@etsalapatis.com \
--to=emil@etsalapatis.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bot+bpf-ci@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=martin.lau@kernel.org \
--cc=mattbobrowski@google.com \
--cc=memxor@gmail.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox