From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: sashiko-reviews@lists.linux.dev, Carlos Llamas <cmllamas@google.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH] libbpf: fix UAF in strset__add_str()
Date: Thu, 14 May 2026 14:38:01 +0100 [thread overview]
Message-ID: <89d34016-cf82-4beb-989c-e4fc2e3cd29e@gmail.com> (raw)
In-Reply-To: <20260514115711.133F4C2BCB3@smtp.kernel.org>
On 5/14/26 12:57 PM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> - [High] Unpatched Use-After-Free in strset__find_str()
> - [High] Use-After-Free in BTF add APIs due to btf_ensure_modifiable() freeing raw_data
> - [Medium] Early return in strset__add_str() breaks hash map deduplication invariants
> --
>
> commit 2a7bfcb093726b849608e340f9bda7c5e3f56ac2
> Author: Carlos Llamas <cmllamas@google.com>
>
> libbpf: fix UAF in strset__add_str()
>
> This commit addresses a heap-use-after-free in strset__add_str() that occurs
> when the provided string points into the internal strset data buffer.
> It adds a boundary check to return the offset directly if the string is
> already within the buffer, avoiding a dangling pointer during reallocation.
>
> [ ... ]
>> | #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
>> | #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
>> | #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
>> | #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
> [ ... ]
>
> Does btf__add_str() have another use-after-free when operating on a
> non-modifiable BTF object?
>
> If a user calls btf__add_str(btf, s) where s is a pointer returned by
> btf__name_by_offset() (which points into btf->raw_data),
> btf_ensure_modifiable(btf) is called.
>
> btf_ensure_modifiable() allocates new buffers and calls
> btf_invalidate_raw_data(btf), which frees btf->raw_data.
>
> At this point, the argument s becomes a dangling pointer before
> strset__add_str() is even called, making the boundary check added by this
> patch evaluate against the freed raw_data.
>
>> diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
>> index 2464bcbd04e03..7d2b2784172e5 100644
>> --- a/tools/lib/bpf/strset.c
>> +++ b/tools/lib/bpf/strset.c
>> @@ -141,10 +141,15 @@ int strset__find_str(struct strset *set, const char *s)
>
> Does strset__find_str() suffer from the same vulnerability being fixed here?
>
> It follows the exact same pattern as strset__add_str(): it temporarily
> appends s by calling strset_add_str_mem(set, len), which may realloc() the
> internal buffer.
>
> If s points into the internal buffer, the realloc() frees the old memory,
> leaving s dangling before the memcpy() occurs.
>
>> */
>> int strset__add_str(struct strset *set, const char *s)
>> {
>> + const char *strs = strset__data(set);
>> long old_off, new_off, len;
>> void *p;
>> int err;
>>
>> + /* Check whether 's' is already in the strset data buffer */
>> + if (strs && s >= strs && s < strs + set->strs_data_len)
>> + return s - strs;
>
> Can this early return break hash map deduplication invariants if s is a
> substring?
>
> If a caller passes a pointer to a substring of an existing string, the
> function returns the substring's offset but skips inserting it into
> set->strs_hash.
>
> Because the substring is never added to the hash map, subsequent lookups for
> it via strset__find_str() will fail, and future additions from an external
> pointer will duplicate the string.
>
> Would it be safer to calculate the offset, perform the realloc(), and then
> re-calculate the s pointer instead of returning early?
>
It looks like sashiko is right, addressing the same issue in strset__find_str()
and handling substring case could be useful.
maybe adding a helper like this:
/*
* Returns the offset of the string in set and length which set needs to
* grow by to include the string.
*/
int strset__offset(struct strset *set, const char *s, long *off, long *grow)
{
long len;
const char *strs = strset__data(set);
void *p;
if (s >= strs && s < strs + set->strs_data_len) {
*off = s - strs;
*grow = 0;
return 0;
}
len = strlen(s) + 1;
p = strset_add_str_mem(set, len);
if (!p)
return -ENOMEM;
memcpy(p, s, len);
*off = set->strs_data_len;
*grow = len;
return 0;
}
> [ ... ]
>
next prev parent reply other threads:[~2026-05-14 13:38 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 23:20 [PATCH] libbpf: fix UAF in strset__add_str() Carlos Llamas
2026-05-13 23:55 ` bot+bpf-ci
2026-05-14 1:10 ` Carlos Llamas
2026-05-14 11:57 ` sashiko-bot
2026-05-14 13:38 ` Mykyta Yatsenko [this message]
2026-05-14 18:39 ` Carlos Llamas
2026-05-15 4:47 ` [PATCH v2] " Carlos Llamas
2026-05-15 22:08 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=89d34016-cf82-4beb-989c-e4fc2e3cd29e@gmail.com \
--to=mykyta.yatsenko5@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=cmllamas@google.com \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.