BPF List
 help / color / mirror / Atom feed
* [PATCH] libbpf: fix UAF in strset__add_str()
@ 2026-05-13 23:20 Carlos Llamas
  2026-05-13 23:55 ` bot+bpf-ci
  2026-05-14 11:57 ` sashiko-bot
  0 siblings, 2 replies; 6+ messages in thread
From: Carlos Llamas @ 2026-05-13 23:20 UTC (permalink / raw)
  To: Andrii Nakryiko, Eduard Zingerman, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend
  Cc: kernel-team, linux-kernel, Carlos Llamas, Andrii Nakryiko,
	open list:BPF [GENERAL] (Safe Dynamic Programs and Tools)

strset_add_str_mem() might reallocate the strset data buffer in order to
accommodate the provided string 's'. However, if 's' points to a string
already present in the buffer, it becomes dangling after the realloc.
This leads to a use-after-free when attempting to memcpy() the string
into the new buffer.

One scenario that triggers this problematic path is when resolve_btfids
attempts to patch kfunc prototypes using existing BTF parameter names:

 | resolve_btfids: function bpf_list_push_back_impl already exists in BTF
 | Segmentation fault (core dumped)

Compiling resolve_btfids with fsanitize=address generates a detailed
report of the UAF:

 | =================================================================
 | ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4
 | ==1507892==ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4 at pc 0x55d25155a2a8 bp 0x7ffcef879060 sp 0x7ffcef878818
 | READ of size 5 at 0x7f4c4a500bd4 thread T0
 |     #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
 |     #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
 |     #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
 |     #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
 |     #4 0x55d25159f0b5 in process_kfunc_with_implicit_args tools/bpf/resolve_btfids/main.c:1196:9
 |     #5 0x55d25159e004 in btf2btf tools/bpf/resolve_btfids/main.c:1229:9
 |     #6 0x55d25159cee7 in main tools/bpf/resolve_btfids/main.c:1535:6
 |     #7 0x7f4c78e29f76 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
 |     #8 0x7f4c78e2a026 in __libc_start_main csu/../csu/libc-start.c:360:3
 |     #9 0x55d2514bb860 in _start (tools/bpf/resolve_btfids/resolve_btfids+0x30860)
 |
 | 0x7f4c4a500bd4 is located 13268 bytes inside of 2829000-byte region [0x7f4c4a4fd800,0x7f4c4a7b02c8)
 | freed by thread T0 here:
 |     #0 0x55d25155b700 in realloc (tools/bpf/resolve_btfids/resolve_btfids+0xd0700)
 |     #1 0x55d2515c426c in libbpf_reallocarray tools/lib/bpf/./libbpf_internal.h:220:9
 |     #2 0x55d2515c426c in libbpf_add_mem tools/lib/bpf/btf.c:224:13
 |
 | previously allocated by thread T0 here:
 |     #0 0x55d25155b2e3 in malloc (tools/bpf/resolve_btfids/resolve_btfids+0xd02e3)
 |     #1 0x55d2515d6e7d in strset__new tools/lib/bpf/strset.c:58:20

While resolve_btfids could be refactored to avoid this call path, let's
instead fix this issue at the source in strset__add_str() and avoid
similar scenarios. Let's simply check whether 's' is already within the
strset data buffer boundaries, and return the offset directly if so.

Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 tools/lib/bpf/strset.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
index 2464bcbd04e0..7d2b2784172e 100644
--- a/tools/lib/bpf/strset.c
+++ b/tools/lib/bpf/strset.c
@@ -141,10 +141,15 @@ int strset__find_str(struct strset *set, const char *s)
  */
 int strset__add_str(struct strset *set, const char *s)
 {
+	const char *strs = strset__data(set);
 	long old_off, new_off, len;
 	void *p;
 	int err;
 
+	/* Check whether 's' is already in the strset data buffer */
+	if (strs && s >= strs && s < strs + set->strs_data_len)
+		return s - strs;
+
 	/* Hashmap keys are always offsets within set->strs_data, so to even
 	 * look up some string from the "outside", we need to first append it
 	 * at the end, so that it can be addressed with an offset. Luckily,
-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] libbpf: fix UAF in strset__add_str()
  2026-05-13 23:20 [PATCH] libbpf: fix UAF in strset__add_str() Carlos Llamas
@ 2026-05-13 23:55 ` bot+bpf-ci
  2026-05-14  1:10   ` Carlos Llamas
  2026-05-14 11:57 ` sashiko-bot
  1 sibling, 1 reply; 6+ messages in thread
From: bot+bpf-ci @ 2026-05-13 23:55 UTC (permalink / raw)
  To: cmllamas, andrii, eddyz87, ast, daniel, martin.lau, memxor, song,
	yonghong.song, jolsa, john.fastabend
  Cc: kernel-team, linux-kernel, cmllamas, andriin, bpf, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 3668 bytes --]

> strset_add_str_mem() might reallocate the strset data buffer in order to
> accommodate the provided string 's'. However, if 's' points to a string
> already present in the buffer, it becomes dangling after the realloc.
> This leads to a use-after-free when attempting to memcpy() the string
> into the new buffer.
>
> One scenario that triggers this problematic path is when resolve_btfids
> attempts to patch kfunc prototypes using existing BTF parameter names:
>
>  | resolve_btfids: function bpf_list_push_back_impl already exists in BTF
>  | Segmentation fault (core dumped)
>
> Compiling resolve_btfids with fsanitize=address generates a detailed
> report of the UAF:
>
>  | =================================================================
>  | ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4
>  | ==1507892==ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4 at pc 0x55d25155a2a8 bp 0x7ffcef879060 sp 0x7ffcef878818
>  | READ of size 5 at 0x7f4c4a500bd4 thread T0
>  |     #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
>  |     #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
>  |     #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
>  |     #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
>  |     #4 0x55d25159f0b5 in process_kfunc_with_implicit_args tools/bpf/resolve_btfids/main.c:1196:9
>  |     #5 0x55d25159e004 in btf2btf tools/bpf/resolve_btfids/main.c:1229:9
>  |     #6 0x55d25159cee7 in main tools/bpf/resolve_btfids/main.c:1535:6
>  |     #7 0x7f4c78e29f76 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
>  |     #8 0x7f4c78e2a026 in __libc_start_main csu/../csu/libc-start.c:360:3
>  |     #9 0x55d2514bb860 in _start (tools/bpf/resolve_btfids/resolve_btfids+0x30860)
>  |
>  | 0x7f4c4a500bd4 is located 13268 bytes inside of 2829000-byte region [0x7f4c4a4fd800,0x7f4c4a7b02c8)
>  | freed by thread T0 here:
>  |     #0 0x55d25155b700 in realloc (tools/bpf/resolve_btfids/resolve_btfids+0xd0700)
>  |     #1 0x55d2515c426c in libbpf_reallocarray tools/lib/bpf/./libbpf_internal.h:220:9
>  |     #2 0x55d2515c426c in libbpf_add_mem tools/lib/bpf/btf.c:224:13
>  |
>  | previously allocated by thread T0 here:
>  |     #0 0x55d25155b2e3 in malloc (tools/bpf/resolve_btfids/resolve_btfids+0xd02e3)
>  |     #1 0x55d2515d6e7d in strset__new tools/lib/bpf/strset.c:58:20
>
> While resolve_btfids could be refactored to avoid this call path, let's
> instead fix this issue at the source in strset__add_str() and avoid
> similar scenarios. Let's simply check whether 's' is already within the
> strset data buffer boundaries, and return the offset directly if so.
>
> Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API")
> Signed-off-by: Carlos Llamas <cmllamas@google.com>

The existing Fixes: tag points to commit 919d2b1dbb07, but the bug is
actually in strset__add_str() which was introduced by commit 90d76d3ececc
("libbpf: Extract internal set-of-strings datastructure APIs").

While the vulnerable pattern originated in btf__add_str() from 919d2b1dbb07,
the code being fixed resides in tools/lib/bpf/strset.c which was created by
90d76d3ececc when the string set functionality was extracted into a separate
module.

Should this use:

  Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure APIs")


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25832614669

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libbpf: fix UAF in strset__add_str()
  2026-05-13 23:55 ` bot+bpf-ci
@ 2026-05-14  1:10   ` Carlos Llamas
  0 siblings, 0 replies; 6+ messages in thread
From: Carlos Llamas @ 2026-05-14  1:10 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: andrii, eddyz87, ast, daniel, martin.lau, memxor, song,
	yonghong.song, jolsa, john.fastabend, kernel-team, linux-kernel,
	andriin, bpf, martin.lau, clm, ihor.solodrai

On Wed, May 13, 2026 at 11:55:53PM +0000, bot+bpf-ci@kernel.org wrote:
> >
> > Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API")
> > Signed-off-by: Carlos Llamas <cmllamas@google.com>
> 
> The existing Fixes: tag points to commit 919d2b1dbb07, but the bug is
> actually in strset__add_str() which was introduced by commit 90d76d3ececc
> ("libbpf: Extract internal set-of-strings datastructure APIs").

Yes, the bug is currently in strset__add_str(). But it used to be in
btf__add_str() before the problematic code got migrated into a separate
file.

> While the vulnerable pattern originated in btf__add_str() from 919d2b1dbb07,
> the code being fixed resides in tools/lib/bpf/strset.c which was created by
> 90d76d3ececc when the string set functionality was extracted into a separate
> module.

Right, you are making my point here.

> Should this use:
> 
>   Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure APIs")

I would think the commit tagged as "Fixes:" should be the one that
introduced the issue and not the one that better fits the file name or
function no?

Another idea would be to tag instead the commit that introduced the
specific pattern that I ran into, which would be commit 9d199965990c
("resolve_btfids: Support for KF_IMPLICIT_ARGS").

Anyway, I'm happy to use any of these. It would be nice if a human can
confirm a preference though lol.

Regards,
--
Carlos Llamas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libbpf: fix UAF in strset__add_str()
  2026-05-13 23:20 [PATCH] libbpf: fix UAF in strset__add_str() Carlos Llamas
  2026-05-13 23:55 ` bot+bpf-ci
@ 2026-05-14 11:57 ` sashiko-bot
  2026-05-14 13:38   ` Mykyta Yatsenko
  1 sibling, 1 reply; 6+ messages in thread
From: sashiko-bot @ 2026-05-14 11:57 UTC (permalink / raw)
  To: Carlos Llamas; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Unpatched Use-After-Free in strset__find_str()
- [High] Use-After-Free in BTF add APIs due to btf_ensure_modifiable() freeing raw_data
- [Medium] Early return in strset__add_str() breaks hash map deduplication invariants
--

commit 2a7bfcb093726b849608e340f9bda7c5e3f56ac2
Author: Carlos Llamas <cmllamas@google.com>

libbpf: fix UAF in strset__add_str()

This commit addresses a heap-use-after-free in strset__add_str() that occurs
when the provided string points into the internal strset data buffer.
It adds a boundary check to return the offset directly if the string is
already within the buffer, avoiding a dangling pointer during reallocation.

[ ... ]
>  |     #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
>  |     #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
>  |     #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
>  |     #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
[ ... ]

Does btf__add_str() have another use-after-free when operating on a
non-modifiable BTF object?

If a user calls btf__add_str(btf, s) where s is a pointer returned by
btf__name_by_offset() (which points into btf->raw_data),
btf_ensure_modifiable(btf) is called.

btf_ensure_modifiable() allocates new buffers and calls
btf_invalidate_raw_data(btf), which frees btf->raw_data.

At this point, the argument s becomes a dangling pointer before
strset__add_str() is even called, making the boundary check added by this
patch evaluate against the freed raw_data.

> diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
> index 2464bcbd04e03..7d2b2784172e5 100644
> --- a/tools/lib/bpf/strset.c
> +++ b/tools/lib/bpf/strset.c
> @@ -141,10 +141,15 @@ int strset__find_str(struct strset *set, const char *s)

Does strset__find_str() suffer from the same vulnerability being fixed here?

It follows the exact same pattern as strset__add_str(): it temporarily
appends s by calling strset_add_str_mem(set, len), which may realloc() the
internal buffer.

If s points into the internal buffer, the realloc() frees the old memory,
leaving s dangling before the memcpy() occurs.

>   */
>  int strset__add_str(struct strset *set, const char *s)
>  {
> +	const char *strs = strset__data(set);
>  	long old_off, new_off, len;
>  	void *p;
>  	int err;
>  
> +	/* Check whether 's' is already in the strset data buffer */
> +	if (strs && s >= strs && s < strs + set->strs_data_len)
> +		return s - strs;

Can this early return break hash map deduplication invariants if s is a
substring?

If a caller passes a pointer to a substring of an existing string, the
function returns the substring's offset but skips inserting it into
set->strs_hash.

Because the substring is never added to the hash map, subsequent lookups for
it via strset__find_str() will fail, and future additions from an external
pointer will duplicate the string.

Would it be safer to calculate the offset, perform the realloc(), and then
re-calculate the s pointer instead of returning early?

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260513232055.1681859-1-cmllamas@google.com?part=1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libbpf: fix UAF in strset__add_str()
  2026-05-14 11:57 ` sashiko-bot
@ 2026-05-14 13:38   ` Mykyta Yatsenko
  2026-05-14 18:39     ` Carlos Llamas
  0 siblings, 1 reply; 6+ messages in thread
From: Mykyta Yatsenko @ 2026-05-14 13:38 UTC (permalink / raw)
  To: sashiko-reviews, Carlos Llamas; +Cc: bpf



On 5/14/26 12:57 PM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> - [High] Unpatched Use-After-Free in strset__find_str()
> - [High] Use-After-Free in BTF add APIs due to btf_ensure_modifiable() freeing raw_data
> - [Medium] Early return in strset__add_str() breaks hash map deduplication invariants
> --
> 
> commit 2a7bfcb093726b849608e340f9bda7c5e3f56ac2
> Author: Carlos Llamas <cmllamas@google.com>
> 
> libbpf: fix UAF in strset__add_str()
> 
> This commit addresses a heap-use-after-free in strset__add_str() that occurs
> when the provided string points into the internal strset data buffer.
> It adds a boundary check to return the offset directly if the string is
> already within the buffer, avoiding a dangling pointer during reallocation.
> 
> [ ... ]
>>  |     #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
>>  |     #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
>>  |     #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
>>  |     #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
> [ ... ]
> 
> Does btf__add_str() have another use-after-free when operating on a
> non-modifiable BTF object?
> 
> If a user calls btf__add_str(btf, s) where s is a pointer returned by
> btf__name_by_offset() (which points into btf->raw_data),
> btf_ensure_modifiable(btf) is called.
> 
> btf_ensure_modifiable() allocates new buffers and calls
> btf_invalidate_raw_data(btf), which frees btf->raw_data.
> 
> At this point, the argument s becomes a dangling pointer before
> strset__add_str() is even called, making the boundary check added by this
> patch evaluate against the freed raw_data.
> 
>> diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
>> index 2464bcbd04e03..7d2b2784172e5 100644
>> --- a/tools/lib/bpf/strset.c
>> +++ b/tools/lib/bpf/strset.c
>> @@ -141,10 +141,15 @@ int strset__find_str(struct strset *set, const char *s)
> 
> Does strset__find_str() suffer from the same vulnerability being fixed here?
> 
> It follows the exact same pattern as strset__add_str(): it temporarily
> appends s by calling strset_add_str_mem(set, len), which may realloc() the
> internal buffer.
> 
> If s points into the internal buffer, the realloc() frees the old memory,
> leaving s dangling before the memcpy() occurs.
> 
>>   */
>>  int strset__add_str(struct strset *set, const char *s)
>>  {
>> +	const char *strs = strset__data(set);
>>  	long old_off, new_off, len;
>>  	void *p;
>>  	int err;
>>  
>> +	/* Check whether 's' is already in the strset data buffer */
>> +	if (strs && s >= strs && s < strs + set->strs_data_len)
>> +		return s - strs;
> 
> Can this early return break hash map deduplication invariants if s is a
> substring?
> 
> If a caller passes a pointer to a substring of an existing string, the
> function returns the substring's offset but skips inserting it into
> set->strs_hash.
> 
> Because the substring is never added to the hash map, subsequent lookups for
> it via strset__find_str() will fail, and future additions from an external
> pointer will duplicate the string.
> 
> Would it be safer to calculate the offset, perform the realloc(), and then
> re-calculate the s pointer instead of returning early?
> 

It looks like sashiko is right, addressing the same issue in strset__find_str()
and handling substring case could be useful.
maybe adding a helper like this:

/* 
 * Returns the offset of the string in set and length which set needs to
 * grow by to include the string.
 */
int strset__offset(struct strset *set, const char *s, long *off, long *grow)
{
	long len;
	const char *strs = strset__data(set);
	void *p;

	if (s >= strs && s < strs + set->strs_data_len) {
		*off = s - strs;
		*grow = 0;
		return 0;
	}

	len = strlen(s) + 1;
	p = strset_add_str_mem(set, len);
	if (!p)
		return -ENOMEM;
	memcpy(p, s, len);
	*off = set->strs_data_len;
	*grow = len;
	return 0;
} 

> [ ... ]
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libbpf: fix UAF in strset__add_str()
  2026-05-14 13:38   ` Mykyta Yatsenko
@ 2026-05-14 18:39     ` Carlos Llamas
  0 siblings, 0 replies; 6+ messages in thread
From: Carlos Llamas @ 2026-05-14 18:39 UTC (permalink / raw)
  To: Mykyta Yatsenko; +Cc: sashiko-reviews, bpf

On Thu, May 14, 2026 at 02:38:01PM +0100, Mykyta Yatsenko wrote:
> 
> 
> On 5/14/26 12:57 PM, sashiko-bot@kernel.org wrote:
> > Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> > - [High] Unpatched Use-After-Free in strset__find_str()
> > - [High] Use-After-Free in BTF add APIs due to btf_ensure_modifiable() freeing raw_data
> > - [Medium] Early return in strset__add_str() breaks hash map deduplication invariants
> > --
> > 
> > commit 2a7bfcb093726b849608e340f9bda7c5e3f56ac2
> > Author: Carlos Llamas <cmllamas@google.com>
> > 
> > libbpf: fix UAF in strset__add_str()
> > 
> > This commit addresses a heap-use-after-free in strset__add_str() that occurs
> > when the provided string points into the internal strset data buffer.
> > It adds a boundary check to return the offset directly if the string is
> > already within the buffer, avoiding a dangling pointer during reallocation.
> > 
> > [ ... ]
> >>  |     #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
> >>  |     #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
> >>  |     #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
> >>  |     #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
> > [ ... ]
> > 
> > Does btf__add_str() have another use-after-free when operating on a
> > non-modifiable BTF object?
> > 
> > If a user calls btf__add_str(btf, s) where s is a pointer returned by
> > btf__name_by_offset() (which points into btf->raw_data),
> > btf_ensure_modifiable(btf) is called.
> > 
> > btf_ensure_modifiable() allocates new buffers and calls
> > btf_invalidate_raw_data(btf), which frees btf->raw_data.
> > 
> > At this point, the argument s becomes a dangling pointer before
> > strset__add_str() is even called, making the boundary check added by this
> > patch evaluate against the freed raw_data.
> > 
> >> diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
> >> index 2464bcbd04e03..7d2b2784172e5 100644
> >> --- a/tools/lib/bpf/strset.c
> >> +++ b/tools/lib/bpf/strset.c
> >> @@ -141,10 +141,15 @@ int strset__find_str(struct strset *set, const char *s)
> > 
> > Does strset__find_str() suffer from the same vulnerability being fixed here?
> > 
> > It follows the exact same pattern as strset__add_str(): it temporarily
> > appends s by calling strset_add_str_mem(set, len), which may realloc() the
> > internal buffer.
> > 
> > If s points into the internal buffer, the realloc() frees the old memory,
> > leaving s dangling before the memcpy() occurs.
> > 
> >>   */
> >>  int strset__add_str(struct strset *set, const char *s)
> >>  {
> >> +	const char *strs = strset__data(set);
> >>  	long old_off, new_off, len;
> >>  	void *p;
> >>  	int err;
> >>  
> >> +	/* Check whether 's' is already in the strset data buffer */
> >> +	if (strs && s >= strs && s < strs + set->strs_data_len)
> >> +		return s - strs;
> > 
> > Can this early return break hash map deduplication invariants if s is a
> > substring?
> > 
> > If a caller passes a pointer to a substring of an existing string, the
> > function returns the substring's offset but skips inserting it into
> > set->strs_hash.
> > 
> > Because the substring is never added to the hash map, subsequent lookups for
> > it via strset__find_str() will fail, and future additions from an external
> > pointer will duplicate the string.
> > 
> > Would it be safer to calculate the offset, perform the realloc(), and then
> > re-calculate the s pointer instead of returning early?
> > 
> 
> It looks like sashiko is right, addressing the same issue in strset__find_str()
> and handling substring case could be useful.

The substring case is interesting, it can save memory space I suppose.
Although it sounds like a brand new use-case, since users would need to
perform some pointer manipulation on 's' first e.g. strstr(). Also, per
the current logic doing so would still trigger the same UAF.

It does seem like more than just fixing the bug though. Would adding the
support for substrings still get backported to stable without a user?

Yeah, it seems strset__find_str() suffers from the same issue. In
general, it feels wrong to call these two APIs with a string that is
already present in the buffer.

Anyway, thanks for the feedback and the example. I'll send a v2 with the
suggestions.

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-14 18:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 23:20 [PATCH] libbpf: fix UAF in strset__add_str() Carlos Llamas
2026-05-13 23:55 ` bot+bpf-ci
2026-05-14  1:10   ` Carlos Llamas
2026-05-14 11:57 ` sashiko-bot
2026-05-14 13:38   ` Mykyta Yatsenko
2026-05-14 18:39     ` Carlos Llamas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox