All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Carlos Llamas" <cmllamas@google.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v3] libbpf: fix UAF in strset__add_str()
Date: Mon, 18 May 2026 05:31:23 +0000	[thread overview]
Message-ID: <20260518053124.2EF28C2BCB7@smtp.kernel.org> (raw)
In-Reply-To: <20260518050550.2600101-1-cmllamas@google.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Use-After-Free in btf__add_* APIs when called with strings from btf__name_by_offset on an unmodifiable BTF.
- [Medium] Undefined Behavior in pointer comparison and arithmetic potentially re-introducing UAF.
--

commit c0c79bff673db7048c82410ff03ad09f2f25dbab
Author: Carlos Llamas <cmllamas@google.com>

    libbpf: fix UAF in strset__add_str()

    strset_add_str_mem() might reallocate the strset data buffer in order to
    accommodate the provided string 's'. However, if 's' points to a string
    already present in the buffer, it becomes dangling after the realloc.

> diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
> index 2464bcbd04e03..d229961ff2fcd 100644
> --- a/tools/lib/bpf/strset.c
> +++ b/tools/lib/bpf/strset.c

[ ... ]

> @@ -107,6 +107,37 @@ static void *strset_add_str_mem(struct strset *set, size_t add_sz)
>  			      set->strs_data_len, set->strs_data_max_len, add_sz);
>  }
>  
> +static long strset_str_append(struct strset *set, const char *s)
> +{
> +	const char *old_data = strset__data(set);
> +	long len = strlen(s) + 1;
> +	void *p;
> +
> +	/* Hashmap keys are always offsets within set->strs_data, so to even
> +	 * look up some string from the "outside", we need to first append it
> +	 * at the end, so that it can be addressed with an offset. Luckily,
> +	 * until set->strs_data_len is incremented, that string is just a piece
> +	 * of garbage for the rest of the code, so no harm, no foul. On the
> +	 * other hand, if the string is unique, it's already appended and
> +	 * ready to be used, only a simple set->strs_data_len increment away.
> +	 */
> +	p = strset_add_str_mem(set, len);
> +	if (!p)
> +		return -ENOMEM;
> +
> +	/* The set->strs_data might have reallocated and if 's' pointed
> +	 * to an internal string within the old buffer, then it became
> +	 * dangling and needs to be reconstructed before the copy.
> +	 */
> +	if (old_data && old_data != strset__data(set) &&
> +	    s >= old_data && s < old_data + set->strs_data_len)

Does this comparison invoke undefined behavior if old_data was freed by
realloc() inside strset_add_str_mem() or if s is an unrelated pointer?

According to the C standard, comparing or subtracting freed or unrelated
pointers using relational operators can invoke undefined behavior. Compilers
tracking pointer provenance might deduce the pointer was freed and optimize
away the bounds check entirely, bypassing the safety check.

Could these pointers be cast to uintptr_t before comparison to safely check
the bounds?

Does a similar use-after-free still exist in the btf__add_* APIs
(e.g. btf__add_str) when called with strings from btf__name_by_offset()
on an unmodifiable BTF?

If a user passes a string retrieved via btf__name_by_offset(), the
string pointer points directly into btf->raw_data.

Inside the btf__add_* APIs, btf_ensure_modifiable(btf) is called first,
which calls btf_invalidate_raw_data() and frees btf->raw_data.

Could this leave the s pointer dangling before it even reaches this
check? The check added here uses old_data which corresponds to
btf->strs_set, not the freed btf->raw_data, so it wouldn't rescue the
pointer.

> +		s = strset__data(set) + (s - old_data);
> +
> +	memcpy(p, s, len);
> +
> +	return len;
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260518050550.2600101-1-cmllamas@google.com?part=1

  reply	other threads:[~2026-05-18  5:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 23:20 [PATCH] libbpf: fix UAF in strset__add_str() Carlos Llamas
2026-05-13 23:55 ` bot+bpf-ci
2026-05-14  1:10   ` Carlos Llamas
2026-05-14 11:57 ` sashiko-bot
2026-05-14 13:38   ` Mykyta Yatsenko
2026-05-14 18:39     ` Carlos Llamas
2026-05-15  4:47     ` [PATCH v2] " Carlos Llamas
2026-05-15 22:08       ` Andrii Nakryiko
2026-05-18  4:59         ` Carlos Llamas
2026-05-18  5:05         ` [PATCH v3] " Carlos Llamas
2026-05-18  5:31           ` sashiko-bot [this message]
2026-05-18 17:36           ` Andrii Nakryiko
2026-05-22 18:35             ` Andrii Nakryiko
2026-05-22 18:50               ` Carlos Llamas
2026-05-23 16:27             ` [PATCH v4] " Carlos Llamas
2026-05-23 16:51               ` sashiko-bot
2026-05-23 17:05               ` bot+bpf-ci
2026-05-28 21:31               ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260518053124.2EF28C2BCB7@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cmllamas@google.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.