All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: bot+bpf-ci@kernel.org, bpf@vger.kernel.org
Cc: martin.lau@linux.dev, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, eddyz87@gmail.com, song@kernel.org,
	yonghong.song@linux.dev, john.fastabend@gmail.com,
	kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com,
	jolsa@kernel.org, shuah@kernel.org, skb99@linux.ibm.com,
	davem@davemloft.net, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, kernel-patches-bot@fb.com,
	martin.lau@kernel.org, clm@meta.com, ihor.solodrai@linux.dev
Subject: Re: [PATCH bpf-next v2 4/5] bpf: lru: Fix unintended eviction when updating lru hash maps
Date: Tue, 6 Jan 2026 00:16:43 +0800	[thread overview]
Message-ID: <f67e288a-7c66-4e7c-8335-2d4f6376e8cb@linux.dev> (raw)
In-Reply-To: <b9810ef33fea33c42cae606c6e3673147668f16b9e88303d47ceeeeea200712e@mail.kernel.org>



On 2026/1/5 23:43, bot+bpf-ci@kernel.org wrote:
>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> index d029690246f8..8665eb6b8a7d 100644
>> --- a/kernel/bpf/hashtab.c
>> +++ b/kernel/bpf/hashtab.c
> 
> [ ... ]
> 
>> +static int htab_lru_map_update_elem_in_place(struct bpf_htab *htab, void *key, void *value,
>> +					     u64 map_flags, struct bucket *b,
>> +					     struct hlist_nulls_head *head, u32 hash,
>> +					     bool percpu, bool onallcpus)
>> +{
>> +	struct htab_elem *l_new, *l_old, *l_free;
>> +	struct bpf_map *map = &htab->map;
>> +	u32 key_size = map->key_size;
>> +	struct bpf_lru_node *node;
>> +	unsigned long flags;
>> +	void *l_val;
>> +	int ret;
>> +
>> +	node = bpf_lru_pop_extra(&htab->lru);
>> +	if (!node)
>> +		return -ENOENT;
>> +
>> +	l_new = container_of(node, struct htab_elem, lru_node);
>> +	l_new->hash = hash;
>> +	memcpy(l_new->key, key, key_size);
>> +	if (!percpu) {
>> +		l_val = htab_elem_value(l_new, map->key_size);
>> +		copy_map_value(map, l_val, value);
>> +		bpf_obj_free_fields(map->record, l_val);
>> +	}
> 
> Should bpf_obj_free_fields() be called here on l_new? The value was just
> copied into l_new and l_new will be inserted into the hash table when
> l_old is found. Calling bpf_obj_free_fields() on l_new's value frees the
> special fields (kptr, bpf_list_head, etc.) that were just copied, then
> l_new is inserted with freed fields. This appears to be a use-after-free
> for maps with special fields.
> 
> Looking at htab_lru_map_update_elem(), the pattern is to copy the value
> and NOT call bpf_obj_free_fields() on the new element. The fields are
> only freed when an element is removed from the hash table.
> 

You're wrong here, that the copied value must be cleaned by
bpf_obj_free_fields().

>> +
>> +	ret = htab_lock_bucket(b, &flags);
>> +	if (ret)
>> +		goto err_lock_bucket;
>> +
>> +	l_old = lookup_elem_raw(head, hash, key, key_size);
>> +
>> +	ret = check_flags(htab, l_old, map_flags);
>> +	if (ret)
>> +		goto err;
>> +
>> +	if (l_old) {
>> +		bpf_lru_node_set_ref(&l_new->lru_node);
>> +		if (percpu) {
>> +			/* per-cpu hash map can update value in-place.
>> +			 * Keep the same logic in __htab_lru_percpu_map_update_elem().
>> +			 */
>> +			pcpu_copy_value(htab, htab_elem_get_ptr(l_old, key_size),
>> +					value, onallcpus);
>> +			l_free = l_new;
>> +		} else {
>> +			hlist_nulls_add_head_rcu(&l_new->hash_node, head);
>> +			hlist_nulls_del_rcu(&l_old->hash_node);
>> +			l_free = l_old;
>> +		}
>> +	} else {
>> +		ret = -ENOENT;
>> +	}
>> +
>> +err:
>> +	htab_unlock_bucket(b, flags);
>> +
>> +err_lock_bucket:
>> +	bpf_lru_push_free(&htab->lru, ret ? node : &l_free->lru_node);
>                                               ^^^^^^^^^^^^^^^^^^
> 
> When ret == 0 and !percpu, l_free points to l_old which is being removed
> from the hash table. The code calls bpf_lru_push_free() directly instead
> of htab_lru_push_free(). Looking at htab_lru_push_free():
> 
>     static void htab_lru_push_free(struct bpf_htab *htab, struct htab_elem *elem)
>     {
>         check_and_free_fields(htab, elem);
>         bpf_map_dec_elem_count(&htab->map);
>         bpf_lru_push_free(&htab->lru, &elem->lru_node);
>     }
> 
> It calls check_and_free_fields() to free special fields in the element
> before pushing to the free list. The code here bypasses that, which would
> leak special fields (kptr, etc.) in l_old when it has them.
> 
> Should this use htab_lru_push_free() instead of bpf_lru_push_free() for
> the non-error path where l_old is being freed?
> 

You're right here, that I did not free the special fields of l_old. But
htab_lru_push_free() shouldn't be used here for the non-error path,
because bpf_map_dec_elem_count() was not required here.

I'll fix it in the next revision.

Thanks,
Leon

>> +
>> +	return ret;
>> +}
> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20720201621


  reply	other threads:[~2026-01-05 16:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-05 15:18 [PATCH bpf-next v2 0/5] bpf: lru: Fix unintended eviction when updating lru hash maps Leon Hwang
2026-01-05 15:18 ` [PATCH bpf-next v2 1/5] bpf: lru: Tidy hash handling in LRU code Leon Hwang
2026-01-05 15:18 ` [PATCH bpf-next v2 2/5] bpf: lru: Factor out bpf_lru_node_reset_state helper Leon Hwang
2026-01-05 15:18 ` [PATCH bpf-next v2 3/5] bpf: lru: Factor out bpf_lru_move_next_inactive_rotation helper Leon Hwang
2026-01-05 15:18 ` [PATCH bpf-next v2 4/5] bpf: lru: Fix unintended eviction when updating lru hash maps Leon Hwang
2026-01-05 15:43   ` bot+bpf-ci
2026-01-05 16:16     ` Leon Hwang [this message]
2026-01-05 15:18 ` [PATCH bpf-next v2 5/5] selftests/bpf: Add tests to verify no unintended eviction when updating lru_[percpu_,]hash maps Leon Hwang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f67e288a-7c66-4e7c-8335-2d4f6376e8cb@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernel-patches-bot@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=skb99@linux.ibm.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.