[PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
@ 2025-11-07 10:03 Alexei Safin
  2025-11-07 11:35 ` Yafang Shao
  2025-11-07 11:41 ` David Laight
  0 siblings, 2 replies; 7+ messages in thread
From: Alexei Safin @ 2025-11-07 10:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Safin, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Yafang Shao,
	bpf, linux-kernel, lvc-patches, stable

The intermediate product value_size * num_possible_cpus() is evaluated
in 32-bit arithmetic and only then promoted to 64 bits. On systems with
large value_size and many possible CPUs this can overflow and lead to
an underestimated memory usage.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 304849a27b34 ("bpf: hashtab memory usage")
Cc: stable@vger.kernel.org
Suggested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Alexei Safin <a.safin@rosa.ru>
---
v2: Promote value_size to u64 at declaration to avoid 32-bit overflow
in all arithmetic using this variable (suggested by Yafang Shao)
 kernel/bpf/hashtab.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 570e2f723144..1f0add26ba3f 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -2252,7 +2252,7 @@ static long bpf_for_each_hash_elem(struct bpf_map *map, bpf_callback_t callback_
 static u64 htab_map_mem_usage(const struct bpf_map *map)
 {
 	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
-	u32 value_size = round_up(htab->map.value_size, 8);
+	u64 value_size = round_up(htab->map.value_size, 8);
 	bool prealloc = htab_is_prealloc(htab);
 	bool percpu = htab_is_percpu(htab);
 	bool lru = htab_is_lru(htab);
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
  2025-11-07 10:03 [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation Alexei Safin
@ 2025-11-07 11:35 ` Yafang Shao
  2025-11-07 11:41 ` David Laight
  1 sibling, 0 replies; 7+ messages in thread
From: Yafang Shao @ 2025-11-07 11:35 UTC (permalink / raw)
  To: Alexei Safin
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	bpf, linux-kernel, lvc-patches, stable

On Fri, Nov 7, 2025 at 6:03 PM Alexei Safin <a.safin@rosa.ru> wrote:
>
> The intermediate product value_size * num_possible_cpus() is evaluated
> in 32-bit arithmetic and only then promoted to 64 bits. On systems with
> large value_size and many possible CPUs this can overflow and lead to
> an underestimated memory usage.
>
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>
> Fixes: 304849a27b34 ("bpf: hashtab memory usage")
> Cc: stable@vger.kernel.org
> Suggested-by: Yafang Shao <laoar.shao@gmail.com>
> Signed-off-by: Alexei Safin <a.safin@rosa.ru>

Acked-by: Yafang Shao <laoar.shao@gmail.com>

> ---
> v2: Promote value_size to u64 at declaration to avoid 32-bit overflow
> in all arithmetic using this variable (suggested by Yafang Shao)
>  kernel/bpf/hashtab.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 570e2f723144..1f0add26ba3f 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -2252,7 +2252,7 @@ static long bpf_for_each_hash_elem(struct bpf_map *map, bpf_callback_t callback_
>  static u64 htab_map_mem_usage(const struct bpf_map *map)
>  {
>         struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> -       u32 value_size = round_up(htab->map.value_size, 8);
> +       u64 value_size = round_up(htab->map.value_size, 8);
>         bool prealloc = htab_is_prealloc(htab);
>         bool percpu = htab_is_percpu(htab);
>         bool lru = htab_is_lru(htab);
> --
> 2.50.1 (Apple Git-155)
>


-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
  2025-11-07 10:03 [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation Alexei Safin
  2025-11-07 11:35 ` Yafang Shao
@ 2025-11-07 11:41 ` David Laight
  2025-11-09  3:00   ` Yafang Shao
  1 sibling, 1 reply; 7+ messages in thread
From: David Laight @ 2025-11-07 11:41 UTC (permalink / raw)
  To: Alexei Safin
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Yafang Shao, bpf, linux-kernel, lvc-patches, stable

On Fri,  7 Nov 2025 13:03:05 +0300
Alexei Safin <a.safin@rosa.ru> wrote:

> The intermediate product value_size * num_possible_cpus() is evaluated
> in 32-bit arithmetic and only then promoted to 64 bits. On systems with
> large value_size and many possible CPUs this can overflow and lead to
> an underestimated memory usage.
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.

That code is insane.
The size being calculated looks like a kernel memory size.
You really don't want to be allocating single structures that exceed 4GB.

	David

> 
> Fixes: 304849a27b34 ("bpf: hashtab memory usage")
> Cc: stable@vger.kernel.org
> Suggested-by: Yafang Shao <laoar.shao@gmail.com>
> Signed-off-by: Alexei Safin <a.safin@rosa.ru>
> ---
> v2: Promote value_size to u64 at declaration to avoid 32-bit overflow
> in all arithmetic using this variable (suggested by Yafang Shao)
>  kernel/bpf/hashtab.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 570e2f723144..1f0add26ba3f 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -2252,7 +2252,7 @@ static long bpf_for_each_hash_elem(struct bpf_map *map, bpf_callback_t callback_
>  static u64 htab_map_mem_usage(const struct bpf_map *map)
>  {
>  	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> -	u32 value_size = round_up(htab->map.value_size, 8);
> +	u64 value_size = round_up(htab->map.value_size, 8);
>  	bool prealloc = htab_is_prealloc(htab);
>  	bool percpu = htab_is_percpu(htab);
>  	bool lru = htab_is_lru(htab);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
  2025-11-07 11:41 ` David Laight
@ 2025-11-09  3:00   ` Yafang Shao
  2025-11-09  8:20     ` Yafang Shao
  0 siblings, 1 reply; 7+ messages in thread
From: Yafang Shao @ 2025-11-09  3:00 UTC (permalink / raw)
  To: David Laight
  Cc: Alexei Safin, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf, linux-kernel, lvc-patches, stable

On Fri, Nov 7, 2025 at 7:41 PM David Laight
<david.laight.linux@gmail.com> wrote:
>
> On Fri,  7 Nov 2025 13:03:05 +0300
> Alexei Safin <a.safin@rosa.ru> wrote:
>
> > The intermediate product value_size * num_possible_cpus() is evaluated
> > in 32-bit arithmetic and only then promoted to 64 bits. On systems with
> > large value_size and many possible CPUs this can overflow and lead to
> > an underestimated memory usage.
> >
> > Found by Linux Verification Center (linuxtesting.org) with SVACE.
>
> That code is insane.
> The size being calculated looks like a kernel memory size.
> You really don't want to be allocating single structures that exceed 4GB.

I failed to get your point.
The calculation `value_size * num_possible_cpus() * num_entries` can
overflow. While the creation of a hashmap limits `value_size *
num_entries` to U32_MAX, this new formula can easily exceed that
limit. For example, on my test server with just 64 CPUs, the following
operation will trigger an overflow:

          map_fd = bpf_map_create(BPF_MAP_TYPE_PERCPU_HASH, "count_map", 4, 4,
                                                     1 << 27, &map_opts)

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
  2025-11-09  3:00   ` Yafang Shao
@ 2025-11-09  8:20     ` Yafang Shao
  2025-11-09 11:00       ` Алексей Сафин
  0 siblings, 1 reply; 7+ messages in thread
From: Yafang Shao @ 2025-11-09  8:20 UTC (permalink / raw)
  To: David Laight, Alexei Safin
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	bpf, linux-kernel, lvc-patches, stable

On Sun, Nov 9, 2025 at 11:00 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Fri, Nov 7, 2025 at 7:41 PM David Laight
> <david.laight.linux@gmail.com> wrote:
> >
> > On Fri,  7 Nov 2025 13:03:05 +0300
> > Alexei Safin <a.safin@rosa.ru> wrote:
> >
> > > The intermediate product value_size * num_possible_cpus() is evaluated
> > > in 32-bit arithmetic and only then promoted to 64 bits. On systems with
> > > large value_size and many possible CPUs this can overflow and lead to
> > > an underestimated memory usage.
> > >
> > > Found by Linux Verification Center (linuxtesting.org) with SVACE.
> >
> > That code is insane.
> > The size being calculated looks like a kernel memory size.
> > You really don't want to be allocating single structures that exceed 4GB.
>
> I failed to get your point.
> The calculation `value_size * num_possible_cpus() * num_entries` can
> overflow. While the creation of a hashmap limits `value_size *
> num_entries` to U32_MAX, this new formula can easily exceed that
> limit. For example, on my test server with just 64 CPUs, the following
> operation will trigger an overflow:
>
>           map_fd = bpf_map_create(BPF_MAP_TYPE_PERCPU_HASH, "count_map", 4, 4,
>                                                      1 << 27, &map_opts)

Upon reviewing the code, I see that `num_entries` is declared as u64,
which prevents overflow in the calculation `value_size *
num_possible_cpus() * num_entries`. Therefore, this change is
unnecessary.

It seems that the Linux Verification Center (linuxtesting.org) needs
to be improved ;-)

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
  2025-11-09  8:20     ` Yafang Shao
@ 2025-11-09 11:00       ` Алексей Сафин
  2025-11-09 12:10         ` Yafang Shao
  0 siblings, 1 reply; 7+ messages in thread
From: Алексей Сафин @ 2025-11-09 11:00 UTC (permalink / raw)
  To: Yafang Shao, David Laight
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	bpf, linux-kernel, lvc-patches, stable

Thanks for the follow-up.

Just to clarify: the overflow happens before the multiplication by
num_entries. In C, the * operator is left-associative, so the expression is
evaluated as (value_size * num_possible_cpus()) * num_entries. Since
value_size was u32 and num_possible_cpus() returns int, the first product is
performed in 32-bit arithmetic due to usual integer promotions. If that
intermediate product overflows, the result is already incorrect before it is
promoted when multiplied by u64 num_entries.

A concrete example within allowed limits:
value_size = 1,048,576 (1 MiB), num_possible_cpus() = 4096
=> 1,048,576 * 4096 = 2^32 => wraps to 0 in 32 bits, even with 
num_entries = 1.

This isn’t about a single >4GiB allocation - it’s about aggregated memory
usage (percpu), which can legitimately exceed 4GiB in total.

v2 promotes value_size to u64 at declaration, which avoids the 32-bit
intermediate overflow cleanly.

09.11.2025 11:20, Yafang Shao пишет:
> On Sun, Nov 9, 2025 at 11:00 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>> On Fri, Nov 7, 2025 at 7:41 PM David Laight
>> <david.laight.linux@gmail.com> wrote:
>>> On Fri,  7 Nov 2025 13:03:05 +0300
>>> Alexei Safin <a.safin@rosa.ru> wrote:
>>>
>>>> The intermediate product value_size * num_possible_cpus() is evaluated
>>>> in 32-bit arithmetic and only then promoted to 64 bits. On systems with
>>>> large value_size and many possible CPUs this can overflow and lead to
>>>> an underestimated memory usage.
>>>>
>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>> That code is insane.
>>> The size being calculated looks like a kernel memory size.
>>> You really don't want to be allocating single structures that exceed 4GB.
>> I failed to get your point.
>> The calculation `value_size * num_possible_cpus() * num_entries` can
>> overflow. While the creation of a hashmap limits `value_size *
>> num_entries` to U32_MAX, this new formula can easily exceed that
>> limit. For example, on my test server with just 64 CPUs, the following
>> operation will trigger an overflow:
>>
>>            map_fd = bpf_map_create(BPF_MAP_TYPE_PERCPU_HASH, "count_map", 4, 4,
>>                                                       1 << 27, &map_opts)
> Upon reviewing the code, I see that `num_entries` is declared as u64,
> which prevents overflow in the calculation `value_size *
> num_possible_cpus() * num_entries`. Therefore, this change is
> unnecessary.
>
> It seems that the Linux Verification Center (linuxtesting.org) needs
> to be improved ;-)
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation
  2025-11-09 11:00       ` Алексей Сафин
@ 2025-11-09 12:10         ` Yafang Shao
  0 siblings, 0 replies; 7+ messages in thread
From: Yafang Shao @ 2025-11-09 12:10 UTC (permalink / raw)
  To: Алексей Сафин
  Cc: David Laight, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, bpf, linux-kernel, lvc-patches, stable

On Sun, Nov 9, 2025 at 7:00 PM Алексей Сафин <a.safin@rosa.ru> wrote:
>
> Thanks for the follow-up.
>
> Just to clarify: the overflow happens before the multiplication by
> num_entries. In C, the * operator is left-associative, so the expression is
> evaluated as (value_size * num_possible_cpus()) * num_entries. Since
> value_size was u32 and num_possible_cpus() returns int, the first product is
> performed in 32-bit arithmetic due to usual integer promotions. If that
> intermediate product overflows, the result is already incorrect before it is
> promoted when multiplied by u64 num_entries.
>
> A concrete example within allowed limits:
> value_size = 1,048,576 (1 MiB), num_possible_cpus() = 4096
> => 1,048,576 * 4096 = 2^32 => wraps to 0 in 32 bits, even with
> num_entries = 1.

Thank you for the clarification.

Based on my understanding, the maximum value_size for a percpu hashmap
appears to be constrained by PCPU_MIN_UNIT_SIZE (32768), as referenced
in htab_map_alloc_check():

  https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/hashtab.c#n457

This would require num_possible_cpus() to reach 131072 to potentially
cause an overflow.  However, the maximum number of CPUs supported on
x86_64 is typically 8192 in standard kernel configurations. I'm
uncertain if any architectures actually support systems at this scale.


>
> This isn’t about a single >4GiB allocation - it’s about aggregated memory
> usage (percpu), which can legitimately exceed 4GiB in total.
>
> v2 promotes value_size to u64 at declaration, which avoids the 32-bit
> intermediate overflow cleanly.


--
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-11-09 12:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07 10:03 [PATCH v2] bpf: hashtab: fix 32-bit overflow in memory usage calculation Alexei Safin
2025-11-07 11:35 ` Yafang Shao
2025-11-07 11:41 ` David Laight
2025-11-09  3:00   ` Yafang Shao
2025-11-09  8:20     ` Yafang Shao
2025-11-09 11:00       ` Алексей Сафин
2025-11-09 12:10         ` Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox