BPF List
 help / color / mirror / Atom feed
* [PATCH bpf v2 0/2] Fix hashmap overflow checks for 32-bit arches
@ 2024-02-29 11:22 Toke Høiland-Jørgensen
  2024-02-29 11:22 ` [PATCH bpf v2 1/2] bpf: Fix DEVMAP_HASH overflow check on " Toke Høiland-Jørgensen
  2024-02-29 11:22 ` [PATCH bpf v2 2/2] bpf: Fix hashtab " Toke Høiland-Jørgensen
  0 siblings, 2 replies; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-29 11:22 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, David S. Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Toke Høiland-Jørgensen
  Cc: Jesper Dangaard Brouer, netdev, bpf

Syzbot managed to trigger a crash by creating a DEVMAP_HASH map with a
large number of buckets because the overflow check relies on
well-defined behaviour that is only correct on 64-bit arches.

Fix the overflow checks to happen before values are rounded up.

v2:
- Fix off-by-one error in overflow check
- Apply the same fix to hashtab, where the devmap_hash code was copied
  from (John)

Toke Høiland-Jørgensen (2):
  bpf: Fix DEVMAP_HASH overflow check on 32-bit arches
  bpf: Fix hashtab overflow check on 32-bit arches

 kernel/bpf/devmap.c  |  8 +++-----
 kernel/bpf/hashtab.c | 10 +++++-----
 2 files changed, 8 insertions(+), 10 deletions(-)

-- 
2.43.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH bpf v2 1/2] bpf: Fix DEVMAP_HASH overflow check on 32-bit arches
  2024-02-29 11:22 [PATCH bpf v2 0/2] Fix hashmap overflow checks for 32-bit arches Toke Høiland-Jørgensen
@ 2024-02-29 11:22 ` Toke Høiland-Jørgensen
  2024-02-29 11:22 ` [PATCH bpf v2 2/2] bpf: Fix hashtab " Toke Høiland-Jørgensen
  1 sibling, 0 replies; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-29 11:22 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, David S. Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Toke Høiland-Jørgensen
  Cc: syzbot+8cd36f6b65f3cafd400a, netdev, bpf

The devmap code allocates a number hash buckets equal to the next power of
two of the max_entries value provided when creating the map. When rounding
up to the next power of two, the 32-bit variable storing the number of
buckets can overflow, and the code checks for overflow by checking if the
truncated 32-bit value is equal to 0. However, on 32-bit arches the
rounding up itself can overflow mid-way through, because it ends up doing a
left-shift of 32 bits on an unsigned long value. If the size of an unsigned
long is four bytes, this is undefined behaviour, so there is no guarantee
that we'll end up with a nice and tidy 0-value at the end.

Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH
with max_entries > 0x80000000 and then trying to update it. Fix this by
moving the overflow check to before the rounding up operation.

Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 kernel/bpf/devmap.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index a936c704d4e7..d08888e5f994 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 	bpf_map_init_from_attr(&dtab->map, attr);
 
 	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
-		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
-
-		if (!dtab->n_buckets) /* Overflow check */
+		if (dtab->map.max_entries > U32_MAX / 2 + 1)
 			return -EINVAL;
-	}
 
-	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
+		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
+
 		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
 							   dtab->map.numa_node);
 		if (!dtab->dev_index_head)
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-02-29 11:22 [PATCH bpf v2 0/2] Fix hashmap overflow checks for 32-bit arches Toke Høiland-Jørgensen
  2024-02-29 11:22 ` [PATCH bpf v2 1/2] bpf: Fix DEVMAP_HASH overflow check on " Toke Høiland-Jørgensen
@ 2024-02-29 11:22 ` Toke Høiland-Jørgensen
  2024-02-29 17:07   ` Alexei Starovoitov
  1 sibling, 1 reply; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-29 11:22 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller
  Cc: Toke Høiland-Jørgensen, bpf

The hashtab code relies on roundup_pow_of_two() to compute the number of
hash buckets, and contains an overflow check by checking if the resulting
value is 0. However, on 32-bit arches, the roundup code itself can overflow
by doing a 32-bit left-shift of an unsigned long value, which is undefined
behaviour, so it is not guaranteed to truncate neatly. This was triggered
by syzbot on the DEVMAP_HASH type, which contains the same check, copied
from the hashtab code. So apply the same fix to hashtab, by moving the
overflow check to before the roundup.

The hashtab code also contained a check that prevents the total allocation
size for the buckets from overflowing a 32-bit value, but since all the
allocation code uses u64s, this does not really seem to be necessary, so
drop it and keep only the strict overflow check of the n_buckets variable.

Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 kernel/bpf/hashtab.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 03a6a2500b6a..4caf8dab18b0 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 							  num_possible_cpus());
 	}
 
-	/* hash table size must be power of 2 */
-	htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
 
 	htab->elem_size = sizeof(struct htab_elem) +
 			  round_up(htab->map.key_size, 8);
@@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
 		htab->elem_size += round_up(htab->map.value_size, 8);
 
 	err = -E2BIG;
-	/* prevent zero size kmalloc and check for u32 overflow */
-	if (htab->n_buckets == 0 ||
-	    htab->n_buckets > U32_MAX / sizeof(struct bucket))
+	/* prevent overflow in roundup below */
+	if (htab->map.max_entries > U32_MAX / 2 + 1)
 		goto free_htab;
 
+	/* hash table size must be power of 2 */
+	htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
+
 	err = bpf_map_init_elem_count(&htab->map);
 	if (err)
 		goto free_htab;
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-02-29 11:22 ` [PATCH bpf v2 2/2] bpf: Fix hashtab " Toke Høiland-Jørgensen
@ 2024-02-29 17:07   ` Alexei Starovoitov
  2024-02-29 22:21     ` John Fastabend
  0 siblings, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2024-02-29 17:07 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> The hashtab code relies on roundup_pow_of_two() to compute the number of
> hash buckets, and contains an overflow check by checking if the resulting
> value is 0. However, on 32-bit arches, the roundup code itself can overflow
> by doing a 32-bit left-shift of an unsigned long value, which is undefined
> behaviour, so it is not guaranteed to truncate neatly. This was triggered
> by syzbot on the DEVMAP_HASH type, which contains the same check, copied
> from the hashtab code. So apply the same fix to hashtab, by moving the
> overflow check to before the roundup.
>
> The hashtab code also contained a check that prevents the total allocation
> size for the buckets from overflowing a 32-bit value, but since all the
> allocation code uses u64s, this does not really seem to be necessary, so
> drop it and keep only the strict overflow check of the n_buckets variable.
>
> Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  kernel/bpf/hashtab.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 03a6a2500b6a..4caf8dab18b0 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>                                                           num_possible_cpus());
>         }
>
> -       /* hash table size must be power of 2 */
> -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>
>         htab->elem_size = sizeof(struct htab_elem) +
>                           round_up(htab->map.key_size, 8);
> @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>                 htab->elem_size += round_up(htab->map.value_size, 8);
>
>         err = -E2BIG;
> -       /* prevent zero size kmalloc and check for u32 overflow */
> -       if (htab->n_buckets == 0 ||
> -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
> +       /* prevent overflow in roundup below */
> +       if (htab->map.max_entries > U32_MAX / 2 + 1)
>                 goto free_htab;

No. We cannot artificially reduce max_entries that will break real users.
Hash table with 4B elements is not that uncommon.

pw-bot: cr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-02-29 17:07   ` Alexei Starovoitov
@ 2024-02-29 22:21     ` John Fastabend
  2024-03-01 12:35       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 13+ messages in thread
From: John Fastabend @ 2024-02-29 22:21 UTC (permalink / raw)
  To: Alexei Starovoitov, Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

Alexei Starovoitov wrote:
> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > The hashtab code relies on roundup_pow_of_two() to compute the number of
> > hash buckets, and contains an overflow check by checking if the resulting
> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
> > from the hashtab code. So apply the same fix to hashtab, by moving the
> > overflow check to before the roundup.
> >
> > The hashtab code also contained a check that prevents the total allocation
> > size for the buckets from overflowing a 32-bit value, but since all the
> > allocation code uses u64s, this does not really seem to be necessary, so
> > drop it and keep only the strict overflow check of the n_buckets variable.
> >
> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > ---
> >  kernel/bpf/hashtab.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > index 03a6a2500b6a..4caf8dab18b0 100644
> > --- a/kernel/bpf/hashtab.c
> > +++ b/kernel/bpf/hashtab.c
> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >                                                           num_possible_cpus());
> >         }
> >
> > -       /* hash table size must be power of 2 */
> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> >
> >         htab->elem_size = sizeof(struct htab_elem) +
> >                           round_up(htab->map.key_size, 8);
> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >                 htab->elem_size += round_up(htab->map.value_size, 8);
> >
> >         err = -E2BIG;
> > -       /* prevent zero size kmalloc and check for u32 overflow */
> > -       if (htab->n_buckets == 0 ||
> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
> > +       /* prevent overflow in roundup below */
> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
> >                 goto free_htab;
> 
> No. We cannot artificially reduce max_entries that will break real users.
> Hash table with 4B elements is not that uncommon.

Agree how about return E2BIG in these cases (32bit arch and overflow) and 
let user figure it out. That makes more sense to me.

> 
> pw-bot: cr



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-02-29 22:21     ` John Fastabend
@ 2024-03-01 12:35       ` Toke Høiland-Jørgensen
  2024-03-01 17:15         ` Alexei Starovoitov
  2024-03-01 17:21         ` John Fastabend
  0 siblings, 2 replies; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-03-01 12:35 UTC (permalink / raw)
  To: John Fastabend, Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

John Fastabend <john.fastabend@gmail.com> writes:

> Alexei Starovoitov wrote:
>> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >
>> > The hashtab code relies on roundup_pow_of_two() to compute the number of
>> > hash buckets, and contains an overflow check by checking if the resulting
>> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
>> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
>> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
>> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
>> > from the hashtab code. So apply the same fix to hashtab, by moving the
>> > overflow check to before the roundup.
>> >
>> > The hashtab code also contained a check that prevents the total allocation
>> > size for the buckets from overflowing a 32-bit value, but since all the
>> > allocation code uses u64s, this does not really seem to be necessary, so
>> > drop it and keep only the strict overflow check of the n_buckets variable.
>> >
>> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
>> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> > ---
>> >  kernel/bpf/hashtab.c | 10 +++++-----
>> >  1 file changed, 5 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> > index 03a6a2500b6a..4caf8dab18b0 100644
>> > --- a/kernel/bpf/hashtab.c
>> > +++ b/kernel/bpf/hashtab.c
>> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >                                                           num_possible_cpus());
>> >         }
>> >
>> > -       /* hash table size must be power of 2 */
>> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>> >
>> >         htab->elem_size = sizeof(struct htab_elem) +
>> >                           round_up(htab->map.key_size, 8);
>> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >                 htab->elem_size += round_up(htab->map.value_size, 8);
>> >
>> >         err = -E2BIG;
>> > -       /* prevent zero size kmalloc and check for u32 overflow */
>> > -       if (htab->n_buckets == 0 ||
>> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
>> > +       /* prevent overflow in roundup below */
>> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
>> >                 goto free_htab;
>> 
>> No. We cannot artificially reduce max_entries that will break real users.
>> Hash table with 4B elements is not that uncommon.

Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
bucket) check, which limits max_entries to 134M (0x8000000). This patch
is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
0x80000000).

> Agree how about return E2BIG in these cases (32bit arch and overflow) and 
> let user figure it out. That makes more sense to me.

Isn't that exactly what this patch does? What am I missing here?

-Toke


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-01 12:35       ` Toke Høiland-Jørgensen
@ 2024-03-01 17:15         ` Alexei Starovoitov
  2024-03-04 13:02           ` Toke Høiland-Jørgensen
  2024-03-01 17:21         ` John Fastabend
  1 sibling, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 17:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: John Fastabend, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

On Fri, Mar 1, 2024 at 4:35 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> John Fastabend <john.fastabend@gmail.com> writes:
>
> > Alexei Starovoitov wrote:
> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >
> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
> >> > hash buckets, and contains an overflow check by checking if the resulting
> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
> >> > overflow check to before the roundup.
> >> >
> >> > The hashtab code also contained a check that prevents the total allocation
> >> > size for the buckets from overflowing a 32-bit value, but since all the
> >> > allocation code uses u64s, this does not really seem to be necessary, so
> >> > drop it and keep only the strict overflow check of the n_buckets variable.
> >> >
> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> > ---
> >> >  kernel/bpf/hashtab.c | 10 +++++-----
> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> >> > index 03a6a2500b6a..4caf8dab18b0 100644
> >> > --- a/kernel/bpf/hashtab.c
> >> > +++ b/kernel/bpf/hashtab.c
> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >                                                           num_possible_cpus());
> >> >         }
> >> >
> >> > -       /* hash table size must be power of 2 */
> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> >> >
> >> >         htab->elem_size = sizeof(struct htab_elem) +
> >> >                           round_up(htab->map.key_size, 8);
> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
> >> >
> >> >         err = -E2BIG;
> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
> >> > -       if (htab->n_buckets == 0 ||
> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
> >> > +       /* prevent overflow in roundup below */
> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
> >> >                 goto free_htab;
> >>
> >> No. We cannot artificially reduce max_entries that will break real users.
> >> Hash table with 4B elements is not that uncommon.
>
> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
> bucket) check, which limits max_entries to 134M (0x8000000). This patch
> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
> 0x80000000).
>
> > Agree how about return E2BIG in these cases (32bit arch and overflow) and
> > let user figure it out. That makes more sense to me.
>
> Isn't that exactly what this patch does? What am I missing here?

I see. Then what are you fixing?
roundup_pow_of_two() will return 0 and existing code is fine as-is.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-01 12:35       ` Toke Høiland-Jørgensen
  2024-03-01 17:15         ` Alexei Starovoitov
@ 2024-03-01 17:21         ` John Fastabend
  1 sibling, 0 replies; 13+ messages in thread
From: John Fastabend @ 2024-03-01 17:21 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, John Fastabend,
	Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

Toke Høiland-Jørgensen wrote:
> John Fastabend <john.fastabend@gmail.com> writes:
> 
> > Alexei Starovoitov wrote:
> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >
> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
> >> > hash buckets, and contains an overflow check by checking if the resulting
> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
> >> > overflow check to before the roundup.
> >> >
> >> > The hashtab code also contained a check that prevents the total allocation
> >> > size for the buckets from overflowing a 32-bit value, but since all the
> >> > allocation code uses u64s, this does not really seem to be necessary, so
> >> > drop it and keep only the strict overflow check of the n_buckets variable.
> >> >
> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> > ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

> >> >  kernel/bpf/hashtab.c | 10 +++++-----
> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> >> > index 03a6a2500b6a..4caf8dab18b0 100644
> >> > --- a/kernel/bpf/hashtab.c
> >> > +++ b/kernel/bpf/hashtab.c
> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >                                                           num_possible_cpus());
> >> >         }
> >> >
> >> > -       /* hash table size must be power of 2 */
> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> >> >
> >> >         htab->elem_size = sizeof(struct htab_elem) +
> >> >                           round_up(htab->map.key_size, 8);
> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
> >> >
> >> >         err = -E2BIG;
> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
> >> > -       if (htab->n_buckets == 0 ||
> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
> >> > +       /* prevent overflow in roundup below */
> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
> >> >                 goto free_htab;
> >> 
> >> No. We cannot artificially reduce max_entries that will break real users.
> >> Hash table with 4B elements is not that uncommon.
> 
> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
> bucket) check, which limits max_entries to 134M (0x8000000). This patch
> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
> 0x80000000).

Yep. From my side makes sense ACK for me. Maybe put it in the commit message
if it wasn't obvious. 


> 
> > Agree how about return E2BIG in these cases (32bit arch and overflow) and 
> > let user figure it out. That makes more sense to me.
> 
> Isn't that exactly what this patch does? What am I missing here?

Nothing it was me must have been tired. Sorry about that.

> 
> -Toke
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-01 17:15         ` Alexei Starovoitov
@ 2024-03-04 13:02           ` Toke Høiland-Jørgensen
  2024-03-06  5:29             ` Alexei Starovoitov
  0 siblings, 1 reply; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-03-04 13:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: John Fastabend, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Fri, Mar 1, 2024 at 4:35 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> John Fastabend <john.fastabend@gmail.com> writes:
>>
>> > Alexei Starovoitov wrote:
>> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >
>> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
>> >> > hash buckets, and contains an overflow check by checking if the resulting
>> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
>> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
>> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
>> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
>> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
>> >> > overflow check to before the roundup.
>> >> >
>> >> > The hashtab code also contained a check that prevents the total allocation
>> >> > size for the buckets from overflowing a 32-bit value, but since all the
>> >> > allocation code uses u64s, this does not really seem to be necessary, so
>> >> > drop it and keep only the strict overflow check of the n_buckets variable.
>> >> >
>> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
>> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> >> > ---
>> >> >  kernel/bpf/hashtab.c | 10 +++++-----
>> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
>> >> >
>> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> >> > index 03a6a2500b6a..4caf8dab18b0 100644
>> >> > --- a/kernel/bpf/hashtab.c
>> >> > +++ b/kernel/bpf/hashtab.c
>> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >> >                                                           num_possible_cpus());
>> >> >         }
>> >> >
>> >> > -       /* hash table size must be power of 2 */
>> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>> >> >
>> >> >         htab->elem_size = sizeof(struct htab_elem) +
>> >> >                           round_up(htab->map.key_size, 8);
>> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
>> >> >
>> >> >         err = -E2BIG;
>> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
>> >> > -       if (htab->n_buckets == 0 ||
>> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
>> >> > +       /* prevent overflow in roundup below */
>> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
>> >> >                 goto free_htab;
>> >>
>> >> No. We cannot artificially reduce max_entries that will break real users.
>> >> Hash table with 4B elements is not that uncommon.
>>
>> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
>> bucket) check, which limits max_entries to 134M (0x8000000). This patch
>> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
>> 0x80000000).
>>
>> > Agree how about return E2BIG in these cases (32bit arch and overflow) and
>> > let user figure it out. That makes more sense to me.
>>
>> Isn't that exactly what this patch does? What am I missing here?
>
> I see. Then what are you fixing?
> roundup_pow_of_two() will return 0 and existing code is fine as-is.

On 64-bit arches it will, yes. On 32-bit arches it ends up doing a
32-bit left-shift (1UL << 32) of a 32-bit type (unsigned long), which is
UB, so there's no guarantee that it truncates down to 0. And it seems at
least on arm32 it does not: syzbot managed to trigger a crash in the
DEVMAP_HASH code by creating a map with more than 0x80000000 entries:

https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com

This patch just preemptively applies the same fix to the hashtab code,
since I could not find any reason why it shouldn't be possible to hit
the same issue there. I haven't actually managed to trigger a crash
there, though (I don't have any arm32 hardware to test this on), so in
that sense it's a bit theoretical for hashtab. So up to you if you want
to take this, but even if you don't, could you please apply the first
patch? That does fix the issue reported by syzbot (cf the
reported-and-tested-by tag).

-Toke


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-04 13:02           ` Toke Høiland-Jørgensen
@ 2024-03-06  5:29             ` Alexei Starovoitov
  2024-03-06 10:32               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2024-03-06  5:29 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: John Fastabend, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

On Mon, Mar 4, 2024 at 5:02 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
> > On Fri, Mar 1, 2024 at 4:35 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> John Fastabend <john.fastabend@gmail.com> writes:
> >>
> >> > Alexei Starovoitov wrote:
> >> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >> >
> >> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
> >> >> > hash buckets, and contains an overflow check by checking if the resulting
> >> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
> >> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
> >> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
> >> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
> >> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
> >> >> > overflow check to before the roundup.
> >> >> >
> >> >> > The hashtab code also contained a check that prevents the total allocation
> >> >> > size for the buckets from overflowing a 32-bit value, but since all the
> >> >> > allocation code uses u64s, this does not really seem to be necessary, so
> >> >> > drop it and keep only the strict overflow check of the n_buckets variable.
> >> >> >
> >> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
> >> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> >> > ---
> >> >> >  kernel/bpf/hashtab.c | 10 +++++-----
> >> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >> >> >
> >> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> >> >> > index 03a6a2500b6a..4caf8dab18b0 100644
> >> >> > --- a/kernel/bpf/hashtab.c
> >> >> > +++ b/kernel/bpf/hashtab.c
> >> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >> >                                                           num_possible_cpus());
> >> >> >         }
> >> >> >
> >> >> > -       /* hash table size must be power of 2 */
> >> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> >> >> >
> >> >> >         htab->elem_size = sizeof(struct htab_elem) +
> >> >> >                           round_up(htab->map.key_size, 8);
> >> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
> >> >> >
> >> >> >         err = -E2BIG;
> >> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
> >> >> > -       if (htab->n_buckets == 0 ||
> >> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
> >> >> > +       /* prevent overflow in roundup below */
> >> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
> >> >> >                 goto free_htab;
> >> >>
> >> >> No. We cannot artificially reduce max_entries that will break real users.
> >> >> Hash table with 4B elements is not that uncommon.
> >>
> >> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
> >> bucket) check, which limits max_entries to 134M (0x8000000). This patch
> >> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
> >> 0x80000000).
> >>
> >> > Agree how about return E2BIG in these cases (32bit arch and overflow) and
> >> > let user figure it out. That makes more sense to me.
> >>
> >> Isn't that exactly what this patch does? What am I missing here?
> >
> > I see. Then what are you fixing?
> > roundup_pow_of_two() will return 0 and existing code is fine as-is.
>
> On 64-bit arches it will, yes. On 32-bit arches it ends up doing a
> 32-bit left-shift (1UL << 32) of a 32-bit type (unsigned long), which is
> UB, so there's no guarantee that it truncates down to 0. And it seems at
> least on arm32 it does not: syzbot managed to trigger a crash in the
> DEVMAP_HASH code by creating a map with more than 0x80000000 entries:
>
> https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
>
> This patch just preemptively applies the same fix to the hashtab code,
> since I could not find any reason why it shouldn't be possible to hit
> the same issue there. I haven't actually managed to trigger a crash
> there, though (I don't have any arm32 hardware to test this on), so in
> that sense it's a bit theoretical for hashtab. So up to you if you want
> to take this, but even if you don't, could you please apply the first
> patch? That does fix the issue reported by syzbot (cf the
> reported-and-tested-by tag).

I see.
Since roundup_pow_of_two() is non deterministic on 32-bit archs,
let's fix them all.

We have at least 5 to fix:
bloom_filter.c:                 nr_bits = roundup_pow_of_two(nr_bits);
devmap.c:               dtab->n_buckets =
roundup_pow_of_two(dtab->map.max_entries);
hashtab.c:      htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
stackmap.c:     n_buckets = roundup_pow_of_two(attr->max_entries);

hashtab.c:           htab->map.max_entries = roundup(attr->max_entries,
                                                num_possible_cpus());

bloom_filter looks ok as-is,
but stack_map has the same issue as devmap and hashtab.

Let's check for
if (max_entries > (1u << 31))
in 3 maps and that should be enough to cover all 5 cases?

imo 1u << 31 is much easier to visualize than U32_MAX/2+1

and don't touch other checks.
This patch is removing U32_MAX / sizeof(struct bucket) check
and with that introduces overflow just few lines below in bpf_map_area_alloc.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-06  5:29             ` Alexei Starovoitov
@ 2024-03-06 10:32               ` Toke Høiland-Jørgensen
  2024-03-06 16:53                 ` Alexei Starovoitov
  0 siblings, 1 reply; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-03-06 10:32 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: John Fastabend, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Mar 4, 2024 at 5:02 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Fri, Mar 1, 2024 at 4:35 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> John Fastabend <john.fastabend@gmail.com> writes:
>> >>
>> >> > Alexei Starovoitov wrote:
>> >> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >> >
>> >> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
>> >> >> > hash buckets, and contains an overflow check by checking if the resulting
>> >> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
>> >> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
>> >> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
>> >> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
>> >> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
>> >> >> > overflow check to before the roundup.
>> >> >> >
>> >> >> > The hashtab code also contained a check that prevents the total allocation
>> >> >> > size for the buckets from overflowing a 32-bit value, but since all the
>> >> >> > allocation code uses u64s, this does not really seem to be necessary, so
>> >> >> > drop it and keep only the strict overflow check of the n_buckets variable.
>> >> >> >
>> >> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
>> >> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> >> >> > ---
>> >> >> >  kernel/bpf/hashtab.c | 10 +++++-----
>> >> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
>> >> >> >
>> >> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> >> >> > index 03a6a2500b6a..4caf8dab18b0 100644
>> >> >> > --- a/kernel/bpf/hashtab.c
>> >> >> > +++ b/kernel/bpf/hashtab.c
>> >> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >> >> >                                                           num_possible_cpus());
>> >> >> >         }
>> >> >> >
>> >> >> > -       /* hash table size must be power of 2 */
>> >> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>> >> >> >
>> >> >> >         htab->elem_size = sizeof(struct htab_elem) +
>> >> >> >                           round_up(htab->map.key_size, 8);
>> >> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
>> >> >> >
>> >> >> >         err = -E2BIG;
>> >> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
>> >> >> > -       if (htab->n_buckets == 0 ||
>> >> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
>> >> >> > +       /* prevent overflow in roundup below */
>> >> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
>> >> >> >                 goto free_htab;
>> >> >>
>> >> >> No. We cannot artificially reduce max_entries that will break real users.
>> >> >> Hash table with 4B elements is not that uncommon.
>> >>
>> >> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
>> >> bucket) check, which limits max_entries to 134M (0x8000000). This patch
>> >> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
>> >> 0x80000000).
>> >>
>> >> > Agree how about return E2BIG in these cases (32bit arch and overflow) and
>> >> > let user figure it out. That makes more sense to me.
>> >>
>> >> Isn't that exactly what this patch does? What am I missing here?
>> >
>> > I see. Then what are you fixing?
>> > roundup_pow_of_two() will return 0 and existing code is fine as-is.
>>
>> On 64-bit arches it will, yes. On 32-bit arches it ends up doing a
>> 32-bit left-shift (1UL << 32) of a 32-bit type (unsigned long), which is
>> UB, so there's no guarantee that it truncates down to 0. And it seems at
>> least on arm32 it does not: syzbot managed to trigger a crash in the
>> DEVMAP_HASH code by creating a map with more than 0x80000000 entries:
>>
>> https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
>>
>> This patch just preemptively applies the same fix to the hashtab code,
>> since I could not find any reason why it shouldn't be possible to hit
>> the same issue there. I haven't actually managed to trigger a crash
>> there, though (I don't have any arm32 hardware to test this on), so in
>> that sense it's a bit theoretical for hashtab. So up to you if you want
>> to take this, but even if you don't, could you please apply the first
>> patch? That does fix the issue reported by syzbot (cf the
>> reported-and-tested-by tag).
>
> I see.
> Since roundup_pow_of_two() is non deterministic on 32-bit archs,
> let's fix them all.
>
> We have at least 5 to fix:
> bloom_filter.c:                 nr_bits = roundup_pow_of_two(nr_bits);
> devmap.c:               dtab->n_buckets =
> roundup_pow_of_two(dtab->map.max_entries);
> hashtab.c:      htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> stackmap.c:     n_buckets = roundup_pow_of_two(attr->max_entries);
>
> hashtab.c:           htab->map.max_entries = roundup(attr->max_entries,
>                                                 num_possible_cpus());
>
> bloom_filter looks ok as-is,
> but stack_map has the same issue as devmap and hashtab.
>
> Let's check for
> if (max_entries > (1u << 31))
> in 3 maps and that should be enough to cover all 5 cases?
>
> imo 1u << 31 is much easier to visualize than U32_MAX/2+1
>
> and don't touch other checks.
> This patch is removing U32_MAX / sizeof(struct bucket) check
> and with that introduces overflow just few lines below in bpf_map_area_alloc.

Are you sure there's an overflow there? I did look at that and concluded
that since bpf_map_area_alloc() uses a u64 for the size that it would
not actually overflow even with n_buckets == 1<<31. There's a check in
__bpf_map_area_alloc() for the size:

	if (size >= SIZE_MAX)
		return NULL;

with

#define SIZE_MAX	(~(size_t)0)

in limits.h. So if sizeof(size_t) == 4, that check against SIZE_MAX
should trip and the allocation will just fail; but there's no overflow
anywhere AFAICT?

Anyway, I'm OK with keeping the check; I'll respin with the changed
constant and add the check to stackmap.c as well.

-Toke


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-06 10:32               ` Toke Høiland-Jørgensen
@ 2024-03-06 16:53                 ` Alexei Starovoitov
  2024-03-07 12:00                   ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2024-03-06 16:53 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: John Fastabend, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

On Wed, Mar 6, 2024 at 2:32 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
> > On Mon, Mar 4, 2024 at 5:02 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> >>
> >> > On Fri, Mar 1, 2024 at 4:35 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> John Fastabend <john.fastabend@gmail.com> writes:
> >> >>
> >> >> > Alexei Starovoitov wrote:
> >> >> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >> >> >
> >> >> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
> >> >> >> > hash buckets, and contains an overflow check by checking if the resulting
> >> >> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
> >> >> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
> >> >> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
> >> >> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
> >> >> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
> >> >> >> > overflow check to before the roundup.
> >> >> >> >
> >> >> >> > The hashtab code also contained a check that prevents the total allocation
> >> >> >> > size for the buckets from overflowing a 32-bit value, but since all the
> >> >> >> > allocation code uses u64s, this does not really seem to be necessary, so
> >> >> >> > drop it and keep only the strict overflow check of the n_buckets variable.
> >> >> >> >
> >> >> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
> >> >> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> >> >> > ---
> >> >> >> >  kernel/bpf/hashtab.c | 10 +++++-----
> >> >> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >> >> >> >
> >> >> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> >> >> >> > index 03a6a2500b6a..4caf8dab18b0 100644
> >> >> >> > --- a/kernel/bpf/hashtab.c
> >> >> >> > +++ b/kernel/bpf/hashtab.c
> >> >> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >> >> >                                                           num_possible_cpus());
> >> >> >> >         }
> >> >> >> >
> >> >> >> > -       /* hash table size must be power of 2 */
> >> >> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> >> >> >> >
> >> >> >> >         htab->elem_size = sizeof(struct htab_elem) +
> >> >> >> >                           round_up(htab->map.key_size, 8);
> >> >> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
> >> >> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
> >> >> >> >
> >> >> >> >         err = -E2BIG;
> >> >> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
> >> >> >> > -       if (htab->n_buckets == 0 ||
> >> >> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
> >> >> >> > +       /* prevent overflow in roundup below */
> >> >> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
> >> >> >> >                 goto free_htab;
> >> >> >>
> >> >> >> No. We cannot artificially reduce max_entries that will break real users.
> >> >> >> Hash table with 4B elements is not that uncommon.
> >> >>
> >> >> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
> >> >> bucket) check, which limits max_entries to 134M (0x8000000). This patch
> >> >> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
> >> >> 0x80000000).
> >> >>
> >> >> > Agree how about return E2BIG in these cases (32bit arch and overflow) and
> >> >> > let user figure it out. That makes more sense to me.
> >> >>
> >> >> Isn't that exactly what this patch does? What am I missing here?
> >> >
> >> > I see. Then what are you fixing?
> >> > roundup_pow_of_two() will return 0 and existing code is fine as-is.
> >>
> >> On 64-bit arches it will, yes. On 32-bit arches it ends up doing a
> >> 32-bit left-shift (1UL << 32) of a 32-bit type (unsigned long), which is
> >> UB, so there's no guarantee that it truncates down to 0. And it seems at
> >> least on arm32 it does not: syzbot managed to trigger a crash in the
> >> DEVMAP_HASH code by creating a map with more than 0x80000000 entries:
> >>
> >> https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
> >>
> >> This patch just preemptively applies the same fix to the hashtab code,
> >> since I could not find any reason why it shouldn't be possible to hit
> >> the same issue there. I haven't actually managed to trigger a crash
> >> there, though (I don't have any arm32 hardware to test this on), so in
> >> that sense it's a bit theoretical for hashtab. So up to you if you want
> >> to take this, but even if you don't, could you please apply the first
> >> patch? That does fix the issue reported by syzbot (cf the
> >> reported-and-tested-by tag).
> >
> > I see.
> > Since roundup_pow_of_two() is non deterministic on 32-bit archs,
> > let's fix them all.
> >
> > We have at least 5 to fix:
> > bloom_filter.c:                 nr_bits = roundup_pow_of_two(nr_bits);
> > devmap.c:               dtab->n_buckets =
> > roundup_pow_of_two(dtab->map.max_entries);
> > hashtab.c:      htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> > stackmap.c:     n_buckets = roundup_pow_of_two(attr->max_entries);
> >
> > hashtab.c:           htab->map.max_entries = roundup(attr->max_entries,
> >                                                 num_possible_cpus());
> >
> > bloom_filter looks ok as-is,
> > but stack_map has the same issue as devmap and hashtab.
> >
> > Let's check for
> > if (max_entries > (1u << 31))
> > in 3 maps and that should be enough to cover all 5 cases?
> >
> > imo 1u << 31 is much easier to visualize than U32_MAX/2+1
> >
> > and don't touch other checks.
> > This patch is removing U32_MAX / sizeof(struct bucket) check
> > and with that introduces overflow just few lines below in bpf_map_area_alloc.
>
> Are you sure there's an overflow there? I did look at that and concluded
> that since bpf_map_area_alloc() uses a u64 for the size that it would
> not actually overflow even with n_buckets == 1<<31. There's a check in
> __bpf_map_area_alloc() for the size:
>
>         if (size >= SIZE_MAX)
>                 return NULL;
>
> with
>
> #define SIZE_MAX        (~(size_t)0)
>
> in limits.h. So if sizeof(size_t) == 4, that check against SIZE_MAX
> should trip and the allocation will just fail; but there's no overflow
> anywhere AFAICT?

There is an overflow _before_ it calls into bpf_map_area_alloc().
Here is the line:
        htab->buckets = bpf_map_area_alloc(htab->n_buckets *
                                           sizeof(struct bucket),
                                           htab->map.numa_node);
that's why we have:
if (htab->n_buckets > U32_MAX / sizeof(struct bucket))
before that.


> Anyway, I'm OK with keeping the check; I'll respin with the changed
> constant and add the check to stackmap.c as well.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf v2 2/2] bpf: Fix hashtab overflow check on 32-bit arches
  2024-03-06 16:53                 ` Alexei Starovoitov
@ 2024-03-07 12:00                   ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-03-07 12:00 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: John Fastabend, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	bpf

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Wed, Mar 6, 2024 at 2:32 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Mon, Mar 4, 2024 at 5:02 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> >>
>> >> > On Fri, Mar 1, 2024 at 4:35 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >>
>> >> >> John Fastabend <john.fastabend@gmail.com> writes:
>> >> >>
>> >> >> > Alexei Starovoitov wrote:
>> >> >> >> On Thu, Feb 29, 2024 at 3:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >> >> >
>> >> >> >> > The hashtab code relies on roundup_pow_of_two() to compute the number of
>> >> >> >> > hash buckets, and contains an overflow check by checking if the resulting
>> >> >> >> > value is 0. However, on 32-bit arches, the roundup code itself can overflow
>> >> >> >> > by doing a 32-bit left-shift of an unsigned long value, which is undefined
>> >> >> >> > behaviour, so it is not guaranteed to truncate neatly. This was triggered
>> >> >> >> > by syzbot on the DEVMAP_HASH type, which contains the same check, copied
>> >> >> >> > from the hashtab code. So apply the same fix to hashtab, by moving the
>> >> >> >> > overflow check to before the roundup.
>> >> >> >> >
>> >> >> >> > The hashtab code also contained a check that prevents the total allocation
>> >> >> >> > size for the buckets from overflowing a 32-bit value, but since all the
>> >> >> >> > allocation code uses u64s, this does not really seem to be necessary, so
>> >> >> >> > drop it and keep only the strict overflow check of the n_buckets variable.
>> >> >> >> >
>> >> >> >> > Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
>> >> >> >> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> >> >> >> > ---
>> >> >> >> >  kernel/bpf/hashtab.c | 10 +++++-----
>> >> >> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
>> >> >> >> >
>> >> >> >> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> >> >> >> > index 03a6a2500b6a..4caf8dab18b0 100644
>> >> >> >> > --- a/kernel/bpf/hashtab.c
>> >> >> >> > +++ b/kernel/bpf/hashtab.c
>> >> >> >> > @@ -499,8 +499,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >> >> >> >                                                           num_possible_cpus());
>> >> >> >> >         }
>> >> >> >> >
>> >> >> >> > -       /* hash table size must be power of 2 */
>> >> >> >> > -       htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>> >> >> >> >
>> >> >> >> >         htab->elem_size = sizeof(struct htab_elem) +
>> >> >> >> >                           round_up(htab->map.key_size, 8);
>> >> >> >> > @@ -510,11 +508,13 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
>> >> >> >> >                 htab->elem_size += round_up(htab->map.value_size, 8);
>> >> >> >> >
>> >> >> >> >         err = -E2BIG;
>> >> >> >> > -       /* prevent zero size kmalloc and check for u32 overflow */
>> >> >> >> > -       if (htab->n_buckets == 0 ||
>> >> >> >> > -           htab->n_buckets > U32_MAX / sizeof(struct bucket))
>> >> >> >> > +       /* prevent overflow in roundup below */
>> >> >> >> > +       if (htab->map.max_entries > U32_MAX / 2 + 1)
>> >> >> >> >                 goto free_htab;
>> >> >> >>
>> >> >> >> No. We cannot artificially reduce max_entries that will break real users.
>> >> >> >> Hash table with 4B elements is not that uncommon.
>> >> >>
>> >> >> Erm, huh? The existing code has the n_buckets > U32_MAX / sizeof(struct
>> >> >> bucket) check, which limits max_entries to 134M (0x8000000). This patch
>> >> >> is *increasing* the maximum allowable size by a factor of 16 (to 2.1B or
>> >> >> 0x80000000).
>> >> >>
>> >> >> > Agree how about return E2BIG in these cases (32bit arch and overflow) and
>> >> >> > let user figure it out. That makes more sense to me.
>> >> >>
>> >> >> Isn't that exactly what this patch does? What am I missing here?
>> >> >
>> >> > I see. Then what are you fixing?
>> >> > roundup_pow_of_two() will return 0 and existing code is fine as-is.
>> >>
>> >> On 64-bit arches it will, yes. On 32-bit arches it ends up doing a
>> >> 32-bit left-shift (1UL << 32) of a 32-bit type (unsigned long), which is
>> >> UB, so there's no guarantee that it truncates down to 0. And it seems at
>> >> least on arm32 it does not: syzbot managed to trigger a crash in the
>> >> DEVMAP_HASH code by creating a map with more than 0x80000000 entries:
>> >>
>> >> https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
>> >>
>> >> This patch just preemptively applies the same fix to the hashtab code,
>> >> since I could not find any reason why it shouldn't be possible to hit
>> >> the same issue there. I haven't actually managed to trigger a crash
>> >> there, though (I don't have any arm32 hardware to test this on), so in
>> >> that sense it's a bit theoretical for hashtab. So up to you if you want
>> >> to take this, but even if you don't, could you please apply the first
>> >> patch? That does fix the issue reported by syzbot (cf the
>> >> reported-and-tested-by tag).
>> >
>> > I see.
>> > Since roundup_pow_of_two() is non deterministic on 32-bit archs,
>> > let's fix them all.
>> >
>> > We have at least 5 to fix:
>> > bloom_filter.c:                 nr_bits = roundup_pow_of_two(nr_bits);
>> > devmap.c:               dtab->n_buckets =
>> > roundup_pow_of_two(dtab->map.max_entries);
>> > hashtab.c:      htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>> > stackmap.c:     n_buckets = roundup_pow_of_two(attr->max_entries);
>> >
>> > hashtab.c:           htab->map.max_entries = roundup(attr->max_entries,
>> >                                                 num_possible_cpus());
>> >
>> > bloom_filter looks ok as-is,
>> > but stack_map has the same issue as devmap and hashtab.
>> >
>> > Let's check for
>> > if (max_entries > (1u << 31))
>> > in 3 maps and that should be enough to cover all 5 cases?
>> >
>> > imo 1u << 31 is much easier to visualize than U32_MAX/2+1
>> >
>> > and don't touch other checks.
>> > This patch is removing U32_MAX / sizeof(struct bucket) check
>> > and with that introduces overflow just few lines below in bpf_map_area_alloc.
>>
>> Are you sure there's an overflow there? I did look at that and concluded
>> that since bpf_map_area_alloc() uses a u64 for the size that it would
>> not actually overflow even with n_buckets == 1<<31. There's a check in
>> __bpf_map_area_alloc() for the size:
>>
>>         if (size >= SIZE_MAX)
>>                 return NULL;
>>
>> with
>>
>> #define SIZE_MAX        (~(size_t)0)
>>
>> in limits.h. So if sizeof(size_t) == 4, that check against SIZE_MAX
>> should trip and the allocation will just fail; but there's no overflow
>> anywhere AFAICT?
>
> There is an overflow _before_ it calls into bpf_map_area_alloc().
> Here is the line:
>         htab->buckets = bpf_map_area_alloc(htab->n_buckets *
>                                            sizeof(struct bucket),
>                                            htab->map.numa_node);
> that's why we have:
> if (htab->n_buckets > U32_MAX / sizeof(struct bucket))
> before that.

Ah, right. I was assuming that the compiler was smart enough to
implicitly convert that into the type of the function parameter before
doing the multiplication, but of course that's not the case. Thanks!

-Toke


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-03-07 12:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-29 11:22 [PATCH bpf v2 0/2] Fix hashmap overflow checks for 32-bit arches Toke Høiland-Jørgensen
2024-02-29 11:22 ` [PATCH bpf v2 1/2] bpf: Fix DEVMAP_HASH overflow check on " Toke Høiland-Jørgensen
2024-02-29 11:22 ` [PATCH bpf v2 2/2] bpf: Fix hashtab " Toke Høiland-Jørgensen
2024-02-29 17:07   ` Alexei Starovoitov
2024-02-29 22:21     ` John Fastabend
2024-03-01 12:35       ` Toke Høiland-Jørgensen
2024-03-01 17:15         ` Alexei Starovoitov
2024-03-04 13:02           ` Toke Høiland-Jørgensen
2024-03-06  5:29             ` Alexei Starovoitov
2024-03-06 10:32               ` Toke Høiland-Jørgensen
2024-03-06 16:53                 ` Alexei Starovoitov
2024-03-07 12:00                   ` Toke Høiland-Jørgensen
2024-03-01 17:21         ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox