From: Yonghong Song <yonghong.song@linux.dev>
To: Yifei Liu <yifei.l.liu@oracle.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "ast@kernel.org" <ast@kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"daniel@iogearbox.net" <daniel@iogearbox.net>
Subject: Re: [External] : Re: Potential BPF Arena Security Vulnerability, Possible Memory Access and Overflow Issues
Date: Wed, 9 Jul 2025 07:51:17 -0700 [thread overview]
Message-ID: <61597973-dd18-4cd1-ba7c-9b2f8862d86b@linux.dev> (raw)
In-Reply-To: <78feeb4e-be58-4e02-ae08-119003b1b494@linux.dev>
On 7/8/25 10:39 PM, Yonghong Song wrote:
>
>
> On 7/8/25 1:56 PM, Yifei Liu wrote:
>>
>>> On Jul 8, 2025, at 1:46 PM, yifei.l.liu@oracle.com wrote:
>>>
>>>
>>>
>>>> On Jul 8, 2025, at 12:53 PM, Yonghong Song
>>>> <yonghong.song@linux.dev> wrote:
>>>>
>>>>
>>>>
>>>> On 7/7/25 4:06 PM, Alexei Starovoitov wrote:
>>>>> On Mon, Jul 7, 2025 at 2:43 PM Yifei Liu <yifei.l.liu@oracle.com>
>>>>> wrote:
>>>>>>
>>>>>>> On Jul 7, 2025, at 2:19 PM, Alexei Starovoitov
>>>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>>
>>>>>>> On Mon, Jul 7, 2025 at 1:44 PM Yifei Liu
>>>>>>> <yifei.l.liu@oracle.com> wrote:
>>>>>>>> Hi Alexei,
>>>>>>>>
>>>>>>>> I recently noticed that the verifier_arena_large selftest would
>>>>>>>> fail on the overflow and underflow section for 64k page size
>>>>>>>> kernels. After a deeper investigation, the similar issue is
>>>>>>>> also reproducible on 4k page size over both x86 and aarch64
>>>>>>>> platforms.
>>>>>>>>
>>>>>>>> The root reason of this failure looks to be a failed or missing
>>>>>>>> check of the pointer upper 32-bit from the user space. User
>>>>>>>> space could access the arena space value even the pointer is
>>>>>>>> not in the assigned user space pointer range. For example, if
>>>>>>>> the user_vm_start is 7f7d26200000 and arena size is 4G (end
>>>>>>>> upper bound is 7f7e26200000), when I set *(7f7e26200000 -
>>>>>>>> 65536) = 20, I could also get the value of (7f7d26200000 -
>>>>>>>> 65536) as 20. It should be 0 if that is out of the range.
>>>>>>>>
>>>>>>>> Could you please take a look at this issue? Or could you please
>>>>>>>> point me where is the place doing the address translation and I
>>>>>>>> could try to provide a patch for this?
>>>>>>>>
>>>>>>>> Thank you very much.
>>>>>>>> Yifei
>>>>>>>>
>>>>>>>> Methods on reproduce:
>>>>>>>> 1. Use a 64k page size arm based kernel and run
>>>>>>>> verifier_arena_large selftest, it would failed on return 12 and
>>>>>>>> 13. Or
>>>>>>> Are you sure you're running the latest kernel ?
>>>>>>> This sounds like issue fixed in commit 517e8a7835e8 ("bpf: Fix
>>>>>>> softlockup in arena_map_free on 64k page kernel”)
>>>>>> Thanks for the reply. I do check this fix and it is not related
>>>>>> to the one I mentioned above. It just fix the guard
>>>>>> range so that it would not set the start address without page
>>>>>> alignment.
>>>>>>
>>>>>>> In general this is not a security vulnerability in any way.
>>>>>>> 32-bit wraparound is there by design.
>>>>>> If we do not check the upper 32-bit value, it would be wide open
>>>>>> for user-space to access the arena space.
>>>>>> And maybe even the user-space process cannot access the memory
>>>>>> outside the 4G area because it would
>>>>>> try to translate all the pointers to that area.
>>>>> No idea what you're trying to say.
>>>>>
>>>>>> Plus, it would consistently fail the verifier_arena_large
>>>>>> selftest for 64k page size kernels. Maybe we want to
>>>>>> skip some of the overflow/underflow tests if the page size is 64k?
>>>> I tried on my aarch64 machine which has 64K page size. The
>>>> verifier_arena_large works fine for
>>>> me with either gcc11 or clang21. Could you give more details
>>>> (traces) about the failure you observed?
>>> Sure, I paste a detailed output below.
>>>
>>> # ./test_progs -vv -t verifier_arena_large
>>> bpf_testmod.ko is already unloaded.
>>> Loading bpf_testmod.ko...
>>> Successfully loaded bpf_testmod.ko.
>>> tester_init:PASS:tester_log_buf 0 nsec
>>> libbpf: loading object 'verifier_arena_large' from buffer
>>> libbpf: elf: section(2) .symtab, size 192, link 1, flags 0, type=2
>>> libbpf: elf: section(3) syscall, size 760, link 0, flags 6, type=1
>>> libbpf: sec 'syscall': found program 'big_alloc1' at insn offset 0
>>> (0 bytes), code size 95 insns (760 bytes)
>>> libbpf: elf: section(4) .maps, size 24, link 0, flags 3, type=1
>>> libbpf: elf: section(5) license, size 4, link 0, flags 3, type=1
>>> libbpf: license of verifier_arena_large is GPL
>>> libbpf: elf: section(6) .relsyscall, size 160, link 2, flags 40, type=9
>>> libbpf: elf: section(7) .BTF, size 1624, link 0, flags 0, type=1
>>> libbpf: elf: section(8) .BTF.ext, size 672, link 0, flags 0, type=1
>>> libbpf: looking for externs among 8 symbols...
>>> libbpf: collected 2 externs total
>>> libbpf: extern (ksym) #0: symbol 5, name bpf_arena_alloc_pages
>>> libbpf: extern (ksym) #1: symbol 6, name bpf_arena_free_pages
>>> libbpf: map 'arena': at sec_idx 4, offset 0.
>>> libbpf: map 'arena': found type = 33.
>>> libbpf: map 'arena': found max_entries = 65536.
>>> libbpf: map 'arena': found map_flags = 0x400.
>>> libbpf: sec '.relsyscall': collecting relocation for section(3)
>>> 'syscall'
>>> libbpf: sec '.relsyscall': relo #0: insn #1 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #1
>>> libbpf: sec '.relsyscall': relo #1: insn #7 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #7
>>> libbpf: sec '.relsyscall': relo #2: insn #17 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #17
>>> libbpf: sec '.relsyscall': relo #3: insn #22 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #22
>>> libbpf: sec '.relsyscall': relo #4: insn #32 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #32
>>> libbpf: sec '.relsyscall': relo #5: insn #37 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #37
>>> libbpf: sec '.relsyscall': relo #6: insn #46 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #46
>>> libbpf: sec '.relsyscall': relo #7: insn #50 against
>>> 'bpf_arena_free_pages'
>>> libbpf: prog 'big_alloc1': found extern #1 'bpf_arena_free_pages'
>>> (sym 6) for insn #50
>>> libbpf: sec '.relsyscall': relo #8: insn #57 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #57
>>> libbpf: sec '.relsyscall': relo #9: insn #63 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #63
>>> process_subtest:PASS:obj_open_mem 0 nsec
>>> process_subtest:PASS:specs_alloc 0 nsec
>>> libbpf: loading object 'verifier_arena_large' from buffer
>>> libbpf: elf: section(2) .symtab, size 192, link 1, flags 0, type=2
>>> libbpf: elf: section(3) syscall, size 760, link 0, flags 6, type=1
>>> libbpf: sec 'syscall': found program 'big_alloc1' at insn offset 0
>>> (0 bytes), code size 95 insns (760 bytes)
>>> libbpf: elf: section(4) .maps, size 24, link 0, flags 3, type=1
>>> libbpf: elf: section(5) license, size 4, link 0, flags 3, type=1
>>> libbpf: license of verifier_arena_large is GPL
>>> libbpf: elf: section(6) .relsyscall, size 160, link 2, flags 40, type=9
>>> libbpf: elf: section(7) .BTF, size 1624, link 0, flags 0, type=1
>>> libbpf: elf: section(8) .BTF.ext, size 672, link 0, flags 0, type=1
>>> libbpf: looking for externs among 8 symbols...
>>> libbpf: collected 2 externs total
>>> libbpf: extern (ksym) #0: symbol 5, name bpf_arena_alloc_pages
>>> libbpf: extern (ksym) #1: symbol 6, name bpf_arena_free_pages
>>> libbpf: map 'arena': at sec_idx 4, offset 0.
>>> libbpf: map 'arena': found type = 33.
>>> libbpf: map 'arena': found max_entries = 65536.
>>> libbpf: map 'arena': found map_flags = 0x400.
>>> libbpf: sec '.relsyscall': collecting relocation for section(3)
>>> 'syscall'
>>> libbpf: sec '.relsyscall': relo #0: insn #1 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #1
>>> libbpf: sec '.relsyscall': relo #1: insn #7 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #7
>>> libbpf: sec '.relsyscall': relo #2: insn #17 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #17
>>> libbpf: sec '.relsyscall': relo #3: insn #22 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #22
>>> libbpf: sec '.relsyscall': relo #4: insn #32 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #32
>>> libbpf: sec '.relsyscall': relo #5: insn #37 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #37
>>> libbpf: sec '.relsyscall': relo #6: insn #46 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #46
>>> libbpf: sec '.relsyscall': relo #7: insn #50 against
>>> 'bpf_arena_free_pages'
>>> libbpf: prog 'big_alloc1': found extern #1 'bpf_arena_free_pages'
>>> (sym 6) for insn #50
>>> libbpf: sec '.relsyscall': relo #8: insn #57 against 'arena'
>>> libbpf: prog 'big_alloc1': found map 0 (arena, sec 4, off 0) for
>>> insn #57
>>> libbpf: sec '.relsyscall': relo #9: insn #63 against
>>> 'bpf_arena_alloc_pages'
>>> libbpf: prog 'big_alloc1': found extern #0 'bpf_arena_alloc_pages'
>>> (sym 5) for insn #63
>>> run_subtest:PASS:obj_open_mem 0 nsec
>>> libbpf: object 'verifier_arena_': failed (-95) to create BPF token
>>> from '/sys/fs/bpf', skipping optional step...
>>> libbpf: loaded kernel BTF from '/sys/kernel/btf/vmlinux'
>>> libbpf: extern (func ksym) 'bpf_arena_alloc_pages': resolved to
>>> vmlinux [136281]
>>> libbpf: extern (func ksym) 'bpf_arena_free_pages': resolved to
>>> vmlinux [136283]
>>> libbpf: map 'arena': created successfully, fd=5
>>> run_subtest:PASS:unexpected_load_failure 0 nsec
>>> VERIFIER LOG:
>>> =============
>>> processed 105 insns (limit 1000000) max_states_per_insn 0
>>> total_states 8 peak_states 8 mark_read 2
>>> =============
>>> do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
>>> run_subtest:FAIL:1037 Unexpected retval: 12 != 0
>>> #431/1 verifier_arena_large/big_alloc1:FAIL
>>> #431 verifier_arena_large:FAIL
>>> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
>>> Successfully unloaded bpf_testmod.ko.
>>>
>>>
>>>>> Skip without full understanding is not a good idea.
>>>>> This test does:
>>>>> if (*(page1 + PAGE_SIZE) != 0)
>>>>> return 11;
>>>>> if (*(page1 - PAGE_SIZE) != 0)
>>>>> return 12;
>>>>> if (*(page2 + PAGE_SIZE) != 0)
>>>>> return 13;
>>>>> if (*(page2 - PAGE_SIZE) != 0)
>>>>>
>>>>> which gets compiled into bpf insns with positive and negative
>>>>> 16-bit offsets.
>>>>> When PAGE_SIZE is 64k the code is compiled into some other form,
>>>>> since constant doesn't fit into 'off' field.
>>>>> So the code is not checking what it is supposed to.
>>>>> One way is to use inline asm. Another is to replace PAGE_SIZE
>>>>> with an actual 4k constant in big_alloc1() test.
>> I would believe it is not at the instruction part but at the run
>> time. For the underflow, I have a simple patch
>> diff --git a/tools/testing/selftests/bpf/progs/verifier_arena_large.c
>> b/tools/testing/selftests/bpf/progs/verifier_arena_large.c
>> index 6065f862d9643..292dc5712a47e 100644
>> --- a/tools/testing/selftests/bpf/progs/verifier_arena_large.c
>> +++ b/tools/testing/selftests/bpf/progs/verifier_arena_large.c
>> @@ -57,7 +57,8 @@ int big_alloc1(void *ctx)
>> return 10;
>> if (*(page1 + PAGE_SIZE) != 0)
>> return 11;
>> - if (*(page1 - PAGE_SIZE) != 0)
>> + if (*(page1 - ARENA_SIZE - PAGE_SIZE) != 0)
>> return 12;
>> The checked pointer (page1 - ARENA_SIZE - PAGE_SIZE) is definitely
>> out of the arena range in user space (page1 is the left boundary of
>> the arena), but the value of it is not zero and it will get the value
>> I set for *page2.
>
> I did some investigation. The following is what I am finding.
> First, the following code
> #define PAGE_CNT 100
> __u8 __arena * __arena page[PAGE_CNT]; /* occupies the first page */
> tells us the first page is occupied for page[PAGE_CNT].
>
> In big_alloc1(), we have
> volatile char __arena *page1, *page2, *no_page, *page3;
> void __arena *base;
>
> page1 = base = bpf_arena_alloc_pages(&arena, NULL, 1,
> NUMA_NO_NODE, 0);
> if (!page1)
> return 1;
> *page1 = 1;
> Here page1 is the second page.
>
> When later on, when we do
> if (*(page1 - PAGE_SIZE) != 0)
> return 12;
> Here 'page1 - PAGE_SIZE' will be equal to the *first* page, i.e., the
> page
> occupied by page[PAGE_CNT].
>
> So the key thing is about the first byte value in page[PAGE_CNT].
>
> The following is related skeleton header file:
>
> struct verifier_arena_large {
> struct bpf_object_skeleton *skeleton;
> struct bpf_object *obj;
> struct {
> struct bpf_map *arena;
> struct bpf_map *bss;
> } maps;
> struct {
> struct bpf_program *big_alloc1;
> struct bpf_program *big_alloc2;
> } progs;
> struct {
> struct bpf_link *big_alloc1;
> struct bpf_link *big_alloc2;
> } links;
> struct verifier_arena_large__arena {
> __u8 *page[100];
> } *arena;
> struct verifier_arena_large__bss {
> __u8 *base;
> } *bss;
>
> 'bss' map is an array map. But looks like struct
> verifier_arena_large__arena
> is not creeating an array map and it relies page fault to allocate
> memories
> for page[100]. The following is the page fault call stack:
> ...
> [ 223.787995] arena_vm_fault+0xc0/0x2a8
> [ 223.787996] __do_fault+0x48/0xf0
> [ 223.788000] do_pte_missing+0x3f8/0x1100
> [ 223.788002] handle_mm_fault+0x4a8/0x7e0
> [ 223.788004] do_page_fault+0x2cc/0x5f8
> [ 223.788006] do_translation_fault+0x54/0x78
> [ 223.788008] do_mem_abort+0x48/0xa0
> [ 223.788010] el0_da+0x5c/0xb8
> [ 223.788012] el0t_64_sync_handler+0x84/0x108
> [ 223.788013] el0t_64_sync+0x198/0x1a0
>
> I have not figured out why the above happen. Probably somewhere triggered
> in libbpf. Did user space does anything with page[100]?
Did some further investigation. The above arena_vm_fault is caused by
libbpf copying init data for page[100] (total 800 bytes) to arena first page.
In my specific case,
$ llvm-readelf -s verifier_arena_large.bpf.o
...
22: 0000000000000000 800 OBJECT GLOBAL DEFAULT 8 page
...
$ llvm-readelf -S verifier_arena_large.bpf.o
...
[ 8] .addr_space.1 PROGBITS 0000000000000000 0008a8 000320 00 WA 0 0 8
...
$ llvm-readelf -x 8 verifier_arena_large.bpf.o
Hex dump of section '.addr_space.1':
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 00000000 00000000 00000000 ................
...
From the above, we can see that the initial value for page[100]
are all zeros and for
if (*(page1 - PAGE_SIZE) != 0)
return 12
the if condition will be false.
If I hacked below to modify the first byte of page[100] from 0 to 1:
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index aee36402f0a3..30a7545384f0 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5512,6 +5512,8 @@ bpf_object__create_maps(struct bpf_object *obj)
return err;
}
if (obj->arena_data) {
+ pr_debug("YHS: copy arena_data, line %d, size %ld\n", __LINE__, obj->arena_data_sz);
+ ((char *)obj->arena_data)[0] = 1;
memcpy(map->mmaped, obj->arena_data, obj->arena_data_sz);
zfree(&obj->arena_data);
Then the test will fail as expected.
So the key thing is related to initial value of page[100].
Could you check your verifier_arena_large.bpf.o to see
whether page[100] initial value is 0 or not?
Which compiler did you use?
[...]
next prev parent reply other threads:[~2025-07-09 14:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-07 20:44 Potential BPF Arena Security Vulnerability, Possible Memory Access and Overflow Issues Yifei Liu
2025-07-07 21:19 ` Alexei Starovoitov
2025-07-07 21:43 ` [External] : " Yifei Liu
2025-07-07 23:06 ` Alexei Starovoitov
2025-07-08 19:53 ` Yonghong Song
2025-07-08 20:46 ` Yifei Liu
2025-07-08 20:56 ` Yifei Liu
2025-07-09 5:39 ` Yonghong Song
2025-07-09 14:51 ` Yonghong Song [this message]
2025-07-11 23:54 ` Yifei Liu
2025-07-12 1:33 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=61597973-dd18-4cd1-ba7c-9b2f8862d86b@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=yifei.l.liu@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.