BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Matt Bobrowski <mattbobrowski@google.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>
Cc: 梅开彦 <kaiyanm@hust.edu.cn>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	bpf <bpf@vger.kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	hust-os-kernel-patches@googlegroups.com,
	"Yinhao Hu" <dddddd@hust.edu.cn>,
	dzm91@hust.edu.cn, "KP Singh" <kpsingh@kernel.org>,
	"Alexei Starovoitov" <alexei.starovoitov@gmail.com>
Subject: Re: bpf: mmap_file LSM hook allows NULL pointer dereference
Date: Thu, 18 Dec 2025 14:51:27 -0800	[thread overview]
Message-ID: <9e402939-40ea-4da2-aad1-43d2afb74a83@linux.dev> (raw)
In-Reply-To: <aTs6JTBrzEa0WJwd@google.com>



On 12/11/25 1:39 PM, Matt Bobrowski wrote:
> On Wed, Dec 10, 2025 at 10:02:16AM +0000, Matt Bobrowski wrote:
>> On Wed, Dec 03, 2025 at 10:23:43AM -0800, Alexei Starovoitov wrote:
>>> On Wed, Dec 3, 2025 at 12:47 AM Matt Bobrowski <mattbobrowski@google.com> wrote:
>>>>> We can play tricks with __weak. Like:
>>>>>
>>>>> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
>>>>> index 7cb6e8d4282c..60d269a85bf1 100644
>>>>> --- a/kernel/bpf/bpf_lsm.c
>>>>> +++ b/kernel/bpf/bpf_lsm.c
>>>>> @@ -21,7 +21,7 @@
>>>>>    * function where a BPF program can be attached.
>>>>>    */
>>>>>   #define LSM_HOOK(RET, DEFAULT, NAME, ...)      \
>>>>> -noinline RET bpf_lsm_##NAME(__VA_ARGS__)       \
>>>>> +__weak noinline RET bpf_lsm_##NAME(__VA_ARGS__)        \
>>>>>
>>>>> diff kernel/bpf/bpf_lsm_proto.c
>>>>>
>>>>> +int bpf_lsm_mmap_file(struct file *file__nullable, unsigned long reqprot,
>>>>> +                     unsigned long prot, unsigned long flags)
>>>>> +{
>>>>> +       return 0;
>>>>> +}
>>>>>
>>>>> and above one with __nullable will be in vmlinux BTF.
>>>>>
>>>>> afaik __weak functions are not removed by linker when in non-LTO,
>>>>> but it's still better than
>>>>> +#define bpf_lsm_mmap_file bpf_lsm_mmap_file__original
>>>>> No need to change bpf_lsm.h either.
>>>> Annotating with a weak attribute would be quite nice, but the compiler
>>>> will complain about the redefinition of the symbol
>>>> bpf_lsm_mmap_file. To avoid this, we'd still need to rely on the
>>>> rename and ignore dance by using the aforementioned define, which at
>>>> that point would still result in both symbols being exposed in both
>>>> BTF and the .text section.
>>> Not quite. You missed this part in the above:
>>>
>>>>> diff kernel/bpf/bpf_lsm_proto.c
>>> it's a different file.
>> Yes, yes, this will work. However, as discussed, it's fundamentally
>> reliant on a small "hack" which I've implemented within
>> kernel/bpf/Makefile here [0] to workaround current pahole
>> deduplication logic.
>>
>> Andrii and Eduard,
>>
>> I’d like your input on a pahole BTF generation issue which I've
>> recently come across. In the series I just sent [0], I had to
>> implement a workaround to force pahole to process bpf_lsm_proto.o
>> before bpf_lsm.o.
>>
>> This was necessary to ensure pahole generates BTF for the strong
>> definition of bpf_lsm_mmap_file() (in bpf_lsm_proto.c) rather than the
>> weak definition (in bpf_lsm.c). Without this forced ordering, pahole
>> processed the weak definition first, resulting in a state array like
>> this:
>>
>> ```
>> btf_encoder.func_states.array[N] = bpf_lsm_mmap_file (weak
>> definition from bpf_lsm.o)
>>
>> btf_encoder.func_states.array[N+1] = bpf_lsm_mmap_file (strong
>> definition from bpf_lsm_proto.o)
>> ```
>>
>> Because the deduplication logic in btf_encoder__add_saved_funcs()
>> folds duplicates (those determined by saved_functions_combine()) into
>> the first occurrence, the resulting BTF was derived from the weak
>> definition. This is incorrect, as the strong definition is the one
>> actually linked into the final vmlinux image.
>>
>> An obvious fix that immediately came to mind here was to essentially
>> teach pahole about strong function prototype definitions, and prefer
>> to emit BTF for those instead of any weak defined counterparts?
> Thinking about this a little more. Perhaps whilst in
> btf_encoder__add_saved_funcs() we should only emit BTF for any
> duplicated function within a CU which happen to match the
> corresponding entry within the backing ELF symtab? We can do this by
> checking whether the virtual address stored within DW_AT_low_pc
> matches that of what's stored in the st_value field for the
> corresponding ELF symtab entry? For example, for bpf_lsm_mmap_file we

I think this is the correct way to do it. Basically we should
pick the dwarf subprogram entry whose DW_AT_low_pc should match
same-name same-low_pc ksym entry.

> have:
>
> Output from reading the vmlinux symbol table:
> ```
> $ readelf -s <input> | grep bpf_lsm_mmap_file
> 165360: ffffffff8152f9b0    16 FUNC    GLOBAL DEFAULT    1 bpf_lsm_mmap_file
> ```
> Output from reading the vmlinux DWARF debugging information:
> ```
> <2a40982>   DW_AT_name        : (indirect string, offset: 0x1352ea): bpf_lsm_mmap_file
> <2a40986>   DW_AT_decl_file   : 4
> <2a40987>   DW_AT_decl_line   : 199
> <2a40988>   DW_AT_decl_column : 1
> <2a40989>   DW_AT_prototyped  : 1
> <2a40989>   DW_AT_type        : <0x2a1b010>
> <2a4098d>   DW_AT_low_pc      : 0xffffffff8152e260
> <2a40995>   DW_AT_high_pc     : 0x10
> <2a4099d>   DW_AT_frame_base  : 1 byte block: 9c    (DW_OP_call_frame_cfa)
> <2a4099f>   DW_AT_call_all_calls: 1
> <2a4099f>   DW_AT_sibling     : <0x2a409d8>
> <2><2a409a3>: Abbrev Number: 10 (DW_TAG_formal_parameter)
> <2a409a4>   DW_AT_name        : (indirect string, offset: 0x3623df): file
> <2a409a8>   DW_AT_decl_file   : 4
> <2a409a9>   DW_AT_decl_line   : 199
> <2a409aa>   DW_AT_decl_column : 1
> <2a409aa>   DW_AT_type        : <0x2a234ef>
> <2a409ae>   DW_AT_location    : 1 byte block: 55    (DW_OP_reg5 (rdi))
> <2><2a409b0>: Abbrev Number: 10 (DW_TAG_formal_parameter)
> <2a409b1>   DW_AT_name        : (indirect string, offset: 0x23a09d): reqprot
> <2a409b5>   DW_AT_decl_file   : 4
> --
> <2a60e0a>   DW_AT_name        : (indirect string, offset: 0x1352ea): bpf_lsm_mmap_file
> <2a60e0e>   DW_AT_decl_file   : 1
> <2a60e0f>   DW_AT_decl_line   : 15
> <2a60e10>   DW_AT_decl_column : 5
> <2a60e11>   DW_AT_prototyped  : 1
> <2a60e11>   DW_AT_type        : <0x2a42713>
> <2a60e15>   DW_AT_low_pc      : 0xffffffff8152f9b0
> <2a60e1d>   DW_AT_high_pc     : 0x10
> <2a60e25>   DW_AT_frame_base  : 1 byte block: 9c    (DW_OP_call_frame_cfa)
> <2a60e27>   DW_AT_call_all_calls: 1
> <2><2a60e27>: Abbrev Number: 82 (DW_TAG_formal_parameter)
> <2a60e28>   DW_AT_name        : (indirect string, offset: 0x135ede): file__nullable
> <2a60e2c>   DW_AT_decl_file   : 1
> <2a60e2c>   DW_AT_decl_line   : 15
> <2a60e2d>   DW_AT_decl_column : 36
> <2a60e2e>   DW_AT_type        : <0x2a49f59>
> <2a60e32>   DW_AT_location    : 1 byte block: 55    (DW_OP_reg5 (rdi))
> ```
>
>> [0] https://lore.kernel.org/bpf/20251210090701.2753545-1-mattbobrowski@google.com/T/#me14d534fb559a349c46e094f18c63d477644d511


  reply	other threads:[~2025-12-18 22:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-02  7:09 bpf: mmap_file LSM hook allows NULL pointer dereference 梅开彦
2025-12-02 10:38 ` Matt Bobrowski
2025-12-02 14:54   ` Matt Bobrowski
2025-12-02 17:27     ` Alexei Starovoitov
2025-12-02 19:17       ` Matt Bobrowski
2025-12-02 21:40         ` Alexei Starovoitov
2025-12-03  8:47           ` Matt Bobrowski
2025-12-03 18:23             ` Alexei Starovoitov
2025-12-10 10:02               ` Matt Bobrowski
2025-12-11 21:39                 ` Matt Bobrowski
2025-12-18 22:51                   ` Yonghong Song [this message]
2025-12-29 10:33                     ` Matt Bobrowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e402939-40ea-4da2-aad1-43d2afb74a83@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dddddd@hust.edu.cn \
    --cc=dzm91@hust.edu.cn \
    --cc=eddyz87@gmail.com \
    --cc=hust-os-kernel-patches@googlegroups.com \
    --cc=kaiyanm@hust.edu.cn \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mattbobrowski@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox