From: Alan Maguire <alan.maguire@oracle.com>
To: Cong Wang <xiyou.wangcong@gmail.com>, Uros Bizjak <ubizjak@gmail.com>
Cc: Laura Nao <laura.nao@collabora.com>,
bpf@vger.kernel.org, chrome-platform@lists.linux.dev,
kernel@collabora.com, linux-kernel@vger.kernel.org,
regressions@lists.linux.dev,
Stephen Brennan <stephen.s.brennan@oracle.com>,
"dwarves@vger.kernel.org" <dwarves@vger.kernel.org>
Subject: Re: [REGRESSION] module BTF validation failure (Error -22) on next
Date: Sat, 14 Dec 2024 12:15:11 +0000 [thread overview]
Message-ID: <3be0346a-8bc9-4be1-8418-b26c7aa4a862@oracle.com> (raw)
In-Reply-To: <Z10MkXtzyY9RDqSp@pop-os.localdomain>
On 14/12/2024 04:41, Cong Wang wrote:
> On Thu, Dec 05, 2024 at 08:36:33AM +0100, Uros Bizjak wrote:
>> On Wed, Dec 4, 2024 at 4:52 PM Laura Nao <laura.nao@collabora.com> wrote:
>>>
>>> On 11/15/24 18:17, Laura Nao wrote:
>>>> I managed to reproduce the issue locally and I've uploaded the vmlinux[1]
>>>> (stripped of DWARF data) and vmlinux.raw[2] files, as well as one of the
>>>> modules[3] and its btf data[4] extracted with:
>>>>
>>>> bpftool -B vmlinux btf dump file cros_kbd_led_backlight.ko > cros_kbd_led_backlight.ko.raw
>>>>
>>>> Looking again at the logs[5], I've noticed the following is reported:
>>>>
>>>> [ 0.415885] BPF: type_id=115803 offset=177920 size=1152
>>>> [ 0.416029] BPF:
>>>> [ 0.416083] BPF: Invalid offset
>>>> [ 0.416165] BPF:
>>>>
>>>> There are two different definitions of rcu_data in '.data..percpu', one
>>>> is a struct and the other is an integer:
>>>>
>>>> type_id=115801 offset=177920 size=1152 (VAR 'rcu_data')
>>>> type_id=115803 offset=177920 size=1152 (VAR 'rcu_data')
>>>>
>>>> [115801] VAR 'rcu_data' type_id=115572, linkage=static
>>>> [115803] VAR 'rcu_data' type_id=1, linkage=static
>>>>
>>>> [115572] STRUCT 'rcu_data' size=1152 vlen=69
>>>> [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
>>>>
>>>> I assume that's not expected, correct?
>>>>
>>>> I'll dig a bit deeper and report back if I can find anything else.
>>>
>>> I ran a bisection, and it appears the culprit commit is:
>>> https://lore.kernel.org/all/20241021080856.48746-2-ubizjak@gmail.com/
>>>
>>> Hi Uros, do you have any suggestions or insights on resolving this issue?
>>
>> There is a stray ";" at the end of the #define, perhaps this makes a difference:
>>
>> +#define PERCPU_PTR(__p) \
>> + (typeof(*(__p)) __force __kernel *)(__p);
>> +
>>
>> and SHIFT_PERCPU_PTR macro now expands to:
>>
>> RELOC_HIDE((typeof(*(p)) __force __kernel *)(p);, (offset))
>>
>> A follow-up patch in the series changes PERCPU_PTR macro to:
>>
>> #define PERCPU_PTR(__p) \
>> ({ \
>> unsigned long __pcpu_ptr = (__force unsigned long)(__p); \
>> (typeof(*(__p)) __force __kernel *)(__pcpu_ptr); \
>> })
>>
>> so this should again correctly cast the value.
>
> Hm, I saw a similar bug but with pahole 1.28. My kernel complains about
> BTF invalid offset:
>
> [ 7.785788] BPF: type_id=2394 offset=0 size=1
> [ 7.786411] BPF:
> [ 7.786703] BPF: Invalid offset
> [ 7.787119] BPF:
>
> Dumping the vmlinux (there is no module invovled), I saw it is related to
> percpu pointer too:
>
> [2394] VAR '__pcpu_unique_cpu_hw_events' type_id=2, linkage=global
> ...
> [163643] DATASEC '.data..percpu' size=2123280 vlen=808
> type_id=2393 offset=0 size=1 (VAR '__pcpu_scope_cpu_hw_events')
> type_id=2394 offset=0 size=1 (VAR '__pcpu_unique_cpu_hw_events')
> ...
>
> I compiled and installed the latest pahole from its git repo:
>
> $ pahole --version
> v1.28
>
> Thanks.
Thanks for the report! Looking at percpu-defs.h it looks like the
existence of such variables requires either
#if defined(ARCH_NEEDS_WEAK_PER_CPU) ||
defined(CONFIG_DEBUG_FORCE_WEAK_PER_CPU)
...
#define DEFINE_PER_CPU_SECTION(type, name, sec) \
__PCPU_DUMMY_ATTRS char __pcpu_scope_##name; \
extern __PCPU_DUMMY_ATTRS char __pcpu_unique_##name; \
__PCPU_DUMMY_ATTRS char __pcpu_unique_##name; \
extern __PCPU_ATTRS(sec) __typeof__(type) name; \
__PCPU_ATTRS(sec) __weak __typeof__(type) name
I'm guessing your .config has CONFIG_DEBUG_FORCE_WEAK_PER_CPU, or are
you building on s390/alpha?
I've reproduced this on bpf-next with CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y,
pahole v1.28 and gcc-12; I see ~900 __pcpu_ variables and get the same
BTF errors since multipe __pcpu_ vars share the offset 0.
A simple workaround in dwarves - and I verified this resolved the issue
for me - would be
diff --git a/btf_encoder.c b/btf_encoder.c
index 3754884..4a1799a 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -2174,7 +2174,8 @@ static bool filter_variable_name(const char *name)
X("__UNIQUE_ID"),
X("__tpstrtab_"),
X("__exitcall_"),
- X("__func_stack_frame_non_standard_")
+ X("__func_stack_frame_non_standard_"),
+ X("__pcpu_")
#undef X
};
int i;
...but I'd like us to understand further why variables which were
supposed to be in a .discard section end up being encoded as there may
be other problems lurking here aside from this one. More soon hopefully...
Alan
next prev parent reply other threads:[~2024-12-14 12:15 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-06 16:08 [REGRESSION] module BTF validation failure (Error -22) on next Laura Nao
2024-11-07 15:05 ` Alan Maguire
2024-11-13 9:37 ` Laura Nao
2024-11-15 17:17 ` Laura Nao
2024-12-04 15:53 ` Laura Nao
2024-12-05 7:36 ` Uros Bizjak
2024-12-14 4:41 ` Cong Wang
2024-12-14 12:15 ` Alan Maguire [this message]
2024-12-16 15:19 ` Alan Maguire
2024-12-16 21:28 ` Stephen Brennan
2024-12-17 8:02 ` Jiri Olsa
2025-01-15 17:38 ` Cong Wang
2025-01-16 9:51 ` Alan Maguire
2024-12-05 10:33 ` Jiri Olsa
2024-12-05 10:35 ` Uros Bizjak
2024-12-06 12:35 ` Jiri Olsa
2024-12-10 13:55 ` [REGRESSION] module BTF validation failure (Error -22) on Laura Nao
2024-12-11 21:10 ` Jiri Olsa
2024-12-12 9:22 ` Jiri Olsa
2024-12-12 21:49 ` Stephen Brennan
2024-12-13 9:26 ` Laura Nao
2024-12-10 20:42 ` [REGRESSION] module BTF validation failure (Error -22) on next Kumar Kartikeya Dwivedi
2024-12-05 10:01 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3be0346a-8bc9-4be1-8418-b26c7aa4a862@oracle.com \
--to=alan.maguire@oracle.com \
--cc=bpf@vger.kernel.org \
--cc=chrome-platform@lists.linux.dev \
--cc=dwarves@vger.kernel.org \
--cc=kernel@collabora.com \
--cc=laura.nao@collabora.com \
--cc=linux-kernel@vger.kernel.org \
--cc=regressions@lists.linux.dev \
--cc=stephen.s.brennan@oracle.com \
--cc=ubizjak@gmail.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox