From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: Alan Maguire <alan.maguire@oracle.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>
Cc: dwarves@vger.kernel.org, linux-debuggers@vger.kernel.org,
Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [PATCH 0/3] Fix duplicated VAR and secinfo
Date: Tue, 18 Feb 2025 08:54:36 -0800 [thread overview]
Message-ID: <87v7t7fb1v.fsf@oracle.com> (raw)
In-Reply-To: <377e9ec2-35fe-440d-923c-8918756fc635@oracle.com>
Alan Maguire <alan.maguire@oracle.com> writes:
> On 12/02/2025 00:49, Stephen Brennan wrote:
>> [...]
>> to feedback or suggestions regarding this. I implemented it in pahole
>> because I'm most familiar with that code, and because it seems to me like
>> it's reasonable for libbpf to expect that the input variable information is
>> already deduplicated.
>>
>
> I've been thinking about whether core BTF deduplication should handle
> this case - I'll try and lay out the process and maybe we can think
> about whether it's better to solve in core dedup or within pahole.
>
> The deduplication of VARs should be straightforward - they are
> considered reference types and since in cases like this they share a
> name and refer to the same type, the reference type deduplication could
> be extended to cover them.
I wonder if there's a hidden catch here... DECL_TAGs can refer to VARs,
but as far as I can tell, the semantics are that the DECL_TAG actually
*modifies* the VAR - it says "this variable declaration had this
annotation / tag / whatever". This is fundamentally because the
var_secinfo doesn't point at a chain of DECL_TAGs, it points directly at
the VAR. (Compare that to, e.g. const/volatile/restrict qualifiers)
So, in a hypothetical situation where there are two variables of the
same name & type, but one of them has a DECL_TAG referring to it, would
the deduplication keep these separate? It seems to me that would be the
correct behavior.
> However, the DATASEC references to such
> variables are a bit trickier.
>
> As seen above, the kernel currently disallows DATASEC btf_var_secinfo
> references that are overlapping, i.e. if I have
>
> [123520] DATASEC '.data..percpu' size=229632 vlen=392
> type_id=7708 offset=0 size=48 (VAR 'fixed_percpu_data')
> type_id=5559 offset=4096 size=4096 (VAR 'cpu_debug_store')
> type_id=7060 offset=8192 size=16384 (VAR 'irq_stack_backing_store')
>
>
> ..the kernel enforces that irq_stack_backing_store starts at >= the
> offset of the previous var (cpu_debug_store) + its size (4096+4096 in
> this case). This is why the kernel rejects BTF with multiple instances
> of the same variable, since they overlap.
>
> So if we consider the case of deduplicating variables; before dedup we
> would have something like this in the DATASEC
>
> type_id=8188 offset=256 size=48 (VAR 'foo')
> type_id=9190 offset=256 size=48 (VAR 'foo')
>
>
> ...and after dedup + remapping it would be:
>
> type_id=8188 offset=256 size=48 (VAR 'foo')
> type_id=8188 offset=256 size=48 (VAR 'foo')
>
> This still violates the overlap check. So there are a few options here:
>
> - change the kernel to relax overlap check when multiple references have
> same type id/offset/size; i.e. they refer to the same variable
> - have pahole weed out such occurrences
> - something else?
>
> Ideally we don't want to have to resize DATASECs with such duplicate
> entries as that would add more complexity to dedup.
Do you think you could share a bit more about the added complexity &
cost? I admittedly don't have any knowledge about the deduplicator, but
naively it seems like the ideal solution would be to delete the
duplicated var_secinfo and resize the DATASEC.
> Anyway I'd be interested to hear what others think about whether solving
> this in BTF dedup itself or pahole makes most sense. Thanks!
Thanks for taking a look into this! Hoping to hear from other
experienced voices on this question as well.
Thanks,
Stephen
next prev parent reply other threads:[~2025-02-18 16:54 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-12 0:49 [PATCH 0/3] Fix duplicated VAR and secinfo Stephen Brennan
2025-02-12 0:49 ` [PATCH 1/3] btf_encoder: move btf_encoder__add_decl_tag() Stephen Brennan
2025-02-12 0:49 ` [PATCH 2/3] btf_encoder: postpone VARs until encoding DATASEC Stephen Brennan
2025-02-19 23:51 ` Eduard Zingerman
2025-02-12 0:49 ` [PATCH 3/3] btf_encoder: don't encode duplicate variables Stephen Brennan
2025-02-12 17:57 ` Alan Maguire
2025-02-12 18:21 ` Stephen Brennan
2025-02-18 10:36 ` [PATCH 0/3] Fix duplicated VAR and secinfo Alan Maguire
2025-02-18 16:54 ` Stephen Brennan [this message]
2025-02-19 7:48 ` Eduard Zingerman
2025-02-20 2:26 ` Eduard Zingerman
2025-02-21 1:05 ` Stephen Brennan
2025-02-21 1:30 ` Eduard Zingerman
2025-02-25 1:16 ` Stephen Brennan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v7t7fb1v.fsf@oracle.com \
--to=stephen.s.brennan@oracle.com \
--cc=acme@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=eddyz87@gmail.com \
--cc=linux-debuggers@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).