All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: Alan Maguire <alan.maguire@oracle.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>
Cc: dwarves@vger.kernel.org, linux-debuggers@vger.kernel.org,
	Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [PATCH 0/3] Fix duplicated VAR and secinfo
Date: Tue, 18 Feb 2025 08:54:36 -0800	[thread overview]
Message-ID: <87v7t7fb1v.fsf@oracle.com> (raw)
In-Reply-To: <377e9ec2-35fe-440d-923c-8918756fc635@oracle.com>

Alan Maguire <alan.maguire@oracle.com> writes:
> On 12/02/2025 00:49, Stephen Brennan wrote:
>> [...]
>> to feedback or suggestions regarding this. I implemented it in pahole
>> because I'm most familiar with that code, and because it seems to me like
>> it's reasonable for libbpf to expect that the input variable information is
>> already deduplicated.
>> 
>
> I've been thinking about whether core BTF deduplication should handle
> this case - I'll try and lay out the process and maybe we can think
> about whether it's better to solve in core dedup or within pahole.
>
> The deduplication of VARs should be straightforward - they are
> considered reference types and since in cases like this they share a
> name and refer to the same type, the reference type deduplication could
> be extended to cover them.

I wonder if there's a hidden catch here... DECL_TAGs can refer to VARs,
but as far as I can tell, the semantics are that the DECL_TAG actually
*modifies* the VAR - it says "this variable declaration had this
annotation / tag / whatever". This is fundamentally because the
var_secinfo doesn't point at a chain of DECL_TAGs, it points directly at
the VAR. (Compare that to, e.g. const/volatile/restrict qualifiers)

So, in a hypothetical situation where there are two variables of the
same name & type, but one of them has a DECL_TAG referring to it, would
the deduplication keep these separate?  It seems to me that would be the
correct behavior.

> However, the DATASEC references to such
> variables are a bit trickier.
>
> As seen above, the kernel currently disallows DATASEC btf_var_secinfo
> references that are overlapping, i.e. if I have
>
> [123520] DATASEC '.data..percpu' size=229632 vlen=392
> 	type_id=7708 offset=0 size=48 (VAR 'fixed_percpu_data')
> 	type_id=5559 offset=4096 size=4096 (VAR 'cpu_debug_store')
> 	type_id=7060 offset=8192 size=16384 (VAR 'irq_stack_backing_store')
>
>
> ..the kernel enforces that irq_stack_backing_store starts at >= the
> offset of the previous var (cpu_debug_store) + its size (4096+4096 in
> this case). This is why the kernel rejects BTF with multiple instances
> of the same variable, since they overlap.
>
> So if we consider the case of deduplicating variables; before dedup we
> would have something like this in the DATASEC
>
> 	type_id=8188 offset=256 size=48 (VAR 'foo')
> 	type_id=9190 offset=256 size=48 (VAR 'foo')
>
>
> ...and after dedup + remapping it would be:
>
> 	type_id=8188 offset=256 size=48 (VAR 'foo')
> 	type_id=8188 offset=256 size=48 (VAR 'foo')
>
> This still violates the overlap check. So there are a few options here:
>
> - change the kernel to relax overlap check when multiple references have
> same type id/offset/size; i.e. they refer to the same variable
> - have pahole weed out such occurrences
> - something else?
>
> Ideally we don't want to have to resize DATASECs with such duplicate
> entries as that would add more complexity to dedup.

Do you think you could share a bit more about the added complexity &
cost? I admittedly don't have any knowledge about the deduplicator, but
naively it seems like the ideal solution would be to delete the
duplicated var_secinfo and resize the DATASEC.

> Anyway I'd be interested to hear what others think about whether solving
> this in BTF dedup itself or pahole makes most sense. Thanks!

Thanks for taking a look into this! Hoping to hear from other
experienced voices on this question as well.

Thanks,
Stephen

  reply	other threads:[~2025-02-18 16:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-12  0:49 [PATCH 0/3] Fix duplicated VAR and secinfo Stephen Brennan
2025-02-12  0:49 ` [PATCH 1/3] btf_encoder: move btf_encoder__add_decl_tag() Stephen Brennan
2025-02-12  0:49 ` [PATCH 2/3] btf_encoder: postpone VARs until encoding DATASEC Stephen Brennan
2025-02-19 23:51   ` Eduard Zingerman
2025-02-12  0:49 ` [PATCH 3/3] btf_encoder: don't encode duplicate variables Stephen Brennan
2025-02-12 17:57   ` Alan Maguire
2025-02-12 18:21     ` Stephen Brennan
2025-02-18 10:36 ` [PATCH 0/3] Fix duplicated VAR and secinfo Alan Maguire
2025-02-18 16:54   ` Stephen Brennan [this message]
2025-02-19  7:48     ` Eduard Zingerman
2025-02-20  2:26 ` Eduard Zingerman
2025-02-21  1:05   ` Stephen Brennan
2025-02-21  1:30     ` Eduard Zingerman
2025-02-25  1:16       ` Stephen Brennan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v7t7fb1v.fsf@oracle.com \
    --to=stephen.s.brennan@oracle.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=linux-debuggers@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.