All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Jiri Olsa <olsajiri@gmail.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Marcus Seyfarth <m.seyfarth@gmail.com>, bpf <bpf@vger.kernel.org>,
	clang-built-linux <llvm@lists.linux.dev>,
	Stanislav Fomichev <sdf@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	Song Liu <song@kernel.org>
Subject: Re: duplicate BTF_IDs leading to symbol redefinition errors?
Date: Sun, 17 Sep 2023 16:09:31 +0200	[thread overview]
Message-ID: <ZQcIm/TZH73IklIf@krava> (raw)
In-Reply-To: <CAEf4BzZe_27FPzMwjaU3d5gPuyXX3iTQJGzT64CCTZLEfpQUvQ@mail.gmail.com>

On Fri, Sep 15, 2023 at 01:41:25PM -0700, Andrii Nakryiko wrote:
> On Fri, Sep 15, 2023 at 1:28 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Thu, Sep 14, 2023 at 11:14:19AM -0700, Andrii Nakryiko wrote:
> > > On Thu, Sep 14, 2023 at 2:52 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:
> > > >
> > > > SNIP
> > > >
> > > > > > > so the change is about adding unique id that's basically path of
> > > > > > > the object stored in base32 so it could be used as symbol, so we
> > > > > > > don't really need to read the actual file
> > > > > > >
> > > > > > > the problem is when BTF_ID definition like:
> > > > > > >
> > > > > > > BTF_ID(struct, cgroup)
> > > > > > >
> > > > > > > translates in 2 separate objects into same symbol name because of
> > > > > > > the matching __COUNTER__ macro values (like 380 below)
> > > > > > >
> > > > > > >   __BTF_ID__struct__cgroup__380
> > > > > > >
> > > > > > > this change just adds unique id of the path name at the end of the
> > > > > > > symbol with:
> > > > > > >
> > > > > > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > > >
> > > > > > > so the symbol looks like:
> > > > > > >
> > > > > > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > > >
> > > > > > > and is unique over the sources
> > > > > > >
> > > > > > > but I still hope we could come up with some better solution ;-)
> > > > > >
> > > > > > so far the only better solution I could come up with is to use
> > > > > > cksum (also from coreutils) instead of base32, which makes the
> > > > > > BTF_ID_BASE value compact
> > > > > >
> > > > > > I'll run test to find out how much it hurts the build time
> > > > > >
> > > > > > jirka
> > > > >
> > > > >
> > > > >
> > > > > Seems a bad idea to me.
> > > > >
> > > > > It would fork a new shell and chsum for all files,
> > > > > while only a few of them need it.
> > > >
> > > > right, I have a change to limit this on kernel and net directories,
> > > > but it's still bad
> > > >
> > > > >
> > > > > Better to consult BTF forks.
> > > >
> > > > perhaps there's better way within kbuild to get unique id/value
> > > > for each object file?
> > >
> > > let's just use __LINE__ + __COUNTER__ for now and teach resolve_btfids
> > > to fail and complain loudly about duplicate symbols?
> >
> > ok, will send that.. but it fails during link before resolve_btfids
> > takes place
> >
> > >
> > >
> > > This will give us time and opportunity to implement a better approach
> > > to .BTF_ids overall. Encoding the desired type name in the symbol name
> > > always felt off. Maybe it's better to encode type + name as data,
> > > which is discarded at the latest stage during vmlinux linking? Either
> >
> > hum, so maybe having a special section (.BTF_ids_desc below)
> > that would have record for each ID placed in .BTF_ids section:
> >
> > asm(                                           \
> > ".pushsection .BTF_ids,\"a\";        \n"       \
> > "1:                                  \n"       \
> > ".zero 4                             \n"       \
> > ".popsection;                        \n"       \
> > ".pushsection .BTF_ids_desc,\"a\";   \n"       \
> > ".long 1b                            \n"       \
> >
> > and somehow get prefix and name pointers in here:
> >
> > ".long prefix
> > ".long name
> >
> > ".popsection;                        \n");
> >
> > so resolve_btfids would iterate .BTF_ids_desc records and fix
> > up .BTF_ids data..
> >
> 
> Something like that. I don't think it's even a regression in terms of
> vmlinux space usage, because right now we spend as much space on
> storing symbol names. So just adding string pointers would be already
> a win due to repeating "struct", "func", etc strings.
> 
> 
> > we might need to do one extra link phase to get rid of the
> > .BTF_ids_desc secion
> 
> Hopefully we can find a way to avoid this, we already do like 3 link
> phases at least (for kallsyms), so doing all that on the first one and
> then stripping it out using link script at subsequent one would be
> best.

perhaps we could move that section under .init.data and
get rid of it on startup

jirka

  reply	other threads:[~2023-09-17 14:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-07 19:01 duplicate BTF_IDs leading to symbol redefinition errors? Nick Desaulniers
2023-09-07 20:33 ` Jiri Olsa
2023-09-08 11:47   ` Jiri Olsa
2023-09-08 17:14     ` Nick Desaulniers
2023-09-08 20:15       ` Jiri Olsa
2023-09-11 16:21         ` Nick Desaulniers
     [not found]         ` <CA+FbhJNz4i4pU+8nT7JBvQKSa0VCkzcNzaJ=dRdRn+JCSTdgKQ@mail.gmail.com>
2023-09-11 18:17           ` Marcus Seyfarth
2023-09-14  8:17         ` Jiri Olsa
2023-09-14  8:30           ` Masahiro Yamada
2023-09-14  9:52             ` Jiri Olsa
2023-09-14 18:14               ` Andrii Nakryiko
2023-09-15  8:28                 ` Jiri Olsa
2023-09-15 16:47                   ` Nick Desaulniers
2023-09-15 20:41                   ` Andrii Nakryiko
2023-09-17 14:09                     ` Jiri Olsa [this message]
2023-09-24 13:27                       ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQcIm/TZH73IklIf@krava \
    --to=olsajiri@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=m.seyfarth@gmail.com \
    --cc=masahiroy@kernel.org \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.