From: Jiri Olsa <olsajiri@gmail.com>
To: Jiri Olsa <olsajiri@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>,
Marcus Seyfarth <m.seyfarth@gmail.com>,
Masahiro Yamada <masahiroy@kernel.org>, bpf <bpf@vger.kernel.org>,
clang-built-linux <llvm@lists.linux.dev>,
Stanislav Fomichev <sdf@google.com>,
Nathan Chancellor <nathan@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
Song Liu <song@kernel.org>
Subject: Re: duplicate BTF_IDs leading to symbol redefinition errors?
Date: Thu, 14 Sep 2023 10:17:31 +0200 [thread overview]
Message-ID: <ZQLBm8sC+V53CIzD@krava> (raw)
In-Reply-To: <ZPuA5+HmbcdBLbIq@krava>
On Fri, Sep 08, 2023 at 10:15:35PM +0200, Jiri Olsa wrote:
> On Fri, Sep 08, 2023 at 10:14:56AM -0700, Nick Desaulniers wrote:
> > Thanks for the patch!
> >
> > + Marcus
> >
> > Marcus can you please test the below patch and provide your tested-by
> > and reported-by tags?
> >
> > On Fri, Sep 8, 2023 at 4:47 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Thu, Sep 07, 2023 at 10:33:00PM +0200, Jiri Olsa wrote:
> > > > On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> > > > > So we've got a curious report recently:
> > > > > https://github.com/ClangBuiltLinux/linux/issues/1913
> > > > >
> > > > > ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> > > > > '__BTF_ID__struct__cgroup__624' is already defined
> > > > > __BTF_ID__struct__cgroup__624:
> > > > > ^
> > > > >
> > > > > It's been hard to pin down a SHA and .config to reproduce this, but
> > > > > looking at the definition of BTF_ID's usage of __ID's usage of
> > > > > __COUNTER__, and the two statements:
> > > > >
> > > > > kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> > > > > kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> > > > >
> > > > > Is it possible that __COUNTER__ could evaluate to the same value
> > > > > across 2 different translation units, leading to a name collision like
> > > > > the above?
> > > >
> > > > hum, that probably the case, I see same counter values at different
> > > > __BTF_ID_ symbols:
> > > >
> > > > ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
> > > > ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
> > > > ffffffff833fe578 r __BTF_ID__struct__cgroup__380
> > > >
> > > > perhaps we were just lucky not to hit that :-\
> > > >
> > > > >
> > > > > looking at another usage of BTF_ID other than struct
> > > > > cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> > > > > is only defined in one translation unit
> > > > >
> > > > > Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> > > > > some other way we can avoid these collisions in the future?
> > > >
> > > > need to find some way to make the symbol unique, will check
> > >
> > > the change below uses object's path as the __BTF_ID_.. symbol suffix to make
> > > it unique
> > >
> > > I'm still looking, but can't think of a better way so far, perhaps somebody
> > > will have better idea
> >
> > Another good approach; I had simply added __LINE__ into the paste.
> > https://github.com/ClangBuiltLinux/linux/issues/1913#issuecomment-1710794319
> > Which just makes the probability of this occurring again smaller, but
> > still non-zero.
>
> yes, there's still possibility of the match
>
> >
> > + Masahiro for thoughts on the invocation of echo and base32. Looks
> > like base32 is part of coreutils. Kind of strange that coreutils isn't
> > listed in Documentation/process/changes.rst. Would adding the usage
> > of base32 add a new dependency on coreutils?
> >
> > >
> > > jirka
> > >
> > >
> > > ---
> > > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> > > index a3462a9b8e18..564953f9cbc7 100644
> > > --- a/include/linux/btf_ids.h
> > > +++ b/include/linux/btf_ids.h
> > > @@ -49,7 +49,7 @@ word \
> > > ____BTF_ID(symbol, word)
> > >
> > > #define __ID(prefix) \
> > > - __PASTE(prefix, __COUNTER__)
> > > + __PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
> >
> > Do we still need __COUNTER__ if we're now using BTF_ID_BASE?
>
> yes we still need that because we could have same __BTF_ID__...
> symbol used multiple times within same object, and that's where
> __COUNTER__ makes the difference
>
> >
> > >
> > > /*
> > > * The BTF_ID defines unique symbol for each ID pointing
> > > diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> > > index 68d0134bdbf9..2ef8b2798be0 100644
> > > --- a/scripts/Makefile.lib
> > > +++ b/scripts/Makefile.lib
> > > @@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
> > > -D__KCSAN_INSTRUMENT_BARRIERS__)
> > > endif
> > >
> > > +ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
> > > +_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell echo -n $(modfile) | base32 -w0))
> >
> > `man 1 base32` shows it can just read a file. Could the above be:
> >
> > _c_flags += -DBTF_ID_BASE=$(subst =,,$(shell base32 -w0 $(modfile)))
> >
> > ? (untested)
> >
> > Also, the output of
> >
> > $ base32 -w0 Documentation/process/changes.rst
> >
> > is 24456 characters. This is going to blow up symbol tables. I
> > suppose ELF probably has some length limit on symbol names, too. I
> > was nervous about my approaching appending __LINE__.
> >
> > Perhaps pipe the output to `head -c <n bytes>`?
>
> so the change is about adding unique id that's basically path of
> the object stored in base32 so it could be used as symbol, so we
> don't really need to read the actual file
>
> the problem is when BTF_ID definition like:
>
> BTF_ID(struct, cgroup)
>
> translates in 2 separate objects into same symbol name because of
> the matching __COUNTER__ macro values (like 380 below)
>
> __BTF_ID__struct__cgroup__380
>
> this change just adds unique id of the path name at the end of the
> symbol with:
>
> echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
>
> so the symbol looks like:
>
> __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
>
> and is unique over the sources
>
> but I still hope we could come up with some better solution ;-)
so far the only better solution I could come up with is to use
cksum (also from coreutils) instead of base32, which makes the
BTF_ID_BASE value compact
I'll run test to find out how much it hurts the build time
jirka
---
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index a3462a9b8e18..564953f9cbc7 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -49,7 +49,7 @@ word \
____BTF_ID(symbol, word)
#define __ID(prefix) \
- __PASTE(prefix, __COUNTER__)
+ __PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
/*
* The BTF_ID defines unique symbol for each ID pointing
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 68d0134bdbf9..01b14e6a7df3 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
-D__KCSAN_INSTRUMENT_BARRIERS__)
endif
+ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
+_c_flags += -DBTF_ID_BASE=$(firstword $(shell echo -n $(modfile) | cksum))
+endif
+
# $(srctree)/$(src) for including checkin headers from generated source files
# $(objtree)/$(obj) for including generated headers from checkin source files
ifeq ($(KBUILD_EXTMOD),)
next prev parent reply other threads:[~2023-09-14 8:17 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-07 19:01 duplicate BTF_IDs leading to symbol redefinition errors? Nick Desaulniers
2023-09-07 20:33 ` Jiri Olsa
2023-09-08 11:47 ` Jiri Olsa
2023-09-08 17:14 ` Nick Desaulniers
2023-09-08 20:15 ` Jiri Olsa
2023-09-11 16:21 ` Nick Desaulniers
[not found] ` <CA+FbhJNz4i4pU+8nT7JBvQKSa0VCkzcNzaJ=dRdRn+JCSTdgKQ@mail.gmail.com>
2023-09-11 18:17 ` Marcus Seyfarth
2023-09-14 8:17 ` Jiri Olsa [this message]
2023-09-14 8:30 ` Masahiro Yamada
2023-09-14 9:52 ` Jiri Olsa
2023-09-14 18:14 ` Andrii Nakryiko
2023-09-15 8:28 ` Jiri Olsa
2023-09-15 16:47 ` Nick Desaulniers
2023-09-15 20:41 ` Andrii Nakryiko
2023-09-17 14:09 ` Jiri Olsa
2023-09-24 13:27 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQLBm8sC+V53CIzD@krava \
--to=olsajiri@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=m.seyfarth@gmail.com \
--cc=masahiroy@kernel.org \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox