From: Jacob Kroon <jacob.kroon@gmail.com>
To: Richard Purdie <richard.purdie@linuxfoundation.org>,
openembedded-core@lists.openembedded.org
Subject: Re: [OE-core] [RFC PATCH v2 1/2] bitbake.conf: Pad rpath and remove build ID in native binaries
Date: Thu, 2 Dec 2021 15:49:26 +0100 [thread overview]
Message-ID: <e388fadb-04d7-e653-2dfa-ccbd6e589251@gmail.com> (raw)
In-Reply-To: <c95df2f5084fd93bb10e308e5b501c92c0779d44.camel@linuxfoundation.org>
On 12/2/21 12:09, Richard Purdie wrote:
> On Thu, 2021-12-02 at 12:03 +0100, Jacob Kroon wrote:
>> On 12/2/21 11:51, Richard Purdie wrote:
>>> On Thu, 2021-12-02 at 11:19 +0100, Jacob Kroon wrote:
>>>> On 12/2/21 00:11, Richard Purdie wrote:
>>>>> On Tue, 2021-11-30 at 23:37 +0100, Jacob Kroon wrote:
>>>>>> Try to make sure that the RUNTIME dynamic entry size is the same for all
>>>>>> binaries produced with the native compiler. This is necessary in order to
>>>>>> produce identical binaries when using differently sized buildpaths. I've
>>>>>> tried using only patchelf, and keeping the linker flags as they are, but
>>>>>> I am unable to produce identical binaries. Has anyone else managed to do
>>>>>> this with patchelf ? If not, maybe we can write a new tool that can handle it ?
>>>>>>
>>>>>> The build-id also needs to be removed since it is calculated based on
>>>>>> the data present at link time. This includes STAGING_LIBDIR_NATIVE
>>>>>> and STAGING_BASE_LIBDIR_NATIVE. Both will differ and they need to be temporarily
>>>>>> preserved since some recipes will execute the binaries during do_install()
>>>>>> (for example python3-native). Later on these are removed in chrpath.bbclass.
>>>>>>
>>>>>> This hack is the first step for producing identical native binaries when using
>>>>>> different build paths. 'zstd-native' is a working example.
>>>>>>
>>>>>> Signed-off-by: Jacob Kroon <jacob.kroon@gmail.com>
>>>>>> ---
>>>>>> meta/classes/chrpath.bbclass | 3 +++
>>>>>> meta/conf/bitbake.conf | 5 ++++-
>>>>>> 2 files changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> I'm a little torn on this. Our other option would be to hardcoded a specific
>>>>> dummy path and then edit it later to the correct value. That may be neater than
>>>>> adding the padding. It will change the end binaries but hopefully only after
>>>>> they're installed so should give the same net end result more neatly?
>>>>>
>>>>
>>>> Hmm not sure I follow. This patch adds a new dummy rpath entry,
>>>> "/rpath-padding-xxx...", then we remove it in chrpath. I don't know what
>>>> other value we would like to put there. If I understand you correctly,
>>>> we could perhaps pad one of the ones we already pass
>>>>
>>>> -Wl,-rpath,${STAGING_LIBDIR_NATIVE}
>>>> -Wl,-rpath,${STAGING_BASE_LIBDIR_NATIVE}
>>>>
>>>> with spaces, like:
>>>>
>>>> -Wl,-rpath,${STAGING_LIBDIR_NATIVE}
>>>> -Wl,-rpath,"${STAGING_BASE_LIBDIR_NATIVE}${RPATH_PADDING}"
>>>
>>>
>>> I'm wondering if:
>>>
>>> -Wl,-rpath,/not/exist/our-native-libdir-marker
>>> -Wl,-rpath,/not/exist/our-native-base-libdir-marker
>>>
>>> would work.
>>>
>>
>> Right, I'll give it a try.
>>
Unfortunatley this breaks building python3-native. Although it compiles,
during the build the python build scripts tries to import the created
modules, and if this fails (which it does) it renames the modules:
> *** WARNING: renaming "_curses" since importing it failed: libncurses.so.5: cannot open shared object file: No such file or directory
> *** WARNING: renaming "_curses_panel" since importing it failed: libpanel.so.5: cannot open shared object file: No such file or directory
> *** WARNING: renaming "_ssl" since importing it failed: libssl.so.3: cannot open shared object file: No such file or directory
> *** WARNING: renaming "_hashlib" since importing it failed: libssl.so.3: cannot open shared object file: No such file or directory
> *** WARNING: renaming "nis" since importing it failed: libnsl.so.3: cannot open shared object file: No such file or directory
> *** WARNING: renaming "_ctypes" since importing it failed: libffi.so.8: cannot open shared object file: No such file or directory
I suppose it tries to import using the built python which has those
phony rpaths, and can't find the per-recipe-sysroot
lbncurses.so.5/libpanel.so.5/etc and fails.
The new modules will be called:
> sysroots-components/x86_64/python3-native/usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu_failed.so
> sysroots-components/x86_64/python3-native/usr/lib/python3.10/lib-dynload/nis.cpython-310-x86_64-linux-gnu_failed.so
> sysroots-components/x86_64/python3-native/usr/lib/python3.10/lib-dynload/_hashlib.cpython-310-x86_64-linux-gnu_failed.so
> sysroots-components/x86_64/python3-native/usr/lib/python3.10/lib-dynload/_ssl.cpython-310-x86_64-linux-gnu_failed.so
> sysroots-components/x86_64/python3-native/usr/lib/python3.10/lib-dynload/_curses_panel.cpython-310-x86_64-linux-gnu_failed.so
> sysroots-components/x86_64/python3-native/usr/lib/python3.10/lib-dynload/_curses.cpython-310-x86_64-linux-gnu_failed.so
which means any subsequent recipe that uses python3-native will fail to
import any of those modules.
I suspect it might not just be python that wants to run the produced
binaries during the build itself.
>>>> If that works that would be less intrusive I think.
>>>>
>>>>> If we separate out the build-id patch we could hopefully get that piece merged
>>>>> as that shouldn't be controversial?
>>>>>
>>>>
>>>> Yes, I can split it out into a separate patch.
>>>>
>>>> But now that I've looked at this for a while, I've asked myself what
>>>> good does all this do ? The only optimization I can think of is that if
>>>> we rebuild a native recipes, and the sysroot component turns out the
>>>> same, then we don't need to create a new sstate cache entry. So we save
>>>> disk space, but disk space is cheap. We still need to build it. What I
>>>> would like is to have a common sstate dir for multiple build
>>>> directories. So if I build libtool-native in one build path, then at my
>>>> other build path it would just pick it up from sstate cache when I build
>>>> there. In the end, is that something that would be possible ?
>>>
>>> We originally started here with gcc-cross so lets consider that and multiple
>>> build directories where a patch changes gcc-cross in a way that is irrelavent to
>>> the output.
>>>
>>> The "win" is that regardless of whether I build in location A or B, I get the
>>> same gcc-cross binary. Hash-equiv will then not rebuild the target binaries.
>>> Yes, I pay the price of a gcc-cross rebuild but hashequiv saves the targets
>>> rebuilding.
>>>
>>> Currently it would only happen if you always build gcc-cross in a specific build
>>> path.
>>>
>>
>> I know the build path will change if I upgrade to a new version of gcc,
>> but then the output is most definitely gonna change as well.
>>
>>> Like everything, it is a question of looking at the changes and deciding whether
>>> they are worth any maintenance burden/code complication or additional overhead
>>> they generate. I don't know the answer here yet but I do appreciate the research
>>> in helping get us data to make decisions on!
>>>
>>
>> I was thinking if it was possible to add a "build-path-does-not-matter"
>> .bbclass that would make the signatures independent of build path and
>> then scan the output to make sure it didn't contain any references to
>> the build path. Then those recipes who didn't depend on build path could
>> inherit from that class, and then maybe their sstate could be reused
>> from multiple build directories ? Not sure reliable it would be though..
>
> Another crazy thought is our sstate really is already path independent,
> regardless of the binary content. You could therefore make the hash function
> replace the path with a fixed string. The downside is that doesn't work well on
> binaries due to offsets, alignment and so on.
>
> As I read the above I was reminded that insane.bbclass does sanity check the
> output for build paths and does have a configurable control mechanism. It
> doesn't do that for the populate_sysroot output though since it is for
> do_package.
>
> Lots to think about here but you're right that adding some kind of scanner to
> mark up recipes over time would help us preserve this.
Jacob
next prev parent reply other threads:[~2021-12-02 14:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-30 22:37 [RFC PATCH v2 0/2] Improve native/cross reproducibility Jacob Kroon
2021-11-30 22:37 ` [RFC PATCH v2 1/2] bitbake.conf: Pad rpath and remove build ID in native binaries Jacob Kroon
2021-12-01 23:11 ` [OE-core] " Richard Purdie
2021-12-02 10:19 ` Jacob Kroon
2021-12-02 10:51 ` Richard Purdie
2021-12-02 11:03 ` Jacob Kroon
2021-12-02 11:09 ` Richard Purdie
2021-12-02 14:49 ` Jacob Kroon [this message]
2021-11-30 22:37 ` [RFC PATCH v2 2/2] Improve native reproducibility in recipes Jacob Kroon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e388fadb-04d7-e653-2dfa-ccbd6e589251@gmail.com \
--to=jacob.kroon@gmail.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=richard.purdie@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.