From: Stefan Agner <stefan@agner.ch>
To: openembedded-core@lists.openembedded.org,
richard.purdie@linuxfoundation.org
Cc: Brandon Shibley <brandon.shibley@toradex.com>,
samuel.bissig@toradex.com, ricardo@foundries.io
Subject: Re: Build failure with parallel build and opkg
Date: Wed, 26 Sep 2018 11:34:31 +0200 [thread overview]
Message-ID: <abbffc6e7fa60468423b2cf1ec39f311@agner.ch> (raw)
In-Reply-To: <ae881b799c12d2514345825d8fa5297f@agner.ch>
Hi,
On 12.09.2018 00:49, Stefan Agner wrote:
> Hi,
>
> We experience build errors as follows every now and then:
>
> ...
> ERROR: full-container-image-0.1-r0 do_populate_sdk: Unable to install
> packages. Command
> '/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/recipe-sysroot-native/usr/bin/opkg
> --volatile-cache -f
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/opkg.conf
> -t
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/temp/ipktemp/
> -o
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi
> --force_postinstall --prefer-arch-to-version install 96boards-tools
> aktualizr aktualizr-host-tools aktualizr-runtime-prov base-passwd
> coreutils cpufrequtils docker gptfdisk haveged hostapd htop iptables
> kernel-modules ldd less lmp-device-register networkmanager
> networkmanager-nmtui openssh-sftp-server os-release ostree
> packagegroup-base-extended packagegroup-core-boot
> packagegroup-core-full-cmdline-extended
> packagegroup-core-full-cmdline-multiuser
> packagegroup-core-full-cmdline-utils packagegroup-core-ssh-openssh
> packagegroup-core-standalone-sdk-target pciutils python3-compression
> python3-distutils python3-docker python3-docker-compose python3-json
> python3-netclient python3-pkgutil python3-shell python3-unixadmin rsync
> run-postinsts shadow sshfs-fuse strace sudo target-sdk-provides-dummy
> tcpdump vim-tiny' returned 255:
> ...
> Downloading
> file:/workdir/oe/tmp/deploy/ipk/armv7at2hf-neon/nss_3.38-r0_armv7at2hf-neon.ipk.
> Removing corrupt package file
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi//var/cache/opkg/volatile/8e392ecd3611e24a6a49a8b22ad6e1ff_nss_3.38-r0_armv7at2hf-neon.ipk.
> ...
> Installing pam-plugin-faildelay (1.3.0) on root
> Downloading
> file:/workdir/oe/tmp/deploy/ipk/armv7at2hf-neon/pam-plugin-faildelay_1.3.0-r5_armv7at2hf-neon.ipk.
> Removing corrupt package file
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi//var/cache/opkg/volatile/0df6a8bc594a581f6ca3bcfa55e860e2_pam-plugin-faildelay_1.3.0-r5_armv7at2hf-neon.ipk.
> ...
> Collected errors:
> * opkg_install_pkg: Failed to download nss. Perhaps you need to run
> 'opkg update'?
> * opkg_install_pkg: Failed to download pam-plugin-faildelay. Perhaps
> you need to run 'opkg update'?
> .
> ...
>
> We build our own OpenEmbedded core based distribution currently based on
> a recent master state. But we have seen this on and off back since
> rocko.
>
> We build the image using Jenkins with multiple builders running in
> parallel and sharing sstate. I think the fact that we run similar images
> in parallel is the culprit: Looking closer at the failed build directory
> reveals that the tmp-glibc/deploy/ipk/armv7at2hf-neon/Packages has a
> different MD5Sum than the actual package. We start with two builders
> simultaneously building an image, and it seems that they build the same
> package around the same time. I assume that the two builders somehow
> have a race between when the package get assembled and when the Package
> index gets built...
>
> We start with a clean sstate, and this typically only happens for the
> very first builds, when the sstate is cold.
We discussed the issue at Linaro Connect a bit.
To recap, we do build in two steps:
1. bitbake full-container-image
2. bitbake -c populate_sdk full-container-image
The issue always happens in the second step.
We also see that in the second step, the do_package_write_ipk_setscene
task for every recipe is executed.
The current assumption is
I tried to reproduce by building a recipe using openembedded-core master
only in two build directories with shared sstate manually:
1. build1 $ bitbake eudev
2. build2 $ bitbake -c cleansstate eudev
3. build2 $ bitbake eudev
4. build1 $ bitbake core-image-minimal
This sequence seems not to have triggered a
do_package_write_ipk_setscene for eudev.
I then tried
5. build1 $ bitbake -c populate_sdk core-image-minimal
Which did trigger a do_package_write_ipk_setscene. However, the issue
did not appear...
I even tried to rebuild and replace the file manually, and run bitbake
-c populate_sdk -f core-image-minimal, but it just seems not to appear.
Last time I have seen it was with oe-core
f6634581fa0a81c4d68dc9179a755ad7b9d99357, I will revert to this version
again to see whether that helps reproducing the issue.
--
Stefan
>
> I guess there is some race/asynchronous operation going on around
> building index/getting package from sstate/pushing package to sstate.
>
> It seems an issue others have seen in the past too:
> https://www.yoctoproject.org/irc/%23yocto.2018-07-05.log.html#t2018-07-05T10:07:25
>
> Any idea?
>
> --
> Stefan
next prev parent reply other threads:[~2018-09-26 9:34 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-11 22:49 Build failure with parallel build and opkg Stefan Agner
2018-09-26 9:34 ` Stefan Agner [this message]
2018-10-02 8:46 ` Stefan Agner
2018-10-02 13:12 ` Stefan Agner
2018-10-02 19:03 ` Khem Raj
2018-10-11 9:32 ` Stefan Agner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abbffc6e7fa60468423b2cf1ec39f311@agner.ch \
--to=stefan@agner.ch \
--cc=brandon.shibley@toradex.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=ricardo@foundries.io \
--cc=richard.purdie@linuxfoundation.org \
--cc=samuel.bissig@toradex.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox