Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Stefan Agner <stefan@agner.ch>
To: openembedded-core@lists.openembedded.org,
	richard.purdie@linuxfoundation.org
Cc: Brandon Shibley <brandon.shibley@toradex.com>,
	samuel.bissig@toradex.com, ricardo@foundries.io
Subject: Re: Build failure with parallel build and opkg
Date: Wed, 26 Sep 2018 11:34:31 +0200	[thread overview]
Message-ID: <abbffc6e7fa60468423b2cf1ec39f311@agner.ch> (raw)
In-Reply-To: <ae881b799c12d2514345825d8fa5297f@agner.ch>

Hi,

On 12.09.2018 00:49, Stefan Agner wrote:
> Hi,
> 
> We experience build errors as follows every now and then:
> 
> ...
> ERROR: full-container-image-0.1-r0 do_populate_sdk: Unable to install
> packages. Command
> '/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/recipe-sysroot-native/usr/bin/opkg
> --volatile-cache -f
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/opkg.conf
> -t
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/temp/ipktemp/
> -o
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi
>  --force_postinstall --prefer-arch-to-version   install 96boards-tools
> aktualizr aktualizr-host-tools aktualizr-runtime-prov base-passwd
> coreutils cpufrequtils docker gptfdisk haveged hostapd htop iptables
> kernel-modules ldd less lmp-device-register networkmanager
> networkmanager-nmtui openssh-sftp-server os-release ostree
> packagegroup-base-extended packagegroup-core-boot
> packagegroup-core-full-cmdline-extended
> packagegroup-core-full-cmdline-multiuser
> packagegroup-core-full-cmdline-utils packagegroup-core-ssh-openssh
> packagegroup-core-standalone-sdk-target pciutils python3-compression
> python3-distutils python3-docker python3-docker-compose python3-json
> python3-netclient python3-pkgutil python3-shell python3-unixadmin rsync
> run-postinsts shadow sshfs-fuse strace sudo target-sdk-provides-dummy
> tcpdump vim-tiny' returned 255:
> ...
> Downloading
> file:/workdir/oe/tmp/deploy/ipk/armv7at2hf-neon/nss_3.38-r0_armv7at2hf-neon.ipk.
> Removing corrupt package file
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi//var/cache/opkg/volatile/8e392ecd3611e24a6a49a8b22ad6e1ff_nss_3.38-r0_armv7at2hf-neon.ipk.
> ...
> Installing pam-plugin-faildelay (1.3.0) on root
> Downloading
> file:/workdir/oe/tmp/deploy/ipk/armv7at2hf-neon/pam-plugin-faildelay_1.3.0-r5_armv7at2hf-neon.ipk.
> Removing corrupt package file
> /workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi//var/cache/opkg/volatile/0df6a8bc594a581f6ca3bcfa55e860e2_pam-plugin-faildelay_1.3.0-r5_armv7at2hf-neon.ipk.
> ...
> Collected errors:
>  * opkg_install_pkg: Failed to download nss. Perhaps you need to run
> 'opkg update'?
>  * opkg_install_pkg: Failed to download pam-plugin-faildelay. Perhaps
> you need to run 'opkg update'?
> .
> ...
> 
> We build our own OpenEmbedded core based distribution currently based on
> a recent master state. But we have seen this on and off back since
> rocko.
> 
> We build the image using Jenkins with multiple builders running in
> parallel and sharing sstate. I think the fact that we run similar images
> in parallel is the culprit: Looking closer at the failed build directory
> reveals that the tmp-glibc/deploy/ipk/armv7at2hf-neon/Packages has a
> different MD5Sum than the actual package. We start with two builders
> simultaneously building an image, and it seems that they build the same
> package around the same time. I assume that the two builders somehow
> have a race between when the package get assembled and when the Package
> index gets built...
> 
> We start with a clean sstate, and this typically only happens for the
> very first builds, when the sstate is cold.

We discussed the issue at Linaro Connect a bit.

To recap, we do build in two steps:

1. bitbake full-container-image
2. bitbake -c populate_sdk full-container-image

The issue always happens in the second step.

We also see that in the second step, the do_package_write_ipk_setscene
task for every recipe is executed.

The current assumption is

I tried to reproduce by building a recipe using openembedded-core master
only in two build directories with shared sstate manually:

1. build1 $ bitbake eudev
2. build2 $ bitbake -c cleansstate eudev
3. build2 $ bitbake eudev
4. build1 $ bitbake core-image-minimal

This sequence seems not to have triggered a
do_package_write_ipk_setscene for eudev.

I then tried
5. build1 $ bitbake -c populate_sdk core-image-minimal

Which did trigger a do_package_write_ipk_setscene. However, the issue
did not appear...

I even tried to rebuild and replace the file manually, and run bitbake
-c populate_sdk -f core-image-minimal, but it just seems not to appear.

Last time I have seen it was with oe-core
f6634581fa0a81c4d68dc9179a755ad7b9d99357, I will revert to this version
again to see whether that helps reproducing the issue.

--
Stefan


> 
> I guess there is some race/asynchronous operation going on around
> building index/getting package from sstate/pushing package to sstate.
> 
> It seems an issue others have seen in the past too:
> https://www.yoctoproject.org/irc/%23yocto.2018-07-05.log.html#t2018-07-05T10:07:25
> 
> Any idea?
> 
> --
> Stefan


  reply	other threads:[~2018-09-26  9:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-11 22:49 Build failure with parallel build and opkg Stefan Agner
2018-09-26  9:34 ` Stefan Agner [this message]
2018-10-02  8:46   ` Stefan Agner
2018-10-02 13:12     ` Stefan Agner
2018-10-02 19:03       ` Khem Raj
2018-10-11  9:32       ` Stefan Agner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abbffc6e7fa60468423b2cf1ec39f311@agner.ch \
    --to=stefan@agner.ch \
    --cc=brandon.shibley@toradex.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=ricardo@foundries.io \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=samuel.bissig@toradex.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox