Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Trevor Woerner <twoerner@gmail.com>
To: openembedded-core@lists.openembedded.org
Subject: build failures due to pigz host tool
Date: Wed, 3 Jul 2019 11:04:06 -0400	[thread overview]
Message-ID: <20190703150405.GA23280@linux-uys3> (raw)

Hi,

This came up as a topic in yesterday's Engineering Sync meeting. For roughly a
year I've been seeing random build failures on my Jenkins setup due to pigz
failing; apparently the project is now seeing them on their builds, so I'll
share what I know of them.

At the time I started seeing these failures (Aug 2018) I had just upgraded my
system to openSUSE 15.0. Since nobody else was seeing them, I assumed they
were related to my setup. When I went out searching for an answer, I found
there wasn't very much out there to help me. But I did notice that there were
reports of other people seeing the issue who weren't using openSUSE and who
weren't doing anything related to OE builds using Jenkins.

The build failure looks something like this:

	| DEBUG: Executing shell function sstate_create_package
	| pigz: abort: internal threads error
	| tar: /z/jenkins-workspace/nightly/cubietruck/build/sstate-cache/8a/sstate:linux-mainline:cubietruck-oe-linux-gnueabi:4.19.46:r0:cubietruck:3:8a159ba1ffefb5fc2feeeff5b40abf8ad67658e5ff3ed3bf67d25d9c8f2805e0_package.tgz.9bA8tCje: Wrote only 6144 of 10240 bytes
	| tar: Child returned status 16
	| tar: Error is not recoverable: exiting now
	| WARNING: /z/jenkins-workspace/nightly/cubietruck/build/tmp-glibc/work/cubietruck-oe-linux-gnueabi/linux-mainline/4.19.46-r0/temp/run.sstate_create_package.19996:1 exit 1 from 'exit 1'
	| DEBUG: Python function sstate_task_postfunc finished
	| ERROR: Function failed: sstate_create_package (log file is located at /z/jenkins-workspace/nightly/cubietruck/build/tmp-glibc/work/cubietruck-oe-linux-gnueabi/linux-mainline/4.19.46-r0/temp/log.do_package.19996)
	NOTE: recipe linux-mainline-4.19.46-r0: task do_package: Failed
	ERROR: Task (/opt/oe/configs/z/jenkins-workspace/nightly/cubietruck/layers/meta-sunxi/recipes-kernel/linux/linux-mainline_4.19.46.bb:do_package) failed with exit code '1'

Here's another example:

	| DEBUG: Executing shell function sstate_create_package
	| pigz: abort: internal threads error
	| tar: /z/jenkins-workspace/nightly/odroid-xu4/build/sstate-cache/d4/sstate:sqlite3:cortexa15t2hf-neon-vfpv4-oe-linux-gnueabi:3.28.0:r0:cortexa15t2hf-neon-vfpv4:3:d4eb5692a1756a832d72fb2003a3d431108fbc736044747d33698ad7b6881dd9_package.tgz.herLUpYQ: Wrote only 2048 of 10240 bytes
	| tar: Child returned status 16
	| tar: Error is not recoverable: exiting now
	| WARNING: /z/jenkins-workspace/nightly/odroid-xu4/build/tmp-glibc/work/cortexa15t2hf-neon-vfpv4-oe-linux-gnueabi/sqlite3/3_3.28.0-r0/temp/run.sstate_create_package.24136:1 exit 1 from 'exit 1'
	| DEBUG: Python function sstate_task_postfunc finished
	| ERROR: Function failed: sstate_create_package (log file is located at /z/jenkins-workspace/nightly/odroid-xu4/build/tmp-glibc/work/cortexa15t2hf-neon-vfpv4-oe-linux-gnueabi/sqlite3/3_3.28.0-r0/temp/log.do_package.24136)
	NOTE: recipe sqlite3-3_3.28.0-r0: task do_package: Failed
	ERROR: Task (/opt/oe/configs/z/jenkins-workspace/nightly/odroid-xu4/layers/openembedded-core/meta/recipes-support/sqlite/sqlite3_3.28.0.bb:do_package) failed with exit code '1'

When I first started seeing this problem, I would see it quite frequently.
Every morning, out of roughly 15 nightly builds, around 4-5 of them would have
failed in this way. Back then I would also get a lot of errors that would
report something along the lines of the following:

	fork: Resource temporarily unavailable
	Cannot spawn thread (?)

I don't have an example of that error on hand, but I used to get a lot of
those around the same time too.

My observations are:
- I've never seen any of these errors with builds that I run by hand, oddly
  enough, these errors only ever happen to builds that are run by Jenkins. I
  have no idea if this is just a coincidence, or if there is something going
  on related to kicking off a build from a large program (Jenkins)

- Back then these failures were quite frequent. Today, of the 20-ish or so
  Jenkins builds that are kicked off every night, in a 2-week span I have only
  2 such failures. So it seems that I've been able to reduce the occurrence
  rate, but not eliminate it completely

- I haven't seen the "resource" failure in a while. I don't know if these are
  two separate issues that just happened to start at the same time, or if
  they're related in some way.

From what little information I was able to find online, here are the things I
tweaked (which may or may not have contributed to the reduction in the rate of
occurrence):

- At that time, I had been setting a "barrier=6000" on the disk I was using
  for the builds. I removed that tweak.

- I edited /etc/systemd/logind.conf and set
	UserTaskMax=infinity

- I edited /etc/systemd/system.conf and set:
	DefaultTaskAccounting=no
	DefaultTaskMax=infinity

- I edited /etc/sysconfig/jenkins and added/set:
	JENKINS_JAVA_OPTIONS="-Djava.awt.headless=true -Xmx1g"

Since this build failure is so intermittent, it's quite hard to dig into it.
As I said above, of the last roughly 280 builds my system has done in the last
2 weeks, only 2 such failures occurred.

It's possible that overriding CONVERSION_CMD_gz in my builds to not use pigz
would probably fix the issue at the cost of losing the parallelism of the
sstate_create_package task.

My host machine's version of pigz is: 2.3.3

Best regards,
	Trevor


             reply	other threads:[~2019-07-03 15:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-03 15:04 Trevor Woerner [this message]
2019-07-03 16:02 ` build failures due to pigz host tool Mikko.Rapeli
2019-07-03 16:10 ` Richard Purdie
2019-07-04 14:27   ` Trevor Woerner
2019-07-04 15:57     ` Richard Purdie
2019-07-04 22:28       ` Richard Purdie
2019-07-06  8:14         ` Richard Purdie
2019-07-06 17:02           ` Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190703150405.GA23280@linux-uys3 \
    --to=twoerner@gmail.com \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox