* [PATCH v2 1/4] lib/oe/utils: allow to set a lower bound on returned cpu_count()
@ 2020-03-03 16:05 André Draszik
2020-03-03 16:05 ` [PATCH v2 2/4] bitbake.conf: more deterministic xz compression (threads) André Draszik
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: André Draszik @ 2020-03-03 16:05 UTC (permalink / raw)
To: openembedded-core
This will be needed for making xz compression more deterministic,
as xz archives are created differently in single- vs multi-threaded
modes.
This means that due to bitbake's default of using as many threads
as there are cores in the system, files compressed with xz
will be different if built on a multi-core system compared to
single-core systems.
Allowing cpu_count() here to return a lower bound, will allow
forcing xz to always use multi-threaded operation.
Signed-off-by: André Draszik <git@andred.net>
---
meta/lib/oe/utils.py | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/meta/lib/oe/utils.py b/meta/lib/oe/utils.py
index e350b05ddf..aee4336482 100644
--- a/meta/lib/oe/utils.py
+++ b/meta/lib/oe/utils.py
@@ -248,9 +248,10 @@ def trim_version(version, num_parts=2):
trimmed = ".".join(parts[:num_parts])
return trimmed
-def cpu_count():
+def cpu_count(at_least=1):
import multiprocessing
- return multiprocessing.cpu_count()
+ cpus = multiprocessing.cpu_count()
+ return max(cpus, at_least)
def execute_pre_post_process(d, cmds):
if cmds is None:
--
2.23.0.rc1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 2/4] bitbake.conf: more deterministic xz compression (threads) 2020-03-03 16:05 [PATCH v2 1/4] lib/oe/utils: allow to set a lower bound on returned cpu_count() André Draszik @ 2020-03-03 16:05 ` André Draszik 2020-03-03 16:05 ` [PATCH v2 3/4] bitbake.conf: omit XZ threads and RAM from sstate signatures André Draszik 2020-03-03 16:05 ` [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives André Draszik 2 siblings, 0 replies; 7+ messages in thread From: André Draszik @ 2020-03-03 16:05 UTC (permalink / raw) To: openembedded-core xz archives can be non-deterministic / non-reproducible: a) archives are created differently in single- vs multi-threaded modes b) xz will scale down the compression level so as to be try to work within any memory limit given to it when operating in single-threaded mode This means that due to bitbake's default of using as many threads as there are cores in the system, files compressed with xz will be different if built on a multi-core system compared to single-core systems. They will also potentially be different if built on single-core systems with different amounts of physical memory, due to bitbake's default of limiting xz's memory consumption. Force multi-threaded operation by default, even on single-core systems, so as to ensure archives are created in the same way in all cases. Signed-off-by: André Draszik <git@andred.net> --- meta/conf/bitbake.conf | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf index e201b671bb..131ba296d3 100644 --- a/meta/conf/bitbake.conf +++ b/meta/conf/bitbake.conf @@ -795,7 +795,7 @@ BB_NUMBER_THREADS ?= "${@oe.utils.cpu_count()}" PARALLEL_MAKE ?= "-j ${@oe.utils.cpu_count()}" # Default parallelism and resource usage for xz -XZ_DEFAULTS ?= "--memlimit=50% --threads=${@oe.utils.cpu_count()}" +XZ_DEFAULTS ?= "--memlimit=50% --threads=${@oe.utils.cpu_count(at_least=2)}" ################################################################## # Magic Cookie for SANITY CHECK -- 2.23.0.rc1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] bitbake.conf: omit XZ threads and RAM from sstate signatures 2020-03-03 16:05 [PATCH v2 1/4] lib/oe/utils: allow to set a lower bound on returned cpu_count() André Draszik 2020-03-03 16:05 ` [PATCH v2 2/4] bitbake.conf: more deterministic xz compression (threads) André Draszik @ 2020-03-03 16:05 ` André Draszik 2020-03-03 16:05 ` [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives André Draszik 2 siblings, 0 replies; 7+ messages in thread From: André Draszik @ 2020-03-03 16:05 UTC (permalink / raw) To: openembedded-core The number of threads used, and the amount of memory allowed to be used, should not affect sstate signatures, as they don't affect the outcome of the compression if xz operates in multi-threaded mode [1]. Otherwise, it becomes impossible to re-use sstate from automated builders on developer's machines (as the former might execute bitbake with certain constraints different compared to developer's machines). This is in particular a problem with the opkg package writing backend, as the OPKGBUILDCMD depends on XZ_DEFAULTS. Without the vardepexclude, there is no re-use possible of the package_write_ipk sstate. Whitelist the maximum number of threads and the memory limit given assumptions outlined in [2] below. Signed-off-by: André Draszik <git@andred.net> [1] When starting out in multi-threaded mode, the output is always deterministic, as even if xz scales down to single-threaded later, the archives are still split into blocks and size information is still added, thus keeping them compatible with multi-threaded mode. Also, when starting out in multi-threaded mode, xz never scales down the compression level to accomodate memory usage restrictions, it just scales down the number of threads and errors out if it can not accomodate the memory limit. [2] Assumptions * We only support multi-threaded mode (threads >= 2), builds should not try to use xz in single-threaded mode * The thread limit should be set via XZ_THREADS, not via modifying XZ_DEFAULTS or XZ_OPTS, or any other way * The thread limit should not be set to xz's magic value zero (0), as that will lead to single-threaded mode on single-core systems. --- meta/conf/bitbake.conf | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf index 131ba296d3..4b544a22cd 100644 --- a/meta/conf/bitbake.conf +++ b/meta/conf/bitbake.conf @@ -795,7 +795,10 @@ BB_NUMBER_THREADS ?= "${@oe.utils.cpu_count()}" PARALLEL_MAKE ?= "-j ${@oe.utils.cpu_count()}" # Default parallelism and resource usage for xz -XZ_DEFAULTS ?= "--memlimit=50% --threads=${@oe.utils.cpu_count(at_least=2)}" +XZ_MEMLIMIT ?= "50%" +XZ_THREADS ?= "${@oe.utils.cpu_count(at_least=2)}" +XZ_DEFAULTS ?= "--memlimit=${XZ_MEMLIMIT} --threads=${XZ_THREADS}" +XZ_DEFAULTS[vardepsexclude] += "XZ_MEMLIMIT XZ_THREADS" ################################################################## # Magic Cookie for SANITY CHECK -- 2.23.0.rc1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives 2020-03-03 16:05 [PATCH v2 1/4] lib/oe/utils: allow to set a lower bound on returned cpu_count() André Draszik 2020-03-03 16:05 ` [PATCH v2 2/4] bitbake.conf: more deterministic xz compression (threads) André Draszik 2020-03-03 16:05 ` [PATCH v2 3/4] bitbake.conf: omit XZ threads and RAM from sstate signatures André Draszik @ 2020-03-03 16:05 ` André Draszik 2020-03-03 16:08 ` André Draszik 2 siblings, 1 reply; 7+ messages in thread From: André Draszik @ 2020-03-03 16:05 UTC (permalink / raw) To: openembedded-core xz suffers from a reproducibility problem when not using multi- threaded mode: a) archives are created differently in single- vs multi-threaded modes b) xz will scale down the compression level so as to be able to work within any memory limit given to it when being launched in single-threaded mode. Thus, for reproducible xz archives we need to launch xz with at least two threads. Add a little sanity test, and error out otherwise, so as to guarantee no difference due this fact. Assumptions: * The thread limit should be set via XZ_THREADS, not via modifying XZ_DEFAULTS or XZ_OPTS, or any other way * The thread limit should not be set to xz's magic value zero (0), as that will lead to single-threaded mode on single-core systems This patch here doesn't prevent people from shooting themselves into the foot by changing XZ_DEFAULTS to change the number of threads directly, but it's can serve as a hint at least. Signed-off-by: André Draszik <git@andred.net> --- meta/classes/reproducible_build.bbclass | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/meta/classes/reproducible_build.bbclass b/meta/classes/reproducible_build.bbclass index 750eb950f2..e07bef87d8 100644 --- a/meta/classes/reproducible_build.bbclass +++ b/meta/classes/reproducible_build.bbclass @@ -35,6 +35,7 @@ # SOURCE_DATE_EPOCH is set for all tasks that might use it (do_configure, do_compile, do_package, ...) BUILD_REPRODUCIBLE_BINARIES ??= '1' +BUILD_REPRODUCIBLE_XZ_ARCHIVES ??= '1' inherit ${@oe.utils.ifelse(d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1', 'reproducible_build_simple', '')} SDE_DIR ="${WORKDIR}/source-date-epoch" @@ -198,4 +199,8 @@ BB_HASHBASE_WHITELIST += "SOURCE_DATE_EPOCH" python () { if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1': d.appendVarFlag("do_unpack", "postfuncs", " do_create_source_date_epoch_stamp") + + if d.getVar('BUILD_REPRODUCIBLE_XZ_ARCHIVES') == '1': + if int(d.getVar('XZ_THREADS')) < 2: + bb.fatal("Can not build reproducible XZ archives without threading") } -- 2.23.0.rc1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives 2020-03-03 16:05 ` [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives André Draszik @ 2020-03-03 16:08 ` André Draszik 2020-03-04 0:31 ` Otavio Salvador 0 siblings, 1 reply; 7+ messages in thread From: André Draszik @ 2020-03-03 16:08 UTC (permalink / raw) To: openembedded-core On Tue, 2020-03-03 at 16:05 +0000, André Draszik wrote: > xz suffers from a reproducibility problem when not using multi- > threaded mode: > a) archives are created differently in single- vs multi-threaded > modes > b) xz will scale down the compression level so as to be able to > work within any memory limit given to it when being launched > in single-threaded mode. > > Thus, for reproducible xz archives we need to launch xz with > at least two threads. > > Add a little sanity test, and error out otherwise, so as to > guarantee no difference due this fact. > > Assumptions: > * The thread limit should be set via XZ_THREADS, not via > modifying XZ_DEFAULTS or XZ_OPTS, or any other way > * The thread limit should not be set to xz's magic value > zero (0), as that will lead to single-threaded mode on > single-core systems > > This patch here doesn't prevent people from shooting themselves > into the foot by changing XZ_DEFAULTS to change the number > of threads directly, but it's can serve as a hint at least. I don't know if this patch is useful, feel free to drop it. In an ideal world, it'd parse the output of xz --verbose --verbose, to catch all possible ways people might be adjusting the thread limit, but that's non-trivial. Cheers, Andre' > > Signed-off-by: André Draszik <git@andred.net> > --- > meta/classes/reproducible_build.bbclass | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/meta/classes/reproducible_build.bbclass b/meta/classes/reproducible_build.bbclass > index 750eb950f2..e07bef87d8 100644 > --- a/meta/classes/reproducible_build.bbclass > +++ b/meta/classes/reproducible_build.bbclass > @@ -35,6 +35,7 @@ > # SOURCE_DATE_EPOCH is set for all tasks that might use it (do_configure, do_compile, do_package, ...) > > BUILD_REPRODUCIBLE_BINARIES ??= '1' > +BUILD_REPRODUCIBLE_XZ_ARCHIVES ??= '1' > inherit ${@oe.utils.ifelse(d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1', 'reproducible_build_simple', '')} > > SDE_DIR ="${WORKDIR}/source-date-epoch" > @@ -198,4 +199,8 @@ BB_HASHBASE_WHITELIST += "SOURCE_DATE_EPOCH" > python () { > if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1': > d.appendVarFlag("do_unpack", "postfuncs", " do_create_source_date_epoch_stamp") > + > + if d.getVar('BUILD_REPRODUCIBLE_XZ_ARCHIVES') == '1': > + if int(d.getVar('XZ_THREADS')) < 2: > + bb.fatal("Can not build reproducible XZ archives without threading") > } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives 2020-03-03 16:08 ` André Draszik @ 2020-03-04 0:31 ` Otavio Salvador 2020-03-04 6:03 ` Richard Purdie 0 siblings, 1 reply; 7+ messages in thread From: Otavio Salvador @ 2020-03-04 0:31 UTC (permalink / raw) To: André Draszik; +Cc: Patches and discussions about the oe-core layer On Tue, Mar 3, 2020 at 1:08 PM André Draszik <git@andred.net> wrote: > On Tue, 2020-03-03 at 16:05 +0000, André Draszik wrote: > In an ideal world, it'd parse the output of xz --verbose --verbose, to catch > all possible ways people might be adjusting the thread limit, but that's > non-trivial. Couldn't we just "enforce" at least two threads? It is quite unlikely we ever use OE on a single core machine (as it'd take few years to finish the build hehe) it seems like a reasonable assumption. -- Otavio Salvador O.S. Systems http://www.ossystems.com.br http://code.ossystems.com.br Mobile: +55 (53) 9 9981-7854 Mobile: +1 (347) 903-9750 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives 2020-03-04 0:31 ` Otavio Salvador @ 2020-03-04 6:03 ` Richard Purdie 0 siblings, 0 replies; 7+ messages in thread From: Richard Purdie @ 2020-03-04 6:03 UTC (permalink / raw) To: Otavio Salvador, André Draszik Cc: Patches and discussions about the oe-core layer On Tue, 2020-03-03 at 21:31 -0300, Otavio Salvador wrote: > On Tue, Mar 3, 2020 at 1:08 PM André Draszik <git@andred.net> wrote: > > On Tue, 2020-03-03 at 16:05 +0000, André Draszik wrote: > > In an ideal world, it'd parse the output of xz --verbose --verbose, > > to catch > > all possible ways people might be adjusting the thread limit, but > > that's > > non-trivial. > > Couldn't we just "enforce" at least two threads? It is quite unlikely > we ever use OE on a single core machine (as it'd take few years to > finish the build hehe) it seems like a reasonable assumption. An earlier patch does, unless you actually set XZ_THREADS = "1". If you do that, things are still reproducible in that the output will be consistent, just not with any other value of XZ_THREADS. So I think we should be fine without this patch. Cheers, Richard ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-03-04 6:03 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-03-03 16:05 [PATCH v2 1/4] lib/oe/utils: allow to set a lower bound on returned cpu_count() André Draszik 2020-03-03 16:05 ` [PATCH v2 2/4] bitbake.conf: more deterministic xz compression (threads) André Draszik 2020-03-03 16:05 ` [PATCH v2 3/4] bitbake.conf: omit XZ threads and RAM from sstate signatures André Draszik 2020-03-03 16:05 ` [PATCH v2 4/4] reproducible: try to ensure reproducible xz archives André Draszik 2020-03-03 16:08 ` André Draszik 2020-03-04 0:31 ` Otavio Salvador 2020-03-04 6:03 ` Richard Purdie
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.