* [PATCH] doc: ensure sphinx output is reproducible
@ 2023-06-29 12:58 christian.ehrhardt
  2023-06-29 13:02 ` Christian Ehrhardt
  2023-07-03 15:29 ` Thomas Monjalon
  0 siblings, 2 replies; 16+ messages in thread
From: christian.ehrhardt @ 2023-06-29 12:58 UTC (permalink / raw)
  To: dev; +Cc: Luca Boccassi, Christian Ehrhardt
From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
By adding -j we build in parallel, to make building on multiprocessor
machines more effective. While that works it does also break
reproducible builds as the order of the sphinx generated searchindex.js
is depending on execution speed of the individual processes.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
---
 buildtools/call-sphinx-build.py | 5 -----
 1 file changed, 5 deletions(-)
diff --git a/buildtools/call-sphinx-build.py b/buildtools/call-sphinx-build.py
index 39a60d09fa..d8879306de 100755
--- a/buildtools/call-sphinx-build.py
+++ b/buildtools/call-sphinx-build.py
@@ -15,12 +15,7 @@
 # set the version in environment for sphinx to pick up
 os.environ['DPDK_VERSION'] = version
 
-# for sphinx version >= 1.7 add parallelism using "-j auto"
-ver = run([sphinx, '--version'], stdout=PIPE,
-          stderr=STDOUT).stdout.decode().split()[-1]
 sphinx_cmd = [sphinx] + extra_args
-if Version(ver) >= Version('1.7'):
-    sphinx_cmd += ['-j', 'auto']
 
 # find all the files sphinx will process so we can write them as dependencies
 srcfiles = []
-- 
2.41.0
^ permalink raw reply related	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-06-29 12:58 [PATCH] doc: ensure sphinx output is reproducible christian.ehrhardt
@ 2023-06-29 13:02 ` Christian Ehrhardt
  2023-07-03 15:29 ` Thomas Monjalon
  1 sibling, 0 replies; 16+ messages in thread
From: Christian Ehrhardt @ 2023-06-29 13:02 UTC (permalink / raw)
  To: dev; +Cc: Luca Boccassi
On Thu, Jun 29, 2023 at 2:58 PM <christian.ehrhardt@canonical.com> wrote:
>
> From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
>
> By adding -j we build in parallel, to make building on multiprocessor
> machines more effective. While that works it does also break
> reproducible builds as the order of the sphinx generated searchindex.js
> is depending on execution speed of the individual processes.
Just FYI (this didn't fit fit well in the commit message) an example
of such a fail can be seen at
https://salsa.debian.org/paelzer-guest/dpdk/-/jobs/4372883
If you download the artifact, extract dpdk-doc, apply js-beautify for
readability and then diff it you'll find it
same-content-different-order.
Examples of two builds:
- https://paste.ubuntu.com/p/VhWYNRv7kN/
- https://paste.ubuntu.com/p/KcQk4Km9xM/
> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> ---
>  buildtools/call-sphinx-build.py | 5 -----
>  1 file changed, 5 deletions(-)
>
> diff --git a/buildtools/call-sphinx-build.py b/buildtools/call-sphinx-build.py
> index 39a60d09fa..d8879306de 100755
> --- a/buildtools/call-sphinx-build.py
> +++ b/buildtools/call-sphinx-build.py
> @@ -15,12 +15,7 @@
>  # set the version in environment for sphinx to pick up
>  os.environ['DPDK_VERSION'] = version
>
> -# for sphinx version >= 1.7 add parallelism using "-j auto"
> -ver = run([sphinx, '--version'], stdout=PIPE,
> -          stderr=STDOUT).stdout.decode().split()[-1]
>  sphinx_cmd = [sphinx] + extra_args
> -if Version(ver) >= Version('1.7'):
> -    sphinx_cmd += ['-j', 'auto']
>
>  # find all the files sphinx will process so we can write them as dependencies
>  srcfiles = []
> --
> 2.41.0
>
-- 
Christian Ehrhardt
Senior Staff Engineer and acting Director, Ubuntu Server
Canonical Ltd
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-06-29 12:58 [PATCH] doc: ensure sphinx output is reproducible christian.ehrhardt
  2023-06-29 13:02 ` Christian Ehrhardt
@ 2023-07-03 15:29 ` Thomas Monjalon
  2023-07-06 12:49   ` Christian Ehrhardt
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2023-07-03 15:29 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: dev, Luca Boccassi, david.marchand
29/06/2023 14:58, christian.ehrhardt@canonical.com:
> From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> 
> By adding -j we build in parallel, to make building on multiprocessor
> machines more effective. While that works it does also break
> reproducible builds as the order of the sphinx generated searchindex.js
> is depending on execution speed of the individual processes.
[...]
> -if Version(ver) >= Version('1.7'):
> -    sphinx_cmd += ['-j', 'auto']
What is the impact on build speed on an average machine?
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-07-03 15:29 ` Thomas Monjalon
@ 2023-07-06 12:49   ` Christian Ehrhardt
  2023-11-27 16:45     ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Christian Ehrhardt @ 2023-07-06 12:49 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Luca Boccassi, david.marchand
On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> >
> > By adding -j we build in parallel, to make building on multiprocessor
> > machines more effective. While that works it does also break
> > reproducible builds as the order of the sphinx generated searchindex.js
> > is depending on execution speed of the individual processes.
> [...]
> > -if Version(ver) >= Version('1.7'):
> > -    sphinx_cmd += ['-j', 'auto']
>
> What is the impact on build speed on an average machine?
Hi,
I haven't tested this in isolation as it was just a mandatory change
on the Debian/Ubuntu side.
And the time for exactly and only the doc build is hidden inside the
concurrency of meson.
But I can compare a full build [1] and a full build with the change [2].
That is an average build machine and it is 35 seconds slower with the
change to no more do doc builds in parallel.
[1]: https://launchpadlibrarian.net/673520160/buildlog_ubuntu-mantic-amd64.dpdk_22.11.2-2_BUILDING.txt.gz
[2]: https://launchpadlibrarian.net/674783718/buildlog_ubuntu-mantic-amd64.dpdk_22.11.2-3_BUILDING.txt.gz
-- 
Christian Ehrhardt
Senior Staff Engineer and acting Director, Ubuntu Server
Canonical Ltd
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-07-06 12:49   ` Christian Ehrhardt
@ 2023-11-27 16:45     ` Thomas Monjalon
  2023-11-27 17:00       ` Bruce Richardson
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2023-11-27 16:45 UTC (permalink / raw)
  To: Luca Boccassi, Christian Ehrhardt; +Cc: dev, david.marchand
06/07/2023 14:49, Christian Ehrhardt:
> On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > >
> > > By adding -j we build in parallel, to make building on multiprocessor
> > > machines more effective. While that works it does also break
> > > reproducible builds as the order of the sphinx generated searchindex.js
> > > is depending on execution speed of the individual processes.
> > [...]
> > > -if Version(ver) >= Version('1.7'):
> > > -    sphinx_cmd += ['-j', 'auto']
> >
> > What is the impact on build speed on an average machine?
> 
> Hi,
> I haven't tested this in isolation as it was just a mandatory change
> on the Debian/Ubuntu side.
> And the time for exactly and only the doc build is hidden inside the
> concurrency of meson.
> But I can compare a full build [1] and a full build with the change [2].
> 
> That is an average build machine and it is 35 seconds slower with the
> change to no more do doc builds in parallel.
I would prefer adding an option for reproducible build
(which is not a common requirement).
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-11-27 16:45     ` Thomas Monjalon
@ 2023-11-27 17:00       ` Bruce Richardson
  2024-05-17 11:29         ` Luca Boccassi
  2024-05-26 11:30         ` Thomas Monjalon
  0 siblings, 2 replies; 16+ messages in thread
From: Bruce Richardson @ 2023-11-27 17:00 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Luca Boccassi, Christian Ehrhardt, dev, david.marchand
On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> 06/07/2023 14:49, Christian Ehrhardt:
> > On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > >
> > > > By adding -j we build in parallel, to make building on multiprocessor
> > > > machines more effective. While that works it does also break
> > > > reproducible builds as the order of the sphinx generated searchindex.js
> > > > is depending on execution speed of the individual processes.
> > > [...]
> > > > -if Version(ver) >= Version('1.7'):
> > > > -    sphinx_cmd += ['-j', 'auto']
> > >
> > > What is the impact on build speed on an average machine?
> > 
> > Hi,
> > I haven't tested this in isolation as it was just a mandatory change
> > on the Debian/Ubuntu side.
> > And the time for exactly and only the doc build is hidden inside the
> > concurrency of meson.
> > But I can compare a full build [1] and a full build with the change [2].
> > 
> > That is an average build machine and it is 35 seconds slower with the
> > change to no more do doc builds in parallel.
> 
> I would prefer adding an option for reproducible build
> (which is not a common requirement).
> 
Taking a slightly different tack, is it possible to sort the searchindex.js
file post-build, so that even reproducible builds get the benefits of
parallelism?
/Bruce
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-11-27 17:00       ` Bruce Richardson
@ 2024-05-17 11:29         ` Luca Boccassi
  2024-05-19 13:54           ` Thomas Monjalon
  2024-05-26 11:30         ` Thomas Monjalon
  1 sibling, 1 reply; 16+ messages in thread
From: Luca Boccassi @ 2024-05-17 11:29 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Thomas Monjalon, Christian Ehrhardt, dev, david.marchand
On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > 06/07/2023 14:49, Christian Ehrhardt:
> > > On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > > >
> > > > > By adding -j we build in parallel, to make building on multiprocessor
> > > > > machines more effective. While that works it does also break
> > > > > reproducible builds as the order of the sphinx generated searchindex.js
> > > > > is depending on execution speed of the individual processes.
> > > > [...]
> > > > > -if Version(ver) >= Version('1.7'):
> > > > > -    sphinx_cmd += ['-j', 'auto']
> > > >
> > > > What is the impact on build speed on an average machine?
> > >
> > > Hi,
> > > I haven't tested this in isolation as it was just a mandatory change
> > > on the Debian/Ubuntu side.
> > > And the time for exactly and only the doc build is hidden inside the
> > > concurrency of meson.
> > > But I can compare a full build [1] and a full build with the change [2].
> > >
> > > That is an average build machine and it is 35 seconds slower with the
> > > change to no more do doc builds in parallel.
> >
> > I would prefer adding an option for reproducible build
> > (which is not a common requirement).
> >
> Taking a slightly different tack, is it possible to sort the searchindex.js
> file post-build, so that even reproducible builds get the benefits of
> parallelism?
Given the recent attacks with malicious sources being injected in open
source projects, reproducible builds are more important than ever and
should just be the default. Could we please take this patch as-is?
If someone wants to try and fix this searchindex.js in a different way
separately it can then be done later and on top of this.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-17 11:29         ` Luca Boccassi
@ 2024-05-19 13:54           ` Thomas Monjalon
  2024-05-19 16:36             ` Luca Boccassi
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2024-05-19 13:54 UTC (permalink / raw)
  To: Christian Ehrhardt, Luca Boccassi; +Cc: Bruce Richardson, dev, david.marchand
17/05/2024 13:29, Luca Boccassi:
> On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > > 06/07/2023 14:49, Christian Ehrhardt:
> > > > On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >
> > > > > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > > > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > > > >
> > > > > > By adding -j we build in parallel, to make building on multiprocessor
> > > > > > machines more effective. While that works it does also break
> > > > > > reproducible builds as the order of the sphinx generated searchindex.js
> > > > > > is depending on execution speed of the individual processes.
> > > > > [...]
> > > > > > -if Version(ver) >= Version('1.7'):
> > > > > > -    sphinx_cmd += ['-j', 'auto']
> > > > >
> > > > > What is the impact on build speed on an average machine?
> > > >
> > > > Hi,
> > > > I haven't tested this in isolation as it was just a mandatory change
> > > > on the Debian/Ubuntu side.
> > > > And the time for exactly and only the doc build is hidden inside the
> > > > concurrency of meson.
> > > > But I can compare a full build [1] and a full build with the change [2].
> > > >
> > > > That is an average build machine and it is 35 seconds slower with the
> > > > change to no more do doc builds in parallel.
> > >
> > > I would prefer adding an option for reproducible build
> > > (which is not a common requirement).
> > >
> > Taking a slightly different tack, is it possible to sort the searchindex.js
> > file post-build, so that even reproducible builds get the benefits of
> > parallelism?
> 
> Given the recent attacks with malicious sources being injected in open
> source projects, reproducible builds are more important than ever and
> should just be the default.
Yes it should be the default when packaging.
Why should it be the default for normal builds?
> Could we please take this patch as-is?
It's a pity nobody tried a different approach.
Considering the activity on this, it does not look a high priority.
> If someone wants to try and fix this searchindex.js in a different way
> separately it can then be done later and on top of this.
Removing something and ask others to re-add it later is a strange reasoning.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-19 13:54           ` Thomas Monjalon
@ 2024-05-19 16:36             ` Luca Boccassi
  2024-05-19 17:13               ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Luca Boccassi @ 2024-05-19 16:36 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Christian Ehrhardt, Bruce Richardson, dev, david.marchand,
	Mcnamara, John
On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 17/05/2024 13:29, Luca Boccassi:
> > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > > > 06/07/2023 14:49, Christian Ehrhardt:
> > > > > On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > > > > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > > > > >
> > > > > > > By adding -j we build in parallel, to make building on multiprocessor
> > > > > > > machines more effective. While that works it does also break
> > > > > > > reproducible builds as the order of the sphinx generated searchindex.js
> > > > > > > is depending on execution speed of the individual processes.
> > > > > > [...]
> > > > > > > -if Version(ver) >= Version('1.7'):
> > > > > > > -    sphinx_cmd += ['-j', 'auto']
> > > > > >
> > > > > > What is the impact on build speed on an average machine?
> > > > >
> > > > > Hi,
> > > > > I haven't tested this in isolation as it was just a mandatory change
> > > > > on the Debian/Ubuntu side.
> > > > > And the time for exactly and only the doc build is hidden inside the
> > > > > concurrency of meson.
> > > > > But I can compare a full build [1] and a full build with the change [2].
> > > > >
> > > > > That is an average build machine and it is 35 seconds slower with the
> > > > > change to no more do doc builds in parallel.
> > > >
> > > > I would prefer adding an option for reproducible build
> > > > (which is not a common requirement).
> > > >
> > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > file post-build, so that even reproducible builds get the benefits of
> > > parallelism?
> >
> > Given the recent attacks with malicious sources being injected in open
> > source projects, reproducible builds are more important than ever and
> > should just be the default.
>
> Yes it should be the default when packaging.
> Why should it be the default for normal builds?
Build reproducibility is everyone's responsibility, not just Linux
distributions. There should be no difference between a "normal build"
and a "packaging build". As far as I know, it is still fully supported
for DPDK consumers to take the git repository, build it and ship it
themselves - those cases also need their builds to be reproducible.
Nowadays reproducibility is no longer a "nice-to-have", it's table
stakes, as especially after the cybersecurity executive order of the
US govt from some time ago, procurement rules are getting stricter.
See the "Reproducible Builds" paragraph under the "2.4 Harden the
Build Environment" section in this CISA document on supply chain
security recommendations:
https://www.cisa.gov/sites/default/files/publications/ESF_SECURING_THE_SOFTWARE_SUPPLY_CHAIN_DEVELOPERS.PDF
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-19 16:36             ` Luca Boccassi
@ 2024-05-19 17:13               ` Thomas Monjalon
  2024-05-19 17:23                 ` Luca Boccassi
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2024-05-19 17:13 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Christian Ehrhardt, Bruce Richardson, dev, david.marchand,
	Mcnamara, John
19/05/2024 18:36, Luca Boccassi:
> On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:
> > 17/05/2024 13:29, Luca Boccassi:
> > > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > > > > I would prefer adding an option for reproducible build
> > > > > (which is not a common requirement).
> > > > >
> > > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > > file post-build, so that even reproducible builds get the benefits of
> > > > parallelism?
> > >
> > > Given the recent attacks with malicious sources being injected in open
> > > source projects, reproducible builds are more important than ever and
> > > should just be the default.
> >
> > Yes it should be the default when packaging.
> > Why should it be the default for normal builds?
> 
> Build reproducibility is everyone's responsibility, not just Linux
> distributions. There should be no difference between a "normal build"
> and a "packaging build". As far as I know, it is still fully supported
> for DPDK consumers to take the git repository, build it and ship it
> themselves - those cases also need their builds to be reproducible.
Sorry I really don't understand this point.
The goal of a reproducible build is to maintain a stable hash, right?
This hash needs to be stable only when it is published, isn't it?
So isn't it enough to give a build option for having a reproducible build?
> Nowadays reproducibility is no longer a "nice-to-have", it's table
> stakes, as especially after the cybersecurity executive order of the
> US govt from some time ago, procurement rules are getting stricter.
> See the "Reproducible Builds" paragraph under the "2.4 Harden the
> Build Environment" section in this CISA document on supply chain
> security recommendations:
> 
> https://www.cisa.gov/sites/default/files/publications/ESF_SECURING_THE_SOFTWARE_SUPPLY_CHAIN_DEVELOPERS.PDF
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-19 17:13               ` Thomas Monjalon
@ 2024-05-19 17:23                 ` Luca Boccassi
  2024-05-19 21:10                   ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Luca Boccassi @ 2024-05-19 17:23 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Christian Ehrhardt, Bruce Richardson, dev, david.marchand,
	Mcnamara, John
On Sun, 19 May 2024 at 18:13, Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 19/05/2024 18:36, Luca Boccassi:
> > On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 17/05/2024 13:29, Luca Boccassi:
> > > > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > > > <bruce.richardson@intel.com> wrote:
> > > > >
> > > > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > > > > > I would prefer adding an option for reproducible build
> > > > > > (which is not a common requirement).
> > > > > >
> > > > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > > > file post-build, so that even reproducible builds get the benefits of
> > > > > parallelism?
> > > >
> > > > Given the recent attacks with malicious sources being injected in open
> > > > source projects, reproducible builds are more important than ever and
> > > > should just be the default.
> > >
> > > Yes it should be the default when packaging.
> > > Why should it be the default for normal builds?
> >
> > Build reproducibility is everyone's responsibility, not just Linux
> > distributions. There should be no difference between a "normal build"
> > and a "packaging build". As far as I know, it is still fully supported
> > for DPDK consumers to take the git repository, build it and ship it
> > themselves - those cases also need their builds to be reproducible.
>
> Sorry I really don't understand this point.
> The goal of a reproducible build is to maintain a stable hash, right?
> This hash needs to be stable only when it is published, isn't it?
> So isn't it enough to give a build option for having a reproducible build?
The goal is that issues breaking reproducibility are bugs and treated
as such. You wouldn't have a build option to allow buffer overflows or
null pointer dereferences, and so on. "The program builds
reproducibly" today and in the future has the same importance as "the
program doesn't write beyond bounds" or "the program doesn't crash" -
they are not optional qualities, they are table stakes, and
regulations are only going to get stricter.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-19 17:23                 ` Luca Boccassi
@ 2024-05-19 21:10                   ` Thomas Monjalon
  2024-05-20  9:53                     ` Luca Boccassi
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Monjalon @ 2024-05-19 21:10 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Christian Ehrhardt, Bruce Richardson, dev, david.marchand,
	Mcnamara, John
19/05/2024 19:23, Luca Boccassi:
> On Sun, 19 May 2024 at 18:13, Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 19/05/2024 18:36, Luca Boccassi:
> > > On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 17/05/2024 13:29, Luca Boccassi:
> > > > > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > > > > <bruce.richardson@intel.com> wrote:
> > > > > >
> > > > > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > > > > > > I would prefer adding an option for reproducible build
> > > > > > > (which is not a common requirement).
> > > > > > >
> > > > > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > > > > file post-build, so that even reproducible builds get the benefits of
> > > > > > parallelism?
> > > > >
> > > > > Given the recent attacks with malicious sources being injected in open
> > > > > source projects, reproducible builds are more important than ever and
> > > > > should just be the default.
> > > >
> > > > Yes it should be the default when packaging.
> > > > Why should it be the default for normal builds?
> > >
> > > Build reproducibility is everyone's responsibility, not just Linux
> > > distributions. There should be no difference between a "normal build"
> > > and a "packaging build". As far as I know, it is still fully supported
> > > for DPDK consumers to take the git repository, build it and ship it
> > > themselves - those cases also need their builds to be reproducible.
> >
> > Sorry I really don't understand this point.
> > The goal of a reproducible build is to maintain a stable hash, right?
> > This hash needs to be stable only when it is published, isn't it?
> > So isn't it enough to give a build option for having a reproducible build?
> 
> The goal is that issues breaking reproducibility are bugs and treated
> as such. You wouldn't have a build option to allow buffer overflows or
> null pointer dereferences, and so on. "The program builds
> reproducibly" today and in the future has the same importance as "the
> program doesn't write beyond bounds" or "the program doesn't crash" -
> they are not optional qualities, they are table stakes, and
> regulations are only going to get stricter.
I hear the technical reasons and want to address them, but
I don't understand how regulation comes in an open source project.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-19 21:10                   ` Thomas Monjalon
@ 2024-05-20  9:53                     ` Luca Boccassi
  2024-05-20 15:39                       ` Stephen Hemminger
  0 siblings, 1 reply; 16+ messages in thread
From: Luca Boccassi @ 2024-05-20  9:53 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Christian Ehrhardt, Bruce Richardson, dev, david.marchand,
	Mcnamara, John
On Sun, 19 May 2024 at 22:11, Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 19/05/2024 19:23, Luca Boccassi:
> > On Sun, 19 May 2024 at 18:13, Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 19/05/2024 18:36, Luca Boccassi:
> > > > On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > 17/05/2024 13:29, Luca Boccassi:
> > > > > > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > > > > > <bruce.richardson@intel.com> wrote:
> > > > > > >
> > > > > > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > > > > > > > I would prefer adding an option for reproducible build
> > > > > > > > (which is not a common requirement).
> > > > > > > >
> > > > > > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > > > > > file post-build, so that even reproducible builds get the benefits of
> > > > > > > parallelism?
> > > > > >
> > > > > > Given the recent attacks with malicious sources being injected in open
> > > > > > source projects, reproducible builds are more important than ever and
> > > > > > should just be the default.
> > > > >
> > > > > Yes it should be the default when packaging.
> > > > > Why should it be the default for normal builds?
> > > >
> > > > Build reproducibility is everyone's responsibility, not just Linux
> > > > distributions. There should be no difference between a "normal build"
> > > > and a "packaging build". As far as I know, it is still fully supported
> > > > for DPDK consumers to take the git repository, build it and ship it
> > > > themselves - those cases also need their builds to be reproducible.
> > >
> > > Sorry I really don't understand this point.
> > > The goal of a reproducible build is to maintain a stable hash, right?
> > > This hash needs to be stable only when it is published, isn't it?
> > > So isn't it enough to give a build option for having a reproducible build?
> >
> > The goal is that issues breaking reproducibility are bugs and treated
> > as such. You wouldn't have a build option to allow buffer overflows or
> > null pointer dereferences, and so on. "The program builds
> > reproducibly" today and in the future has the same importance as "the
> > program doesn't write beyond bounds" or "the program doesn't crash" -
> > they are not optional qualities, they are table stakes, and
> > regulations are only going to get stricter.
>
> I hear the technical reasons and want to address them, but
> I don't understand how regulation comes in an open source project.
Because they will start affecting the companies using DPDK in their
products. There are some things in supply chain security that are
purely the purview of companies shipping the final products, like
providing SBOMs, but there are things that aren't, like for example
having processes to handle security issues, or anything that requires
code changes, like this issue.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-20  9:53                     ` Luca Boccassi
@ 2024-05-20 15:39                       ` Stephen Hemminger
  2024-05-20 18:59                         ` Thomas Monjalon
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2024-05-20 15:39 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Thomas Monjalon, Christian Ehrhardt, Bruce Richardson, dev,
	david.marchand, Mcnamara, John
On Mon, 20 May 2024 10:53:07 +0100
Luca Boccassi <bluca@debian.org> wrote:
> On Sun, 19 May 2024 at 22:11, Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 19/05/2024 19:23, Luca Boccassi:  
> > > On Sun, 19 May 2024 at 18:13, Thomas Monjalon <thomas@monjalon.net> wrote:  
> > > >
> > > > 19/05/2024 18:36, Luca Boccassi:  
> > > > > On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:  
> > > > > > 17/05/2024 13:29, Luca Boccassi:  
> > > > > > > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > > > > > > <bruce.richardson@intel.com> wrote:  
> > > > > > > >
> > > > > > > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:  
> > > > > > > > > I would prefer adding an option for reproducible build
> > > > > > > > > (which is not a common requirement).
> > > > > > > > >  
> > > > > > > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > > > > > > file post-build, so that even reproducible builds get the benefits of
> > > > > > > > parallelism?  
> > > > > > >
> > > > > > > Given the recent attacks with malicious sources being injected in open
> > > > > > > source projects, reproducible builds are more important than ever and
> > > > > > > should just be the default.  
> > > > > >
> > > > > > Yes it should be the default when packaging.
> > > > > > Why should it be the default for normal builds?  
> > > > >
> > > > > Build reproducibility is everyone's responsibility, not just Linux
> > > > > distributions. There should be no difference between a "normal build"
> > > > > and a "packaging build". As far as I know, it is still fully supported
> > > > > for DPDK consumers to take the git repository, build it and ship it
> > > > > themselves - those cases also need their builds to be reproducible.  
> > > >
> > > > Sorry I really don't understand this point.
> > > > The goal of a reproducible build is to maintain a stable hash, right?
> > > > This hash needs to be stable only when it is published, isn't it?
> > > > So isn't it enough to give a build option for having a reproducible build?  
> > >
> > > The goal is that issues breaking reproducibility are bugs and treated
> > > as such. You wouldn't have a build option to allow buffer overflows or
> > > null pointer dereferences, and so on. "The program builds
> > > reproducibly" today and in the future has the same importance as "the
> > > program doesn't write beyond bounds" or "the program doesn't crash" -
> > > they are not optional qualities, they are table stakes, and
> > > regulations are only going to get stricter.  
> >
> > I hear the technical reasons and want to address them, but
> > I don't understand how regulation comes in an open source project.  
> 
> Because they will start affecting the companies using DPDK in their
> products. There are some things in supply chain security that are
> purely the purview of companies shipping the final products, like
> providing SBOMs, but there are things that aren't, like for example
> having processes to handle security issues, or anything that requires
> code changes, like this issue.
Reproducible must be the default. It should not be an option
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2024-05-20 15:39                       ` Stephen Hemminger
@ 2024-05-20 18:59                         ` Thomas Monjalon
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Monjalon @ 2024-05-20 18:59 UTC (permalink / raw)
  To: Luca Boccassi, Stephen Hemminger
  Cc: Christian Ehrhardt, Bruce Richardson, dev, david.marchand,
	Mcnamara, John
20/05/2024 17:39, Stephen Hemminger:
> On Mon, 20 May 2024 10:53:07 +0100
> Luca Boccassi <bluca@debian.org> wrote:
> 
> > On Sun, 19 May 2024 at 22:11, Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 19/05/2024 19:23, Luca Boccassi:  
> > > > On Sun, 19 May 2024 at 18:13, Thomas Monjalon <thomas@monjalon.net> wrote:  
> > > > >
> > > > > 19/05/2024 18:36, Luca Boccassi:  
> > > > > > On Sun, 19 May 2024 at 15:01, Thomas Monjalon <thomas@monjalon.net> wrote:  
> > > > > > > 17/05/2024 13:29, Luca Boccassi:  
> > > > > > > > On Mon, 27 Nov 2023 at 17:04, Bruce Richardson
> > > > > > > > <bruce.richardson@intel.com> wrote:  
> > > > > > > > >
> > > > > > > > > On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:  
> > > > > > > > > > I would prefer adding an option for reproducible build
> > > > > > > > > > (which is not a common requirement).
> > > > > > > > > >  
> > > > > > > > > Taking a slightly different tack, is it possible to sort the searchindex.js
> > > > > > > > > file post-build, so that even reproducible builds get the benefits of
> > > > > > > > > parallelism?  
> > > > > > > >
> > > > > > > > Given the recent attacks with malicious sources being injected in open
> > > > > > > > source projects, reproducible builds are more important than ever and
> > > > > > > > should just be the default.  
> > > > > > >
> > > > > > > Yes it should be the default when packaging.
> > > > > > > Why should it be the default for normal builds?  
> > > > > >
> > > > > > Build reproducibility is everyone's responsibility, not just Linux
> > > > > > distributions. There should be no difference between a "normal build"
> > > > > > and a "packaging build". As far as I know, it is still fully supported
> > > > > > for DPDK consumers to take the git repository, build it and ship it
> > > > > > themselves - those cases also need their builds to be reproducible.  
> > > > >
> > > > > Sorry I really don't understand this point.
> > > > > The goal of a reproducible build is to maintain a stable hash, right?
> > > > > This hash needs to be stable only when it is published, isn't it?
> > > > > So isn't it enough to give a build option for having a reproducible build?  
> > > >
> > > > The goal is that issues breaking reproducibility are bugs and treated
> > > > as such. You wouldn't have a build option to allow buffer overflows or
> > > > null pointer dereferences, and so on. "The program builds
> > > > reproducibly" today and in the future has the same importance as "the
> > > > program doesn't write beyond bounds" or "the program doesn't crash" -
> > > > they are not optional qualities, they are table stakes, and
> > > > regulations are only going to get stricter.  
> > >
> > > I hear the technical reasons and want to address them, but
> > > I don't understand how regulation comes in an open source project.  
> > 
> > Because they will start affecting the companies using DPDK in their
> > products. There are some things in supply chain security that are
> > purely the purview of companies shipping the final products, like
> > providing SBOMs, but there are things that aren't, like for example
> > having processes to handle security issues, or anything that requires
> > code changes, like this issue.
> 
> Reproducible must be the default. It should not be an option
OK I think I better understand, thanks.
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] doc: ensure sphinx output is reproducible
  2023-11-27 17:00       ` Bruce Richardson
  2024-05-17 11:29         ` Luca Boccassi
@ 2024-05-26 11:30         ` Thomas Monjalon
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Monjalon @ 2024-05-26 11:30 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: dev, Luca Boccassi, david.marchand, Bruce Richardson
27/11/2023 18:00, Bruce Richardson:
> On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> > 06/07/2023 14:49, Christian Ehrhardt:
> > > On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > > >
> > > > > By adding -j we build in parallel, to make building on multiprocessor
> > > > > machines more effective. While that works it does also break
> > > > > reproducible builds as the order of the sphinx generated searchindex.js
> > > > > is depending on execution speed of the individual processes.
> > > > [...]
> > > > > -if Version(ver) >= Version('1.7'):
> > > > > -    sphinx_cmd += ['-j', 'auto']
> > > >
> > > > What is the impact on build speed on an average machine?
> > > 
> > > Hi,
> > > I haven't tested this in isolation as it was just a mandatory change
> > > on the Debian/Ubuntu side.
> > > And the time for exactly and only the doc build is hidden inside the
> > > concurrency of meson.
> > > But I can compare a full build [1] and a full build with the change [2].
> > > 
> > > That is an average build machine and it is 35 seconds slower with the
> > > change to no more do doc builds in parallel.
> > 
> > I would prefer adding an option for reproducible build
> > (which is not a common requirement).
> > 
> Taking a slightly different tack, is it possible to sort the searchindex.js
> file post-build, so that even reproducible builds get the benefits of
> parallelism?
We never got a response to this question.
I apply this patch, but I'm waiting for an effort in the direction
Bruce is proposing.
I remove the line "from packaging.version import Version" which becomes useless.
Same for PIPE and STDOUT.
For now we are slowing down the build of docs because there is a strong push
for reproducibility and not enough effort for doing it efficiently.
^ permalink raw reply	[flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-05-26 11:30 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-29 12:58 [PATCH] doc: ensure sphinx output is reproducible christian.ehrhardt
2023-06-29 13:02 ` Christian Ehrhardt
2023-07-03 15:29 ` Thomas Monjalon
2023-07-06 12:49   ` Christian Ehrhardt
2023-11-27 16:45     ` Thomas Monjalon
2023-11-27 17:00       ` Bruce Richardson
2024-05-17 11:29         ` Luca Boccassi
2024-05-19 13:54           ` Thomas Monjalon
2024-05-19 16:36             ` Luca Boccassi
2024-05-19 17:13               ` Thomas Monjalon
2024-05-19 17:23                 ` Luca Boccassi
2024-05-19 21:10                   ` Thomas Monjalon
2024-05-20  9:53                     ` Luca Boccassi
2024-05-20 15:39                       ` Stephen Hemminger
2024-05-20 18:59                         ` Thomas Monjalon
2024-05-26 11:30         ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).