public inbox for openembedded-core@lists.openembedded.org
 help / color / mirror / Atom feed
* Autobuilder reproducibility target changes
@ 2021-02-14 12:19 Richard Purdie
  2021-02-14 15:04 ` Alexander Kanavin
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Richard Purdie @ 2021-02-14 12:19 UTC (permalink / raw)
  To: openembedded-core
  Cc: swat, Mittal, Anuj, Steve Sakoman, Joshua Watt, Alexander Kanavin

Regular users of the autobuilder will note that I've split the
reproducible builds test out of the main oe-selftest build and into its
own target build. This is because that test tends to run for a lot
longer time period and it helps to see the result separately.

I've only done this for master. If gatesgarth and dunfell want to
follow, that should be straight forward with a change to the branch in
autobuilder-helper. Obviously we should ensure this is working ok with
master first but so far so good.

It has already highlighted the difference between a successful run:

https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
(took 3-4 hours)

and failing two failing runs:

https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
(took 9 hours)

the time difference being the system trying to run diffoscope on vim-
common :/.

I'm aware I removed some recipes from the exclusions list after seeing
multiple passing builds for all distros and we're now seeing test
failures. My mistake was not waiting for the date to change and for
builds to run on an autobuilder worker with a different umask.

Meson is failing with a pyc file mismatch which diffoscope can't decode
and despite trying for 5 hours, diffoscope hasn't given any data on why
vim-common differs. I should have fixes in for quilt, valgrind, kernel-
devsrc and cwautomacros. The umask fix may fix other issues too. Alex
has improved the reporting so we can spot cases where exclusion is now
longer needed.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 12:19 Autobuilder reproducibility target changes Richard Purdie
@ 2021-02-14 15:04 ` Alexander Kanavin
  2021-02-14 15:17   ` Richard Purdie
  2021-02-14 18:47 ` Joshua Watt
  2021-02-14 19:17 ` Joshua Watt
  2 siblings, 1 reply; 9+ messages in thread
From: Alexander Kanavin @ 2021-02-14 15:04 UTC (permalink / raw)
  To: Richard Purdie
  Cc: openembedded-core, swat, Mittal, Anuj, Steve Sakoman, Joshua Watt

[-- Attachment #1: Type: text/plain, Size: 2075 bytes --]

Cheers :) If there's something else I could help, tell.

One item is getting to the bottom of why it takes diffoscope beyond the
heat death of the universe to render its verdict on some items.

Alex

On Sun, 14 Feb 2021 at 13:19, Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> Regular users of the autobuilder will note that I've split the
> reproducible builds test out of the main oe-selftest build and into its
> own target build. This is because that test tends to run for a lot
> longer time period and it helps to see the result separately.
>
> I've only done this for master. If gatesgarth and dunfell want to
> follow, that should be straight forward with a change to the branch in
> autobuilder-helper. Obviously we should ensure this is working ok with
> master first but so far so good.
>
> It has already highlighted the difference between a successful run:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
> https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
> (took 3-4 hours)
>
> and failing two failing runs:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
> https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
> (took 9 hours)
>
> the time difference being the system trying to run diffoscope on vim-
> common :/.
>
> I'm aware I removed some recipes from the exclusions list after seeing
> multiple passing builds for all distros and we're now seeing test
> failures. My mistake was not waiting for the date to change and for
> builds to run on an autobuilder worker with a different umask.
>
> Meson is failing with a pyc file mismatch which diffoscope can't decode
> and despite trying for 5 hours, diffoscope hasn't given any data on why
> vim-common differs. I should have fixes in for quilt, valgrind, kernel-
> devsrc and cwautomacros. The umask fix may fix other issues too. Alex
> has improved the reporting so we can spot cases where exclusion is now
> longer needed.
>
> Cheers,
>
> Richard
>
>

[-- Attachment #2: Type: text/html, Size: 3007 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 15:04 ` Alexander Kanavin
@ 2021-02-14 15:17   ` Richard Purdie
  2021-02-14 16:30     ` Alexander Kanavin
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Purdie @ 2021-02-14 15:17 UTC (permalink / raw)
  To: Alexander Kanavin
  Cc: openembedded-core, swat, Mittal, Anuj, Steve Sakoman, Joshua Watt

On Sun, 2021-02-14 at 16:04 +0100, Alexander Kanavin wrote:
> Cheers :) If there's something else I could help, tell.

I'm going to try and get diffs of the remaining big package differences
and see where we stand. The two big ones I know of are go as a language
for reproducibility and perf. Perf will just be a shear pain to fix,
hopefully the kernel will take patches.

> One item is getting to the bottom of why it takes diffoscope beyond
> the heat death of the universe to render its verdict on some items.

That would be really helpful to get to the bottom of. The vim-common
difference is actually really simple:

https://autobuilder.yocto.io/pub/repro-fail/oe-reproducible-20210213-0djxo1sn/packages/diff-html/

so I don't know why it took 5 hours to compute that. It suggests
something really silly/stupid is going on. diffoscope should be
amenable to fixes so it would be worth talking to them too...

(I have a fix for the locale problem in vim brewing, it needs a new
buildtools-extended-tarball)

Cheers,

Richard




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 15:17   ` Richard Purdie
@ 2021-02-14 16:30     ` Alexander Kanavin
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Kanavin @ 2021-02-14 16:30 UTC (permalink / raw)
  To: Richard Purdie
  Cc: openembedded-core, swat, Mittal, Anuj, Steve Sakoman, Joshua Watt

[-- Attachment #1: Type: text/plain, Size: 940 bytes --]

On Sun, 14 Feb 2021 at 16:17, Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> On Sun, 2021-02-14 at 16:04 +0100, Alexander Kanavin wrote:
> > Cheers :) If there's something else I could help, tell.
>
> I'm going to try and get diffs of the remaining big package differences
> and see where we stand. The two big ones I know of are go as a language
> for reproducibility and perf. Perf will just be a shear pain to fix,
> hopefully the kernel will take patches.
>

I did a bit of work on the go reproducibility, that has been preserved here:
http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=akanavin/go-repro

I got there, but it took quite a bit of time (debugging go build process is
extremely painful) and I'm not at all happy with the hacky/brittle things
in the patch, so it's on hold for now - but anyone is welcome to take it
and make it better, especially if they're go specialists.

Alex

[-- Attachment #2: Type: text/html, Size: 1407 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 12:19 Autobuilder reproducibility target changes Richard Purdie
  2021-02-14 15:04 ` Alexander Kanavin
@ 2021-02-14 18:47 ` Joshua Watt
  2021-02-14 19:17 ` Joshua Watt
  2 siblings, 0 replies; 9+ messages in thread
From: Joshua Watt @ 2021-02-14 18:47 UTC (permalink / raw)
  To: Richard Purdie
  Cc: openembedded-core, swat, Mittal, Anuj, Steve Sakoman,
	Alexander Kanavin

On Sun, Feb 14, 2021 at 6:19 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> Regular users of the autobuilder will note that I've split the
> reproducible builds test out of the main oe-selftest build and into its
> own target build. This is because that test tends to run for a lot
> longer time period and it helps to see the result separately.
>
> I've only done this for master. If gatesgarth and dunfell want to
> follow, that should be straight forward with a change to the branch in
> autobuilder-helper. Obviously we should ensure this is working ok with
> master first but so far so good.
>
> It has already highlighted the difference between a successful run:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
> https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
> (took 3-4 hours)
>
> and failing two failing runs:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
> https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
> (took 9 hours)
>
> the time difference being the system trying to run diffoscope on vim-
> common :/.

I'm not sure that diffoscope is the culprit here. If you look at the
logs, you can see that there is only about 30 seconds between the
"Running diffoscope" log message and the end of the test. I suspect
something else is going wrong here. I can try to write up patch to try
and add more logging so we can more accurately pinpoint where it's
taking so long.


>
> I'm aware I removed some recipes from the exclusions list after seeing
> multiple passing builds for all distros and we're now seeing test
> failures. My mistake was not waiting for the date to change and for
> builds to run on an autobuilder worker with a different umask.
>
> Meson is failing with a pyc file mismatch which diffoscope can't decode
> and despite trying for 5 hours, diffoscope hasn't given any data on why
> vim-common differs. I should have fixes in for quilt, valgrind, kernel-
> devsrc and cwautomacros. The umask fix may fix other issues too. Alex
> has improved the reporting so we can spot cases where exclusion is now
> longer needed.
>
> Cheers,
>
> Richard
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 12:19 Autobuilder reproducibility target changes Richard Purdie
  2021-02-14 15:04 ` Alexander Kanavin
  2021-02-14 18:47 ` Joshua Watt
@ 2021-02-14 19:17 ` Joshua Watt
  2021-02-15  6:21   ` Alexander Kanavin
  2021-02-15 16:31   ` Richard Purdie
  2 siblings, 2 replies; 9+ messages in thread
From: Joshua Watt @ 2021-02-14 19:17 UTC (permalink / raw)
  To: Richard Purdie
  Cc: openembedded-core, swat, Mittal, Anuj, Steve Sakoman,
	Alexander Kanavin

On Sun, Feb 14, 2021 at 6:19 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> Regular users of the autobuilder will note that I've split the
> reproducible builds test out of the main oe-selftest build and into its
> own target build. This is because that test tends to run for a lot
> longer time period and it helps to see the result separately.
>
> I've only done this for master. If gatesgarth and dunfell want to
> follow, that should be straight forward with a change to the branch in
> autobuilder-helper. Obviously we should ensure this is working ok with
> master first but so far so good.
>
> It has already highlighted the difference between a successful run:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
> https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
> (took 3-4 hours)
>
> and failing two failing runs:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
> https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
> (took 9 hours)

OK, I read through the code and unfortunately found a bug: when
attempting to make sure the "B" build doesn't use sstate, I misspelled
the SSTATE_MIRRORS, which means that the B build could have been
pulling from the sstate mirror when it was not supposed to. This has a
few implications:

 1) It might explain why some of the reproducible results seem intermittent
 2) It might explain why there is such a time disparity between the tests

Unfortunately, while it probably will help the intermittent results,
it probably means that the tests taking 9 hours is what is "supposed"
to happen, and they happen to be shorter sometimes because the B build
is pulling from sstate when it's not supposed to.

>
> the time difference being the system trying to run diffoscope on vim-
> common :/.
>
> I'm aware I removed some recipes from the exclusions list after seeing
> multiple passing builds for all distros and we're now seeing test
> failures. My mistake was not waiting for the date to change and for
> builds to run on an autobuilder worker with a different umask.
>
> Meson is failing with a pyc file mismatch which diffoscope can't decode
> and despite trying for 5 hours, diffoscope hasn't given any data on why
> vim-common differs. I should have fixes in for quilt, valgrind, kernel-
> devsrc and cwautomacros. The umask fix may fix other issues too. Alex
> has improved the reporting so we can spot cases where exclusion is now
> longer needed.
>
> Cheers,
>
> Richard
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 19:17 ` Joshua Watt
@ 2021-02-15  6:21   ` Alexander Kanavin
  2021-02-15 16:23     ` Joshua Watt
  2021-02-15 16:31   ` Richard Purdie
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Kanavin @ 2021-02-15  6:21 UTC (permalink / raw)
  To: Joshua Watt
  Cc: Mittal, Anuj, Richard Purdie, Steve Sakoman, openembedded-core,
	swat

[-- Attachment #1: Type: text/plain, Size: 2944 bytes --]

I’ve definitely seen diffoscope process take hours and hours and hours in
local builds. Trying it with these vim packages locally should still be
done.

Alex

On Sun 14. Feb 2021 at 20.18, Joshua Watt <jpewhacker@gmail.com> wrote:

> On Sun, Feb 14, 2021 at 6:19 AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> >
> > Regular users of the autobuilder will note that I've split the
> > reproducible builds test out of the main oe-selftest build and into its
> > own target build. This is because that test tends to run for a lot
> > longer time period and it helps to see the result separately.
> >
> > I've only done this for master. If gatesgarth and dunfell want to
> > follow, that should be straight forward with a change to the branch in
> > autobuilder-helper. Obviously we should ensure this is working ok with
> > master first but so far so good.
> >
> > It has already highlighted the difference between a successful run:
> >
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
> > (took 3-4 hours)
> >
> > and failing two failing runs:
> >
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
> > (took 9 hours)
>
> OK, I read through the code and unfortunately found a bug: when
> attempting to make sure the "B" build doesn't use sstate, I misspelled
> the SSTATE_MIRRORS, which means that the B build could have been
> pulling from the sstate mirror when it was not supposed to. This has a
> few implications:
>
>  1) It might explain why some of the reproducible results seem intermittent
>  2) It might explain why there is such a time disparity between the tests
>
> Unfortunately, while it probably will help the intermittent results,
> it probably means that the tests taking 9 hours is what is "supposed"
> to happen, and they happen to be shorter sometimes because the B build
> is pulling from sstate when it's not supposed to.
>
> >
> > the time difference being the system trying to run diffoscope on vim-
> > common :/.
> >
> > I'm aware I removed some recipes from the exclusions list after seeing
> > multiple passing builds for all distros and we're now seeing test
> > failures. My mistake was not waiting for the date to change and for
> > builds to run on an autobuilder worker with a different umask.
> >
> > Meson is failing with a pyc file mismatch which diffoscope can't decode
> > and despite trying for 5 hours, diffoscope hasn't given any data on why
> > vim-common differs. I should have fixes in for quilt, valgrind, kernel-
> > devsrc and cwautomacros. The umask fix may fix other issues too. Alex
> > has improved the reporting so we can spot cases where exclusion is now
> > longer needed.
> >
> > Cheers,
> >
> > Richard
> >
>

[-- Attachment #2: Type: text/html, Size: 4157 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-15  6:21   ` Alexander Kanavin
@ 2021-02-15 16:23     ` Joshua Watt
  0 siblings, 0 replies; 9+ messages in thread
From: Joshua Watt @ 2021-02-15 16:23 UTC (permalink / raw)
  To: Alexander Kanavin
  Cc: Mittal, Anuj, Richard Purdie, Steve Sakoman, openembedded-core,
	swat

On Mon, Feb 15, 2021 at 12:21 AM Alexander Kanavin
<alex.kanavin@gmail.com> wrote:
>
> I’ve definitely seen diffoscope process take hours and hours and hours in local builds. Trying it with these vim packages locally should still be done.

I forgot to mention that I did run diffoscope locally with the
offending vim packages and it took about 30 seconds (same as the AB
logs showed)

>
> Alex
>
> On Sun 14. Feb 2021 at 20.18, Joshua Watt <jpewhacker@gmail.com> wrote:
>>
>> On Sun, Feb 14, 2021 at 6:19 AM Richard Purdie
>> <richard.purdie@linuxfoundation.org> wrote:
>> >
>> > Regular users of the autobuilder will note that I've split the
>> > reproducible builds test out of the main oe-selftest build and into its
>> > own target build. This is because that test tends to run for a lot
>> > longer time period and it helps to see the result separately.
>> >
>> > I've only done this for master. If gatesgarth and dunfell want to
>> > follow, that should be straight forward with a change to the branch in
>> > autobuilder-helper. Obviously we should ensure this is working ok with
>> > master first but so far so good.
>> >
>> > It has already highlighted the difference between a successful run:
>> >
>> > https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
>> > https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
>> > (took 3-4 hours)
>> >
>> > and failing two failing runs:
>> >
>> > https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
>> > https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
>> > (took 9 hours)
>>
>> OK, I read through the code and unfortunately found a bug: when
>> attempting to make sure the "B" build doesn't use sstate, I misspelled
>> the SSTATE_MIRRORS, which means that the B build could have been
>> pulling from the sstate mirror when it was not supposed to. This has a
>> few implications:
>>
>>  1) It might explain why some of the reproducible results seem intermittent
>>  2) It might explain why there is such a time disparity between the tests
>>
>> Unfortunately, while it probably will help the intermittent results,
>> it probably means that the tests taking 9 hours is what is "supposed"
>> to happen, and they happen to be shorter sometimes because the B build
>> is pulling from sstate when it's not supposed to.
>>
>> >
>> > the time difference being the system trying to run diffoscope on vim-
>> > common :/.
>> >
>> > I'm aware I removed some recipes from the exclusions list after seeing
>> > multiple passing builds for all distros and we're now seeing test
>> > failures. My mistake was not waiting for the date to change and for
>> > builds to run on an autobuilder worker with a different umask.
>> >
>> > Meson is failing with a pyc file mismatch which diffoscope can't decode
>> > and despite trying for 5 hours, diffoscope hasn't given any data on why
>> > vim-common differs. I should have fixes in for quilt, valgrind, kernel-
>> > devsrc and cwautomacros. The umask fix may fix other issues too. Alex
>> > has improved the reporting so we can spot cases where exclusion is now
>> > longer needed.
>> >
>> > Cheers,
>> >
>> > Richard
>> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Autobuilder reproducibility target changes
  2021-02-14 19:17 ` Joshua Watt
  2021-02-15  6:21   ` Alexander Kanavin
@ 2021-02-15 16:31   ` Richard Purdie
  1 sibling, 0 replies; 9+ messages in thread
From: Richard Purdie @ 2021-02-15 16:31 UTC (permalink / raw)
  To: Joshua Watt
  Cc: openembedded-core, swat, Mittal, Anuj, Steve Sakoman,
	Alexander Kanavin

On Sun, 2021-02-14 at 13:17 -0600, Joshua Watt wrote:
> On Sun, Feb 14, 2021 at 6:19 AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> > 
> > Regular users of the autobuilder will note that I've split the
> > reproducible builds test out of the main oe-selftest build and into its
> > own target build. This is because that test tends to run for a lot
> > longer time period and it helps to see the result separately.
> > 
> > I've only done this for master. If gatesgarth and dunfell want to
> > follow, that should be straight forward with a change to the branch in
> > autobuilder-helper. Obviously we should ensure this is working ok with
> > master first but so far so good.
> > 
> > It has already highlighted the difference between a successful run:
> > 
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/2
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/119/builds/2
> > (took 3-4 hours)
> > 
> > and failing two failing runs:
> > 
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/116/builds/2
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/118/builds/2
> > (took 9 hours)
> 
> OK, I read through the code and unfortunately found a bug: when
> attempting to make sure the "B" build doesn't use sstate, I misspelled
> the SSTATE_MIRRORS, which means that the B build could have been
> pulling from the sstate mirror when it was not supposed to. This has a
> few implications:
> 
>  1) It might explain why some of the reproducible results seem intermittent
>  2) It might explain why there is such a time disparity between the tests

The "good" news is that this didn't affect the autobuilder as it sets
SSTATE_DIR to a common directory and doesn't use SSTATE_MIRRORS.

> Unfortunately, while it probably will help the intermittent results,
> it probably means that the tests taking 9 hours is what is "supposed"
> to happen, and they happen to be shorter sometimes because the B build
> is pulling from sstate when it's not supposed to.

I don't think we're to the bottom of this. If its not spending the time
in diffoscope, something seems to cause builds with differences to take
much longer...

Cheers,

Richard






^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-02-15 16:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-14 12:19 Autobuilder reproducibility target changes Richard Purdie
2021-02-14 15:04 ` Alexander Kanavin
2021-02-14 15:17   ` Richard Purdie
2021-02-14 16:30     ` Alexander Kanavin
2021-02-14 18:47 ` Joshua Watt
2021-02-14 19:17 ` Joshua Watt
2021-02-15  6:21   ` Alexander Kanavin
2021-02-15 16:23     ` Joshua Watt
2021-02-15 16:31   ` Richard Purdie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox