All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Crowe <mac@mcrowe.com>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: bitbake-devel@lists.openembedded.org
Subject: Re: [bitbake-devel] Should stamp files for different versions of a recipe exist at the same time?
Date: Sun, 8 Jun 2025 20:20:26 +0100	[thread overview]
Message-ID: <aEXier1YxgCzOv1F@mcrowe.com> (raw)
In-Reply-To: <a9296bd359abc522a135cbaa37995d5b5633046e.camel@linuxfoundation.org>

[
 Snip explanation of building first for MACHINE1, then MACHINE2, then back
 to MACHINE1 again with the same PACKAGE_ARCH and relying only on the task
 hashes being different to ensure that the right files end up being used.
]

On 5/30/25 7:17 PM, Mike Crowe via lists.openembedded.org wrote:
>>>>> My guess is that the problem here is that the stamps from the first
>>>>> machine build weren't removed during the "SECOND MACHINE BUILD 1" step
>>>>> above. If I remove them myself then the problem goes away. Is that
>>>>> theory correct? If so, then I can start trying to work out why and any
>>>>> advice would be welcome. If that theory is not correct then does anyone
>>>>> have any idea where I should start investigating?

On Mon, 2025-06-02 at 17:37 +0100, Mike Crowe wrote:
>> I was hoping that someone would just know the answer to my first question
>> above off the top of their head. If so then I'm willing to try digging into
>> this myself to see what I can discover. I just need to make sure that I'm
>> chasing the right part of the problem.

On Monday 02 June 2025 at 22:04:39 +0100, Richard Purdie wrote:
> I've avoided answering this as I think you're right but I'm not 100%
> sure and I don't want to send you off in the wrong direction. There are
> reasons it might not wipe out stamps and those reasons may be
> intentional but without looking at what is going on I couldn't be
> completely sure. So I think you're right and I'd chase that way but...

It turns out that you were right to be hesitant.

Stamps are only removed if they were generated by the current MACHINE due
to oe-core#5634f2fb1740732056d2c1a22717184ef94405bf from 2018:

| sstate: Ensure a given machine only removes things which it created
|
| Currently if you build qemux86 and then generic86, the latter will
| remove all of the former from deploy and workdir. This is because
| qemux86 is i586, genericx86 is i686 and the architctures are compatible
| therefore the sstate 'cleaup' code kicks in.
|
| There was a valid reason for this to ensure i586 packages didn't get into
| an i686 rootfs for example. With the rootfs creation being filtered now, this
| is no longer necessary.
|
| Instead, save out a list of stamps which a give machine has ever seen in
| a given build and only clean up these things if they're no longer
| "reachable".
|
| In particular this means the autobuilder should no longer spend a load of time
| deleting files when switching MACHINE, improving build times.

I'm not sure if the situation the change describes can still occur.
genericx86 (which is presumably what was meant by generic86) no longer
seems to exist and TUNE_PKGARCH should be either i586 and i686, so the
stamps and work directories won't overlap with each other.

This commit was also in Dunfell though, so the next question is why doesn't
Dunfell suffer from the same problem? The answer surprised me: Dunfell
appears to have been looking for mismatched stamp file names (for me at
least).

In Dunfell, the build for MACHINE1 creates a stamp named:

 1-r0.do_populate_lic.sigdata.14708a9f021d264f7c0a9c5c33a93417c6cb20ba936c5805ff96083a9cb05ee9

The first build for MACHINE2 creates a stamp named:

 2-r0.do_populate_lic.sigdata.23cae22087dda59b5d9aad048411200dd4a3abb3308a241f9a89be65b7071d01

Yet the debug log for the second build for MACHINE1 contains:

 DEBUG: Stampfile /.../build/tmp-glibc/stamps/aarch64-oe-linux/lictest/1-r0.do_populate_lic.14708a9f021d264f7c0a9c5c33a93417c6cb20ba936c5805ff96083a9cb05ee9 not available

It's looking for the stamp file without the ".sigdata" in the middle and
doesn't find one.

In Scarthgap, the stamp files are always created without the ".sigdata" in
the filename for me. This means that the stamp file _is_ found during the
second build for MACHINE1, which stops the do_populate_lic (or
do_populate_lic_setscene) task from running.

If I hack sstate.bbclass's sstate_eventhandler2's line:

 if stamp not in stamps and stamp not in preservestamps and stamp in machineindex:

to remove the "and stamp in machineindex", to effectively make
oe-core#5634f2fb17 have no effect, then my problem goes away in Scarthgap:
the do_populate_lic or do_populate_lic_setscene tasks run as expected when
switching MACHINEs. This of course would reintroduce the original problem
that commit was trying to fix though.

The ".sigdata" part is not present in Dunfell stamp filenames for
do_populate_lic_setscene tasks. This means that the stamp check would match
and I would have expected the problem to occur on a third pair of builds
(the first pair don't find anything in sstate, the second ones do and run
do_populate_lic_setscene, the third pair should have the
do_populate_lic_setscene stamps left over from the second pair). It does
not. The do_populate_lic_setscene stamps are being removed by
sstate_installpkg calling sstate_clean where the pattern for removal is
tmp-glibc/stamps/aarch64-oe-linux/lictest/*-*.do_populate_lic*. As far as I
can see, this pattern matches the stamps from all builds that share the
same PACKAGE_ARCH. This step only runs for _setscene tasks, so it doesn't
happen for the real do_populate_lic task in the first build for MACHINE2.

I think that all that suggests three potential solutions:

1. Revert the optimisation in oe-core#5634f2fb17 because it isn't safe in
   the general case.

2. Ensure that running a real task removes any other stamps for that task
   with different hashes (perhaps by calling sstate_clean in the same way
   that sstate_installpkg does?)

3. Create a tool that runs "bitbake -S none world" for each MACHINE and
   then looks for stamp directories with multiple hashes for the same task.
   Assuming the `name.rsplit(".", 3)` trick to extract the task name and
   hash from the stamp filename for arbitrary PV in
   meta/lib/oeqa/selftest/cases/sstatetests.py is correct this doesn't look
   particularly difficult to do.

I shall continue to dig, but if anyone has any more ideas then they'd be
gratefully recieved.

Thanks.

Mike.


  parent reply	other threads:[~2025-06-08 19:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-30 17:17 Should stamp files for different versions of a recipe exist at the same time? Mike Crowe
2025-06-02 10:15 ` [bitbake-devel] " Quentin Schulz
2025-06-02 10:49   ` Mike Crowe
2025-06-02 14:17     ` Richard Purdie
2025-06-02 16:37       ` Mike Crowe
2025-06-02 21:04         ` Richard Purdie
2025-06-03 10:07           ` Mike Crowe
2025-06-03 10:18             ` Richard Purdie
2025-06-08 19:20           ` Mike Crowe [this message]
2025-06-08 21:35             ` Richard Purdie
2025-06-09 14:45               ` Mike Crowe
     [not found]               ` <1847671617FAC7D3.17668@lists.openembedded.org>
2025-06-12 14:33                 ` Mike Crowe
2025-06-13 15:20                   ` Richard Purdie
2025-06-17 13:47                     ` Mike Crowe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEXier1YxgCzOv1F@mcrowe.com \
    --to=mac@mcrowe.com \
    --cc=bitbake-devel@lists.openembedded.org \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.