All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rasmus Villemoes <ravi@prevas.dk>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: openembedded-core@lists.openembedded.org,  chbs@prevas.dk,
	 emkan@prevas.dk
Subject: Re: [OE-core] [RFC PATCH] license.bbclass: only create hardlinks within $TMPDIR
Date: Mon, 04 May 2026 11:58:12 +0200	[thread overview]
Message-ID: <878q9zsdhn.fsf@prevas.dk> (raw)
In-Reply-To: <0a68ea544451d546991ca5b532204af6fe3b6e69.camel@linuxfoundation.org> (Richard Purdie's message of "Mon, 04 May 2026 09:35:24 +0100")

On Mon, May 04 2026, Richard Purdie <richard.purdie@linuxfoundation.org> wrote:

> On Mon, 2026-05-04 at 10:30 +0200, Rasmus Villemoes wrote:
>> On Sat, May 02 2026, Richard Purdie <richard.purdie@linuxfoundation.org> wrote:
>> 
>> > On Fri, 2026-05-01 at 20:36 +0200, Rasmus Villemoes via lists.openembedded.org wrote:
>> > From: Rasmus Villemoes <ravi@prevas.dk>
>> > 
>> > This is _not_ meant to be applied, but merely acts as a place to start
>> > a discussion.
>> > 
>> > > In our CI setup (which is gitlab-based, but I'm not sure that's too
>> > > relevant), we sometimes see spurious errors like
>> > > 
>> > >   touch: setting times of '[...]/rootfs/usr/share/common-
>> > > licenses/go2rtc/COPYING.MIT': Operation not permitted
>> > > 
>> > > 
>> > > I do understand that using hard links is quite valuable for saving
>> > > time and space (I see GPL-2.0-only having 135 links, so that alone is
>> > > a few MB), which is why I don't think this should be applied as-is.
>> > > 
>> > > Could we do something like create a $TMPDIR/license-pool/, and
>> > > whenever encountering a src not inside TMPDIR, copy that file to
>> > > $TMPDIR/license-pool/<basename>-<sha256 of source> [if it doesn't
>> > > already exist], and then use the latter as src in the subsequent "can
>> > > we hardlink" logic? That would then also improve the case where the
>> > > meta-layers are (bind)mounted R/O, and we thus always end up copying.
>> > 
>> > I agree it does look like there is a potential issue here.
>> > 
>> > I'm not sure we have ever claimed isolation between the build metadata
>> > and the builds, I know that for example patch files are linked in, so
>> > that when you update patches, it updates the metadata more easily. I
>> > can see why you might want that, equally, the ability to update patches
>> > that way is a nice usability feature.
>> > 
>> > Did you track down where the utime of the files was being changed? From
>> > the link, it looks like:
>> > 
>> >     find  ${IMAGE_ROOTFS} -print0 | xargs -0 touch -h  --date=@$REPRODUCIBLE_TIMESTAMP_ROOTFS
>> > 
>> > in reproducible_final_image_task in image.bbclass
>> 
>> Yes, it it precisely that command/function where it triggers.
>> 
>> > I think that function could be changed to clamp to anything newer than
>> > REPRODUCIBLE_TIMESTAMP_ROOTFS rather than changing all files, and that
>> > might actually avoid the issue.
>> 
>> I don't see how. First, how would that then serve the purpose of
>> producing something reproducible? It seems that the timestamps would or
>> could then depend on when the meta-data repo was cloned. Second, if the
>> mtime is still going to be changed sometimes under some conditions, the
>> problem is not really gone. And third, I think it would make the
>> implementation much more complicated, if one would need to make a
>> decision for each file for whether to touch it or not (i.e. the simple
>> find|xargs would no longer suffice).
>
> Pretty much all the rest of the system works like this for
> reproducibility. Whether you make everything
> REPRODUCIBLE_TIMESTAMP_ROOTFS or just clamp to a max value of
> REPRODUCIBLE_TIMESTAMP_ROOTFS doesn't really matter since you're using
> that value regardless.
>
> Basically anything touched as part of the build process (newer than the
> timestamp) would be reset, everything else (older than that) should be
> deterministic.
>
> The value of the actual timestamp used is a different discussion but
> we're not changing that.
>
> Having thought about this a bit, I do think that is the right solution,
> and it does match how we handle timestamps elsewhere like the packages.

OK, but then that still wouldn't solve the problem we see, where the
build has decided that it can create a hard link, but then later it
fails to perform the futimensat syscall because the user running bitbake
is not (apparently) the user owning the inode.

As I said, I don't really understand how that actually sometimes happens
and most of the time doesn't, perhaps it is some weird interaction
between how gitlab starts docker containers and perhaps with some pseudo
in the mix.

Rasmus


  reply	other threads:[~2026-05-04  9:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-01 18:36 [RFC PATCH] license.bbclass: only create hardlinks within $TMPDIR Rasmus Villemoes
2026-05-02  7:31 ` [OE-core] " Richard Purdie
2026-05-04  8:30   ` Rasmus Villemoes
2026-05-04  8:35     ` Richard Purdie
2026-05-04  9:58       ` Rasmus Villemoes [this message]
2026-05-05  7:35         ` Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878q9zsdhn.fsf@prevas.dk \
    --to=ravi@prevas.dk \
    --cc=chbs@prevas.dk \
    --cc=emkan@prevas.dk \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.