From: Anand Jain <anand.jain@oracle.com>
To: Chris Murphy <lists@colorremedies.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: reproducible builds with btrfs seed feature
Date: Tue, 16 Oct 2018 16:13:53 +0800 [thread overview]
Message-ID: <11171664-790c-40ff-3925-1c179ab8e74b@oracle.com> (raw)
In-Reply-To: <CAJCQCtTPwQnzwkpk=4ZsZXfWTC7HymYETxp-9xUU_tsvOTW0ZQ@mail.gmail.com>
On 10/14/2018 06:28 AM, Chris Murphy wrote:
> Is it practical and desirable to make Btrfs based OS installation
> images reproducible? Or is Btrfs simply too complex and
> non-deterministic? [1]
>
> The main three problems with Btrfs right now for reproducibility are:
> a. many objects have uuids other than the volume uuid; and mkfs only
> lets us set the volume uuid
> b. atime, ctime, mtime, otime; and no way to make them all the same
> c. non-deterministic allocation of file extents, compression, inode
> assignment, logical and physical address allocation
>
> I'm imagining reproducible image creation would be a mkfs feature that
> builds on Btrfs seed and --rootdir concepts to constrain Btrfs
> features to maybe make reproducible Btrfs volumes possible:
>
> - No raid
> - Either all objects needing uuids can have those uuids specified by
> switch, or possibly a defined set of uuids expressly for this use
> case, or possibly all of them can just be zeros (eek? not sure)
> - A flag to set all times the same
> - Possibly require that target block device is zero filled before
> creation of the Btrfs
> - Possibly disallow subvolumes and snapshots
> - Require the resulting image is seed/ro and maybe also a new
> compat_ro flag to enforce that such Btrfs file systems cannot be
> modified after the fact.
> - Enforce a consistent means of allocation and compression
>
> The end result is creating two Btrfs volumes would yield image files
> with matching hashes.
> If I had to guess, the biggest challenge would be allocation. But it's
> also possible that such an image may have problems with "sprouts". A
> non-removable sprout seems fairly straightforward and safe; but if a
> "reproducible build" type of seed is removed, it seems like removal
> needs to be smart enough to refresh *all* uuids found in the sprout: a
> hard break from the seed.
Right. The seed fsid will be gone in a detached sprout.
> Competing file systems, ext4 with make_ext4 fork, and squashfs. At the
> moment I'm thinking it might be easier to teach squashfs integrity
> checking than to make Btrfs reproducible. But then I also think
> restricting Btrfs features, and applying some requirements to
> constrain Btrfs to make it reproducible, really enhances the Btrfs
> seed-sprout feature.
> Any thoughts? Useful? Difficult to implement?
Recently Nikolay sent a patch to change fsid on a mounted btrfs. However
for a reproducible builds it also needs neutralized uuids, time,
bytenr(s) further more though the ondisk layout won't change without
notice but block-bytenr might.
One question why not reproducible builds get the file data extents from
the image and stitch the hashes together to verify the hash. And there
could be a vfs ioctl to import and export filesystem images for a better
support-ability of the use-case similar to the reproducible builds.
For the seed sprout feature one thing I have in mind is to make it image
and subvolume granular rather than the disk and fsid granular, and
ability to transpire golden image (seed) updates, but I haven't checked
the feasibility yet.
Thanks, Anand
>
> Squashfs might be a better fit for this use case *if* it can be taught
> about integrity checking.
> It does per file checksums for the purpose
> of deduplication but those checksums aren't retained for later
> integrity checking.
>
> [1] problems of reproducible system images
> https://reproducible-builds.org/docs/system-images/
>
> [2] purpose and motivation for reproducible builds
> https://reproducible-builds.org/
>
> [3] who is involved?
> https://reproducible-builds.org/who/#Qubes%20OS
>
>
>
>
next prev parent reply other threads:[~2018-10-16 8:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-13 22:28 reproducible builds with btrfs seed feature Chris Murphy
2018-10-13 23:05 ` Chris Murphy
2018-10-14 12:20 ` Cerem Cem ASLAN
2018-10-14 18:10 ` Chris Murphy
2018-10-14 19:09 ` Cerem Cem ASLAN
2018-10-14 23:38 ` Chris Murphy
2018-10-15 12:29 ` Austin S. Hemmelgarn
2018-10-15 19:52 ` Chris Murphy
2018-10-16 8:13 ` Anand Jain [this message]
2018-10-16 19:49 ` Chris Murphy
2018-10-17 4:08 ` Anand Jain
2018-10-18 18:02 ` Chris Murphy
2018-10-19 0:47 ` Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11171664-790c-40ff-3925-1c179ab8e74b@oracle.com \
--to=anand.jain@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).