git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Simon Josefsson <simon@josefsson.org>,  git@vger.kernel.org
Subject: Re: Making bit-by-bit reproducible Git Bundles?
Date: Thu, 13 Mar 2025 06:36:39 -0700	[thread overview]
Message-ID: <xmqqzfhpqcgo.fsf@gitster.g> (raw)
In-Reply-To: <20250313051538.GA94015@coredump.intra.peff.net> (Jeff King's message of "Thu, 13 Mar 2025 01:15:38 -0400")

Jeff King <peff@peff.net> writes:

> .... But there are some gotchas:
>
>   1. It's stable only for a given Git version, and with a particular set
> ...
>   2. There is no way to pass pack-objects options down through
> ...
>   3. It will be really slow. We're throwing out all of the deltas and
> ...
There also is 4.

    4. We do not control zlib, so even with the same Git binary, the
       zlib implementation that is dynamically linked to us is free
       to produce better compressed base object (or compressed
       delta).

3. is not a downside if the priority of the requestor is about
bit-for-bit reproducibility (iow, "no matter what the cost").

>   # print all commits in topological order, with ties broken by
>   # committer date, which should be stable. And then follow up with the
>   # trees and blobs for each.
>   git rev-list --topo-order --objects HEAD >objects
>
>   # now print the contents of each object (preceded by its name, type,
>   # and length, so there's no chance of weird prepending or appending
>   # attacks). We cut off the path information from rev-list here, since
>   # the ordered set of objects is all we care about.
>   cut -d' ' -f1 objects |
>   git cat-file --batch >content
>
>   # and then take a hash over that content; this will be unambiguous.
>   sha256sum <content

Gross but probably stable ;-)

  reply	other threads:[~2025-03-13 13:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12 11:40 Making bit-by-bit reproducible Git Bundles? Simon Josefsson
2025-03-12 16:02 ` Junio C Hamano
2025-03-13  3:09 ` Kyle Lippincott
2025-03-13  7:59   ` Simon Josefsson
2025-03-13  5:15 ` Jeff King
2025-03-13 13:36   ` Junio C Hamano [this message]
2025-03-13 20:16   ` Simon Josefsson
2025-03-13 21:07     ` Kyle Lippincott
2025-03-13 22:09       ` Junio C Hamano
2025-03-14  2:42     ` Jeff King
2025-03-14 22:24       ` rsbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzfhpqcgo.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=simon@josefsson.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).