From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Subject: [NOTES 01/11] SHA-256 and interoperability work
Date: Mon, 6 Oct 2025 15:18:02 -0400 [thread overview]
Message-ID: <aOQV6iM49QDhcC+C@nand.local> (raw)
In-Reply-To: <aOQVeVYY6zadPjln@nand.local>
Topic: SHA256 and interoperability work
Leader: brian
10:15am-10:45am PT
* lot of work to do
* brian is working on it
* it's progressing, not sure if we can get everything done by 3.0
* how to deal with submodules
* you can produce a split history
* accept, document, ?
* we need to have mapping on server or client
* if someone pushed one commit in sha1 and a different in 256, we can end up
with divergent histories that could produce security issues
* some private repos for open-core type submodules make this difficult with
submodules
* could have the server query, client derive mapping
* server could also be malicious
* if you're converting, how does that work in gpg signatures?
* we have a way to map both signatures
* if you're in compatibility mode, it will produce signatures for both
* what about for older histories, how can it be verified if it's only valid
for sha1?
* it can be verified but can't be resigned
* for converting, can that work?
* converting will retain the sha1 signature
* what is the simplest user journey?
* I have a clone of a repo in sha1, am I expected to run a conversion locally
and then I can talk to GH in 256 protocol?
* you will create a new repo with 256 with sha1 compatibility and clone
into that, which will convert it into both algo
* download the data again?
* clone it to another directory locally
* it will preserve the sha1 repo and create the compatibility layer
* let's say the local one has a submodule, clone locally including the
submodule?
* yes, the conversion script will convert the submodule as well and
you'll have both ids
* if I do a fetch, which do I need
* you need a mapping if you're talking to a server with the other algo
* the mapping is only needed for the server if it wants to be forward
facing?
* with mapping, its only commits or all objects
* all objects
* if someone trusts github, they can just consume it's mapping?
* the server and client will do their own mapping
* what happens if nobody has the submodule anymore? commit from 10 years ago but
nobody has that submodule anymore, how do you make a 256 tree out of that
* pick one at random it doesnt matter
* but you can't match everyone else
* we've chosen to use divergent history in this case
* Same issue exists with LFS objects
* if you have the old submodules,
* recursive/cyclic submodules?
* it's something we need to handle, don't have a great plan but it could be
done
* plan is to maybe have some pool
* you have to convert the submodule up until that point, then convert them
piecewise
* have you thought about mix/match where one uses sha1 and the other uses 256
* we can't distinguish the size of the object id vs filename
* right now you're doing the work, are you thinking of allowing another hash
algo without having these issues again?
* the way the design works now is that we have two algos - main and
compatibility, but designed to accept multiple algos. if we switch to 3512
at some point for example, we could add another compat algo - it's some
work but the approach doesn't assume much about the specific algorithm
* steiny thought it could be useful to add a third algo not for security but
speed
* gh has the insecure non crypto varients
* problem is always client support
* corporate controlled repo often also has control of the clients - so maybe
less of a security issue but depends
* can you put a sha1 link inside a 256 tree
* maybe an extra bit in the mode, some other interesting horrible thoughts
* would it make submodule problems go away if you could just carry the other
forever until the downstream decides to switch
* solves the submodule problem but not LFS problem?
* LFS might be easier, you don't need to have the object to convert yours
* assuming you have the object still
* brian not 100% against it
* if I could do a 256 repo with a 256 submodule, you could parse it back,
but if you do that, it's a different size and not usable by older
versions of git
* if we were clever, sha1 trees hold sh1, 256 holds 256 and only when you
have a sha1 tree inside a 256 that we would use some new format
* the problem is you still end up with stuff that doesn't work with older
versions
* degrades gracefully like a mode bit, worse case is that it checks out
weird filenames?
* write it out, take it to the list
* we discussed upgrading the tree object format, but it's so tight
next prev parent reply other threads:[~2025-10-06 19:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
2025-10-06 19:18 ` Taylor Blau [this message]
2025-10-06 19:18 ` [NOTES 02/11] First-class conflicts in Git? Taylor Blau
2025-10-06 19:18 ` [NOTES 03/11] The future of history rewriting - rebase, replay and history (+Change-IDs) Taylor Blau
2025-10-06 19:18 ` [NOTES 04/11] Rust Taylor Blau
2025-10-06 19:19 ` [NOTES 05/11] Pluggable object databases Taylor Blau
2025-10-06 19:19 ` [NOTES 06/11] Repository maintenance long-term goals Taylor Blau
2025-10-06 19:19 ` [NOTES 07/11] Change-ID Header in Git Taylor Blau
2025-10-06 19:20 ` [NOTES 08/11] Resumable fetch / push Taylor Blau
2025-10-06 19:20 ` [NOTES 09/11] Git 3.0 Taylor Blau
2025-10-06 19:20 ` [NOTES 10/11] How can companies respectfully engage contractors to work on Git? Taylor Blau
2025-10-06 19:20 ` [NOTES 11/11] Conservancy 2025 updates Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOQV6iM49QDhcC+C@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).