* [NOTES 01/11] SHA-256 and interoperability work
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
@ 2025-10-06 19:18 ` Taylor Blau
2025-10-06 19:18 ` [NOTES 02/11] First-class conflicts in Git? Taylor Blau
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:18 UTC (permalink / raw)
To: git
Topic: SHA256 and interoperability work
Leader: brian
10:15am-10:45am PT
* lot of work to do
* brian is working on it
* it's progressing, not sure if we can get everything done by 3.0
* how to deal with submodules
* you can produce a split history
* accept, document, ?
* we need to have mapping on server or client
* if someone pushed one commit in sha1 and a different in 256, we can end up
with divergent histories that could produce security issues
* some private repos for open-core type submodules make this difficult with
submodules
* could have the server query, client derive mapping
* server could also be malicious
* if you're converting, how does that work in gpg signatures?
* we have a way to map both signatures
* if you're in compatibility mode, it will produce signatures for both
* what about for older histories, how can it be verified if it's only valid
for sha1?
* it can be verified but can't be resigned
* for converting, can that work?
* converting will retain the sha1 signature
* what is the simplest user journey?
* I have a clone of a repo in sha1, am I expected to run a conversion locally
and then I can talk to GH in 256 protocol?
* you will create a new repo with 256 with sha1 compatibility and clone
into that, which will convert it into both algo
* download the data again?
* clone it to another directory locally
* it will preserve the sha1 repo and create the compatibility layer
* let's say the local one has a submodule, clone locally including the
submodule?
* yes, the conversion script will convert the submodule as well and
you'll have both ids
* if I do a fetch, which do I need
* you need a mapping if you're talking to a server with the other algo
* the mapping is only needed for the server if it wants to be forward
facing?
* with mapping, its only commits or all objects
* all objects
* if someone trusts github, they can just consume it's mapping?
* the server and client will do their own mapping
* what happens if nobody has the submodule anymore? commit from 10 years ago but
nobody has that submodule anymore, how do you make a 256 tree out of that
* pick one at random it doesnt matter
* but you can't match everyone else
* we've chosen to use divergent history in this case
* Same issue exists with LFS objects
* if you have the old submodules,
* recursive/cyclic submodules?
* it's something we need to handle, don't have a great plan but it could be
done
* plan is to maybe have some pool
* you have to convert the submodule up until that point, then convert them
piecewise
* have you thought about mix/match where one uses sha1 and the other uses 256
* we can't distinguish the size of the object id vs filename
* right now you're doing the work, are you thinking of allowing another hash
algo without having these issues again?
* the way the design works now is that we have two algos - main and
compatibility, but designed to accept multiple algos. if we switch to 3512
at some point for example, we could add another compat algo - it's some
work but the approach doesn't assume much about the specific algorithm
* steiny thought it could be useful to add a third algo not for security but
speed
* gh has the insecure non crypto varients
* problem is always client support
* corporate controlled repo often also has control of the clients - so maybe
less of a security issue but depends
* can you put a sha1 link inside a 256 tree
* maybe an extra bit in the mode, some other interesting horrible thoughts
* would it make submodule problems go away if you could just carry the other
forever until the downstream decides to switch
* solves the submodule problem but not LFS problem?
* LFS might be easier, you don't need to have the object to convert yours
* assuming you have the object still
* brian not 100% against it
* if I could do a 256 repo with a 256 submodule, you could parse it back,
but if you do that, it's a different size and not usable by older
versions of git
* if we were clever, sha1 trees hold sh1, 256 holds 256 and only when you
have a sha1 tree inside a 256 that we would use some new format
* the problem is you still end up with stuff that doesn't work with older
versions
* degrades gracefully like a mode bit, worse case is that it checks out
weird filenames?
* write it out, take it to the list
* we discussed upgrading the tree object format, but it's so tight
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 02/11] First-class conflicts in Git?
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
2025-10-06 19:18 ` [NOTES 01/11] SHA-256 and interoperability work Taylor Blau
@ 2025-10-06 19:18 ` Taylor Blau
2025-10-06 19:18 ` [NOTES 03/11] The future of history rewriting - rebase, replay and history (+Change-IDs) Taylor Blau
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:18 UTC (permalink / raw)
To: git
Topic: First class conflicts
Leader: Martin Z
10:50am-11:15am PT
* how interested is Git in adopting first-class conflicts?
* can rebase descendants easily
* maybe we can use it only internally during rebasing merge commits?
* could mean you don't have to do rebase --continue etc if we expose it to users
in the future. is it appealing?
* taylor: what's the goal of having first-class conflicts in git? do we want to
enable certain jj-like workflows or is there another reason?
* elijah: would like first-class conflicts so i can save context while editing
changes in a stack, to handle later or hand off conflict resolution to
collaborators.
* really helpful to be able to divide and conquer when dealing with a massive
merge conflict
* so we want to be able to publish conflicts to the server for exchange? ->
eventually, yes
* first-class conflict means - it's a separately stored commit header that is
understood more deeply by git (e.g. fsck). jj uses a special header on trees +
OS-hidden dirs, it's a convention to store conflicts into .left/ .right/ etc.
has some human-readable warnings in the commit that stores the conflicts.
* then the client refuses to push the special header
* jj puts it into this special tree because the conflict needs to not get
GC'd, if Git learned how to not GC those conflict objects on its own jj/but
would have less magic to do
* should the conflict objects really live forever?
* They should live as long as the commit referencing them lives
* what about adding a non-tree object for the conflict markers? e.g. add it to
the commit header as a confict object instead
* having the conflicts in the commit header is nice because you don't have to
walk the whole commit tree to find whether there are conflicts
* the commit object just then starts having 3+ trees instead of 1 tree
* brian: what about a special tree object with file mode that indicates it's a
weird special thing, then treat that specially in clients that are aware of
what it's for?
* does this fit into the way we might extend the tree for storing weird
gitlink sha256 things also?
* what does it look like for merge commits? "i meant to merge commits aaa and
bbb and it didn't work, try me again"
* then just hang onto that conflicted state, other tools could resolve it, or
a rebase later could resolve it
* partial resolution - apply as much as possible, then only write down the
still-unresolved parts
* how to keep people from submitting conflicts? conflicts as a first-class
object makes it easier to prevent (or to render correctly on the client if it
was submitted)
* would including this in git make so many git commands obsolete?
* elijah already working on dropping rebase and starting over with replay
* new commands means we can also make the UX not suck this time around
* patrick: same thing for git history
* junio: clapping emoji 🙂
* any concerns with first-class conflicts?
* is it possible to commit those in history and work on top of them
incrementally, so subsequent commits fix only part of the original first-class
conflicts?
* that's how jj works already
* with binary files it's hard to do conflict markers, that's not an issue
inherent to the conflict marker storage method though
* in jj we stick conflict markers inside the binary. it's…. not great….
* for many-sided conflicts we use more types of conflict markers, even on
binaries
* iteratively removing one side of the multi-side conflict until there's
only a simple conflict or no conflict at all
* this requires you to have an appropriate merge tool to resolve binary
conflicts 🙂
* sounds like no broad opposition
* should we aim for 3.0?
* is it possible for people using git without the first-class conflicts to
keep using it the same way, if the git binary supports it?
* not having these conflict objects be pushable makes this much easier
* could mean that initially we can't mail those conflicts around and we
use a major release to make it possible to ship them
* patrick: please be careful putting too many things onto 3.0 gating, so we
can actually finish 3.0 🙂 should we stick to things that are already ready
or at least underway?
* taylor: i think it depends on if we think it can land in the next <10
months
* local-only means it probably doesn't need to be behind a major/breaking
release
* people can share conflicts via continuous sync (not through git protocol)
in the meantime, with other tooling
* would be nice to get branch-level acceptance of conflict objects from the
server side
* helps to understand what the target format should be. if we did it, what would
it look like? then we can start working on it
* on the list let's figure out what it should look like, and then we can
start working on it but not in a breaking way. then we could start to
notice places where it breaks old commands
* but how do we know the format is right before we start developing tooling
against it?
* is the object format a reversible decision?
* maybe we can depend on jj / git butler having forged the path already a
bit
* interesting to think about how tools like jj and git butler would ideally
want to store conflicts if they didn't have to worry about wedging it into
git's current formats
* is the path to getting first-class to start by wedging it into git? that
seems to be what jj and git butler have been doing already; are we ready
to move into git first class?
* "first-class" is in the eye of the client, so we're talking about the
way to make them first class to git, not first class to wrappers (who
already know how to do their own first class thing)
* needs to store the tree and not gc it, any other reqs?
* how does the plan work?
* lock in data format, then only git replay can work with it, everybody else
ignores it?
* would be very difficult to teach rebase/cherrypick to understand these
without breaking for people who use git the way they do now
* or could put a flag to fork rebase into a different handler if it sees a
conflict object
* cherry-pick already has(?) a replay mode (or maybe just in elijah's tree)
* scripting support becomes weird if you're using config flags to change
behavior of porcelain that already exists
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 03/11] The future of history rewriting - rebase, replay and history (+Change-IDs)
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
2025-10-06 19:18 ` [NOTES 01/11] SHA-256 and interoperability work Taylor Blau
2025-10-06 19:18 ` [NOTES 02/11] First-class conflicts in Git? Taylor Blau
@ 2025-10-06 19:18 ` Taylor Blau
2025-10-06 19:18 ` [NOTES 04/11] Rust Taylor Blau
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:18 UTC (permalink / raw)
To: git
Topic: The future of history rewriting
Leader: Phillip Wood
11:30am-12:00pm PT
* Number of methods of history rewriting
* What do we want the future UI and operations to look like and be easy
* Want good commit histories
* JJ always in Elijah's edit mode demo, always interactive rebase.
* Always easy to rewrite commits.
* Always rebase descendants
* Use commands instead of verbs on rebase command
* Git would have a hard time adopting the ‘always rebasing' model
* For new users, git rebase is too complicated for simple use cases.
* Top level commands to easily do common operations
* ‘Git history'?
* Git history vs git replay
* Replay is plumbing used by server.
* History is porcelain used by user.
* Elijah - Building on rebase depends on sequencer
* Underpinnings more important than naming
* Lots of backwards compatibility assumptions
* Hooks and dependencies are pervasive, hard to clean up.
* When update-refs was added, it broke hooks
* Don't want to keep doing that.
* Wants to move beyond sequencer (git history uses sequencer)
* Sequencer frequently updates the work tree, not desirable
* Git history can move to use Git replay
* A cleaner version of git history would be nice so others can try it out
* Git replay is lacking features needed for git history
* Trying landing experimental version with sequencer underpinnings
* No promise of compatibility
* Phillip - users noticed when he broke Sequencer Hooks
* Disable hooks with flags?
* Way forward - land UI then iterate on underpinnings?
* Sequencer depends on shell parseable state files
* Lots to cleanup
* Minh - does this help solve problem of server rewriting history (ie force
push), leaving clients with incompatible forks?
* Out of scope
* Maybe change id is the more relevant conversation
* Conversation ended up on ChangeID
* Change ID loses predecessor tracking, which is more precise
* Hard to propagate without Mercurial style logs
* Mercurial predecessor graphs are independent of commits
* Change IDs would also help with first class conflicts
* Finding range-diffs is cheaper
* Range-diffs used fairly widely
* Git, rust, most mailing list flows
* Change IDs useful for tracking across repos, bugs, etc.
* Why are change IDs stalled in the mailing list?
* Disagreement on tracking predecessors
* Requires a protocol change
* Sending predecessors over protocol has lots of implications
* Gitster - disagreement on what it means to be a predecessor
* Parent? Cherrypick?
* Brian - changeId should be deterministic. Reject non-well formed ids
* Workflows rely on repetition
* ChangeIds should be optional, disableable.
* May track too much information unintentionally across commits, projects.
* Gitster - needs to be possible to expose changeId, predecessor without
exposing private information about private repos.
* ChangeID exposes less than predecessors do
* JJ can't access predecessors from ChangeID
* When rewriting commits, maybe we don't want the predecessor to be
viewable (eg secret keys)
* JJ can bump changeId when rewriting
* Gerrit keeps ChangeID in commit body
* Rebase and Cherrypick don't support arbitrary key:value pairs in commit
body
* ChangeID should propagate to be useful
* Eg across mailing list
* Can Git more generally and globally support headers in the commit?
* ChangeID should be more 1st class than other headers.
* Hard for client to tell when a ChangeID should change.
* Recent JJ commits were pushed with ChangeIDs
* Colleague branched off. Rewriting ids would have been useful.
* Squashes, amends etc lead to ambiguity about which ChangeID to keep.
* JJ keeps the parent.
* Gitster thinks it would be nicer for ChangeIDs to be kept even when
there are 2.
* When commits split, the children get 2 new ChangeIDs instead of keeping
old one.
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 04/11] Rust
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (2 preceding siblings ...)
2025-10-06 19:18 ` [NOTES 03/11] The future of history rewriting - rebase, replay and history (+Change-IDs) Taylor Blau
@ 2025-10-06 19:18 ` Taylor Blau
2025-10-06 19:19 ` [NOTES 05/11] Pluggable object databases Taylor Blau
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:18 UTC (permalink / raw)
To: git
Topic: Rust
Leader: Patrick Steinhardt
13:05am-13:30pm PT
* Recurring topic from past years, but sparked again by Ezekiel's contributions
on xdiff
* We're favorable towards it, but we haven't previously agreed on a timeline
* Platforms that don't have Rust support: NonStop, Alpha, Cygwin, and some
others brought up by Gentoo
* Patrick has a series up to let us provide notification to users that Git will
start depending on Rust
* Led to lots of discussion both on the mailing list and outside, which had the
good effect of making more people aware of the upcoming change
* Ezekiel is trying to pass some of the blame to a big brother – he's happy to
take it ;-)
* Ezekiel is more interested in the technical details than the policy details,
though we need the policy details figured out
* Having Rust be optional leads to code being written twice and increasing the
maintenance support; having mandatory Rust support is needed to avoid that
* brian wrote sha256 interop code in Rust
* Would be nice to hand over maintenance for some kind of (Rust-optional) LTS
release to someone else in the community
* We have lots of global state that we need to get rid of, and lots of other
cleanup
* Long term goal may be to eventually replace all of C, though it's not clear if
we should take that whole goal or just start with pieces that make sense.
Also, we've got a learning process ahead of us, so our goalposts may need to
change as we learn.
* Rust might be helpful for libification reasons, but tying libification to an
already big change might make it too big
* Rust rewrite could mean implementing new subcommands (as discussed earlier) in
Rust instead of rewriting bug-for-bug existing code
* There are lots of updating that can be done before switching to Rust, e.g.
switching to unambiguous types
* Rust can be used to replace things at an individual function level
* Just rewriting in Rust doesn't turn the existing system into nice abstraction
boundaries or reusable modules. We have existing efforts to try to clean
those up in various ways, e.g. the pluggable object store work.
* Rust makes unit tests much easier and ergonomic, and starting by writing tests
of existing C code makes a lot of sense as a way to begin a migration.
* Large organizations and governments are going to start pushing for people to
move away from C for security reasons.
* Major reason(s) to adopt Rust
* Threading
* Error propagation
* Difficult to know who owns what in C - Rust improves maintainability
* Attracting more contributors (it's the most popular according to
StackOverflow)
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 05/11] Pluggable object databases
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (3 preceding siblings ...)
2025-10-06 19:18 ` [NOTES 04/11] Rust Taylor Blau
@ 2025-10-06 19:19 ` Taylor Blau
2025-10-06 19:19 ` [NOTES 06/11] Repository maintenance long-term goals Taylor Blau
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:19 UTC (permalink / raw)
To: git
Topic: Pluggable object databases
Leader: Patrick Steinhardt
* Already working towards, since git 2.50.
* Allow innovation on the server side on large binary.
* The design will soon be up for discussion.
* Allow migration between different object format, and allow to be picked later
by the implementer.
* The planned work is to make the new db more pluggable, right now the work is
still about refactoring. 2.53 will have a proof of concept. Might take into
the second half of 2026 to be done.
* Blocker1: The current db format is still not clear. Particularly latency perf
related issues.
* Might be using content chunking hashing, might be using existing db impl
like cassandra.
* Blocker2: Second problem is how to generate the packfile.
* Taylor wonder whether we can reuse the current object db, but patrick thinks
the current impl is too large/complex to adopt. The current refactoring effort
with better abstraction might speed up future changes.
* Gitster wonders whether we can just use the hash of the chunks' hashes.
* Taylor also thinks a new obj db might become just as complex.
* Patrick thinks the new obj db can be more maintainable. Starting off with a
brand new abstraction allows faster iteration.
* Rewriting obj db in a new world might be challenging because the pack obj is
so intimate to so many usage and optimizations (e.g. bitmap), also the need to
identify big binary obj over the wire.
* Taylor thinks maybe we don't need to rewrite pack obj, but abstracting the
packfile could make it worse and more verbose.
* Patrick mentions there's already many other adjacent projects abstract away
from the pack format; e.g. jgit, libgit2. Jgit initially already identified
Casadra's perf would never work due to latency overhead.
* Taylor suggests we identify a proof of concept with comparable latency to
existing obj db before doing additional refactoring.
* Ezekiel is refocusing the discussion on targeting large binary files. Maybe
with large binary files, latency degradation is not as important.
* In git, we already have a divergent code path for large binary files, we just
chose to store them in the packfile, technically people can change the storage
selection without refactoring.
* Patrick still thinks having sub-system abstraction would make code more
maintainable.
* Taylor is supportive about some objects can use the current db vs only have
the large binary files to use the new db; at least we don't impose the
overhead over all objects.
* The obj chunk design Patrick proposing is meant to benefit both client side
storage and server side.
* We should resume this discussion with more concrete usage, right now we are
still talking about potential scenarios.
* The premisor feature from server side cannot satisfy all clients, since some
clients don't want to use premisor, so the server side might still be expected
to have the large binary files on disk.
* The packfile url might still be the main direction we can use to fix the large
binary issue without doing exploding obj chunking.
* Another benefit of obj chunking is to reduce hash time for large binary files.
Gerrit currently sees 50% of clone time is due to hashing. Parallel hashing is
also possible with obj chunking.
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 06/11] Repository maintenance long-term goals
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (4 preceding siblings ...)
2025-10-06 19:19 ` [NOTES 05/11] Pluggable object databases Taylor Blau
@ 2025-10-06 19:19 ` Taylor Blau
2025-10-06 19:19 ` [NOTES 07/11] Change-ID Header in Git Taylor Blau
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:19 UTC (permalink / raw)
To: git
Topic: Repository maintenance long-term goals
Leader: Taylor Blau
* Taylor's talk was limited towards the end. Could expand on that future work.
* Constant repacking into a single pack was historically the major problem.
* Doing that less (because of geometric repacking) helps, but it's still a
potential issue when it does occur. Gets them 98% of the way.
* Future items were geometric reachability, ?, best effort gc
* Previously during geometric, used to accumulate loose objects too. 6 months
ago they changed to an approach where the big cruft pack could be excluded
from the midx.
* Challenge would be to do a full complete repack without rewriting all of the
midx chain.
* Because bitmap is tied to object order in a pack, need something like
tombstones to not break the bitmaps. Need the tombstone to know that we don't
have the data.
* Unitary midx idea - Taylor designed the chained midx before he figured out the
repacking strategy. MIDX and pack index duplicate the data. No reason to
de-dup other than for space saving. Could even skip having idx, but plenty of
old git versions can't read midx.
* brian - there may be other implementations, such as git lfs, that don't use
midx and object id mappings in pack idx v3 aren't supported in midx either.
* Nothing preventing you from having two parallel repacks, one that's geometric
and one that's trying to do an all-into-one.
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 07/11] Change-ID Header in Git
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (5 preceding siblings ...)
2025-10-06 19:19 ` [NOTES 06/11] Repository maintenance long-term goals Taylor Blau
@ 2025-10-06 19:19 ` Taylor Blau
2025-10-06 19:20 ` [NOTES 08/11] Resumable fetch / push Taylor Blau
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:19 UTC (permalink / raw)
To: git
Topic: Change-ID Header in Git
Leader: Philip Metzger
* How do we store the Change-ID? Store it in a header? Some auxiliary metadata
store?
* Happens to work in a header for GitHub because they survive rebases since
GitHub uses replay, not all forges do this.
* Want a standard interoperable way to associate Change-IDs with commits.
* Storage discussion has largely been covered.
* Taylor: what's less clear to me is the semantics of when we keep Change-IDs
across operations, when we assign new ones.
* Cherry-picking equivalent assign a new Change-ID
* Almost everything else retains that Change-ID
* Taylor: we need to agree on the storage, but not necessarily on the semantics
of when we keep versus assign new Change-IDs.
* Caleb: Assigning a new Change-ID when cherry-picking is interesting, since we
(GitButler) retain those.
* Philip: Gerrit does the same thing, but JJ does something differently. Their
approach was to have an optional header that describes the “origin” (in some
sense) of the commit.
* Caleb: I wonder if the semantics are important if we are trying to use these
in the same sandbox?
* Taylor: we need to understand and agree on them when we are working on the
same repository (regardless of using the same tool), but not in general at
the tool level.
* What's the next step?
* Martin: experiment with it, see if we like the semantics. Don't want to
emphasize the divergence table.
* Taylor: do we need a version associated with the change-id? Philip: no, we
treat it as an opaque identifier, versioning not necessary.
* Elijah: given that multiple players want this and have agreed on a common way
to represent it, maybe we'll have a more productive discussion on it in a year
after they've experienced working with that header for a year
* Jonathan: does it matter what forges do with automatic squash/rebase?
* Philip: for JJ we don't want to use that information, but we're just
another Git client in the ecosystem, so that's just our perspective.
* Martin: Should there be agreement on the semantics?
* Elijah: depends on the usage.
* Elijah: semantics get fuzzy because of splitting and merging, so not clear
what to do there. We either need to clarify it, but probably not here.
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 08/11] Resumable fetch / push
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (6 preceding siblings ...)
2025-10-06 19:19 ` [NOTES 07/11] Change-ID Header in Git Taylor Blau
@ 2025-10-06 19:20 ` Taylor Blau
2025-10-06 19:20 ` [NOTES 09/11] Git 3.0 Taylor Blau
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:20 UTC (permalink / raw)
To: git
Topic: Resumable fetch/push
Leader: Caleb (was Scott, but he's not here)
* Is this only client side or server side too?
* Applies to both as GitButler has a forge too. Would be nice to have protocol
improvements.
* Both bundle-uris and packfile-uris exist and at least packfile-uris are
resumable. Both are fetch-only, so push is unsolved.
* Could use single-threaded output or server-side caching to make pushing work.
* Maybe make it so servers could receive a bundle and make that resumable.
* Use cases: Pushing a repo for the first time to a new server, once there's
good large file support, android/chromium. Also a problem that's independent
of size in environments with poor connectivity (some countries, Caltrain, …).
* Servers could hand out some kind of opaque data with the fetch to indicate
what it has cached, clients can re-share that when attempting to resume and
the server can choose to do something with it or not.
* GitHub support has told people to create a branch with N commits at a time to
fetch.
Scrambly notes (Jack's notes):
* Specific Forge implementation, http based communication -> easier to set up,
keen on improvement to protocol that allows large pack files sent between
client and server
* For packfile uris at least the pack file part that is in the uri is already
reasonable, for bundle url's may not be the same, might be low handing fruit
* Taylor: push side more interesting: server -> already sent you first m bytes
of x, need something to send the resumable push
* Consider implications as an attack vector
* Brian: git's pack implementation is deterministic if you don't do
multithreading, could use returnable mode like gzip has unsyncable mode, for
client side pack a temporary file, this is resumable with an offset, and since
pack is cached locally should be something you could resume with push. Some
possibilities if we cache on the server side or use single threaded output
* an idea from pack file ui which could help solve fetch problem, server provide
url to the client, let the server be the fetcher
* Emily: that would work pretty ok using a commit cloud server, already serving
those objects. The server side can resume necessarily.
* Servers don't receive bundles, so would be adding support for server to
receive bundles. What's the real use case for this? It's worth it's own
protocol, not just a push protocol. When we try to mirror things in Gerrit it
fails due to large number of refs - would need an enhancement to handle large
numbers of refs.
* Caleb: So you suggest some sort of TCP protocol for handling these transfers?
* We have user stored binary and timeout uploading to server, it's not just
migration path
* Having some way of guaranteeing forward progress on a push or a pull as long
as you can get some smaller unit of data transfer, don't know how small to go,
but would be very useful
* We talked about chunk format before, would introducing chunk format, small
enough chunks help?
* If it's small enough and reproducible
* Elijah: Even if you have small chunks, if they are part of the same
communication, if they're small enough you'll need to restart it
* If you have to resume now say you have sent X chunks then you have N - X left
* Peff: All you need to know is the byte offset.
* Elijah: Take the objects that you have received and say "I have these objects"
* What if you hash what you got, "I asked for this", the hash was this length,
give me the rest
* Peff: Has to be able to regenerate everything from scratch, are you caching
it? Kindof wasteful
* Doesn't need to be cached, just needs to be stable, so if there was a way to
ask for it in a specific order
* Disable multithreading
* Peff: Looked into this with resumable clones, server can pass out some cache
tag, here's an opaque tag that may or may not be valid in the future, I got X
bytes of this tag can you send the rest. Becomes a heuristic on the server
"I'll choose how much to cache", git doesn't need to know about that it's an
implementation issue
* With a pack file uri you stop what you're doing talking to the server
* If you were trying to brute force it today, you would brute force sending a
ref
* Peff: GitHub support has told people to do that
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 09/11] Git 3.0
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (7 preceding siblings ...)
2025-10-06 19:20 ` [NOTES 08/11] Resumable fetch / push Taylor Blau
@ 2025-10-06 19:20 ` Taylor Blau
2025-10-06 19:20 ` [NOTES 10/11] How can companies respectfully engage contractors to work on Git? Taylor Blau
2025-10-06 19:20 ` [NOTES 11/11] Conservancy 2025 updates Taylor Blau
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:20 UTC (permalink / raw)
To: git
Topic: Git 3.0
Leader: Patrick Steinhardt
* Any questions?
* Emily: Patrick, you had proposed the end of next year as the cut date, are
people happy with that?
* Taylor: we've been using that internally as a benchmark for when we need to
deliver SHA-256. If Git 3.0 came later and we had more time, certainly
wouldn't complain, but also wouldn't ask the project to push it back purely on
that basis.
* Caleb: we need community support for SHA-256.
* Emily: feels like everybody is playing chicken.
* Taylor: ultimately the users need to tell us.
* Patrick: is there work going on on GitOxide?
* Caleb: nobody is asking for it from GitButler's perspective
* Elijah: if we don't push a date out, nobody will ask for it.
* brian: the cost for creating a SHA-1 collision is roughly $10k USD. Don't want
to spend my bonus check on it, but could do it and spam us with alerts.
* Taylor: sure, but we could just silence those alerts. Also, who would spend
$10k on this? ;-)
* brian: fair, though not implementations are using a SHA1-DC?
* brian: we should include this in Git 3.0, and we should set a hard date for
it. We should plan the interop work around that, but can't guarantee that it
will land by then.
* Martin: what's in scope for Git 3.0?
* Elijah: SHA-256 (and maybe interop) is the main thing, some deprecations
* Patrick: we have a BreakingChanges that lists what we want to remove.
Default reference backend is going to become reftable.
* Taylor: we should be doing brown-outs for deprecated features
* Elijah: we should delay for interop
* Peff: how important is interop really? What is the use-case?
* Elijah: will forges actually support SHA-256 once we enable it? Do we have
people create SHA-256 and then have them not push them anywhere.
* Peff: how do we push forges versus not?
* Peff: When we release Git 3.0 should not depend on whether or not interop
works, but whether or not real-world forges and plugins support SHA-256
* brian: smaller forges aren't there yet and won't undertake it until it's in
3.0
* Taylor: sure, but not the vast majority of users. Ultimately there are always
going to be some stragglers. Reality is that what “we” consider to be Git and
the rest of the world consider to be Git are not the same thing. So if we
release without good support on the forges side, users will be mad at us.
* Let's figure it out on the list?
* brian: I'll start that off.
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 10/11] How can companies respectfully engage contractors to work on Git?
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (8 preceding siblings ...)
2025-10-06 19:20 ` [NOTES 09/11] Git 3.0 Taylor Blau
@ 2025-10-06 19:20 ` Taylor Blau
2025-10-06 19:20 ` [NOTES 11/11] Conservancy 2025 updates Taylor Blau
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:20 UTC (permalink / raw)
To: git
Topic: How can companies respectfully engage contractors to work on Git?
Leader: Emily Shaffer
* Google hired Collabra to work on patches on the list
* Should they be doing something specific to indicate they're pursuing these
patches on behalf of someone else?
* Taylor: So long as they understand there's no obligation from the project to
accept the work
* Having a short note in the cover letter to indicate who is sponsoring the work
(if it's not already obvious). Mention during review if you think there's a
conflict of interest.
^ permalink raw reply [flat|nested] 12+ messages in thread* [NOTES 11/11] Conservancy 2025 updates
2025-10-06 19:16 Notes from the Git Contributor's Summit, 2025 Taylor Blau
` (9 preceding siblings ...)
2025-10-06 19:20 ` [NOTES 10/11] How can companies respectfully engage contractors to work on Git? Taylor Blau
@ 2025-10-06 19:20 ` Taylor Blau
10 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2025-10-06 19:20 UTC (permalink / raw)
To: git
Topic: Conservancy 2025 updates
Leader: Taylor Blau
* More trademark requests than typical this year
* Asked Perforce to stop using a very similar logo to the git logo
* Git holds a fairly restrictive trademark policy, but often doesn't enforce it.
Some risk the trademark office could flag that.
* Git project has a significant amount of money that could be spent ($100k?).
* Emily: Could sponsor git-related projects (ex: gitoxide)
* Outreachy costs money per-intern
* Not guaranteed that GitHub or GitLab would always be able to sponsor all
the interns the Git project desires. Could use $ for this. Also depends on
the future of Outreachy.
* Git ambassador program, with stipends?
* Needs someone with interest and skills to organize
^ permalink raw reply [flat|nested] 12+ messages in thread