* How does git track history overwrites? @ 2026-05-24 23:41 Jens Tröger 2026-05-25 3:46 ` Chris Torek 2026-05-25 6:51 ` Junio C Hamano 0 siblings, 2 replies; 6+ messages in thread From: Jens Tröger @ 2026-05-24 23:41 UTC (permalink / raw) To: git Hello, I’m looking for details and some clarification on a `git fetch` behavior I observed, but can’t quite explain. More context is in this Github comment: https://github.com/jenstroeger/python-package-template/pull/1190#discussion_r3288253713 but it boils down to this: /tmp/bla > git -c protocol.version=2 fetch origin dda8db18cfc68df532abf33b185ecd12d5b7b326 --depth=1 It seems that sha dda8db1 (tag 1.20.0 previously pointed at it) was replaced due to a suspected history overwrite with fda7769 (tag 1.20.0 now points at it) and git figures that out: ... From https://github.com/adamchainz/blacken-docs * branch dda8db18cfc68df532abf33b185ecd12d5b7b326 -> FETCH_HEAD And then: /tmp/bla > git checkout FETCH_HEAD Note: switching to 'FETCH_HEAD’ ... HEAD is now at fda7769 Version 1.20.0 And: /tmp/bla > cat .git/HEAD fda77690955e9b63c6687d8806bafd56a526e45f /tmp/bla > cat .git/FETCH_HEAD dda8db18cfc68df532abf33b185ecd12d5b7b326 'dda8db18cfc68df532abf33b185ecd12d5b7b326' of https://github.com/adamchainz/blacken-docs I’d like to understand the details some more, and how I could manually make that connection? Thank you! Jens ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How does git track history overwrites? 2026-05-24 23:41 How does git track history overwrites? Jens Tröger @ 2026-05-25 3:46 ` Chris Torek 2026-05-25 6:51 ` Junio C Hamano 1 sibling, 0 replies; 6+ messages in thread From: Chris Torek @ 2026-05-25 3:46 UTC (permalink / raw) To: Jens Tröger; +Cc: git On Sun, May 24, 2026 at 4:44 PM Jens Tröger <jens.troeger@light-speed.de> wrote: > I’m looking for details and some clarification on a `git fetch` behavior I observed, but can’t quite explain. ... This isn't really specific to "git fetch" at all, except for the usage of FETCH_HEAD. To really understand this properly, we need to understand the root of a seeming contradiction: 1. Once saved in Git, no commit (in fact, no internal object of any sort) can ever be changed. 2. And yet, "git rebase" and force-push operations seem to rewrite history. How can commits be immutable and yet rewrite-able? The trick here lies in how we (humans) *find* commits. Inside a Git repository, the "true name" of any commit (or indeed any internal object) is its raw hash ID, such as your example of dda8db18cfc68df532abf33b185ecd12d5b7b326. The hash ID (or "object ID", though right now there are only two forms, a SHA1 hash or a SHA256 hash) is specific to that one object once it is created, and forever more can never be used for any other object. It will always mean that original object, as long as that object exists. Thus, as long as that commit exists, it's *that* commit, with *that* ID, and no other. But we (humans) don't *use* hash IDs. They're too cumbersome. So Git provides us with the ability to translate a name to an ID: > It seems that sha dda8db1 (tag 1.20.0 previously pointed at it) The *name* refs/tags/1.20.0 used to produce the above ID. > was replaced ... with fda7769 (tag 1.20.0 now points at it) Some human directed Git to forcibly replace the hash ID associated with the tag, in some repository or repositories. (As the manuals note, this kind of forcible replacement of tags is often a bad idea. It's usually better, once the tag has escaped the confinement of a single repository anyway, to just admit that you goofed up and make a new tag.) If you use raw hash IDs, you can never be bitten by this kind of tag replacement, but of course that's a bad idea for different (and presumably obvious) reasons. I couldn't possibly name the hash ID without using cut-and-paste here. I can *type* "1.20.0" repeatedly without error though. (There are additional considerations, having to do with how Git cleans up unwanted leftover junk, via git gc / git maintenance. In particular Git uses the human-readable names to figure out which objects are useful, and which are unwanted junk. So you have to identify *some* commits with names, or they'll eventually get garbage-collected.) [At this point, you ran git fetch with a raw hash ID, and:] > From https://github.com/adamchainz/blacken-docs > * branch dda8db18cfc68df532abf33b185ecd12d5b7b326 -> FETCH_HEAD When git fetch obtains something from another different Git repository, the new things have the same IDs in both repositories. Normally we do this by *name* (branch or tag name), but for historical reasons, the fetch operation deposits a hash ID (often along with additional information) in the file `.git/FETCH_HEAD`. This file then works as a pseudo-name for the branch, tag, or commit(s) thus obtained: > And then: > > /tmp/bla > git checkout FETCH_HEAD > Note: switching to 'FETCH_HEAD’ This gives you a "detached HEAD" state, using the hash ID stored in .git/FETCH_HEAD. That hash ID will be overwritten (thus lost) by the *next* git fetch, so you're expected to save it in some more-permanent name if you want it to stick around. The key difference between a branch name and a tag name is that branch names are *expected* to map to different hash IDs over time, with updates adding new commits to the branch causing the branch name to remember the latest commit's ID. Each commit in turn remembers the IDs of its parent commit or commits, so knowing the *last* one suffices to allow Git to find *every* one. Rewriting history with rebase consists of copying old (presumably bad) commits to new (presumably good/better) ones, whose backwards links to each previous commit chain through the new-and-improved commits until you reach the point where the rewrite joins existing history. Then we update the branch name to remember the latest of the new-and-improved commits, and it *seems* that we've changed history. The old history is still in there, and will stick around for quite a while (at least a month by default, in standard clones) "just in case". Tag names are not supposed to move, and whether someone else's tag update to their clone changes your own clone's tags is something you can control to some extent. It's not a good idea to depend on other people's clones to follow tag changes, but it's also not a good idea to depend on your own or other people's clones *not* to follow such changes, since both behaviors are possible. Chris ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How does git track history overwrites? 2026-05-24 23:41 How does git track history overwrites? Jens Tröger 2026-05-25 3:46 ` Chris Torek @ 2026-05-25 6:51 ` Junio C Hamano 2026-05-25 22:47 ` Jens Tröger 1 sibling, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2026-05-25 6:51 UTC (permalink / raw) To: Jens Tröger; +Cc: git Jens Tröger <jens.troeger@light-speed.de> writes: > Hello, > > I’m looking for details and some clarification on a `git fetch` behavior I observed, but can’t quite explain. More context is in this Github comment: > > https://github.com/jenstroeger/python-package-template/pull/1190#discussion_r3288253713 > > but it boils down to this: > > /tmp/bla > git -c protocol.version=2 fetch origin dda8db18cfc68df532abf33b185ecd12d5b7b326 --depth=1 > > It seems that sha dda8db1 (tag 1.20.0 previously pointed at it) was replaced due to a suspected history overwrite with fda7769 (tag 1.20.0 now points at it) and git figures that out: > > ... > > From https://github.com/adamchainz/blacken-docs > * branch dda8db18cfc68df532abf33b185ecd12d5b7b326 -> FETCH_HEAD > > And then: > > /tmp/bla > git checkout FETCH_HEAD > Note: switching to 'FETCH_HEAD’ > > ... > > HEAD is now at fda7769 Version 1.20.0 > > And: > > /tmp/bla > cat .git/HEAD > fda77690955e9b63c6687d8806bafd56a526e45f > /tmp/bla > cat .git/FETCH_HEAD > dda8db18cfc68df532abf33b185ecd12d5b7b326 'dda8db18cfc68df532abf33b185ecd12d5b7b326' of https://github.com/adamchainz/blacken-docs > > I’d like to understand the details some more, and how I could manually make that connection? Where does this line in your discussion page at GitHub (which is omitted from the post to this list) come from? commit fda77690955e9b63c6687d8806bafd56a526e45f (grafted, HEAD) Are you doing anything funky with .git/info/grafts by any chance? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How does git track history overwrites? 2026-05-25 6:51 ` Junio C Hamano @ 2026-05-25 22:47 ` Jens Tröger 2026-05-25 23:09 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Jens Tröger @ 2026-05-25 22:47 UTC (permalink / raw) To: Junio C Hamano, Chris Torek; +Cc: git Thank you Chris and Junio! > [Junio] Where does this line in your discussion page at GitHub (which is > omitted from the post to this list) come from? > > commit fda77690955e9b63c6687d8806bafd56a526e45f (grafted, HEAD) > > Are you doing anything funky with .git/info/grafts by any chance? That line is the result of a `git log` after the `git fetch` I mentioned in my initial email. > [Chris] To really understand this properly, we need to understand > the root of a seeming contradiction: > > [...] Thank you for your elaborate explanation, Chris, that all makes a lot of sense. A few follow-up questions: • Is all the object information stored with a repo clone locally as well, or does some/most/all of it stay on the remote server repo? • How exactly does git connect the dots between commit dda8db1 and fda7769, how does it “know” the former was superseded by the latter (i.e. I fetch the former and Git uses the latter for head)? • Based on the previous question, can I manually find such a connection between two commit objects too? It sounds like reading some internals would be helpful. I’m noodling through https://git-scm.com/book/en/v2/Git-Internals-Git-Objects but perhaps you have some more recommendations? With many greetings, Jens ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How does git track history overwrites? 2026-05-25 22:47 ` Jens Tröger @ 2026-05-25 23:09 ` Junio C Hamano 2026-05-25 23:20 ` Jens Tröger 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2026-05-25 23:09 UTC (permalink / raw) To: Jens Tröger; +Cc: Chris Torek, git Jens Tröger <jens.troeger@light-speed.de> writes: > Thank you Chris and Junio! > > >> [Junio] Where does this line in your discussion page at GitHub (which is >> omitted from the post to this list) come from? >> >> commit fda77690955e9b63c6687d8806bafd56a526e45f (grafted, HEAD) >> >> Are you doing anything funky with .git/info/grafts by any chance? > > That line is the result of a `git log` after the `git fetch` I mentioned in my initial email. Sorry, I may have been unclear. I specifically meant the "grafted, " part in the message. I know how "git log" output looks like ;-) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: How does git track history overwrites? 2026-05-25 23:09 ` Junio C Hamano @ 2026-05-25 23:20 ` Jens Tröger 0 siblings, 0 replies; 6+ messages in thread From: Jens Tröger @ 2026-05-25 23:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: Chris Torek, git Hello Junio, > Sorry, I may have been unclear. I specifically meant the "grafted, > " part in the message. I know how "git log" output looks like ;-) The “grafted” too was part of the git log output; here is the complete cmd line output: /tmp/bla > git log commit fda77690955e9b63c6687d8806bafd56a526e45f (grafted, HEAD) Author: Adam Johnson <me@adamj.eu> Date: Mon Sep 8 16:31:35 2025 +0100 Version 1.20.0 On *why* the “grafted” is there in the first place, I suspect that’s got to do with the fetch --depth=1 and that previous history isn’t available in the shallow repo clone. Cheers, Jens ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-25 23:20 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-24 23:41 How does git track history overwrites? Jens Tröger 2026-05-25 3:46 ` Chris Torek 2026-05-25 6:51 ` Junio C Hamano 2026-05-25 22:47 ` Jens Tröger 2026-05-25 23:09 ` Junio C Hamano 2026-05-25 23:20 ` Jens Tröger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox