Git development
 help / color / mirror / Atom feed
* How does git track history overwrites?
@ 2026-05-24 23:41 Jens Tröger
  2026-05-25  3:46 ` Chris Torek
  2026-05-25  6:51 ` Junio C Hamano
  0 siblings, 2 replies; 3+ messages in thread
From: Jens Tröger @ 2026-05-24 23:41 UTC (permalink / raw)
  To: git

Hello,

I’m looking for details and some clarification on a `git fetch` behavior I observed, but can’t quite explain. More context is in this Github comment:

  https://github.com/jenstroeger/python-package-template/pull/1190#discussion_r3288253713

but it boils down to this:

  /tmp/bla > git -c protocol.version=2 fetch origin dda8db18cfc68df532abf33b185ecd12d5b7b326 --depth=1

It seems that sha dda8db1 (tag 1.20.0 previously pointed at it) was replaced due to a suspected history overwrite with fda7769 (tag 1.20.0 now points at it) and git figures that out:

  ...

  From https://github.com/adamchainz/blacken-docs
  * branch dda8db18cfc68df532abf33b185ecd12d5b7b326 -> FETCH_HEAD

And then:

  /tmp/bla > git checkout FETCH_HEAD
  Note: switching to 'FETCH_HEAD’

  ...

  HEAD is now at fda7769 Version 1.20.0

And:

  /tmp/bla > cat .git/HEAD 
  fda77690955e9b63c6687d8806bafd56a526e45f
  /tmp/bla > cat .git/FETCH_HEAD 
  dda8db18cfc68df532abf33b185ecd12d5b7b326 'dda8db18cfc68df532abf33b185ecd12d5b7b326' of https://github.com/adamchainz/blacken-docs

I’d like to understand the details some more, and how I could manually make that connection?

Thank you!
Jens


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How does git track history overwrites?
  2026-05-24 23:41 How does git track history overwrites? Jens Tröger
@ 2026-05-25  3:46 ` Chris Torek
  2026-05-25  6:51 ` Junio C Hamano
  1 sibling, 0 replies; 3+ messages in thread
From: Chris Torek @ 2026-05-25  3:46 UTC (permalink / raw)
  To: Jens Tröger; +Cc: git

On Sun, May 24, 2026 at 4:44 PM Jens Tröger <jens.troeger@light-speed.de> wrote:
> I’m looking for details and some clarification on a `git fetch` behavior I observed, but can’t quite explain. ...

This isn't really specific to "git fetch" at all, except for the
usage of FETCH_HEAD.

To really understand this properly, we need to understand
the root of a seeming contradiction:

1. Once saved in Git, no commit (in fact, no internal object of any sort)
   can ever be changed.
2. And yet, "git rebase" and force-push operations seem to rewrite
   history.

How can commits be immutable and yet rewrite-able? The trick here
lies in how we (humans) *find* commits.

Inside a Git repository, the "true name" of any commit (or indeed
any internal object) is its raw hash ID, such as your example of
dda8db18cfc68df532abf33b185ecd12d5b7b326. The hash ID (or
"object ID", though right now there are only two forms, a SHA1
hash or a SHA256 hash) is specific to that one object once it is
created, and forever more can never be used for any other object.
It will always mean that original object, as long as that object
exists.

Thus, as long as that commit exists, it's *that* commit, with *that*
ID, and no other.

But we (humans) don't *use* hash IDs. They're too cumbersome.
So Git provides us with the ability to translate a name to an ID:

> It seems that sha dda8db1 (tag 1.20.0 previously pointed at it)

The *name* refs/tags/1.20.0 used to produce the above ID.

> was replaced ... with fda7769 (tag 1.20.0 now points at it)

Some human directed Git to forcibly replace the hash ID associated
with the tag, in some repository or repositories.

(As the manuals note, this kind of forcible replacement of tags is
often a bad idea. It's usually better, once the tag has escaped the
confinement of a single repository anyway, to just admit that you
goofed up and make a new tag.)

If you use raw hash IDs, you can never be bitten by this kind of
tag replacement, but of course that's a bad idea for different
(and presumably obvious) reasons. I couldn't possibly name the
hash ID without using cut-and-paste here. I can *type* "1.20.0"
repeatedly without error though.

(There are additional considerations, having to do with how Git
cleans up unwanted leftover junk, via git gc / git maintenance. In
particular Git uses the human-readable names to figure out which
objects are useful, and which are unwanted junk. So you have to
identify *some* commits with names, or they'll eventually get
garbage-collected.)

[At this point, you ran git fetch with a raw hash ID, and:]

>   From https://github.com/adamchainz/blacken-docs
>   * branch dda8db18cfc68df532abf33b185ecd12d5b7b326 -> FETCH_HEAD

When git fetch obtains something from another different Git repository,
the new things have the same IDs in both repositories. Normally we do
this by *name* (branch or tag name), but for historical reasons, the fetch
operation deposits a hash ID (often along with additional information)
 in the file `.git/FETCH_HEAD`. This file then works as a pseudo-name
for the branch, tag, or commit(s) thus obtained:

> And then:
>
>   /tmp/bla > git checkout FETCH_HEAD
>   Note: switching to 'FETCH_HEAD’

This gives you a "detached HEAD" state, using the hash ID stored in
.git/FETCH_HEAD. That hash ID will be overwritten (thus lost) by the
*next* git fetch, so you're expected to save it in some more-permanent
name if you want it to stick around.

The key difference between a branch name and a tag name is that
branch names are *expected* to map to different hash IDs over time,
with updates adding new commits to the branch causing the branch
name to remember the latest commit's ID. Each commit in turn
remembers the IDs of its parent commit or commits, so knowing
the *last* one suffices to allow Git to find *every* one.

Rewriting history with rebase consists of copying old (presumably
bad) commits to new (presumably good/better) ones, whose backwards
links to each previous commit chain through the new-and-improved
commits until you reach the point where the rewrite joins existing
history. Then we update the branch name to remember the latest
of the new-and-improved commits, and it *seems* that we've changed
history. The old history is still in there, and will stick around for quite
a while (at least a month by default, in standard clones) "just in case".

Tag names are not supposed to move, and whether someone else's tag
update to their clone changes your own clone's tags is something
you can control to some extent. It's not a good idea to depend on
other people's clones to follow tag changes, but it's also not a
good idea to depend on your own or other people's clones *not* to
follow such changes, since both behaviors are possible.

Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How does git track history overwrites?
  2026-05-24 23:41 How does git track history overwrites? Jens Tröger
  2026-05-25  3:46 ` Chris Torek
@ 2026-05-25  6:51 ` Junio C Hamano
  1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2026-05-25  6:51 UTC (permalink / raw)
  To: Jens Tröger; +Cc: git

Jens Tröger <jens.troeger@light-speed.de> writes:

> Hello,
>
> I’m looking for details and some clarification on a `git fetch` behavior I observed, but can’t quite explain. More context is in this Github comment:
>
>   https://github.com/jenstroeger/python-package-template/pull/1190#discussion_r3288253713
>
> but it boils down to this:
>
>   /tmp/bla > git -c protocol.version=2 fetch origin dda8db18cfc68df532abf33b185ecd12d5b7b326 --depth=1
>
> It seems that sha dda8db1 (tag 1.20.0 previously pointed at it) was replaced due to a suspected history overwrite with fda7769 (tag 1.20.0 now points at it) and git figures that out:
>
>   ...
>
>   From https://github.com/adamchainz/blacken-docs
>   * branch dda8db18cfc68df532abf33b185ecd12d5b7b326 -> FETCH_HEAD
>
> And then:
>
>   /tmp/bla > git checkout FETCH_HEAD
>   Note: switching to 'FETCH_HEAD’
>
>   ...
>
>   HEAD is now at fda7769 Version 1.20.0
>
> And:
>
>   /tmp/bla > cat .git/HEAD 
>   fda77690955e9b63c6687d8806bafd56a526e45f
>   /tmp/bla > cat .git/FETCH_HEAD 
>   dda8db18cfc68df532abf33b185ecd12d5b7b326 'dda8db18cfc68df532abf33b185ecd12d5b7b326' of https://github.com/adamchainz/blacken-docs
>
> I’d like to understand the details some more, and how I could manually make that connection?

Where does this line in your discussion page at GitHub (which is
omitted from the post to this list) come from?

    commit fda77690955e9b63c6687d8806bafd56a526e45f (grafted, HEAD)

Are you doing anything funky with .git/info/grafts by any chance?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-25  6:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-24 23:41 How does git track history overwrites? Jens Tröger
2026-05-25  3:46 ` Chris Torek
2026-05-25  6:51 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox