Git development
 help / color / mirror / Atom feed
* Strange behaviour when pushing a commit object to remote's refs/HEAD
@ 2024-01-15 19:08 Pratyush Yadav
  2024-01-16  9:54 ` Karthik Nayak
  0 siblings, 1 reply; 5+ messages in thread
From: Pratyush Yadav @ 2024-01-15 19:08 UTC (permalink / raw)
  To: git

Hi,

I ran into a strange Magit bug, where when I ran magit-show-refs on a
particular repo it threw an error. The details of the Magit bug are not
very interesting, but when attempting to reproduce it, I also saw git
misbehaving for such repos.

The strange behaviour happens when you push a commit object to remote's
refs/HEAD instead of pushing a symbolic ref. Such a repository can be
found at https://github.com/prati0100/magit-reproducer. I roughly used
the below steps to create such a repo:

    $ git init
    $ echo 1 > foo && git add foo && git commit
    $ echo 2 > bar && git add bar && git commit
    $ git push
    $ git checkout 79264c3
    $ echo 2.1 > bar && git add bar && git commit
    $ git push origin 707a3d5:refs/heads/HEAD

Now with such a repo, if you do `git log --all --oneline` it would look
something like:

    707a3d5 (origin/HEAD) 2.1
    86e1c97 (HEAD -> main, origin/main) 2
    79264c3 1

And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:

    ,origin,refs/remotes/origin/HEAD,2.1
    ,origin/main,refs/remotes/origin/main,2

All well and good so far. Now delete the repo and attempt to clone it.
This time `git log --all --oneline` gives:

    86e1c97 (HEAD -> main, origin/main, origin/HEAD) 2
    79264c3 1

And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:

    origin/main,origin,refs/remotes/origin/HEAD,2
    ,origin/main,refs/remotes/origin/main,2

So suddenly the remote's HEAD becomes origin/main (symbolic ref) and the
commit (707a3d5, "2.1") is nowhere to be found. It neither shows up in
`git rev-list --all` nor in `git log --all`. The files and trees
associated with it also do not show up in `git rev-list --all --object`.
Yet if you do `git show 707a3d5` it shows up. So it does exist and did
get cloned, but git cannot properly see it.

Interestingly enough, even the GitHub UI is confused and it won't show
you the repo correctly. It will show the commit (86e1c97, "2") for both
"branches" main and HEAD. cgit's UI [0] seems to work fine with this,
though cloning from cgit still suffers from this bug.

There _is_ a way to clone the repo correctly. If you do:

    $ git init magit-reproducer
    $ git remote add origin https://github.com/prati0100/magit-reproducer.git
    $ git remote update

Now if you do git log --all or git for-each-ref, you see the correct
result.

I don't really know how to fix this but it certainly is a bug in git
since it can't clone the repo correctly. And at least one major Git host
can't display such a repo properly (I haven't tried others).

I used Git v2.40.1 to do most of this but I did compile the latest
master d4dbce1db5 ("The seventh batch") and attempted to clone using it
and I see the same problem.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/pratyush/magit-reproducer.git/

-- 
Regards,
Pratyush Yadav

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strange behaviour when pushing a commit object to remote's refs/HEAD
  2024-01-15 19:08 Strange behaviour when pushing a commit object to remote's refs/HEAD Pratyush Yadav
@ 2024-01-16  9:54 ` Karthik Nayak
  2024-01-16 11:33   ` Pratyush Yadav
  0 siblings, 1 reply; 5+ messages in thread
From: Karthik Nayak @ 2024-01-16  9:54 UTC (permalink / raw)
  To: Pratyush Yadav, git

[-- Attachment #1: Type: text/plain, Size: 2720 bytes --]

Pratyush Yadav <me@yadavpratyush.com> writes:

> Hi,
>

Hello,

> I ran into a strange Magit bug, where when I ran magit-show-refs on a
> particular repo it threw an error. The details of the Magit bug are not
> very interesting, but when attempting to reproduce it, I also saw git
> misbehaving for such repos.
>
> The strange behaviour happens when you push a commit object to remote's
> refs/HEAD instead of pushing a symbolic ref. Such a repository can be
> found at https://github.com/prati0100/magit-reproducer. I roughly used
> the below steps to create such a repo:
>
>     $ git init
>     $ echo 1 > foo && git add foo && git commit
>     $ echo 2 > bar && git add bar && git commit
>     $ git push
>     $ git checkout 79264c3
>     $ echo 2.1 > bar && git add bar && git commit
>     $ git push origin 707a3d5:refs/heads/HEAD
>

Just to note here that pushing to "refs/heads/HEAD" is not actually
updating the remote repositories $GIT_DIR/HEAD file, rather it creates a
new reference $GIT_DIR/refs/heads/HEAD.

With this understanding you'll see that this is not a bug, because the
remote HEAD was never updated, but only a new branch called HEAD was
created [0].

> Now with such a repo, if you do `git log --all --oneline` it would look
> something like:
>
>     707a3d5 (origin/HEAD) 2.1
>     86e1c97 (HEAD -> main, origin/main) 2
>     79264c3 1
>
> And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:
>
>     ,origin,refs/remotes/origin/HEAD,2.1
>     ,origin/main,refs/remotes/origin/main,2
>
> All well and good so far. Now delete the repo and attempt to clone it.
> This time `git log --all --oneline` gives:
>
>     86e1c97 (HEAD -> main, origin/main, origin/HEAD) 2
>     79264c3 1
>

This is expected since you cloned the repository and you got the default
branch 'main'.

> And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:
>
>     origin/main,origin,refs/remotes/origin/HEAD,2
>     ,origin/main,refs/remotes/origin/main,2
>
> So suddenly the remote's HEAD becomes origin/main (symbolic ref) and the
> commit (707a3d5, "2.1") is nowhere to be found. It neither shows up in
> `git rev-list --all` nor in `git log --all`. The files and trees
> associated with it also do not show up in `git rev-list --all --object`.


Because rev-list's `--all`, iterates over all refs. Since you only
cloned, the HEAD branch is not pulled.

Everything else is a consequence of the subtle but important difference
between updating $GIT_DIR/HEAD vs creating $GIT_DIR/refs/heads/HEAD.

[0]: https://github.com/prati0100/magit-reproducer/branches/all

Thanks,
Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strange behaviour when pushing a commit object to remote's refs/HEAD
  2024-01-16  9:54 ` Karthik Nayak
@ 2024-01-16 11:33   ` Pratyush Yadav
  2024-01-16 13:24     ` Karthik Nayak
  0 siblings, 1 reply; 5+ messages in thread
From: Pratyush Yadav @ 2024-01-16 11:33 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Pratyush Yadav, git

On Tue, Jan 16 2024, Karthik Nayak wrote:

> Pratyush Yadav <me@yadavpratyush.com> writes:
>
>> Hi,
>>
>
> Hello,
>
>> I ran into a strange Magit bug, where when I ran magit-show-refs on a
>> particular repo it threw an error. The details of the Magit bug are not
>> very interesting, but when attempting to reproduce it, I also saw git
>> misbehaving for such repos.
>>
>> The strange behaviour happens when you push a commit object to remote's
>> refs/HEAD instead of pushing a symbolic ref. Such a repository can be
>> found at https://github.com/prati0100/magit-reproducer. I roughly used
>> the below steps to create such a repo:
>>
>>     $ git init
>>     $ echo 1 > foo && git add foo && git commit
>>     $ echo 2 > bar && git add bar && git commit
>>     $ git push
>>     $ git checkout 79264c3
>>     $ echo 2.1 > bar && git add bar && git commit
>>     $ git push origin 707a3d5:refs/heads/HEAD
>>
>
> Just to note here that pushing to "refs/heads/HEAD" is not actually
> updating the remote repositories $GIT_DIR/HEAD file, rather it creates a
> new reference $GIT_DIR/refs/heads/HEAD.

Yes, that is what I would also expect. I checked one of the Git servers
we have and this is exactly what happens. $GIT_DIR/HEAD is a symref
pointing to refs/heads/main and $GIT_DIR/refs/heads/HEAD points to the
commit. But behaviour from client side is not consistent.

>
> With this understanding you'll see that this is not a bug, because the
> remote HEAD was never updated, but only a new branch called HEAD was
> created [0].

GitHub thinks so but try opening the branch. It won't show you the
commit (707a3d5, "2.1") but instead shows you 86e1c97 ("2"). So
something is wrong _at least_ with Github.

>
>> Now with such a repo, if you do `git log --all --oneline` it would look
>> something like:
>>
>>     707a3d5 (origin/HEAD) 2.1
>>     86e1c97 (HEAD -> main, origin/main) 2
>>     79264c3 1
>>
>> And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:
>>
>>     ,origin,refs/remotes/origin/HEAD,2.1
>>     ,origin/main,refs/remotes/origin/main,2
>>
>> All well and good so far. Now delete the repo and attempt to clone it.
>> This time `git log --all --oneline` gives:
>>
>>     86e1c97 (HEAD -> main, origin/main, origin/HEAD) 2
>>     79264c3 1
>>
>
> This is expected since you cloned the repository and you got the default
> branch 'main'.

No.

First, if I clone a repo with multiple branches (say
https://github.com/prati0100/git-gui) I get _all_ the remote branches.
Yet here I clearly don't get the so called "HEAD" branch. This is not
expected behaviour.

Second, git really does misunderstand refs/remotes/origin/HEAD. For
example, when running git for-each-ref command with the clone method, I
get:

    origin/main,origin,refs/remotes/origin/HEAD,2

So it clearly thinks refs/remotes/origin/HEAD is at 86e1c97 ("2"). Or,
to be more specific, it thinks the ref points to origin/main which is at
86e1c97 ("2"). But we set it at (707a3d5, "2.1"). So it tells me the
wrong thing. Now if I do the git remote add && git remote update method,
git for-each-ref says:

    ,origin,refs/remotes/origin/HEAD,2.1

So now it thinks refs/remotes/origin/HEAD points at (707a3d5, "2.1"). I
do not see it as expected behaviour.

We can also see this when inspecting the contents of
.git/refs/remotes/origin/HEAD. With clone it says:

    ref: refs/remotes/origin/main

With git remote add && git remote update it says:

    707a3d587c61c089710e3924eb63a51763b5a4c8

The same ref points to different places based on how you pull the repo.

Looking deeper, if you clone a repo that does not have a branch called
"HEAD" (like git-gui), git creates a file in
.git/refs/remotes/origin/HEAD that says:

    ref: refs/remotes/origin/master

So it certainly seems to use refs/remotes/origin/HEAD to point to the
remote's HEAD, and not as a regular branch.

I find this to be inconsistent behaviour on git's part and do not think
it is (or should be) expected behaviour.

>
>> And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:
>>
>>     origin/main,origin,refs/remotes/origin/HEAD,2
>>     ,origin/main,refs/remotes/origin/main,2
>>
>> So suddenly the remote's HEAD becomes origin/main (symbolic ref) and the
>> commit (707a3d5, "2.1") is nowhere to be found. It neither shows up in
>> `git rev-list --all` nor in `git log --all`. The files and trees
>> associated with it also do not show up in `git rev-list --all --object`.
>
>
> Because rev-list's `--all`, iterates over all refs. Since you only
> cloned, the HEAD branch is not pulled.

Why not? When you clone all branches should get pulled.

>
> Everything else is a consequence of the subtle but important difference
> between updating $GIT_DIR/HEAD vs creating $GIT_DIR/refs/heads/HEAD.
>
> [0]: https://github.com/prati0100/magit-reproducer/branches/all

-- 
Regards,
Pratyush Yadav

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strange behaviour when pushing a commit object to remote's refs/HEAD
  2024-01-16 11:33   ` Pratyush Yadav
@ 2024-01-16 13:24     ` Karthik Nayak
  2024-01-16 15:00       ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Karthik Nayak @ 2024-01-16 13:24 UTC (permalink / raw)
  To: Pratyush Yadav; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 5062 bytes --]

Pratyush Yadav <me@yadavpratyush.com> writes:


>> Just to note here that pushing to "refs/heads/HEAD" is not actually
>> updating the remote repositories $GIT_DIR/HEAD file, rather it creates a
>> new reference $GIT_DIR/refs/heads/HEAD.
>
> Yes, that is what I would also expect. I checked one of the Git servers
> we have and this is exactly what happens. $GIT_DIR/HEAD is a symref
> pointing to refs/heads/main and $GIT_DIR/refs/heads/HEAD points to the
> commit. But behaviour from client side is not consistent.
>

What is the non _consistent_ part?

>>
>> With this understanding you'll see that this is not a bug, because the
>> remote HEAD was never updated, but only a new branch called HEAD was
>> created [0].
>
> GitHub thinks so but try opening the branch. It won't show you the
> commit (707a3d5, "2.1") but instead shows you 86e1c97 ("2"). So
> something is wrong _at least_ with Github.
>

I don't know how GitHub operates, but I'm guessing because there is
ambiguity between a branch called HEAD and the actual HEAD. So this is
probably the reason.

>>
>>> Now with such a repo, if you do `git log --all --oneline` it would look
>>> something like:
>>>
>>>     707a3d5 (origin/HEAD) 2.1
>>>     86e1c97 (HEAD -> main, origin/main) 2
>>>     79264c3 1
>>>
>>> And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:
>>>
>>>     ,origin,refs/remotes/origin/HEAD,2.1
>>>     ,origin/main,refs/remotes/origin/main,2
>>>
>>> All well and good so far. Now delete the repo and attempt to clone it.
>>> This time `git log --all --oneline` gives:
>>>
>>>     86e1c97 (HEAD -> main, origin/main, origin/HEAD) 2
>>>     79264c3 1
>>>
>>
>> This is expected since you cloned the repository and you got the default
>> branch 'main'.
>
> No.
>
> First, if I clone a repo with multiple branches (say
> https://github.com/prati0100/git-gui) I get _all_ the remote branches.
> Yet here I clearly don't get the so called "HEAD" branch. This is not
> expected behaviour.
>

You're right, I meant to say that the remote branches don't have the
corresponding local branches. But that does not matter here.

I'm not saying that there is a path for git to work properly when
creating a branch called "HEAD". It's just that "HEAD" is more of a
reserved word for git and creating a branch with the same name has
unintended effects.

> Second, git really does misunderstand refs/remotes/origin/HEAD. For
> example, when running git for-each-ref command with the clone method, I
> get:
>
>     origin/main,origin,refs/remotes/origin/HEAD,2
>
> So it clearly thinks refs/remotes/origin/HEAD is at 86e1c97 ("2"). Or,
> to be more specific, it thinks the ref points to origin/main which is at
> 86e1c97 ("2"). But we set it at (707a3d5, "2.1"). So it tells me the
> wrong thing. Now if I do the git remote add && git remote update method,
> git for-each-ref says:
>
>     ,origin,refs/remotes/origin/HEAD,2.1
>

This is one of those ambiguities, we store HEAD for remotes as
     $GIT_DIR/refs/remotes/<remote>/HEAD
and remote branches as
     $GIT_DIR/refs/remotes/<remote>/<branch>

So what happens if there is a branch named HEAD? This is the problem
you're facing...

> So now it thinks refs/remotes/origin/HEAD points at (707a3d5, "2.1"). I
> do not see it as expected behaviour.
>
> We can also see this when inspecting the contents of
> .git/refs/remotes/origin/HEAD. With clone it says:
>
>     ref: refs/remotes/origin/main
>
> With git remote add && git remote update it says:
>
>     707a3d587c61c089710e3924eb63a51763b5a4c8
>
> The same ref points to different places based on how you pull the repo.
>
> Looking deeper, if you clone a repo that does not have a branch called
> "HEAD" (like git-gui), git creates a file in
> .git/refs/remotes/origin/HEAD that says:
>
>     ref: refs/remotes/origin/master
>
> So it certainly seems to use refs/remotes/origin/HEAD to point to the
> remote's HEAD, and not as a regular branch.
>
> I find this to be inconsistent behaviour on git's part and do not think
> it is (or should be) expected behaviour.
>

Maybe we should explicitly mention that using HEAD as the branch name
has unintended effects and should be avoided.

>>
>>> And running `git for-each-ref --format='%(symref:short),%(refname:short),%(refname),%(subject)' refs/remotes/origin` gives:
>>>
>>>     origin/main,origin,refs/remotes/origin/HEAD,2
>>>     ,origin/main,refs/remotes/origin/main,2
>>>
>>> So suddenly the remote's HEAD becomes origin/main (symbolic ref) and the
>>> commit (707a3d5, "2.1") is nowhere to be found. It neither shows up in
>>> `git rev-list --all` nor in `git log --all`. The files and trees
>>> associated with it also do not show up in `git rev-list --all --object`.
>>
>>
>> Because rev-list's `--all`, iterates over all refs. Since you only
>> cloned, the HEAD branch is not pulled.
>
> Why not? When you clone all branches should get pulled.
>

I think I jumped too quick here, it is because the branch HEAD is never
realized locally as I explained above.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strange behaviour when pushing a commit object to remote's refs/HEAD
  2024-01-16 13:24     ` Karthik Nayak
@ 2024-01-16 15:00       ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2024-01-16 15:00 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Pratyush Yadav, git

On Tue, Jan 16, 2024 at 08:24:04AM -0500, Karthik Nayak wrote:

> This is one of those ambiguities, we store HEAD for remotes as
>      $GIT_DIR/refs/remotes/<remote>/HEAD
> and remote branches as
>      $GIT_DIR/refs/remotes/<remote>/<branch>
> 
> So what happens if there is a branch named HEAD? This is the problem
> you're facing...

Yeah, this is a long-standing issue. The reason we have not fixed it is
that it would require a new refs/remotes layout, which implies new
lookup rules (e.g., dwim_ref() will convert the name "foo" to
"refs/remotes/foo/HEAD", but would need to be taught about the new
layout). Likewise, a new layout should probably store per-remote tags
(rather than splatting them into the main refs/tags/) along with new
dwim_ref() rules to make lookup work more or less as it does now.

So it's not impossible, but some care has to be given the design and
to handling compatibility. If anybody is interested, there are probably
some nuggets of wisdom to mine from this old thread:

  https://lore.kernel.org/git/AANLkTi=yFwOAQMHhvLsB1_xmYOE9HHP2YB4H4TQzwwc8@mail.gmail.com/

In the meantime, I think the current wisdom is "don't name a branch
HEAD". ;) We even added logic to "git branch" to forbid this, but tools
like "git push" are a bit more flexible.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-01-16 15:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-15 19:08 Strange behaviour when pushing a commit object to remote's refs/HEAD Pratyush Yadav
2024-01-16  9:54 ` Karthik Nayak
2024-01-16 11:33   ` Pratyush Yadav
2024-01-16 13:24     ` Karthik Nayak
2024-01-16 15:00       ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox