git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Bryan Turner <bturner@atlassian.com>,
	Waleed Khan <me@waleedkhan.name>,
	git@vger.kernel.org
Subject: Re: Bug report: Strange behavior with `git gc` and `reference-transaction` hook
Date: Tue, 7 Dec 2021 09:24:00 +0100	[thread overview]
Message-ID: <Ya8aIAAOlValUL2o@ncase> (raw)
In-Reply-To: <YZaWqTwPOyQz0/mu@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 4393 bytes --]

On Thu, Nov 18, 2021 at 01:08:41PM -0500, Jeff King wrote:
> [+cc pks]
> 
> On Wed, Nov 17, 2021 at 04:52:46PM -0800, Bryan Turner wrote:
> 
> > > The expected behavior would be that the latest reference transaction
> > > hook refers to the state of the references on disk. That is, either
> > > `master` should point to 0 (be deleted), or it should have said that
> > > `master` pointed to `e197d1`.
> > >
> > > But if we actually examine `master`, it's set to `e197d1`, just as you
> > > would expect. The GC should have been a no-op overall.
> > 
> > One of the subtasks of "git gc" is "git pack-refs". If you inspect in
> > more detail, I suspect you'll find that "refs/heads/master" was loose
> > before "git gc" ran (as in, there was a file
> > "$GIT_DIR/refs/heads/master") and "packed-refs" either didn't have a
> > "refs/heads/master" entry or had a different hash. (Loose refs always
> > "win" over packed, since ref updates only write loose refs.)
> 
> It seems totally broken to me that we'd trigger the
> reference-transaction hook for ref packing. The point of the hook is to
> track logical updates to the refs. But during ref packing that does not
> change at all; the value remains the same. So I don't think we should be
> triggering the hook at all, let alone with confusing values.
> 
> This snippet shows a simple case that I think is wrong:
> 
> -- >8 --
> git init -q repo
> cd repo
> 
> cat >.git/hooks/reference-transaction <<\EOF
> #!/bin/sh
> echo >&2 "==> reference-transaction $*"
> sed 's/^/  /'
> EOF
> chmod +x .git/hooks/reference-transaction
> 
> echo >&2 "running commit..."
> git commit --allow-empty -qm foo
> echo >&2 "running pack-refs..."
> git pack-refs --all --prune
> -- >8 --
> 
> It produces:
> 
>   running commit...
>   ==> reference-transaction prepared
>     0000000000000000000000000000000000000000 77bcab0d950aee3021e8aa13a15d40e7a9a5f71b HEAD
>     0000000000000000000000000000000000000000 77bcab0d950aee3021e8aa13a15d40e7a9a5f71b refs/heads/main
>   ==> reference-transaction committed
>     0000000000000000000000000000000000000000 77bcab0d950aee3021e8aa13a15d40e7a9a5f71b HEAD
>     0000000000000000000000000000000000000000 77bcab0d950aee3021e8aa13a15d40e7a9a5f71b refs/heads/main
>   running pack-refs...
>   ==> reference-transaction prepared
>     0000000000000000000000000000000000000000 77bcab0d950aee3021e8aa13a15d40e7a9a5f71b refs/heads/main
>   ==> reference-transaction committed
>     0000000000000000000000000000000000000000 77bcab0d950aee3021e8aa13a15d40e7a9a5f71b refs/heads/main
>   ==> reference-transaction prepared
>     77bcab0d950aee3021e8aa13a15d40e7a9a5f71b 0000000000000000000000000000000000000000 refs/heads/main
>   ==> reference-transaction committed
>     77bcab0d950aee3021e8aa13a15d40e7a9a5f71b 0000000000000000000000000000000000000000 refs/heads/main
> 
> I think the final four invocations should be skipped entirely. They're
> pointless at best (nothing actually changed), and extremely misleading
> at worst (they look like the ref ended up deleted!).
> 
> -Peff

Yeah, I agree that this is something that is totally misleading.

For what it's worth, we also hit a similar case in production at GitLab,
where we use the hook to do voting on ref updates across different
nodes. Sometimes we observed different votes on a subset of nodes, and
it took me quite some time to figure out that this was dependent on
whether a ref was packed or not. We're now filtering out transactions
which consist only of force-deletions [1], which are _likely_ to be
cleanups of such packed refs. But this is very clearly a hack, and I
agree that calling the hook for 

In the end, these really are special cases of how the "files" backend
works and thus are implementation details which shouldn't be exposed to
the user at all. With these implementation details exposed, we'll start
to see different behaviour of when the hook is executed depending on
which ref backend you use, which is even worse compared to the current
state where it's at least consistently misleading.

As Peff said, the hook should really only track logical changes and not
expose any implementation details.

Patrick

[1]: https://gitlab.com/gitlab-org/gitaly/-/blob/3ef55853e9e161204464868390d97d1a1577042d/internal/gitaly/hook/referencetransaction.go#L58

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2021-12-07  8:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18  0:41 Bug report: Strange behavior with `git gc` and `reference-transaction` hook Waleed Khan
2021-11-18  0:52 ` Bryan Turner
     [not found]   ` <CAKjfCeD1C2DTnT0cLj4hC9Nq90TOFLiPnd_SLQi0HuMQHZcCaw@mail.gmail.com>
2021-11-18  1:28     ` Bryan Turner
2021-11-18 18:08   ` Jeff King
2021-12-07  8:24     ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya8aIAAOlValUL2o@ncase \
    --to=ps@pks.im \
    --cc=bturner@atlassian.com \
    --cc=git@vger.kernel.org \
    --cc=me@waleedkhan.name \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).