Destructive pre-commit behaviour and "--all"

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Destructive pre-commit behaviour and "--all"
@ 2023-03-22 10:34 Jan Rüegg
  2023-03-22 20:00 ` Chris Torek
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Rüegg @ 2023-03-22 10:34 UTC (permalink / raw)
  To: git

Why doesn't "git commit --all" keep the files added to the staging
area, so "git commit --all" would have the same behaviour as "git add
--all && git commit"?

As described here a pre-commit hook which changes files, when run with
"git commit --all", will make it impossible to see what the hook
changed.

Making "git commit --all" and "git add --all && git commit" have the
same behaviour (or at least having that configurable) would allow
looking (and reverting if necessary) the changes generated by the
hooks.

- Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Destructive pre-commit behaviour and "--all"
  2023-03-22 10:34 Destructive pre-commit behaviour and "--all" Jan Rüegg
@ 2023-03-22 20:00 ` Chris Torek
  2023-03-22 21:26   ` Chris Torek
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Torek @ 2023-03-22 20:00 UTC (permalink / raw)
  To: Jan Rüegg; +Cc: git

On Wed, Mar 22, 2023 at 3:35 AM Jan Rüegg <rggjan@gmail.com> wrote:
> Why doesn't "git commit --all" keep the files added to the staging
> area, so "git commit --all" would have the same behaviour as "git add
> --all && git commit"?

It's not completely clear to me what behavior you *want* here,
but I can answer the "why doesn't" question.

Remember that `git commit` can be aborted (in several ways).  If
it *is* aborted, whatever it did must be rolled back.  This means
that `git commit -a`, which is otherwise a lot like `git add -u &&
git commit`, is fundamentally different from a successful `git add
-u` followed by a `git commit`.  In particular, if the `git
commit` *is* aborted, the `git commit -a` command *must not* leave
the updated files in the index.  The `git commit` without `-a` did
not update any files in the index, so it does not have to "roll
back" the index update.

To achieve this, `git commit -a` doesn't actually update *the*
index at all.  Instead, it prepares a *new* index and updates
that new index.  It then proceeds to do the committing, using
the new index, rather than the main index, as if it were *the*
index.  If all goes well so that the commit is done, the code
then swaps in the *new* index for *the* index.  If the commit
is aborted, the code simply *deletes* the new index, leaving
*the* index unchanged.

This is fundamentally different from "update *the* index, then
start committing, then either roll back *the* index or leave it in
place": there's a secondary temporary index involved here.

Things get even more complicated if you use `git commit --only`
(vs `git commit --include`) as in this case there are *three*
indices simultaneously active ("the" index, "proposed index for
commit", and "proposed index if commit succeeds").  All three
must be managed carefully and correctly.

It is unwise to invoke `git add` in a pre-commit hook precisely
because there may be two or three indices, and `git add` cannot
affect all of them correctly.

Note that Git could be designed differently (with real databases
that have roll-forward and roll-back options), but that's a much
bigger change.  The process described here is merely how Git works
now, which explains the constraints on using `git add` in pre-
commit hooks.

Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Destructive pre-commit behaviour and "--all"
  2023-03-22 20:00 ` Chris Torek
@ 2023-03-22 21:26   ` Chris Torek
  2023-03-22 21:58     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Torek @ 2023-03-22 21:26 UTC (permalink / raw)
  To: Jan Rüegg; +Cc: git

On Wed, Mar 22, 2023 at 1:00 PM Chris Torek <chris.torek@gmail.com> wrote:
[snip]
> Note that Git could be designed differently (with real databases
> that have roll-forward and roll-back options), but that's a much
> bigger change. ...

I realized on re-reading my reply here that there's an odd bit at
the end.  It's because I decided to simplify something away, but
that left the oddness: bigger change than what?  So I'll bring the
extra part back in.

For a `git commit` *without* `-a`, Git has a special case: after
it runs the pre-commit hook, it reloads *the* index, as there's
only the one index.  This makes it possible to run `git add`
successfully in a pre-commit hook.  But when using `--only`,
`--include`, or `--all` / `-a`, there are multiple index files.
Git therefore *doesn't* re-load "the" index.

It would in theory be possible for Git to load *the* index twice,
once before and once after the hook, and compare them to see what
changed, then perhaps try to use that change to update the
additional indices.  That would be a pretty big change, but if it
were done right, it might get what you want.

Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Destructive pre-commit behaviour and "--all"
  2023-03-22 21:26   ` Chris Torek
@ 2023-03-22 21:58     ` Junio C Hamano
  2023-03-28  9:33       ` Jan Rüegg
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2023-03-22 21:58 UTC (permalink / raw)
  To: Chris Torek; +Cc: Jan Rüegg, git

Chris Torek <chris.torek@gmail.com> writes:

> It would in theory be possible for Git to load *the* index twice,
> once before and once after the hook, and compare them to see what
> changed, then perhaps try to use that change to update the
> additional indices.  That would be a pretty big change, but if it
> were done right, it might get what you want.

FYI, we tried not to do the extra re-reading, because pre-commit
hook was designed to be a mechanism to allow users to validate, but
not correct, what gets committed.  As the system originally was
designed, users who correctly use Git would *not* be modifying the
index.  Because it is an error to modify the index in the hook, (1)
re-reading the index just in case the user commits such a mistake is
waste of resources, and (2) checking the index to make sure it did
not change before and after invoking the hook, again, is also waste
of resources.

It may have been a mistake that we re-read the index in some case,
which adds to the confusion, but not others.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Destructive pre-commit behaviour and "--all"
  2023-03-22 21:58     ` Junio C Hamano
@ 2023-03-28  9:33       ` Jan Rüegg
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Rüegg @ 2023-03-28  9:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Torek, git

Thanks Junio, Chris for the detailed explanations

> FYI, we tried not to do the extra re-reading, because pre-commit
hook was designed to be a mechanism to allow users to validate, but
not correct, what gets committed.

I see, and wasn't quite aware of this. The issue I see is that, in
contrast to what it was designed for, pre-commit is in practice used
to change files.

The pre-commit project, for example (https://pre-commit.com/) has many
hooks (as seen here:
https://pre-commit.com/#4-optional-run-against-all-the-files) has many
useful file-changing hooks. Small cleanup tasks like running code
formatters, fixing end of line spaces etc. are very useful to run
while committing. As this has become more common, also more dangerous
hooks are run as part of pre-commit, like ruff
(https://github.com/charliermarsh/ruff-pre-commit) which has pretty
heavy auto-fixes like automatically removing unused imports.

This is usually not a problem. When you add all your changes to the
staging area ("git add --all") and then commit, you will see what
additional changes happened in the auto-fixes during the pre-commit.
With a "git commit -a", however, this can potentially break your files
without you knowing what exactly was changed in your code. Since that
code was not committed or added to an index beforehand, it's
impossible to find out what the hook changed.

As a developer working on a project using "destructive" pre-commit
hooks, I was just wondering what can be done to make this "safer".
Looking at the issue and comments in
https://github.com/pre-commit/pre-commit/issues/1499 it seems the
pre-commit hooks themselves don't have a way to do that. So it seems
the only option to do this would be to add it somehow to git itself.

I was thinking of something like a setting "git config
commit.add_before_precommit_runs true" which would emulate the
behaviour of "git add --all && git commit". Or is there some other way
(in git, the pre-commit hook, or anywhere else) to make this existing
workflow safer without resorting to bash aliases and stuff like that?

Jan

On Wed, 22 Mar 2023 at 22:58, Junio C Hamano <gitster@pobox.com> wrote:
>
> Chris Torek <chris.torek@gmail.com> writes:
>
> > It would in theory be possible for Git to load *the* index twice,
> > once before and once after the hook, and compare them to see what
> > changed, then perhaps try to use that change to update the
> > additional indices.  That would be a pretty big change, but if it
> > were done right, it might get what you want.
>
> FYI, we tried not to do the extra re-reading, because pre-commit
> hook was designed to be a mechanism to allow users to validate, but
> not correct, what gets committed.  As the system originally was
> designed, users who correctly use Git would *not* be modifying the
> index.  Because it is an error to modify the index in the hook, (1)
> re-reading the index just in case the user commits such a mistake is
> waste of resources, and (2) checking the index to make sure it did
> not change before and after invoking the hook, again, is also waste
> of resources.
>
> It may have been a mistake that we re-read the index in some case,
> which adds to the confusion, but not others.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-28  9:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-22 10:34 Destructive pre-commit behaviour and "--all" Jan Rüegg
2023-03-22 20:00 ` Chris Torek
2023-03-22 21:26   ` Chris Torek
2023-03-22 21:58     ` Junio C Hamano
2023-03-28  9:33       ` Jan Rüegg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).