* smudge filters during checkout & crash consistency
@ 2014-11-12 17:46 Derek Moore
2014-11-12 18:30 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: Derek Moore @ 2014-11-12 17:46 UTC (permalink / raw)
To: git
I have a case where I would like to smudge files according to the
reflog information of the switching-to branch.
This is difficult to achieve because updating HEAD to the new
switched-to refname or commit hash is the last step performed in a
checkout prior to calling the post-checkout hook, and smudge filters
process content during the rewriting of the index and work-tree before
HEAD is updated.
I believe this weakness of checkout & filters also exposes a crash
consistency concern. Suppose power is lost during a long-running
checkout while the index/worktree is being updated but before the new
HEAD file is written.
Upon coming back up, your git status will show edits against your
switching-from branch, and possibilities of recovery would rely on
your memory of what you were doing (instead of git-status reporting
"Incomplete checkout to {branch,commit}, 'git checkout --continue' to
continue").
Maybe git could record a CHECKOUT_HEAD at the start of a checkout,
then at the end of the commit update_refs_for_switch() would move
CHECKOUT_HEAD over top HEAD instead of rewriting HEAD (but,
presumably, a lot of logic in update_refs_for_switch() would have to
be relocated to when CHECKOUT_HEAD is written, other implications
notwithstanding).
Crash consistency aside, my workaround for filtering will probably be
to use a fake smudge filter that records the file paths of all
to-be-smudged files to a file under .git/, and then use a post-commit
hook that will process those files from within the newly checked-out
branch (where I'll be using git-archive to overwrite files).
Seems git could fix these two concerns in one fell swoop.
Thanks,
Derek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smudge filters during checkout & crash consistency
2014-11-12 17:46 smudge filters during checkout & crash consistency Derek Moore
@ 2014-11-12 18:30 ` Junio C Hamano
2014-11-12 19:41 ` Derek Moore
2014-11-12 20:30 ` Derek Moore
0 siblings, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2014-11-12 18:30 UTC (permalink / raw)
To: Derek Moore; +Cc: git
Derek Moore <derek.p.moore@gmail.com> writes:
> I have a case where I would like to smudge files according to the
> reflog information of the switching-to branch.
Don't do that.
When you have branches A, B and C, and a path F is the same between
branches A and B but different in branch C, if you start from branch
C and switch to branch A, F will be updated and obtain your smudge
tailored for "branch A's instance of F".
But if you then switch to B from that state, F will not even be
modified (i.e. it will keep the contents you prepared for "branch
A's instance of F").
In short, do not make clean/smudge depend on anything but blob
contents.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smudge filters during checkout & crash consistency
2014-11-12 18:30 ` Junio C Hamano
@ 2014-11-12 19:41 ` Derek Moore
2014-11-12 20:30 ` Derek Moore
1 sibling, 0 replies; 5+ messages in thread
From: Derek Moore @ 2014-11-12 19:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Here's a solution that depends only/mostly on blob contents:
1) construct the ident of the blob via an `(echo -e -n "blob <size>\0"
; cat file) | sha1sum` equivalent if an $Id$ string is not found in
its contents,
2) look up the earliest commit with that blob hash at that path, and
3) use the reflog metadata from that earliest commit.
Then when switching from C-to-A or C-to-B, F will have the same
contents as a noop switch when switching A-to-B from C-to-A (although,
conceivably, you may get a commit that is in neither A nor B, but you
will have the earliest introduction of that file at that state).
In other words, always use the earliest occurrence of a specific
content at a given path, earliest commit wins irrespective of
branches. Not the most elegant solution.
I may have to go back and let these people know that outside of build
scripts they can't get what they think they want.
Thanks!
:D
On Wed, Nov 12, 2014 at 12:30 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Derek Moore <derek.p.moore@gmail.com> writes:
>
>> I have a case where I would like to smudge files according to the
>> reflog information of the switching-to branch.
>
> Don't do that.
>
> When you have branches A, B and C, and a path F is the same between
> branches A and B but different in branch C, if you start from branch
> C and switch to branch A, F will be updated and obtain your smudge
> tailored for "branch A's instance of F".
>
> But if you then switch to B from that state, F will not even be
> modified (i.e. it will keep the contents you prepared for "branch
> A's instance of F").
>
> In short, do not make clean/smudge depend on anything but blob
> contents.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smudge filters during checkout & crash consistency
2014-11-12 18:30 ` Junio C Hamano
2014-11-12 19:41 ` Derek Moore
@ 2014-11-12 20:30 ` Derek Moore
2014-11-12 20:51 ` Junio C Hamano
1 sibling, 1 reply; 5+ messages in thread
From: Derek Moore @ 2014-11-12 20:30 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
> But if you then switch to B from that state, F will not even be
> modified (i.e. it will keep the contents you prepared for "branch
> A's instance of F").
Or: the post-commit hook used in the workaround looks up the prior
branch via @{-1}, finds all files common between @ & @{-1} that don't
share a latest commit, deletes those files and replaces them singly
with the results of git-archive using the latest commits of those
files relative to @. ("All files common between @ & @{-1}" would need
to be either all non-locally-modified files or making use of git-stash
{save,pop} to preserve local modifications.) All this assumes having
reversible $Format$ strings, so the clean filter can restore the
proper $Format$ string.
Might be worth doing just so there's at least 1 accurate and
maybe-fast "git rcs keywords substitution using smudge/clean filters"
project on github. ;) Otherwise, users of "git-keyword-substitution"
and "git-rcs-keywords" are being led astray.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smudge filters during checkout & crash consistency
2014-11-12 20:30 ` Derek Moore
@ 2014-11-12 20:51 ` Junio C Hamano
0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2014-11-12 20:51 UTC (permalink / raw)
To: Derek Moore; +Cc: git
Derek Moore <derek.p.moore@gmail.com> writes:
>> But if you then switch to B from that state, F will not even be
>> modified (i.e. it will keep the contents you prepared for "branch
>> A's instance of F").
>
> Or: the post-commit hook used in the workaround looks up the prior
> branch via @{-1}, finds all files common between @ & @{-1} that don't
> share a latest commit, deletes those files and replaces them singly
> with the results of git-archive using the latest commits of those
> files relative to @. ("All files common between @ & @{-1}" would need
> to be either all non-locally-modified files or making use of git-stash
> {save,pop} to preserve local modifications.) All this assumes having
> reversible $Format$ strings, so the clean filter can restore the
> proper $Format$ string.
>
> Might be worth doing...
I still do not see what you are trying to record in the checked out
source files with your smudge filter, so I won't comment if it might
be "worth" doing.
Your use of reflog suggests me that whatever you are recording
depends on how you acquired your history in your specific repository
you work in, and your result is not reproducible by other people who
work with you by fetching from a repository that is different from
the repository you work in. E.g. perhaps you have a repository at
GitHub and push into there, and others fetch from there into their
repository. What is in their reflog has no relation to what you
have in your reflog.
That's the nature of distrubuted life. More generally, in a
distributred world with merges, even between two people who agree
that the tip of the 'master' branch of the project is at a certain
commit, there is no single sensible answer to the question "which
commit changed this path last?" We wouldn't mind anything you may
do to emulate RCS $Id$, but it would be futile.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-11-12 20:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-12 17:46 smudge filters during checkout & crash consistency Derek Moore
2014-11-12 18:30 ` Junio C Hamano
2014-11-12 19:41 ` Derek Moore
2014-11-12 20:30 ` Derek Moore
2014-11-12 20:51 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox