git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Using ‘git replace’ to replace blobs
@ 2010-02-07  1:10 Jonathan Nieder
  2010-02-07  6:45 ` Christian Couder
  0 siblings, 1 reply; 2+ messages in thread
From: Jonathan Nieder @ 2010-02-07  1:10 UTC (permalink / raw)
  To: git; +Cc: Christian Couder

I think it is a known problem that ‘git replace’ cannot be used safely
to replace blobs used in the currently checked out commit.  The man
page says:

	Comparing blobs or trees that have been replaced with
	those that replace them will not work properly.

Indeed, in practice it produces problems. [1]

I would like to start to fix this.  But the correct semantics are not
obvious to me:

 - When writing a tree from an index that includes replaced blobs,
   should the result use the original blobs or the replaced ones?

 - When reading a tree that includes replaced blobs, should the
   resulting cache entries use the original blobs or the replaced
   ones?

My hunch is to say both should use the replaced blobs.  This way,
replacing a blob in a checked-out index would behave in a more
intuitive way, and git filter-branch would make permanent any
substitutions requested through replaced blob entries.

I have not thought it through completely, though.

Thoughts?
Jonathan

[1] For example,

 git init repo
 cd repo
 echo first > 1.txt
 echo second > 2.txt
 git add 1.txt 2.txt
 git commit -m demonstration
 git show --raw
 git ls-tree HEAD | awk '
	NR == 1 { first = $3 }
	NR == 2 { system("git replace " first " " $3) }
 '
 git status
 rm *
 git checkout -f
 git status

which one would expect to result in a clean tree, produces

 Initialized empty Git repository in /tmp/repo/.git/
 [master (root-commit) 998cc27] demonstration
  2 files changed, 2 insertions(+), 0 deletions(-)
  create mode 100644 1.txt
  create mode 100644 2.txt
 commit 998cc270986f68450f00bda5e5db62f31367ff96
 Author: Jonathan Nieder <jrnieder@gmail.com>
 Date:   Sat Feb 6 18:48:50 2010 -0600

     demonstration

 :000000 100644 0000000... 9c59e24... A  1.txt
 :000000 100644 0000000... e019be0... A  2.txt
 # On branch master
 nothing to commit (working directory clean)
 # On branch master
 # Changed but not updated:
 #   (use "git add <file>..." to update what will be committed)
 #   (use "git checkout -- <file>..." to discard changes in working
 #   directory)
 #
 #       modified:   1.txt
 #
 no changes added to commit (use "git add" and/or "git commit -a")

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Using ‘git replace’ to replace blobs
  2010-02-07  1:10 Using ‘git replace’ to replace blobs Jonathan Nieder
@ 2010-02-07  6:45 ` Christian Couder
  0 siblings, 0 replies; 2+ messages in thread
From: Christian Couder @ 2010-02-07  6:45 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Nick Edelen, Sam Vilain

On dimanche 07 février 2010, Jonathan Nieder wrote:
> I think it is a known problem that ‘git replace’ cannot be used safely
> to replace blobs used in the currently checked out commit.  The man
> page says:
>
> 	Comparing blobs or trees that have been replaced with
> 	those that replace them will not work properly.
>
> Indeed, in practice it produces problems. [1]
>
> I would like to start to fix this.  

One way to fix it may be to use a bit in "struct object" that could tell if 
any object was replaced or not. I think that in the "Add caching support to 
git-daemon" GSoc patches, Nick Edelen did something like that for grafts. 
(See http://thread.gmane.org/gmane.comp.version-control.git/127932/)

> But the correct semantics are not 
> obvious to me:
>
>  - When writing a tree from an index that includes replaced blobs,
>    should the result use the original blobs or the replaced ones?

It may depend on why the original blob was replaced in the first place.
I did not think much about this though.

>  - When reading a tree that includes replaced blobs, should the
>    resulting cache entries use the original blobs or the replaced
>    ones?

I think it should depend on whether the global variable read_replace_refs is 
set or not.

> My hunch is to say both should use the replaced blobs.  This way,
> replacing a blob in a checked-out index would behave in a more
> intuitive way, and git filter-branch would make permanent any
> substitutions requested through replaced blob entries.

It might not always be a good idea to make any substitution permanent.
For example if you use git replace to improve the bisectability of your 
commit history you may want to keep the original commits.

I know you are talking about blobs, not commits, but perhaps there are some 
similar use cases of replaced blobs.

Thanks for looking at that,
Christian.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-02-07  6:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-07  1:10 Using ‘git replace’ to replace blobs Jonathan Nieder
2010-02-07  6:45 ` Christian Couder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).