git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Rename detection at git log
@ 2006-11-20  5:57 Alexander Litvinov
  2006-11-20  9:50 ` Andy Parkins
  2006-11-20 10:06 ` Alex Riesen
  0 siblings, 2 replies; 17+ messages in thread
From: Alexander Litvinov @ 2006-11-20  5:57 UTC (permalink / raw)
  To: git

How can I see all changes for one file ? Including renames/copies ? Currently 
I don't known how to see them :

> mkdir 1 && cd 1 && git init-db
defaulting to local storage area
> date >> a
> git add a
> git commit -a -m "1"
Committing initial tree c47d83a6544612309aad57519ca831cf62a489d5
> mkdir b
> git mv a b/
> git commit -a -m "2"
> PAGER=cat git log -M -C --pretty=oneline b/a
3b591f7147ee8dbe15fdf456db5730072d41bed8 2
>

At lastline I would like to see two commits : renaming a -> b/a and creation 
of a. By the way, how can I see commit message with git log ?

Thanks for help.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20  5:57 Rename detection at git log Alexander Litvinov
@ 2006-11-20  9:50 ` Andy Parkins
  2006-11-20 10:07   ` Junio C Hamano
  2006-11-20 10:06 ` Alex Riesen
  1 sibling, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2006-11-20  9:50 UTC (permalink / raw)
  To: git; +Cc: Alexander Litvinov

On Monday 2006 November 20 05:57, Alexander Litvinov wrote:

> > PAGER=cat git log -M -C --pretty=oneline b/a

I've come across this too.  Personally I'm not sure what use "-C" is.  From 
the manpage, man git-diff-files (no, this isn't the place I'd look either).

--find-copies-harder
For performance reasons, by default, -C option finds copies only if the 
original file of the copy was modified in the same changeset. This flag makes 
the command inspect unmodified files as candidates for the source of copy. 
This is a very expensive operation for large projects, so use it with 
caution.

That is to say that unless the file you are copying was modified AND copied in 
the same commit, it won't be searched as a potential source for the copy 
operation.  I think it would be rare to make a copy of a file I had modified, 
surely I'd want to check in modifications before making a copy?

Regardlss, to get the results you want, use the stronger 
switch --find-copies-harder, heeding the warning that on big projects it will 
be very slow.



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20  5:57 Rename detection at git log Alexander Litvinov
  2006-11-20  9:50 ` Andy Parkins
@ 2006-11-20 10:06 ` Alex Riesen
  2006-11-20 10:23   ` Andy Parkins
  1 sibling, 1 reply; 17+ messages in thread
From: Alex Riesen @ 2006-11-20 10:06 UTC (permalink / raw)
  To: Alexander Litvinov; +Cc: git

On 11/20/06, Alexander Litvinov <litvinov2004@gmail.com> wrote:
> How can I see all changes for one file ? Including renames/copies ?

git log -M -C -r --name-status

> PAGER=cat git log -M -C --pretty=oneline b/a
>
> At lastline I would like to see two commits : renaming a -> b/a and creation
> of a. By the way, how can I see commit message with git log ?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20  9:50 ` Andy Parkins
@ 2006-11-20 10:07   ` Junio C Hamano
  2006-11-20 10:11     ` Jakub Narebski
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-11-20 10:07 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git, Alexander Litvinov

Andy Parkins <andyparkins@gmail.com> writes:

> On Monday 2006 November 20 05:57, Alexander Litvinov wrote:
>
>> > PAGER=cat git log -M -C --pretty=oneline b/a
>
> I've come across this too.  Personally I'm not sure what use "-C" is.  From 
> the manpage, man git-diff-files (no, this isn't the place I'd look either).

The real issue here is because the b/a on the command line
applies on the input-side, and does not act as the output
filter.  This comes from _very_ early design decision and if you
dig the list archive you will see Linus and I arguing about
diffcore-pathspec (which later died off).

What it means is that "git log" will look at path that matches
b/a (that means b/a/c and b/a/d are looked at, if b/a were a
directory).  Since path "a" which is what the file was
originally at is not something the pattern b/a matches, there is
no way b/a is noticed as a rename from a.

I've been meaning to resurrect Fredrik's --single-follow=path
patch but haven't had time to recently, with all the other
interesting discussion happening on the list.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:07   ` Junio C Hamano
@ 2006-11-20 10:11     ` Jakub Narebski
  2006-11-20 10:22     ` Andy Parkins
  2006-11-20 11:33     ` Alexander Litvinov
  2 siblings, 0 replies; 17+ messages in thread
From: Jakub Narebski @ 2006-11-20 10:11 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Andy Parkins <andyparkins@gmail.com> writes:
> 
>> On Monday 2006 November 20 05:57, Alexander Litvinov wrote:
>>
>>> > PAGER=cat git log -M -C --pretty=oneline b/a
>>
>> I've come across this too.  Personally I'm not sure what use "-C" is.  From 
>> the manpage, man git-diff-files (no, this isn't the place I'd look either).
> 
> The real issue here is because the b/a on the command line
> applies on the input-side, and does not act as the output
> filter.  This comes from _very_ early design decision and if you
> dig the list archive you will see Linus and I arguing about
> diffcore-pathspec (which later died off).
> 
> What it means is that "git log" will look at path that matches
> b/a (that means b/a/c and b/a/d are looked at, if b/a were a
> directory).  Since path "a" which is what the file was
> originally at is not something the pattern b/a matches, there is
> no way b/a is noticed as a rename from a.
> 
> I've been meaning to resurrect Fredrik's --single-follow=path
> patch but haven't had time to recently, with all the other
> interesting discussion happening on the list.

But for now, you can use

  PAGER= git log -M -C -- b/a a

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:07   ` Junio C Hamano
  2006-11-20 10:11     ` Jakub Narebski
@ 2006-11-20 10:22     ` Andy Parkins
  2006-11-20 10:48       ` Junio C Hamano
  2006-11-20 11:33     ` Alexander Litvinov
  2 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2006-11-20 10:22 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alexander Litvinov

On Monday 2006 November 20 10:07, Junio C Hamano wrote:

> The real issue here is because the b/a on the command line
> applies on the input-side, and does not act as the output
> filter.  This comes from _very_ early design decision and if you
> dig the list archive you will see Linus and I arguing about
> diffcore-pathspec (which later died off).

I don't think so; even without the b/a on the command line, git does not find 
copies made in this way...

$ git init-db
defaulting to local storage area
$ date > fileA
$ git add fileA
$ git commit -a -m "fileA"
Committing initial tree 3ef607fd139dd955f868305462d99dfc4cfff70f
$ cp fileA fileB
$ git add fileB
$ git commit -a -m "fileA -> fileB"

Now let's try and get git-diff to notice this was a copy...

$ git diff HEAD^..HEAD | cat
diff --git a/fileB b/fileB
new file mode 100644
index 0000000..ec620df
--- /dev/null
+++ b/fileB
@@ -0,0 +1 @@
+Mon Nov 20 10:16:29 GMT 2006
$ git diff -C HEAD^..HEAD | cat
diff --git a/fileB b/fileB
new file mode 100644
index 0000000..ec620df
--- /dev/null
+++ b/fileB
@@ -0,0 +1 @@
+Mon Nov 20 10:16:29 GMT 2006
$ git diff --find-copies-harder HEAD^..HEAD | cat
diff --git a/fileA b/fileB
similarity index 100%
copy from fileA
copy to fileB

As I said - I don't see what "-C" ever does for you in all but the rarest of 
uses.  --find-copies-harder is the only way to list copies successfully.  
It's nothing to do with any input or output filtering.



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:06 ` Alex Riesen
@ 2006-11-20 10:23   ` Andy Parkins
  2006-11-20 10:51     ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2006-11-20 10:23 UTC (permalink / raw)
  To: git

On Monday 2006 November 20 10:06, Alex Riesen wrote:

> remove --pretty=oneline, it is default behavior of git log.

No it's not; are you confusing it with --pretty=short?


Andy

-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:22     ` Andy Parkins
@ 2006-11-20 10:48       ` Junio C Hamano
  2006-11-20 11:01         ` Andy Parkins
  2006-11-20 11:28         ` Junio C Hamano
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-11-20 10:48 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> On Monday 2006 November 20 10:07, Junio C Hamano wrote:
>
>> The real issue here is because the b/a on the command line
>> applies on the input-side, and does not act as the output
>> filter.  This comes from _very_ early design decision and if you
>> dig the list archive you will see Linus and I arguing about
>> diffcore-pathspec (which later died off).
>
> I don't think so; even without the b/a on the command line,
> git does not find copies made in this way...

I wrote the code and you contradict me ;-)?

Trust me, I know this area reasonably well, to the point that
sometimes I wonder if there is a sane and cheap way to change
the meaning of the pathspec to be an output filter and then
quickly say "Nah" to myself.

If you say

	git diff --find-copies-harder HEAD^..HEAD -- fileB

in your example, it would give you the creation of fileB, not
copy.

There are a few things we need to be careful about rename/copy.

 - Typically too small files are not treated as copies unless
   they are identical copies (does not apply to this case,
   luckily).

 - Renames are only picked up from files that were lost in the
   same change (i.e. "mv fileA fileB" creates fileB and loses
   fileA; fileB is checked if it is similar to fileA in the
   original).

 - Copies are only picked up from files that were changed in the
   same change (i.e. splitting major part of original file and
   moving it to somewhere else, while leaving a skelton in the
   original file).  "harder" is needed if the copy original was
   untouched, as you found out.

The last one is a compromise between performance and thoroughness,
and the "harder" is one knob to tweak its behaviour.

In the kernel archive, 

	git show -C ad2f931d

tells us that:

 - drivers/i2c/chips/Kconfig lost major part of it and only
   skeletal part of the original remains in it;

 - major part of it went to drivers/hwmon/Kconfig;

The story is similar to the Makefile next door.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:23   ` Andy Parkins
@ 2006-11-20 10:51     ` Junio C Hamano
  2006-11-20 11:17       ` Andy Parkins
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2006-11-20 10:51 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> On Monday 2006 November 20 10:06, Alex Riesen wrote:
>
>> remove --pretty=oneline, it is default behavior of git log.
>
> No it's not; are you confusing it with --pretty=short?

I think Alex (Riesen) is saying "you (Alex Litvinov) were
wondering why you do not see the commit log message but only the
first line. That is because you are using --pretty=oneline.
Lose it, then you would get what you want because giving the log
message _is_ the default".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:48       ` Junio C Hamano
@ 2006-11-20 11:01         ` Andy Parkins
  2006-11-20 11:15           ` Jakub Narebski
  2006-11-20 11:28         ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2006-11-20 11:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Monday 2006 November 20 10:48, Junio C Hamano wrote:

> I wrote the code and you contradict me ;-)?

Sorry; I wasn't so much contradicting that the filtering works exactly as you 
say (of course it must - I don't know anywhere near enough to make that sort 
of assertion).

However, I do think that the problem is not one of filtering.  I was saying 
that "-C" has no practical use.

> in your example, it would give you the creation of fileB, not
> copy.

I'm sure it would - but you had to use --find-copies-harder; -C would not find 
it as a copy.

>  - Renames are only picked up from files that were lost in the
>    same change (i.e. "mv fileA fileB" creates fileB and loses
>    fileA; fileB is checked if it is similar to fileA in the
>    original).

I've found rename detection to be flawless in all my uses.

>  - Copies are only picked up from files that were changed in the
>    same change (i.e. splitting major part of original file and
>    moving it to somewhere else, while leaving a skelton in the
>    original file).  "harder" is needed if the copy original was
>    untouched, as you found out.

Yep; I understand that.  I also understand that it is done for performance 
reasons.  However, since the typical copy will be one where the source 
doesn't change at the same time, I am arguing that the non-hard copy 
detection isn't much use.

> The last one is a compromise between performance and thoroughness,
> and the "harder" is one knob to tweak its behaviour.

I've been poking in tree-diff.c to see if I can understand why it it such a 
performance hog.  I still haven't.  Each file is stored under its hash right?  
So for copy detection why can't you just search for other files with the same 
hash, which I presume is very fast (as it is the basis of what makes git so 
fast)?

I am probably misunderstanding git, but I guess that a copy isn't even needed 
in the database because two files with the same hash in the working copy only 
need storing once and then referencing twice.  So for a copy (again, with my 
simple understanding of git) we'd have:

 commit1 -> tree1 -> fileA = fileA_hash
    ^
    |
 commit2 -> tree2 -> fileA = fileA_hash
                     fileB = fileB_hash

Doesn't that mean that copy detection is just a matter of searching the parent 
commit trees for references to the same hash?


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 11:01         ` Andy Parkins
@ 2006-11-20 11:15           ` Jakub Narebski
  2006-11-20 11:32             ` Junio C Hamano
  2006-11-20 11:59             ` Andy Parkins
  0 siblings, 2 replies; 17+ messages in thread
From: Jakub Narebski @ 2006-11-20 11:15 UTC (permalink / raw)
  To: git

Andy Parkins wrote:

> On Monday 2006 November 20 10:48, Junio C Hamano wrote:
>
>>  - Copies are only picked up from files that were changed in the
>>    same change (i.e. splitting major part of original file and
>>    moving it to somewhere else, while leaving a skelton in the
>>    original file).  "harder" is needed if the copy original was
>>    untouched, as you found out.
> 
> Yep; I understand that.  I also understand that it is done for performance 
> reasons.  However, since the typical copy will be one where the source 
> doesn't change at the same time, I am arguing that the non-hard copy 
> detection isn't much use.

I'm not sure about this. You usually both do pure renames (to reorganize
files, to give file a better name) and renames with modification, but
I don't think that copy without modification is very common. Usually you
copy a file because you take one file as template for the other, or you
split file, or you join files into one file.
 
>> The last one is a compromise between performance and thoroughness,
>> and the "harder" is one knob to tweak its behaviour.
> 
> I've been poking in tree-diff.c to see if I can understand why it it such a 
> performance hog.  I still haven't.  Each file is stored under its hash right?  
> So for copy detection why can't you just search for other files with the same 
> hash, which I presume is very fast (as it is the basis of what makes git so 
> fast)?

Copy and rename detection are done by comparing the contents, calculating
similarity. So to check if files A and B are copies (not necessary pure
copies) it is not enough to compare hashes.

That said, it should be fairly easy (if not that useful in true projects
as I understand it, as stated above) to add to copy detection detection of
pure copies by comparing hashes. Still, --find-copies-harder would be still
needed if the copy original was untouched, while copy itself was modified.

> I am probably misunderstanding git, but I guess that a copy isn't even needed 
> in the database because two files with the same hash in the working copy only 
> need storing once and then referencing twice.  So for a copy (again, with my 
> simple understanding of git) we'd have:
> 
>  commit1 -> tree1 -> fileA = fileA_hash
>     ^
>     |
>  commit2 -> tree2 -> fileA = fileA_hash
>                      fileB = fileB_hash
> 
> Doesn't that mean that copy detection is just a matter of searching the parent 
> commit trees for references to the same hash?

Think copy'n'change.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:51     ` Junio C Hamano
@ 2006-11-20 11:17       ` Andy Parkins
  0 siblings, 0 replies; 17+ messages in thread
From: Andy Parkins @ 2006-11-20 11:17 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Monday 2006 November 20 10:51, Junio C Hamano wrote:

> I think Alex (Riesen) is saying "you (Alex Litvinov) were
> wondering why you do not see the commit log message but only the
> first line. That is because you are using --pretty=oneline.
> Lose it, then you would get what you want because giving the log
> message _is_ the default".

You're right.  Apologies to Alex for my misunderstanding.

-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:48       ` Junio C Hamano
  2006-11-20 11:01         ` Andy Parkins
@ 2006-11-20 11:28         ` Junio C Hamano
  2006-11-20 12:16           ` Andy Parkins
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2006-11-20 11:28 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> There are a few things we need to be careful about rename/copy.
>...
>  - Copies are only picked up from files that were changed in the
>    same change (i.e. splitting major part of original file and
>    moving it to somewhere else, while leaving a skelton in the
>    original file).  "harder" is needed if the copy original was
>    untouched, as you found out.
>
> The last one is a compromise between performance and thoroughness,
> and the "harder" is one knob to tweak its behaviour.

If people are well disciplined, code refactoring (which can
trigger rename/copy detection) tend to affect both source and
destination files at the same time, so many times -C finds what
you want without --find-copies-harder.

But sometimes the source stays the same and you literally have
duplicate (with possibly some modifications) in the new
destination.  Finding exact copy is cheap (diffcore-rename has a
double loop that first finds exact copies without similarity
estimation which is very cheap, and then goes on to open blobs
and does its similarity magic for destinations whose origin is
still unknown) but copy/rename with edit is not, and "harder"
variant feeds _everything_ from the older tree as a candidate of
copy source, so it is very expensive for huge projects.

> In the kernel archive, 
>
> 	git show -C ad2f931d
>
> tells us that:
>
>  - drivers/i2c/chips/Kconfig lost major part of it and only
>    skeletal part of the original remains in it;
>
>  - major part of it went to drivers/hwmon/Kconfig;
>
> The story is similar to the Makefile next door.

Having said all that, I think the rename/copy as a wholesale
operation on one file is an uninteresting special case.  The
generic case that happens far more often in practice is the
lines moving around across files, and the new "git blame" gives
you better picture to answer "where the heck did this come from"
question.

For example,

	git blame -f -n -C 'ad2f931d^!' -- drivers/hwmon/Kconfig

on the same commit would show that many of its lines came from
i2c/chips/Kconfig but not all of them.

There are quite a few other things I should probably mention for
new people on the list about rename/copy/break heuristics but it
is getting late so I'd defer it to some other time.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 11:15           ` Jakub Narebski
@ 2006-11-20 11:32             ` Junio C Hamano
  2006-11-20 11:59             ` Andy Parkins
  1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-11-20 11:32 UTC (permalink / raw)
  To: jnareb; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> That said, it should be fairly easy (if not that useful in true projects
> as I understand it, as stated above) to add to copy detection detection of
> pure copies by comparing hashes.

That is already done as a performance measure (notice the double
loop controlled with "contents_too" in diffcore_rename()).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 10:07   ` Junio C Hamano
  2006-11-20 10:11     ` Jakub Narebski
  2006-11-20 10:22     ` Andy Parkins
@ 2006-11-20 11:33     ` Alexander Litvinov
  2 siblings, 0 replies; 17+ messages in thread
From: Alexander Litvinov @ 2006-11-20 11:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

> What it means is that "git log" will look at path that matches
> b/a (that means b/a/c and b/a/d are looked at, if b/a were a
> directory).  Since path "a" which is what the file was
> originally at is not something the pattern b/a matches, there is
> no way b/a is noticed as a rename from a.

I have found that git blame show correct commits for this case. But I am still 
in trouble then examining file's history. I have found I can use 
git show -C -M commit-sha1 
for commit there file was created to see if this was a rename :-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 11:15           ` Jakub Narebski
  2006-11-20 11:32             ` Junio C Hamano
@ 2006-11-20 11:59             ` Andy Parkins
  1 sibling, 0 replies; 17+ messages in thread
From: Andy Parkins @ 2006-11-20 11:59 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski

On Monday 2006 November 20 11:15, Jakub Narebski wrote:

> I'm not sure about this. You usually both do pure renames (to reorganize
> files, to give file a better name) and renames with modification, but
> I don't think that copy without modification is very common. Usually you
> copy a file because you take one file as template for the other, or you
> split file, or you join files into one file.

Exactly - unfortunately it's the /source/ that has to be modified to be 
included in the potential list.  Who copies a file then modifies the 
original?  The copy is by definition already one of the modified files.

"For performance reasons, by default, -C option finds copies only if the 
original file of the copy was modified in the same changeset. This flag makes"

Your points about copy-and-change accepted.  Hash comparison is not 
sufficient.


Andy

-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Rename detection at git log
  2006-11-20 11:28         ` Junio C Hamano
@ 2006-11-20 12:16           ` Andy Parkins
  0 siblings, 0 replies; 17+ messages in thread
From: Andy Parkins @ 2006-11-20 12:16 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Monday 2006 November 20 11:28, Junio C Hamano wrote:

> If people are well disciplined, code refactoring (which can
> trigger rename/copy detection) tend to affect both source and
> destination files at the same time, so many times -C finds what
> you want without --find-copies-harder.

That's true; however, I don't think that refactoring is the common operation.  
Usually it's (as Jakub says) copy-and-modify-the-copy.  In that case the 
original is untouched.

> Having said all that, I think the rename/copy as a wholesale
> operation on one file is an uninteresting special case.  The
> generic case that happens far more often in practice is the
> lines moving around across files, and the new "git blame" gives
> you better picture to answer "where the heck did this come from"
> question.

To help the version control system underneath, I have always obeyed the 
discipline of not to copy/move and modify in the same commit.  git has the 
potential to remove this necessity, but I'd still like all my old commits to 
have the copies detected correctly.

As an example: I've got a colleague who works on a project where each new 
version begins as a copy of the old one (it's not the way I'd work, but I 
think git is flexible enough to cope with anything).  So, project1/ exists 
and is copied to project2/ to begin work.  I suppose this is effectively 
branching using the filesystem rather than the version control system.  I 
noticed (and was surprised) that git didn't detect this as a copy.  No files 
were changed in the copy, so I thought git would easily spot this.

The problem is that the next project can be a copy of either project1/ or 
project2/.  All this has already gone on for a few years.  I've recently 
imported this into git and was examining the history.  I wanted to know for a 
particular subdirectory (of many) which of the others it was based off.  I 
was in qgit, and found that the commit didn't show as a copy, it showed as a 
create, and hence I couldn't tell which was the parent project.  It's a shame 
because all the mechanisms are there to show the operation, it just isn't 
shown (without --find-copies-harder).

git-blame is obviously of huge use for these detailed analyses of individual 
line history.  However, in the simple case of a commit being a 100% copy of 
another file, git lets me down.  In fact, in the case described above, it 
wouldn't necessarily help me.  What if it went like this:

project1/ copied to project2/
project2/ copied to project3/

git-blame on a file in project3/ will show that its contents came from a 
project1 commit, whereas I want to know it's direct parent.


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2006-11-20 12:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-20  5:57 Rename detection at git log Alexander Litvinov
2006-11-20  9:50 ` Andy Parkins
2006-11-20 10:07   ` Junio C Hamano
2006-11-20 10:11     ` Jakub Narebski
2006-11-20 10:22     ` Andy Parkins
2006-11-20 10:48       ` Junio C Hamano
2006-11-20 11:01         ` Andy Parkins
2006-11-20 11:15           ` Jakub Narebski
2006-11-20 11:32             ` Junio C Hamano
2006-11-20 11:59             ` Andy Parkins
2006-11-20 11:28         ` Junio C Hamano
2006-11-20 12:16           ` Andy Parkins
2006-11-20 11:33     ` Alexander Litvinov
2006-11-20 10:06 ` Alex Riesen
2006-11-20 10:23   ` Andy Parkins
2006-11-20 10:51     ` Junio C Hamano
2006-11-20 11:17       ` Andy Parkins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).