git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Proper tracking of copies in git log and others
@ 2009-07-04 16:24 Lasse Kärkkäinen
  2009-07-04 18:31 ` Sean Estabrooks
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Lasse Kärkkäinen @ 2009-07-04 16:24 UTC (permalink / raw)
  To: git

Getting full history of a file, including that beyond copies, is rather 
important and apparently not currently supported by git tools.

An example use case:

# Create repository
mkdir courses
cd courses
git init
# Course 2007
mkdir -p 2007/exercise1
echo Hello > 2007/exercise1/description.txt
mkdir 2007/exercise2
echo World > 2007/exercise2/description.txt
git add 2007
git commit -m "Course 2007"
# Course 2008
mkdir -p 2008/exercise1
echo New one > 2008/exercise1/description.txt
git add 2008
git commit -m "Course 2008 exercise 1 (new)"
cp -R 2007/exercise1 2008/exercise2
git add 2008/exercise2
git commit -a -m "Course 2008 exercise 2 (from 2007 exercise 1)"
# Course 2009
cp -R 2008 2009
git add 2009
git commit -m "Course 2009 recycled entirely from 2008"

Now, if we do git log --follow 2009/exercise2/description.txt or 
2009/exercise2, it only prints the "Course 2009" commit instead of the 
full history because --follow doesn't follow copies. What we actually 
want is:

commit 9e17341497b29735bc55b6631b43db6e2f50ed30
Author: Lasse Karkkainen <tronic@trn.iki.fi>
Date:   Sat Jul 4 19:05:57 2009 +0300

     Course 2009 recycled entirely from 2008

commit 8fd13a8667f0bc5c4851b366864a207fa85519bc
Author: Lasse Karkkainen <tronic@trn.iki.fi>
Date:   Sat Jul 4 19:05:57 2009 +0300

     Course 2008 exercise 2 (from 2007 exercise 1)

commit 593346660872ada80ba751688fffc7af7a31e124
Author: Lasse Karkkainen <tronic@trn.iki.fi>
Date:   Sat Jul 4 19:05:57 2009 +0300

     Course 2007

Note: the "Course 2008 exercise 1 (new)" commit is not listed, as it is 
unrelated to 2009/exercise2.

Some nice people from #git suggested various commands that would find 
the previous version (e.g. 2008/exercise2) etc, but none of those got 
even close to getting this full history over multiple copies, with log 
messages.

It would be useful if the git tools could produce history like this with 
all the tools (log, blame, gitk, etc), preferably with proper branching 
guesses (guesses because there is no info on where the copy came from), 
but even a linear history (sorted by commit time?) would do much better 
than not having anything.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Proper tracking of copies in git log and others
  2009-07-04 16:24 Proper tracking of copies in git log and others Lasse Kärkkäinen
@ 2009-07-04 18:31 ` Sean Estabrooks
  2009-07-14 12:19   ` Lasse Kärkkäinen
  2009-07-07  6:53 ` Graeme Geldenhuys
  2009-07-07 13:48 ` Michael J Gruber
  2 siblings, 1 reply; 6+ messages in thread
From: Sean Estabrooks @ 2009-07-04 18:31 UTC (permalink / raw)
  To: Lasse Kärkkäinen; +Cc: git

On Sat, 04 Jul 2009 19:24:56 +0300
Lasse Kärkkäinen <tronic+dfhy@trn.iki.fi> wrote:

> Getting full history of a file, including that beyond copies, is rather 
> important and apparently not currently supported by git tools.

[...]

> It would be useful if the git tools could produce history like this with 
> all the tools (log, blame, gitk, etc), preferably with proper branching 
> guesses (guesses because there is no info on where the copy came from), 
> but even a linear history (sorted by commit time?) would do much better 
> than not having anything.

Lasse,

You're right, it appears that Git does not currently support what you are
trying to do.   However, if you were to run the command "git log -C -C --raw"
on your test case, you would see that Git can actually detect the copies
in question.  The detection just isn't applied in the "follow" case.

There was a patch sent to the list in January that would have enabled the
functionality you're asking about:

http://kerneltrap.org/mailarchive/git/2009/1/22/4792454

Although it seems the author never followed up after Junio questioned one
aspect of the patch.  It wouldn't do _exactly_ as you requested though;
intermediate copies are not shown, copies are linked back to the original.

You could apply that patch and test it out.  Perhaps you could address
Junio's concern or reignite some interest from the original author.

HTH,
Sean

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Proper tracking of copies in git log and others
  2009-07-04 16:24 Proper tracking of copies in git log and others Lasse Kärkkäinen
  2009-07-04 18:31 ` Sean Estabrooks
@ 2009-07-07  6:53 ` Graeme Geldenhuys
  2009-07-07 13:48 ` Michael J Gruber
  2 siblings, 0 replies; 6+ messages in thread
From: Graeme Geldenhuys @ 2009-07-07  6:53 UTC (permalink / raw)
  To: git

Lasse Kärkkäinen wrote:
> Getting full history of a file, including that beyond copies, is rather 
> important and apparently not currently supported by git tools.

You seem to be right. I'm quite new to Git so don't know all the 
commands, but this is what I came up with..


$ git log --raw 2009/exercise2/description.txt
commit edfd954572e844b05a7489bcaa149afb679bc1f6
Author: Graeme Geldenhuys <graeme@mastermaths.co.za>
Date:   Tue Jul 7 08:44:20 2009 +0200

     Course 2009 recycled entirely from 2008

:000000 100644 0000000... e965047... A  2009/exercise2/description.txt


We then take note of the  ^^^^^^^  value and generate the following
command.


$ git log -C --raw | grep e965047
:000000 100644 0000000... e965047... A	2009/exercise2/description.txt
:000000 100644 0000000... e965047... A	2008/exercise2/description.txt
:000000 100644 0000000... e965047... A	2007/exercise1/description.txt


Now now shows you the history of the file as it was copied from one
directory to another.


Regards,
   - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://opensoft.homeip.net/fpgui/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Proper tracking of copies in git log and others
  2009-07-04 16:24 Proper tracking of copies in git log and others Lasse Kärkkäinen
  2009-07-04 18:31 ` Sean Estabrooks
  2009-07-07  6:53 ` Graeme Geldenhuys
@ 2009-07-07 13:48 ` Michael J Gruber
  2009-07-07 14:59   ` Avery Pennarun
  2 siblings, 1 reply; 6+ messages in thread
From: Michael J Gruber @ 2009-07-07 13:48 UTC (permalink / raw)
  To: Lasse Kärkkäinen; +Cc: git

Lasse Kärkkäinen venit, vidit, dixit 04.07.2009 18:24:
> Getting full history of a file, including that beyond copies, is rather 
> important and apparently not currently supported by git tools.
> 
> An example use case:
> 
> # Create repository
> mkdir courses
> cd courses
> git init
> # Course 2007
> mkdir -p 2007/exercise1
> echo Hello > 2007/exercise1/description.txt
> mkdir 2007/exercise2
> echo World > 2007/exercise2/description.txt
> git add 2007
> git commit -m "Course 2007"
> # Course 2008
> mkdir -p 2008/exercise1
> echo New one > 2008/exercise1/description.txt
> git add 2008
> git commit -m "Course 2008 exercise 1 (new)"
> cp -R 2007/exercise1 2008/exercise2
> git add 2008/exercise2
> git commit -a -m "Course 2008 exercise 2 (from 2007 exercise 1)"
> # Course 2009
> cp -R 2008 2009
> git add 2009
> git commit -m "Course 2009 recycled entirely from 2008"
> 
> Now, if we do git log --follow 2009/exercise2/description.txt or 
> 2009/exercise2, it only prints the "Course 2009" commit instead of the 
> full history because --follow doesn't follow copies. What we actually 
> want is:
> 
> commit 9e17341497b29735bc55b6631b43db6e2f50ed30
> Author: Lasse Karkkainen <tronic@trn.iki.fi>
> Date:   Sat Jul 4 19:05:57 2009 +0300
> 
>      Course 2009 recycled entirely from 2008
> 
> commit 8fd13a8667f0bc5c4851b366864a207fa85519bc
> Author: Lasse Karkkainen <tronic@trn.iki.fi>
> Date:   Sat Jul 4 19:05:57 2009 +0300
> 
>      Course 2008 exercise 2 (from 2007 exercise 1)
> 
> commit 593346660872ada80ba751688fffc7af7a31e124
> Author: Lasse Karkkainen <tronic@trn.iki.fi>
> Date:   Sat Jul 4 19:05:57 2009 +0300
> 
>      Course 2007
> 
> Note: the "Course 2008 exercise 1 (new)" commit is not listed, as it is 
> unrelated to 2009/exercise2.
> 
> Some nice people from #git suggested various commands that would find 
> the previous version (e.g. 2008/exercise2) etc, but none of those got 
> even close to getting this full history over multiple copies, with log 
> messages.
> 
> It would be useful if the git tools could produce history like this with 
> all the tools (log, blame, gitk, etc), preferably with proper branching 
> guesses (guesses because there is no info on where the copy came from), 
> but even a linear history (sorted by commit time?) would do much better 
> than not having anything.

While there is a pending patch as Sean pointed out it still has issues.
For moves, "there is always only one", for copies there are at least
two. So which one do you follow?

If you want to know where the current contents of
2009/exercise2/description.txt come from then a simpler approach may be

git log -S$(git show HEAD:2009/exercise2/description.txt)

This gives you three commits. Guess which ones ;)

Michael

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Proper tracking of copies in git log and others
  2009-07-07 13:48 ` Michael J Gruber
@ 2009-07-07 14:59   ` Avery Pennarun
  0 siblings, 0 replies; 6+ messages in thread
From: Avery Pennarun @ 2009-07-07 14:59 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Lasse Kärkkäinen, git

On Tue, Jul 7, 2009 at 9:48 AM, Michael J
Gruber<git@drmicha.warpmail.net> wrote:
> While there is a pending patch as Sean pointed out it still has issues.
> For moves, "there is always only one", for copies there are at least
> two. So which one do you follow?

I think this is relatively easy, although the answer depends on
whether you're merging or logging.

Let's take the general case of n:m copies: n is the number of copies
in the second commit, and m is the number in the base commit.  So 1:1
is a move, and 2:1 is a copy, and (>2):1 is a multiple copy, and eg.
3:2 is an unclear case where you had more than one identical (or
near-identical) file in the first place.

A) Merges: if it's anything other than 1:1, it's totally non-obvious
what to do, so I'll forgive git for doing pretty much anything.  Bonus
points for triggering a conflict so that I have to look at it.

B) Logs:

 - if m > 1, choose the base file that's closest to the new one; its
history should be pretty close to the right one, and arguably the new
file could have come from there anyway.  If there's more than one file
of the same closeness, just pick one, because their histories are
interchangeable anyway.

 - if n > 1, each file needs to have its base selected separately.
It's totally okay for multiple files to have the same base.  If files
b, c, and d were all copied from a, I'd expect "git log --follow b c
d" to show the copying operation, and then the history of a.
Similarly for "git log --stat --follow b" for example.

All of this seems to work (at least well enough for me) with the 1:1
move case, and also for copies if you use "git log --stat -C -C" (not
restricting by filename).  "git log --stat -C -C b" (restricting by
filename) doesn't work for copies, and --follow doesn't help (the
documentation for that options suggests it's only useful for moves).

Unfortunately this also causes problems with following "effective
move" operations: copy the file in one commit, and delete the original
in the next.  I've been known to make this mistake, and it breaks 'git
log -C -C --follow filename', which is no fun.

Thanks,

Avery

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Proper tracking of copies in git log and others
  2009-07-04 18:31 ` Sean Estabrooks
@ 2009-07-14 12:19   ` Lasse Kärkkäinen
  0 siblings, 0 replies; 6+ messages in thread
From: Lasse Kärkkäinen @ 2009-07-14 12:19 UTC (permalink / raw)
  To: Sean Estabrooks; +Cc: git

> You could apply that patch and test it out.  Perhaps you could address
> Junio's concern or reignite some interest from the original author.

The patch works fine here and solves my issue. Hopefully it can be made 
part of the distribution.

Tracking other (intermediate) copies would be a plus but I have 
absolutely no time to work on git as I am trying to use VCS to reduce my 
workload, not to increase it :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-07-14 12:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-04 16:24 Proper tracking of copies in git log and others Lasse Kärkkäinen
2009-07-04 18:31 ` Sean Estabrooks
2009-07-14 12:19   ` Lasse Kärkkäinen
2009-07-07  6:53 ` Graeme Geldenhuys
2009-07-07 13:48 ` Michael J Gruber
2009-07-07 14:59   ` Avery Pennarun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).