git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Jakma <paul@clubi.ie>
To: Junio C Hamano <junkio@cox.net>
Cc: Andreas Ericsson <ae@op5.se>, git list <git@vger.kernel.org>
Subject: Re: impure renames / history tracking
Date: Wed, 1 Mar 2006 21:25:08 +0000 (GMT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0603012105230.13612@sheen.jakma.org> (raw)
In-Reply-To: <7v3bi2ey63.fsf@assigned-by-dhcp.cox.net>

Hi Junio,

On Wed, 1 Mar 2006, Junio C Hamano wrote:

> Interestingly enough, there are two levels of "rename tracking" the 
> current git does.  Whey you run "git whatchanged -M", you are 
> looking at renames between each commit in the commit chain, one 
> step at a time.  There as long as the rename+rewrite does not 
> amount to too much rewrite, you would see what should be detected 
> as rename to be detected as renames.

Right.

> I found the current default threshold parameters to be about right, 
> maybe a bit too tight sometimes, though.  If you want to loosen the 
> default, you can specify similiarity index after -M.

That's one option.

I'm wondering though if we couldn't also allow for users to 
additionally encode naming 'hints', to aid this 'similarity' 
detection process.

> The way recursive merge strategy uses the rename detection, unlike 
> what whatchanged shows you, does not use chains of commits down to 
> the common merge base in order to detect renames (my recollection 
> may be wrong here -- it's a while since I looked at the recursive 
> merge the last time).  It just looks at the two heads being merged, 
> and detects similarility between them.  So it does not make _any_ 
> difference with the current implementation of recursive merge if 
> you kept a history full of "honest but disgusting" commits or 
> collapsed them into a history with small number of "cleaned up" 
> commits.

I'm going to have to stare at this paragraph a lot longer and harder 
to understand it :).

> One thing it _could_ do (and you _could_ implement as another merge 
> strategy and call it "pauls-rename" merge) is to follow the commit 
> chain one by one down to the common merge base from both heads 
> being merged, and analyze rename history on the both commit chains.

Right, I was just thinking that while making tea actually. This could 
be part of the 'collapsing' process. (or call it "coalesce 
too-detailed commits" process if that is less offensive to ones sense 
of process ;) ).

Actually, you're sort of suggesting following the chains in parallel, 
right? Ie in wall-clock time order, rather than chain order. And 
doing name resolution across the 'to-be-merged' chains at each step 
of the way? Sort of a lesser subset of how other SCMs maintain state 
for names globally?

It's not so much /resolving/ names I'm worried about in the first 
place. It's there simply being no information in the first place to 
indicate (from one single-parent commit to the next) which names were 
renamed.

> Then, you would get better rename+rewrite detection than what it 
> currently does.

But if I follow the commit chain in order to try extract

> HOWEVER.

> If you have that kind of rename-following merge, a workflow that 
> collapses a useful history into a single huge commit "Ok, this 
> commit is a roll-up patch between version 2.6.14 and 2.6.15" 
> becomes far less attractive than it currently already is.  At that 
> point, you _are_ throwing away useful history.

Yes, I agree. And I am, as part of arguing git's case (several SCMs 
are being evaluated and considered, I'm the git proponent at the 
moment), I'm going to suggest workflow ought to be re-evaluated to 
ensure it is generally reasonable, rather than be kept for the sake 
of it keeping (particularly as it may be tailored to the 
needs/limitations of $TRADITIONAL_SCM).

However, I suspect at least some level of collapsing will be desired 
(just as it is with Linux and git).

The workflow issue is seperate from the 'impure rename' issue though, 
even if the workflow I gave as an example excerbates the issue, 
"rename and rewrite half of it" and hard-to-detect renames can still 
occur in the detailed git/linux workflows, surely?

regards,
-- 
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
If you really knew C++, you wouldn't even joke about putting it
in the kernel.

 	- Richard Johnson on linux-kernel

  reply	other threads:[~2006-03-01 21:25 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-01 14:01 impure renames / history tracking Paul Jakma
2006-03-01 15:38 ` Andreas Ericsson
2006-03-01 16:27   ` Paul Jakma
2006-03-01 17:13     ` Linus Torvalds
2006-03-01 18:50       ` Paul Jakma
2006-03-01 17:43     ` Andreas Ericsson
2006-03-02 21:10       ` Paul Jakma
2006-03-02 22:06         ` Andreas Ericsson
2006-03-01 18:05     ` Martin Langhoff
2006-03-01 19:13       ` Paul Jakma
2006-03-01 19:56         ` Junio C Hamano
2006-03-01 21:25           ` Paul Jakma [this message]
2006-03-01 22:12             ` Andreas Ericsson
2006-03-01 22:28               ` Paul Jakma
2006-03-01 22:46               ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2006-03-02 22:24 linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0603012105230.13612@sheen.jakma.org \
    --to=paul@clubi.ie \
    --cc=ae@op5.se \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).