git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Paul Jakma <paul@clubi.ie>
Cc: Andreas Ericsson <ae@op5.se>, git list <git@vger.kernel.org>
Subject: Re: impure renames / history tracking
Date: Wed, 1 Mar 2006 09:13:53 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0603010859200.22647@g5.osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0603011558390.13612@sheen.jakma.org>



On Wed, 1 Mar 2006, Paul Jakma wrote:
> 
> FWIW, I think git's rename handling is really nice. It's just I suspect, being
> a heuristic, it won't be able to follow history reliably across 'very impure'
> renames.

The thing is, it does better than anything that _tries_ to be "reliable".

I can pretty much _guarantee_ that you can't do it better.

Tracking "inodes" - aka file identities - (which is what BK does, and I 
assume what SVN does) is fundamentally problematic. I particular, it's a 
horrible problem when two inodes "meet" under the same name. You now have 
two identities for the same file, and you're fundamentally screwed.

And don't tell me it doesn't happen. It _does_ happen, and it did happen 
with the kernel under BK.

It doesn't even need renames to be a problem. JUST THE FACT THAT YOU TRY 
TO TRACK FILE "IDENTITY" HISTORY IS BROKEN. For example, take CVS, which 
doesn't actually try to do renames, but _does_ try to track the identity 
of a file, since all the history is tied into that identity: think about 
what happens in Attic when a file is deleted. Completely broken model.

Now, CVS doesn't tend to show the problems very much, because people don't 
actually use branches that much (they are a pain in the neck), and they 
sure as hell try to avoid deleting and creating the same filename under a 
branch and on HEAD. I'm sure you can do it, but I'm also pretty sure 
there's a lot of old projects around that have ended up moving the ,v 
files around to play rename/delete games.

And that's really fundamental. CVS doesn't show the problems so much, 
because CVS actively tries to make it hard to do these things.

With renames-tracking-file-identities, it's _really_ easy to get some 
major confusion going. What happens when one branch creates a file, and 
another one renames a file to that same name, and they merge?

Don't tell me it doesn't happen. It happened under BK. The way BK "solved" 
it was to keep the two separate identities: one of them got resolved to 
the new filename, the other one went into the "deleted" directory. Guess 
what happens when the side that got merged into "deleted" continues to 
edit the file? That's right - their edits happen on the deleted file, and 
never show up in the real tree in a subsequent merge ever again.

And as far as I can tell, BK really did the best you can do. Following 
file identities really _is_ fundamentally broken. It sounds like a nice 
idea, but while you migth solve a few problems, you create a whole raft of 
much more fundamental problems.

So next time you think about a merge that migt have been improved by 
tracking renames, please also think about a merge where one of the 
filenames came from two or more different sources through an earlier 
merge, and thank your benevolent Gods that they instructed me to make git 
be based purely on file contents.

		Linus

  reply	other threads:[~2006-03-01 17:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-01 14:01 impure renames / history tracking Paul Jakma
2006-03-01 15:38 ` Andreas Ericsson
2006-03-01 16:27   ` Paul Jakma
2006-03-01 17:13     ` Linus Torvalds [this message]
2006-03-01 18:50       ` Paul Jakma
2006-03-01 17:43     ` Andreas Ericsson
2006-03-02 21:10       ` Paul Jakma
2006-03-02 22:06         ` Andreas Ericsson
2006-03-01 18:05     ` Martin Langhoff
2006-03-01 19:13       ` Paul Jakma
2006-03-01 19:56         ` Junio C Hamano
2006-03-01 21:25           ` Paul Jakma
2006-03-01 22:12             ` Andreas Ericsson
2006-03-01 22:28               ` Paul Jakma
2006-03-01 22:46               ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2006-03-02 22:24 linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0603010859200.22647@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=ae@op5.se \
    --cc=git@vger.kernel.org \
    --cc=paul@clubi.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).