From: Linus Torvalds <torvalds@osdl.org>
To: Paul Jakma <paul@clubi.ie>
Cc: Andreas Ericsson <ae@op5.se>, git list <git@vger.kernel.org>
Subject: Re: impure renames / history tracking
Date: Wed, 1 Mar 2006 09:13:53 -0800 (PST) [thread overview]
Message-ID: <Pine.LNX.4.64.0603010859200.22647@g5.osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0603011558390.13612@sheen.jakma.org>
On Wed, 1 Mar 2006, Paul Jakma wrote:
>
> FWIW, I think git's rename handling is really nice. It's just I suspect, being
> a heuristic, it won't be able to follow history reliably across 'very impure'
> renames.
The thing is, it does better than anything that _tries_ to be "reliable".
I can pretty much _guarantee_ that you can't do it better.
Tracking "inodes" - aka file identities - (which is what BK does, and I
assume what SVN does) is fundamentally problematic. I particular, it's a
horrible problem when two inodes "meet" under the same name. You now have
two identities for the same file, and you're fundamentally screwed.
And don't tell me it doesn't happen. It _does_ happen, and it did happen
with the kernel under BK.
It doesn't even need renames to be a problem. JUST THE FACT THAT YOU TRY
TO TRACK FILE "IDENTITY" HISTORY IS BROKEN. For example, take CVS, which
doesn't actually try to do renames, but _does_ try to track the identity
of a file, since all the history is tied into that identity: think about
what happens in Attic when a file is deleted. Completely broken model.
Now, CVS doesn't tend to show the problems very much, because people don't
actually use branches that much (they are a pain in the neck), and they
sure as hell try to avoid deleting and creating the same filename under a
branch and on HEAD. I'm sure you can do it, but I'm also pretty sure
there's a lot of old projects around that have ended up moving the ,v
files around to play rename/delete games.
And that's really fundamental. CVS doesn't show the problems so much,
because CVS actively tries to make it hard to do these things.
With renames-tracking-file-identities, it's _really_ easy to get some
major confusion going. What happens when one branch creates a file, and
another one renames a file to that same name, and they merge?
Don't tell me it doesn't happen. It happened under BK. The way BK "solved"
it was to keep the two separate identities: one of them got resolved to
the new filename, the other one went into the "deleted" directory. Guess
what happens when the side that got merged into "deleted" continues to
edit the file? That's right - their edits happen on the deleted file, and
never show up in the real tree in a subsequent merge ever again.
And as far as I can tell, BK really did the best you can do. Following
file identities really _is_ fundamentally broken. It sounds like a nice
idea, but while you migth solve a few problems, you create a whole raft of
much more fundamental problems.
So next time you think about a merge that migt have been improved by
tracking renames, please also think about a merge where one of the
filenames came from two or more different sources through an earlier
merge, and thank your benevolent Gods that they instructed me to make git
be based purely on file contents.
Linus
next prev parent reply other threads:[~2006-03-01 17:14 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-01 14:01 impure renames / history tracking Paul Jakma
2006-03-01 15:38 ` Andreas Ericsson
2006-03-01 16:27 ` Paul Jakma
2006-03-01 17:13 ` Linus Torvalds [this message]
2006-03-01 18:50 ` Paul Jakma
2006-03-01 17:43 ` Andreas Ericsson
2006-03-02 21:10 ` Paul Jakma
2006-03-02 22:06 ` Andreas Ericsson
2006-03-01 18:05 ` Martin Langhoff
2006-03-01 19:13 ` Paul Jakma
2006-03-01 19:56 ` Junio C Hamano
2006-03-01 21:25 ` Paul Jakma
2006-03-01 22:12 ` Andreas Ericsson
2006-03-01 22:28 ` Paul Jakma
2006-03-01 22:46 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2006-03-02 22:24 linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0603010859200.22647@g5.osdl.org \
--to=torvalds@osdl.org \
--cc=ae@op5.se \
--cc=git@vger.kernel.org \
--cc=paul@clubi.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).