From: Paul Jakma <paul@clubi.ie>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Andreas Ericsson <ae@op5.se>, git list <git@vger.kernel.org>
Subject: Re: impure renames / history tracking
Date: Wed, 1 Mar 2006 18:50:21 +0000 (GMT) [thread overview]
Message-ID: <Pine.LNX.4.64.0603011815150.13612@sheen.jakma.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0603010859200.22647@g5.osdl.org>
Hi Linus,
On Wed, 1 Mar 2006, Linus Torvalds wrote:
> The thing is, it does better than anything that _tries_ to be
> "reliable".
>
> I can pretty much _guarantee_ that you can't do it better.
I'm willing to take that argument to the 'project' concerned, I just
need to be pretty sure of it.
> Tracking "inodes" - aka file identities - (which is what BK does,
> and I assume what SVN does) is fundamentally problematic. I
> particular, it's a horrible problem when two inodes "meet" under
> the same name. You now have two identities for the same file, and
> you're fundamentally screwed.
Yes, in that model it is. This interestingly, is not the BK model, I
suspect (see below).
> It doesn't even need renames to be a problem. JUST THE FACT THAT
> YOU TRY TO TRACK FILE "IDENTITY" HISTORY IS BROKEN.
If it's "file identity" globally across the lifetime of the project,
I agree 100% per cent. The 'traditional' SCM concerned does this.
That's not what a solution I'd want to explore either, I'm only
interested in the identity of files for any one /one/ commit. In
saying that, I recognise it's pointless to try annotate file-change
information in multi-parent commits (merges).
> For example, take CVS, which doesn't actually try to do renames,
> but _does_ try to track the identity of a file, since all the
> history is tied into that identity: think about what happens in
> Attic when a file is deleted. Completely broken model.
ACK, {Attic,deleted_files}/ is just horrid.
> And that's really fundamental. CVS doesn't show the problems so
> much, because CVS actively tries to make it hard to do these
> things.
ACK.
> With renames-tracking-file-identities, it's _really_ easy to get
> some major confusion going. What happens when one branch creates a
> file, and another one renames a file to that same name, and they
> merge?
Well, the conflict has to be resolved somehow, even today.
> Don't tell me it doesn't happen. It happened under BK. The way BK
> "solved" it was to keep the two separate identities: one of them
> got resolved to the new filename, the other one went into the
> "deleted" directory.
Right. That's what the 'traditional workflow' SCM I'm thinking of
does - not BK funnily enough, but an SCM predating BK which also
happens to use SCCS files, and with some of the same high-level
push/pull constructs as BK (interestingly).
It also tracks name history globally using a deleted_files/ history,
which is maintained, but I don't think it does this for name merges
like the above.
In the one I'm thinking of, it does (I /think/, I'm not an expert in
it) the following:
Given two files, say:
'old:
1.1---1.2---1.3
new:
1.1
- constructs a 'fake' base SCCS revision, empty
- adds the top 'old' version as a branch
- adds the top new version as a new delta
1.1.1.1
/
1.1---------1.2
Where in the merged file:
1.1: empty
1.1.1.1: was 1.3 from 'old'
1.2: is 1.1 from 'new'
However, it does /not/ create a deleted_files entry for the 'old'
file. (AFAICT - I may not have a sufficiently full understanding of
this SCM)
> Guess what happens when the side that got merged into "deleted"
> continues to edit the file? That's right - their edits happen on
> the deleted file, and never show up in the real tree in a
> subsequent merge ever again.
Indeed - horrid.
> And as far as I can tell, BK really did the best you can do.
> Following file identities really _is_ fundamentally broken. It
> sounds like a nice idea, but while you migth solve a few problems,
> you create a whole raft of much more fundamental problems.
For tracking identity across more than one commit - I fully agree.
That's not what quite I'm thinking of though. Is it worth going on
with the discussion on a:
'track identities *only* from context of /the/ parent to
this commit'
> So next time you think about a merge that migt have been improved
> by tracking renames, please also think about a merge where one of
> the filenames came from two or more different sources through an
> earlier merge, and thank your benevolent Gods that they instructed
> me to make git be based purely on file contents.
Oh, I agree muchely here.
I wouldn't change git. I only wonder if it give its rename-heuristics
an additional advisory-only hint? (for single-parent commits at least
- never merges - and only on a per-commit basis).
I probably should first explore how git deals with rename clashes..
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
I'm glad I was not born before tea.
-- Sidney Smith (1771-1845)
next prev parent reply other threads:[~2006-03-01 18:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-01 14:01 impure renames / history tracking Paul Jakma
2006-03-01 15:38 ` Andreas Ericsson
2006-03-01 16:27 ` Paul Jakma
2006-03-01 17:13 ` Linus Torvalds
2006-03-01 18:50 ` Paul Jakma [this message]
2006-03-01 17:43 ` Andreas Ericsson
2006-03-02 21:10 ` Paul Jakma
2006-03-02 22:06 ` Andreas Ericsson
2006-03-01 18:05 ` Martin Langhoff
2006-03-01 19:13 ` Paul Jakma
2006-03-01 19:56 ` Junio C Hamano
2006-03-01 21:25 ` Paul Jakma
2006-03-01 22:12 ` Andreas Ericsson
2006-03-01 22:28 ` Paul Jakma
2006-03-01 22:46 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2006-03-02 22:24 linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0603011815150.13612@sheen.jakma.org \
--to=paul@clubi.ie \
--cc=ae@op5.se \
--cc=git@vger.kernel.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).