git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Ericsson <ae@op5.se>
To: Paul Jakma <paul@clubi.ie>
Cc: git list <git@vger.kernel.org>
Subject: Re: impure renames / history tracking
Date: Wed, 01 Mar 2006 16:38:58 +0100	[thread overview]
Message-ID: <4405C012.6080407@op5.se> (raw)
In-Reply-To: <Pine.LNX.4.64.0603011343170.13612@sheen.jakma.org>

Paul Jakma wrote:
> 
> - git obviously detects pure renames perfectly well
> 
> - git doesn't however record renames, so 'impure' renames may not be
>   detected
> 
> My question is:
> 
> - why not record rename information explicitely in the commit object?
> 

Mainly for two reasons, iirc:
1. Extensive metadata is evil.
2. Backwards compatibility. Old repos should always work with new tools. 
Old tools should work with new repos, at least until a new major-release 
is released.


> I.e. so as to be able to follow history information through 'impure' 
> renames without having to resort to heuristics.
> 
> E.g. imagine a project where development typically occurs through:
> 
> o: commit
> m: merge
> 
>    o---o-m--o-o-o--o----m <- project
>   /     /              /
> o-o-o-o-o--o-o-o--o-o-o <- main branch
> 
> The project merge back to main in one 'big' combined merge (collapsing 
> all of the commits on 'project' into one commit). This leads to 'impure 
> renames' being not uncommon. The desired end-result of merging back to 
> 'main' being to rebase 'project' as one commit against 'main', and merge 
> that single commit back, a la:
> 
>    o---o-m--o-o-o--o----m <- project
>   /     /              /
> o-o-o-o-o--o-o-o--o-o-o---m <- main branch
>                        \ /
>                         o <- project_collapsed
> 
> So that 'm' on 'main' is that one commit[1].
> 

I think you're misunderstanding the git meaning of rebase here. "git 
rebase" moves all commits since "project" forked from "main branch" to 
the tip of "main branch".

Other than that, this is the recommended workflow, and exactly how Linux 
and git both are managed (i.e. topic branches eventually merged into 
'master').

In your drawings, 'main branch' would be 'master' and 'project' would be 
any amount of topic-branches (or just one, if you like that better).

I'm not sure what you mean by 'project_collapsed' though. If I 
understand you correctly, each branch-head represents one 'collapse'. I 
suggest you clone the git repo and do

	$ gitk master
	$ gitk next
	$ gitk pu

gitk is great for visualizing what you've done and what the repo looks 
like. Use and abuse it frequently every time you're unsure what was you 
just did. It's the best way to quickly learn what happens, really.

If you just want to distribute snapshots I suggest you do take a look at 
git-tar-tree. Junio makes nice use of it in the git Makefile (the dist: 
target).


> The merits or demerits of such merging practice aside, what reason would 
> there be /against/ recording explicit rename information in the commit 
> object, so as to help browsers follow history (particularly impure 
> renames) better in a commit?
> 
> I.e. would there be resistance to adding meta-info rename headers commit 
> objects, and having diffcore and other tools to use those headers to 
> /augment/ their existing heuristics in detecting renames?
> 

Personally I think metadata is evil. Renames will still be auto-detected 
anyway, and with the distributed repo setup the only reason git 
shouldn't be able to detect a rename is if you rename a file and hack it 
up so it doesn't even come close to matching its origin (close in this 
case is 80% by default, I think). In those cases it isn't so much a 
rename as a rewrite. If you find the commit where the file was renamed 
it should be listed in that commit, like so:

	similarity index 92%
	rename from Documentation/git-log-script.txt
	rename to Documentation/git-log.txt

(this is gitk output from the git repo. Search for "Big tool rename")

IMO this is far better than having to tell git "I renamed this file to 
that", since it also detects code-copying with modifications, and it's 
usually quick enough to find those renames as well.

> Thanks!
> 
> 1. Git currently doesn't have 'porcelain' to do this, presumably there'd 
> be no objection to one?
> 

	$ git checkout master
	$ git pull . project

The dot means "pull from the local repo". "project" is the branch you 
want to merge into master. You can pull an arbitrary amount of branches 
in one go ("octopus" merge). The current tested limit is 12 (thanks, Len 
;) ).

If, for some reason, you want to combine lots of commits into a single 
mega-patch (like Linus does for each release of the kernel), you can do:

	$ git diff $(git merge-base main project) project > patch-file

Then you can apply patch-file to whatever branch you want and make the 
commit as if it was a single change-set. I'd recommend against it unless 
you're just toying around though. It's a bad idea to lie in a projects 
history.

Hope that helps.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

  reply	other threads:[~2006-03-01 15:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-01 14:01 impure renames / history tracking Paul Jakma
2006-03-01 15:38 ` Andreas Ericsson [this message]
2006-03-01 16:27   ` Paul Jakma
2006-03-01 17:13     ` Linus Torvalds
2006-03-01 18:50       ` Paul Jakma
2006-03-01 17:43     ` Andreas Ericsson
2006-03-02 21:10       ` Paul Jakma
2006-03-02 22:06         ` Andreas Ericsson
2006-03-01 18:05     ` Martin Langhoff
2006-03-01 19:13       ` Paul Jakma
2006-03-01 19:56         ` Junio C Hamano
2006-03-01 21:25           ` Paul Jakma
2006-03-01 22:12             ` Andreas Ericsson
2006-03-01 22:28               ` Paul Jakma
2006-03-01 22:46               ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2006-03-02 22:24 linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4405C012.6080407@op5.se \
    --to=ae@op5.se \
    --cc=git@vger.kernel.org \
    --cc=paul@clubi.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).