git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Ben Clifford <benc@hawaga.org.uk>
Cc: git@vger.kernel.org
Subject: Re: maildir / read-tree trivial merging getting in the way?
Date: Mon, 13 Feb 2006 18:32:35 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0602131820490.3691@g5.osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.60.0602140217380.19093@mundungus.clifford.ac>



On Tue, 14 Feb 2006, Ben Clifford wrote:
> 
> I've spent a few hours playing round with maildir-aware merging.
> 
> The basic idea I'm trying to implement is to flip the index round so that
> instead of looking at how the content has changed for a particular filename,
> I'm looking at how the filenames have changed for a particular content.
> 
> So I'm using git read-tree -m to populate the index with entries for the
> branches to merge so that I can then diddle round with those.
> 
> But the read-tree trivial merge logic seems to be getting in the way a bit.

You are much better off working with "git-ls-tree", or perhaps 
"git-diff-tree".

The latter in particular will show you what got added and what got 
deleted, but will quickly ignore any common entries (which is probably 
exactly what you want).

> So basically my question is: should I feel dirty about doing this and diddle
> read-tree so that there's a flag to not do the trivial merges automatically?

You should try to avoid git-read-tree entirely, I suspect.

All the things git-read-tree does are wrong for you. Notably, it very much 
on purpose will match things up name-by-name, and it does a lot of extra 
work to create a sorted version of the index to do the trivial merges 
quickly. The thing is, it doesn't even do that the smart way.

Now, git-read-tree actually does a _great_ job - don't get me wrong. It's 
just that the job it does isn't really suitable for your usage, and it's 
doing some things the "simple and stupid" way instead of being very smart 
about them, just because they aren't that important under normal loads.

For example, in a three-way merge (with an index), it will basically have 
four sorted inputs that it needs to interleave. Now, there's a _smart_ way 
to interleave sorted input, and there's a stupid one. The smart way is to 
read the sources all together, and just pick the right sorted order, and 
interleave them all together.

That's not what git-read-tree does.

git-read-tree will read them one by one, and use "insertion sort" to 
maintain the result in sorted order. Now, insertion sort isn't totally 
idiotic (it's not doing a bogo-sort, at least), but it _is_ pretty damn 
silly when all the sources are already sorted and known ahead of time.

So git-read-tree does some stupid things, and scales badly with really big 
trees. The good news is that we can fix it - the bad news is that my 
motivation for it is pretty low, since "really big" means "much bigger 
than the kernel" ;)

In contrast "git-diff-tree -r a b" does the _smart_ thing, and scales 
linearly with tree size _and_ can take advantage of subdirectories not 
changing (the latter is apparently not an issue for you, but can be one in 
other circumstances).

(The "raw output" from git-diff-tree is also very easy to parse. Don't do 
the "-p" (patch) form, the raw "this is how the SHA's changed" sounds 
like it's exactly what you want, assuming you're interested in renames 
with no content change)

		Linus

      parent reply	other threads:[~2006-02-14  2:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-14  2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
2006-02-14  2:28 ` Junio C Hamano
2006-02-14  2:35   ` Ben Clifford
2006-02-14  2:36   ` Linus Torvalds
2006-02-14  2:32 ` Linus Torvalds [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0602131820490.3691@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=benc@hawaga.org.uk \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).