git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* maildir / read-tree trivial merging getting in the way?
@ 2006-02-14  2:18 Ben Clifford
  2006-02-14  2:28 ` Junio C Hamano
  2006-02-14  2:32 ` Linus Torvalds
  0 siblings, 2 replies; 5+ messages in thread
From: Ben Clifford @ 2006-02-14  2:18 UTC (permalink / raw)
  To: git


I've spent a few hours playing round with maildir-aware merging.

The basic idea I'm trying to implement is to flip the index round so that 
instead of looking at how the content has changed for a particular filename, 
I'm looking at how the filenames have changed for a particular content.

So I'm using git read-tree -m to populate the index with entries for the 
branches to merge so that I can then diddle round with those.

But the read-tree trivial merge logic seems to be getting in the way a bit.

In my test repo, I have two branches ('master' and 'red') forked from the base 
point 'base':

in 'base':

$ ls
A    fish one

in 'red':

$ ls
B         billygoat one

in 'master'

$ ls
A    lion two

> From base, I renamed and cg add / cg rm'd to change A to B, one to two, 
and fish to billygoat and lion to give the above.

When I read in the tree I get automatic resolving (down to stage 0) for the 
added files. But actually in the output of my merge, I'm not always going to 
want that to happen: In the A->B case, I do want to keep B (and need to remove 
A), likewise in the one->two case.

But for fish->{billygoat,lion}, I only want one file to end up at stage 0, and 
it might not be called either billygoat or lion - in maildir, the filenames are 
more structured, and given a filename like
foo:2,SR and foo:2,SF I would want to compose the filenames to give me 
foo:2,SRF.


$ git read-tree -m base master red

$ git ls-files  --stage
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 1       A
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 2       A
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 0       B
100644 a8150e61a3a4c9941d29169ee639396547f40de2 0       billygoat
100644 a8150e61a3a4c9941d29169ee639396547f40de2 1       fish
100644 a8150e61a3a4c9941d29169ee639396547f40de2 0       lion
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 1       one
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 3       one
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 0       two

Now, I think maybe I can just look at what has made it to stage 0 and play 
round with those, but it makes me feel a little dirty - if anything, the index 
indicates that a bunch of stuff has been correctly merged (by being at stage 0) 
when in fact it hasn't.

So basically my question is: should I feel dirty about doing this and diddle 
read-tree so that there's a flag to not do the trivial merges automatically?

-- 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: maildir / read-tree trivial merging getting in the way?
  2006-02-14  2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
@ 2006-02-14  2:28 ` Junio C Hamano
  2006-02-14  2:35   ` Ben Clifford
  2006-02-14  2:36   ` Linus Torvalds
  2006-02-14  2:32 ` Linus Torvalds
  1 sibling, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2006-02-14  2:28 UTC (permalink / raw)
  To: Ben Clifford; +Cc: git

Ben Clifford <benc@hawaga.org.uk> writes:

> So basically my question is: should I feel dirty about doing this and
> diddle read-tree so that there's a flag to not do the trivial merges
> automatically?

I am mildly negative about touching read-tree for this kind of
non-SCM'ish usage.

If you are doing read-tree without doing any trivial merge, then
you would use ls-files to inspect each stage, decide what the
final shape of the tree you want, and construct such a tree in
the index.

That would be more naturally done by writing that thing in a
more reasonable scripting language (not shell, but Perl or
Python), call ls-tree three times, do whatever merge to come up
with the final shape of the tree, and then construct the tree
with a single invocation of "update-index --index-info", maybe
even starting from an empty index file.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: maildir / read-tree trivial merging getting in the way?
  2006-02-14  2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
  2006-02-14  2:28 ` Junio C Hamano
@ 2006-02-14  2:32 ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2006-02-14  2:32 UTC (permalink / raw)
  To: Ben Clifford; +Cc: git



On Tue, 14 Feb 2006, Ben Clifford wrote:
> 
> I've spent a few hours playing round with maildir-aware merging.
> 
> The basic idea I'm trying to implement is to flip the index round so that
> instead of looking at how the content has changed for a particular filename,
> I'm looking at how the filenames have changed for a particular content.
> 
> So I'm using git read-tree -m to populate the index with entries for the
> branches to merge so that I can then diddle round with those.
> 
> But the read-tree trivial merge logic seems to be getting in the way a bit.

You are much better off working with "git-ls-tree", or perhaps 
"git-diff-tree".

The latter in particular will show you what got added and what got 
deleted, but will quickly ignore any common entries (which is probably 
exactly what you want).

> So basically my question is: should I feel dirty about doing this and diddle
> read-tree so that there's a flag to not do the trivial merges automatically?

You should try to avoid git-read-tree entirely, I suspect.

All the things git-read-tree does are wrong for you. Notably, it very much 
on purpose will match things up name-by-name, and it does a lot of extra 
work to create a sorted version of the index to do the trivial merges 
quickly. The thing is, it doesn't even do that the smart way.

Now, git-read-tree actually does a _great_ job - don't get me wrong. It's 
just that the job it does isn't really suitable for your usage, and it's 
doing some things the "simple and stupid" way instead of being very smart 
about them, just because they aren't that important under normal loads.

For example, in a three-way merge (with an index), it will basically have 
four sorted inputs that it needs to interleave. Now, there's a _smart_ way 
to interleave sorted input, and there's a stupid one. The smart way is to 
read the sources all together, and just pick the right sorted order, and 
interleave them all together.

That's not what git-read-tree does.

git-read-tree will read them one by one, and use "insertion sort" to 
maintain the result in sorted order. Now, insertion sort isn't totally 
idiotic (it's not doing a bogo-sort, at least), but it _is_ pretty damn 
silly when all the sources are already sorted and known ahead of time.

So git-read-tree does some stupid things, and scales badly with really big 
trees. The good news is that we can fix it - the bad news is that my 
motivation for it is pretty low, since "really big" means "much bigger 
than the kernel" ;)

In contrast "git-diff-tree -r a b" does the _smart_ thing, and scales 
linearly with tree size _and_ can take advantage of subdirectories not 
changing (the latter is apparently not an issue for you, but can be one in 
other circumstances).

(The "raw output" from git-diff-tree is also very easy to parse. Don't do 
the "-p" (patch) form, the raw "this is how the SHA's changed" sounds 
like it's exactly what you want, assuming you're interested in renames 
with no content change)

		Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: maildir / read-tree trivial merging getting in the way?
  2006-02-14  2:28 ` Junio C Hamano
@ 2006-02-14  2:35   ` Ben Clifford
  2006-02-14  2:36   ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Ben Clifford @ 2006-02-14  2:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, 13 Feb 2006, Junio C Hamano wrote:

>
> That would be more naturally done by writing that thing in a
> more reasonable scripting language (not shell, but Perl or
> Python), call ls-tree three times, do whatever merge to come up
> with the final shape of the tree, and then construct the tree
> with a single invocation of "update-index --index-info", maybe
> even starting from an empty index file.

yeah, looks like ls-tree x 3 is what I want and quite possibly I'll end up 
constructing a new index from scratch.

-- 
Ben
http://www.hawaga.org.uk/ben/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: maildir / read-tree trivial merging getting in the way?
  2006-02-14  2:28 ` Junio C Hamano
  2006-02-14  2:35   ` Ben Clifford
@ 2006-02-14  2:36   ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2006-02-14  2:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ben Clifford, git



On Mon, 13 Feb 2006, Junio C Hamano wrote:
> 
> That would be more naturally done by writing that thing in a
> more reasonable scripting language (not shell, but Perl or
> Python), call ls-tree three times, do whatever merge to come up
> with the final shape of the tree, and then construct the tree
> with a single invocation of "update-index --index-info", maybe
> even starting from an empty index file.

Exactly. Except that it probably makes sense to use "git-diff-tree" to try 
to avoid doing lots of unnecessary work in a script, if the normal case is 
that there's still a lot of stuff that hasn't changed.

So conceptually you would do three "git-ls-tree" invocations, but in 
_practice_ it's probably better to do just one "git-ls-tree", and then use 
"git-diff-tree" to basically generate the differences from that one 
ls-tree to the other cases of interest.

So start with the merge-base, for example, and then basically generate the 
"what changed" between the merge base and the two branch heads. 

That was the plan for doing merges initially, it just turned out that 
doing them in the index made things easier.

			Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-02-14  2:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-14  2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
2006-02-14  2:28 ` Junio C Hamano
2006-02-14  2:35   ` Ben Clifford
2006-02-14  2:36   ` Linus Torvalds
2006-02-14  2:32 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).