* maildir / read-tree trivial merging getting in the way?
@ 2006-02-14 2:18 Ben Clifford
2006-02-14 2:28 ` Junio C Hamano
2006-02-14 2:32 ` Linus Torvalds
0 siblings, 2 replies; 5+ messages in thread
From: Ben Clifford @ 2006-02-14 2:18 UTC (permalink / raw)
To: git
I've spent a few hours playing round with maildir-aware merging.
The basic idea I'm trying to implement is to flip the index round so that
instead of looking at how the content has changed for a particular filename,
I'm looking at how the filenames have changed for a particular content.
So I'm using git read-tree -m to populate the index with entries for the
branches to merge so that I can then diddle round with those.
But the read-tree trivial merge logic seems to be getting in the way a bit.
In my test repo, I have two branches ('master' and 'red') forked from the base
point 'base':
in 'base':
$ ls
A fish one
in 'red':
$ ls
B billygoat one
in 'master'
$ ls
A lion two
> From base, I renamed and cg add / cg rm'd to change A to B, one to two,
and fish to billygoat and lion to give the above.
When I read in the tree I get automatic resolving (down to stage 0) for the
added files. But actually in the output of my merge, I'm not always going to
want that to happen: In the A->B case, I do want to keep B (and need to remove
A), likewise in the one->two case.
But for fish->{billygoat,lion}, I only want one file to end up at stage 0, and
it might not be called either billygoat or lion - in maildir, the filenames are
more structured, and given a filename like
foo:2,SR and foo:2,SF I would want to compose the filenames to give me
foo:2,SRF.
$ git read-tree -m base master red
$ git ls-files --stage
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 1 A
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 2 A
100644 40e0a6f540b1b457c61315f3ccf2f5ed628e2f36 0 B
100644 a8150e61a3a4c9941d29169ee639396547f40de2 0 billygoat
100644 a8150e61a3a4c9941d29169ee639396547f40de2 1 fish
100644 a8150e61a3a4c9941d29169ee639396547f40de2 0 lion
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 1 one
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 3 one
100644 b67e17aeb5938def7ee105c2afe9fbb30a28a872 0 two
Now, I think maybe I can just look at what has made it to stage 0 and play
round with those, but it makes me feel a little dirty - if anything, the index
indicates that a bunch of stuff has been correctly merged (by being at stage 0)
when in fact it hasn't.
So basically my question is: should I feel dirty about doing this and diddle
read-tree so that there's a flag to not do the trivial merges automatically?
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: maildir / read-tree trivial merging getting in the way?
2006-02-14 2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
@ 2006-02-14 2:28 ` Junio C Hamano
2006-02-14 2:35 ` Ben Clifford
2006-02-14 2:36 ` Linus Torvalds
2006-02-14 2:32 ` Linus Torvalds
1 sibling, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2006-02-14 2:28 UTC (permalink / raw)
To: Ben Clifford; +Cc: git
Ben Clifford <benc@hawaga.org.uk> writes:
> So basically my question is: should I feel dirty about doing this and
> diddle read-tree so that there's a flag to not do the trivial merges
> automatically?
I am mildly negative about touching read-tree for this kind of
non-SCM'ish usage.
If you are doing read-tree without doing any trivial merge, then
you would use ls-files to inspect each stage, decide what the
final shape of the tree you want, and construct such a tree in
the index.
That would be more naturally done by writing that thing in a
more reasonable scripting language (not shell, but Perl or
Python), call ls-tree three times, do whatever merge to come up
with the final shape of the tree, and then construct the tree
with a single invocation of "update-index --index-info", maybe
even starting from an empty index file.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: maildir / read-tree trivial merging getting in the way?
2006-02-14 2:28 ` Junio C Hamano
@ 2006-02-14 2:35 ` Ben Clifford
2006-02-14 2:36 ` Linus Torvalds
1 sibling, 0 replies; 5+ messages in thread
From: Ben Clifford @ 2006-02-14 2:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, 13 Feb 2006, Junio C Hamano wrote:
>
> That would be more naturally done by writing that thing in a
> more reasonable scripting language (not shell, but Perl or
> Python), call ls-tree three times, do whatever merge to come up
> with the final shape of the tree, and then construct the tree
> with a single invocation of "update-index --index-info", maybe
> even starting from an empty index file.
yeah, looks like ls-tree x 3 is what I want and quite possibly I'll end up
constructing a new index from scratch.
--
Ben
http://www.hawaga.org.uk/ben/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: maildir / read-tree trivial merging getting in the way?
2006-02-14 2:28 ` Junio C Hamano
2006-02-14 2:35 ` Ben Clifford
@ 2006-02-14 2:36 ` Linus Torvalds
1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2006-02-14 2:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ben Clifford, git
On Mon, 13 Feb 2006, Junio C Hamano wrote:
>
> That would be more naturally done by writing that thing in a
> more reasonable scripting language (not shell, but Perl or
> Python), call ls-tree three times, do whatever merge to come up
> with the final shape of the tree, and then construct the tree
> with a single invocation of "update-index --index-info", maybe
> even starting from an empty index file.
Exactly. Except that it probably makes sense to use "git-diff-tree" to try
to avoid doing lots of unnecessary work in a script, if the normal case is
that there's still a lot of stuff that hasn't changed.
So conceptually you would do three "git-ls-tree" invocations, but in
_practice_ it's probably better to do just one "git-ls-tree", and then use
"git-diff-tree" to basically generate the differences from that one
ls-tree to the other cases of interest.
So start with the merge-base, for example, and then basically generate the
"what changed" between the merge base and the two branch heads.
That was the plan for doing merges initially, it just turned out that
doing them in the index made things easier.
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: maildir / read-tree trivial merging getting in the way?
2006-02-14 2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
2006-02-14 2:28 ` Junio C Hamano
@ 2006-02-14 2:32 ` Linus Torvalds
1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2006-02-14 2:32 UTC (permalink / raw)
To: Ben Clifford; +Cc: git
On Tue, 14 Feb 2006, Ben Clifford wrote:
>
> I've spent a few hours playing round with maildir-aware merging.
>
> The basic idea I'm trying to implement is to flip the index round so that
> instead of looking at how the content has changed for a particular filename,
> I'm looking at how the filenames have changed for a particular content.
>
> So I'm using git read-tree -m to populate the index with entries for the
> branches to merge so that I can then diddle round with those.
>
> But the read-tree trivial merge logic seems to be getting in the way a bit.
You are much better off working with "git-ls-tree", or perhaps
"git-diff-tree".
The latter in particular will show you what got added and what got
deleted, but will quickly ignore any common entries (which is probably
exactly what you want).
> So basically my question is: should I feel dirty about doing this and diddle
> read-tree so that there's a flag to not do the trivial merges automatically?
You should try to avoid git-read-tree entirely, I suspect.
All the things git-read-tree does are wrong for you. Notably, it very much
on purpose will match things up name-by-name, and it does a lot of extra
work to create a sorted version of the index to do the trivial merges
quickly. The thing is, it doesn't even do that the smart way.
Now, git-read-tree actually does a _great_ job - don't get me wrong. It's
just that the job it does isn't really suitable for your usage, and it's
doing some things the "simple and stupid" way instead of being very smart
about them, just because they aren't that important under normal loads.
For example, in a three-way merge (with an index), it will basically have
four sorted inputs that it needs to interleave. Now, there's a _smart_ way
to interleave sorted input, and there's a stupid one. The smart way is to
read the sources all together, and just pick the right sorted order, and
interleave them all together.
That's not what git-read-tree does.
git-read-tree will read them one by one, and use "insertion sort" to
maintain the result in sorted order. Now, insertion sort isn't totally
idiotic (it's not doing a bogo-sort, at least), but it _is_ pretty damn
silly when all the sources are already sorted and known ahead of time.
So git-read-tree does some stupid things, and scales badly with really big
trees. The good news is that we can fix it - the bad news is that my
motivation for it is pretty low, since "really big" means "much bigger
than the kernel" ;)
In contrast "git-diff-tree -r a b" does the _smart_ thing, and scales
linearly with tree size _and_ can take advantage of subdirectories not
changing (the latter is apparently not an issue for you, but can be one in
other circumstances).
(The "raw output" from git-diff-tree is also very easy to parse. Don't do
the "-p" (patch) form, the raw "this is how the SHA's changed" sounds
like it's exactly what you want, assuming you're interested in renames
with no content change)
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-02-14 2:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-14 2:18 maildir / read-tree trivial merging getting in the way? Ben Clifford
2006-02-14 2:28 ` Junio C Hamano
2006-02-14 2:35 ` Ben Clifford
2006-02-14 2:36 ` Linus Torvalds
2006-02-14 2:32 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).