From: Catalin Marinas <catalin.marinas@gmail.com>
To: Blaisorblade <blaisorblade@yahoo.it>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: git-merge-cache / StGIT - gitmergeonefile.py: merging one tree into another rather than two trees into merge base
Date: Thu, 15 Sep 2005 11:06:50 +0100 [thread overview]
Message-ID: <tnxslw6d4qt.fsf@arm.com> (raw)
In-Reply-To: <200509142019.04667.blaisorblade@yahoo.it> (blaisorblade@yahoo.it's message of "Wed, 14 Sep 2005 20:19:04 +0200")
Blaisorblade <blaisorblade@yahoo.it> wrote:
> On Sunday 11 September 2005 10:24, Catalin Marinas wrote:
>> That's indeed very slow. How may files are modified in each patch?
>
> 1 files, in most cases - biggest one had 5 files changed (and is big
> 4,8K).
OK, this doesn't really matters since the diff between the new HEAD
and the old bottom of the patch is big.
>> Actually, it will three-way merge the diff between top and bottom with
>> the diff between HEAD and bottom. The time will depend on the size of
>> both diffs, not just the maximum.
>
> Yes, true, but the problem in this case was that one diff (the
> upstream one) was extremely huge. And it even involved some files
> removal - I almost never caught, with top, anything else than
> gitmergeonefile.py or git-update-cache --remove.
git-read-tree -m doesn't handle the case when a file is removed from
one branch and unmodified in the other, which is what happens in your
test. For each of these removed files, git-merge-cache will call
gitmergeonefile.py which calls 'git-update-cache --remove'.
An improvement would be to make git-read-tree smarter and only leave
out the three-way merges and the conflicts. Otherwise, implementing my
own git-merge-cache which includes gitmergeonefile.py or implementing
gitmergeonefile in C would be an improvement but I'm not sure it is
that significant.
> I guess that moves between stages don't have the big memmove()
> problem, while calling --remove shows it.
>
> And, with master being the SHA1 I pushed onto
> (8920e8f94c44e31a73bdf923b04721e26e88cadd),
>
> git-diff-tree -r v2.6.13 master |grep ' D'|wc -l
>
> gives 189. I guess that 2.5 seconds per each removal are enough to
> justify the time which was needed above. Actually, it even had to be
> less than a second per removal, otherwise it'd have been even
> slower.
The 2.5 figure was for trivial merges and most of the time was spent
in git-read-tree.
>> That's not a three-way diff algorithm. I could generate the diff of the
>> patch and apply it with git-apply but this might fail and you wouldn't
>> even get reject files (not sure whether git-apply supports this).
>
> Well, true, but either diff3 or merge would work fairly well. Also,
> trying to apply the patch and resorting to something more complex
> wouldn't create so many problems.
But this would slow things down for people pulling changes on a daily
(or even more often) basis. It could be added as an option to 'stg
push', i.e. when you expect something like this, just pass a
'--nothreeway' option.
>> Another problem is that it won't detect upstream merges of your patch,
>> you would need to check it by hand. If all you need is a way to apply a
>> patch using the 'patch' tool, have a look at Quilt. It has similar
>> commands (since StGIT is based on its ideas) but uses the diff/patch
>> tools and is much faster. It also generates rejects if a patch doesn't
>> apply cleanly but I find the diff3 information, together with the
>> original files left in place, much more useful.
>
> Yes, in most cases. I've had something to complain with diff3,
> however (that's a separate story). Maybe merge would be even better?
I couldn't find the difference between merge and diff3 -E. Is there
any?
>> > In the git-read-tree manpage (which is probably outdated), this is
>> > documented as follows:
>> >
>> > o stage 1 and stage 3 are the same and stage 2 is different:
>> > take stage 2 (some work has been done on stage 2)
>> >
>> > But it didn't happen, which is strange.
>
>> This means that the file is not modified by an StGIT patch but it is
>> modified upstream. This case works fine for me (it's a bug if it
>> doesn't). Did you get a conflict? Are the original files left in place?
>
> No, it's not a correctness bug for me - just a performance bug. Why did
> gitupdateonefile.py need to call git-update-cache --remove? If git-read-tree
> had done his duty, this wouldn't be needed.
Do you mean it wasn't modified by the StGIT patch (and hence stage 3
is the same as stage 1) but it was removed in stage 2. This case is
*not* handled by git-read-tree, though I think it should deal with
this trivial case as well, it's not that different from the case
below.
If stage 2 is a simple SHA1 modification (not file removal) the case
was already handled by git-read-tree and there should only be a stage
0 entry for this file. git-merge-cache should *not* call
gitmergeonefile.py.
You can run git-read-tree manually and check the stages for this case.
>> > So, would stgit delete a file created in patch #1 when merge-applying
>> > patch #2?
>
>> Definitely not.
>
> Gonna test - but if you look at the manpage of read-tree, this would
> probably be the expected behaviour.
If patch #2 didn't explicitely removed this file and if patch #1 is
applied, it won't delete it (unless there is a bug).
--
Catalin
next prev parent reply other threads:[~2005-09-15 10:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-09-10 18:27 git-merge-cache / StGIT - gitmergeonefile.py: merging one tree into another rather than two trees into merge base Blaisorblade
2005-09-11 8:24 ` Catalin Marinas
2005-09-12 12:59 ` Chuck Lever
2005-09-14 17:46 ` Blaisorblade
2005-09-14 18:16 ` Blaisorblade
2005-09-14 18:19 ` Blaisorblade
2005-09-15 10:06 ` Catalin Marinas [this message]
2005-09-16 18:45 ` Blaisorblade
2005-09-16 19:59 ` Junio C Hamano
2005-09-17 9:31 ` Catalin Marinas
2005-09-19 15:50 ` omitted test scripts? Chuck Lever
2005-09-19 16:19 ` Junio C Hamano
2005-09-19 17:01 ` Daniel Barkalow
2005-09-19 18:54 ` Matthias Urlichs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tnxslw6d4qt.fsf@arm.com \
--to=catalin.marinas@gmail.com \
--cc=blaisorblade@yahoo.it \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.