git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFH] git cherry-pick takes forever
@ 2008-09-10  8:26 Michal Vitecek
  2008-09-10 10:00 ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Vitecek @ 2008-09-10  8:26 UTC (permalink / raw)
  To: git list

 Hello everyone,

 I have two git repositories: one is the origin of the other. However no
 merging is being done as the projects in the repositories quite differ
 but still use the same core. So to propagate changes I cherry-pick
 those which are useful from one repository to another.

 however 'git cherry-pick' has lately started to last almost forever:

 $ time git cherry-pick b42b77e66a83f1298d9900a9bb1078b9b42e8618
 Finished one cherry-pick.
 Created commit 7caef83: - removed some superfluous newlines
 2 files changed, 0 insertions(+), 2 deletions(-)
 git cherry-pick b42b77e66a83f1298d9900a9bb1078b9b42e8618  282.97s user 34.69s system 100% cpu 5:17.63 total

 Both repositories have approximately 16k commits and their forking
 point (merge base) is 250 to 490 commits far away. 'git gc' (even
 --prune) has been run.

 What can I do to make the 'git cherry-pick' instant again?

        Thank you,

 P.S.: I'm using git-1.6.0.1.
-- 
		Michal Vitecek		(fuf@mageo.cz)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFH] git cherry-pick takes forever
  2008-09-10  8:26 [RFH] git cherry-pick takes forever Michal Vitecek
@ 2008-09-10 10:00 ` Junio C Hamano
  2008-09-11  7:56   ` Michal Vitecek
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2008-09-10 10:00 UTC (permalink / raw)
  To: Michal Vitecek; +Cc: git list

Michal Vitecek <fuf@mageo.cz> writes:

>  Hello everyone,
>
>  I have two git repositories: one is the origin of the other. However no
>  merging is being done as the projects in the repositories quite differ
>  but still use the same core. So to propagate changes I cherry-pick
>  those which are useful from one repository to another.
>
>  however 'git cherry-pick' has lately started to last almost forever:

Can you define "lately"?  Is it a function of your git version, or is it a
function of the age of your repositories?

>  $ time git cherry-pick b42b77e66a83f1298d9900a9bb1078b9b42e8618
>  Finished one cherry-pick.
>  Created commit 7caef83: - removed some superfluous newlines
>  2 files changed, 0 insertions(+), 2 deletions(-)
>  git cherry-pick b42b77e66a83f1298d9900a9bb1078b9b42e8618  282.97s user 34.69s system 100% cpu 5:17.63 total
>
>  Both repositories have approximately 16k commits and their forking
>  point (merge base) is 250 to 490 commits far away.

When talking about cherry-pick, the size of the history (unless the
repository has too many objects and badly packed) does not matter; the
operation is purely about your current state, the cherry-picked commit
itself, and the parent commit of the cherry-picked one.

Taking 5 minutes to cherry-pick a change to only two paths, one line
deletion each, is plain ridiculous, but if the tree state of cherry-picked
commit and the tree state of the target is vastly different (e.g. almost
no common pathnames), the behaviour is certainly understandable.  Ancient
git used straight three-way merge for cherry-pick, but recent ones use
more expensive "recursive-merge", which tries to detect renames.  If the
states of trees are very dissimilar, you can end up wasting a lot of time.

    $ H=$(git rev-parse 7caef83^) ;# the commit before cherry-pick
    $ C=b42b77e6 ;# the cherry-picked one

cherry-pick operation roughly runs these two diffs:

    $ time git diff --shortstat -M $H $C
    $ time git diff --shortstat -M $H $C^1

and uses the result to perform its work.  Can you clock these?

If you rarely have renames, it may be much more efficient to run "git
format-patch -1 --stdout $C | git am -3" instead of cherry-pick.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFH] git cherry-pick takes forever
  2008-09-10 10:00 ` Junio C Hamano
@ 2008-09-11  7:56   ` Michal Vitecek
  0 siblings, 0 replies; 3+ messages in thread
From: Michal Vitecek @ 2008-09-11  7:56 UTC (permalink / raw)
  To: git list; +Cc: Junio C Hamano

 Hello,

Junio C Hamano wrote:
>Michal Vitecek <fuf@mageo.cz> writes:
>>  I have two git repositories: one is the origin of the other. However no
>>  merging is being done as the projects in the repositories quite differ
>>  but still use the same core. So to propagate changes I cherry-pick
>>  those which are useful from one repository to another.
>>
>>  however 'git cherry-pick' has lately started to last almost forever:
>
>Can you define "lately"?  Is it a function of your git version, or is it a
>function of the age of your repositories?

 It's a function a cherry-picking some commits - afterwards
 cherry-picking crawls. One of the commits removes a number of files
 (614) and also renames some (303).

>>  $ time git cherry-pick b42b77e66a83f1298d9900a9bb1078b9b42e8618
>>  Finished one cherry-pick.
>>  Created commit 7caef83: - removed some superfluous newlines
>>  2 files changed, 0 insertions(+), 2 deletions(-)
>>  git cherry-pick b42b77e66a83f1298d9900a9bb1078b9b42e8618  282.97s user 34.69s system 100% cpu 5:17.63 total
>>
>>  Both repositories have approximately 16k commits and their forking
>>  point (merge base) is 250 to 490 commits far away.
>
>When talking about cherry-pick, the size of the history (unless the
>repository has too many objects and badly packed) does not matter; the
>operation is purely about your current state, the cherry-picked commit
>itself, and the parent commit of the cherry-picked one.

 I too thought so but after cherry-picking starting taking so long I
 began to doubt my thoughts :)

>Taking 5 minutes to cherry-pick a change to only two paths, one line
>deletion each, is plain ridiculous, but if the tree state of cherry-picked
>commit and the tree state of the target is vastly different (e.g. almost
>no common pathnames), the behaviour is certainly understandable.  Ancient
>git used straight three-way merge for cherry-pick, but recent ones use
>more expensive "recursive-merge", which tries to detect renames.  If the
>states of trees are very dissimilar, you can end up wasting a lot of time.
>
>    $ H=$(git rev-parse 7caef83^) ;# the commit before cherry-pick
>    $ C=b42b77e6 ;# the cherry-picked one
>
>cherry-pick operation roughly runs these two diffs:
>
>    $ time git diff --shortstat -M $H $C

 $ time git diff --shortstat -M $H $C
 2 files changed, 0 insertions(+), 2 deletions(-)
 git diff --shortstat -M $H $C  0.00s user 0.00s system 72% cpu 0.006 total

>    $ time git diff --shortstat -M $H $C^1

 $ time git diff --shortstat -M $H $C\^1
 git diff --shortstat -M $H $C\^1  0.00s user 0.00s system 0% cpu 0.003 total

>and uses the result to perform its work.  Can you clock these?
>
>If you rarely have renames, it may be much more efficient to run "git
>format-patch -1 --stdout $C | git am -3" instead of cherry-pick.

 Turning off renames detection in diff (via 'git config --add diff.renames
 false') helped and 'git cherry-pick' is instant now.

 Maybe my repositories are "strange" in some way. I would be more than
 happy to provide more information if needed.

        Thank you,
-- 
		Michal Vitecek		(fuf@mageo.cz)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-09-11  7:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-10  8:26 [RFH] git cherry-pick takes forever Michal Vitecek
2008-09-10 10:00 ` Junio C Hamano
2008-09-11  7:56   ` Michal Vitecek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).