git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-svn performance
@ 2014-10-17 20:47 Fabian Schmied
  2014-10-19  0:32 ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Fabian Schmied @ 2014-10-17 20:47 UTC (permalink / raw)
  To: git

Hi,

I'm currently migrating an SVN repository to Git using git-svn (Git
for Windows 1.8.3-preview20130601), and I'm experiencing severe
performance problems with "git svn fetch". Commits to the SVN "trunk"
are fetched very fast (a few seconds or so per SVN revision), but
commits to some branches ("hotfix" branches) are currently taking
about 9 minutes per revision. I fear that the time per these commits
is increasing and that indeed the migration might not be finishable at
all.

For the commits that take such a long time, git-svn always outputs
lots of warnings about ignored SVN cherry-picks, and it tells me it
can't find a revmap for the path being imported. (See [1].)

AFAICS, the offending commits take place on some branches that include
a lot of manually merged ("SVN cherry-picked") revisions. Git-svn
seems to be checking something (though I don't know what) that makes
importing these revisions really slow. And it repeats this for every
revision on these branches with increasing work to do.

Is there anything I can do to speed this up? (I already tried
increasing the --log-window-size to 500, didn't have any effect.)

Thank you, best regards,
Fabian

[1]
        M       foo/bar/XXX.xml
        M       foo/bar/YYY.xml
W:svn cherry-pick ignored (/branches/frob:6940-7068) - missing 12
commit(s) (eg abeaece820ceae44ebf2c06011cf43bbcbf4b1ce)
W:svn cherry-pick ignored (/branches/feature:3316-4798,4811,4827) -
missing 10 commit(s) (eg e255fff14ab1e581f21671ca8b36c0747869cf8c)
W:svn cherry-pick ignored
(/hotfixes/ZZZ.159:2131,2133,2145-2146,2148,2169) - missing 10
commit(s) (eg e04b0326c998f0611c18144b3ed8f686d3b52f4c)
W:svn cherry-pick ignored
(/hotfixes/ZZZ.333:4536,4610-4611,4625,4665,4669,4685,4713,4745,4785,4788,4908-4917,4920,4933-4944,4955,5003,5103,5174,5222,5227,
5261,5267,5306,5310,5321,5360,5416,5467,5501,5508,5599-5614,5650-5651,5757,5761-5762,5764,5778-5779,5784,5811,5814,5819,5823,5825,5836-5838,5860,5862,5873,5889,
5910,5924,5948) - missing 137 commit(s) (eg
9daec24cbdf55200d2cdfb0cd6b3f10485e296ac)
C:\Program Files (x86)\Git\bin\perl.exe: *** WFSO timed out
W:svn cherry-pick ignored (/hotfixes/ZZZ.333.39:5696,5847) - missing
84 commit(s) (eg 9daec24cbdf55200d2cdfb0cd6b3f10485e296ac)
W:svn cherry-pick ignored (/hotfixes/AAA:5905,6095) - missing 119
commit(s) (eg 9daec24cbdf55200d2cdfb0cd6b3f10485e296ac)
W:svn cherry-pick ignored (/hotfixes/BBB_1.1:6971) - missing 198
commit(s) (eg 9daec24cbdf55200d2cdfb0cd6b3f10485e296ac)
W:svn cherry-pick ignored
(/hotfixes/CCC:6134,6164,6168,6174,6206,6211,6237,6239,6244-6245,6250,6257,6269,6271,6276,6289-6292,6294,6296,6301-6302,6313,6315-6316,6329,6333,6379,6383,6394,6405,6411,6456,6478,6483,6491,6519,6537,6557)
- missing 194 commit(s) (eg 9daec24cbdf55200d2cdfb0cd6b3f10485e296ac)
W:svn cherry-pick ignored (/hotfixes/DDD:7635) - missing 1 commit(s)
(eg 6a3ba817635eb3a9411a307924dec393311d93be)
W:svn cherry-pick ignored
(/hotfixes/EEE_1.2:7786,7794,7797,7803,7829-7830,7843,7886,7889,7933,7937,7949,7953)
- missing 80 commit(s) (eg e78b1bc68f7a9b041588a39f3fa5e1a61f98942b)
W:svn cherry-pick ignored
(/hotfixes/EEE_1.3:8159,8170,8173-8174,8177,8181-8182,8185,8187,8194-8195,8201,8203,8206,8251,8255,8257,8259-8262,8265,8280,8286,8294,8296,8304-8305,8312,8318,8323,8327,8363,8387-8388,8390,8422-8423,8432,8446,8536-8537,8548-8549,8556,8559,8566,8569,8572,8578,8597-8598,8602,8617,8619,8655,8687,8720)
- missing 104 commit(s) (eg 33febd4591f42a9d871ba330432840917b157f9e)
W:svn cherry-pick ignored
(/hotfixes/EEE_1.4:8766,8768,8770,8777-8779,8795-8796,8802-8809,8812-8814,8816-8817,8820,8823,8825,8827,8831,8836,8841,8845,8848-8852,8854-8855,8866,8868-8869,8871-8873,8875-8878,8880,8888,8892,8911-8912,8917-8918,8946,8956-8957,8964,8984,8994,9003,9008,9011,9029,9038,9040,9046-9048,9055,9086,9101,9108,9111,9113,9124,9129,9133,9138-9139,9150,9152,9154,9156,9172,9174,9188-9189,9208,9211,9217)
- missing 44 commit(s) (eg 0621fb44de682650d762c707b102bc2472c088f8)
W:svn cherry-pick ignored
(/hotfixes/EEE_1.5:9412,9421,9430,9433-9436,9439,9441,9449,9459,9468,9529,9548,9561,9568,9605-9606,9612,9614,9617,9628,9630-9631,9637,9687,9807)
- missing 41 commit(s) (eg 1bd1a9b72336bf4d3839a00348b7f2a52368c16c)
W:svn cherry-pick ignored
(/trunk:9852-9853,9857,9859,9862,9868,9872,9876,9879,9890,9895,9926-9927,9933,9953,9956,9960-9962)
- missing 60 commit(s) (eg 3322e7ffc6ab49181976d9e94c91a4556951f38a)
Couldn't find revmap for https://the-svn-server/svn/something/trunk/foo
r9963 = 597df48cb830825f9029d1cfdf45df024d7fd3dd (refs/remotes/EEE_1.6)

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: git-svn performance
@ 2014-10-22 17:38 Hin-Tak Leung
  2014-10-25  0:02 ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Hin-Tak Leung @ 2014-10-22 17:38 UTC (permalink / raw)
  To: normalperson, stoklund
  Cc: fabian.schmied, git, sam, stevenrwalter, waste.manager, amyrick

------------------------------
On Tue, Oct 21, 2014 10:00 BST Eric Wong wrote:

>Jakob Stoklund Olesen <stoklund@2pi.dk> wrote:
>> Yes, but I think you can remove cached_mergeinfo_rev too. 
>
>Thanks, pushed the patch at the bottom, too.
>Also started working on some memory reductions here:
> http://mid.gmane.org/20141021033912.GA27462@dcvr.yhbt.net
>But there seem to be more problems :<
>
>----------------------------8<-----------------------------
>From: Eric Wong <normalperson@yhbt.net>
>Date: Tue, 21 Oct 2014 06:23:22 +0000
>Subject: [PATCH] git-svn: remove mergeinfo rev caching
>
>This should further reduce memory usage from the new mergeinfo
>speedups without hurting performance too much, assuming
>reasonable latency to the SVN server.
>
>Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
>Suggested-by: Jakob Stoklund Olesen <stoklund@2pi.dk>
>Signed-off-by: Eric Wong <normalperson@yhbt.net>
>---
> perl/Git/SVN.pm | 30 +++++++++---------------------
> 1 file changed, 9 insertions(+), 21 deletions(-)
>
>diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
>index f8a75b1..4364506 100644
>--- a/perl/Git/SVN.pm
>+++ b/perl/Git/SVN.pm
>@@ -1710,32 +1710,20 @@ sub mergeinfo_changes {
>     my %minfo = map {split ":", $_ } split "\n", $mergeinfo_prop;
>     my $old_minfo = {};
> 
>-    # Initialize cache on the first call.
>-    unless (defined $self->{cached_mergeinfo_rev}) {
>-        $self->{cached_mergeinfo_rev} = {};
>-    }
>-
>-    my $cached_rev = $self->{cached_mergeinfo_rev}{$old_path};
>-    unless (defined $cached_rev && $cached_rev == $old_rev) {
>-        my $ra = $self->ra;
>-        # Give up if $old_path isn't in the repo.
>-        # This is probably a merge on a subtree.
>-        if ($ra->check_path($old_path, $old_rev) != $SVN::Node::dir) {
>-            warn "W: ignoring svn:mergeinfo on $old_path, ",
>-                "directory didn't exist in r$old_rev\n";
>-            return {};
>-        }
>-    }
>-    my (undef, undef, $props) = $self->ra->get_dir($old_path, $old_rev);
>+    my $ra = $self->ra;
>+    # Give up if $old_path isn't in the repo.
>+    # This is probably a merge on a subtree.
>+    if ($ra->check_path($old_path, $old_rev) != $SVN::Node::dir) {
>+        warn "W: ignoring svn:mergeinfo on $old_path, ",
>+            "directory didn't exist in r$old_rev\n";
>+        return {};
>+    }
>+    my (undef, undef, $props) = $ra->get_dir($old_path, $old_rev);
>     if (defined $props->{"svn:mergeinfo"}) {
>         my %omi = map {split ":", $_ } split "\n",
>             $props->{"svn:mergeinfo"};
>         $old_minfo = \%omi;
>     }
>-    $self->{cached_mergeinfo_rev}{$old_path} = $old_rev;
>-
>-    # Cache the new mergeinfo.
>-    $self->{cached_mergeinfo_rev}{$path} = $rev;
> 
>     my %changes = ();
>     foreach my $p (keys %minfo) {
>-- 
>EW

I'll have a look at the new changes at some point - I am still keeping the old
clone and the new clone and just fetching from time to time to keep them
in sync. I just tried that and fetching the same 50 commits on the old clone 
took 1.7 GB memory vs 1.0 GB memory on the new. Details below.
This is just with the 2 earliest patches - I'll put the new 3 in at some point.
So I see some needs for retrospectively fixing old clones (maybe as part
of garbage collection?), since most would simply use an old clone through
the ages... 

Comparing trunk of old and new, I see one difference -  One short
commit message is missing in the *old* (the "Add checkPoFiles etc." part)
and so all the sha1 afterwards differed. Is that an old bug that's fixed
and therefore I should throw away the old clone? 

Date:   Wed Apr 25 18:21:29 2012 +0000
    Add checkPoFiles etc.
        git-svn-id: https://svn.r-project.org/R/trunk@59188 

Here is the details of fetching old and new:

<---
$ /usr/bin/time -v git svn fetch --all
	M	doc/manual/R-admin.texi
r66784 = fc20374f26f8e03bb88c00933982e29138a6f929 (refs/remotes/trunk)
...
	M	configure
r66834 = d8d1876f6aa71b3fe3773cd28a760ff945d30bdf (refs/remotes/R-3-1-branch)
	Command being timed: "git svn fetch --all"
	User time (seconds): 1520.77
	System time (seconds): 156.32
	Percent of CPU this job got: 98%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 28:15.82
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1738276
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 613
	Minor (reclaiming a frame) page faults: 2039305
	Voluntary context switches: 11243
	Involuntary context switches: 181507
	Swaps: 0
	File system inputs: 658328
	File system outputs: 754688
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

$ cd ../R-2/
[Hin-Tak@localhost R-2]$ /usr/bin/time -v git svn fetch --all
	M	doc/manual/R-admin.texi
r66784 = 6a08d94b456d33d85add914a1b780a972689443a (refs/remotes/trunk)
...
	M	configure
r66834 = 370a6484c2a65be78dfae184b50d8f08685d389c (refs/remotes/R-3-1-branch)
	Command being timed: "git svn fetch --all"
	User time (seconds): 1507.89
	System time (seconds): 134.25
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 27:38.49
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1026656
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 1110
	Minor (reclaiming a frame) page faults: 1630150
	Voluntary context switches: 10280
	Involuntary context switches: 176444
	Swaps: 0
	File system inputs: 361472
	File system outputs: 477912
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
---->

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: git-svn performance
@ 2014-10-25  5:23 Hin-Tak Leung
  2014-10-25  5:32 ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Hin-Tak Leung @ 2014-10-25  5:23 UTC (permalink / raw)
  To: normalperson
  Cc: stoklund, fabian.schmied, git, sam, stevenrwalter, waste.manager,
	amyrick



------------------------------
On Sat, Oct 25, 2014 01:02 BST Eric Wong wrote:

>Hin-Tak Leung <htl10@users.sourceforge.net> wrote:
>> Comparing trunk of old and new, I see one difference -  One short
>> commit message is missing in the *old* (the "Add checkPoFiles etc." part)
>> and so all the sha1 afterwards differed. Is that an old bug that's fixed
>> and therefore I should throw away the old clone? 
>
>I don't recall a bug which would cause a revision to be skipped.
>I suppose it's alright now the new clone has that revision.
>Perhaps there was a power outage or improper shutdown?
>
>At least we can be glad current versions see this revision...

the old didn't missing a revision - just a revision 'message' - blank instead of 3 words, above the git svn id. I supppse it is possible some power problem or etc caused this. I'll check the other branches as well, and possibly clone again to be sure. ( The new clone did have one break...)

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: git-svn performance
@ 2014-10-25  5:47 Hin-Tak Leung
  2014-10-25  6:01 ` Eric Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Hin-Tak Leung @ 2014-10-25  5:47 UTC (permalink / raw)
  To: normalperson
  Cc: stoklund, fabian.schmied, git, sam, stevenrwalter, waste.manager,
	amyrick



------------------------------
On Sat, Oct 25, 2014 06:32 BST Eric Wong wrote:

>Hin-Tak Leung <htl10@users.sourceforge.net> wrote:
>> the old didn't missing a revision - just a revision 'message' - blank
>> instead of 3 words, above the git svn id. I supppse it is possible
>> some power problem or etc caused this. I'll check the other branches
>> as well, and possibly clone again to be sure. ( The new clone did have
>> one break...)
>
>Oh, there's a possibility the commit message in SVN was edited/added
>after-the-fact, but that depends on the SVN admin (most never allow
>or do it).

That's a possibility - the old clone was created by fetching every few days. It is possible that the author edited it after commiting a blank message and i fetched.

btw, git svn seems to disallow single word commit messages (or is it a svn config?). i found that i could not do git svn dcommit, when i had merely did git commit -m 'typos', for example, for an svn repo i have write access to. (I don't have them many such things, so it is difficult to tell whether it is a repo config, or a git svn strangeness). i just do rebase and do 'typo correction' or something before re-dcommit in the past.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-10-25 21:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-17 20:47 git-svn performance Fabian Schmied
2014-10-19  0:32 ` Eric Wong
2014-10-19  2:29   ` Eric Wong
2014-10-19  2:33     ` Eric Wong
2014-10-19 14:56       ` Jakob Stoklund Olesen
2014-10-20  1:16         ` Eric Wong
2014-10-20 13:46           ` Jakob Stoklund Olesen
2014-10-21  9:00             ` Eric Wong
2014-10-19  9:38   ` Fabian Schmied
  -- strict thread matches above, loose matches on Subject: below --
2014-10-22 17:38 Hin-Tak Leung
2014-10-25  0:02 ` Eric Wong
2014-10-25  5:23 Hin-Tak Leung
2014-10-25  5:32 ` Eric Wong
2014-10-25  5:47 Hin-Tak Leung
2014-10-25  6:01 ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).