git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Excruciatingly slow git-svn imports
@ 2008-04-24 18:54 Geert Bosch
  2008-04-24 19:57 ` Steven Grimm
  2008-04-29  7:03 ` Eric Wong
  0 siblings, 2 replies; 9+ messages in thread
From: Geert Bosch @ 2008-04-24 18:54 UTC (permalink / raw)
  To: git@vger.kernel.org List

I'm trying to import a 9.7G, 130K revision svn repository
but it seems to only import about 6K revisions per day on fast hardware
using a recent git (1.5.5).

This means about 20 days, or more if things slow down as the repo gets  
bigger
Are there any tips/tricks on how to most efficiently convert large  
repos?
I'm using ssh+svn protocol for accessing the repository, but slowness
seems due to local inefficiency. An strace -fcp <pid> during a minute  
gives
the following results:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  52.46   21.392640       17607      1215           clone
  47.47   19.358882        3983      4860      3645 execve
   0.05    0.019571          16      1216           wait4
   0.01    0.003944           0     14582      1215 open
   0.01    0.002458           0     14580     12150 access
   0.00    0.000797           0      8500           write
   0.00    0.000694           0     26013           read
   0.00    0.000574           0      3693           munmap
   0.00    0.000513           0     20659           close
   0.00    0.000452           0     21918           mmap
   0.00    0.000353           0      1215           stat
   0.00    0.000234           0     12158      1215 lseek
   0.00    0.000155           0     17013           fstat
   0.00    0.000077           0      6075           mprotect
   0.00    0.000076           0      8511           rt_sigaction
   0.00    0.000074           0      6078      6078 ioctl
   0.00    0.000049           0      2432           unlink
   0.00    0.000033           0      2430           dup2
   0.00    0.000033           0      7293           fcntl
   0.00    0.000022           0      3681           brk
   0.00    0.000022           0      1215           getppid
   0.00    0.000019           0      1215           uname
   0.00    0.000019           0      1215           arch_prctl
   0.00    0.000000           0      1215           lstat
   0.00    0.000000           0      1216           pipe
   0.00    0.000000           0        22           mremap
   0.00    0.000000           0      2431           dup
   0.00    0.000000           0      1215           getcwd
   0.00    0.000000           0      2430           getdents64
------ ----------- ----------- --------- --------- ----------------
100.00   40.781691                196296     24303 total

So, 99.93% of the time seems to be in clone/execve
(including actual work done by the forked programs)

In another trace, I found the following execve calls were made:
      22 execve("/homes/bosch/x86_64-linux/bin/git",
       2 execve("/homes/bosch/x86_64-linux/bin/git-commit-tree",
    2842 execve("/homes/bosch/x86_64-linux/bin/git-hash-object",
      22 execve("/opt/gnu/bin/git",
       2 execve("/opt/gnu/bin/git-commit-tree",
    2842 execve("/opt/gnu/bin/git-hash-object",
      22 execve("/opt/local/bin/git",
       2 execve("/opt/local/bin/git-commit-tree",
    2842 execve("/opt/local/bin/git-hash-object",
      22 execve("/opt/local/sbin/git",
       2 execve("/opt/local/sbin/git-commit-tree",
    2842 execve("/opt/local/sbin/git-hash-object",

I don't have git installed in either of /opt/gnu/bin, /opt/local/bin  
or /opt/local/sbin.
These three directories just happen to be before the one containing  
git in my path:

bosch:~/git$ echo $PATH
/opt/gnu/bin:/opt/local/bin:/opt/local/sbin:/homes/bosch/x86_64-linux/ 
bin ...

Before trying to brush up my Perl and propose patching fixes for this
(I doubt the extra execve's take much time at all), I was wondering why
we don't open a single stream to git-fast-import and have it do
the heavy lifting. Are there fundamental issues with this?

   -Geert

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-05-06 11:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-24 18:54 Excruciatingly slow git-svn imports Geert Bosch
2008-04-24 19:57 ` Steven Grimm
2008-04-29  7:11   ` Eric Wong
2008-05-05  4:29     ` Geert Bosch
2008-05-06  3:28       ` Eric Wong
2008-05-06  3:56         ` Avery Pennarun
2008-05-06  4:25           ` Eric Wong
2008-05-06 11:23             ` Geert Bosch
2008-04-29  7:03 ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).