git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* oprofile on svn import
@ 2006-06-14  1:10 Jon Smirl
  2006-06-14  2:01 ` Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jon Smirl @ 2006-06-14  1:10 UTC (permalink / raw)
  To: git

I'm going back to cvsimport tomorrow. My svn import that had been
running for five days got killed this morning when the city decided to
move the telephone pole that provides my electricty.

Some oprofile data, this doesn't make a lot of sense to me. Why is it
in libcypto so much?

 12632739 30.6077 /lib/libcrypto.so.0.9.8a
 11762639 28.4995 /home/good/vmlinux
  6310191 15.2889 /lib/libc-2.4.so
  2498812  6.0543 /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
  2079975  5.0395 /usr/local/bin/git-update-index
  1103116  2.6727 /usr/lib/libz.so.1.2.3
   617395  1.4959 /usr/lib/libapr-1.so.0.2.2
   484625  1.1742 /usr/local/bin/git-read-tree

kernel breakdown

2035561  16.4450  copy_page_range
1110813   8.9741  get_page_from_freelist
851064    6.8756  check_poison_obj
759296    6.1342  unmap_vmas
670659    5.4181  release_pages
667657    5.3939  page_remove_rmap
595826    4.8136  page_fault
241962    1.9548  __copy_from_user_ll
185876    1.5017  do_wp_page
176506    1.4260  do_page_fault


I reset the statistics and took another snapshot half an hour later.

  2232310 44.3485 /home/good/vmlinux
   757114 15.0413 /lib/libcrypto.so.0.9.8a
   507282 10.0780 /lib/libc-2.4.so
   203440  4.0417 /usr/lib/libz.so.1.2.3
   179105  3.5582 /usr/lib/libapr-1.so.0.2.2
   169724  3.3718 /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
   114384  2.2724 /usr/local/bin/git-update-index
   102350  2.0334 /usr/lib/libsvn_subr-1.so.0.0.0
    74673  1.4835 /usr/lib/libaprutil-1.so.0.2.2
    69987  1.3904 /usr/lib/libsvn_fs_fs-1.so.0.0.0

Kernel:

543264   21.2518  copy_page_range
243383    9.5208  check_poison_obj
227788    8.9108  unmap_vmas
161806    6.3296  page_remove_rmap
153201    5.9930  release_pages
119092    4.6587  page_fault
100116    3.9164  get_page_from_freelist
45014     1.7609  do_wp_page
42130     1.6481  vm_normal_page
34804     1.3615  poison_obj
28231     1.1044  do_page_fault
27403     1.0720  __handle_mm_fault
24558     0.9607  __copy_to_user_ll
20618     0.8066  flush_tlb_page


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  1:10 oprofile on svn import Jon Smirl
@ 2006-06-14  2:01 ` Eric Wong
  2006-06-14  2:39   ` Jon Smirl
  2006-06-14  4:48   ` Ryan Anderson
  2006-06-14  2:32 ` Jon Smirl
  2006-06-14  3:32 ` Martin Langhoff
  2 siblings, 2 replies; 10+ messages in thread
From: Eric Wong @ 2006-06-14  2:01 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

Jon Smirl <jonsmirl@gmail.com> wrote:
> I'm going back to cvsimport tomorrow. My svn import that had been
> running for five days got killed this morning when the city decided to
> move the telephone pole that provides my electricty.
> 
> Some oprofile data, this doesn't make a lot of sense to me. Why is it
> in libcypto so much?

The sha1 calculation is done in libcrypto, afaik.

Anybody want to see how my latest patches to git-svn (and using SVN perl
libraries) stacks up against the mozilla repo?  Speedwise, I don't
expect git-svn to be too different than git-svnimport, but it should use
much less memory (I'll probably port the hacks to git-svnimport, too).

I'll see about freeing up one of my machines to test the mozilla repo.
Unfortunately, all of my hardware is a few years old and not extremely
fast.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  1:10 oprofile on svn import Jon Smirl
  2006-06-14  2:01 ` Eric Wong
@ 2006-06-14  2:32 ` Jon Smirl
  2006-06-14 19:25   ` Jon Smirl
  2006-06-14  3:32 ` Martin Langhoff
  2 siblings, 1 reply; 10+ messages in thread
From: Jon Smirl @ 2006-06-14  2:32 UTC (permalink / raw)
  To: git

>From the previous data it is obvious that I had slab debugging
enabled. I usally never notice having it turned on but in this case it
make a lot of difference.

New numbers without slab debug. Could forking off the git tasks be
causing all of this vm load?

[root@jonsmirl jonsmirl]# vmstat 10
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
sy id wa st
 2  0      0  13504  91220 563280    0    0   299   232  244   426 23
18 53  6  0
 2  0      0  10900  91344 565128    0    0   169   464  481   737 26
23 48  2  0
 2  0      0  10804  91436 564832    0    0   196   650  478   780 25
24 49  3  0
 4  0      0  13516  91512 561696    0    0   166   612  474   790 26
23 49  2  0
 1  0      0  10928  91632 563548    0    0   124   471  464   789 24
25 48  2  0
 1  0      0  12312  91684 562000    0    0   179   688  472   783 26
23 48  3  0
 1  0      0  13232  91748 560712    0    0    51   198  445   794 25
26 48  1  0

  9951967 44.5102 /home/good/vmlinux
  3192131 14.2768 /lib/libcrypto.so.0.9.8a
  2207857  9.8747 /lib/libc-2.4.so
  1587518  7.1002 /usr/lib/libz.so.1.2.3
   663114  2.9658 /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
   517463  2.3144 /lib/ld-2.4.so
   435100  1.9460 /usr/lib/libapr-1.so.0.2.2
   430292  1.9245 /usr/local/bin/git-update-index
   285157  1.2754 /usr/local/bin/git-read-tree

2331728  22.8834  copy_page_range
1076769  10.5673  unmap_vmas
667975    6.5555  page_remove_rmap
663844    6.5149  page_fault
654668    6.4249  release_pages
440547    4.3235  get_page_from_freelist
245142    2.4058  do_wp_page
174656    1.7141  vm_normal_page
155185    1.5230  __handle_mm_fault
133584    1.3110  do_page_fault
131456    1.2901  __d_lookup
94194     0.9244  __link_path_walk
92927     0.9120  flush_tlb_page
91775     0.9007  find_get_page
85927     0.8433  copy_process


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  2:01 ` Eric Wong
@ 2006-06-14  2:39   ` Jon Smirl
  2006-06-14  3:02     ` Eric Wong
  2006-06-14  4:48   ` Ryan Anderson
  1 sibling, 1 reply; 10+ messages in thread
From: Jon Smirl @ 2006-06-14  2:39 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

On 6/13/06, Eric Wong <normalperson@yhbt.net> wrote:
> Jon Smirl <jonsmirl@gmail.com> wrote:
> > I'm going back to cvsimport tomorrow. My svn import that had been
> > running for five days got killed this morning when the city decided to
> > move the telephone pole that provides my electricty.
> >
> > Some oprofile data, this doesn't make a lot of sense to me. Why is it
> > in libcypto so much?
>
> The sha1 calculation is done in libcrypto, afaik.

That make sense, but it's eating up 14% of my CPU in a long sample.

> Anybody want to see how my latest patches to git-svn (and using SVN perl
> libraries) stacks up against the mozilla repo?  Speedwise, I don't
> expect git-svn to be too different than git-svnimport, but it should use
> much less memory (I'll probably port the hacks to git-svnimport, too).

Can svnimport be rewritten to avoid calling fork? If I am reading the
oprofiles correctly that fork is very expensive especially when the
svnimport task grows to 600MB.

I have an import running but post your code when it is ready and I can
try it on the next run. They always seem to fail so there will
probably be another run.

> I'll see about freeing up one of my machines to test the mozilla repo.
> Unfortunately, all of my hardware is a few years old and not extremely
> fast.
>
> --
> Eric Wong
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  2:39   ` Jon Smirl
@ 2006-06-14  3:02     ` Eric Wong
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2006-06-14  3:02 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git, Matthias Urlichs, Linus Torvalds

Linus: I hope I'm right on [1] (the stuff about fork).

Jon Smirl <jonsmirl@gmail.com> wrote:
> On 6/13/06, Eric Wong <normalperson@yhbt.net> wrote:
> >Jon Smirl <jonsmirl@gmail.com> wrote:
> >> I'm going back to cvsimport tomorrow. My svn import that had been
> >> running for five days got killed this morning when the city decided to
> >> move the telephone pole that provides my electricty.
> >>
> >> Some oprofile data, this doesn't make a lot of sense to me. Why is it
> >> in libcypto so much?
> >
> >The sha1 calculation is done in libcrypto, afaik.
> 
> That make sense, but it's eating up 14% of my CPU in a long sample.
> 
> >Anybody want to see how my latest patches to git-svn (and using SVN perl
> >libraries) stacks up against the mozilla repo?  Speedwise, I don't
> >expect git-svn to be too different than git-svnimport, but it should use
> >much less memory (I'll probably port the hacks to git-svnimport, too).
> 
> Can svnimport be rewritten to avoid calling fork? If I am reading the
> oprofiles correctly that fork is very expensive especially when the
> svnimport task grows to 600MB.

I think the problem is the process growing to 600MB, and not the fork :)
git-svn avoids process growth pretty well from my tests with the gcc
repo.

See the fetch_lib() function in this patch on how I avoid process
growth by _using_ fork():

Subject: [PATCH 12/13] git-svn: add support for Perl SVN::* libraries
	(<115022175180-git-send-email-normalperson@yhbt.net>)

Perl processes (at least on my machines (5.8.x, Linux x86) don't like to
release memory back to the OS when they're done using it (although it
can reuse the memory within the process itself).  This is why SVN::Pool
isn't very effective in many cases.

fork() will only duplicate memory for the pages that are changed by the
child, not the entire process[1].  So I fork children that run temporarily
to avoid accumulating memory usage inside the process.

This technique should probably be added to git-svnimport as well.

> I have an import running but post your code when it is ready and I can
> try it on the next run. They always seem to fail so there will
> probably be another run.

I've posted a two series of patches the past few days that have yet
to be merged by Junio:

Subject: [PATCH] git-svn: bug fixes (some resends)
	<11500094252972-git-send-email-normalperson@yhbt.net>
Subject: [PATCH 0/13] git-svn: better branch support, SVN:: lib usage, feature additions
	<11502217352245-git-send-email-normalperson@yhbt.net>

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  1:10 oprofile on svn import Jon Smirl
  2006-06-14  2:01 ` Eric Wong
  2006-06-14  2:32 ` Jon Smirl
@ 2006-06-14  3:32 ` Martin Langhoff
  2 siblings, 0 replies; 10+ messages in thread
From: Martin Langhoff @ 2006-06-14  3:32 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

On 6/14/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> I'm going back to cvsimport tomorrow. My svn import that had been

For best results, make sure you remove the -a from the git-repack
line. Once it's done, run git-repack -a -d manually.

cheers,


martin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  2:01 ` Eric Wong
  2006-06-14  2:39   ` Jon Smirl
@ 2006-06-14  4:48   ` Ryan Anderson
  2006-06-14  5:26     ` Jon Smirl
  1 sibling, 1 reply; 10+ messages in thread
From: Ryan Anderson @ 2006-06-14  4:48 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jon Smirl, git

On Tue, Jun 13, 2006 at 07:01:08PM -0700, Eric Wong wrote:
> Anybody want to see how my latest patches to git-svn (and using SVN perl
> libraries) stacks up against the mozilla repo?  Speedwise, I don't
> expect git-svn to be too different than git-svnimport, but it should use
> much less memory (I'll probably port the hacks to git-svnimport, too).

I've got access to a pretty good machine to run this on - where can I
grab the svn repo from?
(I can just grab the CVS one and convert it, first, as well, just point
me at that, if that's got more bandwidth.)

-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  4:48   ` Ryan Anderson
@ 2006-06-14  5:26     ` Jon Smirl
  0 siblings, 0 replies; 10+ messages in thread
From: Jon Smirl @ 2006-06-14  5:26 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: Eric Wong, git

On 6/14/06, Ryan Anderson <ryan@michonline.com> wrote:
> On Tue, Jun 13, 2006 at 07:01:08PM -0700, Eric Wong wrote:
> > Anybody want to see how my latest patches to git-svn (and using SVN perl
> > libraries) stacks up against the mozilla repo?  Speedwise, I don't
> > expect git-svn to be too different than git-svnimport, but it should use
> > much less memory (I'll probably port the hacks to git-svnimport, too).
>
> I've got access to a pretty good machine to run this on - where can I
> grab the svn repo from?
> (I can just grab the CVS one and convert it, first, as well, just point
> me at that, if that's got more bandwidth.)

rsync -az cvs-mirror.mozilla.org::mozilla ~/mozilla/cvs-mirror
It took about three days for my machine to convert that cvs to svn.

I have the converted repo local but it is 8.2GB and I have 256kb up.

There is no real purpose in converting mozilla cvs to svn to git other
than to test the tools. My last attempt at svn to git ran five days
before I lost power. Towards the end it was getting significantly slow
implying some kind of n squared problem in the import process. The
idea was to see if cvsimport and svnimport both end up with the same
output.

I am going to use git-cvsimport on the mozilla repo but that tool
needs to 2GB+ physical RAM to run. I ordered 2GB more and it will be
here tomorrow. I have just been playing with the svn conversion while
I wait five days for my 2nd day air package to show up.


>
> --
>
> Ryan Anderson
>   sometimes Pug Majere
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14  2:32 ` Jon Smirl
@ 2006-06-14 19:25   ` Jon Smirl
  2006-06-14 19:38     ` Jakub Narebski
  0 siblings, 1 reply; 10+ messages in thread
From: Jon Smirl @ 2006-06-14 19:25 UTC (permalink / raw)
  To: git

Stats after 18 hours into git-svnimport. Process is now stuck in the
kernel 64% of the time. All of the kernel time is in page management.
Perl svnimport process is 290MB now.

My top candidates for causing the problem are the fork in the perl
code or the execing of a million tiny git processes.

The key low level git functions could be made into a library to avoid
the need to exec them continuously. The svn functions are libraries
and they hardly show up.

   606218  2.4143 /usr/local/bin/git-update-index
   127170  0.5065 /usr/local/bin/git-write-tree
    81153  0.3232 /usr/local/bin/git-read-tree
    13065  0.0520 /usr/local/bin/git-ls-files
     2624  0.0105 /usr/local/bin/git-hash-object
      754  0.0030 /usr/local/bin/git-commit-tree
      462  0.0018 /usr/local/bin/git-ls-tree
      398  0.0016 /usr/local/bin/git-rev-parse

versus

   102784  0.3641 /usr/lib/libsvn_subr-1.so.0.0.0
    70235  0.2488 /usr/lib/libsvn_fs_fs-1.so.0.0.0
    67081  0.2376 /usr/lib/libsvn_delta-1.so.0.0.0
      848  0.0030 /usr/lib/libsvn_swig_perl-1.so.0.0.0
      512  0.0018 /usr/lib/libsvn_ra_local-1.so.0.0.0
      350  0.0012 /usr/lib/libsvn_fs-1.so.0.0.0
      222 7.9e-04 /usr/lib/libsvn_repos-1.so.0.0.0
      124 4.4e-04 /usr/lib/libsvn_ra-1.so.0.0.0

------------------------------------------------------------------------------------------------------------

  4093890 64.3711 /home/good/vmlinux
   906014 14.2459 /lib/libcrypto.so.0.9.8a
   435744  6.8515 /lib/libc-2.4.so
   158325  2.4895 /usr/lib/libz.so.1.2.3
   139995  2.2012 /usr/local/bin/git-update-index
    75322  1.1843 /nvidia
    64349  1.0118 /usr/bin/oprofiled
    52825  0.8306 /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
    51930  0.8165 /usr/lib/libapr-1.so.0.2.2
    42771  0.6725 /usr/local/bin/git-read-tree
    37774  0.5939 /lib/ld-2.4.so
    34761  0.5466 /usr/local/bin/git-write-tree
    29560  0.4648 /usr/lib/libsvn_subr-1.so.0.0.0
    28210  0.4436 /usr/lib/libaprutil-1.so.0.2.2

-----------------------------------------------------------------------------------------------------------------

2471826  32.8741    copy_page_range
375260  18.2903  unmap_vmas
574208    7.6367  release_pages
572189    7.6098  page_remove_rmap
233367    3.1037  free_pages_and_swap_cache
191051    2.5409  get_page_from_freelist
169058    2.2484  unlock_page
162027    2.1549  vm_normal_page
155691    2.0706  swap_info_get
136324    1.8130  swap_duplicate
119227    1.5857  page_fault
99729     1.3263  page_waitqueue
49288     0.6555  remove_exclusive_swap_page
39611     0.5268  do_wp_page
39142     0.5206  __wake_up_bit
34384     0.4573  __copy_from_user_ll
31111     0.4138  __handle_mm_fault
29990     0.3989  find_get_page
29682     0.3948  do_page_fault


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oprofile on svn import
  2006-06-14 19:25   ` Jon Smirl
@ 2006-06-14 19:38     ` Jakub Narebski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-06-14 19:38 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> Stats after 18 hours into git-svnimport. Process is now stuck in the
> kernel 64% of the time. All of the kernel time is in page management.
> Perl svnimport process is 290MB now.
> 
> My top candidates for causing the problem are the fork in the perl
> code or the execing of a million tiny git processes.
> 
> The key low level git functions could be made into a library to avoid
> the need to exec them continuously. The svn functions are libraries
> and they hardly show up.

There is ongoing effort to translate git functions into builtins.
Still you would need to translate git-svnimport Perl code into C,
or somehow access git library from Perl.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-06-14 19:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-14  1:10 oprofile on svn import Jon Smirl
2006-06-14  2:01 ` Eric Wong
2006-06-14  2:39   ` Jon Smirl
2006-06-14  3:02     ` Eric Wong
2006-06-14  4:48   ` Ryan Anderson
2006-06-14  5:26     ` Jon Smirl
2006-06-14  2:32 ` Jon Smirl
2006-06-14 19:25   ` Jon Smirl
2006-06-14 19:38     ` Jakub Narebski
2006-06-14  3:32 ` Martin Langhoff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).