git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Robin Rosenberg <robin.rosenberg.lists@dewire.com>
Cc: git@vger.kernel.org
Subject: Re: jgit performance update
Date: Sun, 3 Dec 2006 17:59:48 -0500	[thread overview]
Message-ID: <20061203225947.GD15965@spearce.org> (raw)
In-Reply-To: <200612031455.48032.robin.rosenberg.lists@dewire.com>

Robin Rosenberg <robin.rosenberg.lists@dewire.com> wrote:
> So, just go on to the next case. I added filtering on filenames (yes, 
> CVS-induced brain damage, I should track the content. next version. filenames 
> are so much handier to work with). That gives me 4.5s to retrieve a filtered 
> history (from 10800 commits).Half of the time is spent in re-sorting tree 
> entries. Is that really necessary?

Yea, I was looking at that code while doing the other performance
improvements and thought it might start to become a bottleneck. I
guess I was right.

What is happening here is jgit wants to store the items in the tree
in name ordering, but Git stores the items in the tree sorted such
that subtrees sort with a '/' on the end of their name.  This is a
different ordering...

The reason I'm resorting them is so we can find an entry without
knowing what its type is first.  Looks like that's going to have
to change somehow.
 
> Most of java's slowness comes from the programmers using it. (Lutz Prechelt. 
> Technical opinion: comparing Java vs. C/C++ efficiency differences to 
> interpersonal differences. ACM, Vol 42,#10, 1999)

Yes, that was clearly the case here with jgit!  :-)

_This_ programmer made jgit slow.  Learned from the mistake, and
made it faster.
 
> > One of the biggest annoyances has been the fact that although Java 
> > 1.4 offers a way to mmap a file into the process, the overhead to
> > access that data seems to be far higher than just reading the file
> > content into a very large byte array, especially if we are going
> > to access that file content multiple times.  So jgit performs worse
> > than core Git early on while it copies everything from the OS buffer
> > cache into the Java process, but then performs reasonably well once
> > the internal cache is hot.  On the other hand using the mmap call
> > reduces early latency but hurts the access times so much that we're
> > talking closer to 3s average read times for the same log operation.
> 
> Have you tried that with difference JVM's?

No, I'm on Mac OS X so I don't have a huge JVM selection (that I
know of).  And I haven't tried jgit or egit on any other system yet.

-- 

  parent reply	other threads:[~2006-12-03 22:59 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-03  4:59 jgit performance update Shawn Pearce
2006-12-03 13:55 ` Robin Rosenberg
2006-12-03 14:19   ` Jakub Narebski
2006-12-03 15:53     ` Robin Rosenberg
2006-12-03 23:06       ` Shawn Pearce
2006-12-03 22:59   ` Shawn Pearce [this message]
2006-12-03 17:45 ` Linus Torvalds
2006-12-03 17:56   ` Jakub Narebski
2006-12-03 22:42     ` Juergen Stuber
2006-12-03 23:39       ` Robin Rosenberg
2006-12-03 23:58         ` Jakub Narebski
2006-12-04  0:46           ` Shawn Pearce
2006-12-04 20:35         ` Juergen Stuber
2006-12-03 22:47   ` Shawn Pearce
2006-12-03 21:55 ` sf
2006-12-03 22:16   ` Shawn Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061203225947.GD15965@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=robin.rosenberg.lists@dewire.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).