git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joey Hess <joey@kitenet.net>
To: git@vger.kernel.org
Subject: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat)
Date: Thu, 8 Jan 2009 14:54:46 -0500	[thread overview]
Message-ID: <20090108195446.GB18025@gnu.kitenet.net> (raw)
In-Reply-To: <gk4bk5$9dq$1@ger.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2147 bytes --]

Giuseppe Bilotta wrote:
> > There is a small overhead in including the microformat on project list
> > and forks list pages, but getting the project descriptions for those pages
> > already incurs a similar overhead, and the ability to get every repo url
> > in one place seems worthwhile.
> 
> I agree with this, although people with very large project lists may
> differ ... do we have timings on these?

AFAICS, when displaying the project list, gitweb reads each project's
description file, falling back to reading its config file if there is no
description file.

If performance was a problem here, the thing to do would be to add
project descriptions to the $project_list file, and use those in
preference to the description files. If a large site has done that,
they've not sent in the patch. :-)

With my patch, it will read each cloneurl file too. The best way to
optimise that for large sites seems to be to add an option that would
ignore the cloneurl files and config file and always use
@git_base_url_list.

I checked the only large site I have access to (git.debian.org) and they
use a $project_list file, but I see no other performance tuning. That's
a 2 ghz machine; it takes gitweb 28 (!) seconds to generate the nearly 1
MB index web page for 1671 repositories:

/srv/git.debian.org/http/cgi-bin/gitweb.cgi  3.04s user 9.24s system 43% cpu 28.515 total

Notice that most of the time is spent by child processes. For each
repository, gitweb runs git-for-each-ref to determine the time of the
last commit.

If that is removed (say if there were a way to get the info w/o
forking), performance improves nicely:

./gitweb.cgi > /dev/null  1.29s user 1.08s system 69% cpu 3.389 total

Making it not read description files for each project, as I suggest above,
is the next best optimisation:

./gitweb.cgi > /dev/null  1.08s user 0.05s system 96% cpu 1.170 total

So, I think it makes sense to optimise gitweb and offer knobs for performance
tuning at the expense of the flexability of description and cloneurl files.
But, git-for-each-ref is swamping everything else.

-- 
see shy jo

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2009-01-08 19:56 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-07  4:25 [PATCH] gitweb: support the rel=vcs microformat Joey Hess
2009-01-07 12:30 ` Giuseppe Bilotta
2009-01-07 15:50   ` Joey Hess
2009-01-07 18:03     ` Giuseppe Bilotta
2009-01-07 18:41       ` Joey Hess
2009-01-10  0:01         ` Jakub Narebski
2009-01-07 18:45       ` Joey Hess
2009-01-07 19:02         ` Joey Hess
2009-01-07 23:24           ` [PATCH] gitweb: support the rel=vcs-* microformat Joey Hess
2009-01-08  7:56             ` Giuseppe Bilotta
2009-01-08 19:54               ` Joey Hess [this message]
2009-01-08 23:53                 ` gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) J.H.
2009-01-09  0:16                   ` Miklos Vajna
2009-01-09  0:19                   ` Johannes Schindelin
2009-01-09  0:26                     ` J.H.
2009-01-10  1:44                   ` Jakub Narebski
2009-01-10  1:11                 ` Jakub Narebski
2009-01-10  1:04               ` [PATCH] gitweb: support the rel=vcs-* microformat Jakub Narebski
2009-01-10  0:52             ` Jakub Narebski
2009-01-10  0:03           ` [PATCH] gitweb: support the rel=vcs microformat Jakub Narebski
2009-01-09 23:56   ` Jakub Narebski
2009-01-09 23:49 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090108195446.GB18025@gnu.kitenet.net \
    --to=joey@kitenet.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).