From: Petr Baudis <pasky@suse.cz>
To: Jakub Narebski <jnareb@gmail.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: repo.or.cz renovated
Date: Mon, 17 Mar 2008 19:10:15 +0100 [thread overview]
Message-ID: <20080317181015.GC10335@machine.or.cz> (raw)
In-Reply-To: <m3ve3nwtl3.fsf@localhost.localdomain>
On Sat, Mar 15, 2008 at 02:44:42PM -0700, Jakub Narebski wrote:
> Petr Baudis <pasky@suse.cz> writes:
>
> > On repo.or.cz (permanently I/O overloaded and hosting 1050 project +
> > forks),
>
> It looks like repo.or.cz is overwhelmed by its success. I hope that
> now that there are other software hosting sites with git hosting
> (Savannah, GitHub, Gitorious,...) the number of projects wouldn't grow
> as rapidly.
Actually, it was overwhelmed to so much by its success but by lack of
good maintenance. ;-) I gave it some love again for the past week and
the improvement was, well, overwhelming. :-)
I finally fixed tons of failures and broken repositories, and most
importantly repacked some of the big repositories with object databases
in pretty horrid shape. The effect has been immense, having everything
in database of 1/3 the size and single big pack drastically reduced the
I/O load.
Scenario: Site with about 1100 repositories weighting 13GB, running a
fetch job for about 200 of them hourly. About two git-daemon requests
per minute and 10 gitweb requests per minute (the last two numbers are
taken quite sloppily over a small sample of the last ten minutes ;-).
Site is running on 2x1GHz P3 with 2G RAM, repository is on hw RAID5.
(We are currently preparing to migrate it to a more powerful machine.)
Before, the load on the server would be normally about 6 to 15 _all the
time_ and bunch of git-related processes would be permanently eating
some CPU and crunch on the disk.
After introducing the index caching and repacking the repositories, the
load seems to be around 1 at most and hardly seems to come above 3; all
feels very snappy.
So for anyone running a hosting site, make sure your repositories are
nicely packed. It makes huge difference to the I/O load!
> Another solution would be to divide projects list page into pages,
> perhaps adding search box for searching for a project (by name, by
> description and by owner).
>
> Nevertheless even with pagination, if we want to have "sort by last
> update" we do need caching.
Yes, I'm pondering about pagination, but because of web clients, not the
server load; it takes firefox on my notebook noticeable time to render
this list already, and it's rather big too. Ideas are welcome here.
My current plan is to have a [Search project] box at the front page,
together with direct link to 'show all'. Other than that, what makes
sense to display on the front page? I think recently added projects (age
< 1 week) for sure. I'm not so sure about recently changed projects -
maybe it is better to keep the front page cruft-free.
--
Petr "Pasky" Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it. -- J. W. von Goethe
next prev parent reply other threads:[~2008-03-17 18:11 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-13 23:14 [PATCH] gitweb: Support caching projects list Petr Baudis
2008-03-14 0:07 ` Jay Soffian
2008-03-14 0:22 ` Petr Baudis
2008-03-14 0:27 ` Jay Soffian
2008-03-14 0:30 ` J.H.
2008-03-14 12:17 ` Jakub Narebski
2008-03-14 0:36 ` J.H.
2008-03-17 17:49 ` repo.or.cz renovation Petr Baudis
2008-03-17 18:11 ` Petr Baudis
2008-03-17 18:44 ` J.H.
2008-03-17 20:41 ` Jakub Narebski
2008-03-17 21:09 ` Jakub Narebski
2008-03-14 15:29 ` [PATCH] gitweb: Support caching projects list Jakub Narebski
2008-03-14 21:11 ` Jay Soffian
2008-03-14 0:19 ` Junio C Hamano
2008-03-14 8:35 ` Frank Lichtenheld
2008-03-14 12:14 ` Jakub Narebski
2008-03-17 17:40 ` Petr Baudis
2008-03-15 21:44 ` Jakub Narebski
2008-03-16 0:56 ` Miklos Vajna
2008-03-16 11:41 ` Frank Lichtenheld
2008-03-16 16:52 ` J.H.
2008-03-16 18:37 ` Jakub Narebski
2008-03-16 22:37 ` J.H.
2008-03-16 23:39 ` Jakub Narebski
2008-03-17 18:10 ` Petr Baudis [this message]
2008-03-17 19:09 ` repo.or.cz renovated Junio C Hamano
2008-03-17 19:25 ` Petr Baudis
2008-03-17 19:34 ` Theodore Tso
2008-03-17 19:54 ` Petr Baudis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080317181015.GC10335@machine.or.cz \
--to=pasky@suse.cz \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).