From: Petr Baudis <pasky@suse.cz>
To: Theodore Tso <tytso@MIT.EDU>
Cc: Jakub Narebski <jnareb@gmail.com>,
Junio C Hamano <junkio@cox.net>,
git@vger.kernel.org
Subject: Re: repo.or.cz renovated
Date: Mon, 17 Mar 2008 20:54:22 +0100 [thread overview]
Message-ID: <20080317195422.GF10335@machine.or.cz> (raw)
In-Reply-To: <20080317193423.GI8368@mit.edu>
On Mon, Mar 17, 2008 at 03:34:23PM -0400, Theodore Tso wrote:
> On Mon, Mar 17, 2008 at 07:10:15PM +0100, Petr Baudis wrote:
> > Actually, it was overwhelmed to so much by its success but by lack of
> > good maintenance. ;-) I gave it some love again for the past week and
> > the improvement was, well, overwhelming. :-)
> >
> > I finally fixed tons of failures and broken repositories, and most
> > importantly repacked some of the big repositories with object databases
> > in pretty horrid shape. The effect has been immense, having everything
> > in database of 1/3 the size and single big pack drastically reduced the
> > I/O load.
>
> Are you making sure that repositories which are forks off of some
> parent repository are using objects/info/alternates to share objects?
> (If so you have to be careful when you prune not to drop objects, but
> it can make a huge difference in disk utilization and I/O bandwidth).
Yes, I reuse objects from parent projects, that has always been so.
> At least for master.kernel.org, and for those git repositories which I
> own, I make a point of periodically logging in and running git gc,
> copying over the object packs so I can do a prune operation safely,
> etc. --- and I suspect most of the master.kernel.org git users do
> something similar. On repo.or.cz we don't have shell access, so the
> project administrators can't do that for you.
>
> > So for anyone running a hosting site, make sure your repositories are
> > nicely packed. It makes huge difference to the I/O load!
>
> It seems that a Really Good Idea would be do the the packing and
> pruning via cron scripts that run during the off hours...
Yes, this was done before too, however repo.or.cz has been around for
long time and historically the scripts weren't working very well,
especially since I had to be careful about the forks problem.
Since I am repacking on live system, I think the current repacking
strategy is still not completely error prone, however I believe that I
have encountered no breakage because of pruned objects the last at least
half a year or so it has been running with the current setup (all of the
breakages I have encountered seem to be caused by child process of
git-repack dying). Besides, if some fork breaks, it should be possible
to fix that very easily (I do not backup the object stores at all
anyway - if the server burns down, you will have to re-push ;-).
> > My current plan is to have a [Search project] box at the front page,
> > together with direct link to 'show all'. Other than that, what makes
> > sense to display on the front page? I think recently added projects (age
> > < 1 week) for sure. I'm not so sure about recently changed projects -
> > maybe it is better to keep the front page cruft-free.
>
> There are plenty of ways which sites like freshmeat and sourceforge
> have come up to make it easy to browse a large number of software
> projects. One way that might make sense is Sourceforge's Software Map
> (i.e., http://sourceforge.net/softwaremap/).
This all feels like a real overkill, besides my main doubt is whether
repo.or.cz needs something like this *at all* - but I think I will try
the tagging system and see how do people like it.
--
Petr "Pasky" Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it. -- J. W. von Goethe
prev parent reply other threads:[~2008-03-17 19:55 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-13 23:14 [PATCH] gitweb: Support caching projects list Petr Baudis
2008-03-14 0:07 ` Jay Soffian
2008-03-14 0:22 ` Petr Baudis
2008-03-14 0:27 ` Jay Soffian
2008-03-14 0:30 ` J.H.
2008-03-14 12:17 ` Jakub Narebski
2008-03-14 0:36 ` J.H.
2008-03-17 17:49 ` repo.or.cz renovation Petr Baudis
2008-03-17 18:11 ` Petr Baudis
2008-03-17 18:44 ` J.H.
2008-03-17 20:41 ` Jakub Narebski
2008-03-17 21:09 ` Jakub Narebski
2008-03-14 15:29 ` [PATCH] gitweb: Support caching projects list Jakub Narebski
2008-03-14 21:11 ` Jay Soffian
2008-03-14 0:19 ` Junio C Hamano
2008-03-14 8:35 ` Frank Lichtenheld
2008-03-14 12:14 ` Jakub Narebski
2008-03-17 17:40 ` Petr Baudis
2008-03-15 21:44 ` Jakub Narebski
2008-03-16 0:56 ` Miklos Vajna
2008-03-16 11:41 ` Frank Lichtenheld
2008-03-16 16:52 ` J.H.
2008-03-16 18:37 ` Jakub Narebski
2008-03-16 22:37 ` J.H.
2008-03-16 23:39 ` Jakub Narebski
2008-03-17 18:10 ` repo.or.cz renovated Petr Baudis
2008-03-17 19:09 ` Junio C Hamano
2008-03-17 19:25 ` Petr Baudis
2008-03-17 19:34 ` Theodore Tso
2008-03-17 19:54 ` Petr Baudis [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080317195422.GF10335@machine.or.cz \
--to=pasky@suse.cz \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=junkio@cox.net \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).