git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Charles McGarvey <chazmcgarvey@brokenzipper.com>
To: "Constantine A. Murenin" <mureninc@gmail.com>
Cc: Fredrik Gustafsson <iveqy@iveqy.com>, git@vger.kernel.org
Subject: Re: is there a fast web-interface to git for huge repos?
Date: Fri, 07 Jun 2013 14:13:53 -0600	[thread overview]
Message-ID: <51B23F01.5020608@brokenzipper.com> (raw)
In-Reply-To: <CAPKkNb4myh9MPNSgLqs5Mku-z1EOsHyWrgK2Qy_3_UOivXvcnw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2207 bytes --]

On 06/07/2013 01:02 PM, Constantine A. Murenin wrote:
>> That's a one-time penalty. Why would that be a problem? And why is wget
>> even mentioned? Did we misunderstood eachother?
> 
> `wget` or `curl --head` would be used to trigger the caching.
> 
> I don't understand how it's a one-time penalty.  Noone wants to look
> at an old copy of the repository, so, pretty much, if, say, I want to
> have a gitweb of all 4 BSDs, updated daily, then, pretty much, even
> with lots of ram (e.g. to eliminate the cold-case 5s penalty, and
> reduce each page to 0.5s), on a quad-core box, I'd be kinda be lucky
> to complete a generation of all the pages within 12h or so, obviously
> using the machine at, or above, 50% capacity just for the caching.  Or
> several days or even a couple of weeks on an Intel Atom or VIA Nano
> with 2GB of RAM or so.  Obviously not acceptable, there has to be a
> better solution.
> 
> One could, I guess, only regenerate the pages which have changed, but
> it still sounds like an ugly solution, where you'd have to be
> generating a list of files that have changed between one gen and the
> next, and you'd still have to have a very high cpu, cache and storage
> requirements.

Have you already ruled out caching on a proxy?  Pages would only be generated
on demand, so the first visitor would still experience the delay but the rest
would be fast until the page expires.  Even expiring pages as often as five
minutes or less would probably provide significant processing savings
(depending on how many users you have), and that level of staleness and the
occasional delays may be acceptable to your users.

As you say, generating the entire cache upfront and continuously is wasteful
and probably unrealistic, but any type of caching, by definition, is going to
involve users seeing stale content, and I don't see that you have any other
option but some type of caching.  Well, you could reproduce what git does in a
bunch of distributed algorithms and run your app on a farm--which, I guess, is
probably what GitHub is doing--but throwing up a caching reverse proxy is a
lot quicker if you can accept the caveats.

-- 
Charles McGarvey


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2013-06-07 20:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-07  1:35 is there a fast web-interface to git for huge repos? Constantine A. Murenin
2013-06-07  6:33 ` Fredrik Gustafsson
2013-06-07 17:05   ` Constantine A. Murenin
2013-06-07 17:57     ` Fredrik Gustafsson
2013-06-07 19:02       ` Constantine A. Murenin
2013-06-07 20:13         ` Charles McGarvey [this message]
2013-06-07 20:21           ` Constantine A. Murenin
2013-06-14 10:55             ` Holger Hellmuth (IKS)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B23F01.5020608@brokenzipper.com \
    --to=chazmcgarvey@brokenzipper.com \
    --cc=git@vger.kernel.org \
    --cc=iveqy@iveqy.com \
    --cc=mureninc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).