git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Lea Wiemann <lewiemann@gmail.com>
Cc: git@vger.kernel.org, John Hawley <warthog19@eaglescrag.net>,
	Junio C Hamano <gitster@pobox.com>, Petr Baudis <pasky@suse.cz>,
	Lars Hjemli <hjemli@gmail.com>
Subject: Re: Gitweb caching: Google Summer of Code project
Date: Fri, 30 May 2008 01:27:07 +0200	[thread overview]
Message-ID: <200805300127.10454.jnareb@gmail.com> (raw)
In-Reply-To: <483DA594.5040803@gmail.com>

On Wed, 28 May 2008, Lea Wiemann wrote:
> Jakub Narebski wrote:
> >
> > 1. Caching data
> >  * disadvantages:
> >    - more CPU
> >    - need to serialize and deserialize (parse) data
> >    - more complicated
> 
> CPU: John told me that so far CPU has *never* been an issue on k.org. 
> Unless someone tells me they've had CPU problems, I'll assume that CPU 
> is a non-issue until I actually run into it (and then I can optimize the 
> particular pieces where CPU is actually an issue).

True.

What you have to care about (although I don't think it would be
partilcularly difficult) is to not repeat bad I/O patterns with
cache...

> Serialization: I was planning to use Storable (memcached's Perl API uses 
> it transparently I think).  I'm hoping that this'll just solve it.

While Storable is part of, I think, any modern Perl installation, there
might be problem with memcached API, and memcached API wrappers such as
CHI one.  Namely you cannot assume that memcached API is installed, so
you have to provide some kind of fallback.
 
> It's true that it's more complicated.  It'll require quite a bit of 
> refactoring, and maybe I'll just back off if I find that it's too hard.

What's more, if you want to implement If-Modified-Since and
If-None-Match, you would have to implement it by yourself, while
for static pages (cahing HTML output) web server would do this
for us "for free".

> > I'm afraid that implementing kernel.org caching in mainline in
> > a generic way would be enough work for a whole GSoC 2008.
> 
> I probably won't reimplement the current caching mechanism.  Do you 
> think that a solution using memcached is generic enough?  I'll still 
> need to add some abstraction layer in the code, but when I'm finished 
> the user will either get the normal uncached gitweb, or activate 
> memcached caching with some configuration setting.

Thats good enough, although I think that current caching mechanism in
kernel.org's gitweb (your implementation follows more what repo.or.cz's
gitweb does) has some good ideas, like for example adaptive (depending
on load) expiry time.

By the way what do you think about adding (as an option) information
about gitweb performance to the output, in the form of
  "Site generated in 0.01 seconds, 2 calls to git commands"
or
  "Site generated in 0.0023 seconds, cached output, 1m31s old"
line somewhere in the page footer?

I hope you have some ideas in gitweb access statistics from kernel.org,
repo.or.cz, and perhaps other large git hosting sites (e.g.
freedesktop.org), and you plan on benchamrking gitweb caching using
average / amortized time to generate page, ApacheBench or equivalent,
load average on server depending on number of requests, I/O load (using
fio tool, for example) depending on number of requests etc.

> By the way, I'll be posting about gitweb on this mailing list 
> occasionally.  If any of you would like to receive CC's on such 
> messages, please let me know, otherwise I'll assume you get them through 
> the mailing list.

I read git mailing list via Usenet / news interface (NNTP gateway) from
GMane. 

-- 
Jakub Narebski
Poland

  reply	other threads:[~2008-05-29 23:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-27 18:03 Gitweb caching: Google Summer of Code project Lea Wiemann
2008-05-27 21:53 ` Jakub Narebski
2008-05-27 22:54   ` Lea Wiemann
2008-05-28 12:14     ` Jakub Narebski
2008-05-28 18:33       ` Lea Wiemann
2008-05-29 23:27         ` Jakub Narebski [this message]
2008-05-30  7:24           ` Lea Wiemann
2008-05-30 10:02             ` Jakub Narebski
2008-05-30 14:59               ` Lea Wiemann
2008-05-30 15:07                 ` Petr Baudis
2008-05-30 15:27                   ` Lea Wiemann
2008-05-30 15:38                     ` Petr Baudis
2008-05-30 16:04                       ` Rafael Garcia-Suarez
2008-05-30 18:56                         ` J.H.
2008-05-30 20:28                           ` Junio C Hamano
2008-05-30 21:32                             ` Lea Wiemann
2008-05-30 18:47                       ` Lea Wiemann
2008-05-31 10:15                     ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200805300127.10454.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hjemli@gmail.com \
    --cc=lewiemann@gmail.com \
    --cc=pasky@suse.cz \
    --cc=warthog19@eaglescrag.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).