All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J.H." <warthog9@kernel.org>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 0/6] Gitweb caching changes v2
Date: Fri, 11 Dec 2009 10:26:27 -0800	[thread overview]
Message-ID: <4B228ED3.3030901@kernel.org> (raw)
In-Reply-To: <200912111901.35781.jnareb@gmail.com>

Jakub Narebski wrote:
> On Fri, 11 Dec 2009, J.H. (John 'Warthog9' Hawley) wrote:
>> Jakub Narebski wrote:
>>> "John 'Warthog9' Hawley" <warthog9@kernel.org> writes:
> 
>>>> John 'Warthog9' Hawley (6):
>>>>   GITWEB - Load Checking
>>>>   GITWEB - Missmatching git w/ gitweb
>>>>   GITWEB - Add git:// link to summary pages
>>>>   GITWEB - Makefile changes
>>>>   GITWEB - File based caching layer
>>> This patch didn't made it to git mailing list.  I suspect that you ran
>>> afoul vger anti-SPAM filter.
>>>
>>> Does this "File based caching layer" have anything common with GSoC
>>> 2008 project, available at git://repo.or.cz/git/gitweb-caching.git ?
>> Yeah, it does seem that way (like I said eaten by a grue), it 
>> *currently* has nothing to do with Lea's GSoC code but it is still my 
>> intention, long term, to integrate the two.
>>
>> The patch, in all it's glory can be viewed at: 
>> http://git.kernel.org/?p=git/warthog9/gitweb.git;a=commitdiff;h=42641b1e3bfae14d5cc2e0150355e89cb87951db
>>
>> It is anything but a small patch to gitweb, the patch is 117K and 
>> comprises 3539 lines (including git header commit information).  There's 
>> not any real good way to break it up as it's a bit of an all or nothing 
>> patch.
> 
> First, why do you reinvent the wheel instead of using one of existing
> caching interfaces like CHI or Cache::Cache (perhaps creating a custom
> backend or middle layer which incorporates required features, like being
> load-aware)?

Well for starters this isn't exactly a reinvention of the wheel, and 
this isn't something "new" per-se.  This code has been actively running 
on git.kernel.org for something like 3 - 4 years so there's something to 
be said for the devil we know and understand.  As well using the other 
caching strategies involves adding dramatically more complex 
interactions with caching layer.  The caching layer is actually quite 
specific to how git + gitweb works and solves more than just "caching" 
on the surface.  Specifically it solves the stampeding herd problem 
which would have to be solved either way even if I didn't implement my 
own caching, and since I had to do that caching was barely a step beyond 
that to implement.

>  This way changing from file-based cache to e.g. mmap based
> one or to memcached would be very simple.

True but these are *VERY* different caching strategies than the one I've 
got here, yes it's using files as a backend but it's doing so with 
specific goals in mind.  As I've said I plan to integrate Lea's 
memcached based caching into this in the future and that has different 
advantages and disadvantages.

At the end of the day the "normal" caching engines aren't as efficient 
as mine and there is the case the very high performance sites are going 
to have to investigate a number of different solutions to see what works 
best for them.  Mine is also *dramatically* simpler to setup as well, 
turn it on, point it at a directory and your done.

>  And you would avoid pitfals
> in doing your own cache management.  perl-Cache-Cache should be available
> package in extras repositories.

There's pitfalls if I do it myself, or I use one of the other "common" 
perl modules.  I did it this way years ago, I've maintained it and it 
works pretty well.  I won't admit that it's the smartest caching engine 
on the planet, far from it, but it has evolved specifically for gitweb 
and that itself saves me a lot of pitfalls from cache engine + gitweb 
integration.

> If module is no available this would simply mean no caching, like in many
> (or not so many) other cases with optional features in gitweb.

Yes, but as can be seen from how you enable various other caching 
engines the setup of those is non-trivial, this is and either way 
caching *HAS* to be explicitly turned on by the admin/user since they 
are going to have to do *some* configuration, or at least be aware that 
their webapp is going to chew up some sort of resource.

> Second, if you can't use CGI::Cache directly, you can always steal the
> idea from it, then the change to gitweb itself would be minimal:
> 
>   "Internally, the CGI::Cache module ties the output file descriptor
>   (usually STDOUT) to an internal variable to which all output is saved."

I thought about that 3 years ago, and decided it wasn't a good option 
for gitweb.  Why?  There's too many assumptions throughout the code that 
when you do a print it will go immediately out.  Things like error 
messages and such.  Breaking out the prints into prints (which will do 
what is expected) and passing around the output in the $output variables 
makes it a lot simpler easier to differentiate about how / what your 
looking at and a *LOT* easier to debug.

> P.S. I'll postpone critique of the patch itself for now.  The above issues
> are much more important.

That's fine.  The issues your raising aren't new though, and stem back 
to before I created gitweb-caching, got rehashed with Lea's patches and 
not surprisingly are back on the table now.  Like I said above, there is 
no one caching strategy that's perfect in all cases here and that's 
again why I eventually plan to merge Lea's changes (which uses 
memcached) in as well, I'm just trying to get code that I'm getting 
considerable demand for, that's proven, upstream.

- John 'Warthog9' Hawley

  reply	other threads:[~2009-12-11 18:26 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-10 23:45 [PATCH 0/6] Gitweb caching changes v2 John 'Warthog9' Hawley
2009-12-10 23:45 ` [PATCH 1/6] GITWEB - Load Checking John 'Warthog9' Hawley
2009-12-10 23:45   ` [PATCH 2/6] GITWEB - Missmatching git w/ gitweb John 'Warthog9' Hawley
2009-12-10 23:45     ` [PATCH 3/6] GITWEB - Add git:// link to summary pages John 'Warthog9' Hawley
2009-12-10 23:45       ` [PATCH 4/6] GITWEB - Makefile changes John 'Warthog9' Hawley
     [not found]         ` <1260488743-25855-6-git-send-email-warthog9@kernel.org>
2009-12-10 23:45           ` [PATCH 6/6] GITWEB - Separate defaults from main file John 'Warthog9' Hawley
2009-12-11 15:46             ` Jakub Narebski
2009-12-11 15:58               ` J.H.
2009-12-11 22:53                 ` Jakub Narebski
2009-12-16  1:22                   ` Junio C Hamano
2009-12-16  2:00                     ` J.H.
2009-12-16 19:52                       ` Jakub Narebski
2009-12-16 20:04                         ` J.H.
2009-12-16  2:22                     ` Jakub Narebski
2009-12-11 14:28         ` [PATCH 4/6] GITWEB - Makefile changes Jakub Narebski
2009-12-11 16:22           ` J.H.
2009-12-11 16:41             ` Jakub Narebski
2009-12-19 13:32               ` [PATCH/RFCv2 4/6] gitweb: Makefile improvements Jakub Narebski
2009-12-11 12:52       ` [PATCH 3/6] GITWEB - Add git:// link to summary pages Johannes Schindelin
2009-12-11 13:44       ` Jakub Narebski
2009-12-18 21:02         ` [PATCHv2 3/6] gitweb: Optionally add "git" links in project list page Jakub Narebski
2009-12-11 10:52     ` [PATCH 2/6] GITWEB - Missmatching git w/ gitweb Jakub Narebski
2009-12-18 19:18       ` [RFC/PATCHv2 2/6] gitweb: Add option to force version match Jakub Narebski
2009-12-11 12:49     ` [PATCH 2/6] GITWEB - Missmatching git w/ gitweb Johannes Schindelin
2009-12-10 23:54   ` [PATCH 1/6] GITWEB - Load Checking Sverre Rabbelier
2009-12-11  0:52   ` Jakub Narebski
2009-12-11  1:10     ` Junio C Hamano
2009-12-11  2:19     ` J.H.
2009-12-11  2:50       ` Junio C Hamano
2009-12-11  2:58         ` J.H.
2009-12-11  3:07           ` J.H.
2009-12-11  3:09           ` Junio C Hamano
2009-12-11 10:09       ` Jakub Narebski
2009-12-18 16:36         ` [PATCHv2 1/6] gitweb: Load checking Jakub Narebski
2009-12-11 13:53   ` [PATCH 1/6] GITWEB - Load Checking Mihamina Rakotomandimby
2009-12-10 23:53 ` [PATCH 0/6] Gitweb caching changes v2 Sverre Rabbelier
2009-12-11 15:51 ` Jakub Narebski
     [not found]   ` <4B226D56.7000004@kernel.org>
2009-12-11 18:01     ` Jakub Narebski
2009-12-11 18:26       ` J.H. [this message]
2009-12-12  1:37         ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B228ED3.3030901@kernel.org \
    --to=warthog9@kernel.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.