* gitweb: Simple file based output caching - current state of series
@ 2010-04-11 21:17 Jakub Narebski
0 siblings, 0 replies; only message in thread
From: Jakub Narebski @ 2010-04-11 21:17 UTC (permalink / raw)
To: git
Cc: Petr Baudis, John 'Warthog9' Hawley,
John 'Warthog9' Hawley
I am still working on adding support for output (response) caching to
gitweb, based originally on patch series by John 'Warthog9' Hawley aka J.H.
The idea was to start small, and add features piece by piece for easier
review. As you can see the idea snowballed a bit, with 16 patches in
the series now (4 of those are more or less independent of caching support).
Current version of my work on simple file based output caching in
gitweb can be found at:
git://repo.or.cz/git/jnareb-git.git gitweb/cache-kernel-pu
http://repo.or.cz/w/git/jnareb-git.git/log/refs/heads/gitweb/cache-kernel-pu
Note that 'gitweb/cache-kernel-pu' will be undergoing rewriting!
Previous versions of my take on series can be found as branches
'gitweb/cache-kernel', 'gitweb/cache-kernel-v2', 'gitweb/cache-kernel-v3'.
J.H. series can be found in git://git.kernel.org/pub/scm/git/warthog9/gitweb.git
as branches 'gitweb-ml-v2', 'gitweb-ml-v3', 'gitweb-ml-v4'.
Shortlog:
~~~~~~~~~
1. Export more test-related variables when running external tests
This patch is actually independent of caching. I needed
TEST_DIRECTORY to find gitweb/cache.pm for later tests of
caching (which are written in Perl).
2. gitweb: href(..., -path_info => 0|1)
This is needed to have unique key for a gitweb page, independently
on whether we used query string or path_info URL to arrive at it.
In other words it is used to implement canonical URL for a gitweb
page, which would not depend on gitweb configuration.
3. gitweb: Add custom error handler using die_error (WIP)
This patch was created for gitweb to provide similar output for its
errors (thrown using die_error() subroutine), and errors from the
caching layer (thrown using simply "die"), and errors from Perl
(like Perl errors, or mkdir from File::Path, or tempfile from
File::Temp. The idea was to use die_error as error handler for
the CGI layer via set_message from CGI::Carp.
Unfortunately I have not realized until later that error message
handler set using set_message from CGI::Carp is responsible only
for the HTTP response body part: you can't set HTTP headers in
CGI::Carp error handler (fatalsToBrowser case).
Therefore this patch, if it is to stay, should be fixed so that
die_error does not produce HTTP headers when invoked as error handler,
and not explicitely.
Note also that due to the fact that die_error uses exit, errors
handled by die_error as error handler would not be logged. Fixing
*this* would require additional patch, removing 'exit' from gitweb
and replacing it by non-local jump (non-local goto)... which I guess
would also improve gitweb performance in persistent environments
such as mod_perl, mod_perlite (WIP), or (if gitweb would in the
future support it) FastCGI.
4. gitweb/cache.pm - Very simple file based cache
5. gitweb/cache.pm - Stat-based cache expiration
Those two patches didn't change much from previous versions.
The idea is to provide Cache::Cache and CHI caching interface
compatibile file-based cache package, so that one would be able
to use CHI or Cache::Cache instead of it.
I have this time actually tested that Cache::FileCache seems to
work as gitweb caching engine.
6. gitweb/cache.pm - Simple select(FH) based output capture
Compared to previous versions, this time capturing is separated
into two packages. One implements capturing interface, second
is select(FH)-based driver that actually does "capturing".
The idea is to be able to replace simplistic select(FH)-based
"capturing" with capturing using tied filehandle, or capturing
using push_layer method from non-standard PerlIO::Util module.
7. gitweb/cache.pm - Cache captured output (using get/set)
8. gitweb: Add optional output caching (from gitweb/cache.pm)
Compared to previous version of gitweb caching series, adding
caching captured output (as yet another package in gitweb/cache.pm)
is separated from adding output caching to gitweb itself.
In this point of series out simple file-based cache has enough
features to be useful for caching gitweb output.
9. gitweb/cache.pm - Adaptive cache expiration time
Not changed much from previous version. The idea is to lengthen
cache expiration time when there is heavy load on server. More
configurable that in original patch by J.H. Compare Cache::Adaptive.
10. gitweb/cache.pm - Use CHI compatibile (compute method) caching
interface
Just updated to newer way of capturing output. It is needed because
locking requires ->compute API, and would not work with only
->get / ->set API.
11. gitweb/cache.pm - Use locking to avoid 'cache miss stampede' problem
12. gitweb/cache.pm - Serve stale data when waiting for filling cache
13. gitweb/cache.pm - Regenerate (refresh) cache in background
Those three patches didn't change much from previous versions.
Note that compared to original J.H. patches caching engine now
daemonizes process that does bacground cache generation; this should
make it work correctly also in persistent environments like mod_perl
(and perhaps in the future also FastCGI).
14. gitweb: Move generating page title to separate subroutine
This patch is actually independent of caching, although without
"progress info" support for cache in subsequent patches the new
subroutine would be used only in one place, in git_header_html().
Nevertheless refactoring title generation allows to reduce indent
level.
15. gitweb: Show appropriate "Generating..." page when regenerating cache
It should actually work now. The problem was that flock can be
interrupted by SIGALRM from timer used to print progress info.
Might need extra tests not that of generic support for
generating_info subroutine in caching engine, but that subroutine
similar to git_generating_html in gitweb works correctly.
16. gitweb: Add startup delay to activity indicator for cache
In previous version of series it was lumped together in previous
patch, even though it is a new feature, not present in original
patch series by J.H.
This feature was inspired (semi-requested) by comments from Petr
'Pasky' Baudis during discussion on some previous version of my
series, or J.H. series about gitweb caching.
Diffstat:
~~~~~~~~~
gitweb/INSTALL | 6 +
gitweb/README | 70 +++
gitweb/cache.pm | 779 ++++++++++++++++++++++++++++++++
gitweb/gitweb.perl | 391 ++++++++++++++--
t/gitweb-lib.sh | 2 +
t/t9500-gitweb-standalone-no-errors.sh | 19 +
t/t9503-gitweb-caching.sh | 38 ++
t/t9503/test_cache_interface.pl | 339 ++++++++++++++
t/t9503/test_cache_output.pl | 76 +++
t/t9503/test_capture.pl | 74 +++
t/test-lib.sh | 3 +
11 files changed, 1756 insertions(+), 41 deletions(-)
create mode 100644 gitweb/cache.pm
create mode 100755 t/t9503-gitweb-caching.sh
create mode 100755 t/t9503/test_cache_interface.pl
create mode 100755 t/t9503/test_cache_output.pl
create mode 100755 t/t9503/test_capture.pl
--
Jakub Narebski
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2010-04-11 21:18 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-11 21:17 gitweb: Simple file based output caching - current state of series Jakub Narebski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).