git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "John 'Warthog9' Hawley" <warthog9@eaglescrag.net>, git@vger.kernel.org
Subject: Re: [PATCH 0/3] Gitweb caching v7
Date: Sun, 31 Oct 2010 11:21:54 +0200	[thread overview]
Message-ID: <201010311021.55917.jnareb@gmail.com> (raw)
In-Reply-To: <7v1v773s7e.fsf@alter.siamese.dyndns.org>

On Sun, 31 Oct 2010, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> I am getting this in the gitweb.log:
>>> 
>>>     [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine 
>>>     &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124.
>>> 
>>> which seems to cause t9500 to fail.
>>
>> This is caused by three issues (bugs) in v7 caching code.
>>
>> First is the reason why t9500 exhibits this bug.  The gitweb caching
>> v7 includes file with subroutines related to caching in the following
>> way:
>>
>>   do 'cache.pm';
>>
>> Note that the relative path is used; for t9500 it is relative from
>> somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';"
>> doesn't find it.
> 
> John, where should cache.pm go in the installed system?  Does it typically
> go next to gitweb.perl?
> 
> I think "do 'that-file'" honors path specified by the -I option, so I do
> not think "do $cache_pm" is necessary.  My preference is to run gitweb
> tests with appropriate -I pointing at the cache.pm in the directory.

>From `perldoc -f do`

      do 'stat.pl';

   is just like

      eval 'cat stat.pl';

   except that it's more efficient and concise, keeps track of the current
   filename for error messages, searches the @INC libraries, and updates %INC
   if the file is found.        ^^^^^^^^^^^^^^^^^^^^^^^^^^^

So I think it respects '-I<directory>' given to perl interpreter.

On the other hand I don't know how it would work with mod_perl (uwing
ModPerl::Registry handler), whether it wouldn't too require extra
configuration.

So I think a better solution would be to base 'Gitweb caching v7' plus
necessary fixes and improvements on top of 

  gitweb: Prepare for splitting gitweb

This means that the directory that gitweb is in would be added to @INC
via

  use lib __DIR__.'/lib';

The 'cache.pm' or 'cache.pl' file would be moved to 'gitweb/lib', but
that is cosmetic change.

> 
>> ...  It should be
>>
>>   do $cache_pm;
>>   die $@ if $@;
>>
>> at least.
> 
> Catching failure is a good thing to do.

Right.  And this is only one-line addition.
 
>> Perhaps even better would be to simply turn off caching
>> support if there is an error in 'cache.pm'
> 
> That can come later.
> 
> Jakub, can we have an absolute minimum fix-up, so that we can give
> this wider exposure?  I think there are only
> four issues:
> 
>  (1) exclude Ajax-y stuff from caching;

Easy, but only if check whether to do capturing and caching is moved
out of cache_fetch to caller, i.e. to gitweb script.  See also comments
below about what need and should be done.

Another solution is to turn off Ajax-y stuff when caching is enabled.
It can be done quite easily, just sprinkle some !$caching_enabled
(see comment below about naming and semantic of this config variable)
in appropriate place.

>  (2) install cache.pm the same way gitweb.perl is installed via
>      the Makefile;

With "gitweb: Prepare for splitting gitweb" as introductory patch this
would be just adding

  # gitweb output caching
  GITWEB_MODULES += cache.pm

(or cache.pl - it is not a proper Perl package!) e.g. above GITWEB_REPLACE
sed script in gitweb/Makefile.

>  (3) running tests with appropriate -I so that cache.pm is found; and

No change to test necessary if we use "use lib __DIR__.'/lib';" in
gitweb.perl

I'd like to port from my rewrite change to t/t9500-gitweb-standalone-no-errors.sh
adding minimal test to check if running gitweb with caching enabled
doesn't generate any warnings, and (modified) change to the
t/t9502-gitweb-standalone-parse-output.sh, adding minimal test that
gitweb produces the same output with and without caching.

>  (4) die if 'cache.pm' cannot be "done".

O.K. (4) is one liner.

There is are also other issues

   (5) naming and semantic of gitweb config variables configuring caching;
       at least change $cache_enabled enum to $caching_enabled boolean
   (6) do not change anything in gitweb behavior if caching is disabled;
       move 'if ($caching_enabled)' test to gitweb.perl, and remove code
       from cache_fetch

About (5): names and semantics of configuration variables are gitweb API,
so we should at least try to keep backward compatibility with old names
(see for example 'backward compatibility: legacy gitweb config support'
section in %known_snapshot_format_aliases).  

There is no much problem with $minCacheTime, $maxCacheTime, $backgroundCache
etc. (besides bleh, camelCase ;-) but I would prefer to have *boolean*
$caching_enabled (true-ish value means cache is enabled, false-y value
including undef means caching is disabled) to currently two-value *"enum"*
$cache_enable called "binary" (sic!) in docs and comments.  See also
issue (6).


About (6): I would prefer to move check if caching should be enabled
higher, to gitweb.perl (to its own configure_caching() subroutine, rather
than sprinkling it in top scope), and not even include 'cache.pm'
if caching is disabled (default).  But because we use 'do' and not
'require', we would have to check %INC to not include it twice...

Anyway, whether we include 'cache.pm' conditionally or not, I'd prefer
to use

  if ($caching_enabled) {
  	cache_fetch($action);
  } else {
  	$actions{$action}->();
  }

in dispatch() subroutine, and simply remove all code dealing with
'$cache_enable == 0' case from 'cache.pm'.

Remember, config variables are forever :-)

> 
> I think the change in gitweb-cache v7 is small and safe enough that we
> should fast-track it to give usability to the real world sites.  It may be
> a low-risk "obviously correct" approach that is quick-and-dirty, but that
> is exactly why this should be fast-tracked.  It does not touch the logic
> or formatting in any way, just bypasses the page generation altogether
> when it can clearly do so when it can tell the output cannot possibly be
> incorrect (albeit sometimes it might be stale if in certain cases, e.g. it
> is relative to HEAD).

My rewrite also does the same.  I have changed it to do the same STDOUT
redirection that J.H. gitweb caching v7 does (I used select($fh) based
capture) - work in progress, not yet pushed to my repository:

  git://repo.or.cz/git/jnareb-git.git
  http://repo.or.cz/w/git/jnareb-git.git (gitweb)

> 
> I know you and others were aiming to split things up, but I think the
> amount of the effort that is needed for that line of work on top of the
> current codebase is not much different from what is needed on top of
> gitweb-cache v7.

Well, I'd have to add support for configuration variables used in this
patch series, but with exception of $cache_enable -> $caching_enabled
it wouldn't be much extra work.

-- 
Jakub Narebski
Poland

  reply	other threads:[~2010-10-31  9:22 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
2010-10-28  9:52   ` Jakub Narebski
2010-10-28 22:08   ` Junio C Hamano
2010-10-28  0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
2010-10-28  9:56   ` Jakub Narebski
2010-10-28  0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
2010-10-28 16:10   ` Jakub Narebski
2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano
2010-10-29 22:25 ` Junio C Hamano
2010-10-30  8:58   ` Jakub Narebski
2010-10-31  4:24     ` Junio C Hamano
2010-10-31  9:21       ` Jakub Narebski [this message]
2010-11-01 10:24         ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski
2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
2010-11-17 23:10               ` Junio C Hamano
2010-11-18 13:37                 ` Jakub Narebski
2010-12-02 10:17               ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski
2010-12-02 17:37                 ` Junio C Hamano
2010-12-02 19:01                   ` Jakub Narebski
2010-12-02 19:21                     ` Junio C Hamano
2010-12-02 19:36                       ` Jakub Narebski
2010-11-12 23:44             ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
2010-11-12 23:56             ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
2010-11-28 11:22               ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski
2010-11-28 11:31                 ` Jakub Narebski
2010-11-29 22:13                   ` Junio C Hamano
2010-11-29 22:20                     ` Junio C Hamano
2010-11-29 23:09                       ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski
2010-11-30  0:51                         ` Junio C Hamano
2010-11-30 10:21                           ` Jakub Narebski
2010-11-29 23:07               ` [PATCHv7.1 3/4] " demerphq
2010-11-29 23:26                 ` demerphq
2010-11-29 23:54                   ` Jakub Narebski
2010-11-13  0:01             ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
2010-11-17 22:37               ` Junio C Hamano
2010-11-17 23:12                 ` Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
2010-11-01 18:50           ` [PATCHv7.1b " Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201010311021.55917.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=warthog9@eaglescrag.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).