From: Jakub Narebski <jnareb@gmail.com>
To: Lea Wiemann <lewiemann@gmail.com>
Cc: git@vger.kernel.org, John Hawley <warthog19@eaglescrag.net>,
Petr Baudis <pasky@suse.cz>
Subject: Re: [PATCH 3/3] gitweb: use new Git::Repo API, and add optional caching
Date: Mon, 14 Jul 2008 23:23:21 +0200 [thread overview]
Message-ID: <200807142323.22761.jnareb@gmail.com> (raw)
In-Reply-To: <1215738708-5212-1-git-send-email-LeWiemann@gmail.com>
On Fri, 11 July 2008, Lea Wiemann wrote:
> Gitweb now uses the Git::Repo API; this change is behavior-preserving,
> except for slightly more aggressive error handling; see below.
Good.
It was suggested to split this into separate commit from the following
change, for making it easier to test (you can check that behavior is
the same, with the exception of error handling) and review (smaller
patch to read and review).
> This patch also adds an optional caching layer for caching repository
> data in memory and (for larger cacheable items, like blobs, snapshots,
> or diffs) on disk.
As it was said, if feasible it would be good idea to put this change
into separate commit.
> Other minor changes:
>
> - Gitweb would previously accept invalid input and either (a) display
> nothing, (b) display an obscure error message, or (c) proceed as
> normal since the parameter happens to be unused in the particular
> code path used. This has changed in that gitweb will check for
> parameter correctness more aggressively, and display meaningful
> error messages. This change is only relevant if you manually edit
> gitweb's CGI parameters, since gitweb only generates valid links.
I understand that this change deals with treating invalid specifiers,
which point to either object that do not exists, are ambiguous, or point
to object of invalid type. Gitweb does check "syntactic" validity of
input (of CGI parameters) already, even those that are not used for
selected action.
BTW. such check was not feasible before implementing --batch and/or
--batch-check options to git-cat-file; I think that possibly one more
fork is not much price to pay for better error checking.
> - Empty projects:
>
> - Only display summary link for empty projects in project list to
> avoid broken links (yielding 404).
>
> - Slim down summary page for empty projects to avoid some broken
> links and unnecessary vertical space.
>
> - Sort empty projects at the bottom of the project list when sorting
> by last change.
>
> - Add test for empty projects to t9503 (the Mechanize test), now
> that there no broken links anymore.
Good. The only thing that *might* be controversial is putting empty
projects at the bottom of sorted by age (by last change) projects list,
instead of at top.
> - For HTML pages, remove the "Expires" HTTP response header, and add
> "Cache-Control: no-cache" instead. This is because pages can
> contain dynamic content (like the subject of the latest commit), so
> the Expires headers would be wrong.
>
> This makes gitweb's responsiveness slightly worse, but it will get
> much better once If-Last-Modified is implemented. It's better to be
> correct than to be convenient here, since having to press the reload
> button makes for lousy user experience (IOW, users should be able to
> always trust gitweb's output).
>
> Raw diffs and blobs still get the Expires header, where appropriate.
I don't think it is a good change.
Gitweb generates two types of views (pages): transient and immutable.
An example of transient view (transient page/action) is for example RSS
feed, or summary page. When project (repository) is updated, they can
change.
The opposite are immutable pages. They are pages/actions/views where
all specifiers are given by full SHA-1; to be more exact all specifiers
that are needed to reconstruct object are given by SHA-1. (It is
enough to have sufficient check for immutability, i.e. such that if
check succeeds, then page is immutable, but it doesn't need to be true
in reverse.)
Gitweb sets expires to '+1d' which is one day to pages it considers
immutable, while not defining expires for other pages (which results,
I think, in lack of expires header). We could have set it to "forever",
which in terms of Expires: HTTP header is half a year (from what
I remember).
Now I don't see *any* reason to not set long expires for immutable
pages; I don't know if forbidding to cache transient pages even if in
fact they are generated dynamically is a good idea... Note that if
caching is enabled, you can set expires to either time-to-expire of
cache entries (simpler), or time left to live to invalidation of item
in cache (better, but more complicated) perhaps also setting Age:
header to appropriate value.
Sidenote: we would probably want to use Expires: for HTTP/1.0 requests,
and Cache-Control: max-age=<seconds> for HTTP/1.1 requests. But that
might be left as improvement for later...
> - Add a $page_info option to display cache stats at the bottom of each
> page; the option is named generically to allow for adding non-cache
> page info there at some point (timings perhaps?).
Great idea!
> ---
> It's all documented of course :-), but for the impatient here's a
> snippet for gitweb_config.perl to activate caching:
Nice.
> use Cache::Memcached;
> $cache = Cache::Memcached->new( { servers => ['localhost:11211'],
> compress_threshold => 1000 } );
IIRC you can use any Cache::Cache compatibile (is it explained later
what it means?) cache here; IMVHO it would be nice if this info would
be also in commit message.
> $large_cache_root = '/home/lewiemann/gitweb-cache';
> $large_cache_case_sensitive = 1;
Errr... I understand that it is your _private_ configuration, just
copied here verbatim, but I don't think '/home/lewiemann/gitweb-cache'
is a good example: '/tmp/gitweb-cache' perhaps, that I can understand.
> # Invalidate cache on changes to gitweb without version number bump;
> # useful for development.
> $cache_key = (stat '/home/lewiemann/gitweb')[9] .
> (stat '/home/lewiemann/gitweb/gitweb.cgi')[9];
What should be used in production? "$cache_key = $version;"?
Besides hardcoding those paths is not a good idea. You can always
use $ENV{'SCRIPT_FILENAME'}, or dirname of it.
> # Display detailed cache info at the bottom of each page.
> $page_info = 2;
Errr... what does "$page_info = <n>;" mean?
> A live demo is here: http://odin3.kernel.org/git-lewiemann/
Nice. Thanks.
[...]
> gitweb/README | 14 +
Very good.
[Comments on patch itself in separate email, later]
--
Jakub Narebski
Poland
next prev parent reply other threads:[~2008-07-14 21:24 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-11 1:06 [PATCH 0/3] Git::Repo API and gitweb caching Lea Wiemann
2008-07-11 1:10 ` [PATCH 1/3 v9] gitweb: add test suite with Test::WWW::Mechanize::CGI Lea Wiemann
2008-07-11 1:11 ` [PATCH 2/3] add new Git::Repo API Lea Wiemann
2008-07-13 21:38 ` Junio C Hamano
2008-07-14 1:04 ` Lea Wiemann
2008-07-13 23:28 ` Jakub Narebski
2008-07-14 2:29 ` Lea Wiemann
2008-07-14 1:40 ` Petr Baudis
2008-07-14 22:19 ` Lea Wiemann
2008-07-18 16:48 ` Petr Baudis
2008-07-18 17:05 ` Jakub Narebski
2008-07-18 17:17 ` Petr Baudis
2008-07-18 18:09 ` Lea Wiemann
2008-07-18 18:19 ` Petr Baudis
2008-07-18 18:23 ` Johannes Schindelin
2008-07-19 20:54 ` Statictics on Git.pm usage in git commands (was: [PATCH 2/3] add new Git::Repo API) Jakub Narebski
2008-07-19 21:14 ` Petr Baudis
2008-07-20 0:16 ` Jakub Narebski
2008-07-20 21:38 ` Petr Baudis
2008-07-20 10:38 ` Johannes Schindelin
2008-07-20 10:49 ` Petr Baudis
2008-07-20 12:33 ` Johannes Schindelin
2008-07-20 12:58 ` Petr Baudis
2008-07-20 13:21 ` Johannes Schindelin
2008-07-14 23:41 ` [PATCH 2/3] add new Git::Repo API Jakub Narebski
2008-07-15 0:11 ` Lea Wiemann
2008-07-18 16:54 ` Petr Baudis
2008-07-19 0:03 ` Jakub Narebski
2008-07-19 19:07 ` Jakub Narebski
2008-07-20 21:36 ` Petr Baudis
2008-07-20 21:50 ` Jakub Narebski
2008-07-16 18:21 ` Jakub Narebski
2008-07-16 20:32 ` Lea Wiemann
2008-07-17 23:49 ` Jakub Narebski
2008-07-18 13:40 ` Lea Wiemann
2008-07-18 15:35 ` Jakub Narebski
2008-07-18 16:51 ` Lea Wiemann
2008-07-11 1:11 ` [PATCH 3/3] gitweb: use new Git::Repo API, and add optional caching Lea Wiemann
2008-07-14 21:23 ` Jakub Narebski [this message]
2008-07-14 23:03 ` Lea Wiemann
2008-07-14 23:14 ` Jakub Narebski
2008-07-14 23:56 ` Lea Wiemann
2008-07-15 0:52 ` Jakub Narebski
2008-07-15 1:16 ` Lea Wiemann
2008-07-15 1:28 ` Johannes Schindelin
2008-07-15 1:44 ` J.H.
2008-07-15 1:50 ` Lea Wiemann
2008-07-15 2:03 ` J.H.
2008-07-11 1:21 ` [PATCH 0/3] Git::Repo API and gitweb caching Johannes Schindelin
2008-07-11 9:33 ` Jakub Narebski
2008-07-11 14:07 ` Lea Wiemann
2008-07-11 16:27 ` Abhijit Menon-Sen
2008-07-12 15:08 ` Jakub Narebski
2008-07-19 5:35 ` Lea Wiemann
2008-08-18 19:34 ` Lea Wiemann
2008-08-18 19:39 ` [PATCH 1/3 v10] gitweb: add test suite with Test::WWW::Mechanize::CGI Lea Wiemann
2008-08-19 1:17 ` Junio C Hamano
2008-08-19 14:37 ` Lea Wiemann
2008-08-18 19:39 ` [PATCH 2/3 v2] add new Perl API: Git::Repo, Git::Commit, Git::Tag, and Git::RepoRoot Lea Wiemann
2008-08-19 1:32 ` Junio C Hamano
2008-08-19 15:06 ` Lea Wiemann
2008-08-19 13:51 ` Lea Wiemann
2008-08-18 19:39 ` [PATCH 3/3 v2] gitweb: use new Git::Repo API, and add optional caching Lea Wiemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200807142323.22761.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=lewiemann@gmail.com \
--cc=pasky@suse.cz \
--cc=warthog19@eaglescrag.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).