From: Jonathan Nieder <jrnieder@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org, "J.H." <warthog9@eaglescrag.net>,
John 'Warthog9' Hawley <warthog9@kernel.org>
Subject: Re: [RFD] My thoughts about implementing gitweb output caching
Date: Fri, 7 Jan 2011 18:26:43 -0600 [thread overview]
Message-ID: <20110108002643.GD15495@burratino> (raw)
In-Reply-To: <201101080042.36156.jnareb@gmail.com>
Hi,
Thanks for these design notes. A few uninformed reactions.
Jakub Narebski wrote:
> There was request to support installing gitweb modules in a separate
> directory, but that would require changes to "gitweb: Prepare for
> splitting gitweb" patch (but it is doable). Is there wider interest
> in supporting such feature?
If you are referring to my suggestion, I see no reason to wait on
that. The lib/ dir can be made configurable later.
> Simplest solution is to use $cgi->self_url() (note that what J.H. v8
> uses, i.e.: "$my_url?". $ENV{'QUERY_STRING'}, where $my_url is just
> $cgi->url() is not enough - it doesn't take path_info into account).
>
> Alternate solution, which I used in my rewrite, is to come up with
> "canonical" URL, e.g. href(-replay => 1, -full => 1, -path_info => 0);
> with this solution using path_info vs query parameters or reordering
> query parameters still gives the same key.
It is easy to miss dependencies on parts of the URL that are being
fuzzed out. For example, the <base href...> tag is only inserted with
path_info. Maybe it would be less risky to first use self_url(), then
canonicalize it in a separate patch?
> J.H. patches up and including v7, and my rewrite up and including v6,
> excluded error pages from caching. I think that the original resoning
> behind choosing to do it this way was that A.), each of specific error
> pages is usually accessed only once, so caching them would only take up
> space bloating cache, but what is more important B.) that you can't
> cache errors from caching engine.
Perhaps there is a user experience reason? If I receive an error page
due to a problem with my repository, all else being equal, I would
prefer that the next time I reload it is fixed. By comparison, having
to reload multiple times to forget an obsolete non-error response
would be less aggravating and perhaps expected.
But the benefit from caching e.g. a response from a broken link would
outweigh that.
> Second is if there is no stale data to serve (or data is too stale), but
> we have progress indicator. In this case the foreground process is
> responsible for rendering progress indicator, and background process is
> responsible for generating data. In this case foreground process waits
> for data to be generated (unless progress info subroutine exits), so
> strictly spaking we don't need to detach background process in this
> case.
What happens when the client gets tired of waiting and closes the
connection?
> With output caching gitweb can also support 'Range' requests, which
> means that it would support resumable download. This would mean hat we
> would be able to resume downloading of snapshot (or in the future
> bundle)... if we cannot do this now. This would require some more code
> to be added.
Exciting stuff.
Teaching gitweb to generate bundles sounds like a recipe for high server
loads, though. I suspect manual (or by cronjob) generation would work
better, with a possible exception of very frequently cloned and
infrequently pushed-to repos like linus's linux-2.6.
Jonathan
next prev parent reply other threads:[~2011-01-08 0:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-07 23:42 [RFD] My thoughts about implementing gitweb output caching Jakub Narebski
2011-01-08 0:26 ` Jonathan Nieder [this message]
2011-01-08 2:46 ` Nicolas Pitre
2011-01-08 11:15 ` Jakub Narebski
2011-01-08 11:44 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110108002643.GD15495@burratino \
--to=jrnieder@gmail.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=warthog9@eaglescrag.net \
--cc=warthog9@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).