From: Jonathan Nieder <jrnieder@gmail.com>
To: Olaf Alders <olaf@wundersolutions.com>
Cc: Jakub Narebski <jnareb@gmail.com>,
"J.H." <warthog9@eaglescrag.net>,
git@vger.kernel.org, John 'Warthog9' Hawley <warthog9@kernel.org>,
Junio C Hamano <gitster@pobox.com>, Petr Baudis <pasky@ucw.cz>,
admin@repo.or.cz
Subject: Re: [RFC] Implementing gitweb output caching - issues to solve
Date: Thu, 9 Dec 2010 22:11:44 -0600 [thread overview]
Message-ID: <20101210041144.GA28166@burratino> (raw)
In-Reply-To: <88CF82F1-0363-47B4-8C6F-AE4A2DA1714B@wundersolutions.com>
Olaf Alders wrote:
> On 2010-12-09, at 5:52 PM, Jonathan Nieder wrote:
>> HTTP::BrowserDetect uses a blacklist as far as I can tell. Maybe in
>> the long term it would be nice to add a whitelist ->human() method.
>>
>> Cc-ing Olaf Alders for ideas.
>
> Thanks for including me in this. :) I'm certainly open to patching
> the module, but I'm not 100% clear on how you would want to
> implement this. How is ->is_human different from !->is_robot? To
> clarify, I should say that from the snippet above, I'm not 100%
> clear on what the problem is which needs to be solved.
Context (sorry I did not include this in the first place):
The caching code (in development) for git's web interface uses a page
that says "Generating..." for cache misses, with an http refresh
redirecting to the generated content. The big downside is that if
done naively this breaks wget, curl, and similar user agents that are
not patient enough to grab the actual content instead of the redirect
page.
The first solution tried was to explicitly special case wget and curl.
But in this case it is better to be more inclusive[2]; when in doubt,
leave out the nice "Generating..." page and just serve the actual
content slowly just in case.
In other words, the idea was that user agents fall into three
categories:
A. definitely will not replace content with target of HTTP refresh
B. definitely will replace content with target of HTTP refresh
C. unknown
and maybe ->is_robot could return true for A and ->is_human return
true for B (leaving C as !->is_human && !->is_robot). In this case,
we should show the "Generating..." page only in the ->is_human (B)
case.
That said, I know almost nothing on this subject, so it is likely
this analysis misses something. J.H. or Jakub can likely say more.
Thanks,
Jonathan
next prev parent reply other threads:[~2010-12-10 4:12 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-04 16:21 [RFC] Implementing gitweb output caching - issues to solve Jakub Narebski
2010-12-09 1:31 ` J.H.
2010-12-09 5:22 ` Junio C Hamano
2010-12-09 5:28 ` J.H.
2010-12-09 22:30 ` Jakub Narebski
2010-12-09 22:52 ` Jonathan Nieder
2010-12-10 3:17 ` Olaf Alders
2010-12-10 4:11 ` Jonathan Nieder [this message]
2010-12-10 4:46 ` J.H.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101210041144.GA28166@burratino \
--to=jrnieder@gmail.com \
--cc=admin@repo.or.cz \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
--cc=olaf@wundersolutions.com \
--cc=pasky@ucw.cz \
--cc=warthog9@eaglescrag.net \
--cc=warthog9@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).