git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: "Martin Langhoff" <martin.langhoff@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>
Cc: "Linus Torvalds" <torvalds@osdl.org>,
	"Jeff Garzik" <jeff@garzik.org>, "H. Peter Anvin" <hpa@zytor.com>,
	"Rogan Dawes" <discard@dawes.za.net>,
	"Kernel Org Admin" <ftpadmin@kernel.org>
Subject: Re: kernel.org mirroring (Re: [GIT PULL] MMC update)
Date: Sat, 9 Dec 2006 12:51:15 +0100	[thread overview]
Message-ID: <200612091251.16460.jnareb@gmail.com> (raw)
In-Reply-To: <46a038f90612081756w1ab4609epcb4a2cbd9f4d8205@mail.gmail.com>

Martin Langhoff wrote:

> We can make gitweb to detect mod_perl and a few smarter things if it
> is running inside of it. In fact, we can (ab)use mod_perl and perl
> facilities a bit to do some serialization which will be a big win for
> some pages. What we need for that is to set a sensible the ETag and
> use some IPC to announce/check if other apache/modperl processes are
> preparing content for the same ETag. The first-process-to-announce a
> given ETag can then write it to a common temp directory (atomically -
> write to a temp-name and move to the expected name) while other
> processes wait, polling for the file. Once the file is in place the
> latecomers can just serve the content of the file and exit.

First, it would (and could) work only for serving gitweb over mod_perl.
I'm not sure if overhead with IPC and complications implementing are
worth it: this perhaps be better solved by caching engine.

But let us put aside for a while actual caching (writing HTML version
of the page to a common temp directory, and serving this static page
if possible), and talk a bit what gitweb can do with respect to
cache validation.

In addition to setting either Expires: header or Cache-Control: max-age
gitweb should also set Last-Modified: and ETag headers, and also 
probably respond to If-Modified-Since: and If-None-Match: requests.

Would be worth implementing this?
 
> (I am calling the "state we are serving" identifier ETag because I
> think we should also set it as the ETag in the HTTP headers, so well
> be able to check the ETag of future requests for staleness - all we
> need is a ref lookup, and if the SHA1 matches, we are sorted). So
> having this 'unique request identifier' doubles up nicely...

For some pages ETag is natural; for other Last-Modified: would be more
natural.

> The ETag should probably be:
>  - SHA1+displaytype+args for pages that display an object identified
>    by SHA1

What uniquely identifies contents in "object" views ("commit", "tag",
"tree", "blob") is either h=SHA1, or hb=SHA1;f=FILENAME (with absence
of h=SHA1). If both h=SHA1 and hb=SHA1 is present, hb=SHA1 serves as
backlink. The "diff" views ("commitdiff", "blobdiff") are uniquely
identified by pair of object identifiers (pairs of SHA1, or pairs of
hb SHA1 + FILENAME).

Three of those views ("blob", "commitdiff", "blobdiff") have their 
"plain" version; so ETag should include displaytype (action, 'a' 
parameter).

The hb=SHA1;f=FILENAME indentifier can be converted at cost of one
call to git command (but which is a bit expensive as it recurses
trees), namely to git-ls-tree.

ETag can be simply args (query), if all h/hb/hbp parameters are SHA1.
Or ETag can be SHA1 of an object (or pair of SHA1 in the case of diff),
but this is little more costly to verify. Although we usually (always?) 
convert hb=SHA1;f=FILENAME to h=SHA1 anyway when displaying/generating 
page.

Usualy you can compare ETags base on URL alone.
   
>  - refname+SHA!+displaytype+args for pages that display something
>    identified by a ref

For objects views we can simply convert refname to SHA1. I'm not sure if 
it is worth it. In the cases when for view we have to calculate SHA1 of 
object anyway, we can return (and validate) ETag with SHA1 as above.

- ETag and/or Last-Modified headers for "log" views: "log", 
"shortlog" (is part of summary view), "history", "rss"/"atom" views.

On one hand all log views (at least now) are identified by their 
parameters (action/view name, and filename in the case of history view) 
and SHA1 of top commit. On the other hand it might be easier to use 
Last-Modified with date of top commit... Verifying SHA1 based ETag 
could add some overhead in the case of miss.

>  - SHA1(names and sha1s of all refs) for the summary page

Wouldn't it be simplier to just set Last-Modified: header (and check
it?)


P.S. Can anyone post some benchmark comparing gitweb deployed under 
mod_perl as compared to deployed as CGI script? Does kernel.org use 
mod_perl, or CGI version of gitweb?

-- 
Jakub Narebski

  reply	other threads:[~2006-12-09 11:49 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <45708A56.3040508@drzeus.cx>
     [not found] ` <Pine.LNX.4.64.0612011639240.3695@woody.osdl.org>
     [not found]   ` <457151A0.8090203@drzeus.cx>
     [not found]     ` <Pine.LNX.4.64.0612020835110.3476@woody.osdl.org>
     [not found]       ` <45744FA3.7020908@zytor.com>
     [not found]         ` <Pine.LNX.4.64.0612061847190.3615@woody.osdl.org>
     [not found]           ` <45778AA3.7080709@zytor.com>
     [not found]             ` <Pine.LNX.4.64.0612061940170.3615@woody.osdl.org>
     [not found]               ` <4577A84C.3010601@zytor.com>
     [not found]                 ` <Pine.LNX.4.64.0612070953290.3615@woody.osdl.org>
     [not found]                   ` <45785697.1060001@zytor.com>
2006-12-07 19:05                     ` kernel.org mirroring (Re: [GIT PULL] MMC update) Linus Torvalds
2006-12-07 19:16                       ` H. Peter Anvin
2006-12-07 19:30                         ` Olivier Galibert
2006-12-07 19:57                           ` H. Peter Anvin
2006-12-07 23:50                             ` Olivier Galibert
2006-12-07 23:56                               ` H. Peter Anvin
2006-12-08 11:25                               ` Jakub Narebski
2006-12-08 12:57                             ` Rogan Dawes
2006-12-08 13:38                               ` Jakub Narebski
2006-12-08 14:31                                 ` Rogan Dawes
2006-12-08 15:38                                   ` Jonas Fonseca
2006-12-09  1:28                                 ` Martin Langhoff
2006-12-09  2:03                                   ` H. Peter Anvin
2006-12-09  2:52                                     ` Martin Langhoff
2006-12-09  5:09                                       ` H. Peter Anvin
2006-12-09  5:34                                         ` Martin Langhoff
2006-12-09 16:26                                           ` H. Peter Anvin
2006-12-08 16:16                               ` H. Peter Anvin
2006-12-08 16:35                                 ` Linus Torvalds
2006-12-08 16:42                                   ` H. Peter Anvin
2006-12-08 19:49                                     ` Lars Hjemli
2006-12-08 19:51                                       ` H. Peter Anvin
2006-12-08 19:59                                         ` Lars Hjemli
2006-12-08 20:02                                           ` H. Peter Anvin
2006-12-10  9:43                                     ` rda
2006-12-08 16:54                                   ` Jeff Garzik
2006-12-08 17:04                                     ` H. Peter Anvin
2006-12-08 17:40                                       ` Jeff Garzik
2006-12-08 23:27                                     ` Linus Torvalds
2006-12-08 23:46                                       ` Michael K. Edwards
2006-12-08 23:49                                         ` H. Peter Anvin
2006-12-09  0:18                                           ` Michael K. Edwards
2006-12-09  0:23                                             ` H. Peter Anvin
2006-12-09  0:49                                         ` Linus Torvalds
2006-12-09  0:51                                           ` H. Peter Anvin
2006-12-09  4:36                                           ` Michael K. Edwards
2006-12-09  9:27                                           ` Jeff Garzik
     [not found]                                       ` <4579FABC.5070509@garzik.org>
2006-12-09  0:45                                         ` Linus Torvalds
2006-12-09  0:47                                           ` H. Peter Anvin
2006-12-09  9:16                                           ` Jeff Garzik
2006-12-09  1:56                                       ` Martin Langhoff
2006-12-09 11:51                                         ` Jakub Narebski [this message]
2006-12-09 12:42                                           ` Jeff Garzik
2006-12-09 13:37                                             ` Jakub Narebski
2006-12-09 14:43                                               ` Jeff Garzik
2006-12-09 17:02                                                 ` Jakub Narebski
2006-12-09 17:27                                                   ` Jeff Garzik
2006-12-10  4:07                                               ` Martin Langhoff
2006-12-10 10:09                                                 ` Jakub Narebski
2006-12-10 12:41                                                   ` Jeff Garzik
2006-12-10 13:02                                                     ` Jakub Narebski
2006-12-10 13:45                                                       ` Jeff Garzik
2006-12-10 19:11                                                         ` Jakub Narebski
2006-12-10 19:50                                                           ` Linus Torvalds
2006-12-10 20:27                                                             ` Jakub Narebski
2006-12-10 20:30                                                               ` Linus Torvalds
2006-12-10 22:01                                                                 ` Martin Langhoff
2006-12-10 22:14                                                                   ` Jeff Garzik
2006-12-10 22:08                                                                 ` Jeff Garzik
2006-12-10 21:01                                                             ` H. Peter Anvin
2006-12-10 22:05                                                           ` Jeff Garzik
2006-12-10 22:59                                                             ` Jakub Narebski
2006-12-11  2:16                                                               ` Martin Langhoff
2006-12-11  8:59                                                                 ` Jakub Narebski
2006-12-11 10:18                                                                   ` Martin Langhoff
2006-12-09 18:04                                             ` Linus Torvalds
2006-12-09 18:30                                               ` H. Peter Anvin
2006-12-10  3:55                                             ` Martin Langhoff
2006-12-10  7:05                                               ` H. Peter Anvin
2006-12-12 21:19                                                 ` Jakub Narebski
2006-12-09  7:56                                       ` Steven Grimm
2006-12-07 19:30                         ` Linus Torvalds
2006-12-07 19:39                           ` Shawn Pearce
2006-12-07 19:58                             ` Linus Torvalds
2006-12-07 23:33                               ` Michael K. Edwards
2006-12-07 19:58                             ` H. Peter Anvin
2006-12-07 20:05                           ` Junio C Hamano
2006-12-07 20:09                             ` H. Peter Anvin
2006-12-07 22:11                               ` Junio C Hamano
2006-12-08  9:43                       ` Jakub Narebski
2006-12-11  3:40 linux
2006-12-11  9:30 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200612091251.16460.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=discard@dawes.za.net \
    --cc=ftpadmin@kernel.org \
    --cc=git@vger.kernel.org \
    --cc=hpa@zytor.com \
    --cc=jeff@garzik.org \
    --cc=martin.langhoff@gmail.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).