git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin Langhoff" <martin.langhoff@gmail.com>
To: "Linus Torvalds" <torvalds@osdl.org>
Cc: "Jeff Garzik" <jeff@garzik.org>, "H. Peter Anvin" <hpa@zytor.com>,
	"Rogan Dawes" <discard@dawes.za.net>,
	"Kernel Org Admin" <ftpadmin@kernel.org>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Jakub Narebski" <jnareb@gmail.com>
Subject: Re: kernel.org mirroring (Re: [GIT PULL] MMC update)
Date: Sat, 9 Dec 2006 14:56:28 +1300	[thread overview]
Message-ID: <46a038f90612081756w1ab4609epcb4a2cbd9f4d8205@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0612081453430.3516@woody.osdl.org>

On 12/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Actually, just looking at the examples, it looks like memcached is
> fundamentally flawed, exactly the same way Apache mod_cache is
> fundamentally flawed.

I don't know if fundamentally flawed but (having used memcached) I
don't think it's a big win for this at all.

We can make gitweb to detect mod_perl and a few smarter things if it
is running inside of it. In fact, we can (ab)use mod_perl and perl
facilities a bit to do some serialization which will be a big win for
some pages. What we need for that is to set a sensible the ETag and
use some IPC to announce/check if other apache/modperl processes are
preparing content for the same ETag. The first-process-to-announce a
given ETag can then write it to a common temp directory (atomically -
write to a temp-name and move to the expected name) while other
processes wait, polling for the file. Once the file is in place the
latecomers can just serve the content of the file and exit.

(I am calling the "state we are serving" identifier ETag because I
think we should also set it as the ETag in the HTTP headers, so well
be able to check the ETag of future requests for staleness - all we
need is a ref lookup, and if the SHA1 matches, we are sorted). So
having this 'unique request identifier' doubles up nicely...

The ETag should probably be:
 - SHA1+displaytype+args for pages that display an object identified by SHA1
 - refname+SHA!+displaytype+args for pages that display something
identified by a ref
 - SHA1(names and sha1s of all refs) for the summary page

> You can't have a cache architecture where the client just does a "get",
> like memcached does. You need to have a "read-for-fill" operation, which
> says:

You _could_ make do with a convention of polling for "entryname" and
"workingon-entryname" and if "workingon-entryname" is set to 1, you
can expect entryname to be filled real soon now. However, memcached is
completely memorybound, so it is only nice for really small stuff or
for a large server farm which has gobs of spare ram.

(Note that memcached does have timeouts which means that the
'workingon' value could have a short timeout in case the request is
cancelled or the process dies - the nasty bit in the above plan would
be the polling.)

> I still don't understand why apache doesn't do it. I guess it wants to be
> stateless or something.

Apache doesn't do it because most web applications don't use the HTTP
procol correctly - specially when it comes to the idempotency of GET.
So in 99% of the cases, web apps serve truly different pages for the
same GET request, depending on your cookie, IP address, time-of-day,
etc.

Most websites deal with very little traffic, so this isn't a problem.
And many large sites that serve a lot of traffic from a dynamic web
app want to be serving custom ads, let you login and see your
personalised toolbar, etc,etc, so this wouldn't work for them either.

So in practice, serialising speculatively on GET requests for the same
URL has very little payoff except for static content. And that's quite
fast anyway.... specially if the underlying OS is smokin' fast ;-)

cheers,




  parent reply	other threads:[~2006-12-09  1:56 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <45708A56.3040508@drzeus.cx>
     [not found] ` <Pine.LNX.4.64.0612011639240.3695@woody.osdl.org>
     [not found]   ` <457151A0.8090203@drzeus.cx>
     [not found]     ` <Pine.LNX.4.64.0612020835110.3476@woody.osdl.org>
     [not found]       ` <45744FA3.7020908@zytor.com>
     [not found]         ` <Pine.LNX.4.64.0612061847190.3615@woody.osdl.org>
     [not found]           ` <45778AA3.7080709@zytor.com>
     [not found]             ` <Pine.LNX.4.64.0612061940170.3615@woody.osdl.org>
     [not found]               ` <4577A84C.3010601@zytor.com>
     [not found]                 ` <Pine.LNX.4.64.0612070953290.3615@woody.osdl.org>
     [not found]                   ` <45785697.1060001@zytor.com>
2006-12-07 19:05                     ` kernel.org mirroring (Re: [GIT PULL] MMC update) Linus Torvalds
2006-12-07 19:16                       ` H. Peter Anvin
2006-12-07 19:30                         ` Olivier Galibert
2006-12-07 19:57                           ` H. Peter Anvin
2006-12-07 23:50                             ` Olivier Galibert
2006-12-07 23:56                               ` H. Peter Anvin
2006-12-08 11:25                               ` Jakub Narebski
2006-12-08 12:57                             ` Rogan Dawes
2006-12-08 13:38                               ` Jakub Narebski
2006-12-08 14:31                                 ` Rogan Dawes
2006-12-08 15:38                                   ` Jonas Fonseca
2006-12-09  1:28                                 ` Martin Langhoff
2006-12-09  2:03                                   ` H. Peter Anvin
2006-12-09  2:52                                     ` Martin Langhoff
2006-12-09  5:09                                       ` H. Peter Anvin
2006-12-09  5:34                                         ` Martin Langhoff
2006-12-09 16:26                                           ` H. Peter Anvin
2006-12-08 16:16                               ` H. Peter Anvin
2006-12-08 16:35                                 ` Linus Torvalds
2006-12-08 16:42                                   ` H. Peter Anvin
2006-12-08 19:49                                     ` Lars Hjemli
2006-12-08 19:51                                       ` H. Peter Anvin
2006-12-08 19:59                                         ` Lars Hjemli
2006-12-08 20:02                                           ` H. Peter Anvin
2006-12-10  9:43                                     ` rda
2006-12-08 16:54                                   ` Jeff Garzik
2006-12-08 17:04                                     ` H. Peter Anvin
2006-12-08 17:40                                       ` Jeff Garzik
2006-12-08 23:27                                     ` Linus Torvalds
2006-12-08 23:46                                       ` Michael K. Edwards
2006-12-08 23:49                                         ` H. Peter Anvin
2006-12-09  0:18                                           ` Michael K. Edwards
2006-12-09  0:23                                             ` H. Peter Anvin
2006-12-09  0:49                                         ` Linus Torvalds
2006-12-09  0:51                                           ` H. Peter Anvin
2006-12-09  4:36                                           ` Michael K. Edwards
2006-12-09  9:27                                           ` Jeff Garzik
     [not found]                                       ` <4579FABC.5070509@garzik.org>
2006-12-09  0:45                                         ` Linus Torvalds
2006-12-09  0:47                                           ` H. Peter Anvin
2006-12-09  9:16                                           ` Jeff Garzik
2006-12-09  1:56                                       ` Martin Langhoff [this message]
2006-12-09 11:51                                         ` Jakub Narebski
2006-12-09 12:42                                           ` Jeff Garzik
2006-12-09 13:37                                             ` Jakub Narebski
2006-12-09 14:43                                               ` Jeff Garzik
2006-12-09 17:02                                                 ` Jakub Narebski
2006-12-09 17:27                                                   ` Jeff Garzik
2006-12-10  4:07                                               ` Martin Langhoff
2006-12-10 10:09                                                 ` Jakub Narebski
2006-12-10 12:41                                                   ` Jeff Garzik
2006-12-10 13:02                                                     ` Jakub Narebski
2006-12-10 13:45                                                       ` Jeff Garzik
2006-12-10 19:11                                                         ` Jakub Narebski
2006-12-10 19:50                                                           ` Linus Torvalds
2006-12-10 20:27                                                             ` Jakub Narebski
2006-12-10 20:30                                                               ` Linus Torvalds
2006-12-10 22:01                                                                 ` Martin Langhoff
2006-12-10 22:14                                                                   ` Jeff Garzik
2006-12-10 22:08                                                                 ` Jeff Garzik
2006-12-10 21:01                                                             ` H. Peter Anvin
2006-12-10 22:05                                                           ` Jeff Garzik
2006-12-10 22:59                                                             ` Jakub Narebski
2006-12-11  2:16                                                               ` Martin Langhoff
2006-12-11  8:59                                                                 ` Jakub Narebski
2006-12-11 10:18                                                                   ` Martin Langhoff
2006-12-09 18:04                                             ` Linus Torvalds
2006-12-09 18:30                                               ` H. Peter Anvin
2006-12-10  3:55                                             ` Martin Langhoff
2006-12-10  7:05                                               ` H. Peter Anvin
2006-12-12 21:19                                                 ` Jakub Narebski
2006-12-09  7:56                                       ` Steven Grimm
2006-12-07 19:30                         ` Linus Torvalds
2006-12-07 19:39                           ` Shawn Pearce
2006-12-07 19:58                             ` Linus Torvalds
2006-12-07 23:33                               ` Michael K. Edwards
2006-12-07 19:58                             ` H. Peter Anvin
2006-12-07 20:05                           ` Junio C Hamano
2006-12-07 20:09                             ` H. Peter Anvin
2006-12-07 22:11                               ` Junio C Hamano
2006-12-08  9:43                       ` Jakub Narebski
2006-12-11  3:40 linux
2006-12-11  9:30 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46a038f90612081756w1ab4609epcb4a2cbd9f4d8205@mail.gmail.com \
    --to=martin.langhoff@gmail.com \
    --cc=discard@dawes.za.net \
    --cc=ftpadmin@kernel.org \
    --cc=git@vger.kernel.org \
    --cc=hpa@zytor.com \
    --cc=jeff@garzik.org \
    --cc=jnareb@gmail.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).