From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.176.0/21 X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MSGID_FROM_MTA_HEADER,RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 From: Jeff Garzik Subject: Re: kernel.org mirroring (Re: [GIT PULL] MMC update) Date: Sat, 09 Dec 2006 09:43:45 -0500 Message-ID: <457ACBA1.4090007@garzik.org> References: <200612091251.16460.jnareb@gmail.com> <457AAF31.2050002@garzik.org> <200612091437.01183.jnareb@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit NNTP-Posting-Date: Sat, 9 Dec 2006 14:43:58 +0000 (UTC) Cc: Martin Langhoff , Git Mailing List , Linus Torvalds , "H. Peter Anvin" , Rogan Dawes , Kernel Org Admin Return-path: Envelope-to: gcvg-git@gmane.org User-Agent: Thunderbird 1.5.0.8 (X11/20061107) In-Reply-To: <200612091437.01183.jnareb@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org Archived-At: Received: from vger.kernel.org ([209.132.176.167]) by dough.gmane.org with esmtp (Exim 4.50) id 1Gt3Qf-0006Vu-8v for gcvg-git@gmane.org; Sat, 09 Dec 2006 15:43:53 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936409AbWLIOnt (ORCPT ); Sat, 9 Dec 2006 09:43:49 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S936420AbWLIOnt (ORCPT ); Sat, 9 Dec 2006 09:43:49 -0500 Received: from srv5.dvmed.net ([207.36.208.214]:60769 "EHLO mail.dvmed.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936409AbWLIOnt (ORCPT ); Sat, 9 Dec 2006 09:43:49 -0500 Received: from cpe-065-190-194-075.nc.res.rr.com ([65.190.194.75] helo=[10.10.10.10]) by mail.dvmed.net with esmtpsa (Exim 4.63 #1 (Red Hat Linux)) id 1Gt3QY-0007RC-Mz; Sat, 09 Dec 2006 14:43:47 +0000 To: Jakub Narebski Sender: git-owner@vger.kernel.org Jakub Narebski wrote: > Sending Last-Modified: should be easy; sending ETag needs some consensus > on the contents: mainly about validation. Responding to If-Modified-Since: > and If-None-Match: should cut at least _some_ of the page generating time. Definitely. > As I said, I'm not talking (at least now) about saving generated HTML > output. This I think is better solved in caching engine like Squid can > be. Although even here some git specific can be of help: we can invalidate > cache on push, and we know that some results doesn't ever change (well, > with exception of changing output of gitweb). It depends on how creatively you think ;-) Consider generating static HTML files on each push, via a hook, for many of the toplevel files. The static HTML would then link to the CGI for further dynamic querying of the git database. > What can be _easily_ done: > * Use post 1.4.4 gitweb, which uses git-for-each-ref to generate summary > page; this leads to around 3 times faster summary page. This re-opens the question mentioned earlier, is Kay (or anyone?) still actively maintaining gitweb on k.org? > * Perhaps using projects list file (which can be now generated by gitweb) > instead of scanning directories and stat()-ing for owner would help > with time to generate projects lis page This could be statically generated by a robot. I think everybody would shrink in horror if a human needed to maintain such a file. > What can be quite easy incorporated into gitweb: > * For immutable pages set Expires: or Cache-Control: max-age (or both) > to infinity nice! > * Generate Last-Modified: for those views where it can be calculated, > and respond with 304 Not Modified as soon as it can. agreed > What can be easily done using caching engine: > * Select top 10 of common queries, and cache them, invalidating cache on push > (depending on query: for example invalidate project list on push to any > project, invalidate RSS/Atom feed and summary pages only on push to specific > project) - can be done with git hooks. Or simply generate regular filesystem files into the webspace, as triggered by a hook. Let the standard filesystem mirroring/caching work its magic. Jeff