From mboxrd@z Thu Jan 1 00:00:00 1970 From: "J.H." Subject: Re: [RFC PATCH v7 11/9] [PoC] gitweb/lib - tee, i.e. print and capture during cache entry generation Date: Mon, 03 Jan 2011 15:31:58 -0800 Message-ID: <4D225C6E.9000108@eaglescrag.net> References: <20101222234843.7998.87068.stgit@localhost.localdomain> <201101032233.16174.jnareb@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: git@vger.kernel.org, "John 'Warthog9' Hawley" To: Jakub Narebski X-From: git-owner@vger.kernel.org Tue Jan 04 00:32:23 2011 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PZtsx-0003Tu-Nx for gcvg-git-2@lo.gmane.org; Tue, 04 Jan 2011 00:32:20 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750757Ab1ACXcI (ORCPT ); Mon, 3 Jan 2011 18:32:08 -0500 Received: from shards.monkeyblade.net ([198.137.202.13]:46539 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750716Ab1ACXcH (ORCPT ); Mon, 3 Jan 2011 18:32:07 -0500 Received: from voot-cruiser.eaglescrag.net (c-71-202-185-40.hsd1.ca.comcast.net [71.202.185.40]) (authenticated bits=0) by shards.monkeyblade.net (8.14.4/8.14.3) with ESMTP id p03NVw0w021679 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 3 Jan 2011 15:31:59 -0800 X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.95.3 at shards.monkeyblade.net User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Fedora/3.0.10-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.10 In-Reply-To: <201101032233.16174.jnareb@gmail.com> X-Enigmail-Version: 1.0.1 X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.3 (shards.monkeyblade.net [198.137.202.13]); Mon, 03 Jan 2011 15:31:59 -0800 (PST) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On 01/03/2011 01:33 PM, Jakub Narebski wrote: > Instead of having gitweb use progress info indicator / throbber to > notify user that data is being generated by current process, gitweb > can now (provided that PerlIO::tee from PerlIO::Util is available) > send page to web browser while simultaneously saving it to cache > (print and capture, i.e. tee), thus having incremental generating of > page serve as a progress indicator. In general, and particularly for the large sites that caching is targeted at, teeing is a really bad idea. I've mentioned this several times before, and the progress indicator is a *MUCH* better idea. I'm not sure how many times I can say that, even if this was added it would have the potential to exacerbate disk thrashing and overall make things a lot more complex. 1) Errors may still be generated in flight as the cache is being generated. It would be better to let the cache run with a progress indicator and should an error occur, display the error instead of giving any output that may have been generated (and thus likely a broken page). 2) Having multiple clients all waiting on the same page (in particular the index page) can lead to invalid output. In particular if you are teeing the output a reading client now must come in, read the current contents of the file (as written), then pick up on the the tee after that. It's actually possible for the reading client to miss data as it may be in flight to be written and the client is switching from reading the file to reading the tee. I don't see anything in your code to handle that kind of switch over. 3) This makes no allowance for the file to be generated completely in the background while serving stale data in the interim. Keep in mind that it can (as Fedora has experienced) take *HOURS* to generate the index page, teeing that output just means brokenness and isn't useful. It's much better to have a simple, lightweight waiting message get displayed while things happen. When they are done, output the completed page to all waiting clients. - John 'Warthog9' Hawley P.S. I'm back to work full-time on Wednesday, which I'll be catching up on gitweb and trying to make forward progress on my gitweb code again.