From mboxrd@z Thu Jan 1 00:00:00 1970 From: Johan Herland Subject: Re: [PATCHv4 08/12] Teach the notes lookup code to parse notes trees with various fanout schemes Date: Fri, 28 Aug 2009 16:15:48 +0200 Message-ID: <200908281615.49465.johan@herland.net> References: <1251337437-16947-1-git-send-email-johan@herland.net> <200908281240.13311.johan@herland.net> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: git@vger.kernel.org, Junio C Hamano , "Shawn O. Pearce" , trast@student.ethz.ch, tavestbo@trolltech.com, git@drmicha.warpmail.net, chriscool@tuxfamily.org To: Johannes Schindelin X-From: git-owner@vger.kernel.org Fri Aug 28 16:17:43 2009 Return-path: Envelope-to: gcvg-git-2@lo.gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1Mh2Gs-00050Y-Bz for gcvg-git-2@lo.gmane.org; Fri, 28 Aug 2009 16:17:42 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751189AbZH1ORb (ORCPT ); Fri, 28 Aug 2009 10:17:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750785AbZH1ORa (ORCPT ); Fri, 28 Aug 2009 10:17:30 -0400 Received: from sam.opera.com ([213.236.208.81]:33028 "EHLO smtp.opera.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750757AbZH1ORa (ORCPT ); Fri, 28 Aug 2009 10:17:30 -0400 Received: from pc107.coreteam.oslo.opera.com (pat-tdc.opera.com [213.236.208.22]) (authenticated bits=0) by smtp.opera.com (8.13.4/8.13.4/Debian-3sarge3) with ESMTP id n7SEFnMs023863 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 28 Aug 2009 14:15:55 GMT User-Agent: KMail/1.9.9 In-Reply-To: Content-Disposition: inline Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Friday 28 August 2009, Johannes Schindelin wrote: > On Fri, 28 Aug 2009, Johan Herland wrote: > > On Friday 28 August 2009, Johannes Schindelin wrote: > > > And I can easily imagine a repository that has a daily note > > > generated by an automatic build, and no other notes. The > > > date-based fan-out just wastes our time here, and even hurts > > > performance. > > > > What about a month-based fanout? > > Well, I hoped to convince you that the date-based approach is too > rigid. You basically cannot adapt the optimal data layout to the > available data. > > (I like to think of this issue as related to storing deltas: we let > Git choose relatively freely what to delta against, and do not force > a delta against the parent commit like others do; I think it is > pretty obvious that our approach is more powerful.) > > So the simplest (yet powerful-enough) way I could imagine is to teach > the reading part to accept any fan-out (but that fan-out is really > only based on the object name, nothing else), and to adjust the > writing/merging part such that it has a maximum bin size (i.e. it > starts a new fan-out whenever a tree object contains more than a > config-specifyable limit). I agree with your points on flexibility and not nailing down a structure that might prove too rigid in the future. But it seems the date-based approach might offer wins that an object-name-based approach (flexible or not) simply cannot hope to match... Also a rigid organization (with unique note locations) makes the implementation simpler and faster: If you allow notes for a given commit at several places in the notes tree (and require the result to be the concatenation of those notes, which seems to be the saner choice), the lookup procedure must keep looking even after it has found the first match. This affects both runtime and memory consumption negatively (more subtrees must be unpacked, etc.) I guess I'll code up both alternatives so that we can get some actual numbers... ...Johan -- Johan Herland, www.herland.net