From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Shawn O. Pearce" Subject: Re: pack operation is thrashing my server Date: Wed, 13 Aug 2008 10:19:21 -0700 Message-ID: <20080813171921.GF3782@spearce.org> References: <20080811030444.GC27195@spearce.org> <87vdy71i6w.fsf@basil.nowhere.org> <1EE44425-6910-4C37-9242-54D0078FC377@adacore.com> <20080813145944.GB3782@spearce.org> <20080813155016.GD3782@spearce.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: Geert Bosch , Andi Kleen , Ken Pratt , git@vger.kernel.org To: Nicolas Pitre X-From: git-owner@vger.kernel.org Wed Aug 13 19:20:28 2008 Return-path: Envelope-to: gcvg-git-2@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1KTK1L-0000N5-O2 for gcvg-git-2@gmane.org; Wed, 13 Aug 2008 19:20:28 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751799AbYHMRTX (ORCPT ); Wed, 13 Aug 2008 13:19:23 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751722AbYHMRTW (ORCPT ); Wed, 13 Aug 2008 13:19:22 -0400 Received: from george.spearce.org ([209.20.77.23]:50730 "EHLO george.spearce.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750869AbYHMRTV (ORCPT ); Wed, 13 Aug 2008 13:19:21 -0400 Received: by george.spearce.org (Postfix, from userid 1001) id 543D738375; Wed, 13 Aug 2008 17:19:21 +0000 (UTC) Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Nicolas Pitre wrote: > On Wed, 13 Aug 2008, Shawn O. Pearce wrote: > > Doing the object > > enumeration is pointless as a security measure. > > It is good for network bandwidth efficiency as I mentioned. The network bandwidth efficiency is the most valid argument for the enumeration. > > I'm too busy to write a pack concat implementation proposal > > A much better solution would consist of finding just _why_ object > enumeration is so slow. This is indeed my biggest grip with git > performance at the moment. ... > |nico@xanadu:gcc> time git rev-list --objects --all > /dev/null > | > |real 1m51.591s > |user 1m50.757s > |sys 0m0.810s > > That's for 1267993 objects, or about 11400 objects/sec. > > Clearly something is not scaling here. Yikes. Last time I was looking at this sort of thing I think we spent around 60% of our time dealing with inflating, patching and parsing commit and tree objects. pack v4's formatting spawned out of that particular point, but we never really finished that. Its been years so I can't trust my memory enough to say pack v4 is the solution to this, without redoing the profiling. But I think that is what one would find. Though the decreasing objects/sec rate with increased total number of objects suggets the object hash isn't scaling. -- Shawn.