From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Pickens Subject: Re: Why Git is so fast Date: Thu, 30 Apr 2009 18:25:21 -0700 Message-ID: <885649360904301825i40b6b7b7o9874ee3df2809a21@mail.gmail.com> References: <46a038f90904270155i6c802fceoffc73eb5ab57130e@mail.gmail.com> <200904301728.06989.jnareb@gmail.com> <20090430185244.GR23604@spearce.org> <86iqkllw0c.fsf@broadpark.no> <20090430204033.GV23604@spearce.org> <8663gllt88.fsf@broadpark.no> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Kjetil Barvik , "Shawn O. Pearce" , git@vger.kernel.org To: Steven Noonan X-From: git-owner@vger.kernel.org Fri May 01 03:25:33 2009 Return-path: Envelope-to: gcvg-git-2@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1LzhVM-00030V-M4 for gcvg-git-2@gmane.org; Fri, 01 May 2009 03:25:33 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753560AbZEABZX (ORCPT ); Thu, 30 Apr 2009 21:25:23 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753323AbZEABZW (ORCPT ); Thu, 30 Apr 2009 21:25:22 -0400 Received: from yw-out-2324.google.com ([74.125.46.31]:64253 "EHLO yw-out-2324.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751054AbZEABZW (ORCPT ); Thu, 30 Apr 2009 21:25:22 -0400 Received: by yw-out-2324.google.com with SMTP id 5so1250327ywb.1 for ; Thu, 30 Apr 2009 18:25:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=4KgpImCi3I4ln+C9pRDrdJF7FX9Zz5wyXoSctW7c8aM=; b=p+SJDW1/eIVDZKcF1fxgchOhRxxRdZxrXPM5SYbN2ojwWE06qQPDSEoIYr76kBIGCi trkWHSnhNM5xYpwm1+ImGd8Zn0GtC0I6tMzd/LBf/av/AZEv8bxiFkBLslekOiS9nChT 231nabPPA3kNRTili9GXVhETboFlsOFt7VFvo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=XjBPQBPkab4RZEnVgOYy+WYwnJMC9wrDwoDnla3ukoe3AwCKJ7xj/HCuZhowo6ys6s XfYD27GO4iRwkrYKWs9QnHg/qNX7wZS/dGvIw1LZgx3iKJf2dhhEhiWFh89oX996JBZw viuCaQ9TBnbf0Cd3ztTB7zcDPxy1SZfDWcJ9Y= Received: by 10.151.134.5 with SMTP id l5mr4671786ybn.146.1241141121901; Thu, 30 Apr 2009 18:25:21 -0700 (PDT) In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Thu, Apr 30, 2009, Steven Noonan wrote: > A bit off topic, but the results are rather interesting to me, and I > think I see a weakness in how GCC is doing this on Intel. Someone > please correct me if I'm wrong, but the PowerPC code seems much better > because it can yield very high instruction-level parallelism. It does > 5 loads and then 5 stores, using 4 registers for temporary storage and > 2 registers for pointers. > > I realize the Intel x86 architecture is quite constrained in that it > has so few general purpose registers, but there has to be better code > than what GCC emitted above. It seems like the processor would stall > because of the quantity of sequential inter-dependent instructions > that can't be done in parallel (mov to memory that depends on a mov to > eax, etc). There aren't any unnecessary dependencies. Take this sequence: 1: movl (%edx), %eax 2: movl %eax, (%ecx) 3: movl 4(%edx), %eax 4: movl %eax, 4(%ecx) There are two unavoidable dependencies - #2 depends on #1, and #4 depends on #3. #3 does not depend on #2, even though they both use %eax, because #3 is a write to %eax. So whatever was in %eax before #3 is irrelevant. The processor knows this and will use register renaming to execute #1 and #3 in parallel, and #2 and #4 in parallel. James