From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Hommey Subject: Re: Why Git is so fast Date: Fri, 1 May 2009 11:34:27 +0200 Message-ID: <20090501093427.GA13264@glandium.org> References: <46a038f90904270155i6c802fceoffc73eb5ab57130e@mail.gmail.com> <200904301728.06989.jnareb@gmail.com> <20090430185244.GR23604@spearce.org> <86iqkllw0c.fsf@broadpark.no> <20090430204033.GV23604@spearce.org> <8663gllt88.fsf@broadpark.no> <864ow59o53.fsf@broadpark.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Steven Noonan , "Shawn O. Pearce" , git@vger.kernel.org To: Kjetil Barvik X-From: git-owner@vger.kernel.org Fri May 01 11:34:44 2009 Return-path: Envelope-to: gcvg-git-2@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1Lzp8k-00067M-7e for gcvg-git-2@gmane.org; Fri, 01 May 2009 11:34:42 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754045AbZEAJef (ORCPT ); Fri, 1 May 2009 05:34:35 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753645AbZEAJef (ORCPT ); Fri, 1 May 2009 05:34:35 -0400 Received: from vuizook.err.no ([85.19.221.46]:57771 "EHLO vuizook.err.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752880AbZEAJee (ORCPT ); Fri, 1 May 2009 05:34:34 -0400 Received: from cha92-13-88-165-248-19.fbx.proxad.net ([88.165.248.19] helo=jigen) by vuizook.err.no with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1Lzp8U-0007c8-FU; Fri, 01 May 2009 11:34:29 +0200 Received: from mh by jigen with local (Exim 4.69) (envelope-from ) id 1Lzp8V-0003Sy-4u; Fri, 01 May 2009 11:34:27 +0200 Content-Disposition: inline In-Reply-To: <864ow59o53.fsf@broadpark.no> X-GPG-Fingerprint: A479 A824 265C B2A5 FC54 8D1E DE4B DA2C 54FD 2A58 User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam-Status: (score 0.1): No, score=0.1 required=5.0 tests=RDNS_DYNAMIC autolearn=disabled version=3.2.4 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Fri, May 01, 2009 at 11:19:04AM +0200, Kjetil Barvik wrote: > * Steven Noonan writes: > | On Thu, Apr 30, 2009 at 2:36 PM, Kjetil Barvik wrote: > |> * "Shawn O. Pearce" writes: > |> |> 4) The "static inline void hashcpy(....)" in cache.h could then > |> |> maybe be written like this: > |> | > |> | Its already done as "memcpy(a, b, 20)" which most compilers will > |> | inline and probably reduce to 5 word moves anyway. That's why > |> | hashcpy() itself is inline. > |> > |> But would the compiler be able to trust that the hashcpy() is always > |> called with correct word alignment on variables a and b? > > > > | Well, I just tested this with GCC myself. I used this segment of code: > | > | #include > | void hashcpy(unsigned char *sha_dst, const unsigned char *sha_src) > | { > | memcpy(sha_dst, sha_src, 20); > | } > > OK, here is a smal test, which maybe shows at least one difference > between using "unsigned char sha1[20]" and "unsigned long sha1[5]". > Given the following file, memcpy_test.c: > > #include > extern void hashcpy_uchar(unsigned char *sha_dst, const unsigned char *sha_src); > void hashcpy_uchar(unsigned char *sha_dst, const unsigned char *sha_src) > { > memcpy(sha_dst, sha_src, 20); > } > extern void hashcpy_ulong(unsigned long *sha_dst, const unsigned long *sha_src); > void hashcpy_ulong(unsigned long *sha_dst, const unsigned long *sha_src) > { > memcpy(sha_dst, sha_src, 5); > } > > And, compiled with the following: > > gcc -O2 -mtune=core2 -march=core2 -S -fomit-frame-pointer memcpy_test.c > > It produced the following memcpy_test.s file: > > .file "memcpy_test.c" > .text > .p2align 4,,15 > .globl hashcpy_ulong > .type hashcpy_ulong, @function > hashcpy_ulong: > movl 8(%esp), %edx > movl 4(%esp), %ecx > movl (%edx), %eax > movl %eax, (%ecx) > movzbl 4(%edx), %eax > movb %al, 4(%ecx) > ret > .size hashcpy_ulong, .-hashcpy_ulong > .p2align 4,,15 > .globl hashcpy_uchar > .type hashcpy_uchar, @function > hashcpy_uchar: > movl 8(%esp), %edx > movl 4(%esp), %ecx > movl (%edx), %eax > movl %eax, (%ecx) > movl 4(%edx), %eax > movl %eax, 4(%ecx) > movl 8(%edx), %eax > movl %eax, 8(%ecx) > movl 12(%edx), %eax > movl %eax, 12(%ecx) > movl 16(%edx), %eax > movl %eax, 16(%ecx) > ret > .size hashcpy_uchar, .-hashcpy_uchar > .ident "GCC: (Gentoo 4.3.3-r2 p1.1, pie-10.1.5) 4.3.3" > .section .note.GNU-stack,"",@progbits > > So, the "unsigned long" type hashcpy() used 7 instructions, compared > to 13 for the "unsigned char" type hascpy(). But your "unsigned long" version only copies 5 bytes... Mike