From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Subject: Re: block-sha1: improve code on large-register-set machines Date: Tue, 11 Aug 2009 08:04:29 -0700 (PDT) Message-ID: References: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Git Mailing List , Junio C Hamano To: Nicolas Pitre X-From: git-owner@vger.kernel.org Tue Aug 11 17:05:41 2009 Return-path: Envelope-to: gcvg-git-2@gmane.org Received: from vger.kernel.org ([209.132.176.167]) by lo.gmane.org with esmtp (Exim 4.50) id 1Masux-0005OF-Ri for gcvg-git-2@gmane.org; Tue, 11 Aug 2009 17:05:40 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752360AbZHKPFa (ORCPT ); Tue, 11 Aug 2009 11:05:30 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752142AbZHKPFa (ORCPT ); Tue, 11 Aug 2009 11:05:30 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:58095 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751108AbZHKPF3 (ORCPT ); Tue, 11 Aug 2009 11:05:29 -0400 Received: from imap1.linux-foundation.org (imap1.linux-foundation.org [140.211.169.55]) by smtp1.linux-foundation.org (8.14.2/8.13.5/Debian-3ubuntu1.1) with ESMTP id n7BF4UYk001308 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 11 Aug 2009 08:04:31 -0700 Received: from localhost (localhost [127.0.0.1]) by imap1.linux-foundation.org (8.13.5.20060308/8.13.5/Debian-3ubuntu1.1) with ESMTP id n7BF4Tnl003425; Tue, 11 Aug 2009 08:04:30 -0700 X-X-Sender: torvalds@localhost.localdomain In-Reply-To: User-Agent: Alpine 2.01 (LFD 1184 2008-12-16) X-Spam-Status: No, hits=-3.463 required=5 tests=AWL,BAYES_00 X-Spam-Checker-Version: SpamAssassin 3.2.4-osdl_revision__1.47__ X-MIMEDefang-Filter: lf$Revision: 1.188 $ X-Scanned-By: MIMEDefang 2.63 on 140.211.169.13 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Tue, 11 Aug 2009, Nicolas Pitre wrote: > > #define SHA_SRC(t) \ > ({ unsigned char *__d = (unsigned char *)&data[t]; \ > (__d[0] << 24) | (__d[1] << 16) | (__d[2] << 8) | (__d[3] << 0); }) > > And this provides the exact same performance as the ntohl() based > version (4.980s) except that this now cope with unaligned buffers too. Is it better to do a (conditional) memcpy up front? Or is the byte-based one better just because you always end up doing the shifting anyway due to most ARM situations being little-endian? I _suspect_ that most large SHA1 calls from git are pre-aligned. The big SHA1 calls are for pack-file verification in fsck, which should all be aligned. Same goes for index file integrity checking. The actual object SHA1 calculations are likely not aligned (we do that object header thing), and if you can't do the htonl() any better way I guess the byte-based thing is the way to go.. Linus --- block-sha1/sha1.c | 13 ++++++++++++- 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c index 9bc8b8a..df27e66 100644 --- a/block-sha1/sha1.c +++ b/block-sha1/sha1.c @@ -25,6 +25,12 @@ void blk_SHA1_Init(blk_SHA_CTX *ctx) ctx->H[4] = 0xc3d2e1f0; } +#ifdef REALLY_SLOW_UNALIGNED + #define is_unaligned(ptr) (3 & (unsigned long)(ptr)) +#else + #define is_unaligned(ptr) 0 +#endif + void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len) { @@ -47,7 +53,12 @@ void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len) blk_SHA1Block(ctx, ctx->W); } while (len >= 64) { - blk_SHA1Block(ctx, data); + const unsigned int *block = data; + if (is_unaligned(data)) { + memcpy(ctx->W, data, 64); + block = ctx->W; + } + blk_SHA1Block(ctx, block); data += 64; len -= 64; }