From: Linus Torvalds <torvalds@linux-foundation.org>
To: Artur Skawina <art.08.09@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 0/7] block-sha1: improved SHA1 hashing
Date: Thu, 6 Aug 2009 13:53:11 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.2.01.0908061329320.3390@localhost.localdomain> (raw)
In-Reply-To: <4A7B384C.2020407@gmail.com>
On Thu, 6 Aug 2009, Artur Skawina wrote:
>
> it's a bit slower (P4):
>
> before: linus 0.6288 97.06
> after: linus 0.6604 92.42
Hmm. Ok, I just tested with your harness, and I get
# TIME[s] SPEED[MB/s]
rfc3174 5.1 119.7
rfc3174 5.097 119.7
linus 1.836 332.5
linusas 2.006 304.3
linusas2 1.879 324.9
mozilla 5.562 109.7
mozillaas 5.913 103.2
openssl 1.613 378.5
spelvin 1.698 359.5
spelvina 1.602 381
nettle 1.594 382.9
with it, so it is faster for me. So your slowdown seems to be yet another
P4 thing. Dang crazy micro-architecture.
Of course, it might be a compiler version difference too. I'm using
gcc-4.4.0.
With the cpp variable renaming, the compiler really has less to be smart
about, but spill decisions will still matter a lot.
(My old 32-bit numbers were
linus 2.092 291.8
so it's a clear improvement on my machine and with my compiler).
It also seems to improve the 64-bit numbers a small bit, I'm getting
# TIME[s] SPEED[MB/s]
rfc3174 3.98 153.3
rfc3174 3.972 153.7
linus 1.514 403.1
linusas 1.555 392.6
linusas2 1.599 381.7
mozilla 4.34 140.6
mozillaas 4.223 144.5
with my 64-bit compile, so on a Nehalem it's the best one of the C ones by
a noticeable margin. (My original 64-bit numbers were
linus 1.54 396.3
and while the numbers seem to fluctuate a bit, the fluctuation is roughly
in the 1% range, so that improvement seems to be statistically
significant.
Oh, I did make a small change, but I doubt it matters. Instead of doing
TEMP += E + SHA_ROL(A,5) + (fn) + (constant); \
B = SHA_ROR(B, 2); E = TEMP; } while (0)
I now do
E += TEMP + SHA_ROL(A,5) + (fn) + (constant); \
B = SHA_ROR(B, 2); } while (0)
which is a bit more logical (the old TEMP usage was just due to a fairly
mindless conversion). That _might_ have lower register pressure if the
compiler is silly enough to not notice that it can do it. Maybe that
matters.
Linus
next prev parent reply other threads:[~2009-08-06 20:53 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-06 15:13 [PATCH 0/7] block-sha1: improved SHA1 hashing Linus Torvalds
2009-08-06 15:15 ` [PATCH 1/7] block-sha1: add new optimized C 'block-sha1' routines Linus Torvalds
2009-08-06 15:16 ` [PATCH 2/7] block-sha1: try to use rol/ror appropriately Linus Torvalds
2009-08-06 15:18 ` [PATCH 3/7] block-sha1: make the 'ntohl()' part of the first SHA1 loop Linus Torvalds
2009-08-06 15:20 ` [PATCH 4/7] block-sha1: re-use the temporary array as we calculate the SHA1 Linus Torvalds
2009-08-06 15:22 ` [PATCH 5/7] block-sha1: macroize the rounds a bit further Linus Torvalds
2009-08-06 15:24 ` [PATCH 6/7] block-sha1: Use '(B&C)+(D&(B^C))' instead of '(B&C)|(D&(B|C))' in round 3 Linus Torvalds
2009-08-06 15:25 ` [PATCH 7/7] block-sha1: get rid of redundant 'lenW' context Linus Torvalds
2009-08-06 18:25 ` [PATCH 2/7] block-sha1: try to use rol/ror appropriately Bert Wesarg
2009-08-06 17:22 ` [PATCH 0/7] block-sha1: improved SHA1 hashing Artur Skawina
2009-08-06 18:09 ` Linus Torvalds
2009-08-06 19:10 ` Artur Skawina
2009-08-06 19:41 ` Linus Torvalds
2009-08-06 20:08 ` Artur Skawina
2009-08-06 20:53 ` Linus Torvalds [this message]
2009-08-06 21:24 ` Linus Torvalds
2009-08-06 21:39 ` Artur Skawina
2009-08-06 21:52 ` Artur Skawina
2009-08-06 22:27 ` Linus Torvalds
2009-08-06 22:33 ` Linus Torvalds
2009-08-06 23:19 ` Artur Skawina
2009-08-06 23:42 ` Linus Torvalds
2009-08-06 22:55 ` Artur Skawina
2009-08-06 23:04 ` Linus Torvalds
2009-08-06 23:25 ` Linus Torvalds
2009-08-07 0:13 ` Linus Torvalds
2009-08-07 1:30 ` Artur Skawina
2009-08-07 1:55 ` Linus Torvalds
2009-08-07 0:53 ` Artur Skawina
2009-08-07 2:23 ` Linus Torvalds
2009-08-07 4:16 ` Artur Skawina
[not found] ` <alpine.LFD.2.01.0908071614310.3288@localhost.localdomain>
[not found] ` <4A7CBD28.6070306@gmail.com>
[not found] ` <4A7CBF47.9000903@gmail.com>
[not found] ` <alpine.LFD.2.01.0908071700290.3288@localhost.localdomain>
[not found] ` <4A7CC380.3070008@gmail.com>
2009-08-08 4:16 ` Linus Torvalds
2009-08-08 5:34 ` Artur Skawina
2009-08-08 17:10 ` Linus Torvalds
2009-08-08 18:12 ` Artur Skawina
2009-08-08 22:58 ` Artur Skawina
2009-08-08 23:36 ` Artur Skawina
-- strict thread matches above, loose matches on Subject: below --
2009-08-07 7:36 George Spelvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.01.0908061329320.3390@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=art.08.09@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).