git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Artur Skawina <art.08.09@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 0/7] block-sha1: improved SHA1 hashing
Date: Thu, 6 Aug 2009 14:24:30 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.2.01.0908061406330.3390@localhost.localdomain> (raw)
In-Reply-To: <alpine.LFD.2.01.0908061329320.3390@localhost.localdomain>



On Thu, 6 Aug 2009, Linus Torvalds wrote:
> 
> Hmm. Ok, I just tested with your harness, and I get
> 
> 	#             TIME[s] SPEED[MB/s]
> 	rfc3174           5.1       119.7
> 	rfc3174         5.097       119.7
> 	linus           1.836       332.5
> 	linusas         2.006       304.3
> 	linusas2        1.879       324.9
> 	mozilla         5.562       109.7
> 	mozillaas       5.913       103.2
> 	openssl         1.613       378.5
> 	spelvin         1.698       359.5
> 	spelvina        1.602         381
> 	nettle          1.594       382.9

On atom, I get things like:

	#             TIME[s] SPEED[MB/s]
	rfc3174         2.186       27.92
	rfc3174         2.186       27.92
	linus          0.9492        64.3
	linusas        0.9656       63.21
	linusas2        1.012       60.29
	mozilla         2.492       24.49
	mozillaas         2.5       24.41
	openssl        0.6411        95.2
	spelvin        0.6052       100.8
	spelvina       0.6655       91.71
	nettle         0.7149       85.37

but quite frankly, those timings aren't stable enough to say anything. 
Another few runs got me:

	#             TIME[s] SPEED[MB/s]
	rfc3174         2.207       27.65
	rfc3174          2.21       27.62
	linus           1.022       59.74
	linusas         1.058        57.7
	linusas2        1.008       60.58
	mozilla         2.485       24.56
	mozillaas       2.522        24.2
	openssl        0.6421       95.06
	spelvin        0.5989       101.9
	spelvina       0.6638       91.94
	nettle         0.7132       85.58

	#             TIME[s] SPEED[MB/s]
	rfc3174         2.224       27.44
	rfc3174         2.205       27.68
	linus          0.9727       62.75
	linusas        0.9766        62.5
	linusas2        1.026        59.5
	mozilla          2.52       24.22
	mozillaas       2.547       23.96
	openssl        0.6459        94.5
	spelvin        0.6074       100.5
	spelvina       0.6751       90.41
	nettle         0.7254       84.14

so whatever differences there are between linus*, they seem to be in the 
noise, and the hand-scheduled asm beats all the C versions senseless.

I'd like to get closer to the hand-tuned ones, but I don't see anything to 
do any more. It's all about gcc register choice and avoiding spilling. So 
compiler flags changing small details can have _huge_ differences in 
performance. Here's the Atom numbers with gcc given the "-Os" flag (just 
because I wanted to try):

	linus           1.072       56.94
	linusas        0.9573       63.76
	linusas2       0.9906       61.61

Why did 'linus' numbers go down? No idea. With -O3, it's the other way 
around:

	linus          0.9537          64
	linusas        0.9566        63.8
	linusas2        1.013       60.26

but again, there's variation enough that I'd probabyl need to run ten runs 
just to see how much is noise. But the "linusas2 sucks with -O3" is clear, 
as is the "linus sucks with -Os" thing. Very odd, and very random.

		Linus

  reply	other threads:[~2009-08-06 21:25 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-06 15:13 [PATCH 0/7] block-sha1: improved SHA1 hashing Linus Torvalds
2009-08-06 15:15 ` [PATCH 1/7] block-sha1: add new optimized C 'block-sha1' routines Linus Torvalds
2009-08-06 15:16   ` [PATCH 2/7] block-sha1: try to use rol/ror appropriately Linus Torvalds
2009-08-06 15:18     ` [PATCH 3/7] block-sha1: make the 'ntohl()' part of the first SHA1 loop Linus Torvalds
2009-08-06 15:20       ` [PATCH 4/7] block-sha1: re-use the temporary array as we calculate the SHA1 Linus Torvalds
2009-08-06 15:22         ` [PATCH 5/7] block-sha1: macroize the rounds a bit further Linus Torvalds
2009-08-06 15:24           ` [PATCH 6/7] block-sha1: Use '(B&C)+(D&(B^C))' instead of '(B&C)|(D&(B|C))' in round 3 Linus Torvalds
2009-08-06 15:25             ` [PATCH 7/7] block-sha1: get rid of redundant 'lenW' context Linus Torvalds
2009-08-06 18:25     ` [PATCH 2/7] block-sha1: try to use rol/ror appropriately Bert Wesarg
2009-08-06 17:22 ` [PATCH 0/7] block-sha1: improved SHA1 hashing Artur Skawina
2009-08-06 18:09   ` Linus Torvalds
2009-08-06 19:10     ` Artur Skawina
2009-08-06 19:41       ` Linus Torvalds
2009-08-06 20:08         ` Artur Skawina
2009-08-06 20:53           ` Linus Torvalds
2009-08-06 21:24             ` Linus Torvalds [this message]
2009-08-06 21:39             ` Artur Skawina
2009-08-06 21:52               ` Artur Skawina
2009-08-06 22:27                 ` Linus Torvalds
2009-08-06 22:33                   ` Linus Torvalds
2009-08-06 23:19                     ` Artur Skawina
2009-08-06 23:42                       ` Linus Torvalds
2009-08-06 22:55                   ` Artur Skawina
2009-08-06 23:04                     ` Linus Torvalds
2009-08-06 23:25                       ` Linus Torvalds
2009-08-07  0:13                         ` Linus Torvalds
2009-08-07  1:30                           ` Artur Skawina
2009-08-07  1:55                             ` Linus Torvalds
2009-08-07  0:53                         ` Artur Skawina
2009-08-07  2:23                   ` Linus Torvalds
2009-08-07  4:16                     ` Artur Skawina
     [not found]                     ` <alpine.LFD.2.01.0908071614310.3288@localhost.localdomain>
     [not found]                       ` <4A7CBD28.6070306@gmail.com>
     [not found]                         ` <4A7CBF47.9000903@gmail.com>
     [not found]                           ` <alpine.LFD.2.01.0908071700290.3288@localhost.localdomain>
     [not found]                             ` <4A7CC380.3070008@gmail.com>
2009-08-08  4:16                               ` Linus Torvalds
2009-08-08  5:34                                 ` Artur Skawina
2009-08-08 17:10                                   ` Linus Torvalds
2009-08-08 18:12                                     ` Artur Skawina
2009-08-08 22:58                                   ` Artur Skawina
2009-08-08 23:36                                     ` Artur Skawina
  -- strict thread matches above, loose matches on Subject: below --
2009-08-07  7:36 George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.01.0908061406330.3390@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=art.08.09@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).