All of lore.kernel.org
 help / color / mirror / Atom feed
From: Artur Skawina <art.08.09@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 0/7] block-sha1: improved SHA1 hashing
Date: Fri, 07 Aug 2009 01:19:13 +0200	[thread overview]
Message-ID: <4A7B64F1.2000309@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.01.0908061531310.3390@localhost.localdomain>

Linus Torvalds wrote:
> 
> Yeah, verified. Google for
> 
> 	northwood "barrel shifter"
> 
> and you'll find a lot of it.
> 
> Basically, older P4's will I think shift one bit at a time. So while even 
> Prescott is relatively weak in the shifter department, pre-prescott 
> (Willamette and Northwood) are _really_ weak. If your P4 is one of those, 
> you really shouldn't use it to decide on optimizations.

Actually that's even more of a reason to make sure the code doesn't suck :)
The difference on less perverse cpus will usually be small, but on P4 it
can be huge.

A few years back I found my old ip checksum microbenchmark, and when I ran
it on a P4 (prescott iirc) i didn't believe my eyes. The straightforward 
32-bit C implementation was running circles around the in-kernel one...
And a few tweaks to the assembler version got me another ~100% speedup.[1]

After that the P4 became the very first cpu to test any code on... :)

artur

[1] just reran the benchmark on this p4; true on northwood too:

IACCK 0.9.30  Artur Skawina <...>
[ exec time; lower is better  ] [speed ] [ time ]  [ok?]
TIME-N+S TIME32 TIME33 TIME1480 MBYTES/S TIMEXXXX  CSUM FUNCTION ( rdtsc_overhead=0  null=0 )
   17901    510    557     3010   393.36    59772  56dd csum_partial_cdumb16
    3019    154    156      431  2747.10    43106  56dd csum_partial_c32
    2413    170    177      328  3609.76    37501  56dd csum_partial_c32l
    2437    170    170      328  3609.76    37488  56dd csum_partial_c32i
    5078    205    254      767  1543.68    48117  56dd csum_partial_std
    5612    299    291      851  1391.30    53673  56dd csum_partial_686
    1584     99    127      227  5215.86    14495  56dd csum_partial_586f
    1738    107    121      229  5170.31    14785  56dd csum_partial_586fs
    4893    175    171      759  1559.95    52347  56dd csum_partial_copy_generic_std
    4949    151    189      756  1566.14    67847  56dd csum_partial_copy_generic_686
    2072    110    134      302  3920.53    39061  56dd csum_partial_copy_generic_p4as1

  reply	other threads:[~2009-08-06 23:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-06 15:13 [PATCH 0/7] block-sha1: improved SHA1 hashing Linus Torvalds
2009-08-06 15:15 ` [PATCH 1/7] block-sha1: add new optimized C 'block-sha1' routines Linus Torvalds
2009-08-06 15:16   ` [PATCH 2/7] block-sha1: try to use rol/ror appropriately Linus Torvalds
2009-08-06 15:18     ` [PATCH 3/7] block-sha1: make the 'ntohl()' part of the first SHA1 loop Linus Torvalds
2009-08-06 15:20       ` [PATCH 4/7] block-sha1: re-use the temporary array as we calculate the SHA1 Linus Torvalds
2009-08-06 15:22         ` [PATCH 5/7] block-sha1: macroize the rounds a bit further Linus Torvalds
2009-08-06 15:24           ` [PATCH 6/7] block-sha1: Use '(B&C)+(D&(B^C))' instead of '(B&C)|(D&(B|C))' in round 3 Linus Torvalds
2009-08-06 15:25             ` [PATCH 7/7] block-sha1: get rid of redundant 'lenW' context Linus Torvalds
2009-08-06 18:25     ` [PATCH 2/7] block-sha1: try to use rol/ror appropriately Bert Wesarg
2009-08-06 17:22 ` [PATCH 0/7] block-sha1: improved SHA1 hashing Artur Skawina
2009-08-06 18:09   ` Linus Torvalds
2009-08-06 19:10     ` Artur Skawina
2009-08-06 19:41       ` Linus Torvalds
2009-08-06 20:08         ` Artur Skawina
2009-08-06 20:53           ` Linus Torvalds
2009-08-06 21:24             ` Linus Torvalds
2009-08-06 21:39             ` Artur Skawina
2009-08-06 21:52               ` Artur Skawina
2009-08-06 22:27                 ` Linus Torvalds
2009-08-06 22:33                   ` Linus Torvalds
2009-08-06 23:19                     ` Artur Skawina [this message]
2009-08-06 23:42                       ` Linus Torvalds
2009-08-06 22:55                   ` Artur Skawina
2009-08-06 23:04                     ` Linus Torvalds
2009-08-06 23:25                       ` Linus Torvalds
2009-08-07  0:13                         ` Linus Torvalds
2009-08-07  1:30                           ` Artur Skawina
2009-08-07  1:55                             ` Linus Torvalds
2009-08-07  0:53                         ` Artur Skawina
2009-08-07  2:23                   ` Linus Torvalds
2009-08-07  4:16                     ` Artur Skawina
     [not found]                     ` <alpine.LFD.2.01.0908071614310.3288@localhost.localdomain>
     [not found]                       ` <4A7CBD28.6070306@gmail.com>
     [not found]                         ` <4A7CBF47.9000903@gmail.com>
     [not found]                           ` <alpine.LFD.2.01.0908071700290.3288@localhost.localdomain>
     [not found]                             ` <4A7CC380.3070008@gmail.com>
2009-08-08  4:16                               ` Linus Torvalds
2009-08-08  5:34                                 ` Artur Skawina
2009-08-08 17:10                                   ` Linus Torvalds
2009-08-08 18:12                                     ` Artur Skawina
2009-08-08 22:58                                   ` Artur Skawina
2009-08-08 23:36                                     ` Artur Skawina
  -- strict thread matches above, loose matches on Subject: below --
2009-08-07  7:36 George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A7B64F1.2000309@gmail.com \
    --to=art.08.09@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.