public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "David Schwartz" <davids@webmaster.com>
To: "David S. Miller" <davem@redhat.com>, <willy@w.ods.org>
Cc: <linux-kernel@vger.kernel.org>
Subject: RE: Algoritmic Complexity Attacks and 2.4.20 the dcache code
Date: Sat, 31 May 2003 01:58:09 -0700	[thread overview]
Message-ID: <MDEHLPKNGKAHNMBLJOLKAECGDFAA.davids@webmaster.com> (raw)
In-Reply-To: <20030531.011210.34750891.davem@redhat.com>


> __jhash_mix takes ~23 cycles on sparc64 in the original version for
> me.  I get the same measurement for your version.  Maybe your gcc
> version just stinks :-(

> Oh wait, yes, it's the memory operations it can't eliminate.
> It can't do that because it has no idea whether certain pointers
> alias or not.  (ie. it doesn't know whether 'a' and 'b' point
> to the same value)

	It's a macro, and the way it inlines, it should be obvious in most cases
that 'a', 'b', and 'c' can't be in the same place in memory.

	I've been messing with optimizing the Jenkins hash lately. I've found two
interesting optimizations.

	One is to check if the input pointer happens to be aligned on a double-word
boundary, and if so  optimize the inner loop to use dword fetches. Even most
strings, assuming you hash from the beginning, will be aligned on a dword
boundary. For very short strings, this has no effect. The longer the string,
the more of an affect this has. Basically, you change this:

 // handle most of the key
 while(len >= 12)
 {
  a += ((unsigned)ptr[0] + ((unsigned)ptr[1]<<8) +
       ((unsigned)ptr[2]<<16) + ((unsigned)ptr[3]<<24));
  b += ((unsigned)ptr[4] + ((unsigned)ptr[5]<<8) +
       ((unsigned)ptr[6]<<16) + ((unsigned)ptr[7]<<24));
  c += ((unsigned)ptr[8] + ((unsigned)ptr[9]<<8) +
       ((unsigned)ptr[10]<<16) + ((unsigned)ptr[11]<<24));
  J_MIX(a, b, c);
  ptr += 12;
  len -= 12;
 }

	to:

 // handle most of the key
 if( (((unsigned)data)&3) == 0)
 {
  while(len >= 12)
  {
   a += *  (unsigned *) ptr;
   b += *(((unsigned *) ptr)+1);
   c += *(((unsigned *) ptr)+2);
   J_MIX(a, b, c);
   ptr += 12;
   len -= 12;
  }
 }
 else
 {
  while(len >= 12)
  {
   a += ((unsigned)ptr[0] + ((unsigned)ptr[1]<<8) +
        ((unsigned)ptr[2]<<16) + ((unsigned)ptr[3]<<24));
   b += ((unsigned)ptr[4] + ((unsigned)ptr[5]<<8) +
        ((unsigned)ptr[6]<<16) + ((unsigned)ptr[7]<<24));
   c += ((unsigned)ptr[8] + ((unsigned)ptr[9]<<8) +
        ((unsigned)ptr[10]<<16) + ((unsigned)ptr[11]<<24));
   J_MIX(a, b, c);
   ptr += 12;
   len -= 12;
  }
 }

	How much this helps depends upon how often your inputs are dword aligned.
If you're hashing strings from the beginning, they almost always are.

	The other is to rework the switch/case statement into a tree of 'if(len>0)
{ len--; ...'. You have to reverse the order of the additions, basically
changing:

 switch(len) // all the case statements fall through
 {
  case 11: c+=((unsigned)ptr[10]<<24);
  case 10: c+=((unsigned)ptr[9]<<16);
  case 9 : c+=((unsigned)ptr[8]<<8);
    /* the first byte of c is reserved for the length */
  case 8 : b+=((unsigned)ptr[7]<<24);
  case 7 : b+=((unsigned)ptr[6]<<16);
  case 6 : b+=((unsigned)ptr[5]<<8);
  case 5 : b+=ptr[4];
  case 4 : a+=((unsigned)ptr[3]<<24);
  case 3 : a+=((unsigned)ptr[2]<<16);
  case 2 : a+=((unsigned)ptr[1]<<8);
  case 1 : a+=ptr[0];
//  case 0 : // nothing left to add
 }

	to

 if(len!=0) { len--; a+=ptr[0];
 if(len!=0) { len--; a+=((unsigned)ptr[1]<<8);
 if(len!=0) { len--; a+=((unsigned)ptr[2]<<16);
 if(len!=0) { len--; a+=((unsigned)ptr[3]<<24);
 if(len!=0) { len--; b+=ptr[4];
 if(len!=0) { len--; b+=((unsigned)ptr[5]<<8);
 if(len!=0) { len--; b+=((unsigned)ptr[6]<<16);
 if(len!=0) { len--; b+=((unsigned)ptr[7]<<24);
 if(len!=0) { len--; c+=((unsigned)ptr[8]<<8);
 if(len!=0) { len--; c+=((unsigned)ptr[9]<<16);
 if(len!=0) { len--; c+=((unsigned)ptr[10]<<24);
 } } } } } } } } } } }

	Yeah, it's uglier, and you'd think the extra conditionals would slow things
down, but actually, the branches predict better and the comparisons are
cheaper. The original switch/case statement translates into a
subtract/compare for nothing (to see if len>=12) followed by a totally
unpredictable indirect jump. With my version, the most frequently executed
comparisons are the easiest to predict (assuming random distribution of
lengths).

	I've confirmed by benchmark that both of these are in fact optimizations
(on my processor/compiler/options/etcetera). YMMV.

	DS


  parent reply	other threads:[~2003-05-31  8:44 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-29 20:42 Algoritmic Complexity Attacks and 2.4.20 the dcache code Scott A Crosby
2003-05-30  3:57 ` David S. Miller
2003-05-30  4:29   ` Ingo Molnar
2003-05-30 18:16     ` Timothy Miller
2003-05-30 18:53       ` Scott A Crosby
2003-05-30  5:04   ` Scott A Crosby
2003-05-30  6:24     ` David S. Miller
2003-05-30  6:46       ` Scott A Crosby
2003-05-30  6:56         ` David S. Miller
2003-05-30  8:59       ` Alex Riesen
2003-05-30  9:00         ` David S. Miller
2003-05-30 15:05           ` Scott A Crosby
2003-05-31  6:18             ` David S. Miller
2003-05-31  8:02               ` Willy TARREAU
2003-05-31  8:12                 ` David S. Miller
2003-05-31  8:56                   ` Willy Tarreau
2003-05-31  8:58                     ` David S. Miller
2003-05-31  8:58                   ` David Schwartz [this message]
2003-05-31  9:01                     ` David S. Miller
2003-05-31  6:30           ` William Lee Irwin III
2003-05-31  6:33             ` David S. Miller
2003-05-31  6:41               ` William Lee Irwin III
2003-05-31  6:45                 ` David S. Miller
2003-05-31 18:40                   ` Aaron Lehmann
2003-05-30  4:02 ` Ingo Molnar
2003-05-30  4:42   ` Scott A Crosby
2003-05-30  5:01     ` David S. Miller
2003-05-30 13:48 ` Nikita Danilov
2003-06-01  1:15 ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MDEHLPKNGKAHNMBLJOLKAECGDFAA.davids@webmaster.com \
    --to=davids@webmaster.com \
    --cc=davem@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=willy@w.ods.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox