From: Denys Vlasenko <vda.linux@googlemail.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Subject: Re: [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
Date: Thu, 25 Oct 2007 19:33:55 +0100 [thread overview]
Message-ID: <200710251933.55853.vda.linux@googlemail.com> (raw)
In-Reply-To: <20071025010822.GB15154@gondor.apana.org.au>
[-- Attachment #1: Type: text/plain, Size: 785 bytes --]
On Thursday 25 October 2007 02:08, Herbert Xu wrote:
> On Wed, Oct 24, 2007 at 06:44:54PM +0100, Denys Vlasenko wrote:
> >
> > 7% slower key setup (see patch - there is a comment about it).
>
> Is it just the keying? If so please simply delete the unrolled
> version because keying is supposed to be a rare event.
In some crypto applications key setup takes much longer
than encryption itself (e.g. password check).
Not sure whether this kind of behavior ever happens in kernel,
though.
And second, there will be people which want speed at all costs
and they surely will see 7% speedup as significant.
I personally am in the -Os camp and prefer smaller code,
so I personally have no objections.
See attached patch.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
--
vda
[-- Attachment #2: twofish_common02.diff --]
[-- Type: text/x-diff, Size: 4773 bytes --]
--- linux-2.6.23.src/crypto/twofish_common0.c 2007-10-09 21:31:38.000000000 +0100
+++ linux-2.6.23.src/crypto/twofish_common.c 2007-10-25 19:28:28.000000000 +0100
@@ -655,84 +655,48 @@ int twofish_setkey(struct crypto_tfm *tf
CALC_SB256_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
- /* Calculate whitening and round subkeys. The constants are
- * indices of subkeys, preprocessed through q0 and q1. */
- CALC_K256 (w, 0, 0xA9, 0x75, 0x67, 0xF3);
- CALC_K256 (w, 2, 0xB3, 0xC6, 0xE8, 0xF4);
- CALC_K256 (w, 4, 0x04, 0xDB, 0xFD, 0x7B);
- CALC_K256 (w, 6, 0xA3, 0xFB, 0x76, 0xC8);
- CALC_K256 (k, 0, 0x9A, 0x4A, 0x92, 0xD3);
- CALC_K256 (k, 2, 0x80, 0xE6, 0x78, 0x6B);
- CALC_K256 (k, 4, 0xE4, 0x45, 0xDD, 0x7D);
- CALC_K256 (k, 6, 0xD1, 0xE8, 0x38, 0x4B);
- CALC_K256 (k, 8, 0x0D, 0xD6, 0xC6, 0x32);
- CALC_K256 (k, 10, 0x35, 0xD8, 0x98, 0xFD);
- CALC_K256 (k, 12, 0x18, 0x37, 0xF7, 0x71);
- CALC_K256 (k, 14, 0xEC, 0xF1, 0x6C, 0xE1);
- CALC_K256 (k, 16, 0x43, 0x30, 0x75, 0x0F);
- CALC_K256 (k, 18, 0x37, 0xF8, 0x26, 0x1B);
- CALC_K256 (k, 20, 0xFA, 0x87, 0x13, 0xFA);
- CALC_K256 (k, 22, 0x94, 0x06, 0x48, 0x3F);
- CALC_K256 (k, 24, 0xF2, 0x5E, 0xD0, 0xBA);
- CALC_K256 (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
- CALC_K256 (k, 28, 0x84, 0x8A, 0x54, 0x00);
- CALC_K256 (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+ /* CALC_K256/CALC_K192/CALC_K loops were unrolled.
+ * Unrolling produced x2.5 more code (+18k on i386),
+ * and speeded up key setup by 7%:
+ * unrolled: twofish_setkey/sec: 41128
+ * loop: twofish_setkey/sec: 38148
+ * CALC_K256: ~100 insns each
+ * CALC_K192: ~90 insns
+ * CALC_K: ~70 insns
+ */
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K256 (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K256 (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
} else if (key_len == 24) { /* 192-bit key */
/* Compute the S-boxes. */
for ( i = j = 0, k = 1; i < 256; i++, j += 2, k += 2 ) {
CALC_SB192_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
- /* Calculate whitening and round subkeys. The constants are
- * indices of subkeys, preprocessed through q0 and q1. */
- CALC_K192 (w, 0, 0xA9, 0x75, 0x67, 0xF3);
- CALC_K192 (w, 2, 0xB3, 0xC6, 0xE8, 0xF4);
- CALC_K192 (w, 4, 0x04, 0xDB, 0xFD, 0x7B);
- CALC_K192 (w, 6, 0xA3, 0xFB, 0x76, 0xC8);
- CALC_K192 (k, 0, 0x9A, 0x4A, 0x92, 0xD3);
- CALC_K192 (k, 2, 0x80, 0xE6, 0x78, 0x6B);
- CALC_K192 (k, 4, 0xE4, 0x45, 0xDD, 0x7D);
- CALC_K192 (k, 6, 0xD1, 0xE8, 0x38, 0x4B);
- CALC_K192 (k, 8, 0x0D, 0xD6, 0xC6, 0x32);
- CALC_K192 (k, 10, 0x35, 0xD8, 0x98, 0xFD);
- CALC_K192 (k, 12, 0x18, 0x37, 0xF7, 0x71);
- CALC_K192 (k, 14, 0xEC, 0xF1, 0x6C, 0xE1);
- CALC_K192 (k, 16, 0x43, 0x30, 0x75, 0x0F);
- CALC_K192 (k, 18, 0x37, 0xF8, 0x26, 0x1B);
- CALC_K192 (k, 20, 0xFA, 0x87, 0x13, 0xFA);
- CALC_K192 (k, 22, 0x94, 0x06, 0x48, 0x3F);
- CALC_K192 (k, 24, 0xF2, 0x5E, 0xD0, 0xBA);
- CALC_K192 (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
- CALC_K192 (k, 28, 0x84, 0x8A, 0x54, 0x00);
- CALC_K192 (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K192 (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K192 (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
} else { /* 128-bit key */
/* Compute the S-boxes. */
for ( i = j = 0, k = 1; i < 256; i++, j += 2, k += 2 ) {
CALC_SB_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
- /* Calculate whitening and round subkeys. The constants are
- * indices of subkeys, preprocessed through q0 and q1. */
- CALC_K (w, 0, 0xA9, 0x75, 0x67, 0xF3);
- CALC_K (w, 2, 0xB3, 0xC6, 0xE8, 0xF4);
- CALC_K (w, 4, 0x04, 0xDB, 0xFD, 0x7B);
- CALC_K (w, 6, 0xA3, 0xFB, 0x76, 0xC8);
- CALC_K (k, 0, 0x9A, 0x4A, 0x92, 0xD3);
- CALC_K (k, 2, 0x80, 0xE6, 0x78, 0x6B);
- CALC_K (k, 4, 0xE4, 0x45, 0xDD, 0x7D);
- CALC_K (k, 6, 0xD1, 0xE8, 0x38, 0x4B);
- CALC_K (k, 8, 0x0D, 0xD6, 0xC6, 0x32);
- CALC_K (k, 10, 0x35, 0xD8, 0x98, 0xFD);
- CALC_K (k, 12, 0x18, 0x37, 0xF7, 0x71);
- CALC_K (k, 14, 0xEC, 0xF1, 0x6C, 0xE1);
- CALC_K (k, 16, 0x43, 0x30, 0x75, 0x0F);
- CALC_K (k, 18, 0x37, 0xF8, 0x26, 0x1B);
- CALC_K (k, 20, 0xFA, 0x87, 0x13, 0xFA);
- CALC_K (k, 22, 0x94, 0x06, 0x48, 0x3F);
- CALC_K (k, 24, 0xF2, 0x5E, 0xD0, 0xBA);
- CALC_K (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
- CALC_K (k, 28, 0x84, 0x8A, 0x54, 0x00);
- CALC_K (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
}
return 0;
next prev parent reply other threads:[~2007-10-25 18:34 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-21 19:16 [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE Denys Vlasenko
2007-10-23 6:07 ` Herbert Xu
2007-10-24 17:44 ` Denys Vlasenko
2007-10-25 1:08 ` Herbert Xu
2007-10-25 18:33 ` Denys Vlasenko [this message]
2007-10-26 8:24 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200710251933.55853.vda.linux@googlemail.com \
--to=vda.linux@googlemail.com \
--cc=herbert@gondor.apana.org.au \
--cc=linux-crypto@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.