* [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
@ 2007-10-21 19:16 Denys Vlasenko
2007-10-23 6:07 ` Herbert Xu
0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2007-10-21 19:16 UTC (permalink / raw)
To: Herbert Xu; +Cc: linux-crypto
[-- Attachment #1: Type: text/plain, Size: 697 bytes --]
Hello Herbert,
Currently twofish cipher key setup code
has unrolled loops - approximately 70-100
instructions are repeated 40 times.
As a result, twofish module is the biggest module
in crypto/*.
Attached patch conditionalize this unrolling on
CONFIG_CC_OPTIMIZE_FOR_SIZE. Presumably, people which
want to use -Os will also prefer to not have these loops
unrolled:
$ size */twofish_common.o
text data bss dec hex filename
37920 0 0 37920 9420 crypto.org/twofish_common.o
13209 0 0 13209 3399 crypto/twofish_common.o
Run tested (modprobe tcrypt reports ok). Please apply.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
--
vda
[-- Attachment #2: twofish_common.diff --]
[-- Type: text/x-diff, Size: 2741 bytes --]
--- linux-2.6.23.crypto/crypto/twofish_common0.c Sun Oct 21 18:30:14 2007
+++ linux-2.6.23.crypto/crypto/twofish_common.c Sun Oct 21 18:17:36 2007
@@ -655,6 +655,23 @@
CALC_SB256_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
+ /* Unrolling produces x2.5 more code (+18k on i386),
+ * and speeds up key setup by 7%:
+ * unrolled: twofish_setkey/sec: 41128
+ * loop: twofish_setkey/sec: 38148
+ * CALC_K256: ~100 insns each
+ * CALC_K192: ~90 insns
+ * CALC_K: ~70 insns
+ */
+#ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K256 (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K256 (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
+#else
/* Calculate whitening and round subkeys. The constants are
* indices of subkeys, preprocessed through q0 and q1. */
CALC_K256 (w, 0, 0xA9, 0x75, 0x67, 0xF3);
@@ -677,12 +694,22 @@
CALC_K256 (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
CALC_K256 (k, 28, 0x84, 0x8A, 0x54, 0x00);
CALC_K256 (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+#endif
} else if (key_len == 24) { /* 192-bit key */
/* Compute the S-boxes. */
for ( i = j = 0, k = 1; i < 256; i++, j += 2, k += 2 ) {
CALC_SB192_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
+#ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K192 (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K192 (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
+#else
/* Calculate whitening and round subkeys. The constants are
* indices of subkeys, preprocessed through q0 and q1. */
CALC_K192 (w, 0, 0xA9, 0x75, 0x67, 0xF3);
@@ -705,12 +732,22 @@
CALC_K192 (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
CALC_K192 (k, 28, 0x84, 0x8A, 0x54, 0x00);
CALC_K192 (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+#endif
} else { /* 128-bit key */
/* Compute the S-boxes. */
for ( i = j = 0, k = 1; i < 256; i++, j += 2, k += 2 ) {
CALC_SB_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
+#ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
+#else
/* Calculate whitening and round subkeys. The constants are
* indices of subkeys, preprocessed through q0 and q1. */
CALC_K (w, 0, 0xA9, 0x75, 0x67, 0xF3);
@@ -733,6 +770,7 @@
CALC_K (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
CALC_K (k, 28, 0x84, 0x8A, 0x54, 0x00);
CALC_K (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+#endif
}
return 0;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
2007-10-21 19:16 [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE Denys Vlasenko
@ 2007-10-23 6:07 ` Herbert Xu
2007-10-24 17:44 ` Denys Vlasenko
0 siblings, 1 reply; 6+ messages in thread
From: Herbert Xu @ 2007-10-23 6:07 UTC (permalink / raw)
To: Denys Vlasenko; +Cc: linux-crypto
On Sun, Oct 21, 2007 at 08:16:25PM +0100, Denys Vlasenko wrote:
> Hello Herbert,
>
> Currently twofish cipher key setup code
> has unrolled loops - approximately 70-100
> instructions are repeated 40 times.
>
> As a result, twofish module is the biggest module
> in crypto/*.
>
> Attached patch conditionalize this unrolling on
> CONFIG_CC_OPTIMIZE_FOR_SIZE. Presumably, people which
> want to use -Os will also prefer to not have these loops
> unrolled:
Thanks for the patch Denys.
Have you looked at the performance figures on x86 for the two
variants? If the difference is small we could just get rid of
the unrolled version altogether.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
2007-10-23 6:07 ` Herbert Xu
@ 2007-10-24 17:44 ` Denys Vlasenko
2007-10-25 1:08 ` Herbert Xu
0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2007-10-24 17:44 UTC (permalink / raw)
To: Herbert Xu; +Cc: linux-crypto
On Tuesday 23 October 2007 07:07, Herbert Xu wrote:
> On Sun, Oct 21, 2007 at 08:16:25PM +0100, Denys Vlasenko wrote:
> > Hello Herbert,
> >
> > Currently twofish cipher key setup code
> > has unrolled loops - approximately 70-100
> > instructions are repeated 40 times.
> >
> > As a result, twofish module is the biggest module
> > in crypto/*.
> >
> > Attached patch conditionalize this unrolling on
> > CONFIG_CC_OPTIMIZE_FOR_SIZE. Presumably, people which
> > want to use -Os will also prefer to not have these loops
> > unrolled:
>
> Thanks for the patch Denys.
>
> Have you looked at the performance figures on x86 for the two
> variants? If the difference is small we could just get rid of
> the unrolled version altogether.
7% slower key setup (see patch - there is a comment about it).
--
vda
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
2007-10-24 17:44 ` Denys Vlasenko
@ 2007-10-25 1:08 ` Herbert Xu
2007-10-25 18:33 ` Denys Vlasenko
0 siblings, 1 reply; 6+ messages in thread
From: Herbert Xu @ 2007-10-25 1:08 UTC (permalink / raw)
To: Denys Vlasenko; +Cc: linux-crypto
On Wed, Oct 24, 2007 at 06:44:54PM +0100, Denys Vlasenko wrote:
>
> 7% slower key setup (see patch - there is a comment about it).
Is it just the keying? If so please simply delete the unrolled
version because keying is supposed to be a rare event.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
2007-10-25 1:08 ` Herbert Xu
@ 2007-10-25 18:33 ` Denys Vlasenko
2007-10-26 8:24 ` Herbert Xu
0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2007-10-25 18:33 UTC (permalink / raw)
To: Herbert Xu; +Cc: linux-crypto
[-- Attachment #1: Type: text/plain, Size: 785 bytes --]
On Thursday 25 October 2007 02:08, Herbert Xu wrote:
> On Wed, Oct 24, 2007 at 06:44:54PM +0100, Denys Vlasenko wrote:
> >
> > 7% slower key setup (see patch - there is a comment about it).
>
> Is it just the keying? If so please simply delete the unrolled
> version because keying is supposed to be a rare event.
In some crypto applications key setup takes much longer
than encryption itself (e.g. password check).
Not sure whether this kind of behavior ever happens in kernel,
though.
And second, there will be people which want speed at all costs
and they surely will see 7% speedup as significant.
I personally am in the -Os camp and prefer smaller code,
so I personally have no objections.
See attached patch.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
--
vda
[-- Attachment #2: twofish_common02.diff --]
[-- Type: text/x-diff, Size: 4773 bytes --]
--- linux-2.6.23.src/crypto/twofish_common0.c 2007-10-09 21:31:38.000000000 +0100
+++ linux-2.6.23.src/crypto/twofish_common.c 2007-10-25 19:28:28.000000000 +0100
@@ -655,84 +655,48 @@ int twofish_setkey(struct crypto_tfm *tf
CALC_SB256_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
- /* Calculate whitening and round subkeys. The constants are
- * indices of subkeys, preprocessed through q0 and q1. */
- CALC_K256 (w, 0, 0xA9, 0x75, 0x67, 0xF3);
- CALC_K256 (w, 2, 0xB3, 0xC6, 0xE8, 0xF4);
- CALC_K256 (w, 4, 0x04, 0xDB, 0xFD, 0x7B);
- CALC_K256 (w, 6, 0xA3, 0xFB, 0x76, 0xC8);
- CALC_K256 (k, 0, 0x9A, 0x4A, 0x92, 0xD3);
- CALC_K256 (k, 2, 0x80, 0xE6, 0x78, 0x6B);
- CALC_K256 (k, 4, 0xE4, 0x45, 0xDD, 0x7D);
- CALC_K256 (k, 6, 0xD1, 0xE8, 0x38, 0x4B);
- CALC_K256 (k, 8, 0x0D, 0xD6, 0xC6, 0x32);
- CALC_K256 (k, 10, 0x35, 0xD8, 0x98, 0xFD);
- CALC_K256 (k, 12, 0x18, 0x37, 0xF7, 0x71);
- CALC_K256 (k, 14, 0xEC, 0xF1, 0x6C, 0xE1);
- CALC_K256 (k, 16, 0x43, 0x30, 0x75, 0x0F);
- CALC_K256 (k, 18, 0x37, 0xF8, 0x26, 0x1B);
- CALC_K256 (k, 20, 0xFA, 0x87, 0x13, 0xFA);
- CALC_K256 (k, 22, 0x94, 0x06, 0x48, 0x3F);
- CALC_K256 (k, 24, 0xF2, 0x5E, 0xD0, 0xBA);
- CALC_K256 (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
- CALC_K256 (k, 28, 0x84, 0x8A, 0x54, 0x00);
- CALC_K256 (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+ /* CALC_K256/CALC_K192/CALC_K loops were unrolled.
+ * Unrolling produced x2.5 more code (+18k on i386),
+ * and speeded up key setup by 7%:
+ * unrolled: twofish_setkey/sec: 41128
+ * loop: twofish_setkey/sec: 38148
+ * CALC_K256: ~100 insns each
+ * CALC_K192: ~90 insns
+ * CALC_K: ~70 insns
+ */
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K256 (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K256 (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
} else if (key_len == 24) { /* 192-bit key */
/* Compute the S-boxes. */
for ( i = j = 0, k = 1; i < 256; i++, j += 2, k += 2 ) {
CALC_SB192_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
- /* Calculate whitening and round subkeys. The constants are
- * indices of subkeys, preprocessed through q0 and q1. */
- CALC_K192 (w, 0, 0xA9, 0x75, 0x67, 0xF3);
- CALC_K192 (w, 2, 0xB3, 0xC6, 0xE8, 0xF4);
- CALC_K192 (w, 4, 0x04, 0xDB, 0xFD, 0x7B);
- CALC_K192 (w, 6, 0xA3, 0xFB, 0x76, 0xC8);
- CALC_K192 (k, 0, 0x9A, 0x4A, 0x92, 0xD3);
- CALC_K192 (k, 2, 0x80, 0xE6, 0x78, 0x6B);
- CALC_K192 (k, 4, 0xE4, 0x45, 0xDD, 0x7D);
- CALC_K192 (k, 6, 0xD1, 0xE8, 0x38, 0x4B);
- CALC_K192 (k, 8, 0x0D, 0xD6, 0xC6, 0x32);
- CALC_K192 (k, 10, 0x35, 0xD8, 0x98, 0xFD);
- CALC_K192 (k, 12, 0x18, 0x37, 0xF7, 0x71);
- CALC_K192 (k, 14, 0xEC, 0xF1, 0x6C, 0xE1);
- CALC_K192 (k, 16, 0x43, 0x30, 0x75, 0x0F);
- CALC_K192 (k, 18, 0x37, 0xF8, 0x26, 0x1B);
- CALC_K192 (k, 20, 0xFA, 0x87, 0x13, 0xFA);
- CALC_K192 (k, 22, 0x94, 0x06, 0x48, 0x3F);
- CALC_K192 (k, 24, 0xF2, 0x5E, 0xD0, 0xBA);
- CALC_K192 (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
- CALC_K192 (k, 28, 0x84, 0x8A, 0x54, 0x00);
- CALC_K192 (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K192 (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K192 (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
} else { /* 128-bit key */
/* Compute the S-boxes. */
for ( i = j = 0, k = 1; i < 256; i++, j += 2, k += 2 ) {
CALC_SB_2( i, calc_sb_tbl[j], calc_sb_tbl[k] );
}
- /* Calculate whitening and round subkeys. The constants are
- * indices of subkeys, preprocessed through q0 and q1. */
- CALC_K (w, 0, 0xA9, 0x75, 0x67, 0xF3);
- CALC_K (w, 2, 0xB3, 0xC6, 0xE8, 0xF4);
- CALC_K (w, 4, 0x04, 0xDB, 0xFD, 0x7B);
- CALC_K (w, 6, 0xA3, 0xFB, 0x76, 0xC8);
- CALC_K (k, 0, 0x9A, 0x4A, 0x92, 0xD3);
- CALC_K (k, 2, 0x80, 0xE6, 0x78, 0x6B);
- CALC_K (k, 4, 0xE4, 0x45, 0xDD, 0x7D);
- CALC_K (k, 6, 0xD1, 0xE8, 0x38, 0x4B);
- CALC_K (k, 8, 0x0D, 0xD6, 0xC6, 0x32);
- CALC_K (k, 10, 0x35, 0xD8, 0x98, 0xFD);
- CALC_K (k, 12, 0x18, 0x37, 0xF7, 0x71);
- CALC_K (k, 14, 0xEC, 0xF1, 0x6C, 0xE1);
- CALC_K (k, 16, 0x43, 0x30, 0x75, 0x0F);
- CALC_K (k, 18, 0x37, 0xF8, 0x26, 0x1B);
- CALC_K (k, 20, 0xFA, 0x87, 0x13, 0xFA);
- CALC_K (k, 22, 0x94, 0x06, 0x48, 0x3F);
- CALC_K (k, 24, 0xF2, 0x5E, 0xD0, 0xBA);
- CALC_K (k, 26, 0x8B, 0xAE, 0x30, 0x5B);
- CALC_K (k, 28, 0x84, 0x8A, 0x54, 0x00);
- CALC_K (k, 30, 0xDF, 0xBC, 0x23, 0x9D);
+ /* Calculate whitening and round subkeys */
+ for ( i = 0; i < 8; i += 2 ) {
+ CALC_K (w, i, q0[i], q1[i], q0[i+1], q1[i+1]);
+ }
+ for ( i = 0; i < 32; i += 2 ) {
+ CALC_K (k, i, q0[i+8], q1[i+8], q0[i+9], q1[i+9]);
+ }
}
return 0;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE
2007-10-25 18:33 ` Denys Vlasenko
@ 2007-10-26 8:24 ` Herbert Xu
0 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2007-10-26 8:24 UTC (permalink / raw)
To: Denys Vlasenko; +Cc: linux-crypto
On Thu, Oct 25, 2007 at 07:33:55PM +0100, Denys Vlasenko wrote:
>
> See attached patch.
>
> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Patch applied. Thanks a lot Denys.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-10-26 8:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-21 19:16 [PATCH] do not unroll big stuff in twofish key setup if OPTIMIZE_FOR_SIZE Denys Vlasenko
2007-10-23 6:07 ` Herbert Xu
2007-10-24 17:44 ` Denys Vlasenko
2007-10-25 1:08 ` Herbert Xu
2007-10-25 18:33 ` Denys Vlasenko
2007-10-26 8:24 ` Herbert Xu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.