All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] crypto: make michael_block a function
@ 2008-05-16  6:17 Harvey Harrison
  2008-05-16  7:01 ` Sebastian Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Harvey Harrison @ 2008-05-16  6:17 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Johannes Berg, LKML

Make the michael_block macro a function and change the calling
function to take a struct michael_mic_ctx * and the value for
the initial xor with ctx->l.

Also open-code xswap in its one use in michael_block.

Some use of get_unaligned is probably needed as an add-on.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
---
 crypto/michael_mic.c |   55 +++++++++++++++++++++----------------------------
 1 files changed, 24 insertions(+), 31 deletions(-)

diff --git a/crypto/michael_mic.c b/crypto/michael_mic.c
index 9e917b8..792dbf9 100644
--- a/crypto/michael_mic.c
+++ b/crypto/michael_mic.c
@@ -31,19 +31,18 @@ static inline u32 xswap(u32 val)
 	return ((val & 0x00ff00ff) << 8) | ((val & 0xff00ff00) >> 8);
 }
 
-
-#define michael_block(l, r)	\
-do {				\
-	r ^= rol32(l, 17);	\
-	l += r;			\
-	r ^= xswap(l);		\
-	l += r;			\
-	r ^= rol32(l, 3);	\
-	l += r;			\
-	r ^= ror32(l, 2);	\
-	l += r;			\
-} while (0)
-
+static void michael_block(struct michael_mic_ctx *ctx, u32 val)
+{
+	ctx->l ^= val;
+	ctx->r ^= rol32(ctx->l, 17);
+	ctx->l += ctx->r;
+	ctx->r ^= ((ctx->l & 0x00ff00ff) << 8) | ((ctx->l & 0xff00ff00) >> 8);
+	ctx->l += ctx->r;
+	ctx->r ^= rol32(ctx->l, 3);
+	ctx->l += ctx->r;
+	ctx->r ^= ror32(ctx->l, 2);
+	ctx->l += ctx->r;
+}
 
 static void michael_init(struct crypto_tfm *tfm)
 {
@@ -71,16 +70,14 @@ static void michael_update(struct crypto_tfm *tfm, const u8 *data,
 			return;
 
 		src = (const __le32 *)mctx->pending;
-		mctx->l ^= le32_to_cpup(src);
-		michael_block(mctx->l, mctx->r);
+		michael_block(mctx, le32_to_cpup(src));
 		mctx->pending_len = 0;
 	}
 
 	src = (const __le32 *)data;
 
 	while (len >= 4) {
-		mctx->l ^= le32_to_cpup(src++);
-		michael_block(mctx->l, mctx->r);
+		michael_block(mctx, le32_to_cpup(src++));
 		len -= 4;
 	}
 
@@ -96,26 +93,22 @@ static void michael_final(struct crypto_tfm *tfm, u8 *out)
 	struct michael_mic_ctx *mctx = crypto_tfm_ctx(tfm);
 	u8 *data = mctx->pending;
 	__le32 *dst = (__le32 *)out;
+	u32 tmp;
 
 	/* Last block and padding (0x5a, 4..7 x 0) */
+	tmp = 0x5a;
 	switch (mctx->pending_len) {
-	case 0:
-		mctx->l ^= 0x5a;
-		break;
-	case 1:
-		mctx->l ^= data[0] | 0x5a00;
-		break;
-	case 2:
-		mctx->l ^= data[0] | (data[1] << 8) | 0x5a0000;
-		break;
 	case 3:
-		mctx->l ^= data[0] | (data[1] << 8) | (data[2] << 16) |
-			0x5a000000;
+		tmp = (tmp << 8) | data[2];
+	case 2:
+		tmp = (tmp << 8) | data[1];
+	case 1:
+		tmp = (tmp << 8) | data[0];
+	case 0:
 		break;
 	}
-	michael_block(mctx->l, mctx->r);
-	/* l ^= 0; */
-	michael_block(mctx->l, mctx->r);
+	michael_block(mctx, tmp);
+	michael_block(mctx, 0);
 
 	dst[0] = cpu_to_le32(mctx->l);
 	dst[1] = cpu_to_le32(mctx->r);
-- 
1.5.5.1.570.g26b5e




^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: make michael_block a function
  2008-05-16  6:17 [PATCH] crypto: make michael_block a function Harvey Harrison
@ 2008-05-16  7:01 ` Sebastian Siewior
  2008-05-16  8:10   ` Johannes Berg
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Siewior @ 2008-05-16  7:01 UTC (permalink / raw)
  To: Harvey Harrison; +Cc: Herbert Xu, Johannes Berg, LKML

* Harvey Harrison | 2008-05-15 23:17:17 [-0700]:

>Make the michael_block macro a function and change the calling
>function to take a struct michael_mic_ctx * and the value for
>the initial xor with ctx->l.
>
>Also open-code xswap in its one use in michael_block.
Does this change have any performance impact?
The only user is wireless (I guess). Is it used frequently (sign every
packet for instance) or once in a while (in every re-keying)?

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: make michael_block a function
  2008-05-16  7:01 ` Sebastian Siewior
@ 2008-05-16  8:10   ` Johannes Berg
  2008-05-16 15:52     ` Harvey Harrison
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Berg @ 2008-05-16  8:10 UTC (permalink / raw)
  To: Sebastian Siewior; +Cc: Harvey Harrison, Herbert Xu, LKML

[-- Attachment #1: Type: text/plain, Size: 639 bytes --]

On Fri, 2008-05-16 at 09:01 +0200, Sebastian Siewior wrote:
> * Harvey Harrison | 2008-05-15 23:17:17 [-0700]:
> 
> >Make the michael_block macro a function and change the calling
> >function to take a struct michael_mic_ctx * and the value for
> >the initial xor with ctx->l.
> >
> >Also open-code xswap in its one use in michael_block.
> Does this change have any performance impact?
> The only user is wireless (I guess). Is it used frequently (sign every
> packet for instance) or once in a while (in every re-keying)?

Every packet. I have no idea whether it has performance impact, very
hard to even guess.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: make michael_block a function
  2008-05-16  8:10   ` Johannes Berg
@ 2008-05-16 15:52     ` Harvey Harrison
  2008-05-18 21:57       ` Sebastian Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Harvey Harrison @ 2008-05-16 15:52 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Sebastian Siewior, Herbert Xu, LKML, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1131 bytes --]

On Fri, 2008-05-16 at 10:10 +0200, Johannes Berg wrote:
> On Fri, 2008-05-16 at 09:01 +0200, Sebastian Siewior wrote:
> > * Harvey Harrison | 2008-05-15 23:17:17 [-0700]:
> > 
> > >Make the michael_block macro a function and change the calling
> > >function to take a struct michael_mic_ctx * and the value for
> > >the initial xor with ctx->l.
> > >
> > >Also open-code xswap in its one use in michael_block.
> > Does this change have any performance impact?
> > The only user is wireless (I guess). Is it used frequently (sign every
> > packet for instance) or once in a while (in every re-keying)?
> 
> Every packet. I have no idea whether it has performance impact, very
> hard to even guess.
> 

Well, the code-size difference is significant (about 200 bytes smaller
on X86-32).  This macro is essentially an inline function that is pretty
large.

Attached find the objdump -d of the original and after the patch.  As it
is, this code isn't used anywhere that I can find as mac80211 has its
own implementation, I'm trying to see if there is much advantage in
keeping the private version over moving to the crypto one.

Harvey

[-- Attachment #2: michael_mic.orig --]
[-- Type: text/plain, Size: 11197 bytes --]


crypto/michael_mic.o:     file format elf32-i386

Disassembly of section .text:

00000000 <michael_setkey>:
   0:	83 f9 08             	cmp    $0x8,%ecx
   3:	53                   	push   %ebx
   4:	89 d3                	mov    %edx,%ebx
   6:	74 0d                	je     15 <michael_setkey+0x15>
   8:	81 08 00 00 20 00    	orl    $0x200000,(%eax)
   e:	b8 ea ff ff ff       	mov    $0xffffffea,%eax
  13:	eb 10                	jmp    25 <michael_setkey+0x25>
  15:	8b 12                	mov    (%edx),%edx
  17:	83 c0 30             	add    $0x30,%eax
  1a:	89 50 08             	mov    %edx,0x8(%eax)
  1d:	8b 53 04             	mov    0x4(%ebx),%edx
  20:	89 50 0c             	mov    %edx,0xc(%eax)
  23:	31 c0                	xor    %eax,%eax
  25:	5b                   	pop    %ebx
  26:	c3                   	ret    

00000027 <michael_init>:
  27:	c7 40 34 00 00 00 00 	movl   $0x0,0x34(%eax)
  2e:	c3                   	ret    

0000002f <michael_update>:
  2f:	55                   	push   %ebp
  30:	89 cd                	mov    %ecx,%ebp
  32:	57                   	push   %edi
  33:	56                   	push   %esi
  34:	53                   	push   %ebx
  35:	89 c3                	mov    %eax,%ebx
  37:	83 ec 08             	sub    $0x8,%esp
  3a:	83 c3 30             	add    $0x30,%ebx
  3d:	89 44 24 04          	mov    %eax,0x4(%esp)
  41:	89 14 24             	mov    %edx,(%esp)
  44:	8b 73 04             	mov    0x4(%ebx),%esi
  47:	85 f6                	test   %esi,%esi
  49:	0f 84 93 00 00 00    	je     e2 <michael_update+0xb3>
  4f:	b8 04 00 00 00       	mov    $0x4,%eax
  54:	29 f0                	sub    %esi,%eax
  56:	39 c8                	cmp    %ecx,%eax
  58:	89 c2                	mov    %eax,%edx
  5a:	0f 47 d1             	cmova  %ecx,%edx
  5d:	89 d1                	mov    %edx,%ecx
  5f:	8d 3c 33             	lea    (%ebx,%esi,1),%edi
  62:	8b 34 24             	mov    (%esp),%esi
  65:	c1 e9 02             	shr    $0x2,%ecx
  68:	f3 a5                	rep movsl %ds:(%esi),%es:(%edi)
  6a:	89 d1                	mov    %edx,%ecx
  6c:	83 e1 03             	and    $0x3,%ecx
  6f:	74 02                	je     73 <michael_update+0x44>
  71:	f3 a4                	rep movsb %ds:(%esi),%es:(%edi)
  73:	89 d0                	mov    %edx,%eax
  75:	03 43 04             	add    0x4(%ebx),%eax
  78:	83 f8 03             	cmp    $0x3,%eax
  7b:	89 43 04             	mov    %eax,0x4(%ebx)
  7e:	0f 86 d2 00 00 00    	jbe    156 <michael_update+0x127>
  84:	8b 44 24 04          	mov    0x4(%esp),%eax
  88:	29 d5                	sub    %edx,%ebp
  8a:	01 14 24             	add    %edx,(%esp)
  8d:	8b 53 08             	mov    0x8(%ebx),%edx
  90:	33 50 30             	xor    0x30(%eax),%edx
  93:	c7 43 04 00 00 00 00 	movl   $0x0,0x4(%ebx)
  9a:	89 53 08             	mov    %edx,0x8(%ebx)
  9d:	c1 ca 0f             	ror    $0xf,%edx
  a0:	33 53 0c             	xor    0xc(%ebx),%edx
  a3:	89 d0                	mov    %edx,%eax
  a5:	03 43 08             	add    0x8(%ebx),%eax
  a8:	89 c1                	mov    %eax,%ecx
  aa:	89 43 08             	mov    %eax,0x8(%ebx)
  ad:	81 e1 00 ff 00 ff    	and    $0xff00ff00,%ecx
  b3:	25 ff 00 ff 00       	and    $0xff00ff,%eax
  b8:	c1 e0 08             	shl    $0x8,%eax
  bb:	c1 e9 08             	shr    $0x8,%ecx
  be:	09 c1                	or     %eax,%ecx
  c0:	31 d1                	xor    %edx,%ecx
  c2:	89 ca                	mov    %ecx,%edx
  c4:	03 53 08             	add    0x8(%ebx),%edx
  c7:	89 53 08             	mov    %edx,0x8(%ebx)
  ca:	c1 ca 1d             	ror    $0x1d,%edx
  cd:	31 ca                	xor    %ecx,%edx
  cf:	89 d0                	mov    %edx,%eax
  d1:	03 43 08             	add    0x8(%ebx),%eax
  d4:	89 43 08             	mov    %eax,0x8(%ebx)
  d7:	c1 c8 02             	ror    $0x2,%eax
  da:	31 d0                	xor    %edx,%eax
  dc:	01 43 08             	add    %eax,0x8(%ebx)
  df:	89 43 0c             	mov    %eax,0xc(%ebx)
  e2:	8b 34 24             	mov    (%esp),%esi
  e5:	eb 54                	jmp    13b <michael_update+0x10c>
  e7:	8b 53 08             	mov    0x8(%ebx),%edx
  ea:	83 c6 04             	add    $0x4,%esi
  ed:	83 ed 04             	sub    $0x4,%ebp
  f0:	33 56 fc             	xor    -0x4(%esi),%edx
  f3:	89 53 08             	mov    %edx,0x8(%ebx)
  f6:	c1 ca 0f             	ror    $0xf,%edx
  f9:	33 53 0c             	xor    0xc(%ebx),%edx
  fc:	89 d0                	mov    %edx,%eax
  fe:	03 43 08             	add    0x8(%ebx),%eax
 101:	89 c1                	mov    %eax,%ecx
 103:	89 43 08             	mov    %eax,0x8(%ebx)
 106:	81 e1 00 ff 00 ff    	and    $0xff00ff00,%ecx
 10c:	25 ff 00 ff 00       	and    $0xff00ff,%eax
 111:	c1 e0 08             	shl    $0x8,%eax
 114:	c1 e9 08             	shr    $0x8,%ecx
 117:	09 c1                	or     %eax,%ecx
 119:	31 d1                	xor    %edx,%ecx
 11b:	89 ca                	mov    %ecx,%edx
 11d:	03 53 08             	add    0x8(%ebx),%edx
 120:	89 53 08             	mov    %edx,0x8(%ebx)
 123:	c1 ca 1d             	ror    $0x1d,%edx
 126:	31 ca                	xor    %ecx,%edx
 128:	89 d0                	mov    %edx,%eax
 12a:	03 43 08             	add    0x8(%ebx),%eax
 12d:	89 43 08             	mov    %eax,0x8(%ebx)
 130:	c1 c8 02             	ror    $0x2,%eax
 133:	31 d0                	xor    %edx,%eax
 135:	01 43 08             	add    %eax,0x8(%ebx)
 138:	89 43 0c             	mov    %eax,0xc(%ebx)
 13b:	83 fd 03             	cmp    $0x3,%ebp
 13e:	77 a7                	ja     e7 <michael_update+0xb8>
 140:	85 ed                	test   %ebp,%ebp
 142:	74 12                	je     156 <michael_update+0x127>
 144:	89 6b 04             	mov    %ebp,0x4(%ebx)
 147:	31 c9                	xor    %ecx,%ecx
 149:	89 df                	mov    %ebx,%edi
 14b:	f3 a5                	rep movsl %ds:(%esi),%es:(%edi)
 14d:	89 e9                	mov    %ebp,%ecx
 14f:	83 e1 03             	and    $0x3,%ecx
 152:	74 02                	je     156 <michael_update+0x127>
 154:	f3 a4                	rep movsb %ds:(%esi),%es:(%edi)
 156:	58                   	pop    %eax
 157:	5a                   	pop    %edx
 158:	5b                   	pop    %ebx
 159:	5e                   	pop    %esi
 15a:	5f                   	pop    %edi
 15b:	5d                   	pop    %ebp
 15c:	c3                   	ret    

0000015d <michael_final>:
 15d:	57                   	push   %edi
 15e:	89 d7                	mov    %edx,%edi
 160:	56                   	push   %esi
 161:	8d 70 30             	lea    0x30(%eax),%esi
 164:	53                   	push   %ebx
 165:	8b 56 04             	mov    0x4(%esi),%edx
 168:	83 fa 01             	cmp    $0x1,%edx
 16b:	74 14                	je     181 <michael_final+0x24>
 16d:	72 0c                	jb     17b <michael_final+0x1e>
 16f:	83 fa 02             	cmp    $0x2,%edx
 172:	74 16                	je     18a <michael_final+0x2d>
 174:	83 fa 03             	cmp    $0x3,%edx
 177:	75 47                	jne    1c0 <michael_final+0x63>
 179:	eb 27                	jmp    1a2 <michael_final+0x45>
 17b:	83 76 08 5a          	xorl   $0x5a,0x8(%esi)
 17f:	eb 3f                	jmp    1c0 <michael_final+0x63>
 181:	0f b6 40 30          	movzbl 0x30(%eax),%eax
 185:	80 cc 5a             	or     $0x5a,%ah
 188:	eb 33                	jmp    1bd <michael_final+0x60>
 18a:	0f b6 50 30          	movzbl 0x30(%eax),%edx
 18e:	0f b6 46 01          	movzbl 0x1(%esi),%eax
 192:	81 ca 00 00 5a 00    	or     $0x5a0000,%edx
 198:	c1 e0 08             	shl    $0x8,%eax
 19b:	09 c2                	or     %eax,%edx
 19d:	31 56 08             	xor    %edx,0x8(%esi)
 1a0:	eb 1e                	jmp    1c0 <michael_final+0x63>
 1a2:	0f b6 40 30          	movzbl 0x30(%eax),%eax
 1a6:	0f b6 56 01          	movzbl 0x1(%esi),%edx
 1aa:	0d 00 00 00 5a       	or     $0x5a000000,%eax
 1af:	c1 e2 08             	shl    $0x8,%edx
 1b2:	09 d0                	or     %edx,%eax
 1b4:	0f b6 56 02          	movzbl 0x2(%esi),%edx
 1b8:	c1 e2 10             	shl    $0x10,%edx
 1bb:	09 d0                	or     %edx,%eax
 1bd:	31 46 08             	xor    %eax,0x8(%esi)
 1c0:	8b 4e 08             	mov    0x8(%esi),%ecx
 1c3:	c1 c9 0f             	ror    $0xf,%ecx
 1c6:	33 4e 0c             	xor    0xc(%esi),%ecx
 1c9:	89 c8                	mov    %ecx,%eax
 1cb:	03 46 08             	add    0x8(%esi),%eax
 1ce:	89 c2                	mov    %eax,%edx
 1d0:	89 46 08             	mov    %eax,0x8(%esi)
 1d3:	81 e2 00 ff 00 ff    	and    $0xff00ff00,%edx
 1d9:	25 ff 00 ff 00       	and    $0xff00ff,%eax
 1de:	c1 e0 08             	shl    $0x8,%eax
 1e1:	c1 ea 08             	shr    $0x8,%edx
 1e4:	09 c2                	or     %eax,%edx
 1e6:	31 ca                	xor    %ecx,%edx
 1e8:	89 d0                	mov    %edx,%eax
 1ea:	03 46 08             	add    0x8(%esi),%eax
 1ed:	89 46 08             	mov    %eax,0x8(%esi)
 1f0:	c1 c8 1d             	ror    $0x1d,%eax
 1f3:	31 d0                	xor    %edx,%eax
 1f5:	89 c2                	mov    %eax,%edx
 1f7:	03 56 08             	add    0x8(%esi),%edx
 1fa:	89 56 08             	mov    %edx,0x8(%esi)
 1fd:	c1 ca 02             	ror    $0x2,%edx
 200:	31 c2                	xor    %eax,%edx
 202:	89 d3                	mov    %edx,%ebx
 204:	03 5e 08             	add    0x8(%esi),%ebx
 207:	89 5e 08             	mov    %ebx,0x8(%esi)
 20a:	c1 cb 0f             	ror    $0xf,%ebx
 20d:	31 d3                	xor    %edx,%ebx
 20f:	89 d8                	mov    %ebx,%eax
 211:	03 46 08             	add    0x8(%esi),%eax
 214:	89 c1                	mov    %eax,%ecx
 216:	89 46 08             	mov    %eax,0x8(%esi)
 219:	81 e1 00 ff 00 ff    	and    $0xff00ff00,%ecx
 21f:	25 ff 00 ff 00       	and    $0xff00ff,%eax
 224:	c1 e0 08             	shl    $0x8,%eax
 227:	c1 e9 08             	shr    $0x8,%ecx
 22a:	09 c1                	or     %eax,%ecx
 22c:	31 d9                	xor    %ebx,%ecx
 22e:	89 ca                	mov    %ecx,%edx
 230:	03 56 08             	add    0x8(%esi),%edx
 233:	89 56 08             	mov    %edx,0x8(%esi)
 236:	c1 ca 1d             	ror    $0x1d,%edx
 239:	31 ca                	xor    %ecx,%edx
 23b:	89 d1                	mov    %edx,%ecx
 23d:	03 4e 08             	add    0x8(%esi),%ecx
 240:	89 c8                	mov    %ecx,%eax
 242:	c1 c8 02             	ror    $0x2,%eax
 245:	31 d0                	xor    %edx,%eax
 247:	89 46 0c             	mov    %eax,0xc(%esi)
 24a:	01 c8                	add    %ecx,%eax
 24c:	89 46 08             	mov    %eax,0x8(%esi)
 24f:	89 07                	mov    %eax,(%edi)
 251:	8b 46 0c             	mov    0xc(%esi),%eax
 254:	5b                   	pop    %ebx
 255:	5e                   	pop    %esi
 256:	89 47 04             	mov    %eax,0x4(%edi)
 259:	5f                   	pop    %edi
 25a:	c3                   	ret    
Disassembly of section .init.text:

00000000 <michael_mic_init>:
   0:	b8 00 00 00 00       	mov    $0x0,%eax
   5:	e9 fc ff ff ff       	jmp    6 <michael_mic_init+0x6>
Disassembly of section .exit.text:

00000000 <michael_mic_exit>:
   0:	b8 00 00 00 00       	mov    $0x0,%eax
   5:	e9 fc ff ff ff       	jmp    6 <michael_mic_exit+0x6>

[-- Attachment #3: michael_mic.patched --]
[-- Type: text/plain, Size: 7598 bytes --]


crypto/michael_mic.o:     file format elf32-i386

Disassembly of section .text:

00000000 <michael_block>:
   0:	33 50 08             	xor    0x8(%eax),%edx
   3:	53                   	push   %ebx
   4:	89 d1                	mov    %edx,%ecx
   6:	c1 c9 0f             	ror    $0xf,%ecx
   9:	33 48 0c             	xor    0xc(%eax),%ecx
   c:	8d 14 11             	lea    (%ecx,%edx,1),%edx
   f:	89 d3                	mov    %edx,%ebx
  11:	89 50 08             	mov    %edx,0x8(%eax)
  14:	81 e3 00 ff 00 ff    	and    $0xff00ff00,%ebx
  1a:	81 e2 ff 00 ff 00    	and    $0xff00ff,%edx
  20:	c1 e2 08             	shl    $0x8,%edx
  23:	c1 eb 08             	shr    $0x8,%ebx
  26:	09 d3                	or     %edx,%ebx
  28:	31 cb                	xor    %ecx,%ebx
  2a:	89 d9                	mov    %ebx,%ecx
  2c:	03 48 08             	add    0x8(%eax),%ecx
  2f:	89 48 08             	mov    %ecx,0x8(%eax)
  32:	c1 c9 1d             	ror    $0x1d,%ecx
  35:	31 d9                	xor    %ebx,%ecx
  37:	89 ca                	mov    %ecx,%edx
  39:	03 50 08             	add    0x8(%eax),%edx
  3c:	5b                   	pop    %ebx
  3d:	89 50 08             	mov    %edx,0x8(%eax)
  40:	c1 ca 02             	ror    $0x2,%edx
  43:	31 ca                	xor    %ecx,%edx
  45:	01 50 08             	add    %edx,0x8(%eax)
  48:	89 50 0c             	mov    %edx,0xc(%eax)
  4b:	c3                   	ret    

0000004c <michael_setkey>:
  4c:	83 f9 08             	cmp    $0x8,%ecx
  4f:	53                   	push   %ebx
  50:	89 d3                	mov    %edx,%ebx
  52:	74 0d                	je     61 <michael_setkey+0x15>
  54:	81 08 00 00 20 00    	orl    $0x200000,(%eax)
  5a:	b8 ea ff ff ff       	mov    $0xffffffea,%eax
  5f:	eb 10                	jmp    71 <michael_setkey+0x25>
  61:	8b 12                	mov    (%edx),%edx
  63:	83 c0 30             	add    $0x30,%eax
  66:	89 50 08             	mov    %edx,0x8(%eax)
  69:	8b 53 04             	mov    0x4(%ebx),%edx
  6c:	89 50 0c             	mov    %edx,0xc(%eax)
  6f:	31 c0                	xor    %eax,%eax
  71:	5b                   	pop    %ebx
  72:	c3                   	ret    

00000073 <michael_init>:
  73:	c7 40 34 00 00 00 00 	movl   $0x0,0x34(%eax)
  7a:	c3                   	ret    

0000007b <michael_update>:
  7b:	55                   	push   %ebp
  7c:	89 c5                	mov    %eax,%ebp
  7e:	57                   	push   %edi
  7f:	83 c5 30             	add    $0x30,%ebp
  82:	56                   	push   %esi
  83:	53                   	push   %ebx
  84:	89 cb                	mov    %ecx,%ebx
  86:	83 ec 08             	sub    $0x8,%esp
  89:	89 44 24 04          	mov    %eax,0x4(%esp)
  8d:	89 14 24             	mov    %edx,(%esp)
  90:	8b 75 04             	mov    0x4(%ebp),%esi
  93:	85 f6                	test   %esi,%esi
  95:	74 4c                	je     e3 <michael_update+0x68>
  97:	b8 04 00 00 00       	mov    $0x4,%eax
  9c:	29 f0                	sub    %esi,%eax
  9e:	39 c8                	cmp    %ecx,%eax
  a0:	89 c2                	mov    %eax,%edx
  a2:	0f 47 d1             	cmova  %ecx,%edx
  a5:	89 d1                	mov    %edx,%ecx
  a7:	8d 7c 35 00          	lea    0x0(%ebp,%esi,1),%edi
  ab:	8b 34 24             	mov    (%esp),%esi
  ae:	c1 e9 02             	shr    $0x2,%ecx
  b1:	f3 a5                	rep movsl %ds:(%esi),%es:(%edi)
  b3:	89 d1                	mov    %edx,%ecx
  b5:	83 e1 03             	and    $0x3,%ecx
  b8:	74 02                	je     bc <michael_update+0x41>
  ba:	f3 a4                	rep movsb %ds:(%esi),%es:(%edi)
  bc:	89 d0                	mov    %edx,%eax
  be:	03 45 04             	add    0x4(%ebp),%eax
  c1:	83 f8 03             	cmp    $0x3,%eax
  c4:	89 45 04             	mov    %eax,0x4(%ebp)
  c7:	76 4a                	jbe    113 <michael_update+0x98>
  c9:	8b 44 24 04          	mov    0x4(%esp),%eax
  cd:	29 d3                	sub    %edx,%ebx
  cf:	01 14 24             	add    %edx,(%esp)
  d2:	8b 50 30             	mov    0x30(%eax),%edx
  d5:	89 e8                	mov    %ebp,%eax
  d7:	e8 24 ff ff ff       	call   0 <michael_block>
  dc:	c7 45 04 00 00 00 00 	movl   $0x0,0x4(%ebp)
  e3:	8b 34 24             	mov    (%esp),%esi
  e6:	eb 10                	jmp    f8 <michael_update+0x7d>
  e8:	83 c6 04             	add    $0x4,%esi
  eb:	89 e8                	mov    %ebp,%eax
  ed:	8b 56 fc             	mov    -0x4(%esi),%edx
  f0:	83 eb 04             	sub    $0x4,%ebx
  f3:	e8 08 ff ff ff       	call   0 <michael_block>
  f8:	83 fb 03             	cmp    $0x3,%ebx
  fb:	77 eb                	ja     e8 <michael_update+0x6d>
  fd:	85 db                	test   %ebx,%ebx
  ff:	74 12                	je     113 <michael_update+0x98>
 101:	89 5d 04             	mov    %ebx,0x4(%ebp)
 104:	31 c9                	xor    %ecx,%ecx
 106:	89 ef                	mov    %ebp,%edi
 108:	f3 a5                	rep movsl %ds:(%esi),%es:(%edi)
 10a:	89 d9                	mov    %ebx,%ecx
 10c:	83 e1 03             	and    $0x3,%ecx
 10f:	74 02                	je     113 <michael_update+0x98>
 111:	f3 a4                	rep movsb %ds:(%esi),%es:(%edi)
 113:	58                   	pop    %eax
 114:	5a                   	pop    %edx
 115:	5b                   	pop    %ebx
 116:	5e                   	pop    %esi
 117:	5f                   	pop    %edi
 118:	5d                   	pop    %ebp
 119:	c3                   	ret    

0000011a <michael_final>:
 11a:	56                   	push   %esi
 11b:	89 d6                	mov    %edx,%esi
 11d:	53                   	push   %ebx
 11e:	8d 58 30             	lea    0x30(%eax),%ebx
 121:	8b 43 04             	mov    0x4(%ebx),%eax
 124:	83 f8 02             	cmp    $0x2,%eax
 127:	74 14                	je     13d <michael_final+0x23>
 129:	83 f8 03             	cmp    $0x3,%eax
 12c:	74 16                	je     144 <michael_final+0x2a>
 12e:	48                   	dec    %eax
 12f:	ba 5a 00 00 00       	mov    $0x5a,%edx
 134:	b9 5a 00 00 00       	mov    $0x5a,%ecx
 139:	74 1b                	je     156 <michael_final+0x3c>
 13b:	eb 23                	jmp    160 <michael_final+0x46>
 13d:	b8 5a 00 00 00       	mov    $0x5a,%eax
 142:	eb 07                	jmp    14b <michael_final+0x31>
 144:	0f b6 43 02          	movzbl 0x2(%ebx),%eax
 148:	80 cc 5a             	or     $0x5a,%ah
 14b:	89 c1                	mov    %eax,%ecx
 14d:	0f b6 43 01          	movzbl 0x1(%ebx),%eax
 151:	c1 e1 08             	shl    $0x8,%ecx
 154:	09 c1                	or     %eax,%ecx
 156:	0f b6 03             	movzbl (%ebx),%eax
 159:	89 ca                	mov    %ecx,%edx
 15b:	c1 e2 08             	shl    $0x8,%edx
 15e:	09 c2                	or     %eax,%edx
 160:	89 d8                	mov    %ebx,%eax
 162:	e8 99 fe ff ff       	call   0 <michael_block>
 167:	89 d8                	mov    %ebx,%eax
 169:	31 d2                	xor    %edx,%edx
 16b:	e8 90 fe ff ff       	call   0 <michael_block>
 170:	8b 43 08             	mov    0x8(%ebx),%eax
 173:	89 06                	mov    %eax,(%esi)
 175:	8b 43 0c             	mov    0xc(%ebx),%eax
 178:	5b                   	pop    %ebx
 179:	89 46 04             	mov    %eax,0x4(%esi)
 17c:	5e                   	pop    %esi
 17d:	c3                   	ret    
Disassembly of section .init.text:

00000000 <michael_mic_init>:
   0:	b8 00 00 00 00       	mov    $0x0,%eax
   5:	e9 fc ff ff ff       	jmp    6 <michael_mic_init+0x6>
Disassembly of section .exit.text:

00000000 <michael_mic_exit>:
   0:	b8 00 00 00 00       	mov    $0x0,%eax
   5:	e9 fc ff ff ff       	jmp    6 <michael_mic_exit+0x6>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: make michael_block a function
  2008-05-16 15:52     ` Harvey Harrison
@ 2008-05-18 21:57       ` Sebastian Siewior
  0 siblings, 0 replies; 5+ messages in thread
From: Sebastian Siewior @ 2008-05-18 21:57 UTC (permalink / raw)
  To: Harvey Harrison; +Cc: Johannes Berg, Herbert Xu, LKML, Andrew Morton

* Harvey Harrison | 2008-05-16 08:52:03 [-0700]:

>Well, the code-size difference is significant (about 200 bytes smaller
>on X86-32).  This macro is essentially an inline function that is pretty
>large.

Can you please apply the attached patch and run
| modprobe tcrypt mode=314

before and after your patch?

I get the following numbers on one of my maschines:

before:
~~~~~~~
|test  0 (   16 byte blocks,   16 bytes per update,   1 updates):    122 cycles/operation,    7 cycles/byte
|test  1 (   64 byte blocks,   16 bytes per update,   4 updates):    370 cycles/operation,    5 cycles/byte
|test  2 (   64 byte blocks,   64 bytes per update,   1 updates):    216 cycles/operation,    3 cycles/byte
|test  3 (  256 byte blocks,   16 bytes per update,  16 updates):   1304 cycles/operation,    5 cycles/byte
|test  4 (  256 byte blocks,   64 bytes per update,   4 updates):    746 cycles/operation,    2 cycles/byte
|test  5 (  256 byte blocks,  256 bytes per update,   1 updates):    592 cycles/operation,    2 cycles/byte
|test  6 ( 1024 byte blocks,   16 bytes per update,  64 updates):   5040 cycles/operation,    4 cycles/byte
|test  7 ( 1024 byte blocks,  256 bytes per update,   4 updates):   2252 cycles/operation,    2 cycles/byte
|test  8 ( 1024 byte blocks, 1024 bytes per update,   1 updates):   2098 cycles/operation,    2 cycles/byte
|test  9 ( 2048 byte blocks,   16 bytes per update, 128 updates):  10021 cycles/operation,    4 cycles/byte
|test 10 ( 2048 byte blocks,  256 bytes per update,   8 updates):   4446 cycles/operation,    2 cycles/byte
|test 11 ( 2048 byte blocks, 1024 bytes per update,   2 updates):   4167 cycles/operation,    2 cycles/byte
|test 12 ( 2048 byte blocks, 2048 bytes per update,   1 updates):   4106 cycles/operation,    2 cycles/byte
|test 13 ( 4096 byte blocks,   16 bytes per update, 256 updates):  19982 cycles/operation,    4 cycles/byte
|test 14 ( 4096 byte blocks,  256 bytes per update,  16 updates):   8833 cycles/operation,    2 cycles/byte
|test 15 ( 4096 byte blocks, 1024 bytes per update,   4 updates):   8275 cycles/operation,    2 cycles/byte
|test 16 ( 4096 byte blocks, 4096 bytes per update,   1 updates):   8122 cycles/operation,    1 cycles/byte
|test 17 ( 8192 byte blocks,   16 bytes per update, 512 updates):  39906 cycles/operation,    4 cycles/byte
|test 18 ( 8192 byte blocks,  256 bytes per update,  32 updates):  17607 cycles/operation,    2 cycles/byte
|test 19 ( 8192 byte blocks, 1024 bytes per update,   8 updates):  16492 cycles/operation,    2 cycles/byte
|test 20 ( 8192 byte blocks, 4096 bytes per update,   2 updates):  16214 cycles/operation,    1 cycles/byte
|test 21 ( 8192 byte blocks, 8192 bytes per update,   1 updates):  16173 cycles/operation,    1 cycles/byte

after:
~~~~~~
|test  0 (   16 byte blocks,   16 bytes per update,   1 updates):    175 cycles/operation,   10 cycles/byte
|test  1 (   64 byte blocks,   16 bytes per update,   4 updates):    510 cycles/operation,    7 cycles/byte
|test  2 (   64 byte blocks,   64 bytes per update,   1 updates):    342 cycles/operation,    5 cycles/byte
|test  3 (  256 byte blocks,   16 bytes per update,  16 updates):   1813 cycles/operation,    7 cycles/byte
|test  4 (  256 byte blocks,   64 bytes per update,   4 updates):   1182 cycles/operation,    4 cycles/byte
|test  5 (  256 byte blocks,  256 bytes per update,   1 updates):   1013 cycles/operation,    3 cycles/byte
|test  6 ( 1024 byte blocks,   16 bytes per update,  64 updates):   7026 cycles/operation,    6 cycles/byte
|test  7 ( 1024 byte blocks,  256 bytes per update,   4 updates):   3869 cycles/operation,    3 cycles/byte
|test  8 ( 1024 byte blocks, 1024 bytes per update,   1 updates):   3701 cycles/operation,    3 cycles/byte
|test  9 ( 2048 byte blocks,   16 bytes per update, 128 updates):  13975 cycles/operation,    6 cycles/byte
|test 10 ( 2048 byte blocks,  256 bytes per update,   8 updates):   7662 cycles/operation,    3 cycles/byte
|test 11 ( 2048 byte blocks, 1024 bytes per update,   2 updates):   7346 cycles/operation,    3 cycles/byte
|test 12 ( 2048 byte blocks, 2048 bytes per update,   1 updates):   7282 cycles/operation,    3 cycles/byte
|test 13 ( 4096 byte blocks,   16 bytes per update, 256 updates):  27875 cycles/operation,    6 cycles/byte
|test 14 ( 4096 byte blocks,  256 bytes per update,  16 updates):  15247 cycles/operation,    3 cycles/byte
|test 15 ( 4096 byte blocks, 1024 bytes per update,   4 updates):  14615 cycles/operation,    3 cycles/byte
|test 16 ( 4096 byte blocks, 4096 bytes per update,   1 updates):  14447 cycles/operation,    3 cycles/byte
|test 17 ( 8192 byte blocks,   16 bytes per update, 512 updates):  55673 cycles/operation,    6 cycles/byte
|test 18 ( 8192 byte blocks,  256 bytes per update,  32 updates):  30416 cycles/operation,    3 cycles/byte
|test 19 ( 8192 byte blocks, 1024 bytes per update,   8 updates):  29155 cycles/operation,    3 cycles/byte
|test 20 ( 8192 byte blocks, 4096 bytes per update,   2 updates):  28839 cycles/operation,    3 cycles/byte
|test 21 ( 8192 byte blocks, 8192 bytes per update,   1 updates):  28794 cycles/operation,    3 cycles/byte


>
>Harvey

Sebastian

---
 crypto/tcrypt.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index 1ab8c01..c0040d5 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -1761,6 +1761,10 @@ static void do_test(void)
 		test_hash_speed("sha224", sec, generic_hash_speed_template);
 		if (mode > 300 && mode < 400) break;
 
+	case 314:
+		test_hash_speed("michael_mic", sec, generic_hash_speed_template);
+		if (mode > 300 && mode < 400) break;
+
 	case 399:
 		break;
 
-- 
1.5.4.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-05-18 21:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-16  6:17 [PATCH] crypto: make michael_block a function Harvey Harrison
2008-05-16  7:01 ` Sebastian Siewior
2008-05-16  8:10   ` Johannes Berg
2008-05-16 15:52     ` Harvey Harrison
2008-05-18 21:57       ` Sebastian Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.