All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ipt_connlimit / speed-up.
@ 2005-04-21  3:10 Pawel Sikora
  2005-04-21  8:44 ` Jonas Berlin
  2005-04-24 16:26 ` Patrick McHardy
  0 siblings, 2 replies; 3+ messages in thread
From: Pawel Sikora @ 2005-04-21  3:10 UTC (permalink / raw)
  To: netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 882 bytes --]

Hi All,

I've attached a small patch reduces register usage and redundant &0xff.

Regards,
Paweł.

ipt_iphash: (original)
        movl    %eax, %ecx
        movzbl  %ah, %edx
        pushl   %ebx
        movl    %eax, %ebx
        shrl    $16, %ecx
        andl    $255, %eax
        xorl    %eax, %edx
        andl    $255, %ecx
        shrl    $24, %ebx
        xorl    %edx, %ecx
        xorl    %ecx, %ebx
        movl    %ebx, %eax
        popl    %ebx
        ret

ipt_iphash: (fixed)
        movl    %eax, %edx
        shrl    $8, %edx
        xorl    %eax, %edx
        shrl    $16, %eax
        xorl    %eax, %edx
        shrl    $8, %eax
        xorl    %eax, %edx
        movzbl  %dl,%eax
        ret

-- 
/* Copyright (C) 2003, SCO, Inc. This is valuable Intellectual Property. */

                           #define say(x) lie(x)

[-- Attachment #2: ipt_connlimit.diff --]
[-- Type: text/x-diff, Size: 774 bytes --]

Index: connlimit/linux-2.6.11/net/ipv4/netfilter/ipt_connlimit.c
===================================================================
--- connlimit/linux-2.6.11/net/ipv4/netfilter/ipt_connlimit.c	(revision 3884)
+++ connlimit/linux-2.6.11/net/ipv4/netfilter/ipt_connlimit.c	(working copy)
@@ -35,15 +35,13 @@
 	struct list_head iphash[256];
 };
 
-static int ipt_iphash(u_int32_t addr)
+static inline unsigned __attribute__((regparm(1), const))
+ipt_iphash(const unsigned addr)
 {
-	int hash;
-
-	hash  =  addr        & 0xff;
-	hash ^= (addr >>  8) & 0xff;
-	hash ^= (addr >> 16) & 0xff;
-	hash ^= (addr >> 24) & 0xff;
-	return hash;
+	return ((addr ^
+		(addr >>  8) ^
+		(addr >> 16) ^
+		(addr >> 24)) & 0xff);
 }
 
 static int count_them(struct ipt_connlimit_data *data,

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ipt_connlimit / speed-up.
  2005-04-21  3:10 [PATCH] ipt_connlimit / speed-up Pawel Sikora
@ 2005-04-21  8:44 ` Jonas Berlin
  2005-04-24 16:26 ` Patrick McHardy
  1 sibling, 0 replies; 3+ messages in thread
From: Jonas Berlin @ 2005-04-21  8:44 UTC (permalink / raw)
  To: Pawel Sikora; +Cc: netfilter-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Quoting Pawel Sikora on 2005-04-21 03:10 UTC:

I just had to test one alternative.. :)

> ipt_iphash: (fixed)

>         movl    %eax, %edx

>         shrl    $8, %edx

>         xorl    %eax, %edx
>         shrl    $16, %eax

>         xorl    %eax, %edx
>         shrl    $8, %eax

>         xorl    %eax, %edx

>         movzbl  %dl,%eax

>         ret

I grouped the instructions of your solution according to my outdated and
maybe incomplete knowledge of how Pentium processors execute
instructions in parallell. This gives 7 groups.

> +	return ((addr ^
> +		(addr >>  8) ^
> +		(addr >> 16) ^
> +		(addr >> 24)) & 0xff);

My approach:

	unsigned tmp = addr ^ (addr >> 16);
	return ((tmp >> 8) ^ tmp) & 255;

Which gives:

        movl    %eax, %edx
        shrl    $16, %eax

        xorl    %edx, %eax

        movl    %eax, %edx

        shrl    $8, %edx

        xorl    %edx, %eax

        andl    $255, %eax

        ret

This also gives 7 groups. If the compiler had only used eax instead in
"shrl $8, %edx", the instruction could have been grouped with the
previous instruction and my solution would have beaten yours, but no,
the compiler chose to modify the copy instead of the original.. :)

So both my and your solution end up using the same amount of cpu
although my code has one instruction less and also results in one byte
less of assembler instructions.. :)

However, on my processor, your solution results in 7 groups of code
while mine results in 8 (again, the compiler chose to modify a copy
instead of the original), so I'm going to vote for your version :D

For the fun of it, here's what your version looks like on x86_64:

	movl	%edi, %eax
	movl	%edi, %edx

	shrl	$8, %eax
	shrl	$16, %edx

	xorl	%edi, %eax
	shrl	$24, %edi

	xorl	%edx, %eax

	xorl	%edi, %eax

	andl	$255, %eax

	ret

Keep up the good work :)

- --
- - xkr47
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFCZ2fYxyF48ZTvn+4RAgHyAJ9B3vcxnPRHZOhHT3Eh6U/p8Mjh8QCeJwL2
OPiZ1VtkT8Y+Jug3UwCDdIg=
=UwcV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ipt_connlimit / speed-up.
  2005-04-21  3:10 [PATCH] ipt_connlimit / speed-up Pawel Sikora
  2005-04-21  8:44 ` Jonas Berlin
@ 2005-04-24 16:26 ` Patrick McHardy
  1 sibling, 0 replies; 3+ messages in thread
From: Patrick McHardy @ 2005-04-24 16:26 UTC (permalink / raw)
  To: Pawel Sikora; +Cc: netfilter-devel

Pawel Sikora wrote:
> Hi All,
> 
> I've attached a small patch reduces register usage and redundant &0xff.
>
> -static int ipt_iphash(u_int32_t addr)
> +static inline unsigned __attribute__((regparm(1), const))

Why do you need attribute(regparm) for an inline function?

Regards
Patrick

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-04-24 16:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-21  3:10 [PATCH] ipt_connlimit / speed-up Pawel Sikora
2005-04-21  8:44 ` Jonas Berlin
2005-04-24 16:26 ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.