public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* hex_to_bin speedup
@ 2010-05-27 10:07 Fredrik Gustafsson
       [not found] ` <AANLkTinn7aEv0ueW4uRN06XdUbEltTfmjRXEABMRN-7e@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Fredrik Gustafsson @ 2010-05-27 10:07 UTC (permalink / raw)
  To: linux-kernel

Hi,
I looked a bit at the newly added hex_to_bin function in lib/hexdump.c.
I do believe this is a speedup (at least according to my benchmark[1]).

However in the comments to commit 
903788892ea0fc7fcaf7e8e5fac9a77379fc215b you can read
"[akpm@linux-foundation.org: use tolower(), saving 3 bytes, test the more common case first - it's quicker]"

I don't know the change that akpm has done, so I'm unsure if there's any
problems that I miss with my patch.

--
iveqy

[1] Benchmark

#include <sys/time.h>
#include <stdio.h>

int main()
{
	struct timeval start, end;
	int i,itr;
	itr = 100000;
	char c = 'A';
	time_t diff;
	gettimeofday(&start,NULL);
	for(i = 0; i < itr; i++) {
		if((c >= 'A') && (c <= 'F')) {}
	}
	gettimeofday(&end,NULL);
	diff = end.tv_usec - start.tv_usec;
	printf("if-statement: %d ms\n",diff);
	gettimeofday(&start,NULL);
	for(i = 0; i < itr; i++) {
		c = tolower(c);
	}
	gettimeofday(&end,NULL);
	diff = end.tv_usec - start.tv_usec;
	printf("tolower(): %d ms\n",diff);
}

[2] Patch

This is faster (at least on i686).
---
 lib/hexdump.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/lib/hexdump.c b/lib/hexdump.c
index 5d7a480..f01d11c 100644
--- a/lib/hexdump.c
+++ b/lib/hexdump.c
@@ -26,9 +26,10 @@ int hex_to_bin(char ch)
 {
 	if ((ch >= '0') && (ch <= '9'))
 		return ch - '0';
-	ch = tolower(ch);
 	if ((ch >= 'a') && (ch <= 'f'))
 		return ch - 'a' + 10;
+	if ((ch >= 'A') && (ch <= 'F'))
+		return ch - 'A' + 10;
 	return -1;
 }
 EXPORT_SYMBOL(hex_to_bin);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: hex_to_bin speedup
       [not found]   ` <20100527142848.GA23543@iveqy.com>
@ 2010-05-27 16:35     ` Andy Shevchenko
  2010-05-28 21:05       ` Fredrik Gustafsson
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Shevchenko @ 2010-05-27 16:35 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 11103 bytes --]

On Thu, May 27, 2010 at 5:28 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:> I saw from the mailinglist that the initial patch suggested for this> function did not use tolower() but an if statement.Exactly, and there was a small discussion around it.
> On Thu, May 27, 2010 at 02:22:30PM +0300, Andy Shevchenko wrote:>> To your test case:>>  - the data is not random (CPU cache hits distribution is differ to real)> I thought that because the best case senario was equal, only the worst> case was of interest. Of course you're right. I rewrote the testcase to> use random numbers instead.>>>  - the compiler call and parameters are not published (gcc -O2>> probably removes unused parts of code)> I didn't use any parameters at all. However with the improved benchmark> I tested both without and with -O2 and got different results. I'm not> qualified to judge what this can do for the kernel.>>>  - please, provide assembler excerpts to compare implementations> Please se [3]-[6]>> P.S. How come we don't CC the LKML?My fault. Return back to public.
> --> iveqy>> [1] Benchmark results> $ gcc -O2 main.c && ./a.out> tolower() was faster 2657 times> if-statement was faster 2571 times> Equal fast 4772 timesSo, optimizer do the job for tolower(). Thus, nothing to worry about.
> $ gcc  main.c && ./a.out> tolower() was faster 1532 times> if-statement was faster 8465 times> Equal fast 3 times>> [2] Benchmark source>> #include <sys/time.h>> #include <stdio.h>>> int hex_to_bin_if(char ch)> {>        if ((ch >= '0') && (ch <= '9'))>                return ch - '0';>        if ((ch >= 'a') && (ch <= 'f'))>                return ch - 'a' + 10;>        if ((ch >= 'A') && (ch <= 'F'))>                return ch - 'A' + 10;>        return -1;> }> int hex_to_bin_to(char ch)> {>        if ((ch >= '0') && (ch <= '9'))>                return ch - '0';>        ch = tolower(ch);>        if ((ch >= 'a') && (ch <= 'f'))>                return ch - 'a' + 10;>        return -1;> }> char bin_to_hex(int i)> {>        switch(i) {>                case 0:>                        return '0';>                case 1:>                        return '1';>                case 2:>                        return '2';>                case 3:>                        return '3';>                case 4:>                        return '4';>                case 5:>                        return '5';>                case 6:>                        return '6';>                case 7:>                        return '7';>                case 8:>                        return '8';>                case 9:>                        return '9';>                case 10:>                        return 'a';>                case 11:>                        return 'b';>                case 12:>                        return 'c';>                case 13:>                        return 'd';>                case 14:>                        return 'e';>                case 15:>                        return 'f';>                case 16:>                        return '0';>                case 17:>                        return '1';>                case 18:>                        return '2';>                case 19:>                        return '3';>                case 20:>                        return '4';>                case 21:>                        return '5';>                case 22:>                        return '6';>                case 23:>                        return '7';>                case 24:>                        return '8';>                case 25:>                        return '9';>                case 26:>                        return 'A';>                case 27:>                        return 'B';>                case 28:>                        return 'C';>                case 29:>                        return 'D';>                case 30:>                        return 'E';>                case 31:>                        return 'F';>        }> }> int main()> {>        // Needed variables>        struct timeval start, end;>        int i,itr,res,ran,j,won_if = 0, won_to = 0,equal = 0;>        itr = 10000;>        char c;>        time_t tif,tto,diff;>        unsigned int iseed = (unsigned int)time(NULL);>        srand(iseed);>>        for(j = 0; j < itr; j++) {>                // Test if-statement>                gettimeofday(&start,NULL);>                for(i = 0; i < itr; i++) {>                        ran = rand() % 32;>                        c = bin_to_hex(ran);>                        res = hex_to_bin_if(c);>                }>                gettimeofday(&end,NULL);>                tif = end.tv_usec - start.tv_usec;>>                // Test tolower()>                gettimeofday(&start,NULL);>                for(i = 0; i < itr; i++) {>                        ran = rand() % 32;>                        c = bin_to_hex(ran);>                        res = hex_to_bin_to(c);>                }>                gettimeofday(&end,NULL);>                tto = end.tv_usec - start.tv_usec;>>                // Calculate winner>                diff = tto - tif;>                if(diff == 0) {>                        equal++;>                } else if(tto - tif > 0) {>                        won_if++;>                } else {>                        diff *= -1;>                        won_to++;>                }>        }>        printf("tolower() was faster %d times\n",won_to);>        printf("if-statement was faster %d times\n",won_if);>        printf("Equal fast %d times\n",equal);> }> [3] Assembler code if-statement (gcc -S main.c)> hex_to_bin_if:>        pushl   %ebp>        movl    %esp, %ebp>        subl    $8, %esp>        movl    8(%ebp), %eax>        movb    %al, -4(%ebp)>        cmpb    $47, -4(%ebp)>        jle     .L2>        cmpb    $57, -4(%ebp)>        jg      .L2>        movsbl  -4(%ebp),%eax>        subl    $48, %eax>        movl    %eax, -8(%ebp)>        jmp     .L3> .L2:>        cmpb    $96, -4(%ebp)>        jle     .L4>        cmpb    $102, -4(%ebp)>        jg      .L4>        movsbl  -4(%ebp),%eax>        subl    $87, %eax>        movl    %eax, -8(%ebp)>        jmp     .L3> .L4:>        cmpb    $64, -4(%ebp)>        jle     .L5>        cmpb    $70, -4(%ebp)>        jg      .L5>        movsbl  -4(%ebp),%eax>        subl    $55, %eax>        movl    %eax, -8(%ebp)>        jmp     .L3> .L5:>        movl    $-1, -8(%ebp)> .L3:>        movl    -8(%ebp), %eax>        leave>        ret>        .size   hex_to_bin_if, .-hex_to_bin_if> .globl hex_to_bin_to>        .type   hex_to_bin_to, @function>> [4] Assembler code tolower() (gcc -S main.c)> hex_to_bin_to:>        pushl   %ebp>        movl    %esp, %ebp>        subl    $24, %esp>        movl    8(%ebp), %eax>        movb    %al, -4(%ebp)>        cmpb    $47, -4(%ebp)>        jle     .L8>        cmpb    $57, -4(%ebp)>        jg      .L8>        movsbl  -4(%ebp),%eax>        subl    $48, %eax>        movl    %eax, -8(%ebp)>        jmp     .L9> .L8:>        movsbl  -4(%ebp),%eax>        movl    %eax, (%esp)>        call    tolower>        movb    %al, -4(%ebp)>        cmpb    $96, -4(%ebp)>        jle     .L10>        cmpb    $102, -4(%ebp)>        jg      .L10>        movsbl  -4(%ebp),%eax>        subl    $87, %eax>        movl    %eax, -8(%ebp)>        jmp     .L9> .L10:>        movl    $-1, -8(%ebp)> .L9:>        movl    -8(%ebp), %eax>        leave>        ret>        .size   hex_to_bin_to, .-hex_to_bin_to>> [5] Assembler code if-statement (gcc -S -O2 main.c)> hex_to_bin_if:>        pushl   %ebp>        movl    %esp, %ebp>        movzbl  8(%ebp), %edx>        leal    -48(%edx), %eax>        cmpb    $9, %al>        jbe     .L8>        leal    -97(%edx), %eax>        cmpb    $5, %al>        ja      .L4>        movsbl  %dl,%eax>        leal    -87(%eax), %ecx> .L3:>        movl    %ecx, %eax>        popl    %ebp>        ret>        .p2align 4,,7>        .p2align 3> .L8:>        movsbl  %dl,%eax>        leal    -48(%eax), %ecx>        movl    %ecx, %eax>        popl    %ebp>        ret>        .p2align 4,,7>        .p2align 3> .L4:>        leal    -65(%edx), %eax>        movl    $-1, %ecx>        cmpb    $5, %al>        ja      .L3>        movsbl  %dl,%eax>        leal    -55(%eax), %ecx>        jmp     .L3>        .size   hex_to_bin_if, .-hex_to_bin_if>        .p2align 4,,15>> [6] Assembler code tolower() (gcc -S -O2 main.c)> hex_to_bin_to:>        pushl   %ebp>        movl    %esp, %ebp>        subl    $8, %esp>        movzbl  8(%ebp), %edx>        leal    -48(%edx), %eax>        cmpb    $9, %al>        ja      .L38>        movsbl  %dl,%eax>        leal    -48(%eax), %edx> .L39:>        movl    %edx, %eax>        leave>        ret>        .p2align 4,,7>        .p2align 3> .L38:>        movsbl  %dl,%eax>        movl    %eax, (%esp)>        call    tolower>        movl    $-1, %edx>        movl    %eax, %ecx>        leal    -97(%ecx), %eax>        cmpb    $5, %al>        ja      .L39>        movsbl  %cl,%eax>        leal    -87(%eax), %edx>        jmp     .L39>        .size   hex_to_bin_to, .-hex_to_bin_to>        .section        .rodata.str1.4,"aMS",@progbits,1>        .align 4>


-- With Best Regards,Andy Shevchenkoÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hex_to_bin speedup
  2010-05-27 16:35     ` Andy Shevchenko
@ 2010-05-28 21:05       ` Fredrik Gustafsson
  0 siblings, 0 replies; 4+ messages in thread
From: Fredrik Gustafsson @ 2010-05-28 21:05 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: linux-kernel

On Thu, May 27, 2010 at 07:35:05PM +0300, Andy Shevchenko wrote:
> > [1] Benchmark results
> > $ gcc -O2 main.c && ./a.out
> > tolower() was faster 2657 times
> > if-statement was faster 2571 times
> > Equal fast 4772 times
> So, optimizer do the job for tolower(). Thus, nothing to worry about.
Yes I guess so. I tested with -Os too and that also does the trick.
Thank you for your time.

--
iveqy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hex_to_bin speedup
@ 2010-05-28 21:38 George Spelvin
  0 siblings, 0 replies; 4+ messages in thread
From: George Spelvin @ 2010-05-28 21:38 UTC (permalink / raw)
  To: andy.shevchenko, iveqy; +Cc: linux, linux-kernel

1) First of all, I'd worry about code size far more than speed.
   this is not fast-path code.
2) Second, given that you're already doing a range test,
   the fastest way to perform a tolower() is "c |= 0x20".

Generally it's something like:
int hex_to_bin(char ch)
{
	ch -= '0';
	if ((unsigned char)ch <= 9)
		return ch;
	ch |= 0x20;
	ch -= 'a' - '0';
	if ((unsigned char)ch <= 6)
		return ch+10
	return -1;
}


that produces the even smaller code:
hex_to_bin_or:
        movb    4(%esp), %dl
        subl    $48, %edx
        cmpb    $9, %dl
        movsbl  %dl,%eax
        jbe     .L10
        orl     $32, %edx
        orl     $-1, %eax
        subl    $49, %edx
        cmpb    $6, %dl
        ja      .L10
        movsbl  %dl,%eax
        addl    $10, %eax
.L10:
        ret


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-05-28 21:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-27 10:07 hex_to_bin speedup Fredrik Gustafsson
     [not found] ` <AANLkTinn7aEv0ueW4uRN06XdUbEltTfmjRXEABMRN-7e@mail.gmail.com>
     [not found]   ` <20100527142848.GA23543@iveqy.com>
2010-05-27 16:35     ` Andy Shevchenko
2010-05-28 21:05       ` Fredrik Gustafsson
  -- strict thread matches above, loose matches on Subject: below --
2010-05-28 21:38 George Spelvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox