* hex_to_bin speedup
@ 2010-05-27 10:07 Fredrik Gustafsson
[not found] ` <AANLkTinn7aEv0ueW4uRN06XdUbEltTfmjRXEABMRN-7e@mail.gmail.com>
0 siblings, 1 reply; 4+ messages in thread
From: Fredrik Gustafsson @ 2010-05-27 10:07 UTC (permalink / raw)
To: linux-kernel
Hi,
I looked a bit at the newly added hex_to_bin function in lib/hexdump.c.
I do believe this is a speedup (at least according to my benchmark[1]).
However in the comments to commit
903788892ea0fc7fcaf7e8e5fac9a77379fc215b you can read
"[akpm@linux-foundation.org: use tolower(), saving 3 bytes, test the more common case first - it's quicker]"
I don't know the change that akpm has done, so I'm unsure if there's any
problems that I miss with my patch.
--
iveqy
[1] Benchmark
#include <sys/time.h>
#include <stdio.h>
int main()
{
struct timeval start, end;
int i,itr;
itr = 100000;
char c = 'A';
time_t diff;
gettimeofday(&start,NULL);
for(i = 0; i < itr; i++) {
if((c >= 'A') && (c <= 'F')) {}
}
gettimeofday(&end,NULL);
diff = end.tv_usec - start.tv_usec;
printf("if-statement: %d ms\n",diff);
gettimeofday(&start,NULL);
for(i = 0; i < itr; i++) {
c = tolower(c);
}
gettimeofday(&end,NULL);
diff = end.tv_usec - start.tv_usec;
printf("tolower(): %d ms\n",diff);
}
[2] Patch
This is faster (at least on i686).
---
lib/hexdump.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/lib/hexdump.c b/lib/hexdump.c
index 5d7a480..f01d11c 100644
--- a/lib/hexdump.c
+++ b/lib/hexdump.c
@@ -26,9 +26,10 @@ int hex_to_bin(char ch)
{
if ((ch >= '0') && (ch <= '9'))
return ch - '0';
- ch = tolower(ch);
if ((ch >= 'a') && (ch <= 'f'))
return ch - 'a' + 10;
+ if ((ch >= 'A') && (ch <= 'F'))
+ return ch - 'A' + 10;
return -1;
}
EXPORT_SYMBOL(hex_to_bin);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: hex_to_bin speedup
[not found] ` <20100527142848.GA23543@iveqy.com>
@ 2010-05-27 16:35 ` Andy Shevchenko
2010-05-28 21:05 ` Fredrik Gustafsson
0 siblings, 1 reply; 4+ messages in thread
From: Andy Shevchenko @ 2010-05-27 16:35 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 11103 bytes --]
On Thu, May 27, 2010 at 5:28 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:> I saw from the mailinglist that the initial patch suggested for this> function did not use tolower() but an if statement.Exactly, and there was a small discussion around it.
> On Thu, May 27, 2010 at 02:22:30PM +0300, Andy Shevchenko wrote:>> To your test case:>> Â - the data is not random (CPU cache hits distribution is differ to real)> I thought that because the best case senario was equal, only the worst> case was of interest. Of course you're right. I rewrote the testcase to> use random numbers instead.>>> Â - the compiler call and parameters are not published (gcc -O2>> probably removes unused parts of code)> I didn't use any parameters at all. However with the improved benchmark> I tested both without and with -O2 and got different results. I'm not> qualified to judge what this can do for the kernel.>>> Â - please, provide assembler excerpts to compare implementations> Please se [3]-[6]>> P.S. How come we don't CC the LKML?My fault. Return back to public.
> --> iveqy>> [1] Benchmark results> $ gcc -O2 main.c && ./a.out> tolower() was faster 2657 times> if-statement was faster 2571 times> Equal fast 4772 timesSo, optimizer do the job for tolower(). Thus, nothing to worry about.
> $ gcc  main.c && ./a.out> tolower() was faster 1532 times> if-statement was faster 8465 times> Equal fast 3 times>> [2] Benchmark source>> #include <sys/time.h>> #include <stdio.h>>> int hex_to_bin_if(char ch)> {>     if ((ch >= '0') && (ch <= '9'))>         return ch - '0';>     if ((ch >= 'a') && (ch <= 'f'))>         return ch - 'a' + 10;>     if ((ch >= 'A') && (ch <= 'F'))>         return ch - 'A' + 10;>     return -1;> }> int hex_to_bin_to(char ch)> {>     if ((ch >= '0') && (ch <= '9'))>         return ch - '0';>     ch = tolower(ch);>     if ((ch >= 'a') && (ch <= 'f'))>         return ch - 'a' + 10;>     return -1;> }> char bin_to_hex(int i)> {>     switch(i) {>         case 0:>             return '0';>         case 1:>             return '1';>         case 2:>             return '2';>         case 3:>             return '3';>         case 4:>             return '4';>         case 5:>             return '5';>         case 6:>             return '6';>         case 7:>             return '7';>         case 8:>             return '8';>         case 9:>             return '9';>         case 10:>             return 'a';>         case 11:>             return 'b';>         case 12:>             return 'c';>         case 13:>             return 'd';>         case 14:>             return 'e';>         case 15:>             return 'f';>         case 16:>             return '0';>         case 17:>             return '1';>         case 18:>             return '2';>         case 19:>             return '3';>         case 20:>             return '4';>         case 21:>             return '5';>         case 22:>             return '6';>         case 23:>             return '7';>         case 24:>             return '8';>         case 25:>             return '9';>         case 26:>             return 'A';>         case 27:>             return 'B';>         case 28:>             return 'C';>         case 29:>             return 'D';>         case 30:>             return 'E';>         case 31:>             return 'F';>     }> }> int main()> {>     // Needed variables>     struct timeval start, end;>     int i,itr,res,ran,j,won_if = 0, won_to = 0,equal = 0;>     itr = 10000;>     char c;>     time_t tif,tto,diff;>     unsigned int iseed = (unsigned int)time(NULL);>     srand(iseed);>>     for(j = 0; j < itr; j++) {>         // Test if-statement>         gettimeofday(&start,NULL);>         for(i = 0; i < itr; i++) {>             ran = rand() % 32;>             c = bin_to_hex(ran);>             res = hex_to_bin_if(c);>         }>         gettimeofday(&end,NULL);>         tif = end.tv_usec - start.tv_usec;>>         // Test tolower()>         gettimeofday(&start,NULL);>         for(i = 0; i < itr; i++) {>             ran = rand() % 32;>             c = bin_to_hex(ran);>             res = hex_to_bin_to(c);>         }>         gettimeofday(&end,NULL);>         tto = end.tv_usec - start.tv_usec;>>         // Calculate winner>         diff = tto - tif;>         if(diff == 0) {>             equal++;>         } else if(tto - tif > 0) {>             won_if++;>         } else {>             diff *= -1;>             won_to++;>         }>     }>     printf("tolower() was faster %d times\n",won_to);>     printf("if-statement was faster %d times\n",won_if);>     printf("Equal fast %d times\n",equal);> }> [3] Assembler code if-statement (gcc -S main.c)> hex_to_bin_if:>     pushl  %ebp>     movl   %esp, %ebp>     subl   $8, %esp>     movl   8(%ebp), %eax>     movb   %al, -4(%ebp)>     cmpb   $47, -4(%ebp)>     jle   .L2>     cmpb   $57, -4(%ebp)>     jg    .L2>     movsbl  -4(%ebp),%eax>     subl   $48, %eax>     movl   %eax, -8(%ebp)>     jmp   .L3> .L2:>     cmpb   $96, -4(%ebp)>     jle   .L4>     cmpb   $102, -4(%ebp)>     jg    .L4>     movsbl  -4(%ebp),%eax>     subl   $87, %eax>     movl   %eax, -8(%ebp)>     jmp   .L3> .L4:>     cmpb   $64, -4(%ebp)>     jle   .L5>     cmpb   $70, -4(%ebp)>     jg    .L5>     movsbl  -4(%ebp),%eax>     subl   $55, %eax>     movl   %eax, -8(%ebp)>     jmp   .L3> .L5:>     movl   $-1, -8(%ebp)> .L3:>     movl   -8(%ebp), %eax>     leave>     ret>     .size  hex_to_bin_if, .-hex_to_bin_if> .globl hex_to_bin_to>     .type  hex_to_bin_to, @function>> [4] Assembler code tolower() (gcc -S main.c)> hex_to_bin_to:>     pushl  %ebp>     movl   %esp, %ebp>     subl   $24, %esp>     movl   8(%ebp), %eax>     movb   %al, -4(%ebp)>     cmpb   $47, -4(%ebp)>     jle   .L8>     cmpb   $57, -4(%ebp)>     jg    .L8>     movsbl  -4(%ebp),%eax>     subl   $48, %eax>     movl   %eax, -8(%ebp)>     jmp   .L9> .L8:>     movsbl  -4(%ebp),%eax>     movl   %eax, (%esp)>     call   tolower>     movb   %al, -4(%ebp)>     cmpb   $96, -4(%ebp)>     jle   .L10>     cmpb   $102, -4(%ebp)>     jg    .L10>     movsbl  -4(%ebp),%eax>     subl   $87, %eax>     movl   %eax, -8(%ebp)>     jmp   .L9> .L10:>     movl   $-1, -8(%ebp)> .L9:>     movl   -8(%ebp), %eax>     leave>     ret>     .size  hex_to_bin_to, .-hex_to_bin_to>> [5] Assembler code if-statement (gcc -S -O2 main.c)> hex_to_bin_if:>     pushl  %ebp>     movl   %esp, %ebp>     movzbl  8(%ebp), %edx>     leal   -48(%edx), %eax>     cmpb   $9, %al>     jbe   .L8>     leal   -97(%edx), %eax>     cmpb   $5, %al>     ja    .L4>     movsbl  %dl,%eax>     leal   -87(%eax), %ecx> .L3:>     movl   %ecx, %eax>     popl   %ebp>     ret>     .p2align 4,,7>     .p2align 3> .L8:>     movsbl  %dl,%eax>     leal   -48(%eax), %ecx>     movl   %ecx, %eax>     popl   %ebp>     ret>     .p2align 4,,7>     .p2align 3> .L4:>     leal   -65(%edx), %eax>     movl   $-1, %ecx>     cmpb   $5, %al>     ja    .L3>     movsbl  %dl,%eax>     leal   -55(%eax), %ecx>     jmp   .L3>     .size  hex_to_bin_if, .-hex_to_bin_if>     .p2align 4,,15>> [6] Assembler code tolower() (gcc -S -O2 main.c)> hex_to_bin_to:>     pushl  %ebp>     movl   %esp, %ebp>     subl   $8, %esp>     movzbl  8(%ebp), %edx>     leal   -48(%edx), %eax>     cmpb   $9, %al>     ja    .L38>     movsbl  %dl,%eax>     leal   -48(%eax), %edx> .L39:>     movl   %edx, %eax>     leave>     ret>     .p2align 4,,7>     .p2align 3> .L38:>     movsbl  %dl,%eax>     movl   %eax, (%esp)>     call   tolower>     movl   $-1, %edx>     movl   %eax, %ecx>     leal   -97(%ecx), %eax>     cmpb   $5, %al>     ja    .L39>     movsbl  %cl,%eax>     leal   -87(%eax), %edx>     jmp   .L39>     .size  hex_to_bin_to, .-hex_to_bin_to>     .section     .rodata.str1.4,"aMS",@progbits,1>     .align 4>
-- With Best Regards,Andy Shevchenkoÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: hex_to_bin speedup
2010-05-27 16:35 ` Andy Shevchenko
@ 2010-05-28 21:05 ` Fredrik Gustafsson
0 siblings, 0 replies; 4+ messages in thread
From: Fredrik Gustafsson @ 2010-05-28 21:05 UTC (permalink / raw)
To: Andy Shevchenko; +Cc: linux-kernel
On Thu, May 27, 2010 at 07:35:05PM +0300, Andy Shevchenko wrote:
> > [1] Benchmark results
> > $ gcc -O2 main.c && ./a.out
> > tolower() was faster 2657 times
> > if-statement was faster 2571 times
> > Equal fast 4772 times
> So, optimizer do the job for tolower(). Thus, nothing to worry about.
Yes I guess so. I tested with -Os too and that also does the trick.
Thank you for your time.
--
iveqy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: hex_to_bin speedup
@ 2010-05-28 21:38 George Spelvin
0 siblings, 0 replies; 4+ messages in thread
From: George Spelvin @ 2010-05-28 21:38 UTC (permalink / raw)
To: andy.shevchenko, iveqy; +Cc: linux, linux-kernel
1) First of all, I'd worry about code size far more than speed.
this is not fast-path code.
2) Second, given that you're already doing a range test,
the fastest way to perform a tolower() is "c |= 0x20".
Generally it's something like:
int hex_to_bin(char ch)
{
ch -= '0';
if ((unsigned char)ch <= 9)
return ch;
ch |= 0x20;
ch -= 'a' - '0';
if ((unsigned char)ch <= 6)
return ch+10
return -1;
}
that produces the even smaller code:
hex_to_bin_or:
movb 4(%esp), %dl
subl $48, %edx
cmpb $9, %dl
movsbl %dl,%eax
jbe .L10
orl $32, %edx
orl $-1, %eax
subl $49, %edx
cmpb $6, %dl
ja .L10
movsbl %dl,%eax
addl $10, %eax
.L10:
ret
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-05-28 21:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-27 10:07 hex_to_bin speedup Fredrik Gustafsson
[not found] ` <AANLkTinn7aEv0ueW4uRN06XdUbEltTfmjRXEABMRN-7e@mail.gmail.com>
[not found] ` <20100527142848.GA23543@iveqy.com>
2010-05-27 16:35 ` Andy Shevchenko
2010-05-28 21:05 ` Fredrik Gustafsson
-- strict thread matches above, loose matches on Subject: below --
2010-05-28 21:38 George Spelvin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox