* [Patch] performance enhancement for simple_strtoul
@ 2000-12-20 14:09 Steve Grubb
2000-12-20 16:04 ` Jeff Epler
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Steve Grubb @ 2000-12-20 14:09 UTC (permalink / raw)
To: linux-kernel
Hello,
The following patch is a faster implementation of the simple_strtoul
function. This function differs from the original in that it reduces the
multiplies to shifts and logical operations wherever possible. My testing
shows that it adds around 100 bytes, but is about 6% faster on a K6-2. (It
is 40% faster that glibc's strtoul, but that's a different story.) My guess
is that the performance gain will be higher on platforms with slower
multiplication instructions. I have tested it for numerical accuracy so I
think this is safe to apply. If anyone is interested, I can also supply a
test application that demonstrates the performance gain. This patch was
generated against 2.2.16, but should apply to 2.2.19 cleanly. In
2.4.0-test9, simple_strtoul starts on line 19 rather than 17, hopefully
that's not a problem.
Cheers,
Steve Grubb
-------------------------
--- lib/vsprintf.orig Fri Dec 1 08:58:02 2000
+++ lib/vsprintf.c Wed Dec 20 08:42:26 2000
@@ -14,10 +14,13 @@
#include <linux/string.h>
#include <linux/ctype.h>
+/*
+* This function converts base 8, 10, or 16 only - Steve Grubb
+*/
unsigned long simple_strtoul(const char *cp,char **endp,unsigned int base)
{
- unsigned long result = 0,value;
-
+ unsigned char c;
+ unsigned long result = 0;
if (!base) {
base = 10;
if (*cp == '0') {
@@ -29,11 +32,36 @@
}
}
}
- while (isxdigit(*cp) && (value = isdigit(*cp) ? *cp-'0' : (islower(*cp)
- ? toupper(*cp) : *cp)-'A'+10) < base) {
- result = result*base + value;
- cp++;
- }
+ c = *cp;
+ switch (base) {
+ case 10:
+ while (isdigit(c)) {
+ result = (result*10) + (c & 0x0f);
+ c = *(++cp);
+ }
+ break;
+ case 16:
+ while (isxdigit(c)) {
+ result = result<<4;
+ if (c&0x40)
+ result += (c & 0x07) + 9;
+ else
+ result += c & 0x0f;
+ c = *(++cp);
+ }
+ break;
+ case 8:
+ while (isdigit(c)) {
+ if ((c&0x37) == c)
+ result = (result<<3) + (c & 0x07);
+ else
+ break;
+ c = *(++cp);
+ }
+ break;
+ default: /* Is anything else used by the kernel? */
+ break;
+ }
if (endp)
*endp = (char *)cp;
return result;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-20 14:09 [Patch] performance enhancement for simple_strtoul Steve Grubb
@ 2000-12-20 16:04 ` Jeff Epler
2000-12-20 16:35 ` Steve Grubb
2000-12-20 18:42 ` Steve Grubb
2000-12-21 9:16 ` Matthias Andree
2000-12-21 20:05 ` Pavel Machek
2 siblings, 2 replies; 11+ messages in thread
From: Jeff Epler @ 2000-12-20 16:04 UTC (permalink / raw)
To: Steve Grubb; +Cc: linux-kernel
On Wed, Dec 20, 2000 at 09:09:03AM -0500, Steve Grubb wrote:
> Hello,
>
> The following patch is a faster implementation of the simple_strtoul
> function.
[snip]
Why not preserve the existing code for bases other than 8, 10, and 16?
Admittedly, the only other case that is likely to be used would be base
2, but surely there's only a penalty of a few dozen bytes for the
following code..
> - while (isxdigit(*cp) && (value = isdigit(*cp) ? *cp-'0' : (islower(*cp)
> - ? toupper(*cp) : *cp)-'A'+10) < base) {
> - result = result*base + value;
> - cp++;
> - }
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-20 16:04 ` Jeff Epler
@ 2000-12-20 16:35 ` Steve Grubb
2000-12-20 18:42 ` Steve Grubb
1 sibling, 0 replies; 11+ messages in thread
From: Steve Grubb @ 2000-12-20 16:35 UTC (permalink / raw)
To: Jeff Epler; +Cc: linux-kernel
Hello,
I thought about that. This would be my recommendation for glibc where the
general public may be doing scientific applications. But this is the kernel
and there are people that would reject my patch purely on the basis that it
adds precious bytes to the kernel. But since the kernel is "controllable" &
printf() and its variants only support 8, 10, & 16, perhaps a better
solution might be to trap the odd case and write something for it if its
that important, or simply don't allow it.
The base guessing part at the beginning of the function only supports base
8, 10, & 16. Therefore, the only way to require another base is to specify
it in the function call (param - unsigned int base). A quick scan of the
current linux source shows no one using something odd. So...
If the maintainers of vsprintf.c want support for all number bases, that's
fine with me. Just say the word & I'll gen up another patch...but it will be
more bytes.
Cheers,
Steve Grubb
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-20 16:04 ` Jeff Epler
2000-12-20 16:35 ` Steve Grubb
@ 2000-12-20 18:42 ` Steve Grubb
2000-12-21 0:09 ` Jamie Lokier
1 sibling, 1 reply; 11+ messages in thread
From: Steve Grubb @ 2000-12-20 18:42 UTC (permalink / raw)
To: linux-kernel
Hello,
I continued experimenting with the Test Case and found a further speed
improvement & I am re-submiting the patch. It is the same as the first one
with the two local variables changed to register storage types.
On a K6-2, I now see:
Base 10 - 28% speedup
Base 16 - 24% speedup
Base 8 - 30% speedup
On a P3 system, I now see:
Base 10 - 25% speedup
Base 16 - 17% speedup
Base 8 - 20% speedup
It seems gcc creates much better code with the variables set to register
types. Please apply the following patch. It should apply to any recent 2.2.x
without problems. In 2.4 the function starts 2 lines later.
Cheers,
Steve Grubb
----------------------------------
--- lib/vsprintf.orig Fri Dec 1 08:58:02 2000
+++ lib/vsprintf.c Wed Dec 20 13:14:13 2000
@@ -14,10 +14,13 @@
#include <linux/string.h>
#include <linux/ctype.h>
+/*
+* This function converts base 8, 10, or 16 only - Steve Grubb
+*/
unsigned long simple_strtoul(const char *cp,char **endp,unsigned int base)
{
- unsigned long result = 0,value;
-
+ register unsigned char c;
+ register unsigned long result = 0;
if (!base) {
base = 10;
if (*cp == '0') {
@@ -29,11 +32,36 @@
}
}
}
- while (isxdigit(*cp) && (value = isdigit(*cp) ? *cp-'0' : (islower(*cp)
- ? toupper(*cp) : *cp)-'A'+10) < base) {
- result = result*base + value;
- cp++;
- }
+ c = *cp;
+ switch (base) {
+ case 10:
+ while (isdigit(c)) {
+ result = (result*10) + (c & 0x0f);
+ c = *(++cp);
+ }
+ break;
+ case 16:
+ while (isxdigit(c)) {
+ result = result<<4;
+ if (c&0x40)
+ result += (c & 0x07) + 9;
+ else
+ result += c & 0x0f;
+ c = *(++cp);
+ }
+ break;
+ case 8:
+ while (isdigit(c)) {
+ if ((c&0x37) == c)
+ result = (result<<3) + (c & 0x07);
+ else
+ break;
+ c = *(++cp);
+ }
+ break;
+ default: /* Is anything else used by the kernel? */
+ break;
+ }
if (endp)
*endp = (char *)cp;
return result;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-20 18:42 ` Steve Grubb
@ 2000-12-21 0:09 ` Jamie Lokier
2000-12-21 20:06 ` Pavel Machek
0 siblings, 1 reply; 11+ messages in thread
From: Jamie Lokier @ 2000-12-21 0:09 UTC (permalink / raw)
To: Steve Grubb; +Cc: linux-kernel
Steve Grubb wrote:
> It seems gcc creates much better code with the variables set to register
> types.
Curious. GCC should be generating the same code regardless; ah well.
Is strtoul actually used in the kernel other than for the occasional
(rare) write to /proc/sys and parsing boot options?
> But this is the kernel and there are people that would reject my patch
> purely on the basis that it adds precious bytes to the kernel.
Perhaps I am mistaken but I'd expect it to be called what, ten times at
boot time, and a couple of times when X loads the MTRRs?
Sounds like the neatest trick would be reducing bytes used here...
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-20 14:09 [Patch] performance enhancement for simple_strtoul Steve Grubb
2000-12-20 16:04 ` Jeff Epler
@ 2000-12-21 9:16 ` Matthias Andree
2000-12-21 10:28 ` Bernd Schmidt
2000-12-21 16:07 ` Alan Cox
2000-12-21 20:05 ` Pavel Machek
2 siblings, 2 replies; 11+ messages in thread
From: Matthias Andree @ 2000-12-21 9:16 UTC (permalink / raw)
To: Steve Grubb; +Cc: Linux-Kernel mailing list
On Wed, 20 Dec 2000, Steve Grubb wrote:
> + while (isdigit(c)) {
> + result = (result*10) + (c & 0x0f);
> + c = *(++cp);
> + }
x * 10 can be written as:
(x << 2 + x) << 1 = (4x+x) * 2
(x << 3) + (x << 1) = 8x + 2x
Not sure if that/which one is faster, you may want to benchmark.
However, on machines that I have seen, multiplication times are either
constant or depend on the count of set bits in the second divisor, so
it's something like 6 + 2s.
However, I have only m68k data books here, and it will gain nothing on
an 'C68060 since those beasts ram down multiplications in 2 cycles, so
we'd gain nothing on those chips (OK, the shifts take 1 cycles each and
are scheduled in parallel, and the add takes an additional cycle after
the shifts have completed). Not sure about the ix86, alpha or sparc
series.
--
Matthias Andree
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-21 9:16 ` Matthias Andree
@ 2000-12-21 10:28 ` Bernd Schmidt
2000-12-21 16:07 ` Alan Cox
1 sibling, 0 replies; 11+ messages in thread
From: Bernd Schmidt @ 2000-12-21 10:28 UTC (permalink / raw)
To: Matthias Andree; +Cc: Steve Grubb, Linux-Kernel mailing list
On Thu, 21 Dec 2000, Matthias Andree wrote:
>
> x * 10 can be written as:
>
> (x << 2 + x) << 1 = (4x+x) * 2
> (x << 3) + (x << 1) = 8x + 2x
Or as "x * 10". Which has the advantage of actually being readable, and
letting the compiler optimize it into one of the other forms if that's
profitable on the machine you are compiling for.
Why do you guys bother making strtoul run fast anyway? Surely it's not on
any critical path in the kernel?
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-21 9:16 ` Matthias Andree
2000-12-21 10:28 ` Bernd Schmidt
@ 2000-12-21 16:07 ` Alan Cox
1 sibling, 0 replies; 11+ messages in thread
From: Alan Cox @ 2000-12-21 16:07 UTC (permalink / raw)
To: Matthias Andree; +Cc: Steve Grubb, Linux-Kernel mailing list
> On Wed, 20 Dec 2000, Steve Grubb wrote:
>
> > + while (isdigit(c)) {
> > + result = (result*10) + (c & 0x0f);
> > + c = *(++cp);
> > + }
>
> x * 10 can be written as:
>
> (x << 2 + x) << 1 = (4x+x) * 2
> (x << 3) + (x << 1) = 8x + 2x
Since when has printk been performance critical. It isnt worth microoptimising
(or in your case for some cpus micropessimising) that stuff. Besides, gcc should
work it out if its worth doing
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-20 14:09 [Patch] performance enhancement for simple_strtoul Steve Grubb
2000-12-20 16:04 ` Jeff Epler
2000-12-21 9:16 ` Matthias Andree
@ 2000-12-21 20:05 ` Pavel Machek
2 siblings, 0 replies; 11+ messages in thread
From: Pavel Machek @ 2000-12-21 20:05 UTC (permalink / raw)
To: Steve Grubb, linux-kernel
Hi!
> The following patch is a faster implementation of the simple_strtoul
> function. This function differs from the original in that it reduces the
> multiplies to shifts and logical operations wherever possible. My testing
> shows that it adds around 100 bytes, but is about 6% faster on a K6-2. (It
> is 40% faster that glibc's strtoul, but that's a different story.) My guess
> is that the performance gain will be higher on platforms with slower
> multiplication instructions. I have tested it for numerical accuracy so I
> think this is safe to apply. If anyone is interested, I can also supply a
> test application that demonstrates the performance gain. This patch was
> generated against 2.2.16, but should apply to 2.2.19 cleanly. In
> 2.4.0-test9, simple_strtoul starts on line 19 rather than 17, hopefully
> that's not a problem.
Simple question: who cares about performance of simple_strtoul?
Original is shorter, and simple_strtoul is not performace
critical. Keep it as it is.
--
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-21 0:09 ` Jamie Lokier
@ 2000-12-21 20:06 ` Pavel Machek
2000-12-28 5:15 ` Jamie Lokier
0 siblings, 1 reply; 11+ messages in thread
From: Pavel Machek @ 2000-12-21 20:06 UTC (permalink / raw)
To: Jamie Lokier, Steve Grubb; +Cc: linux-kernel
Hi!
> > It seems gcc creates much better code with the variables set to register
> > types.
>
> Curious. GCC should be generating the same code regardless; ah well.
>
> Is strtoul actually used in the kernel other than for the occasional
> (rare) write to /proc/sys and parsing boot options?
>
> > But this is the kernel and there are people that would reject my patch
> > purely on the basis that it adds precious bytes to the kernel.
>
> Perhaps I am mistaken but I'd expect it to be called what, ten times at
> boot time, and a couple of times when X loads the MTRRs?
On second thought, ps -auxl maybe stresses simple_strtoul a little
bit. Not sure.
> Sounds like the neatest trick would be reducing bytes used here...
Pavel
--
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Patch] performance enhancement for simple_strtoul
2000-12-21 20:06 ` Pavel Machek
@ 2000-12-28 5:15 ` Jamie Lokier
0 siblings, 0 replies; 11+ messages in thread
From: Jamie Lokier @ 2000-12-28 5:15 UTC (permalink / raw)
To: Pavel Machek; +Cc: Steve Grubb, linux-kernel
Pavel Machek wrote:
> > [about strtoul]
> > Perhaps I am mistaken but I'd expect it to be called what, ten times at
> > boot time, and a couple of times when X loads the MTRRs?
>
> On second thought, ps -auxl maybe stresses simple_strtoul a little
> bit. Not sure.
Nah. proc_pid_lookup does its own conversion from string to number, and
the rest are conversions from numbers to strings in sprintf.
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2000-12-28 5:45 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-12-20 14:09 [Patch] performance enhancement for simple_strtoul Steve Grubb
2000-12-20 16:04 ` Jeff Epler
2000-12-20 16:35 ` Steve Grubb
2000-12-20 18:42 ` Steve Grubb
2000-12-21 0:09 ` Jamie Lokier
2000-12-21 20:06 ` Pavel Machek
2000-12-28 5:15 ` Jamie Lokier
2000-12-21 9:16 ` Matthias Andree
2000-12-21 10:28 ` Bernd Schmidt
2000-12-21 16:07 ` Alan Cox
2000-12-21 20:05 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox