From: Michal Nazarewicz <mpn@google.com>
To: George Spelvin <linux@horizon.com>,
linux@horizon.com, vda.linux@googlemail.com
Cc: hughd@google.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8
Date: Mon, 24 Sep 2012 14:29:56 +0200 [thread overview]
Message-ID: <xa1tipb3d2nv.fsf@mina86.com> (raw)
In-Reply-To: <20120924114602.501.qmail@science.horizon.com>
[-- Attachment #1: Type: text/plain, Size: 2488 bytes --]
>>> @@ -174,20 +174,12 @@ char *put_dec_trunc8(char *buf, unsigned r)
>>> unsigned q;
>>> /* Copy of previous function's body with added early returns */
>>> - q = (r * (uint64_t)0x1999999a) >> 32;
>>> - *buf++ = (r - 10 * q) + '0'; /* 2 */
>>> - if (q == 0)
>>> - return buf;
>>> - r = (q * (uint64_t)0x1999999a) >> 32;
>>> - *buf++ = (q - 10 * r) + '0'; /* 3 */
>>> - if (r == 0)
>>> - return buf;
>>> - q = (r * (uint64_t)0x1999999a) >> 32;
>>> - *buf++ = (r - 10 * q) + '0'; /* 4 */
>>> - if (q == 0)
>>> - return buf;
>>> - r = (q * (uint64_t)0x1999999a) >> 32;
>>> - *buf++ = (q - 10 * r) + '0'; /* 5 */
>>> + while (r >= 10000) {
>>> + q = r + '0';
>>> + r = (r * (uint64_t)0x1999999a) >> 32;
>>> + *buf++ = q - 10*r;
>>> + }
All right, I now see what the loop is doing (I couldn't grasp it
yesterday) and expect for r=0 it looks legit.
On Mon, Sep 24 2012, George Spelvin wrote:
> Truthfully, it would have made *more* sense to swap q and r globally,
> so the loop had a more sensible q=quotient/r=remainder assignment,
> but I wanted to show that the unmodified tail was in fact unmodified.
The original has it a bit awkwardly because it just copies code from
put_dec_full9() with the first iteration skipped.
> The big saving from using a loop is that it avoids unnecessary
> 32x32->64-bit multiplies, falling through to the 16x16->32-bit
> code as early as possible. Given that most numbers are small,
> this seemed like a significant win.
Ah, makes sense.
I guess the following should work, even though it's not so pretty:
static noinline_for_stack
char *put_dec_trunc8(char *buf, unsigned r) {
unsigned q;
if (r > 10000) {
do {
q = r + '0';
r = (r * (uint64_t)0x1999999a) >> 32;
*buf++ = q - 10 * r;
} while (r >= 10000);
if (r == 0)
return buf;
}
q = (r * 0x199a) >> 16;
*buf++ = (r - 10 * q) + '0'; /* 6 */
if (q == 0)
return buf;
r = (q * 0xcd) >> 11;
*buf++ = (q - 10 * r) + '0'; /* 7 */
if (r == 0)
return buf;
q = (r * 0xcd) >> 11;
*buf++ = (r - 10 * q) + '0'; /* 8 */
if (q == 0)
return buf;
*buf++ = q + '0'; /* 9 */
return buf;
}
--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał “mina86” Nazarewicz (o o)
ooo +----<email/xmpp: mpn@google.com>--------------ooO--(_)--Ooo--
[-- Attachment #2.1: Type: text/plain, Size: 0 bytes --]
[-- Attachment #2.2: Type: application/pgp-signature, Size: 835 bytes --]
next prev parent reply other threads:[~2012-09-24 12:30 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-03 5:21 [PATCH 1/4] lib: vsprintf: Optimize division by 10 for small integers George Spelvin
2012-08-03 5:21 ` [PATCH 2/4] lib: vsprintf: Optimize division by 10000 George Spelvin
2012-09-23 17:30 ` Michal Nazarewicz
2012-09-24 12:16 ` George Spelvin
2012-09-24 12:41 ` Michal Nazarewicz
2012-09-24 13:56 ` George Spelvin
2012-09-24 15:14 ` Geert Uytterhoeven
2012-09-24 15:48 ` George Spelvin
2012-09-24 9:03 ` Denys Vlasenko
2012-09-24 12:35 ` George Spelvin
2012-09-24 15:02 ` Denys Vlasenko
2012-08-03 5:21 ` [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8 George Spelvin
2012-09-23 14:18 ` Rabin Vincent
2012-09-24 11:13 ` George Spelvin
2012-09-24 14:33 ` George Spelvin
2012-09-24 14:53 ` Michal Nazarewicz
2012-09-24 14:57 ` Michal Nazarewicz
2012-09-23 18:22 ` Michal Nazarewicz
2012-09-24 11:46 ` George Spelvin
2012-09-24 12:29 ` Michal Nazarewicz [this message]
2012-09-24 13:49 ` George Spelvin
2012-09-24 15:06 ` Michal Nazarewicz
2012-09-25 11:44 ` George Spelvin
2012-09-25 13:00 ` Denys Vlasenko
2012-08-03 5:21 ` [PATCH 4/4] lib: vsprintf: Fix broken comments George Spelvin
2012-09-23 17:22 ` [PATCH 1/4] lib: vsprintf: Optimize division by 10 for small integers Michal Nazarewicz
2012-09-24 14:18 ` George Spelvin
2012-09-24 9:06 ` Denys Vlasenko
2012-09-24 11:27 ` George Spelvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xa1tipb3d2nv.fsf@mina86.com \
--to=mpn@google.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@horizon.com \
--cc=vda.linux@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox