All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Nazarewicz <mpn@google.com>
To: George Spelvin <linux@horizon.com>,
	linux@horizon.com, vda.linux@googlemail.com
Cc: hughd@google.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8
Date: Mon, 24 Sep 2012 14:29:56 +0200	[thread overview]
Message-ID: <xa1tipb3d2nv.fsf@mina86.com> (raw)
In-Reply-To: <20120924114602.501.qmail@science.horizon.com>

[-- Attachment #1: Type: text/plain, Size: 2488 bytes --]

>>> @@ -174,20 +174,12 @@ char *put_dec_trunc8(char *buf, unsigned r)
>>>  	unsigned q;
>>>  	/* Copy of previous function's body with added early returns */
>>> -	q      = (r * (uint64_t)0x1999999a) >> 32;
>>> -	*buf++ = (r - 10 * q) + '0'; /* 2 */
>>> -	if (q == 0)
>>> -		return buf;
>>> -	r      = (q * (uint64_t)0x1999999a) >> 32;
>>> -	*buf++ = (q - 10 * r) + '0'; /* 3 */
>>> -	if (r == 0)
>>> -		return buf;
>>> -	q      = (r * (uint64_t)0x1999999a) >> 32;
>>> -	*buf++ = (r - 10 * q) + '0'; /* 4 */
>>> -	if (q == 0)
>>> -		return buf;
>>> -	r      = (q * (uint64_t)0x1999999a) >> 32;
>>> -	*buf++ = (q - 10 * r) + '0'; /* 5 */
>>> +	while (r >= 10000) {
>>> +		q = r + '0';
>>> +		r  = (r * (uint64_t)0x1999999a) >> 32;
>>> +		*buf++ = q - 10*r;
>>> +	}

All right, I now see what the loop is doing (I couldn't grasp it
yesterday) and expect for r=0 it looks legit.

On Mon, Sep 24 2012, George Spelvin wrote:
> Truthfully, it would have made *more* sense to swap q and r globally,
> so the loop had a more sensible q=quotient/r=remainder assignment,
> but I wanted to show that the unmodified tail was in fact unmodified.

The original has it a bit awkwardly because it just copies code from
put_dec_full9() with the first iteration skipped.

> The big saving from using a loop is that it avoids unnecessary
> 32x32->64-bit multiplies, falling through to the 16x16->32-bit
> code as early as possible.  Given that most numbers are small,
> this seemed like a significant win.

Ah, makes sense.

I guess the following should work, even though it's not so pretty:

static noinline_for_stack
char *put_dec_trunc8(char *buf, unsigned r) {
	unsigned q;

	if (r > 10000) {
		do {
			q = r + '0';
			r = (r * (uint64_t)0x1999999a) >> 32;
			*buf++ = q - 10 * r;
		} while (r >= 10000);
		if (r == 0)
			return buf;
	}

	q      = (r * 0x199a) >> 16;
	*buf++ = (r - 10 * q)  + '0'; /* 6 */
	if (q == 0)
		return buf;
	r      = (q * 0xcd) >> 11;
	*buf++ = (q - 10 * r)  + '0'; /* 7 */
	if (r == 0)
		return buf;
	q      = (r * 0xcd) >> 11;
	*buf++ = (r - 10 * q) + '0'; /* 8 */
	if (q == 0)
		return buf;
	*buf++ = q + '0'; /* 9 */
	return buf;
}

-- 
Best regards,                                         _     _
.o. | Liege of Serenely Enlightened Majesty of      o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz    (o o)
ooo +----<email/xmpp: mpn@google.com>--------------ooO--(_)--Ooo--

[-- Attachment #2.1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2.2: Type: application/pgp-signature, Size: 835 bytes --]

  reply	other threads:[~2012-09-24 12:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-03  5:21 [PATCH 1/4] lib: vsprintf: Optimize division by 10 for small integers George Spelvin
2012-08-03  5:21 ` [PATCH 2/4] lib: vsprintf: Optimize division by 10000 George Spelvin
2012-09-23 17:30   ` Michal Nazarewicz
2012-09-24 12:16     ` George Spelvin
2012-09-24 12:41       ` Michal Nazarewicz
2012-09-24 13:56         ` George Spelvin
2012-09-24 15:14           ` Geert Uytterhoeven
2012-09-24 15:48             ` George Spelvin
2012-09-24  9:03   ` Denys Vlasenko
2012-09-24 12:35     ` George Spelvin
2012-09-24 15:02       ` Denys Vlasenko
2012-08-03  5:21 ` [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8 George Spelvin
2012-09-23 14:18   ` Rabin Vincent
2012-09-24 11:13     ` George Spelvin
2012-09-24 14:33     ` George Spelvin
2012-09-24 14:53       ` Michal Nazarewicz
2012-09-24 14:57         ` Michal Nazarewicz
2012-09-23 18:22   ` Michal Nazarewicz
2012-09-24 11:46     ` George Spelvin
2012-09-24 12:29       ` Michal Nazarewicz [this message]
2012-09-24 13:49         ` George Spelvin
2012-09-24 15:06           ` Michal Nazarewicz
2012-09-25 11:44           ` George Spelvin
2012-09-25 13:00             ` Denys Vlasenko
2012-08-03  5:21 ` [PATCH 4/4] lib: vsprintf: Fix broken comments George Spelvin
2012-09-23 17:22 ` [PATCH 1/4] lib: vsprintf: Optimize division by 10 for small integers Michal Nazarewicz
2012-09-24 14:18   ` George Spelvin
2012-09-24  9:06 ` Denys Vlasenko
2012-09-24 11:27   ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xa1tipb3d2nv.fsf@mina86.com \
    --to=mpn@google.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@horizon.com \
    --cc=vda.linux@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.