From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:52936)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1Yql2v-0002Vz-K2
	for qemu-devel@nongnu.org; Fri, 08 May 2015 12:22:42 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1Yql2q-0004uX-GI
	for qemu-devel@nongnu.org; Fri, 08 May 2015 12:22:41 -0400
Received: from mx1.redhat.com ([209.132.183.28]:42420)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1Yql2q-0004uO-7d
	for qemu-devel@nongnu.org; Fri, 08 May 2015 12:22:36 -0400
Message-ID: <554CE2CA.2080005@redhat.com>
Date: Fri, 08 May 2015 12:22:34 -0400
From: John Snow <jsnow@redhat.com>
MIME-Version: 1.0
References: <1430864578-22072-1-git-send-email-jsnow@redhat.com>	<1430864578-22072-5-git-send-email-jsnow@redhat.com>	<87pp6eusrz.fsf@blackfin.pond.sub.org>
	<554A2142.7090006@redhat.com>	<87y4l13f8z.fsf@blackfin.pond.sub.org>
	<554A3EE1.6050207@redhat.com>	<554BCAB0.9030707@redhat.com>
	<87lhgzr3g1.fsf@blackfin.pond.sub.org>
In-Reply-To: <87lhgzr3g1.fsf@blackfin.pond.sub.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH v3 4/5] qtest: precompute hex nibs
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, afaerber@suse.de


On 05/08/2015 02:25 AM, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
>> On 05/06/2015 10:18 AM, John Snow wrote:
>>
>>>> To find out, add just buffering.  Something like this in your patch
>>>> instead of byte2hex():
>>>>
>>>>          for (i = 0; i < len; i++) {
>>>> -            qtest_sendf(chr, "%02x", data[i]);
>>>> +            snprintf(&enc[i * 2], 2, "%02x", data[i]);
>>>>          }
>>>>
>>>> If the speedup is pretty much entirely due to buffering (which I
>>>> suspect), then your commit message could use a bit of love :)
>>>>
>>>
>>> When you're right, you're right. The difference may not be statistically
>>> meaningful, but with today's current planetary alignment, using
>>> sprintf() to batch the sends instead of my home-rolled nib computation
>>> function, I can eke out a few more tenths of a second.
>>
>> I'm a bit surprised - making a function call per byte generally executes
>> more instructions than open-coding the conversion (albeit the branch
>> prediction in the hardware probably does fairly well over long strings,
>> since it is a tight and predictable loop).  Remember, sprintf() has to
>> decode the format string on every call (unless the compiler is smart
>> enough to open-code what sprintf would do).
> 
> John's measurements show that the speed difference between snprintf()
> and a local copy of formatting code gets thoroughly drowned in noise.
> 
> The snprintf() version takes 18 lines less, according to diffstat.  Less
> code, same measured performance, what's not to like?
> 
> However, if you feel strongly about avoiding snprintf() here, I won't
> argue further.  Except for the commit message: it needs to be fixed not
> to claim avoiding "printf and friends" makes a speed difference.
> 

My reasoning was the same as Markus's: the difference was so negligible
that I went with the "less home-rolled code" version.

I already staged this series without the nib functions and submitted the
snprintf version as its own patch with a less disparaging (to printf and
friends) commit message.

Any further micro-optimization is a waste of time to properly benchmark
and split hairs. I already dropped the test from ~14s to ~4s. Good enough.

--js