From: Rene Herman <rene.herman@keyaccess.nl>
To: Soumyadip Das Mahapatra <kernelhacker@visualserver.org>
Cc: Benoit Boissinot <bboissin@gmail.com>,
Akinobu Mita <akinobu.mita@gmail.com>,
Harvey Harrison <harvey.harrison@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] bitreversal program
Date: Wed, 21 May 2008 13:11:25 +0200 [thread overview]
Message-ID: <4834035D.5090703@keyaccess.nl> (raw)
In-Reply-To: <Pine.LNX.4.64.0805211029250.1736@visualserver.org>
On 21-05-08 10:54, Soumyadip Das Mahapatra wrote:
> Sorry to disturb you again. But i tested my code against Akinobu's one
> and the test result shows my code takes less cpu time than that of
> Akinobu's.
The unfortunate thing about these kinds of changes is that they're not
all that easily tested. Straightforwardness would suggest that obviously
the current table driven method will be faster due to needing fewer code
cycles. Cache considerations add to that in the sense of instruction
cache and can (!) detract from it in the sense of data cache; sometimes
dramaticaly detract due to cache misses basically dwarving most anything
else.
However, in this case the table is a tiny 256-byte one which isn't even
going to be pulled in completely in normal usage (just the cache-lines
needed) while on the other hand the extra i-cache pressure from the
increased code in your version is always there.
It's unexpected that you would get better results from your new code
(and I'm not; I took Benoit's posted test and get 15 seconds for your
version versus 9 for the original table-driven one) and in this case,
reality wouldn't contradict the micro-benchmark either. It's when the
table grows and, especially, more of it is needed on a regular basis
that you'd start to worry.
PS: If you're going to go really micro, there are even going to be
differences between bitreversing 0x00000000 which is just going to need
the first byte (hence cacheline) and say 0x004080c0 which is going to
occupy 4 cachelines. Again not in the isolated test though; the data in
this case is small enough that you should be having a hard time getting
your version to perform better -- forking off a competing process that
does its best to dirty cache might do it, but then you're in a situation
which is no longer real-world with respect to this "call once" bit of API...
Rene.
next prev parent reply other threads:[~2008-05-21 11:09 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-19 17:04 [PATCH 1/2] bitreversal program Soumyadip Das Mahapatra
2008-05-19 20:42 ` Harvey Harrison
2008-05-20 6:53 ` John Hubbard
2008-05-20 11:01 ` Soumyadip Das Mahapatra
2008-05-20 12:13 ` Akinobu Mita
2008-05-20 15:25 ` Soumyadip Das Mahapatra
2008-05-20 15:47 ` Benoit Boissinot
2008-05-20 15:57 ` Soumyadip Das Mahapatra
2008-05-20 16:39 ` Benoit Boissinot
2008-05-21 8:54 ` Soumyadip Das Mahapatra
2008-05-21 9:11 ` Benoit Boissinot
2008-05-21 11:11 ` Rene Herman [this message]
2008-05-21 16:52 ` Tilman Schmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4834035D.5090703@keyaccess.nl \
--to=rene.herman@keyaccess.nl \
--cc=akinobu.mita@gmail.com \
--cc=bboissin@gmail.com \
--cc=harvey.harrison@gmail.com \
--cc=kernelhacker@visualserver.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).