From mboxrd@z Thu Jan  1 00:00:00 1970
From: michi1@michaelblizek.twilightparadox.com (michi1 at michaelblizek.twilightparadox.com)
Date: Sat, 11 Feb 2012 08:34:29 +0100
Subject: How to measure performance inside Kernel?
In-Reply-To: <CA+MoWDr-7pAffmkC2YQfaZFU6u5ojstxHcKwJ8D6jPBd1HPyKA@mail.gmail.com>
References: <CA+MoWDrO0PbKfs=qgzthoEuob2uAirOfhjpdU__GLG34eSaqpw@mail.gmail.com>
	<CA+MoWDqXn=fS_Wf+vsvYH40MufaN7GTBW2MogaE8GMwUz6=fPg@mail.gmail.com>
	<6F5DE7538AFCDA45A114F5E7510424A702F8C9FC@hq-exchange01.bytemobile.com>
	<CA+MoWDr-7pAffmkC2YQfaZFU6u5ojstxHcKwJ8D6jPBd1HPyKA@mail.gmail.com>
Message-ID: <20120211073428.GB2232@grml>
To: kernelnewbies@lists.kernelnewbies.org
List-Id: kernelnewbies.lists.kernelnewbies.org

Hi!

On 22:22 Fri 10 Feb     , Peter Senna Tschudin wrote:
...
> I'm measuring the running time of that portion of code 512 times. Then
> calculate the geometrical mean of results. See one output example:
> 
> Original_code:,
> 514,110,92,104,107,101,101,
...
> The geometrical mean of the values is: 104.7623578604
> 
> Isn't it enough?

It should reduce the influence of the scheduler, but you can see a different
effect here: The first run takes ~5 times longer than any run which follows.
This is most likely caused by CPU cache effects. The question is now whether
you can expect the data to be in the cpu cache when this code is run in the
real world. If not, you might want to add prefetch instructions (look for
"__builtin_prefetch"). These instructions will make the first run faster, but
further runs slower.

	-Michi
-- 
programing a layer 3+4 network protocol for mesh networks
see http://michaelblizek.twilightparadox.com