From mboxrd@z Thu Jan 1 00:00:00 1970 From: michi1@michaelblizek.twilightparadox.com (michi1 at michaelblizek.twilightparadox.com) Date: Sat, 11 Feb 2012 08:34:29 +0100 Subject: How to measure performance inside Kernel? In-Reply-To: References: <6F5DE7538AFCDA45A114F5E7510424A702F8C9FC@hq-exchange01.bytemobile.com> Message-ID: <20120211073428.GB2232@grml> To: kernelnewbies@lists.kernelnewbies.org List-Id: kernelnewbies.lists.kernelnewbies.org Hi! On 22:22 Fri 10 Feb , Peter Senna Tschudin wrote: ... > I'm measuring the running time of that portion of code 512 times. Then > calculate the geometrical mean of results. See one output example: > > Original_code:, > 514,110,92,104,107,101,101, ... > The geometrical mean of the values is: 104.7623578604 > > Isn't it enough? It should reduce the influence of the scheduler, but you can see a different effect here: The first run takes ~5 times longer than any run which follows. This is most likely caused by CPU cache effects. The question is now whether you can expect the data to be in the cpu cache when this code is run in the real world. If not, you might want to add prefetch instructions (look for "__builtin_prefetch"). These instructions will make the first run faster, but further runs slower. -Michi -- programing a layer 3+4 network protocol for mesh networks see http://michaelblizek.twilightparadox.com