From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [patch 9/9] Scheduler profiling - Use conditional calls
Date: Fri, 1 Jun 2007 11:54:13 -0400 [thread overview]
Message-ID: <20070601155413.GA1216@Krystal> (raw)
In-Reply-To: <20070530133407.4f5789a0.akpm@linux-foundation.org>
* Andrew Morton (akpm@linux-foundation.org) wrote:
> On Wed, 30 May 2007 10:00:34 -0400
> Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
>
> > @@ -2990,7 +2991,8 @@
> > print_irqtrace_events(prev);
> > dump_stack();
> > }
> > - profile_hit(SCHED_PROFILING, __builtin_return_address(0));
> > + cond_call(profile_on,
> > + profile_hit(SCHED_PROFILING, __builtin_return_address(0)));
> >
>
> That's looking pretty neat. Do you have any before-and-after performance
> figures for i386 and for a non-optimised architecture?
Sure, here is the result of a small test comparing:
1 - Branch depending on a cache miss (has to fetch in memory, caused by a 128
bytes stride)). This is the test that is likely to look like what
side-effect the original profile_hit code was causing, under the
assumption that the kernel is already using L1 and L2 caches at
their full capacity and that a supplementary data load would cause
cache trashing.
2 - Branch depending on L1 cache hit. Just for comparison.
3 - Branch depending on a load immediate in the instruction stream.
It has been compiled with gcc -O2. Tests done on a 3GHz P4.
In the first test series, the branch is not taken:
number of tests : 1000
number of branches per test : 81920
memory hit cycles per iteration (mean) : 48.252
L1 cache hit cycles per iteration (mean) : 16.1693
instruction stream based test, cycles per iteration (mean) : 16.0432
In the second test series, the branch is taken and an integer is
incremented within the block:
number of tests : 1000
number of branches per test : 81920
memory hit cycles per iteration (mean) : 48.2691
L1 cache hit cycles per iteration (mean) : 16.396
instruction stream based test, cycles per iteration (mean) : 16.0441
Therefore, the memory fetch based test seems to be 200% slower than the
load immediate based test.
(I am adding these results to the documentation)
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2007-06-01 15:54 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-30 14:00 [patch 0/9] Conditional Calls - for 2.6.22-rc2-mm1 Mathieu Desnoyers
2007-05-30 14:00 ` [patch 1/9] Conditional Calls - Architecture Independent Code Mathieu Desnoyers
2007-05-30 20:32 ` Andrew Morton
2007-05-31 16:34 ` Mathieu Desnoyers
2007-05-31 13:47 ` Andi Kleen
2007-06-05 18:40 ` Mathieu Desnoyers
2007-06-04 19:01 ` Adrian Bunk
2007-06-13 15:57 ` Mathieu Desnoyers
2007-06-13 21:51 ` Adrian Bunk
2007-06-14 16:02 ` Mathieu Desnoyers
2007-06-14 21:06 ` Adrian Bunk
2007-06-20 21:59 ` Mathieu Desnoyers
2007-06-21 13:00 ` Adrian Bunk
2007-06-21 13:55 ` Mathieu Desnoyers
2007-05-30 14:00 ` [patch 2/9] Conditional Calls - Hash Table Mathieu Desnoyers
2007-05-30 20:32 ` Andrew Morton
2007-05-31 13:42 ` Andi Kleen
2007-06-01 16:08 ` Matt Mackall
2007-06-01 16:46 ` Mathieu Desnoyers
2007-06-01 17:07 ` Matt Mackall
2007-06-01 17:45 ` Andi Kleen
2007-06-01 18:06 ` Mathieu Desnoyers
2007-06-01 18:49 ` Matt Mackall
2007-06-01 19:35 ` Andi Kleen
2007-06-01 20:33 ` Mathieu Desnoyers
2007-06-01 20:44 ` Andi Kleen
2007-06-04 22:26 ` Mathieu Desnoyers
2007-06-01 18:03 ` Mathieu Desnoyers
2007-05-30 14:00 ` [patch 3/9] Conditional Calls - Non Optimized Architectures Mathieu Desnoyers
2007-05-30 14:00 ` [patch 4/9] Conditional Calls - Add kconfig menus Mathieu Desnoyers
2007-05-30 14:00 ` [patch 5/9] Conditional Calls - i386 Optimization Mathieu Desnoyers
2007-05-30 20:33 ` Andrew Morton
2007-05-31 13:54 ` Andi Kleen
2007-06-05 19:02 ` Mathieu Desnoyers
2007-05-30 14:00 ` [patch 6/9] Conditional Calls - PowerPC Optimization Mathieu Desnoyers
2007-05-30 14:00 ` [patch 7/9] Conditional Calls - Documentation Mathieu Desnoyers
2007-05-30 14:00 ` [patch 8/9] F00F bug fixup for i386 - use conditional calls Mathieu Desnoyers
2007-05-30 20:33 ` Andrew Morton
2007-05-31 21:07 ` Mathieu Desnoyers
2007-05-31 21:21 ` Andrew Morton
2007-05-31 21:38 ` Mathieu Desnoyers
2007-05-30 14:00 ` [patch 9/9] Scheduler profiling - Use " Mathieu Desnoyers
2007-05-30 20:34 ` Andrew Morton
2007-06-01 15:54 ` Mathieu Desnoyers [this message]
2007-06-01 16:19 ` Andrew Morton
2007-06-01 16:33 ` Mathieu Desnoyers
2007-05-30 20:59 ` William Lee Irwin III
2007-05-31 21:12 ` Mathieu Desnoyers
2007-05-31 23:41 ` William Lee Irwin III
2007-06-04 22:20 ` Mathieu Desnoyers
2007-05-30 21:44 ` Matt Mackall
2007-05-31 21:36 ` Mathieu Desnoyers
2007-05-31 13:39 ` Andi Kleen
2007-05-31 22:07 ` Mathieu Desnoyers
2007-05-31 22:33 ` Andi Kleen
2007-06-04 22:20 ` Mathieu Desnoyers
-- strict thread matches above, loose matches on Subject: below --
2007-05-29 18:33 [patch 0/9] Conditional Calls Mathieu Desnoyers
2007-05-29 18:34 ` [patch 9/9] Scheduler profiling - Use conditional calls Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070601155413.GA1216@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.