public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Subject: [patch 6/8] Immediate Value - Documentation
Date: Fri, 15 Jun 2007 16:23:16 -0400	[thread overview]
Message-ID: <20070615202926.780859816@polymtl.ca> (raw)
In-Reply-To: 20070615202310.178466032@polymtl.ca

[-- Attachment #1: immediate-values-documentation.patch --]
[-- Type: text/plain, Size: 4142 bytes --]

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 Documentation/immediate.txt |  103 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)

Index: linux-2.6-lttng/Documentation/immediate.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6-lttng/Documentation/immediate.txt	2007-06-15 16:14:05.000000000 -0400
@@ -0,0 +1,103 @@
+		        Using the Immediate Values
+
+			    Mathieu Desnoyers
+
+
+This document introduces Immediate Values and their use.
+
+* Purpose of immediate values
+
+An immediate value is used to compile into the kernel a branch that is disabled
+at compile-time that has almost no measurable performance impact on the kernel.
+Then, at runtime, it can be enabled dynamically.
+
+It can be used to compile code in the kernel that is seldomly meant to be
+dynamically activated. It's the case of CPU specific workarounds, profiling,
+tracing, etc.
+
+This infrastructure is specialized in supporting dynamic patching of the values
+in the instruction stream when multiple CPUs are running without disturbing the
+normal system behavior.
+
+
+* Usage
+
+In order to use the macro immediate, you should include linux/immediate.h.
+
+#include <linux/immediate.h>
+
+immediate_t __read_mostly this_immediate;
+EXPORT_SYMBOL(this_immediate);
+
+
+Add, in your code :
+
+if (unlikely(immediate(this_immediate))) {
+	some code...
+}
+
+And then, use:
+
+Use immediate_arm(&this_immediate) to activate the immediate value.
+
+Use immediate_disarm(&this_immediate) to deactivate the immediate value.
+
+Use immediate_query(&this_immediate) to query the immediate value state.
+
+The immediate mechanism supports inserting multiple instances of the same
+immediate. Immediate values can be put in inline functions, inlined static
+functions, and unrolled loops.
+
+
+* Optimization for a given architecture
+
+One can implement optimized immediate values for a given architecture by
+replacing asm-$ARCH/immediate.h.
+
+The IF_* flags can be used to control the type of immediate value. See the
+include/linux/immediate.h header for the list of flags. They can be specified as
+the first parameter of the _immediate() macro, as in the following example,
+which uses flags to declare a immediate that always uses the generic version of
+the immediates. It can be useful to use this when immediates are placed in
+kernel code presenting particular reentrancy challenges.
+
+if (unlikely(_immediate(IF_DEFAULT & ~IF_OPTIMIZED, this_immediate))) {
+	some code...
+}
+
+
+* Performance improvement
+
+Result of a small test comparing:
+
+1 - Branch depending on a cache miss (has to fetch in memory, caused by a 128
+    bytes stride)). This is the test that is likely to look like what
+    side-effect the original profile_hit code was causing, under the
+    assumption that the kernel is already using L1 and L2 caches at
+    their full capacity and that a supplementary data load would cause
+    cache trashing.
+2 - Branch depending on L1 cache hit. Just for comparison.
+3 - Branch depending on a load immediate in the instruction stream.
+
+It has been compiled with gcc -O2. Tests done on a 3GHz P4.
+
+In the first test series, the branch is not taken:
+
+number of tests : 1000
+number of branches per test : 81920
+memory hit cycles per iteration (mean) : 48.252
+L1 cache hit cycles per iteration (mean) : 16.1693
+instruction stream based test, cycles per iteration (mean) : 16.0432
+
+
+In the second test series, the branch is taken and an integer is
+incremented within the block:
+
+number of tests : 1000
+number of branches per test : 81920
+memory hit cycles per iteration (mean) : 48.2691
+L1 cache hit cycles per iteration (mean) : 16.396
+instruction stream based test, cycles per iteration (mean) : 16.0441
+
+Therefore, the memory fetch based test seems to be 200% slower than the
+load immediate based test.

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  parent reply	other threads:[~2007-06-15 20:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-15 20:23 [patch 0/8] Immediate values for fast branches Mathieu Desnoyers
2007-06-15 20:23 ` [patch 1/8] Immediate Value - Architecture Independent Code Mathieu Desnoyers
2007-06-15 20:23 ` [patch 2/8] Immediate Values - Non Optimized Architectures Mathieu Desnoyers
2007-06-15 20:23 ` [patch 3/8] Immediate Value - Add kconfig menus Mathieu Desnoyers
2007-06-15 20:23 ` [patch 4/8] Immediate Value - i386 Optimization Mathieu Desnoyers
2007-06-15 22:02   ` Chuck Ebbert
2007-06-17 17:50     ` Mathieu Desnoyers
2007-06-18 14:57     ` [patch 4/8] Immediate Value - i386 Optimization; kprobes Mathieu Desnoyers
2007-06-18 18:44       ` Chuck Ebbert
2007-06-18 18:56         ` Andrew Morton
2007-06-18 19:27           ` Mathieu Desnoyers
2007-06-18 19:32           ` Andi Kleen
2007-06-18 20:16             ` Chuck Ebbert
2007-06-19 10:06             ` [patch 1/2] kprobes i386 quick fix mark-ro-data S. P. Prasanna
2007-06-19 10:08               ` [patch 2/2] kprobes x86_64 " S. P. Prasanna
2007-06-19 13:21                 ` Arjan van de Ven
2007-06-19 13:30                   ` Mathieu Desnoyers
2007-06-19 13:44                     ` Arjan van de Ven
2007-06-19 14:31                       ` S. P. Prasanna
2007-06-19 16:47               ` [patch 1/2] kprobes i386 " Andi Kleen
2007-06-15 20:23 ` [patch 5/8] Immediate Value - PowerPC Optimization Mathieu Desnoyers
2007-06-15 20:23 ` Mathieu Desnoyers [this message]
2007-06-15 20:23 ` [patch 7/8] F00F bug fixup for i386 - use immediate values Mathieu Desnoyers
2007-06-15 20:23 ` [patch 8/8] Scheduler profiling - Use " Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070615202926.780859816@polymtl.ca \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox