From: Rusty Russell <rusty@rustcorp.com.au>
To: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>,
tglx@linutronix.de, x86@kernel.org, linux-kernel@vger.kernel.org,
hpa@zytor.com, jeremy@goop.org, cpw@sgi.com,
nickpiggin@yahoo.com.au, ink@jurassic.park.msu.ru
Subject: Re: [PATCHSET x86/core/percpu] improve the first percpu chunk allocation
Date: Wed, 4 Mar 2009 10:33:08 +1030 [thread overview]
Message-ID: <200903041033.09502.rusty@rustcorp.com.au> (raw)
In-Reply-To: <49A40624.4000100@kernel.org>
On Wednesday 25 February 2009 01:07:24 Tejun Heo wrote:
> it always
> saves a 2MB TLB entry for all the non-NUMA machines out there.
Note that everyone keeps talking about "a" TLB entry; I wanted to make
sure (esp. for those of us reading from the sidelines), it's not: it's
up to num_possible_cpus() TLB entries. Of course, many paths won't access
other CPU's data, but it'd be interesting (and pretty easy) to actually
instrument how rare this is...
Hmm, fairly rare, but not incredibly:
percpu: measure use
With the idea of using virtual mappings for percpu regions, we wonder
how often we access other CPU's per-cpu variables.
32-bit 4-way SMP (under kvm), kernel make -j4:
get_cpu_var() 52,358,618
raw_get_cpu_var() 287,191
per_cpu(): 17,371,648
per_cpu(same): 16,020,390
Total same-cpu calls: 68,666,199
Cross-per-cpu calls: 1,351,258
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
arch/x86/Makefile_32.cpu | 2 +-
include/asm-generic/percpu.h | 10 +++++++---
kernel/module.c | 11 +++++++++++
kernel/smp.c | 21 +++++++++++++++++++++
4 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/arch/x86/Makefile_32.cpu b/arch/x86/Makefile_32.cpu
--- a/arch/x86/Makefile_32.cpu
+++ b/arch/x86/Makefile_32.cpu
@@ -47,5 +47,5 @@ cflags-$(CONFIG_X86_GENERIC) += $(call
# Bug fix for binutils: this option is required in order to keep
# binutils from generating NOPL instructions against our will.
ifneq ($(CONFIG_X86_P6_NOP),y)
-cflags-y += $(call cc-option,-Wa$(comma)-mtune=generic32,)
+#cflags-y += $(call cc-option,-Wa$(comma)-mtune=generic32,)
endif
diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h
--- a/include/asm-generic/percpu.h
+++ b/include/asm-generic/percpu.h
@@ -53,12 +53,16 @@ extern unsigned long __per_cpu_offset[NR
* established ways to produce a usable pointer from the percpu variable
* offset.
*/
+void count_per_cpu(unsigned int cpu);
+void count_get_cpu_var(void);
+void count_raw_get_cpu_var(void);
+
#define per_cpu(var, cpu) \
- (*SHIFT_PERCPU_PTR(&per_cpu_var(var), per_cpu_offset(cpu)))
+ (*(count_per_cpu(cpu), SHIFT_PERCPU_PTR(&per_cpu_var(var), per_cpu_offset(cpu))))
#define __get_cpu_var(var) \
- (*SHIFT_PERCPU_PTR(&per_cpu_var(var), my_cpu_offset))
+ (*(count_get_cpu_var(), SHIFT_PERCPU_PTR(&per_cpu_var(var), my_cpu_offset)))
#define __raw_get_cpu_var(var) \
- (*SHIFT_PERCPU_PTR(&per_cpu_var(var), __my_cpu_offset))
+ (*(count_raw_get_cpu_var(), SHIFT_PERCPU_PTR(&per_cpu_var(var), __my_cpu_offset)))
#ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2705,6 +2705,17 @@ static const struct seq_operations modul
static int modules_open(struct inode *inode, struct file *file)
{
+ extern atomic_t get_cpu_var_count, raw_get_cpu_var_count, per_cpu_count[], unnecessary_count[];
+ unsigned int i;
+
+ printk("get_cpu_var_count: %i\n", atomic_xchg(&get_cpu_var_count, 0));
+ printk("raw_get_cpu_var_count: %i\n",
+ atomic_xchg(&raw_get_cpu_var_count, 0));
+ for_each_online_cpu(i)
+ printk("per_cpu %i: %u (%u self)\n",
+ i, atomic_xchg(&per_cpu_count[i], 0),
+ atomic_xchg(&unnecessary_count[i], 0));
+
return seq_open(file, &modules_op);
}
diff --git a/kernel/smp.c b/kernel/smp.c
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -10,6 +10,27 @@
#include <linux/rcupdate.h>
#include <linux/rculist.h>
#include <linux/smp.h>
+
+atomic_t get_cpu_var_count, raw_get_cpu_var_count, per_cpu_count[CONFIG_NR_CPUS], unnecessary_count[CONFIG_NR_CPUS];
+void count_per_cpu(unsigned int cpu)
+{
+ if (cpu == raw_smp_processor_id())
+ atomic_inc(&unnecessary_count[cpu]);
+ atomic_inc(&per_cpu_count[cpu]);
+}
+EXPORT_SYMBOL(count_per_cpu);
+
+void count_get_cpu_var(void)
+{
+ atomic_inc(&get_cpu_var_count);
+}
+EXPORT_SYMBOL(count_get_cpu_var);
+
+void count_raw_get_cpu_var(void)
+{
+ atomic_inc(&raw_get_cpu_var_count);
+}
+EXPORT_SYMBOL(count_raw_get_cpu_var);
static DEFINE_PER_CPU(struct call_single_queue, call_single_queue);
static LIST_HEAD(call_function_queue);
next prev parent reply other threads:[~2009-03-04 0:03 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-24 3:11 [PATCHSET x86/core/percpu] improve the first percpu chunk allocation Tejun Heo
2009-02-24 3:11 ` [PATCH 01/10] percpu: fix pcpu_chunk_struct_size Tejun Heo
2009-02-24 3:11 ` [PATCH 02/10] bootmem: clean up arch-specific bootmem wrapping Tejun Heo
2009-02-24 11:30 ` Johannes Weiner
2009-02-24 11:39 ` Tejun Heo
2009-02-24 3:11 ` [PATCH 03/10] bootmem: reorder interface functions and add a missing one Tejun Heo
2009-02-24 3:11 ` [PATCH 04/10] vmalloc: add @align to vm_area_register_early() Tejun Heo
2009-02-24 3:11 ` [PATCH 05/10] x86: update populate_extra_pte() and add populate_extra_pmd() Tejun Heo
2009-02-24 3:11 ` [PATCH 06/10] percpu: remove unit_size power-of-2 restriction Tejun Heo
2009-02-24 3:11 ` [PATCH 07/10] percpu: give more latitude to arch specific first chunk initialization Tejun Heo
2009-02-24 3:11 ` [PATCH 08/10] x86: separate out setup_pcpu_4k() from setup_per_cpu_areas() Tejun Heo
2009-02-24 3:11 ` [PATCH 09/10] x86: add embedding percpu first chunk allocator Tejun Heo
2009-02-24 3:11 ` [PATCH 10/10] x86: add remapping " Tejun Heo
2009-02-24 9:57 ` [PATCHSET x86/core/percpu] improve the first percpu chunk allocation Ingo Molnar
2009-02-24 11:48 ` Tejun Heo
2009-02-24 12:40 ` Ingo Molnar
2009-02-24 13:27 ` Tejun Heo
2009-02-24 14:12 ` Ingo Molnar
2009-02-24 14:37 ` Tejun Heo
2009-02-24 15:15 ` Ingo Molnar
2009-02-24 23:33 ` Tejun Heo
2009-03-04 0:03 ` Rusty Russell [this message]
2009-03-04 0:15 ` H. Peter Anvin
2009-03-04 0:50 ` Ingo Molnar
2009-02-24 12:51 ` Ingo Molnar
2009-02-24 14:47 ` Tejun Heo
2009-02-24 15:19 ` Ingo Molnar
2009-02-24 15:30 ` Nick Piggin
2009-02-24 13:02 ` Ingo Molnar
2009-02-24 14:40 ` Tejun Heo
2009-02-24 20:17 ` Ingo Molnar
2009-02-24 20:51 ` Ingo Molnar
2009-02-24 21:02 ` Yinghai Lu
2009-02-24 21:12 ` [PATCH] x86: check range in reserve_early() -v2 Yinghai Lu
2009-02-24 21:16 ` [PATCHSET x86/core/percpu] improve the first percpu chunk allocation Ingo Molnar
2009-02-25 2:09 ` [PATCH x86/core/percpu 1/2] x86, percpu: fix minor bugs in setup_percpu.c Tejun Heo
2009-02-25 2:10 ` [PATCH x86/core/percpu 2/2] x86: convert cacheflush macros inline functions Tejun Heo
2009-02-25 2:23 ` [PATCHSET x86/core/percpu] improve the first percpu chunk allocation Tejun Heo
2009-02-25 2:56 ` Tejun Heo
2009-02-25 12:59 ` Ingo Molnar
2009-02-25 13:43 ` WARNING: at include/linux/percpu.h:159 __create_workqueue_key+0x1f6/0x220() Ingo Molnar
2009-02-26 2:03 ` [PATCH core/percpu] percpu: fix too low alignment restriction on UP Tejun Heo
2009-02-26 3:26 ` Ingo Molnar
2009-02-25 6:40 ` [PATCHSET x86/core/percpu] improve the first percpu chunk allocation Rusty Russell
2009-02-25 12:54 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200903041033.09502.rusty@rustcorp.com.au \
--to=rusty@rustcorp.com.au \
--cc=cpw@sgi.com \
--cc=hpa@zytor.com \
--cc=ink@jurassic.park.msu.ru \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox