* [patch 0/2] Immediate Values - in kernel users
@ 2007-07-14 1:26 Mathieu Desnoyers
2007-07-14 1:26 ` [patch 1/2] F00F bug fixup for i386 - use immediate values Mathieu Desnoyers
2007-07-14 1:26 ` [patch 2/2] Scheduler profiling - Use " Mathieu Desnoyers
0 siblings, 2 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2007-07-14 1:26 UTC (permalink / raw)
To: akpm, linux-kernel
Here are the first users of the immediate values : f00f bug handler and
profiling.
It applies to 2.6.22-rc6-mm1 and depends on the Immediate Values.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 6+ messages in thread
* [patch 1/2] F00F bug fixup for i386 - use immediate values
2007-07-14 1:26 [patch 0/2] Immediate Values - in kernel users Mathieu Desnoyers
@ 2007-07-14 1:26 ` Mathieu Desnoyers
2007-07-14 7:22 ` Alexey Dobriyan
2007-07-14 1:26 ` [patch 2/2] Scheduler profiling - Use " Mathieu Desnoyers
1 sibling, 1 reply; 6+ messages in thread
From: Mathieu Desnoyers @ 2007-07-14 1:26 UTC (permalink / raw)
To: akpm, linux-kernel; +Cc: Mathieu Desnoyers
[-- Attachment #1: f00f-bug-use-immediate-values.patch --]
[-- Type: text/plain, Size: 2749 bytes --]
Use the faster immediate values for F00F bug handling in do_page_fault.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
arch/i386/kernel/traps.c | 4 ++++
arch/i386/mm/fault.c | 3 ++-
include/asm-i386/processor.h | 3 +++
3 files changed, 9 insertions(+), 1 deletion(-)
Index: linux-2.6-lttng/arch/i386/kernel/traps.c
===================================================================
--- linux-2.6-lttng.orig/arch/i386/kernel/traps.c 2007-07-13 19:30:53.000000000 -0400
+++ linux-2.6-lttng/arch/i386/kernel/traps.c 2007-07-13 19:31:21.000000000 -0400
@@ -31,6 +31,7 @@
#include <linux/uaccess.h>
#include <linux/nmi.h>
#include <linux/bug.h>
+#include <linux/immediate.h>
#ifdef CONFIG_EISA
#include <linux/ioport.h>
@@ -1158,6 +1159,8 @@
#endif /* CONFIG_MATH_EMULATION */
#ifdef CONFIG_X86_F00F_BUG
+immediate_char_t f00f_bug_fix __read_mostly;
+
void __init trap_init_f00f_bug(void)
{
__set_fixmap(FIX_F00F_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
@@ -1168,6 +1171,7 @@
*/
idt_descr.address = fix_to_virt(FIX_F00F_IDT);
load_idt(&idt_descr);
+ immediate_set_early(&f00f_bug_fix, 1);
}
#endif
Index: linux-2.6-lttng/arch/i386/mm/fault.c
===================================================================
--- linux-2.6-lttng.orig/arch/i386/mm/fault.c 2007-07-13 19:25:38.000000000 -0400
+++ linux-2.6-lttng/arch/i386/mm/fault.c 2007-07-13 19:30:55.000000000 -0400
@@ -25,6 +25,7 @@
#include <linux/kprobes.h>
#include <linux/uaccess.h>
#include <linux/kdebug.h>
+#include <linux/immediate.h>
#include <asm/system.h>
#include <asm/desc.h>
@@ -492,7 +493,7 @@
/*
* Pentium F0 0F C7 C8 bug workaround.
*/
- if (boot_cpu_data.f00f_bug) {
+ immediate_if (&f00f_bug_fix) {
unsigned long nr;
nr = (address - idt_descr.address) >> 3;
Index: linux-2.6-lttng/include/asm-i386/processor.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-i386/processor.h 2007-07-13 19:25:38.000000000 -0400
+++ linux-2.6-lttng/include/asm-i386/processor.h 2007-07-13 19:30:55.000000000 -0400
@@ -21,6 +21,7 @@
#include <asm/percpu.h>
#include <linux/cpumask.h>
#include <linux/init.h>
+#include <linux/immediate.h>
#include <asm/processor-flags.h>
/* flag for disabling the tsc */
@@ -102,6 +103,8 @@
extern struct tss_struct doublefault_tss;
DECLARE_PER_CPU(struct tss_struct, init_tss);
+extern immediate_char_t f00f_bug_fix;
+
#ifdef CONFIG_SMP
extern struct cpuinfo_x86 cpu_data[];
#define current_cpu_data cpu_data[smp_processor_id()]
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 6+ messages in thread
* [patch 2/2] Scheduler profiling - Use immediate values
2007-07-14 1:26 [patch 0/2] Immediate Values - in kernel users Mathieu Desnoyers
2007-07-14 1:26 ` [patch 1/2] F00F bug fixup for i386 - use immediate values Mathieu Desnoyers
@ 2007-07-14 1:26 ` Mathieu Desnoyers
1 sibling, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2007-07-14 1:26 UTC (permalink / raw)
To: akpm, linux-kernel; +Cc: Mathieu Desnoyers
[-- Attachment #1: profiling-use-immediate-values.patch --]
[-- Type: text/plain, Size: 6570 bytes --]
Use immediate values with lower d-cache hit in optimized version as a
condition for scheduler profiling call.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
drivers/kvm/svm.c | 2 +-
drivers/kvm/vmx.c | 2 +-
include/linux/profile.h | 10 ++++------
kernel/profile.c | 38 ++++++++++++++++++++++++++------------
kernel/sched.c | 3 ++-
5 files changed, 34 insertions(+), 21 deletions(-)
Index: linux-2.6-lttng/kernel/profile.c
===================================================================
--- linux-2.6-lttng.orig/kernel/profile.c 2007-07-13 19:30:58.000000000 -0400
+++ linux-2.6-lttng/kernel/profile.c 2007-07-13 19:32:49.000000000 -0400
@@ -42,9 +42,6 @@
static atomic_t *prof_buffer;
static unsigned long prof_len, prof_shift;
-int prof_on __read_mostly;
-EXPORT_SYMBOL_GPL(prof_on);
-
static cpumask_t prof_cpu_mask = CPU_MASK_ALL;
#ifdef CONFIG_SMP
static DEFINE_PER_CPU(struct profile_hit *[2], cpu_profile_hits);
@@ -52,6 +49,14 @@
static DEFINE_MUTEX(profile_flip_mutex);
#endif /* CONFIG_SMP */
+/* Immediate values */
+immediate_char_t sleep_profiling __read_mostly,
+ sched_profiling __read_mostly,
+ kvm_profiling __read_mostly,
+ cpu_profiling __read_mostly;
+EXPORT_SYMBOL_GPL(kvm_profiling);
+EXPORT_SYMBOL_GPL(cpu_profiling);
+
static int __init profile_setup(char * str)
{
static char __initdata schedstr[] = "schedule";
@@ -60,7 +65,7 @@
int par;
if (!strncmp(str, sleepstr, strlen(sleepstr))) {
- prof_on = SLEEP_PROFILING;
+ immediate_set_early(&sleep_profiling, 1);
if (str[strlen(sleepstr)] == ',')
str += strlen(sleepstr) + 1;
if (get_option(&str, &par))
@@ -69,7 +74,7 @@
"kernel sleep profiling enabled (shift: %ld)\n",
prof_shift);
} else if (!strncmp(str, schedstr, strlen(schedstr))) {
- prof_on = SCHED_PROFILING;
+ immediate_set_early(&sched_profiling, 1);
if (str[strlen(schedstr)] == ',')
str += strlen(schedstr) + 1;
if (get_option(&str, &par))
@@ -78,7 +83,7 @@
"kernel schedule profiling enabled (shift: %ld)\n",
prof_shift);
} else if (!strncmp(str, kvmstr, strlen(kvmstr))) {
- prof_on = KVM_PROFILING;
+ immediate_set_early(&kvm_profiling, 1);
if (str[strlen(kvmstr)] == ',')
str += strlen(kvmstr) + 1;
if (get_option(&str, &par))
@@ -88,7 +93,7 @@
prof_shift);
} else if (get_option(&str, &par)) {
prof_shift = par;
- prof_on = CPU_PROFILING;
+ immediate_set_early(&cpu_profiling, 1);
printk(KERN_INFO "kernel profiling enabled (shift: %ld)\n",
prof_shift);
}
@@ -99,7 +104,10 @@
void __init profile_init(void)
{
- if (!prof_on)
+ if (!_immediate_read(&sleep_profiling) &&
+ !_immediate_read(&sched_profiling) &&
+ !_immediate_read(&kvm_profiling) &&
+ !_immediate_read(&cpu_profiling))
return;
/* only text is profiled */
@@ -288,7 +296,7 @@
int i, j, cpu;
struct profile_hit *hits;
- if (prof_on != type || !prof_buffer)
+ if (!prof_buffer)
return;
pc = min((pc - (unsigned long)_stext) >> prof_shift, prof_len - 1);
i = primary = (pc & (NR_PROFILE_GRP - 1)) << PROFILE_GRPSHIFT;
@@ -398,7 +406,7 @@
{
unsigned long pc;
- if (prof_on != type || !prof_buffer)
+ if (!prof_buffer)
return;
pc = ((unsigned long)__pc - (unsigned long)_stext) >> prof_shift;
atomic_add(nr_hits, &prof_buffer[min(pc, prof_len - 1)]);
@@ -555,7 +563,10 @@
}
return 0;
out_cleanup:
- prof_on = 0;
+ immediate_set_early(&sleep_profiling, 0);
+ immediate_set_early(&sched_profiling, 0);
+ immediate_set_early(&kvm_profiling, 0);
+ immediate_set_early(&cpu_profiling, 0);
smp_mb();
on_each_cpu(profile_nop, NULL, 0, 1);
for_each_online_cpu(cpu) {
@@ -582,7 +593,10 @@
{
struct proc_dir_entry *entry;
- if (!prof_on)
+ if (!_immediate_read(&sleep_profiling) &&
+ !_immediate_read(&sched_profiling) &&
+ !_immediate_read(&kvm_profiling) &&
+ !_immediate_read(&cpu_profiling))
return 0;
if (create_hash_tables())
return -1;
Index: linux-2.6-lttng/include/linux/profile.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/profile.h 2007-07-13 19:30:58.000000000 -0400
+++ linux-2.6-lttng/include/linux/profile.h 2007-07-13 19:32:00.000000000 -0400
@@ -7,10 +7,12 @@
#include <linux/init.h>
#include <linux/cpumask.h>
#include <linux/cache.h>
+#include <linux/immediate.h>
#include <asm/errno.h>
-extern int prof_on __read_mostly;
+extern immediate_char_t sleep_profiling, sched_profiling, kvm_profiling,
+ cpu_profiling;
#define CPU_PROFILING 1
#define SCHED_PROFILING 2
@@ -35,11 +37,7 @@
*/
static inline void profile_hit(int type, void *ip)
{
- /*
- * Speedup for the common (no profiling enabled) case:
- */
- if (unlikely(prof_on == type))
- profile_hits(type, ip, 1);
+ profile_hits(type, ip, 1);
}
#ifdef CONFIG_PROC_FS
Index: linux-2.6-lttng/kernel/sched.c
===================================================================
--- linux-2.6-lttng.orig/kernel/sched.c 2007-07-13 19:30:58.000000000 -0400
+++ linux-2.6-lttng/kernel/sched.c 2007-07-13 19:32:00.000000000 -0400
@@ -3241,7 +3241,8 @@
if (unlikely(in_atomic_preempt_off()) && unlikely(!prev->exit_state))
__schedule_bug(prev);
- profile_hit(SCHED_PROFILING, __builtin_return_address(0));
+ immediate_if (&sched_profiling)
+ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
schedstat_inc(this_rq(), sched_cnt);
}
Index: linux-2.6-lttng/drivers/kvm/svm.c
===================================================================
--- linux-2.6-lttng.orig/drivers/kvm/svm.c 2007-07-13 19:30:58.000000000 -0400
+++ linux-2.6-lttng/drivers/kvm/svm.c 2007-07-13 19:32:00.000000000 -0400
@@ -1654,7 +1654,7 @@
/*
* Profile KVM exit RIPs:
*/
- if (unlikely(prof_on == KVM_PROFILING))
+ immediate_if (&kvm_profiling)
profile_hit(KVM_PROFILING,
(void *)(unsigned long)vcpu->svm->vmcb->save.rip);
Index: linux-2.6-lttng/drivers/kvm/vmx.c
===================================================================
--- linux-2.6-lttng.orig/drivers/kvm/vmx.c 2007-07-13 19:30:58.000000000 -0400
+++ linux-2.6-lttng/drivers/kvm/vmx.c 2007-07-13 19:32:00.000000000 -0400
@@ -2156,7 +2156,7 @@
/*
* Profile KVM exit RIPs:
*/
- if (unlikely(prof_on == KVM_PROFILING))
+ immediate_if (&kvm_profiling)
profile_hit(KVM_PROFILING, (void *)vmcs_readl(GUEST_RIP));
vcpu->launched = 1;
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 1/2] F00F bug fixup for i386 - use immediate values
2007-07-14 1:26 ` [patch 1/2] F00F bug fixup for i386 - use immediate values Mathieu Desnoyers
@ 2007-07-14 7:22 ` Alexey Dobriyan
2007-07-15 2:49 ` Mathieu Desnoyers
0 siblings, 1 reply; 6+ messages in thread
From: Alexey Dobriyan @ 2007-07-14 7:22 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: akpm, linux-kernel
On Fri, Jul 13, 2007 at 09:26:43PM -0400, Mathieu Desnoyers wrote:
> Use the faster immediate values for F00F bug handling in do_page_fault.
> --- linux-2.6-lttng.orig/arch/i386/mm/fault.c
> +++ linux-2.6-lttng/arch/i386/mm/fault.c
> @@ -492,7 +493,7 @@
> /*
> * Pentium F0 0F C7 C8 bug workaround.
> */
> - if (boot_cpu_data.f00f_bug) {
> + immediate_if (&f00f_bug_fix) {
This code is not called during normal pagefaults and even during invalid
userspace accesses.
Out of curiosity, I inserted printk() at this place to see where I was
wrong. I got only two hits:
Checking if this processor honours the WP bit even in supervisor mode... do_page_fault:
Freeing unused kernel memory: 116k freed
do_page_fault:
Resume: nobody gives a fuck about performance of this particular if,
so conversion it totally pointless.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 1/2] F00F bug fixup for i386 - use immediate values
2007-07-14 7:22 ` Alexey Dobriyan
@ 2007-07-15 2:49 ` Mathieu Desnoyers
2007-07-15 2:54 ` Mathieu Desnoyers
0 siblings, 1 reply; 6+ messages in thread
From: Mathieu Desnoyers @ 2007-07-15 2:49 UTC (permalink / raw)
To: Alexey Dobriyan; +Cc: akpm, linux-kernel
* Alexey Dobriyan (adobriyan@gmail.com) wrote:
> On Fri, Jul 13, 2007 at 09:26:43PM -0400, Mathieu Desnoyers wrote:
> > Use the faster immediate values for F00F bug handling in do_page_fault.
>
> > --- linux-2.6-lttng.orig/arch/i386/mm/fault.c
> > +++ linux-2.6-lttng/arch/i386/mm/fault.c
> > @@ -492,7 +493,7 @@
> > /*
> > * Pentium F0 0F C7 C8 bug workaround.
> > */
> > - if (boot_cpu_data.f00f_bug) {
> > + immediate_if (&f00f_bug_fix) {
>
> This code is not called during normal pagefaults and even during invalid
> userspace accesses.
>
> Out of curiosity, I inserted printk() at this place to see where I was
> wrong. I got only two hits:
>
> Checking if this processor honours the WP bit even in supervisor mode... do_page_fault:
> Freeing unused kernel memory: 116k freed
> do_page_fault:
>
> Resume: nobody gives a fuck about performance of this particular if,
> so conversion it totally pointless.
>
Interesting investigation, let's push it further:
instrumenting the f00f test site with a printk, I get:
[ 0.000000] Checking if this processor honours the WP bit even in
supervisor mode... TEST: would test f00f bug at vadd ffecc000, eip c011928e
[ 0.000000] Ok.
... and (whenever xdm restarts) :
[ 64.768165] TEST: would test f00f bug at vadd 00000000, eip c0237596
[ 64.787136] TEST: would test f00f bug at vadd 0000004c, eip c02375a2
Those EIPs are:
0xc011928e <do_test_wp_bit+20>: mov %cl,0xffecd000(%edx)
-> Will trigger fixup_exception.
0xc0237596 <__copy_from_user_ll+53>: rep movsl %ds:(%esi),%es:(%edi)
0xc02375a2 <__copy_from_user_ll+65>: mov 0x20(%esi),%eax
-> Those look like user-space programs that gave NULL pointers to kernel
system calls.
I agree with you that this is not a "hot path". It was mostly a
straightforward test conversion.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 1/2] F00F bug fixup for i386 - use immediate values
2007-07-15 2:49 ` Mathieu Desnoyers
@ 2007-07-15 2:54 ` Mathieu Desnoyers
0 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2007-07-15 2:54 UTC (permalink / raw)
To: Alexey Dobriyan; +Cc: akpm, linux-kernel
* Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca) wrote:
> * Alexey Dobriyan (adobriyan@gmail.com) wrote:
> > On Fri, Jul 13, 2007 at 09:26:43PM -0400, Mathieu Desnoyers wrote:
> > > Use the faster immediate values for F00F bug handling in do_page_fault.
> >
> > > --- linux-2.6-lttng.orig/arch/i386/mm/fault.c
> > > +++ linux-2.6-lttng/arch/i386/mm/fault.c
> > > @@ -492,7 +493,7 @@
> > > /*
> > > * Pentium F0 0F C7 C8 bug workaround.
> > > */
> > > - if (boot_cpu_data.f00f_bug) {
> > > + immediate_if (&f00f_bug_fix) {
> >
> > This code is not called during normal pagefaults and even during invalid
> > userspace accesses.
> >
> > Out of curiosity, I inserted printk() at this place to see where I was
> > wrong. I got only two hits:
> >
> > Checking if this processor honours the WP bit even in supervisor mode... do_page_fault:
> > Freeing unused kernel memory: 116k freed
> > do_page_fault:
> >
> > Resume: nobody gives a fuck about performance of this particular if,
> > so conversion it totally pointless.
> >
>
> Interesting investigation, let's push it further:
>
> instrumenting the f00f test site with a printk, I get:
>
> [ 0.000000] Checking if this processor honours the WP bit even in
> supervisor mode... TEST: would test f00f bug at vadd ffecc000, eip c011928e
> [ 0.000000] Ok.
> ... and (whenever xdm restarts) :
> [ 64.768165] TEST: would test f00f bug at vadd 00000000, eip c0237596
> [ 64.787136] TEST: would test f00f bug at vadd 0000004c, eip c02375a2
>
> Those EIPs are:
>
> 0xc011928e <do_test_wp_bit+20>: mov %cl,0xffecd000(%edx)
> -> Will trigger fixup_exception.
>
> 0xc0237596 <__copy_from_user_ll+53>: rep movsl %ds:(%esi),%es:(%edi)
> 0xc02375a2 <__copy_from_user_ll+65>: mov 0x20(%esi),%eax
> -> Those look like user-space programs that gave NULL pointers to kernel
> system calls.
>
> I agree with you that this is not a "hot path". It was mostly a
> straightforward test conversion.
>
> Mathieu
Actually, what is even more weird is that my system is configured to
output the segfaults, i.e. a small user-space program that does a memory
read of *(char*)NULL causes:
[ 1413.164212] null[3740]: segfault at 00000000 eip 0804836a esp
bffde2c0 error 4
But there is no message for the __copy_from_user_ll fault, so I wonder
1 - has the process been killed ?
2 - Is it just the printk that is missing ?
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-07-15 2:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-14 1:26 [patch 0/2] Immediate Values - in kernel users Mathieu Desnoyers
2007-07-14 1:26 ` [patch 1/2] F00F bug fixup for i386 - use immediate values Mathieu Desnoyers
2007-07-14 7:22 ` Alexey Dobriyan
2007-07-15 2:49 ` Mathieu Desnoyers
2007-07-15 2:54 ` Mathieu Desnoyers
2007-07-14 1:26 ` [patch 2/2] Scheduler profiling - Use " Mathieu Desnoyers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox