* [PATCH 0/4] percpu: Optimize percpu accesses
@ 2008-06-04 0:30 Mike Travis
2008-06-04 0:30 ` [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero Mike Travis
` (4 more replies)
0 siblings, 5 replies; 108+ messages in thread
From: Mike Travis @ 2008-06-04 0:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet,
Jeremy Fitzhardinge, linux-kernel
This patchset provides the following:
* Generic: Percpu infrastructure to rebase the per cpu area to zero
This provides for the capability of accessing the percpu variables
using a local register instead of having to go through a table
on node 0 to find the cpu-specific offsets. It also would allow
atomic operations on percpu variables to reduce required locking.
Uses a new config var HAVE_ZERO_BASED_PER_CPU to indicate to the
generic code that the arch has this new basing.
* x86_64: Fold pda into per cpu area
Declare the pda as a per cpu variable. This will move the pda
area to an address accessible by the x86_64 per cpu macros.
Subtraction of __per_cpu_start will make the offset based from
the beginning of the per cpu area. Since %gs is pointing to the
pda, it will then also point to the per cpu variables and can be
accessed thusly:
%gs:[&per_cpu_xxxx - __per_cpu_start]
* x86_64: Rebase per cpu variables to zero
Take advantage of the zero-based per cpu area provided above.
Then we can directly use the x86_32 percpu operations. x86_32
offsets %fs by __per_cpu_start. x86_64 has %gs pointing directly
to the pda and the per cpu area thereby allowing access to the
pda with the x86_64 pda operations and access to the per cpu
variables using x86_32 percpu operations.
Based on linux-2.6.tip
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
--
^ permalink raw reply [flat|nested] 108+ messages in thread* [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero 2008-06-04 0:30 [PATCH 0/4] percpu: Optimize percpu accesses Mike Travis @ 2008-06-04 0:30 ` Mike Travis 2008-06-10 10:06 ` Ingo Molnar 2008-06-04 0:30 ` [PATCH 2/4] x86: Extend percpu ops to 64 bit Mike Travis ` (3 subsequent siblings) 4 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-04 0:30 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel [-- Attachment #1: zero_based_infrastructure --] [-- Type: text/plain, Size: 7044 bytes --] * Support an option CONFIG_HAVE_ZERO_BASED_PER_CPU to make offsets for per cpu variables to start at zero. If a percpu area starts at zero then: - We do not need RELOC_HIDE anymore - Provides for the future capability of architectures providing a per cpu allocator that returns offsets instead of pointers. The offsets would be independent of the processor so that address calculations can be done in a processor independent way. Per cpu instructions can then add the processor specific offset at the last minute possibly in an atomic instruction. The data the linker provides is different for zero based percpu segments: __per_cpu_load -> The address at which the percpu area was loaded __per_cpu_size -> The length of the per cpu area * Removes the &__per_cpu_x in lockdep. The __per_cpu_x are already pointers. There is no need to take the address. * Updates kernel/module.c to be able to deal with a percpu area that is loaded at __per_cpu_load but is accessed at __per_cpu_start. Based on linux-2.6.tip Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> --- include/asm-generic/percpu.h | 9 ++++++++- include/asm-generic/sections.h | 10 ++++++++++ include/asm-generic/vmlinux.lds.h | 16 ++++++++++++++++ include/linux/percpu.h | 17 ++++++++++++++++- kernel/lockdep.c | 4 ++-- kernel/module.c | 7 ++++--- 6 files changed, 56 insertions(+), 7 deletions(-) --- linux-2.6.tip.orig/include/asm-generic/percpu.h +++ linux-2.6.tip/include/asm-generic/percpu.h @@ -45,7 +45,12 @@ extern unsigned long __per_cpu_offset[NR * Only S390 provides its own means of moving the pointer. */ #ifndef SHIFT_PERCPU_PTR -#define SHIFT_PERCPU_PTR(__p, __offset) RELOC_HIDE((__p), (__offset)) +# ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU +# define SHIFT_PERCPU_PTR(__p, __offset) \ + ((__typeof(__p))(((void *)(__p)) + (__offset))) +# else +# define SHIFT_PERCPU_PTR(__p, __offset) RELOC_HIDE((__p), (__offset)) +# endif /* CONFIG_HAVE_ZERO_BASED_PER_CPU */ #endif /* @@ -70,6 +75,8 @@ extern void setup_per_cpu_areas(void); #define per_cpu(var, cpu) (*((void)(cpu), &per_cpu_var(var))) #define __get_cpu_var(var) per_cpu_var(var) #define __raw_get_cpu_var(var) per_cpu_var(var) +#define SHIFT_PERCPU_PTR(__p, __offset) (__p) +#define per_cpu_offset(x) 0L #endif /* SMP */ --- linux-2.6.tip.orig/include/asm-generic/sections.h +++ linux-2.6.tip/include/asm-generic/sections.h @@ -9,7 +9,17 @@ extern char __bss_start[], __bss_stop[]; extern char __init_begin[], __init_end[]; extern char _sinittext[], _einittext[]; extern char _end[]; +#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU +extern char __per_cpu_load[]; +extern char ____per_cpu_size[]; +#define __per_cpu_size ((unsigned long)&____per_cpu_size) +#define __per_cpu_start ((char *)0) +#define __per_cpu_end ((char *)__per_cpu_size) +#else extern char __per_cpu_start[], __per_cpu_end[]; +#define __per_cpu_load __per_cpu_start +#define __per_cpu_size (__per_cpu_end - __per_cpu_start) +#endif extern char __kprobes_text_start[], __kprobes_text_end[]; extern char __initdata_begin[], __initdata_end[]; extern char __start_rodata[], __end_rodata[]; --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h @@ -371,6 +371,21 @@ *(.initcall7.init) \ *(.initcall7s.init) +#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU +#define PERCPU(align) \ + . = ALIGN(align); \ + percpu : { } :percpu \ + __per_cpu_load = .; \ + .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ + *(.data.percpu.first) \ + *(.data.percpu.shared_aligned) \ + *(.data.percpu) \ + *(.data.percpu.page_aligned) \ + ____per_cpu_size = .; \ + } \ + . = __per_cpu_load + ____per_cpu_size; \ + data : { } :data +#else #define PERCPU(align) \ . = ALIGN(align); \ __per_cpu_start = .; \ @@ -380,3 +395,4 @@ *(.data.percpu.shared_aligned) \ } \ __per_cpu_end = .; +#endif --- linux-2.6.tip.orig/include/linux/percpu.h +++ linux-2.6.tip/include/linux/percpu.h @@ -27,7 +27,18 @@ #define DEFINE_PER_CPU_PAGE_ALIGNED(type, name) \ __attribute__((__section__(".data.percpu.page_aligned"))) \ PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name + +#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU +#define DEFINE_PER_CPU_FIRST(type, name) \ + __attribute__((__section__(".data.percpu.first"))) \ + PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name #else +#define DEFINE_PER_CPU_FIRST(type, name) \ + DEFINE_PER_CPU(type, name) +#endif + +#else /* !CONFIG_SMP */ + #define DEFINE_PER_CPU(type, name) \ PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name @@ -36,7 +47,11 @@ #define DEFINE_PER_CPU_PAGE_ALIGNED(type, name) \ DEFINE_PER_CPU(type, name) -#endif + +#define DEFINE_PER_CPU_FIRST(type, name) \ + DEFINE_PER_CPU(type, name) + +#endif /* !CONFIG_SMP */ #define EXPORT_PER_CPU_SYMBOL(var) EXPORT_SYMBOL(per_cpu__##var) #define EXPORT_PER_CPU_SYMBOL_GPL(var) EXPORT_SYMBOL_GPL(per_cpu__##var) --- linux-2.6.tip.orig/kernel/lockdep.c +++ linux-2.6.tip/kernel/lockdep.c @@ -614,8 +614,8 @@ static int static_obj(void *obj) * percpu var? */ for_each_possible_cpu(i) { - start = (unsigned long) &__per_cpu_start + per_cpu_offset(i); - end = (unsigned long) &__per_cpu_start + PERCPU_ENOUGH_ROOM + start = (unsigned long) __per_cpu_start + per_cpu_offset(i); + end = (unsigned long) __per_cpu_start + PERCPU_ENOUGH_ROOM + per_cpu_offset(i); if ((addr >= start) && (addr < end)) --- linux-2.6.tip.orig/kernel/module.c +++ linux-2.6.tip/kernel/module.c @@ -45,6 +45,7 @@ #include <linux/unwind.h> #include <asm/uaccess.h> #include <asm/cacheflush.h> +#include <asm/sections.h> #include <linux/license.h> #include <asm/sections.h> #include <linux/marker.h> @@ -367,7 +368,7 @@ static void *percpu_modalloc(unsigned lo align = PAGE_SIZE; } - ptr = __per_cpu_start; + ptr = __per_cpu_load; for (i = 0; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) { /* Extra for alignment requirement. */ extra = ALIGN((unsigned long)ptr, align) - (unsigned long)ptr; @@ -402,7 +403,7 @@ static void *percpu_modalloc(unsigned lo static void percpu_modfree(void *freeme) { unsigned int i; - void *ptr = __per_cpu_start + block_size(pcpu_size[0]); + void *ptr = __per_cpu_load + block_size(pcpu_size[0]); /* First entry is core kernel percpu data. */ for (i = 1; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) { @@ -453,7 +454,7 @@ static int percpu_modinit(void) pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated, GFP_KERNEL); /* Static in-kernel percpu data (used). */ - pcpu_size[0] = -(__per_cpu_end-__per_cpu_start); + pcpu_size[0] = -__per_cpu_size; /* Free room. */ pcpu_size[1] = PERCPU_ENOUGH_ROOM + pcpu_size[0]; if (pcpu_size[1] < 0) { -- ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero 2008-06-04 0:30 ` [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero Mike Travis @ 2008-06-10 10:06 ` Ingo Molnar 0 siblings, 0 replies; 108+ messages in thread From: Ingo Molnar @ 2008-06-10 10:06 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel * Mike Travis <travis@sgi.com> wrote: > * Support an option > > CONFIG_HAVE_ZERO_BASED_PER_CPU > > to make offsets for per cpu variables to start at zero. > > If a percpu area starts at zero then: > > - We do not need RELOC_HIDE anymore > > - Provides for the future capability of architectures providing > a per cpu allocator that returns offsets instead of pointers. > The offsets would be independent of the processor so that > address calculations can be done in a processor independent way. > Per cpu instructions can then add the processor specific offset > at the last minute possibly in an atomic instruction. > > The data the linker provides is different for zero based percpu segments: > > __per_cpu_load -> The address at which the percpu area was loaded > __per_cpu_size -> The length of the per cpu area > > * Removes the &__per_cpu_x in lockdep. The __per_cpu_x are already > pointers. There is no need to take the address. > > * Updates kernel/module.c to be able to deal with a percpu area that > is loaded at __per_cpu_load but is accessed at __per_cpu_start. > > Based on linux-2.6.tip applied to tip/core/percpu, thanks. Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 2/4] x86: Extend percpu ops to 64 bit 2008-06-04 0:30 [PATCH 0/4] percpu: Optimize percpu accesses Mike Travis 2008-06-04 0:30 ` [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero Mike Travis @ 2008-06-04 0:30 ` Mike Travis 2008-06-10 10:04 ` Ingo Molnar 2008-06-04 0:30 ` [PATCH 3/4] x86_64: Fold pda into per cpu area Mike Travis ` (2 subsequent siblings) 4 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-04 0:30 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel [-- Attachment #1: zero_based_percpu_64bit --] [-- Type: text/plain, Size: 3858 bytes --] * x86 percpu ops now will work on 64 bit too, so add the missing 8 byte cases. * Add a few atomic ops that will be useful in the future: x86_xchg_percpu() x86_cmpxchg_percpu(). x86_inc_percpu() - Increment by one can generate more efficient x86_dec_percpu() instructions and inc/dec will be supported by cpu ops later. * Use per_cpu_var() instead of per_cpu__##xxx. Based on linux-2.6.tip Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> --- include/asm-x86/percpu.h | 83 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 78 insertions(+), 5 deletions(-) --- linux-2.6.tip.orig/include/asm-x86/percpu.h +++ linux-2.6.tip/include/asm-x86/percpu.h @@ -108,6 +108,11 @@ do { \ : "+m" (var) \ : "ri" ((T__)val)); \ break; \ + case 8: \ + asm(op "q %1,"__percpu_seg"%0" \ + : "+m" (var) \ + : "ri" ((T__)val)); \ + break; \ default: __bad_percpu_size(); \ } \ } while (0) @@ -131,16 +136,84 @@ do { \ : "=r" (ret__) \ : "m" (var)); \ break; \ + case 8: \ + asm(op "q "__percpu_seg"%1,%0" \ + : "=r" (ret__) \ + : "m" (var)); \ + break; \ default: __bad_percpu_size(); \ } \ ret__; \ }) -#define x86_read_percpu(var) percpu_from_op("mov", per_cpu__##var) -#define x86_write_percpu(var, val) percpu_to_op("mov", per_cpu__##var, val) -#define x86_add_percpu(var, val) percpu_to_op("add", per_cpu__##var, val) -#define x86_sub_percpu(var, val) percpu_to_op("sub", per_cpu__##var, val) -#define x86_or_percpu(var, val) percpu_to_op("or", per_cpu__##var, val) +#define percpu_addr_op(op, var) \ +({ \ + switch (sizeof(var)) { \ + case 1: \ + asm(op "b "__percpu_seg"%0" \ + : : "m"(var)); \ + break; \ + case 2: \ + asm(op "w "__percpu_seg"%0" \ + : : "m"(var)); \ + break; \ + case 4: \ + asm(op "l "__percpu_seg"%0" \ + : : "m"(var)); \ + break; \ + case 8: \ + asm(op "q "__percpu_seg"%0" \ + : : "m"(var)); \ + break; \ + default: __bad_percpu_size(); \ + } \ +}) + +#define percpu_cmpxchg_op(var, old, new) \ +({ \ + typeof(var) prev; \ + switch (sizeof(var)) { \ + case 1: \ + asm("cmpxchgb %b1, "__percpu_seg"%2" \ + : "=a"(prev) \ + : "q"(new), "m"(var), "0"(old) \ + : "memory"); \ + break; \ + case 2: \ + asm("cmpxchgw %w1, "__percpu_seg"%2" \ + : "=a"(prev) \ + : "r"(new), "m"(var), "0"(old) \ + : "memory"); \ + break; \ + case 4: \ + asm("cmpxchgl %k1, "__percpu_seg"%2" \ + : "=a"(prev) \ + : "r"(new), "m"(var), "0"(old) \ + : "memory"); \ + break; \ + case 8: \ + asm("cmpxchgq %1, "__percpu_seg"%2" \ + : "=a"(prev) \ + : "r"(new), "m"(var), "0"(old) \ + : "memory"); \ + break; \ + default: \ + __bad_percpu_size(); \ + } \ + return prev; \ +}) + +#define x86_read_percpu(var) percpu_from_op("mov", per_cpu_var(var)) +#define x86_write_percpu(var, val) percpu_to_op("mov", per_cpu_var(var), val) +#define x86_add_percpu(var, val) percpu_to_op("add", per_cpu_var(var), val) +#define x86_sub_percpu(var, val) percpu_to_op("sub", per_cpu_var(var), val) +#define x86_inc_percpu(var) percpu_addr_op("inc", per_cpu_var(var)) +#define x86_dec_percpu(var) percpu_addr_op("dec", per_cpu_var(var)) +#define x86_or_percpu(var, val) percpu_to_op("or", per_cpu_var(var), val) +#define x86_xchg_percpu(var, val) percpu_to_op("xchg", per_cpu_var(var), val) +#define x86_cmpxchg_percpu(var, old, new) \ + percpu_cmpxchg_op(per_cpu_var(var), old, new) + #endif /* !__ASSEMBLY__ */ #endif /* !CONFIG_X86_64 */ -- ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 2/4] x86: Extend percpu ops to 64 bit 2008-06-04 0:30 ` [PATCH 2/4] x86: Extend percpu ops to 64 bit Mike Travis @ 2008-06-10 10:04 ` Ingo Molnar 0 siblings, 0 replies; 108+ messages in thread From: Ingo Molnar @ 2008-06-10 10:04 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel * Mike Travis <travis@sgi.com> wrote: > * x86 percpu ops now will work on 64 bit too, so add the missing 8 byte cases. > > * Add a few atomic ops that will be useful in the future: > > x86_xchg_percpu() > x86_cmpxchg_percpu(). > > x86_inc_percpu() - Increment by one can generate more efficient > x86_dec_percpu() instructions and inc/dec will be supported by > cpu ops later. > > * Use per_cpu_var() instead of per_cpu__##xxx. > > Based on linux-2.6.tip applied to tip/cpus4096, thanks Mike. Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 0:30 [PATCH 0/4] percpu: Optimize percpu accesses Mike Travis 2008-06-04 0:30 ` [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero Mike Travis 2008-06-04 0:30 ` [PATCH 2/4] x86: Extend percpu ops to 64 bit Mike Travis @ 2008-06-04 0:30 ` Mike Travis 2008-06-04 12:59 ` Jeremy Fitzhardinge 2008-06-05 10:22 ` [crash, bisected] " Ingo Molnar 2008-06-04 0:30 ` [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu() Mike Travis 2008-06-04 10:18 ` [PATCH] x86: collapse the various size-dependent percpu accessors together Jeremy Fitzhardinge 4 siblings, 2 replies; 108+ messages in thread From: Mike Travis @ 2008-06-04 0:30 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel [-- Attachment #1: zero_based_fold --] [-- Type: text/plain, Size: 16555 bytes --] * Declare the pda as a per cpu variable. * Make the x86_64 per cpu area start at zero. * Since the pda is now the first element of the per_cpu area, cpu_pda() is no longer needed and per_cpu() can be used instead. This also makes the _cpu_pda[] table obsolete. * Since %gs is pointing to the pda, it will then also point to the per cpu variables and can be accessed thusly: %gs:[&per_cpu_xxxx - __per_cpu_start] Based on linux-2.6.tip Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> --- arch/x86/Kconfig | 3 + arch/x86/kernel/head64.c | 34 ++++++-------- arch/x86/kernel/irq_64.c | 36 ++++++++------- arch/x86/kernel/setup.c | 90 ++++++++++++--------------------------- arch/x86/kernel/setup64.c | 5 -- arch/x86/kernel/smpboot.c | 51 ---------------------- arch/x86/kernel/traps_64.c | 11 +++- arch/x86/kernel/vmlinux_64.lds.S | 1 include/asm-x86/percpu.h | 48 ++++++-------------- 9 files changed, 89 insertions(+), 190 deletions(-) --- linux-2.6.tip.orig/arch/x86/Kconfig +++ linux-2.6.tip/arch/x86/Kconfig @@ -129,6 +129,9 @@ config HAVE_SETUP_PER_CPU_AREA config HAVE_CPUMASK_OF_CPU_MAP def_bool X86_64_SMP +config HAVE_ZERO_BASED_PER_CPU + def_bool X86_64_SMP + config ARCH_HIBERNATION_POSSIBLE def_bool y depends on !SMP || !X86_VOYAGER --- linux-2.6.tip.orig/arch/x86/kernel/head64.c +++ linux-2.6.tip/arch/x86/kernel/head64.c @@ -25,20 +25,6 @@ #include <asm/e820.h> #include <asm/bios_ebda.h> -/* boot cpu pda */ -static struct x8664_pda _boot_cpu_pda __read_mostly; - -#ifdef CONFIG_SMP -/* - * We install an empty cpu_pda pointer table to indicate to early users - * (numa_set_node) that the cpu_pda pointer table for cpus other than - * the boot cpu is not yet setup. - */ -static struct x8664_pda *__cpu_pda[NR_CPUS] __initdata; -#else -static struct x8664_pda *__cpu_pda[NR_CPUS] __read_mostly; -#endif - static void __init zap_identity_mappings(void) { pgd_t *pgd = pgd_offset_k(0UL); @@ -159,6 +145,20 @@ void __init x86_64_start_kernel(char * r /* Cleanup the over mapped high alias */ cleanup_highmap(); + /* point to boot pda which is the first element in the percpu area */ + { + struct x8664_pda *pda; +#ifdef CONFIG_SMP + pda = (struct x8664_pda *)__per_cpu_load; + pda->data_offset = per_cpu_offset(0) = (unsigned long)pda; +#else + pda = &per_cpu(pda, 0); + pda->data_offset = (unsigned long)pda; +#endif + } + /* initialize boot cpu_pda data */ + pda_init(0); + for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) { #ifdef CONFIG_EARLY_PRINTK set_intr_gate(i, &early_idt_handlers[i]); @@ -170,12 +170,6 @@ void __init x86_64_start_kernel(char * r early_printk("Kernel alive\n"); - _cpu_pda = __cpu_pda; - cpu_pda(0) = &_boot_cpu_pda; - pda_init(0); - - early_printk("Kernel really alive\n"); - copy_bootdata(__va(real_mode_data)); reserve_early(__pa_symbol(&_text), __pa_symbol(&_end), "TEXT DATA BSS"); --- linux-2.6.tip.orig/arch/x86/kernel/irq_64.c +++ linux-2.6.tip/arch/x86/kernel/irq_64.c @@ -115,39 +115,43 @@ skip: } else if (i == NR_IRQS) { seq_printf(p, "NMI: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->__nmi_count); + seq_printf(p, "%10u ", per_cpu(pda.__nmi_count, j)); seq_printf(p, " Non-maskable interrupts\n"); seq_printf(p, "LOC: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->apic_timer_irqs); + seq_printf(p, "%10u ", per_cpu(pda.apic_timer_irqs, j)); seq_printf(p, " Local timer interrupts\n"); #ifdef CONFIG_SMP seq_printf(p, "RES: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->irq_resched_count); + seq_printf(p, "%10u ", + per_cpu(pda.irq_resched_count, j)); seq_printf(p, " Rescheduling interrupts\n"); seq_printf(p, "CAL: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->irq_call_count); + seq_printf(p, "%10u ", per_cpu(pda.irq_call_count, j)); seq_printf(p, " function call interrupts\n"); seq_printf(p, "TLB: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->irq_tlb_count); + seq_printf(p, "%10u ", per_cpu(pda.irq_tlb_count, j)); seq_printf(p, " TLB shootdowns\n"); #endif #ifdef CONFIG_X86_MCE seq_printf(p, "TRM: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->irq_thermal_count); + seq_printf(p, "%10u ", + per_cpu(pda.irq_thermal_count, j)); seq_printf(p, " Thermal event interrupts\n"); seq_printf(p, "THR: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->irq_threshold_count); + seq_printf(p, "%10u ", + per_cpu(pda.irq_threshold_count, j)); seq_printf(p, " Threshold APIC interrupts\n"); #endif seq_printf(p, "SPU: "); for_each_online_cpu(j) - seq_printf(p, "%10u ", cpu_pda(j)->irq_spurious_count); + seq_printf(p, "%10u ", + per_cpu(pda.irq_spurious_count, j)); seq_printf(p, " Spurious interrupts\n"); seq_printf(p, "ERR: %10u\n", atomic_read(&irq_err_count)); } @@ -159,19 +163,19 @@ skip: */ u64 arch_irq_stat_cpu(unsigned int cpu) { - u64 sum = cpu_pda(cpu)->__nmi_count; + u64 sum = per_cpu(pda.__nmi_count, cpu); - sum += cpu_pda(cpu)->apic_timer_irqs; + sum += per_cpu(pda.apic_timer_irqs, cpu); #ifdef CONFIG_SMP - sum += cpu_pda(cpu)->irq_resched_count; - sum += cpu_pda(cpu)->irq_call_count; - sum += cpu_pda(cpu)->irq_tlb_count; + sum += per_cpu(pda.irq_resched_count, cpu); + sum += per_cpu(pda.irq_call_count, cpu); + sum += per_cpu(pda.irq_tlb_count, cpu); #endif #ifdef CONFIG_X86_MCE - sum += cpu_pda(cpu)->irq_thermal_count; - sum += cpu_pda(cpu)->irq_threshold_count; + sum += per_cpu(pda.irq_thermal_count, cpu); + sum += per_cpu(pda.irq_threshold_count, cpu); #endif - sum += cpu_pda(cpu)->irq_spurious_count; + sum += per_cpu(pda.irq_spurious_count, cpu); return sum; } --- linux-2.6.tip.orig/arch/x86/kernel/setup.c +++ linux-2.6.tip/arch/x86/kernel/setup.c @@ -29,6 +29,11 @@ DEFINE_EARLY_PER_CPU(u16, x86_bios_cpu_a EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid); EXPORT_EARLY_PER_CPU_SYMBOL(x86_bios_cpu_apicid); +#ifdef CONFIG_X86_64 +DEFINE_PER_CPU_FIRST(struct x8664_pda, pda); +EXPORT_PER_CPU_SYMBOL(pda); +#endif + #if defined(CONFIG_NUMA) && defined(CONFIG_X86_64) #define X86_64_NUMA 1 @@ -47,7 +52,7 @@ static void __init setup_node_to_cpumask static inline void setup_node_to_cpumask_map(void) { } #endif -#if defined(CONFIG_HAVE_SETUP_PER_CPU_AREA) && defined(CONFIG_SMP) +#ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA /* * Copy data used in early init routines from the initial arrays to the * per cpu data areas. These arrays then become expendable and the @@ -94,64 +99,9 @@ static void __init setup_cpumask_of_cpu( static inline void setup_cpumask_of_cpu(void) { } #endif -#ifdef CONFIG_X86_32 -/* - * Great future not-so-futuristic plan: make i386 and x86_64 do it - * the same way - */ unsigned long __per_cpu_offset[NR_CPUS] __read_mostly; EXPORT_SYMBOL(__per_cpu_offset); -static inline void setup_cpu_pda_map(void) { } - -#elif !defined(CONFIG_SMP) -static inline void setup_cpu_pda_map(void) { } - -#else /* CONFIG_SMP && CONFIG_X86_64 */ - -/* - * Allocate cpu_pda pointer table and array via alloc_bootmem. - */ -static void __init setup_cpu_pda_map(void) -{ - char *pda; - struct x8664_pda **new_cpu_pda; - unsigned long size; - int cpu; - - size = roundup(sizeof(struct x8664_pda), cache_line_size()); - - /* allocate cpu_pda array and pointer table */ - { - unsigned long tsize = nr_cpu_ids * sizeof(void *); - unsigned long asize = size * (nr_cpu_ids - 1); - - tsize = roundup(tsize, cache_line_size()); - new_cpu_pda = alloc_bootmem(tsize + asize); - pda = (char *)new_cpu_pda + tsize; - } - /* initialize pointer table to static pda's */ - for_each_possible_cpu(cpu) { - if (cpu == 0) { - /* leave boot cpu pda in place */ - new_cpu_pda[0] = cpu_pda(0); - continue; - } - new_cpu_pda[cpu] = (struct x8664_pda *)pda; - new_cpu_pda[cpu]->in_bootmem = 1; - pda += size; - } - - /* point to new pointer table */ - _cpu_pda = new_cpu_pda; -} -#endif - -/* - * Great future plan: - * Declare PDA itself and support (irqstack,tss,pgd) as per cpu data. - * Always point %gs to its beginning - */ void __init setup_per_cpu_areas(void) { ssize_t size = PERCPU_ENOUGH_ROOM; @@ -164,9 +114,6 @@ void __init setup_per_cpu_areas(void) nr_cpu_ids = num_processors; #endif - /* Setup cpu_pda map */ - setup_cpu_pda_map(); - /* Copy section for each CPU (we discard the original) */ size = PERCPU_ENOUGH_ROOM; printk(KERN_INFO "PERCPU: Allocating %lu bytes of per cpu data\n", @@ -186,9 +133,28 @@ void __init setup_per_cpu_areas(void) else ptr = alloc_bootmem_pages_node(NODE_DATA(node), size); #endif + /* Initialize each cpu's per_cpu area and save pointer */ + memcpy(ptr, __per_cpu_load, __per_cpu_size); per_cpu_offset(cpu) = ptr - __per_cpu_start; - memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); +#ifdef CONFIG_X86_64 + /* + * Note the boot cpu has been using the static per_cpu load + * area for it's pda. We need to zero out the pda's for the + * other cpu's that are coming online. + */ + { + /* we rely on the fact that pda is the first element */ + struct x8664_pda *pda = (struct x8664_pda *)ptr; + + if (cpu) + memset(pda, 0, sizeof(struct x8664_pda)); + else + pda_init(0); + + pda->data_offset = (unsigned long)ptr; + } +#endif } printk(KERN_DEBUG "NR_CPUS: %d, nr_cpu_ids: %d, nr_node_ids %d\n", @@ -240,8 +206,8 @@ void __cpuinit numa_set_node(int cpu, in { int *cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map); - if (cpu_pda(cpu) && node != NUMA_NO_NODE) - cpu_pda(cpu)->nodenumber = node; + if (per_cpu_offset(cpu)) + per_cpu(pda.nodenumber, cpu) = node; if (cpu_to_node_map) cpu_to_node_map[cpu] = node; --- linux-2.6.tip.orig/arch/x86/kernel/setup64.c +++ linux-2.6.tip/arch/x86/kernel/setup64.c @@ -35,9 +35,6 @@ struct boot_params boot_params; cpumask_t cpu_initialized __cpuinitdata = CPU_MASK_NONE; -struct x8664_pda **_cpu_pda __read_mostly; -EXPORT_SYMBOL(_cpu_pda); - struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; char boot_cpu_stack[IRQSTACKSIZE] __attribute__((section(".bss.page_aligned"))); @@ -89,7 +86,7 @@ __setup("noexec32=", nonx32_setup); void pda_init(int cpu) { - struct x8664_pda *pda = cpu_pda(cpu); + struct x8664_pda *pda = &per_cpu(pda, cpu); /* Setup up data that may be needed in __get_free_pages early */ asm volatile("movl %0,%%fs ; movl %0,%%gs" :: "r" (0)); --- linux-2.6.tip.orig/arch/x86/kernel/smpboot.c +++ linux-2.6.tip/arch/x86/kernel/smpboot.c @@ -798,45 +798,6 @@ static void __cpuinit do_fork_idle(struc complete(&c_idle->done); } -#ifdef CONFIG_X86_64 -/* - * Allocate node local memory for the AP pda. - * - * Must be called after the _cpu_pda pointer table is initialized. - */ -static int __cpuinit get_local_pda(int cpu) -{ - struct x8664_pda *oldpda, *newpda; - unsigned long size = sizeof(struct x8664_pda); - int node = cpu_to_node(cpu); - - if (cpu_pda(cpu) && !cpu_pda(cpu)->in_bootmem) - return 0; - - oldpda = cpu_pda(cpu); - newpda = kmalloc_node(size, GFP_ATOMIC, node); - if (!newpda) { - printk(KERN_ERR "Could not allocate node local PDA " - "for CPU %d on node %d\n", cpu, node); - - if (oldpda) - return 0; /* have a usable pda */ - else - return -1; - } - - if (oldpda) { - memcpy(newpda, oldpda, size); - if (!after_bootmem) - free_bootmem((unsigned long)oldpda, size); - } - - newpda->in_bootmem = 0; - cpu_pda(cpu) = newpda; - return 0; -} -#endif /* CONFIG_X86_64 */ - static int __cpuinit do_boot_cpu(int apicid, int cpu) /* * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad @@ -860,14 +821,6 @@ static int __cpuinit do_boot_cpu(int api printk(KERN_ERR "Failed to allocate GDT for CPU %d\n", cpu); return -1; } - - /* Allocate node local memory for AP pdas */ - if (cpu > 0) { - boot_error = get_local_pda(cpu); - if (boot_error) - goto restore_state; - /* if can't get pda memory, can't start cpu */ - } #endif alternatives_smp_switch(1); @@ -908,7 +861,7 @@ do_rest: stack_start.sp = (void *) c_idle.idle->thread.sp; irq_ctx_init(cpu); #else - cpu_pda(cpu)->pcurrent = c_idle.idle; + per_cpu(pda.pcurrent, cpu) = c_idle.idle; init_rsp = c_idle.idle->thread.sp; load_sp0(&per_cpu(init_tss, cpu), &c_idle.idle->thread); initial_code = (unsigned long)start_secondary; @@ -985,8 +938,6 @@ do_rest: } } -restore_state: - if (boot_error) { /* Try to put things back the way they were before ... */ unmap_cpu_to_logical_apicid(cpu); --- linux-2.6.tip.orig/arch/x86/kernel/traps_64.c +++ linux-2.6.tip/arch/x86/kernel/traps_64.c @@ -265,7 +265,8 @@ void dump_trace(struct task_struct *tsk, const struct stacktrace_ops *ops, void *data) { const unsigned cpu = get_cpu(); - unsigned long *irqstack_end = (unsigned long*)cpu_pda(cpu)->irqstackptr; + unsigned long *irqstack_end = + (unsigned long*)per_cpu(pda.irqstackptr, cpu); unsigned used = 0; struct thread_info *tinfo; @@ -399,8 +400,10 @@ _show_stack(struct task_struct *tsk, str unsigned long *stack; int i; const int cpu = smp_processor_id(); - unsigned long *irqstack_end = (unsigned long *) (cpu_pda(cpu)->irqstackptr); - unsigned long *irqstack = (unsigned long *) (cpu_pda(cpu)->irqstackptr - IRQSTACKSIZE); + unsigned long *irqstack_end = + (unsigned long *)per_cpu(pda.irqstackptr, cpu); + unsigned long *irqstack = + (unsigned long *)(per_cpu(pda.irqstackptr, cpu) - IRQSTACKSIZE); // debugging aid: "show_stack(NULL, NULL);" prints the // back trace for this cpu. @@ -464,7 +467,7 @@ void show_registers(struct pt_regs *regs int i; unsigned long sp; const int cpu = smp_processor_id(); - struct task_struct *cur = cpu_pda(cpu)->pcurrent; + struct task_struct *cur = __get_cpu_var(pda.pcurrent); u8 *ip; unsigned int code_prologue = code_bytes * 43 / 64; unsigned int code_len = code_bytes; --- linux-2.6.tip.orig/arch/x86/kernel/vmlinux_64.lds.S +++ linux-2.6.tip/arch/x86/kernel/vmlinux_64.lds.S @@ -16,6 +16,7 @@ jiffies_64 = jiffies; _proxy_pda = 1; PHDRS { text PT_LOAD FLAGS(5); /* R_E */ + percpu PT_LOAD FLAGS(4); /* R__ */ data PT_LOAD FLAGS(7); /* RWE */ user PT_LOAD FLAGS(7); /* RWE */ data.init PT_LOAD FLAGS(7); /* RWE */ --- linux-2.6.tip.orig/include/asm-x86/percpu.h +++ linux-2.6.tip/include/asm-x86/percpu.h @@ -3,26 +3,20 @@ #ifdef CONFIG_X86_64 #include <linux/compiler.h> - -/* Same as asm-generic/percpu.h, except that we store the per cpu offset - in the PDA. Longer term the PDA and every per cpu variable - should be just put into a single section and referenced directly - from %gs */ - -#ifdef CONFIG_SMP #include <asm/pda.h> -#define __per_cpu_offset(cpu) (cpu_pda(cpu)->data_offset) -#define __my_cpu_offset read_pda(data_offset) - -#define per_cpu_offset(x) (__per_cpu_offset(x)) - +#ifdef CONFIG_SMP +#define __my_cpu_offset (x86_read_percpu(pda.data_offset)) +#define __percpu_seg "%%gs:" +#else +#define __percpu_seg "" #endif + #include <asm-generic/percpu.h> DECLARE_PER_CPU(struct x8664_pda, pda); -#else /* CONFIG_X86_64 */ +#else /* !CONFIG_X86_64 */ #ifdef __ASSEMBLY__ @@ -51,36 +45,23 @@ DECLARE_PER_CPU(struct x8664_pda, pda); #else /* ...!ASSEMBLY */ -/* - * PER_CPU finds an address of a per-cpu variable. - * - * Args: - * var - variable name - * cpu - 32bit register containing the current CPU number - * - * The resulting address is stored in the "cpu" argument. - * - * Example: - * PER_CPU(cpu_gdt_descr, %ebx) - */ #ifdef CONFIG_SMP - #define __my_cpu_offset x86_read_percpu(this_cpu_off) - -/* fs segment starts at (positive) offset == __per_cpu_offset[cpu] */ #define __percpu_seg "%%fs:" - -#else /* !SMP */ - +#else #define __percpu_seg "" - -#endif /* SMP */ +#endif #include <asm-generic/percpu.h> /* We can use this directly for local CPU (faster). */ DECLARE_PER_CPU(unsigned long, this_cpu_off); +#endif /* __ASSEMBLY__ */ +#endif /* !CONFIG_X86_64 */ + +#ifndef __ASSEMBLY__ + /* For arch-specific code, we can use direct single-insn ops (they * don't give an lvalue though). */ extern void __bad_percpu_size(void); @@ -215,7 +196,6 @@ do { \ percpu_cmpxchg_op(per_cpu_var(var), old, new) #endif /* !__ASSEMBLY__ */ -#endif /* !CONFIG_X86_64 */ #ifdef CONFIG_SMP -- ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 0:30 ` [PATCH 3/4] x86_64: Fold pda into per cpu area Mike Travis @ 2008-06-04 12:59 ` Jeremy Fitzhardinge 2008-06-04 13:48 ` Mike Travis 2008-06-09 23:18 ` Christoph Lameter 2008-06-05 10:22 ` [crash, bisected] " Ingo Molnar 1 sibling, 2 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-04 12:59 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, Rusty Russell Mike Travis wrote: > * Declare the pda as a per cpu variable. > > * Make the x86_64 per cpu area start at zero. > > * Since the pda is now the first element of the per_cpu area, cpu_pda() > is no longer needed and per_cpu() can be used instead. This also makes > the _cpu_pda[] table obsolete. > > * Since %gs is pointing to the pda, it will then also point to the per cpu > variables and can be accessed thusly: > > %gs:[&per_cpu_xxxx - __per_cpu_start] > Unfortunately that doesn't actually work, because you can't have a reloc with two variables. In something like: mov %gs:per_cpu__foo - 12345, %rax mov %gs:per_cpu__foo, %rax mov %gs:per_cpu__foo - 12345(%rip), %rax mov %gs:per_cpu__foo(%rip), %rax mov %gs:per_cpu__foo - __per_cpu_start, %rax mov %gs:per_cpu__foo - __per_cpu_start(%rip), %rax the last two lines will not assemble: t.S:5: Error: can't resolve `per_cpu__foo' {*UND* section} - `__per_cpu_start' {*UND* section} t.S:6: Error: can't resolve `per_cpu__foo' {*UND* section} - `__per_cpu_start' {*UND* section} Unfortunately, the only way I can think of fixing this is to compute the offset into a temp register, then use that: lea per_cpu__foo(%rip), %rax mov %gs:__per_cpu_offset(%rax), %rax (where __per_cpu_offset is defined in the linker script as -__per_cpu_start). This seems to be a general problem with zero-offset per-cpu. And its unfortunate, because no-register access to per-cpu variables is nice to have. The other alternative - and I have no idea whether this is practical or possible - is to define a complete set of pre-offset per_cpu symbols. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 12:59 ` Jeremy Fitzhardinge @ 2008-06-04 13:48 ` Mike Travis 2008-06-04 13:58 ` Jeremy Fitzhardinge 2008-06-09 23:18 ` Christoph Lameter 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-04 13:48 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, Rusty Russell Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> * Declare the pda as a per cpu variable. >> >> * Make the x86_64 per cpu area start at zero. >> >> * Since the pda is now the first element of the per_cpu area, cpu_pda() >> is no longer needed and per_cpu() can be used instead. This also >> makes >> the _cpu_pda[] table obsolete. >> >> * Since %gs is pointing to the pda, it will then also point to the >> per cpu >> variables and can be accessed thusly: >> >> %gs:[&per_cpu_xxxx - __per_cpu_start] >> The above is only a partial story (I folded the two patches but didn't update the comments correctly.] The variables are already offset from __per_cpu_start by virtue of the .data.percpu section being based at zero. Therefore only the %gs register needs to be set to the base of each cpu's percpu section to resolve the target address: %gs:&per_cpu_xxxx And the .data.percpu.first forces the pda percpu variable to the front. > > Unfortunately that doesn't actually work, because you can't have a reloc > with two variables. > > In something like: > > mov %gs:per_cpu__foo - 12345, %rax > mov %gs:per_cpu__foo, %rax > mov %gs:per_cpu__foo - 12345(%rip), %rax > mov %gs:per_cpu__foo(%rip), %rax > mov %gs:per_cpu__foo - __per_cpu_start, %rax > mov %gs:per_cpu__foo - __per_cpu_start(%rip), %rax > > the last two lines will not assemble: > > t.S:5: Error: can't resolve `per_cpu__foo' {*UND* section} - > `__per_cpu_start' {*UND* section} > t.S:6: Error: can't resolve `per_cpu__foo' {*UND* section} - > `__per_cpu_start' {*UND* section} > > Unfortunately, the only way I can think of fixing this is to compute the > offset into a temp register, then use that: > > lea per_cpu__foo(%rip), %rax > mov %gs:__per_cpu_offset(%rax), %rax > > (where __per_cpu_offset is defined in the linker script as > -__per_cpu_start). > > This seems to be a general problem with zero-offset per-cpu. And its > unfortunate, because no-register access to per-cpu variables is nice to > have. > > The other alternative - and I have no idea whether this is practical or > possible - is to define a complete set of pre-offset per_cpu symbols. > > J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 13:48 ` Mike Travis @ 2008-06-04 13:58 ` Jeremy Fitzhardinge 2008-06-04 14:17 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-04 13:58 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, Rusty Russell Mike Travis wrote: > Jeremy Fitzhardinge wrote: > >> Mike Travis wrote: >> >>> * Declare the pda as a per cpu variable. >>> >>> * Make the x86_64 per cpu area start at zero. >>> >>> * Since the pda is now the first element of the per_cpu area, cpu_pda() >>> is no longer needed and per_cpu() can be used instead. This also >>> makes >>> the _cpu_pda[] table obsolete. >>> >>> * Since %gs is pointing to the pda, it will then also point to the >>> per cpu >>> variables and can be accessed thusly: >>> >>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>> >>> > > > The above is only a partial story (I folded the two patches but didn't > update the comments correctly.] The variables are already offset from > __per_cpu_start by virtue of the .data.percpu section being based at > zero. Therefore only the %gs register needs to be set to the base of > each cpu's percpu section to resolve the target address: > > %gs:&per_cpu_xxxx > Oh, good. I'd played with trying to make that work at one point, and got lost in linker bugs and/or random version-specific strangeness. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 13:58 ` Jeremy Fitzhardinge @ 2008-06-04 14:17 ` Mike Travis 0 siblings, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-06-04 14:17 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, Rusty Russell Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> Jeremy Fitzhardinge wrote: >> >>> Mike Travis wrote: >>> >>>> * Declare the pda as a per cpu variable. >>>> >>>> * Make the x86_64 per cpu area start at zero. >>>> >>>> * Since the pda is now the first element of the per_cpu area, >>>> cpu_pda() >>>> is no longer needed and per_cpu() can be used instead. This also >>>> makes >>>> the _cpu_pda[] table obsolete. >>>> >>>> * Since %gs is pointing to the pda, it will then also point to the >>>> per cpu >>>> variables and can be accessed thusly: >>>> >>>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>>> >>>> >> >> The above is only a partial story (I folded the two patches but didn't >> update the comments correctly.] The variables are already offset from >> __per_cpu_start by virtue of the .data.percpu section being based at >> zero. Therefore only the %gs register needs to be set to the base of >> each cpu's percpu section to resolve the target address: >> >> %gs:&per_cpu_xxxx >> > > Oh, good. I'd played with trying to make that work at one point, and > got lost in linker bugs and/or random version-specific strangeness. > J Incidentally, this is why the following load is needed in x86_64_start_kernel(): pda = (struct x8664_pda *)__per_cpu_load; pda->data_offset = per_cpu_offset(0) = (unsigned long)pda; /* initialize boot cpu_pda data */ pda_init(0); pda_init() loads the %gs reg so early accesses to the static per_cpu section can be executed before the percpu areas are allocated. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 12:59 ` Jeremy Fitzhardinge 2008-06-04 13:48 ` Mike Travis @ 2008-06-09 23:18 ` Christoph Lameter 1 sibling, 0 replies; 108+ messages in thread From: Christoph Lameter @ 2008-06-09 23:18 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Mike Travis, Ingo Molnar, Andrew Morton, David Miller, Eric Dumazet, linux-kernel, Rusty Russell On Wed, 4 Jun 2008, Jeremy Fitzhardinge wrote: > > %gs:[&per_cpu_xxxx - __per_cpu_start] > > > > Unfortunately that doesn't actually work, because you can't have a reloc with > two variables. That is just a conceptual discussion. __per_cpu_start is 0 with the zero based patch. And thus this reduces to %gs[&per_cpu_xxx] ^ permalink raw reply [flat|nested] 108+ messages in thread
* [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-04 0:30 ` [PATCH 3/4] x86_64: Fold pda into per cpu area Mike Travis 2008-06-04 12:59 ` Jeremy Fitzhardinge @ 2008-06-05 10:22 ` Ingo Molnar 2008-06-05 16:02 ` Mike Travis 2008-06-10 21:31 ` Mike Travis 1 sibling, 2 replies; 108+ messages in thread From: Ingo Molnar @ 2008-06-05 10:22 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel, the arch/x86 maintainers * Mike Travis <travis@sgi.com> wrote: > * Declare the pda as a per cpu variable. > > * Make the x86_64 per cpu area start at zero. > > * Since the pda is now the first element of the per_cpu area, cpu_pda() > is no longer needed and per_cpu() can be used instead. This also makes > the _cpu_pda[] table obsolete. > > * Since %gs is pointing to the pda, it will then also point to the per cpu > variables and can be accessed thusly: > > %gs:[&per_cpu_xxxx - __per_cpu_start] > > Based on linux-2.6.tip -tip testing found an instantaneous reboot crash on 64-bit x86, with this config: http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad there is no boot log as the instantaneous reboot happens before anything is printed to the (early-) serial console. I have bisected it down to: | 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f is first bad commit | commit 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f | Author: Mike Travis <travis@sgi.com> | Date: Tue Jun 3 17:30:21 2008 -0700 | | x86_64: Fold pda into per cpu area the big problem is not just this crash, but that the patch is _way_ too big: arch/x86/Kconfig | 3 + arch/x86/kernel/head64.c | 34 ++++++-------- arch/x86/kernel/irq_64.c | 36 ++++++++------- arch/x86/kernel/setup.c | 90 ++++++++++++--------------------------- arch/x86/kernel/setup64.c | 5 -- arch/x86/kernel/smpboot.c | 51 ---------------------- arch/x86/kernel/traps_64.c | 11 +++- arch/x86/kernel/vmlinux_64.lds.S | 1 include/asm-x86/percpu.h | 48 ++++++-------------- 9 files changed, 89 insertions(+), 190 deletions(-) considering the danger involved, this is just way too large, and there's no reasonable debugging i can do in the bisection to narrow it down any further. Please resubmit with the bug fixed and with a proper splitup, the more patches you manage to create, the better. For a dangerous code area like this, with a track record of frequent breakages in the past, i would not mind a "one line of code changed per patch" splitup either. (Feel free to send a git tree link for us to try as well.) Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-05 10:22 ` [crash, bisected] " Ingo Molnar @ 2008-06-05 16:02 ` Mike Travis 2008-06-06 8:29 ` Jeremy Fitzhardinge 2008-06-10 21:31 ` Mike Travis 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-05 16:02 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel, the arch/x86 maintainers Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> * Declare the pda as a per cpu variable. >> >> * Make the x86_64 per cpu area start at zero. >> >> * Since the pda is now the first element of the per_cpu area, cpu_pda() >> is no longer needed and per_cpu() can be used instead. This also makes >> the _cpu_pda[] table obsolete. >> >> * Since %gs is pointing to the pda, it will then also point to the per cpu >> variables and can be accessed thusly: >> >> %gs:[&per_cpu_xxxx - __per_cpu_start] >> >> Based on linux-2.6.tip > > -tip testing found an instantaneous reboot crash on 64-bit x86, with > this config: > > http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad > > there is no boot log as the instantaneous reboot happens before anything > is printed to the (early-) serial console. I have bisected it down to: > > | 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f is first bad commit > | commit 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f > | Author: Mike Travis <travis@sgi.com> > | Date: Tue Jun 3 17:30:21 2008 -0700 > | > | x86_64: Fold pda into per cpu area > > the big problem is not just this crash, but that the patch is _way_ too > big: > > arch/x86/Kconfig | 3 + > arch/x86/kernel/head64.c | 34 ++++++-------- > arch/x86/kernel/irq_64.c | 36 ++++++++------- > arch/x86/kernel/setup.c | 90 ++++++++++++--------------------------- > arch/x86/kernel/setup64.c | 5 -- > arch/x86/kernel/smpboot.c | 51 ---------------------- > arch/x86/kernel/traps_64.c | 11 +++- > arch/x86/kernel/vmlinux_64.lds.S | 1 > include/asm-x86/percpu.h | 48 ++++++-------------- > 9 files changed, 89 insertions(+), 190 deletions(-) > > considering the danger involved, this is just way too large, and there's > no reasonable debugging i can do in the bisection to narrow it down any > further. > > Please resubmit with the bug fixed and with a proper splitup, the more > patches you manage to create, the better. For a dangerous code area like > this, with a track record of frequent breakages in the past, i would not > mind a "one line of code changed per patch" splitup either. (Feel free > to send a git tree link for us to try as well.) > > Ingo Thanks for the feedback Ingo. I'll test the above config and look at splitting up the patch. The difficulty is making each patch independently compilable and testable. Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-05 16:02 ` Mike Travis @ 2008-06-06 8:29 ` Jeremy Fitzhardinge 2008-06-06 13:15 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-06 8:29 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers Mike Travis wrote: > Ingo Molnar wrote: > >> * Mike Travis <travis@sgi.com> wrote: >> >> >>> * Declare the pda as a per cpu variable. >>> >>> * Make the x86_64 per cpu area start at zero. >>> >>> * Since the pda is now the first element of the per_cpu area, cpu_pda() >>> is no longer needed and per_cpu() can be used instead. This also makes >>> the _cpu_pda[] table obsolete. >>> >>> * Since %gs is pointing to the pda, it will then also point to the per cpu >>> variables and can be accessed thusly: >>> >>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>> >>> Based on linux-2.6.tip >>> >> -tip testing found an instantaneous reboot crash on 64-bit x86, with >> this config: >> >> http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad >> >> there is no boot log as the instantaneous reboot happens before anything >> is printed to the (early-) serial console. I have bisected it down to: >> >> | 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f is first bad commit >> | commit 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f >> | Author: Mike Travis <travis@sgi.com> >> | Date: Tue Jun 3 17:30:21 2008 -0700 >> | >> | x86_64: Fold pda into per cpu area >> >> the big problem is not just this crash, but that the patch is _way_ too >> big: >> >> arch/x86/Kconfig | 3 + >> arch/x86/kernel/head64.c | 34 ++++++-------- >> arch/x86/kernel/irq_64.c | 36 ++++++++------- >> arch/x86/kernel/setup.c | 90 ++++++++++++--------------------------- >> arch/x86/kernel/setup64.c | 5 -- >> arch/x86/kernel/smpboot.c | 51 ---------------------- >> arch/x86/kernel/traps_64.c | 11 +++- >> arch/x86/kernel/vmlinux_64.lds.S | 1 >> include/asm-x86/percpu.h | 48 ++++++-------------- >> 9 files changed, 89 insertions(+), 190 deletions(-) >> >> considering the danger involved, this is just way too large, and there's >> no reasonable debugging i can do in the bisection to narrow it down any >> further. >> >> Please resubmit with the bug fixed and with a proper splitup, the more >> patches you manage to create, the better. For a dangerous code area like >> this, with a track record of frequent breakages in the past, i would not >> mind a "one line of code changed per patch" splitup either. (Feel free >> to send a git tree link for us to try as well.) >> >> Ingo >> > > Thanks for the feedback Ingo. I'll test the above config and look at > splitting up the patch. The difficulty is making each patch independently > compilable and testable. FWIW, I'm getting past the "crashes very, very early" stage with this series applied when booting under Xen. Then it crashes pretty early, but that's not your fault... J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-06 8:29 ` Jeremy Fitzhardinge @ 2008-06-06 13:15 ` Mike Travis 2008-06-18 5:34 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-06 13:15 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> Ingo Molnar wrote: >> >>> * Mike Travis <travis@sgi.com> wrote: >>> >>> >>>> * Declare the pda as a per cpu variable. >>>> >>>> * Make the x86_64 per cpu area start at zero. >>>> >>>> * Since the pda is now the first element of the per_cpu area, >>>> cpu_pda() >>>> is no longer needed and per_cpu() can be used instead. This >>>> also makes >>>> the _cpu_pda[] table obsolete. >>>> >>>> * Since %gs is pointing to the pda, it will then also point to the >>>> per cpu >>>> variables and can be accessed thusly: >>>> >>>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>>> >>>> Based on linux-2.6.tip >>>> >>> -tip testing found an instantaneous reboot crash on 64-bit x86, with >>> this config: >>> >>> http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad >>> >>> there is no boot log as the instantaneous reboot happens before >>> anything is printed to the (early-) serial console. I have bisected >>> it down to: >>> >>> | 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f is first bad commit >>> | commit 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f >>> | Author: Mike Travis <travis@sgi.com> >>> | Date: Tue Jun 3 17:30:21 2008 -0700 >>> | >>> | x86_64: Fold pda into per cpu area >>> >>> the big problem is not just this crash, but that the patch is _way_ >>> too big: >>> >>> arch/x86/Kconfig | 3 + >>> arch/x86/kernel/head64.c | 34 ++++++-------- >>> arch/x86/kernel/irq_64.c | 36 ++++++++------- >>> arch/x86/kernel/setup.c | 90 >>> ++++++++++++--------------------------- >>> arch/x86/kernel/setup64.c | 5 -- >>> arch/x86/kernel/smpboot.c | 51 ---------------------- >>> arch/x86/kernel/traps_64.c | 11 +++- >>> arch/x86/kernel/vmlinux_64.lds.S | 1 >>> include/asm-x86/percpu.h | 48 ++++++-------------- >>> 9 files changed, 89 insertions(+), 190 deletions(-) >>> >>> considering the danger involved, this is just way too large, and >>> there's no reasonable debugging i can do in the bisection to narrow >>> it down any further. >>> >>> Please resubmit with the bug fixed and with a proper splitup, the >>> more patches you manage to create, the better. For a dangerous code >>> area like this, with a track record of frequent breakages in the >>> past, i would not mind a "one line of code changed per patch" splitup >>> either. (Feel free to send a git tree link for us to try as well.) >>> >>> Ingo >>> >> >> Thanks for the feedback Ingo. I'll test the above config and look at >> splitting up the patch. The difficulty is making each patch >> independently >> compilable and testable. > > FWIW, I'm getting past the "crashes very, very early" stage with this > series applied when booting under Xen. Then it crashes pretty early, > but that's not your fault... > > J Hi Jeremy, Yes we have a simulator for Nahelem that also breezes past the boot up problem (actually makes it to the kernel login prompt.) Weirdly, the problem doesn't exist in an earlier code base so my changes are tickling something else newly introduced. I'm attempting to see if I can use GRUB 2 with the GDB stubs to track it down (which is time consuming in itself to setup.) It is definitely related to basing percpu variable offsets from %gs and (I think) interrupts. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-06 13:15 ` Mike Travis @ 2008-06-18 5:34 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-18 5:34 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers Mike Travis wrote: > Jeremy Fitzhardinge wrote: > >> Mike Travis wrote: >> >>> Ingo Molnar wrote: >>> >>> >>>> * Mike Travis <travis@sgi.com> wrote: >>>> >>>> >>>> >>>>> * Declare the pda as a per cpu variable. >>>>> >>>>> * Make the x86_64 per cpu area start at zero. >>>>> >>>>> * Since the pda is now the first element of the per_cpu area, >>>>> cpu_pda() >>>>> is no longer needed and per_cpu() can be used instead. This >>>>> also makes >>>>> the _cpu_pda[] table obsolete. >>>>> >>>>> * Since %gs is pointing to the pda, it will then also point to the >>>>> per cpu >>>>> variables and can be accessed thusly: >>>>> >>>>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>>>> >>>>> Based on linux-2.6.tip >>>>> >>>>> >>>> -tip testing found an instantaneous reboot crash on 64-bit x86, with >>>> this config: >>>> >>>> http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad >>>> >>>> there is no boot log as the instantaneous reboot happens before >>>> anything is printed to the (early-) serial console. I have bisected >>>> it down to: >>>> >>>> | 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f is first bad commit >>>> | commit 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f >>>> | Author: Mike Travis <travis@sgi.com> >>>> | Date: Tue Jun 3 17:30:21 2008 -0700 >>>> | >>>> | x86_64: Fold pda into per cpu area >>>> >>>> the big problem is not just this crash, but that the patch is _way_ >>>> too big: >>>> >>>> arch/x86/Kconfig | 3 + >>>> arch/x86/kernel/head64.c | 34 ++++++-------- >>>> arch/x86/kernel/irq_64.c | 36 ++++++++------- >>>> arch/x86/kernel/setup.c | 90 >>>> ++++++++++++--------------------------- >>>> arch/x86/kernel/setup64.c | 5 -- >>>> arch/x86/kernel/smpboot.c | 51 ---------------------- >>>> arch/x86/kernel/traps_64.c | 11 +++- >>>> arch/x86/kernel/vmlinux_64.lds.S | 1 >>>> include/asm-x86/percpu.h | 48 ++++++-------------- >>>> 9 files changed, 89 insertions(+), 190 deletions(-) >>>> >>>> considering the danger involved, this is just way too large, and >>>> there's no reasonable debugging i can do in the bisection to narrow >>>> it down any further. >>>> >>>> Please resubmit with the bug fixed and with a proper splitup, the >>>> more patches you manage to create, the better. For a dangerous code >>>> area like this, with a track record of frequent breakages in the >>>> past, i would not mind a "one line of code changed per patch" splitup >>>> either. (Feel free to send a git tree link for us to try as well.) >>>> >>>> Ingo >>>> >>>> >>> Thanks for the feedback Ingo. I'll test the above config and look at >>> splitting up the patch. The difficulty is making each patch >>> independently >>> compilable and testable. >>> >> FWIW, I'm getting past the "crashes very, very early" stage with this >> series applied when booting under Xen. Then it crashes pretty early, >> but that's not your fault... >> >> J >> > > Hi Jeremy, > > Yes we have a simulator for Nahelem that also breezes past the boot up > problem (actually makes it to the kernel login prompt.) Weirdly, the > problem doesn't exist in an earlier code base so my changes are tickling > something else newly introduced. I'm attempting to see if I can use > GRUB 2 with the GDB stubs to track it down (which is time consuming in > itself to setup.) > > It is definitely related to basing percpu variable offsets from %gs and > (I think) interrupts. > Hi Mike, Have you made any progress on this? I'm bumping up against it when I run on native hardware (as opposed to under Xen). J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-05 10:22 ` [crash, bisected] " Ingo Molnar 2008-06-05 16:02 ` Mike Travis @ 2008-06-10 21:31 ` Mike Travis 2008-06-18 17:36 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-10 21:31 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel, the arch/x86 maintainers Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> * Declare the pda as a per cpu variable. >> >> * Make the x86_64 per cpu area start at zero. >> >> * Since the pda is now the first element of the per_cpu area, cpu_pda() >> is no longer needed and per_cpu() can be used instead. This also makes >> the _cpu_pda[] table obsolete. >> >> * Since %gs is pointing to the pda, it will then also point to the per cpu >> variables and can be accessed thusly: >> >> %gs:[&per_cpu_xxxx - __per_cpu_start] >> >> Based on linux-2.6.tip > > -tip testing found an instantaneous reboot crash on 64-bit x86, with > this config: > > http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad I'm still stuck on this one. One new development is that the current -tip branch without the patches boots to the kernel prompt then hangs after a few moments and then reboots. It seems you can tickle it using ^C to abort a process. -Mike > > there is no boot log as the instantaneous reboot happens before anything > is printed to the (early-) serial console. I have bisected it down to: > > | 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f is first bad commit > | commit 7670dc09e89a2b151a1cf49eccebc07c41c2ce9f > | Author: Mike Travis <travis@sgi.com> > | Date: Tue Jun 3 17:30:21 2008 -0700 > | > | x86_64: Fold pda into per cpu area > > the big problem is not just this crash, but that the patch is _way_ too > big: > > arch/x86/Kconfig | 3 + > arch/x86/kernel/head64.c | 34 ++++++-------- > arch/x86/kernel/irq_64.c | 36 ++++++++------- > arch/x86/kernel/setup.c | 90 ++++++++++++--------------------------- > arch/x86/kernel/setup64.c | 5 -- > arch/x86/kernel/smpboot.c | 51 ---------------------- > arch/x86/kernel/traps_64.c | 11 +++- > arch/x86/kernel/vmlinux_64.lds.S | 1 > include/asm-x86/percpu.h | 48 ++++++-------------- > 9 files changed, 89 insertions(+), 190 deletions(-) > > considering the danger involved, this is just way too large, and there's > no reasonable debugging i can do in the bisection to narrow it down any > further. > > Please resubmit with the bug fixed and with a proper splitup, the more > patches you manage to create, the better. For a dangerous code area like > this, with a track record of frequent breakages in the past, i would not > mind a "one line of code changed per patch" splitup either. (Feel free > to send a git tree link for us to try as well.) > > Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-10 21:31 ` Mike Travis @ 2008-06-18 17:36 ` Jeremy Fitzhardinge 2008-06-18 18:17 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-18 17:36 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers Mike Travis wrote: > Ingo Molnar wrote: > >> * Mike Travis <travis@sgi.com> wrote: >> >> >>> * Declare the pda as a per cpu variable. >>> >>> * Make the x86_64 per cpu area start at zero. >>> >>> * Since the pda is now the first element of the per_cpu area, cpu_pda() >>> is no longer needed and per_cpu() can be used instead. This also makes >>> the _cpu_pda[] table obsolete. >>> >>> * Since %gs is pointing to the pda, it will then also point to the per cpu >>> variables and can be accessed thusly: >>> >>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>> >>> Based on linux-2.6.tip >>> >> -tip testing found an instantaneous reboot crash on 64-bit x86, with >> this config: >> >> http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad >> > > I'm still stuck on this one. One new development is that the current -tip > branch without the patches boots to the kernel prompt then hangs after a few > moments and then reboots. It seems you can tickle it using ^C to abort a > process. Hi Mike, I added some instrumentation to Xen to print the cpu state on triple-fault, which highlights an obvious-looking problem. (XEN) hvm.c:767:d1 Triple fault on VCPU0 - invoking HVM system reset. (XEN) ----[ Xen-3.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: 0010:[<ffffffff80200160>] (XEN) RFLAGS: 0000000000010002 CONTEXT: hvm (XEN) rax: 0000000000000018 rbx: 0000000000000000 rcx: 00000000c0000080 (XEN) rdx: 0000000000000000 rsi: 0000000000092f40 rdi: 0000000020100800 (XEN) rbp: 0000000000000000 rsp: ffffffff807dfff8 r8: 0000000000208000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 00000000000000de (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000000a0 (XEN) cr3: 0000000000201000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0010 The rip is: (gdb) x/i 0xffffffff80200160 0xffffffff80200160 <secondary_startup_64+96>: movl %eax,%ds which is: lgdt early_gdt_descr(%rip) /* set up data segments. actually 0 would do too */ movl $__KERNEL_DS,%eax movl %eax,%ds movl %eax,%ss movl %eax,%es And early_gdt_descr is: .globl early_gdt_descr early_gdt_descr: .word GDT_ENTRIES*8-1 .quad per_cpu__gdt_page and per_cpu__gdt_page is zero-based, and therefore not a directly addressable symbol. I tried this patch, but it didn't work. Perhaps I'm missing something. diff -r bf5a46e13f78 arch/x86/kernel/head_64.S --- a/arch/x86/kernel/head_64.S Tue Jun 17 22:10:51 2008 -0700 +++ b/arch/x86/kernel/head_64.S Wed Jun 18 10:34:24 2008 -0700 @@ -94,6 +94,8 @@ addq %rbp, level2_fixmap_pgt + (506*8)(%rip) + addq $__per_cpu_load, early_gdt_descr+2(%rip) + /* Add an Identity mapping if I am above 1G */ leaq _text(%rip), %rdi andq $PMD_PAGE_MASK, %rdi J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-18 17:36 ` Jeremy Fitzhardinge @ 2008-06-18 18:17 ` Mike Travis 2008-06-18 18:33 ` Ingo Molnar 2008-06-18 19:33 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 108+ messages in thread From: Mike Travis @ 2008-06-18 18:17 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> Ingo Molnar wrote: >> >>> * Mike Travis <travis@sgi.com> wrote: >>> >>> >>>> * Declare the pda as a per cpu variable. >>>> >>>> * Make the x86_64 per cpu area start at zero. >>>> >>>> * Since the pda is now the first element of the per_cpu area, >>>> cpu_pda() >>>> is no longer needed and per_cpu() can be used instead. This >>>> also makes >>>> the _cpu_pda[] table obsolete. >>>> >>>> * Since %gs is pointing to the pda, it will then also point to the >>>> per cpu >>>> variables and can be accessed thusly: >>>> >>>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>>> >>>> Based on linux-2.6.tip >>>> >>> -tip testing found an instantaneous reboot crash on 64-bit x86, with >>> this config: >>> >>> http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad >>> >> >> I'm still stuck on this one. One new development is that the current >> -tip >> branch without the patches boots to the kernel prompt then hangs after >> a few >> moments and then reboots. It seems you can tickle it using ^C to abort a >> process. > > Hi Mike, > > I added some instrumentation to Xen to print the cpu state on > triple-fault, which highlights an obvious-looking problem. > > (XEN) hvm.c:767:d1 Triple fault on VCPU0 - invoking HVM system reset. > (XEN) ----[ Xen-3.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 1 > (XEN) RIP: 0010:[<ffffffff80200160>] > (XEN) RFLAGS: 0000000000010002 CONTEXT: hvm > (XEN) rax: 0000000000000018 rbx: 0000000000000000 rcx: 00000000c0000080 > (XEN) rdx: 0000000000000000 rsi: 0000000000092f40 rdi: 0000000020100800 > (XEN) rbp: 0000000000000000 rsp: ffffffff807dfff8 r8: 0000000000208000 > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 00000000000000de > (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000000a0 > (XEN) cr3: 0000000000201000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0010 > > The rip is: > > (gdb) x/i 0xffffffff80200160 > 0xffffffff80200160 <secondary_startup_64+96>: movl %eax,%ds > > which is: > > lgdt early_gdt_descr(%rip) > > /* set up data segments. actually 0 would do too */ > movl $__KERNEL_DS,%eax > movl %eax,%ds > movl %eax,%ss > movl %eax,%es > > And early_gdt_descr is: > > .globl early_gdt_descr > early_gdt_descr: > .word GDT_ENTRIES*8-1 > .quad per_cpu__gdt_page > > and per_cpu__gdt_page is zero-based, and therefore not a directly > addressable symbol. > > I tried this patch, but it didn't work. Perhaps I'm missing something. > > diff -r bf5a46e13f78 arch/x86/kernel/head_64.S > --- a/arch/x86/kernel/head_64.S Tue Jun 17 22:10:51 2008 -0700 > +++ b/arch/x86/kernel/head_64.S Wed Jun 18 10:34:24 2008 -0700 > @@ -94,6 +94,8 @@ > > addq %rbp, level2_fixmap_pgt + (506*8)(%rip) > > + addq $__per_cpu_load, early_gdt_descr+2(%rip) > + > /* Add an Identity mapping if I am above 1G */ > leaq _text(%rip), %rdi > andq $PMD_PAGE_MASK, %rdi > > > J Hi Jeremy, I'm not finding that code in the tip/latest or linux-next branches... ? I can send you my latest version of the patch which is better than the previous but still is having problems with the config file that Ingo sent out. (It also has a weird quirk that it will hang and reboot after about 30 seconds with or without my patch.) Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-18 18:17 ` Mike Travis @ 2008-06-18 18:33 ` Ingo Molnar 2008-06-18 19:33 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 108+ messages in thread From: Ingo Molnar @ 2008-06-18 18:33 UTC (permalink / raw) To: Mike Travis Cc: Jeremy Fitzhardinge, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers * Mike Travis <travis@sgi.com> wrote: > Hi Jeremy, > > I'm not finding that code in the tip/latest or linux-next branches... > ? > > I can send you my latest version of the patch which is better than the > previous but still is having problems with the config file that Ingo > sent out. (It also has a weird quirk that it will hang and reboot > after about 30 seconds with or without my patch.) the patch is not in -tip yet because we dont keep known-broken patches applied unless there's some really strong reason to do so. Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-18 18:17 ` Mike Travis 2008-06-18 18:33 ` Ingo Molnar @ 2008-06-18 19:33 ` Jeremy Fitzhardinge [not found] ` <48596893.4040908@sgi.com> 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-18 19:33 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel, the arch/x86 maintainers Mike Travis wrote: > Jeremy Fitzhardinge wrote: > >> Mike Travis wrote: >> >>> Ingo Molnar wrote: >>> >>> >>>> * Mike Travis <travis@sgi.com> wrote: >>>> >>>> >>>> >>>>> * Declare the pda as a per cpu variable. >>>>> >>>>> * Make the x86_64 per cpu area start at zero. >>>>> >>>>> * Since the pda is now the first element of the per_cpu area, >>>>> cpu_pda() >>>>> is no longer needed and per_cpu() can be used instead. This >>>>> also makes >>>>> the _cpu_pda[] table obsolete. >>>>> >>>>> * Since %gs is pointing to the pda, it will then also point to the >>>>> per cpu >>>>> variables and can be accessed thusly: >>>>> >>>>> %gs:[&per_cpu_xxxx - __per_cpu_start] >>>>> >>>>> Based on linux-2.6.tip >>>>> >>>>> >>>> -tip testing found an instantaneous reboot crash on 64-bit x86, with >>>> this config: >>>> >>>> http://redhat.com/~mingo/misc/config-Thu_Jun__5_11_43_51_CEST_2008.bad >>>> >>>> >>> I'm still stuck on this one. One new development is that the current >>> -tip >>> branch without the patches boots to the kernel prompt then hangs after >>> a few >>> moments and then reboots. It seems you can tickle it using ^C to abort a >>> process. >>> >> Hi Mike, >> >> I added some instrumentation to Xen to print the cpu state on >> triple-fault, which highlights an obvious-looking problem. >> >> (XEN) hvm.c:767:d1 Triple fault on VCPU0 - invoking HVM system reset. >> (XEN) ----[ Xen-3.3-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 1 >> (XEN) RIP: 0010:[<ffffffff80200160>] >> (XEN) RFLAGS: 0000000000010002 CONTEXT: hvm >> (XEN) rax: 0000000000000018 rbx: 0000000000000000 rcx: 00000000c0000080 >> (XEN) rdx: 0000000000000000 rsi: 0000000000092f40 rdi: 0000000020100800 >> (XEN) rbp: 0000000000000000 rsp: ffffffff807dfff8 r8: 0000000000208000 >> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 00000000000000de >> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 >> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000000a0 >> (XEN) cr3: 0000000000201000 cr2: 0000000000000000 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0010 >> >> The rip is: >> >> (gdb) x/i 0xffffffff80200160 >> 0xffffffff80200160 <secondary_startup_64+96>: movl %eax,%ds >> >> which is: >> >> lgdt early_gdt_descr(%rip) >> >> /* set up data segments. actually 0 would do too */ >> movl $__KERNEL_DS,%eax >> movl %eax,%ds >> movl %eax,%ss >> movl %eax,%es >> >> And early_gdt_descr is: >> >> .globl early_gdt_descr >> early_gdt_descr: >> .word GDT_ENTRIES*8-1 >> .quad per_cpu__gdt_page >> >> and per_cpu__gdt_page is zero-based, and therefore not a directly >> addressable symbol. >> >> I tried this patch, but it didn't work. Perhaps I'm missing something. >> >> diff -r bf5a46e13f78 arch/x86/kernel/head_64.S >> --- a/arch/x86/kernel/head_64.S Tue Jun 17 22:10:51 2008 -0700 >> +++ b/arch/x86/kernel/head_64.S Wed Jun 18 10:34:24 2008 -0700 >> @@ -94,6 +94,8 @@ >> >> addq %rbp, level2_fixmap_pgt + (506*8)(%rip) >> >> + addq $__per_cpu_load, early_gdt_descr+2(%rip) >> + >> /* Add an Identity mapping if I am above 1G */ >> leaq _text(%rip), %rdi >> andq $PMD_PAGE_MASK, %rdi >> >> >> J >> > > Hi Jeremy, > > I'm not finding that code in the tip/latest or linux-next branches... ? > You mean your percpu/pda code? No, I'm carrying it locally because I need it as a base for my Xen work. Xen bypasses these early boot stages, so I haven't seen any problems so far. But I'd also like to make sure that my Xen changes don't break native boots, too... > I can send you my latest version of the patch which is better than > the previous but still is having problems with the config file that > Ingo sent out. (It also has a weird quirk that it will hang and > reboot after about 30 seconds with or without my patch.) > Yes, keep me uptodate with the percpu work. J ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <48596893.4040908@sgi.com>]
[parent not found: <485AADAC.3070301@sgi.com>]
[parent not found: <485AB78B.5090904@goop.org>]
[parent not found: <485AC120.6010202@sgi.com>]
[parent not found: <485AC5D4.6040302@goop.org>]
[parent not found: <485ACA8F.10006@sgi.com>]
[parent not found: <485ACD92.8050109@sgi.com>]
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area [not found] ` <485ACD92.8050109@sgi.com> @ 2008-06-19 21:35 ` Jeremy Fitzhardinge 2008-06-19 21:54 ` Jeremy Fitzhardinge 2008-06-19 22:13 ` Mike Travis 0 siblings, 2 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-19 21:35 UTC (permalink / raw) To: Mike Travis; +Cc: Rusty Russell, Linux Kernel Mailing List Mike Travis wrote: > Oh yeah, is it alright to re-use the pda in the static percpu load area > for each startup cpu, or should it be adjusted to use the areas allocated > by setup_per_cpu_areas()? pda_init() is called in x86_64_start_kernel > so it would only be for anything that occurs before then. (And I moved > the call to pda_init() to before the early_idt_handlers are setup.) > Why not use the real pda for all cpus? Do you move the boot-cpu's per-cpu data? (Please don't) If not, you can just use percpu__pda from the start without having to do anything else, and then set up %gs pointing to the pda base for each secondary cpu. 64-bit inherits 32-bit's use of per-cpu gdts, though its mostly useless on 64-bit. More important is to have a: startup_percpu_base: .quad __per_cpu_load which you stick the processor's initial %gs into, and then load that from in startup_secondary_64: mov $X86_MSR_GSBASE, %ecx mov startup_percpu_base, %eax mov startup_percpu_base+4, %edx wrmsr and put startup_percpu_base = new_cpus_percpu_base; in do_cpu_boot(). > I hadn't realized that this code is executed for cpus other than the > boot cpu. Is there a way to find out if this is the boot cpu (and/or > the initial execution)? > Don't think so. If you want something to happen only at boot time, do it in startup_64. > If it's the boot cpu, then this would work for the gdt, yes? > > leaq early_gdt_descr_base(%rip), %edi > movq 0(%edi), %rax > addq $__per_cpu_load, %rax > movq %rax, 0(%edi) > lgdt early_gdt_descr(%rip) > As I mentioned in my other mail, a simple add should be enough. > But it should only be executed for the boot because do_boot_cpu() > does this: > > early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu); > > static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu) > { > return per_cpu(gdt_page, cpu).gdt; > } > Right, do it in startup_64. > Btw, I've only been testing on an x86_64 system. I'm sure I've got > things to fix up for i386. > It should be possible to share almost everything, at least in C. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-19 21:35 ` Jeremy Fitzhardinge @ 2008-06-19 21:54 ` Jeremy Fitzhardinge 2008-06-19 22:13 ` Mike Travis 1 sibling, 0 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-19 21:54 UTC (permalink / raw) To: Mike Travis; +Cc: Rusty Russell, Linux Kernel Mailing List Jeremy Fitzhardinge wrote: > 64-bit inherits 32-bit's use of per-cpu gdts, though its mostly > useless on 64-bit. Note to self: Not really true; they're still needed for 32-on-64. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-19 21:35 ` Jeremy Fitzhardinge 2008-06-19 21:54 ` Jeremy Fitzhardinge @ 2008-06-19 22:13 ` Mike Travis 2008-06-19 22:21 ` Jeremy Fitzhardinge 2008-06-19 22:23 ` Jeremy Fitzhardinge 1 sibling, 2 replies; 108+ messages in thread From: Mike Travis @ 2008-06-19 22:13 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Rusty Russell, Linux Kernel Mailing List, Christoph Lameter, Jack Steiner Jeremy Fitzhardinge wrote: > > Why not use the real pda for all cpus? Yeah, I figured that out after doing some more thinking... ;-) > > Do you move the boot-cpu's per-cpu data? (Please don't) If not, you can > just use percpu__pda from the start without having to do anything else, > and then set up %gs pointing to the pda base for each secondary cpu. The problem is that the static percpu area is removed as it lies in the initdata section, so the pda is removed as well. But I took your suggestion to move the fixup to before secondary_startup. Below is a revised version. It builds but I'll have to test it tomorrow. Note the addition of: + initial_pda = (unsigned long)get_percpu_pda(cpu); in do_boot_cpu. I'm not sure yet what to put into acpi_save_state_mem: initial_code = (unsigned long)wakeup_long64; + /* ZZZ initial_pda = (unsigned long)?; */ Thanks again for your help! Based on linux-2.6.tip/master Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> --- arch/x86/Kconfig | 3 + arch/x86/kernel/acpi/sleep.c | 1 arch/x86/kernel/head64.c | 34 ++++++--------- arch/x86/kernel/head_64.S | 13 +++++ arch/x86/kernel/setup.c | 86 +++++++++++---------------------------- arch/x86/kernel/setup64.c | 3 - arch/x86/kernel/smpboot.c | 52 ----------------------- arch/x86/kernel/vmlinux_64.lds.S | 1 include/asm-x86/desc.h | 5 ++ include/asm-x86/pda.h | 3 - include/asm-x86/percpu.h | 46 +++++--------------- include/asm-x86/trampoline.h | 1 12 files changed, 78 insertions(+), 170 deletions(-) --- linux-2.6.tip.orig/arch/x86/Kconfig +++ linux-2.6.tip/arch/x86/Kconfig @@ -129,6 +129,9 @@ config HAVE_SETUP_PER_CPU_AREA config HAVE_CPUMASK_OF_CPU_MAP def_bool X86_64_SMP +config HAVE_ZERO_BASED_PER_CPU + def_bool X86_64_SMP + config ARCH_HIBERNATION_POSSIBLE def_bool y depends on !SMP || !X86_VOYAGER --- linux-2.6.tip.orig/arch/x86/kernel/acpi/sleep.c +++ linux-2.6.tip/arch/x86/kernel/acpi/sleep.c @@ -76,6 +76,7 @@ int acpi_save_state_mem(void) stack_start.sp = temp_stack + 4096; #endif initial_code = (unsigned long)wakeup_long64; + /* ZZZ initial_pda = (unsigned long)?; */ saved_magic = 0x123456789abcdef0; #endif /* CONFIG_64BIT */ --- linux-2.6.tip.orig/arch/x86/kernel/head64.c +++ linux-2.6.tip/arch/x86/kernel/head64.c @@ -25,20 +25,6 @@ #include <asm/e820.h> #include <asm/bios_ebda.h> -/* boot cpu pda */ -static struct x8664_pda _boot_cpu_pda __read_mostly; - -#ifdef CONFIG_SMP -/* - * We install an empty cpu_pda pointer table to indicate to early users - * (numa_set_node) that the cpu_pda pointer table for cpus other than - * the boot cpu is not yet setup. - */ -static struct x8664_pda *__cpu_pda[NR_CPUS] __initdata; -#else -static struct x8664_pda *__cpu_pda[NR_CPUS] __read_mostly; -#endif - static void __init zap_identity_mappings(void) { pgd_t *pgd = pgd_offset_k(0UL); @@ -91,6 +77,20 @@ void __init x86_64_start_kernel(char * r /* Cleanup the over mapped high alias */ cleanup_highmap(); + /* point to boot pda which is the first element in the percpu area */ + { + struct x8664_pda *pda; +#ifdef CONFIG_SMP + pda = (struct x8664_pda *)__per_cpu_load; + pda->data_offset = per_cpu_offset(0) = (unsigned long)pda; +#else + pda = &per_cpu(pda, 0); + pda->data_offset = (unsigned long)pda; +#endif + } + /* initialize boot cpu_pda data */ + pda_init(0); + for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) { #ifdef CONFIG_EARLY_PRINTK set_intr_gate(i, &early_idt_handlers[i]); @@ -102,12 +102,6 @@ void __init x86_64_start_kernel(char * r early_printk("Kernel alive\n"); - _cpu_pda = __cpu_pda; - cpu_pda(0) = &_boot_cpu_pda; - pda_init(0); - - early_printk("Kernel really alive\n"); - copy_bootdata(__va(real_mode_data)); reserve_early(__pa_symbol(&_text), __pa_symbol(&_end), "TEXT DATA BSS"); --- linux-2.6.tip.orig/arch/x86/kernel/head_64.S +++ linux-2.6.tip/arch/x86/kernel/head_64.S @@ -12,6 +12,7 @@ #include <linux/linkage.h> #include <linux/threads.h> #include <linux/init.h> +#include <asm/asm-offsets.h> #include <asm/desc.h> #include <asm/segment.h> #include <asm/pgtable.h> @@ -132,6 +133,12 @@ ident_complete: #ifdef CONFIG_SMP addq %rbp, trampoline_level4_pgt + 0(%rip) addq %rbp, trampoline_level4_pgt + (511*8)(%rip) + + /* + * Fix up per_cpu__gdt_page offset when basing percpu + * variables at zero. This is only needed for the boot cpu. + */ + addq $__per_cpu_load, early_gdt_descr_base #endif /* Due to ENTRY(), sometimes the empty space gets filled with @@ -224,10 +231,11 @@ ENTRY(secondary_startup_64) * that does in_interrupt() */ movl $MSR_GS_BASE,%ecx - movq $empty_zero_page,%rax + movq initial_pda(%rip), %rax movq %rax,%rdx shrq $32,%rdx wrmsr + movq %rax,%gs:pda_data_offset /* esi is pointer to real mode structure with interesting info. pass it to C */ @@ -250,6 +258,8 @@ ENTRY(secondary_startup_64) .align 8 ENTRY(initial_code) .quad x86_64_start_kernel + ENTRY(initial_pda) + .quad __per_cpu_load __FINITDATA ENTRY(stack_start) @@ -394,6 +404,7 @@ NEXT_PAGE(level2_spare_pgt) .globl early_gdt_descr early_gdt_descr: .word GDT_ENTRIES*8-1 +early_gdt_descr_base: .quad per_cpu__gdt_page ENTRY(phys_base) --- linux-2.6.tip.orig/arch/x86/kernel/setup.c +++ linux-2.6.tip/arch/x86/kernel/setup.c @@ -30,6 +30,11 @@ DEFINE_EARLY_PER_CPU(u16, x86_bios_cpu_a EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid); EXPORT_EARLY_PER_CPU_SYMBOL(x86_bios_cpu_apicid); +#ifdef CONFIG_X86_64 +DEFINE_PER_CPU_FIRST(struct x8664_pda, pda); +EXPORT_PER_CPU_SYMBOL(pda); +#endif + #if defined(CONFIG_NUMA) && defined(CONFIG_X86_64) #define X86_64_NUMA 1 @@ -48,7 +53,7 @@ static void __init setup_node_to_cpumask static inline void setup_node_to_cpumask_map(void) { } #endif -#if defined(CONFIG_HAVE_SETUP_PER_CPU_AREA) && defined(CONFIG_SMP) +#ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA /* * Copy data used in early init routines from the initial arrays to the * per cpu data areas. These arrays then become expendable and the @@ -95,64 +100,9 @@ static void __init setup_cpumask_of_cpu( static inline void setup_cpumask_of_cpu(void) { } #endif -#ifdef CONFIG_X86_32 -/* - * Great future not-so-futuristic plan: make i386 and x86_64 do it - * the same way - */ unsigned long __per_cpu_offset[NR_CPUS] __read_mostly; EXPORT_SYMBOL(__per_cpu_offset); -static inline void setup_cpu_pda_map(void) { } - -#elif !defined(CONFIG_SMP) -static inline void setup_cpu_pda_map(void) { } - -#else /* CONFIG_SMP && CONFIG_X86_64 */ - -/* - * Allocate cpu_pda pointer table and array via alloc_bootmem. - */ -static void __init setup_cpu_pda_map(void) -{ - char *pda; - struct x8664_pda **new_cpu_pda; - unsigned long size; - int cpu; - - size = roundup(sizeof(struct x8664_pda), cache_line_size()); - - /* allocate cpu_pda array and pointer table */ - { - unsigned long tsize = nr_cpu_ids * sizeof(void *); - unsigned long asize = size * (nr_cpu_ids - 1); - - tsize = roundup(tsize, cache_line_size()); - new_cpu_pda = alloc_bootmem(tsize + asize); - pda = (char *)new_cpu_pda + tsize; - } - - /* initialize pointer table to static pda's */ - for_each_possible_cpu(cpu) { - if (cpu == 0) { - /* leave boot cpu pda in place */ - new_cpu_pda[0] = cpu_pda(0); - continue; - } - new_cpu_pda[cpu] = (struct x8664_pda *)pda; - new_cpu_pda[cpu]->in_bootmem = 1; - pda += size; - } - - /* point to new pointer table */ - _cpu_pda = new_cpu_pda; -} -#endif -/* - * Great future plan: - * Declare PDA itself and support (irqstack,tss,pgd) as per cpu data. - * Always point %gs to its beginning - */ void __init setup_per_cpu_areas(void) { ssize_t size = PERCPU_ENOUGH_ROOM; @@ -165,9 +115,6 @@ void __init setup_per_cpu_areas(void) nr_cpu_ids = num_processors; #endif - /* Setup cpu_pda map */ - setup_cpu_pda_map(); - /* Copy section for each CPU (we discard the original) */ size = PERCPU_ENOUGH_ROOM; printk(KERN_INFO "PERCPU: Allocating %zd bytes of per cpu data\n", @@ -187,9 +134,28 @@ void __init setup_per_cpu_areas(void) else ptr = alloc_bootmem_pages_node(NODE_DATA(node), size); #endif + /* Initialize each cpu's per_cpu area and save pointer */ + memcpy(ptr, __per_cpu_load, __per_cpu_size); per_cpu_offset(cpu) = ptr - __per_cpu_start; - memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); +#ifdef CONFIG_X86_64 + /* + * Note the boot cpu has been using the static per_cpu load + * area for it's pda. We need to zero out the pda's for the + * other cpu's that are coming online. + */ + { + /* we rely on the fact that pda is the first element */ + struct x8664_pda *pda = (struct x8664_pda *)ptr; + + if (cpu) + memset(pda, 0, sizeof(struct x8664_pda)); + else + pda_init(0); + + pda->data_offset = (unsigned long)ptr; + } +#endif } printk(KERN_DEBUG "NR_CPUS: %d, nr_cpu_ids: %d, nr_node_ids %d\n", --- linux-2.6.tip.orig/arch/x86/kernel/setup64.c +++ linux-2.6.tip/arch/x86/kernel/setup64.c @@ -35,9 +35,6 @@ struct boot_params boot_params; cpumask_t cpu_initialized __cpuinitdata = CPU_MASK_NONE; -struct x8664_pda **_cpu_pda __read_mostly; -EXPORT_SYMBOL(_cpu_pda); - struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; char boot_cpu_stack[IRQSTACKSIZE] __page_aligned_bss; --- linux-2.6.tip.orig/arch/x86/kernel/smpboot.c +++ linux-2.6.tip/arch/x86/kernel/smpboot.c @@ -762,45 +762,6 @@ static void __cpuinit do_fork_idle(struc complete(&c_idle->done); } -#ifdef CONFIG_X86_64 -/* - * Allocate node local memory for the AP pda. - * - * Must be called after the _cpu_pda pointer table is initialized. - */ -static int __cpuinit get_local_pda(int cpu) -{ - struct x8664_pda *oldpda, *newpda; - unsigned long size = sizeof(struct x8664_pda); - int node = cpu_to_node(cpu); - - if (cpu_pda(cpu) && !cpu_pda(cpu)->in_bootmem) - return 0; - - oldpda = cpu_pda(cpu); - newpda = kmalloc_node(size, GFP_ATOMIC, node); - if (!newpda) { - printk(KERN_ERR "Could not allocate node local PDA " - "for CPU %d on node %d\n", cpu, node); - - if (oldpda) - return 0; /* have a usable pda */ - else - return -1; - } - - if (oldpda) { - memcpy(newpda, oldpda, size); - if (!after_bootmem) - free_bootmem((unsigned long)oldpda, size); - } - - newpda->in_bootmem = 0; - cpu_pda(cpu) = newpda; - return 0; -} -#endif /* CONFIG_X86_64 */ - static int __cpuinit do_boot_cpu(int apicid, int cpu) /* * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad @@ -818,16 +779,6 @@ static int __cpuinit do_boot_cpu(int api }; INIT_WORK(&c_idle.work, do_fork_idle); -#ifdef CONFIG_X86_64 - /* Allocate node local memory for AP pdas */ - if (cpu > 0) { - boot_error = get_local_pda(cpu); - if (boot_error) - goto restore_state; - /* if can't get pda memory, can't start cpu */ - } -#endif - alternatives_smp_switch(1); c_idle.idle = get_idle_for_cpu(cpu); @@ -865,6 +816,7 @@ do_rest: #else cpu_pda(cpu)->pcurrent = c_idle.idle; clear_tsk_thread_flag(c_idle.idle, TIF_FORK); + initial_pda = (unsigned long)get_percpu_pda(cpu); #endif early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu); initial_code = (unsigned long)start_secondary; @@ -940,8 +892,6 @@ do_rest: } } -restore_state: - if (boot_error) { /* Try to put things back the way they were before ... */ numa_remove_cpu(cpu); /* was set by numa_add_cpu */ --- linux-2.6.tip.orig/arch/x86/kernel/vmlinux_64.lds.S +++ linux-2.6.tip/arch/x86/kernel/vmlinux_64.lds.S @@ -16,6 +16,7 @@ jiffies_64 = jiffies; _proxy_pda = 1; PHDRS { text PT_LOAD FLAGS(5); /* R_E */ + percpu PT_LOAD FLAGS(7); /* RWE */ data PT_LOAD FLAGS(7); /* RWE */ user PT_LOAD FLAGS(7); /* RWE */ data.init PT_LOAD FLAGS(7); /* RWE */ --- linux-2.6.tip.orig/include/asm-x86/desc.h +++ linux-2.6.tip/include/asm-x86/desc.h @@ -41,6 +41,11 @@ static inline struct desc_struct *get_cp #ifdef CONFIG_X86_64 +static inline struct x8664_pda *get_percpu_pda(unsigned int cpu) +{ + return &per_cpu(pda, cpu); +} + static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long func, unsigned dpl, unsigned ist, unsigned seg) { --- linux-2.6.tip.orig/include/asm-x86/pda.h +++ linux-2.6.tip/include/asm-x86/pda.h @@ -37,10 +37,9 @@ struct x8664_pda { unsigned irq_spurious_count; } ____cacheline_aligned_in_smp; -extern struct x8664_pda **_cpu_pda; extern void pda_init(int); -#define cpu_pda(i) (_cpu_pda[i]) +#define cpu_pda(i) (&per_cpu(pda, i)) /* * There is no fast way to get the base address of the PDA, all the accesses --- linux-2.6.tip.orig/include/asm-x86/percpu.h +++ linux-2.6.tip/include/asm-x86/percpu.h @@ -3,26 +3,20 @@ #ifdef CONFIG_X86_64 #include <linux/compiler.h> - -/* Same as asm-generic/percpu.h, except that we store the per cpu offset - in the PDA. Longer term the PDA and every per cpu variable - should be just put into a single section and referenced directly - from %gs */ - -#ifdef CONFIG_SMP #include <asm/pda.h> -#define __per_cpu_offset(cpu) (cpu_pda(cpu)->data_offset) +#ifdef CONFIG_SMP #define __my_cpu_offset read_pda(data_offset) - -#define per_cpu_offset(x) (__per_cpu_offset(x)) - +#define __percpu_seg "%%gs:" +#else +#define __percpu_seg "" #endif + #include <asm-generic/percpu.h> DECLARE_PER_CPU(struct x8664_pda, pda); -#else /* CONFIG_X86_64 */ +#else /* !CONFIG_X86_64 */ #ifdef __ASSEMBLY__ @@ -51,36 +45,23 @@ DECLARE_PER_CPU(struct x8664_pda, pda); #else /* ...!ASSEMBLY */ -/* - * PER_CPU finds an address of a per-cpu variable. - * - * Args: - * var - variable name - * cpu - 32bit register containing the current CPU number - * - * The resulting address is stored in the "cpu" argument. - * - * Example: - * PER_CPU(cpu_gdt_descr, %ebx) - */ #ifdef CONFIG_SMP - #define __my_cpu_offset x86_read_percpu(this_cpu_off) - -/* fs segment starts at (positive) offset == __per_cpu_offset[cpu] */ #define __percpu_seg "%%fs:" - -#else /* !SMP */ - +#else #define __percpu_seg "" - -#endif /* SMP */ +#endif #include <asm-generic/percpu.h> /* We can use this directly for local CPU (faster). */ DECLARE_PER_CPU(unsigned long, this_cpu_off); +#endif /* __ASSEMBLY__ */ +#endif /* !CONFIG_X86_64 */ + +#ifndef __ASSEMBLY__ + /* For arch-specific code, we can use direct single-insn ops (they * don't give an lvalue though). */ extern void __bad_percpu_size(void); @@ -215,7 +196,6 @@ do { \ percpu_cmpxchg_op(per_cpu_var(var), old, new) #endif /* !__ASSEMBLY__ */ -#endif /* !CONFIG_X86_64 */ #ifdef CONFIG_SMP --- linux-2.6.tip.orig/include/asm-x86/trampoline.h +++ linux-2.6.tip/include/asm-x86/trampoline.h @@ -12,6 +12,7 @@ extern unsigned char *trampoline_base; extern unsigned long init_rsp; extern unsigned long initial_code; +extern unsigned long initial_pda; #define TRAMPOLINE_BASE 0x6000 extern unsigned long setup_trampoline(void); ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-19 22:13 ` Mike Travis @ 2008-06-19 22:21 ` Jeremy Fitzhardinge 2008-06-30 17:49 ` Mike Travis 2008-06-19 22:23 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-19 22:21 UTC (permalink / raw) To: Mike Travis Cc: Rusty Russell, Linux Kernel Mailing List, Christoph Lameter, Jack Steiner Mike Travis wrote: > Jeremy Fitzhardinge wrote: > > >> Why not use the real pda for all cpus? >> > > Yeah, I figured that out after doing some more thinking... ;-) > > >> Do you move the boot-cpu's per-cpu data? (Please don't) If not, you can >> just use percpu__pda from the start without having to do anything else, >> and then set up %gs pointing to the pda base for each secondary cpu. >> > > The problem is that the static percpu area is removed as it lies > in the initdata section, so the pda is removed as well. > Well, that's easy to fix... > But I took your suggestion to move the fixup to before secondary_startup. > > Below is a revised version. (Incremental diffs are easier to review and work with.) > It builds but I'll have to test it tomorrow. > Note the addition of: > > + initial_pda = (unsigned long)get_percpu_pda(cpu); > > in do_boot_cpu. > > I'm not sure yet what to put into acpi_save_state_mem: > > initial_code = (unsigned long)wakeup_long64; > + /* ZZZ initial_pda = (unsigned long)?; */ > You'll need to change wakeup_long64 to load the right value into the GS_BASE msr anyway. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-19 22:21 ` Jeremy Fitzhardinge @ 2008-06-30 17:49 ` Mike Travis 0 siblings, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-06-30 17:49 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Rusty Russell, Linux Kernel Mailing List, len.brown, Jack Steiner Jeremy Fitzhardinge wrote: > Mike Travis wrote: ... >> I'm not sure yet what to put into acpi_save_state_mem: >> >> initial_code = (unsigned long)wakeup_long64; >> + /* ZZZ initial_pda = (unsigned long)?; */ >> > > You'll need to change wakeup_long64 to load the right value into the > GS_BASE msr anyway. > > J I'm afraid I don't quite understand the transitioning of the ACPI states to figure out the correct thing to do. My first inclination would be to: [sorry, cut and pasted] --- linux-2.6.tip.orig/arch/x86/kernel/acpi/sleep.c +++ linux-2.6.tip/arch/x86/kernel/acpi/sleep.c @@ -89,6 +89,8 @@ int acpi_save_state_mem(void) #ifdef CONFIG_SMP stack_start.sp = temp_stack + 4096; #endif + early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu); + initial_pda = (unsigned long)get_cpu_pda(cpu); initial_code = (unsigned long)wakeup_long64; saved_magic = 0x123456789abcdef0; #endif /* CONFIG_64BIT */ But I'd like some confirmation that this is right thing to do... [This mimics what smpboot.c:do_boot_cpu() does.] Len - I'll cc you on the full patch submission shortly. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-19 22:13 ` Mike Travis 2008-06-19 22:21 ` Jeremy Fitzhardinge @ 2008-06-19 22:23 ` Jeremy Fitzhardinge [not found] ` <485BDB04.4090709@sgi.com> 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-19 22:23 UTC (permalink / raw) To: Mike Travis Cc: Rusty Russell, Linux Kernel Mailing List, Christoph Lameter, Jack Steiner Mike Travis wrote: > @@ -132,6 +133,12 @@ ident_complete: > #ifdef CONFIG_SMP > addq %rbp, trampoline_level4_pgt + 0(%rip) > addq %rbp, trampoline_level4_pgt + (511*8)(%rip) > + > + /* > + * Fix up per_cpu__gdt_page offset when basing percpu > + * variables at zero. This is only needed for the boot cpu. > + */ > + addq $__per_cpu_load, early_gdt_descr_base > This needs to be rip-relative. An absolute reference here will fail because you're still running in physical addresses. J ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <485BDB04.4090709@sgi.com>]
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area [not found] ` <485BDB04.4090709@sgi.com> @ 2008-06-20 17:25 ` Jeremy Fitzhardinge 2008-06-20 17:48 ` Christoph Lameter 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-20 17:25 UTC (permalink / raw) To: Mike Travis; +Cc: Christoph Lameter, Linux Kernel Mailing List Mike Travis wrote: > Jeremy Fitzhardinge wrote: > >> Mike Travis wrote: >> >>> @@ -132,6 +133,12 @@ ident_complete: >>> #ifdef CONFIG_SMP >>> addq %rbp, trampoline_level4_pgt + 0(%rip) >>> addq %rbp, trampoline_level4_pgt + (511*8)(%rip) >>> + >>> + /* >>> + * Fix up per_cpu__gdt_page offset when basing percpu >>> + * variables at zero. This is only needed for the boot cpu. >>> + */ >>> + addq $__per_cpu_load, early_gdt_descr_base >>> >>> >> This needs to be rip-relative. An absolute reference here will fail >> because you're still running in physical addresses. >> >> J >> > > Still bombs right at boot up... ;-( > Yep. I see the triple-fault at the "mov %eax,%ds", which means it's having trouble with the gdt. Either 1) the lgdt pointed to a bad address, or 2) there's something wrong with the descriptor there. The dump is: (XEN) hvm.c:767:d14 Triple fault on VCPU0 - invoking HVM system reset. (XEN) ----[ Xen-3.3-unstable x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: 0010:[<ffffffff80200167>] (XEN) RFLAGS: 0000000000010002 CONTEXT: hvm (XEN) rax: 0000000000000018 rbx: 0000000000000000 rcx: ffffffff808d6000 (XEN) rdx: 0000000000000000 rsi: 0000000000092f40 rdi: 0000000020100800 (XEN) rbp: 0000000000000000 rsp: ffffffff80827ff8 r8: 0000000000208000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 00000000000000d8 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000000a0 (XEN) cr3: 0000000000201000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0010 I loaded early_gdt_descr+2 into %rcx, which looks reasonable. Hm, but loading the __KERNEL_DS descriptor into %rdx, which is all zero. So it seems the problem is that the pre-initialized gdt_page is being lost and replaced with zero. Linker script bug? J --- a/arch/x86/kernel/head_64.S Fri Jun 20 09:50:02 2008 -0700 +++ b/arch/x86/kernel/head_64.S Fri Jun 20 10:19:20 2008 -0700 @@ -213,6 +213,8 @@ * because in 32bit we couldn't load a 64bit linear address. */ lgdt early_gdt_descr(%rip) + movq early_gdt_descr+2(%rip), %rcx + movq __KERNEL_DS(%rcx), %rdx /* set up data segments. actually 0 would do too */ movl $__KERNEL_DS,%eax ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 17:25 ` Jeremy Fitzhardinge @ 2008-06-20 17:48 ` Christoph Lameter 2008-06-20 18:30 ` Mike Travis 2008-06-20 18:37 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 108+ messages in thread From: Christoph Lameter @ 2008-06-20 17:48 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Mike Travis, Linux Kernel Mailing List On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: > So it seems the problem is that the pre-initialized gdt_page is being lost and > replaced with zero. Linker script bug? Is the pre initialized gdt page in the per cpu area? Does not look like it. The loader setup for the percpu section changes with zero basing. Maybe that has bad side effects? ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 17:48 ` Christoph Lameter @ 2008-06-20 18:30 ` Mike Travis 2008-06-20 18:40 ` Jeremy Fitzhardinge 2008-06-20 18:37 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-20 18:30 UTC (permalink / raw) To: Christoph Lameter; +Cc: Jeremy Fitzhardinge, Linux Kernel Mailing List Christoph Lameter wrote: > On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: > >> So it seems the problem is that the pre-initialized gdt_page is being lost and >> replaced with zero. Linker script bug? > > Is the pre initialized gdt page in the per cpu area? Does not look like > it. The loader setup for the percpu section changes with zero basing. > Maybe that has bad side effects? Yes, it is... The fixup logic is this: 0000000000004000 D per_cpu__gdt_page ffffffff81911000 A __per_cpu_load arch/x86/kernel/cpu/common.c: DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = { [GDT_ENTRY_KERNEL_CS] = { { { 0x0000ffff, 0x00cf9a00 } } }, ... arch/x86/kernel/head_64.S: startup_64: ... /* * Fix up per_cpu__gdt_page offset when basing percpu * variables at zero. This is only needed for the boot cpu. */ addq $__per_cpu_load, early_gdt_descr_base(%rip) ENTRY(secondary_startup_64) ... /* * We must switch to a new descriptor in kernel space for the GDT * because soon the kernel won't have access anymore to the userspace * addresses where we're currently running on. We have to do that here * because in 32bit we couldn't load a 64bit linear address. */ lgdt early_gdt_descr(%rip) ... .globl early_gdt_descr early_gdt_descr: .word GDT_ENTRIES*8-1 early_gdt_descr_base: .quad per_cpu__gdt_page ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 18:30 ` Mike Travis @ 2008-06-20 18:40 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-20 18:40 UTC (permalink / raw) To: Mike Travis; +Cc: Christoph Lameter, Linux Kernel Mailing List Mike Travis wrote: > Yes, it is... The fixup logic is this: > > 0000000000004000 D per_cpu__gdt_page > ffffffff81911000 A __per_cpu_load > > arch/x86/kernel/cpu/common.c: > > DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = { > [GDT_ENTRY_KERNEL_CS] = { { { 0x0000ffff, 0x00cf9a00 } } }, > Aha, you fell into the trap! "common" in this case means "common to 32-bit x86". Or something. But setup_64.c is what you want to be looking at. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 17:48 ` Christoph Lameter 2008-06-20 18:30 ` Mike Travis @ 2008-06-20 18:37 ` Jeremy Fitzhardinge 2008-06-20 18:51 ` Christoph Lameter 2008-06-20 19:06 ` Mike Travis 1 sibling, 2 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-20 18:37 UTC (permalink / raw) To: Christoph Lameter Cc: Mike Travis, Linux Kernel Mailing List, H. Peter Anvin, Eric W. Biederman Christoph Lameter wrote: > On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: > > >> So it seems the problem is that the pre-initialized gdt_page is being lost and >> replaced with zero. Linker script bug? >> > > Is the pre initialized gdt page in the per cpu area? Does not look like > it. Yes, it should be. arch/x86/kernel/setup_64.c: DEFINE_PER_CPU(struct gdt_page, gdt_page) = { .gdt = { [GDT_ENTRY_KERNEL32_CS] = { { { 0x0000ffff, 0x00cf9b00 } } }, [GDT_ENTRY_KERNEL_CS] = { { { 0x0000ffff, 0x00af9b00 } } }, [GDT_ENTRY_KERNEL_DS] = { { { 0x0000ffff, 0x00cf9300 } } }, [GDT_ENTRY_DEFAULT_USER32_CS] = { { { 0x0000ffff, 0x00cffb00 } } }, [GDT_ENTRY_DEFAULT_USER_DS] = { { { 0x0000ffff, 0x00cff300 } } }, [GDT_ENTRY_DEFAULT_USER_CS] = { { { 0x0000ffff, 0x00affb00 } } }, } }; > The loader setup for the percpu section changes with zero basing. > Maybe that has bad side effects How does it work? The symbols in the percpu segment are 0-based, but where does the data for the sections which correspond to that segment go? The vmlinux looks like it has the gdt_page data in it. per_cpu__gdt_page is 0x5000, and offset 0x5000 in .data.percpu has the right stuff: 5000 00000000 00000000 ffff0000 009bcf00 ................ 5010 ffff0000 009baf00 ffff0000 0093cf00 ................ 5020 ffff0000 00fbcf00 ffff0000 00f3cf00 ................ 5030 ffff0000 00fbaf00 00000000 00000000 ................ 5040 00000000 00000000 00000000 00000000 ................ 5050 00000000 00000000 00000000 00000000 ................ 5060 00000000 00000000 00000000 00000000 ................ So the question is what kernel virtual address is it being loaded to? __per_cpu_load is ffffffff808d1000, so ffffffff808d6000 is what you'd expect... Hm, but what happens when this gets converted to bzImage? Hm, looks OK, I think. BTW, I think __per_cpu_load will cause trouble if you make a relocatable kernel, being an absolute symbol. But I have relocation off at the moment. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 18:37 ` Jeremy Fitzhardinge @ 2008-06-20 18:51 ` Christoph Lameter 2008-06-20 19:04 ` Jeremy Fitzhardinge 2008-06-20 19:06 ` Mike Travis 1 sibling, 1 reply; 108+ messages in thread From: Christoph Lameter @ 2008-06-20 18:51 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Mike Travis, Linux Kernel Mailing List, H. Peter Anvin, Eric W. Biederman On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: > > The loader setup for the percpu section changes with zero basing. Maybe that > > has bad side effects > > How does it work? The symbols in the percpu segment are 0-based, but where > does the data for the sections which correspond to that segment go? Its loaded at __per_cpu_load but the symbols have addresses starting at 0. > So the question is what kernel virtual address is it being loaded to? > __per_cpu_load is ffffffff808d1000, so ffffffff808d6000 is what you'd > expect... Correct. > Hm, but what happens when this gets converted to bzImage? Hm, looks OK, I > think. > > BTW, I think __per_cpu_load will cause trouble if you make a relocatable > kernel, being an absolute symbol. But I have relocation off at the moment. Hmmm.... we could add the relocation offset to __per_cpu_load? __per_cpu_load is used very sparingly. Basically only useful during early boot and when a new per cpu area has to be setup. In that case we want to copy from __per_cpu_load to the newly allocated percpu area. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 18:51 ` Christoph Lameter @ 2008-06-20 19:04 ` Jeremy Fitzhardinge 2008-06-20 19:21 ` H. Peter Anvin 2008-06-20 19:43 ` Eric W. Biederman 0 siblings, 2 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-20 19:04 UTC (permalink / raw) To: Christoph Lameter Cc: Mike Travis, Linux Kernel Mailing List, H. Peter Anvin, Eric W. Biederman Christoph Lameter wrote: > On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: > > >>> The loader setup for the percpu section changes with zero basing. Maybe that >>> has bad side effects >>> >> How does it work? The symbols in the percpu segment are 0-based, but where >> does the data for the sections which correspond to that segment go? >> > > Its loaded at __per_cpu_load but the symbols have addresses starting at 0. > Yes, which leads to an odd-looking ELF file where the Phdrs aren't sorted by virtual address order. I'm wondering what would happen if a bootloader that actually understood ELF files tried to load it as an actual ELF file... >> So the question is what kernel virtual address is it being loaded to? >> __per_cpu_load is ffffffff808d1000, so ffffffff808d6000 is what you'd >> expect... >> > > Correct. > Well, reading back from that address got zeros, so something is amiss. >> Hm, but what happens when this gets converted to bzImage? Hm, looks OK, I >> think. >> >> BTW, I think __per_cpu_load will cause trouble if you make a relocatable >> kernel, being an absolute symbol. But I have relocation off at the moment. >> > > Hmmm.... we could add the relocation offset to __per_cpu_load? > __per_cpu_load is used very sparingly. Basically only useful during early > boot and when a new per cpu area has to be setup. In that case we want to > copy from __per_cpu_load to the newly allocated percpu area. > Yes, it should be fairly easy to manually relocate it by applying the (load - link) offset to it. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 19:04 ` Jeremy Fitzhardinge @ 2008-06-20 19:21 ` H. Peter Anvin 2008-06-20 19:43 ` Eric W. Biederman 1 sibling, 0 replies; 108+ messages in thread From: H. Peter Anvin @ 2008-06-20 19:21 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Christoph Lameter, Mike Travis, Linux Kernel Mailing List, Eric W. Biederman Jeremy Fitzhardinge wrote: >> >> Its loaded at __per_cpu_load but the symbols have addresses starting >> at 0. > > Yes, which leads to an odd-looking ELF file where the Phdrs aren't > sorted by virtual address order. I'm wondering what would happen if a > bootloader that actually understood ELF files tried to load it as an > actual ELF file... > If it is implemented correctly, it will work. It might trigger bugs in such loaders, however. >> Hmmm.... we could add the relocation offset to __per_cpu_load? >> __per_cpu_load is used very sparingly. Basically only useful during >> early boot and when a new per cpu area has to be setup. In that case >> we want to copy from __per_cpu_load to the newly allocated percpu area. > > Yes, it should be fairly easy to manually relocate it by applying the > (load - link) offset to it. Seems easy enough, and as already stated, this is not performance-critical so a few extra instructions is pretty much a non-issue. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 19:04 ` Jeremy Fitzhardinge 2008-06-20 19:21 ` H. Peter Anvin @ 2008-06-20 19:43 ` Eric W. Biederman 2008-06-20 20:04 ` Mike Travis 1 sibling, 1 reply; 108+ messages in thread From: Eric W. Biederman @ 2008-06-20 19:43 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Christoph Lameter, Mike Travis, Linux Kernel Mailing List, H. Peter Anvin Jeremy Fitzhardinge <jeremy@goop.org> writes: > Christoph Lameter wrote: >> On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: >> >> >>>> The loader setup for the percpu section changes with zero basing. Maybe that >>>> has bad side effects >>>> >>> How does it work? The symbols in the percpu segment are 0-based, but where >>> does the data for the sections which correspond to that segment go? >>> >> >> Its loaded at __per_cpu_load but the symbols have addresses starting at 0. >> > > Yes, which leads to an odd-looking ELF file where the Phdrs aren't sorted by > virtual address order. I'm wondering what would happen if a bootloader that > actually understood ELF files tried to load it as an actual ELF > file... Well /sbin/kexec looks at the physical addresses not the virtual ones so that may not be a problem. >>> So the question is what kernel virtual address is it being loaded to? >>> __per_cpu_load is ffffffff808d1000, so ffffffff808d6000 is what you'd >>> expect... >>> >> >> Correct. >> > > Well, reading back from that address got zeros, so something is > amiss. Weird. >>> Hm, but what happens when this gets converted to bzImage? Hm, looks OK, I >>> think. >>> >>> BTW, I think __per_cpu_load will cause trouble if you make a relocatable >>> kernel, being an absolute symbol. But I have relocation off at the moment. >>> >> >> Hmmm.... we could add the relocation offset to __per_cpu_load? __per_cpu_load >> is used very sparingly. Basically only useful during early boot and when a new >> per cpu area has to be setup. In that case we want to copy from __per_cpu_load >> to the newly allocated percpu area. >> > > Yes, it should be fairly easy to manually relocate it by applying the (load - > link) offset to it. For x86_64 all kernels are built relocatable as the only cost was changing the physical addresses in the initial page tables. The virtual address always remain the same but the physical addresses change. So that could be part of what is going on. Is this a change that only got tested on x86_32? As long as we are not changing the way the kernel virtual address are actually being used we should be ok with a change to make the pda 0 based. Still it is an area you need to be especially careful with. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 19:43 ` Eric W. Biederman @ 2008-06-20 20:04 ` Mike Travis 2008-06-20 20:37 ` Christoph Lameter 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-20 20:04 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Eric W. Biederman wrote: > Jeremy Fitzhardinge <jeremy@goop.org> writes: > >> Christoph Lameter wrote: >>> On Fri, 20 Jun 2008, Jeremy Fitzhardinge wrote: >>> >>> >>>>> The loader setup for the percpu section changes with zero basing. Maybe that >>>>> has bad side effects >>>>> >>>> How does it work? The symbols in the percpu segment are 0-based, but where >>>> does the data for the sections which correspond to that segment go? >>>> >>> Its loaded at __per_cpu_load but the symbols have addresses starting at 0. >>> >> Yes, which leads to an odd-looking ELF file where the Phdrs aren't sorted by >> virtual address order. I'm wondering what would happen if a bootloader that >> actually understood ELF files tried to load it as an actual ELF >> file... > > Well /sbin/kexec looks at the physical addresses not the virtual ones > so that may not be a problem. > >>>> So the question is what kernel virtual address is it being loaded to? >>>> __per_cpu_load is ffffffff808d1000, so ffffffff808d6000 is what you'd >>>> expect... >>>> >>> Correct. >>> >> Well, reading back from that address got zeros, so something is >> amiss. > > Weird. > >>>> Hm, but what happens when this gets converted to bzImage? Hm, looks OK, I >>>> think. >>>> >>>> BTW, I think __per_cpu_load will cause trouble if you make a relocatable >>>> kernel, being an absolute symbol. But I have relocation off at the moment. >>>> >>> Hmmm.... we could add the relocation offset to __per_cpu_load? __per_cpu_load >>> is used very sparingly. Basically only useful during early boot and when a new >>> per cpu area has to be setup. In that case we want to copy from __per_cpu_load >>> to the newly allocated percpu area. >>> >> Yes, it should be fairly easy to manually relocate it by applying the (load - >> link) offset to it. > > For x86_64 all kernels are built relocatable as the only cost was > changing the physical addresses in the initial page tables. The > virtual address always remain the same but the physical addresses > change. So that could be part of what is going on. > > Is this a change that only got tested on x86_32? I'm only testing this on x86_64. The zero-based percpu/pda changes worked fine up until just recently. At first it was one of Ingo's "randconfig" config files that was tripping it up, but lately it's not working on any config. > > As long as we are not changing the way the kernel virtual address > are actually being used we should be ok with a change to make the pda > 0 based. Still it is an area you need to be especially careful with. The major gotcha's seem to be in referencing the per_cpu symbol directly though I've examined them all and nothing seems amiss. > > > Eric Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 20:04 ` Mike Travis @ 2008-06-20 20:37 ` Christoph Lameter 0 siblings, 0 replies; 108+ messages in thread From: Christoph Lameter @ 2008-06-20 20:37 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, Jeremy Fitzhardinge, Linux Kernel Mailing List, H. Peter Anvin On Fri, 20 Jun 2008, Mike Travis wrote: > > Is this a change that only got tested on x86_32? > > I'm only testing this on x86_64. The zero-based percpu/pda changes worked fine > up until just recently. At first it was one of Ingo's "randconfig" config files > that was tripping it up, but lately it's not working on any config. x86_32 does not need the zero basing since it does not have a pda. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 18:37 ` Jeremy Fitzhardinge 2008-06-20 18:51 ` Christoph Lameter @ 2008-06-20 19:06 ` Mike Travis 2008-06-20 20:25 ` Eric W. Biederman 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-20 19:06 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin, Eric W. Biederman Jeremy Fitzhardinge wrote: > > > BTW, I think __per_cpu_load will cause trouble if you make a relocatable > kernel, being an absolute symbol. But I have relocation off at the moment. > ... Here's where it's defined (in include/asm-generic/vmlinux.lds.h): #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU #define PERCPU(align) \ . = ALIGN(align); \ percpu : { } :percpu \ __per_cpu_load = .; \ .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ *(.data.percpu.first) \ *(.data.percpu.shared_aligned) \ *(.data.percpu) \ *(.data.percpu.page_aligned) \ ____per_cpu_size = .; \ } \ . = __per_cpu_load + ____per_cpu_size; \ data : { } :data #else Can we generate a new symbol which would account for LOAD_OFFSET? ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 19:06 ` Mike Travis @ 2008-06-20 20:25 ` Eric W. Biederman 2008-06-20 20:55 ` Christoph Lameter ` (2 more replies) 0 siblings, 3 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-06-20 20:25 UTC (permalink / raw) To: Mike Travis Cc: Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin, Eric W. Biederman Mike Travis <travis@sgi.com> writes: > Jeremy Fitzhardinge wrote: >> >> >> BTW, I think __per_cpu_load will cause trouble if you make a relocatable >> kernel, being an absolute symbol. But I have relocation off at the moment. >> > ... > Here's where it's defined (in include/asm-generic/vmlinux.lds.h): > > #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU > #define PERCPU(align) \ > . = ALIGN(align); \ > percpu : { } :percpu \ > __per_cpu_load = .; \ > .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ > *(.data.percpu.first) \ > *(.data.percpu.shared_aligned) \ > *(.data.percpu) \ > *(.data.percpu.page_aligned) \ > ____per_cpu_size = .; \ > } \ > . = __per_cpu_load + ____per_cpu_size; \ > data : { } :data > #else > > Can we generate a new symbol which would account for LOAD_OFFSET? Ouch. Absolute symbols indeed. On the 32bit kernel that may play havoc with the relocatable kernel, although we have had similar absolute logic for the last year. With __per_cpu_start and __per_cpu_end so it may not be a problem. To initialize the percpu data you do want to talk to the virtual address at __per_coup_load. But it is absolute Ugh. It might be worth saying something like. .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { DATA(0) . = ALIGN(align); __per_cpu_load = . ; } To make __per_cpu_load a relative symbol. ld has a bad habit of taking symbols out of empty sections and making them absolute. Which is why I added the DATA(0). Still I don't think that would be the 64bit problem. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 20:25 ` Eric W. Biederman @ 2008-06-20 20:55 ` Christoph Lameter 2008-06-23 16:55 ` Mike Travis 2008-06-30 17:07 ` Mike Travis 2 siblings, 0 replies; 108+ messages in thread From: Christoph Lameter @ 2008-06-20 20:55 UTC (permalink / raw) To: Eric W. Biederman Cc: Mike Travis, Jeremy Fitzhardinge, Linux Kernel Mailing List, H. Peter Anvin On Fri, 20 Jun 2008, Eric W. Biederman wrote: > at __per_coup_load. But it is absolute Ugh. > > It might be worth saying something like. > .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { > DATA(0) > . = ALIGN(align); > __per_cpu_load = . ; > } > To make __per_cpu_load a relative symbol. ld has a bad habit of taking > symbols out of empty sections and making them absolute. Which is why > I added the DATA(0). > > Still I don't think that would be the 64bit problem. Ahh.. Good idea. I had a long fight with the loader before it did the right thing. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 20:25 ` Eric W. Biederman 2008-06-20 20:55 ` Christoph Lameter @ 2008-06-23 16:55 ` Mike Travis 2008-06-23 17:33 ` Jeremy Fitzhardinge 2008-06-30 17:07 ` Mike Travis 2 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-23 16:55 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > >> Jeremy Fitzhardinge wrote: >>> >>> BTW, I think __per_cpu_load will cause trouble if you make a relocatable >>> kernel, being an absolute symbol. But I have relocation off at the moment. >>> >> ... >> Here's where it's defined (in include/asm-generic/vmlinux.lds.h): >> >> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >> #define PERCPU(align) \ >> . = ALIGN(align); \ >> percpu : { } :percpu \ >> __per_cpu_load = .; \ >> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ >> *(.data.percpu.first) \ >> *(.data.percpu.shared_aligned) \ >> *(.data.percpu) \ >> *(.data.percpu.page_aligned) \ >> ____per_cpu_size = .; \ >> } \ >> . = __per_cpu_load + ____per_cpu_size; \ >> data : { } :data >> #else >> >> Can we generate a new symbol which would account for LOAD_OFFSET? > > Ouch. Absolute symbols indeed. On the 32bit kernel that may play havoc > with the relocatable kernel, although we have had similar absolute logic > for the last year. With __per_cpu_start and __per_cpu_end so it may > not be a problem. > > To initialize the percpu data you do want to talk to the virtual address > at __per_coup_load. But it is absolute Ugh. > > It might be worth saying something like. > .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { > DATA(0) > . = ALIGN(align); > __per_cpu_load = . ; > } > To make __per_cpu_load a relative symbol. ld has a bad habit of taking > symbols out of empty sections and making them absolute. Which is why > I added the DATA(0). > > Still I don't think that would be the 64bit problem. > > Eric I'm not sure I understand the linker lingo enough to fill in the rest of the blanks... I've tried various versions around this framework and none have been accepted yet. #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU #define PERCPU(align) \ .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { \ DATA(0) \ . = ALIGN(align); \ __per_cpu_load = .; \ *(.data.percpu.first) \ *(.data.percpu.shared_aligned) \ *(.data.percpu) \ *(.data.percpu.page_aligned) \ ____per_cpu_size = . - __per_cpu_load \ } \ #else Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-23 16:55 ` Mike Travis @ 2008-06-23 17:33 ` Jeremy Fitzhardinge 2008-06-23 18:04 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-23 17:33 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Mike Travis wrote: > Eric W. Biederman wrote: > >> Mike Travis <travis@sgi.com> writes: >> >> >>> Jeremy Fitzhardinge wrote: >>> >>>> BTW, I think __per_cpu_load will cause trouble if you make a relocatable >>>> kernel, being an absolute symbol. But I have relocation off at the moment. >>>> >>>> >>> ... >>> Here's where it's defined (in include/asm-generic/vmlinux.lds.h): >>> >>> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >>> #define PERCPU(align) \ >>> . = ALIGN(align); \ >>> percpu : { } :percpu \ >>> __per_cpu_load = .; \ >>> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ >>> *(.data.percpu.first) \ >>> *(.data.percpu.shared_aligned) \ >>> *(.data.percpu) \ >>> *(.data.percpu.page_aligned) \ >>> ____per_cpu_size = .; \ >>> } \ >>> . = __per_cpu_load + ____per_cpu_size; \ >>> data : { } :data >>> #else >>> >>> Can we generate a new symbol which would account for LOAD_OFFSET? >>> >> Ouch. Absolute symbols indeed. On the 32bit kernel that may play havoc >> with the relocatable kernel, although we have had similar absolute logic >> for the last year. With __per_cpu_start and __per_cpu_end so it may >> not be a problem. >> >> To initialize the percpu data you do want to talk to the virtual address >> at __per_coup_load. But it is absolute Ugh. >> >> It might be worth saying something like. >> .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { >> DATA(0) >> . = ALIGN(align); >> __per_cpu_load = . ; >> } >> To make __per_cpu_load a relative symbol. ld has a bad habit of taking >> symbols out of empty sections and making them absolute. Which is why >> I added the DATA(0). >> >> Still I don't think that would be the 64bit problem. >> >> Eric >> > > I'm not sure I understand the linker lingo enough to fill in the rest > of the blanks... I've tried various versions around this framework and > none have been accepted yet. > > #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU > #define PERCPU(align) \ > .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { \ > DATA(0) \ > . = ALIGN(align); \ > __per_cpu_load = .; \ > *(.data.percpu.first) \ > *(.data.percpu.shared_aligned) \ > *(.data.percpu) \ > *(.data.percpu.page_aligned) \ > ____per_cpu_size = . - __per_cpu_load \ > } \ > #else > That looks OK to me. Does it work? J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-23 17:33 ` Jeremy Fitzhardinge @ 2008-06-23 18:04 ` Mike Travis 2008-06-23 18:36 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-23 18:04 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> Eric W. Biederman wrote: >> >>> Mike Travis <travis@sgi.com> writes: >>> >>> >>>> Jeremy Fitzhardinge wrote: >>>> >>>>> BTW, I think __per_cpu_load will cause trouble if you make a >>>>> relocatable >>>>> kernel, being an absolute symbol. But I have relocation off at the >>>>> moment. >>>>> >>>>> >>>> ... >>>> Here's where it's defined (in include/asm-generic/vmlinux.lds.h): >>>> >>>> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >>>> #define >>>> PERCPU(align) \ >>>> . = >>>> ALIGN(align); \ >>>> percpu : { } >>>> :percpu \ >>>> __per_cpu_load = >>>> .; \ >>>> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) >>>> { \ >>>> >>>> *(.data.percpu.first) \ >>>> >>>> *(.data.percpu.shared_aligned) \ >>>> >>>> *(.data.percpu) \ >>>> >>>> *(.data.percpu.page_aligned) \ >>>> ____per_cpu_size = >>>> .; \ >>>> >>>> } \ >>>> . = __per_cpu_load + >>>> ____per_cpu_size; \ >>>> data : { } :data >>>> #else >>>> >>>> Can we generate a new symbol which would account for LOAD_OFFSET? >>>> >>> Ouch. Absolute symbols indeed. On the 32bit kernel that may play havoc >>> with the relocatable kernel, although we have had similar absolute logic >>> for the last year. With __per_cpu_start and __per_cpu_end so it may >>> not be a problem. >>> >>> To initialize the percpu data you do want to talk to the virtual address >>> at __per_coup_load. But it is absolute Ugh. >>> It might be worth saying something like. >>> .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { >>> DATA(0) . = ALIGN(align); >>> __per_cpu_load = . ; } >>> To make __per_cpu_load a relative symbol. ld has a bad habit of taking >>> symbols out of empty sections and making them absolute. Which is why >>> I added the DATA(0). >>> >>> Still I don't think that would be the 64bit problem. >>> >>> Eric >>> >> >> I'm not sure I understand the linker lingo enough to fill in the rest >> of the blanks... I've tried various versions around this framework and >> none have been accepted yet. >> >> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >> #define PERCPU(align) \ >> .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { \ >> DATA(0) \ >> . = ALIGN(align); \ >> __per_cpu_load = .; \ >> *(.data.percpu.first) \ >> *(.data.percpu.shared_aligned) \ >> *(.data.percpu) \ >> *(.data.percpu.page_aligned) \ >> ____per_cpu_size = . - __per_cpu_load \ >> } \ >> #else >> > > That looks OK to me. Does it work? > > J Nope, fighting undefines and/or syntax errors in the linker. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-23 18:04 ` Mike Travis @ 2008-06-23 18:36 ` Mike Travis 2008-06-23 19:41 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-23 18:36 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin >>>> >>>> To initialize the percpu data you do want to talk to the virtual address >>>> at __per_coup_load. But it is absolute Ugh. >>>> It might be worth saying something like. >>>> .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { >>>> DATA(0) . = ALIGN(align); >>>> __per_cpu_load = . ; } >>>> To make __per_cpu_load a relative symbol. ld has a bad habit of taking >>>> symbols out of empty sections and making them absolute. Which is why >>>> I added the DATA(0). The syntax error is at this "DATA(0)" statement. I don't find this as a linker script command or a macro. What is it we're trying to do with this? Thanks, Mike ... >>> I'm not sure I understand the linker lingo enough to fill in the rest >>> of the blanks... I've tried various versions around this framework and >>> none have been accepted yet. >>> >>> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >>> #define PERCPU(align) \ >>> .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { \ >>> DATA(0) \ >>> . = ALIGN(align); \ >>> __per_cpu_load = .; \ >>> *(.data.percpu.first) \ >>> *(.data.percpu.shared_aligned) \ >>> *(.data.percpu) \ >>> *(.data.percpu.page_aligned) \ >>> ____per_cpu_size = . - __per_cpu_load \ >>> } \ >>> #else ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-23 18:36 ` Mike Travis @ 2008-06-23 19:41 ` Jeremy Fitzhardinge 2008-06-24 0:02 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-23 19:41 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Mike Travis wrote: > The syntax error is at this "DATA(0)" statement. I don't find this as a > linker script command or a macro. What is it we're trying to do with this? > In Eric's sample, it's intended to prevent there being an empty section, which can cause linker bugs. In your case it probably isn't necessary, since you're also putting the percpu data in that section. "DATA" is probably a typo. It should be "LONG" or something like that. (See "3.6.5 Output Section Data" in the linker manual.) J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-23 19:41 ` Jeremy Fitzhardinge @ 2008-06-24 0:02 ` Mike Travis 0 siblings, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-06-24 0:02 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin, Jack Steiner Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> The syntax error is at this "DATA(0)" statement. I don't find this as a >> linker script command or a macro. What is it we're trying to do with >> this? >> > > In Eric's sample, it's intended to prevent there being an empty section, > which can cause linker bugs. In your case it probably isn't necessary, > since you're also putting the percpu data in that section. > > "DATA" is probably a typo. It should be "LONG" or something like that. > (See "3.6.5 Output Section Data" in the linker manual.) > > J Yes, thanks I did find that. I now have the version below which seems to have what we need... but it hasn't had an effect on the boot startup panic. I'm back to verifying that the assembler effective addresses are correct in the loaded object. ffffffff81911000 D __per_cpu_load 0000000000000000 D per_cpu__pda 0000000000000080 D per_cpu__init_tss . . 000000000000a2d0 d per_cpu__cookie_scratch 000000000000a470 d per_cpu__cookie_scratch 000000000000a604 D ____per_cpu_size Btw, the "percpu : { } :percpu" below removes a linker warning about an empty section. #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU #define PERCPU(align) \ .data.percpu.abs = .; \ percpu : { } :percpu \ .data.percpu.header : AT(.data.percpu.abs - LOAD_OFFSET) { \ BYTE(0) \ . = ALIGN(align); \ __per_cpu_load = .; \ } \ .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ *(.data.percpu.first) \ *(.data.percpu.shared_aligned) \ *(.data.percpu) \ *(.data.percpu.page_aligned) \ ____per_cpu_size = .; \ } \ . = __per_cpu_load + ____per_cpu_size; ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-20 20:25 ` Eric W. Biederman 2008-06-20 20:55 ` Christoph Lameter 2008-06-23 16:55 ` Mike Travis @ 2008-06-30 17:07 ` Mike Travis 2008-06-30 17:18 ` H. Peter Anvin 2008-06-30 17:43 ` Jeremy Fitzhardinge 2 siblings, 2 replies; 108+ messages in thread From: Mike Travis @ 2008-06-30 17:07 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: ... >> Can we generate a new symbol which would account for LOAD_OFFSET? > > Ouch. Absolute symbols indeed. On the 32bit kernel that may play havoc > with the relocatable kernel, although we have had similar absolute logic > for the last year. With __per_cpu_start and __per_cpu_end so it may > not be a problem. > > To initialize the percpu data you do want to talk to the virtual address > at __per_coup_load. But it is absolute Ugh. > > It might be worth saying something like. > .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { > DATA(0) > . = ALIGN(align); > __per_cpu_load = . ; > } > To make __per_cpu_load a relative symbol. ld has a bad habit of taking > symbols out of empty sections and making them absolute. Which is why > I added the DATA(0). > > Still I don't think that would be the 64bit problem. > > Eric FYI, I did try this out and it caused the bootloader to scramble the loaded data. The first corruption I found was the .x86cpuvendor.init section contained all zeroes. Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 17:07 ` Mike Travis @ 2008-06-30 17:18 ` H. Peter Anvin 2008-06-30 17:57 ` Mike Travis 2008-06-30 17:43 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 108+ messages in thread From: H. Peter Anvin @ 2008-06-30 17:18 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List Mike Travis wrote: > > FYI, I did try this out and it caused the bootloader to scramble the > loaded data. The first corruption I found was the .x86cpuvendor.init > section contained all zeroes. > Explain what you mean with "the bootloader" in this context. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 17:18 ` H. Peter Anvin @ 2008-06-30 17:57 ` Mike Travis 2008-06-30 20:50 ` Eric W. Biederman 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-30 17:57 UTC (permalink / raw) To: H. Peter Anvin Cc: Eric W. Biederman, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List H. Peter Anvin wrote: > Mike Travis wrote: >> >> FYI, I did try this out and it caused the bootloader to scramble the >> loaded data. The first corruption I found was the .x86cpuvendor.init >> section contained all zeroes. >> > > Explain what you mean with "the bootloader" in this context. > > -hpa After the code was loaded (the compressed code, it seems that my GRUB doesn't support uncompressed loading), the above section contained zeroes. I snapped it fairly early, around secondary_startup_64, and then printed it in x86_64_start_kernel. The object file had the correct data (as displayed by objdump) so I'm assuming that the bootloading process didn't load the section correctly. Below was the linker script I used: --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h @@ -373,9 +373,13 @@ #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU #define PERCPU(align) \ - . = ALIGN(align); \ + .data.percpu.abs = .; \ percpu : { } :percpu \ - __per_cpu_load = .; \ + .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) { \ + BYTE(0) \ + . = ALIGN(align); \ + __per_cpu_load = .; \ + } \ .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ *(.data.percpu.first) \ *(.data.percpu.shared_aligned) \ @@ -383,8 +387,8 @@ *(.data.percpu.page_aligned) \ ____per_cpu_size = .; \ } \ - . = __per_cpu_load + ____per_cpu_size; \ - data : { } :data + . = __per_cpu_load + ____per_cpu_size; + #else #define PERCPU(align) \ . = ALIGN(align); \ It showed all the correct address in the map and __per_cpu_load was a relative symbol (which was the objective.) Btw, our simulator, which only loads uncompressed code, had the data correct, so it *may* only be a result of the code being compressed. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 17:57 ` Mike Travis @ 2008-06-30 20:50 ` Eric W. Biederman 2008-06-30 21:08 ` Jeremy Fitzhardinge 2008-07-01 11:49 ` Mike Travis 0 siblings, 2 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-06-30 20:50 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List Mike Travis <travis@sgi.com> writes: > H. Peter Anvin wrote: >> Mike Travis wrote: >>> >>> FYI, I did try this out and it caused the bootloader to scramble the >>> loaded data. The first corruption I found was the .x86cpuvendor.init >>> section contained all zeroes. >>> >> >> Explain what you mean with "the bootloader" in this context. >> >> -hpa > > > After the code was loaded (the compressed code, it seems that my GRUB > doesn't support uncompressed loading), the above section contained > zeroes. I snapped it fairly early, around secondary_startup_64, and > then printed it in x86_64_start_kernel. > > The object file had the correct data (as displayed by objdump) so I'm > assuming that the bootloading process didn't load the section correctly. > > Below was the linker script I used: > > --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h > +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h > @@ -373,9 +373,13 @@ > > #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU > #define PERCPU(align) \ > - . = ALIGN(align); \ > + .data.percpu.abs = .; \ > percpu : { } :percpu \ > - __per_cpu_load = .; \ > + .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) { \ > + BYTE(0) \ > + . = ALIGN(align); \ > + __per_cpu_load = .; \ > + } \ > .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ > *(.data.percpu.first) \ > *(.data.percpu.shared_aligned) \ > @@ -383,8 +387,8 @@ > *(.data.percpu.page_aligned) \ > ____per_cpu_size = .; \ > } \ > - . = __per_cpu_load + ____per_cpu_size; \ > - data : { } :data > + . = __per_cpu_load + ____per_cpu_size; > + > #else > #define PERCPU(align) \ > . = ALIGN(align); \ > > It showed all the correct address in the map and __per_cpu_load was a > relative symbol (which was the objective.) > > Btw, our simulator, which only loads uncompressed code, had the data correct, > so it *may* only be a result of the code being compressed. Weird. Grub doesn't get involved in the decompression the kernel does it all itself so we should be able to track where things go bad. Last I looked the compressed code was formed by essentially. objcopy vmlinux -O binary vmlinux.bin gzip vmlinux.bin And then we take on a magic header to the gzip compressed file. Are things only bad with the change above? Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 20:50 ` Eric W. Biederman @ 2008-06-30 21:08 ` Jeremy Fitzhardinge 2008-07-01 8:40 ` Eric W. Biederman 2008-07-01 12:09 ` Mike Travis 2008-07-01 11:49 ` Mike Travis 1 sibling, 2 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-30 21:08 UTC (permalink / raw) To: Eric W. Biederman Cc: Mike Travis, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > > >> H. Peter Anvin wrote: >> >>> Mike Travis wrote: >>> >>>> FYI, I did try this out and it caused the bootloader to scramble the >>>> loaded data. The first corruption I found was the .x86cpuvendor.init >>>> section contained all zeroes. >>>> >>>> >>> Explain what you mean with "the bootloader" in this context. >>> >>> -hpa >>> >> After the code was loaded (the compressed code, it seems that my GRUB >> doesn't support uncompressed loading), the above section contained >> zeroes. I snapped it fairly early, around secondary_startup_64, and >> then printed it in x86_64_start_kernel. >> >> The object file had the correct data (as displayed by objdump) so I'm >> assuming that the bootloading process didn't load the section correctly. >> >> Below was the linker script I used: >> >> --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h >> +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h >> @@ -373,9 +373,13 @@ >> >> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >> #define PERCPU(align) \ >> - . = ALIGN(align); \ >> + .data.percpu.abs = .; \ >> percpu : { } :percpu \ >> - __per_cpu_load = .; \ >> + .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) { \ >> + BYTE(0) \ >> + . = ALIGN(align); \ >> + __per_cpu_load = .; \ >> + } \ >> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ >> *(.data.percpu.first) \ >> *(.data.percpu.shared_aligned) \ >> @@ -383,8 +387,8 @@ >> *(.data.percpu.page_aligned) \ >> ____per_cpu_size = .; \ >> } \ >> - . = __per_cpu_load + ____per_cpu_size; \ >> - data : { } :data >> + . = __per_cpu_load + ____per_cpu_size; >> + >> #else >> #define PERCPU(align) \ >> . = ALIGN(align); \ >> >> It showed all the correct address in the map and __per_cpu_load was a >> relative symbol (which was the objective.) >> >> Btw, our simulator, which only loads uncompressed code, had the data correct, >> so it *may* only be a result of the code being compressed. >> > > Weird. Grub doesn't get involved in the decompression the kernel does it > all itself so we should be able to track where things go bad. > > Last I looked the compressed code was formed by essentially. > objcopy vmlinux -O binary vmlinux.bin > gzip vmlinux.bin > And then we take on a magic header to the gzip compressed file. > > Are things only bad with the change above? No, the original crash being discussed was a GP fault in head_64.S as it tries to initialize the kernel segments. The cause was that the prototype GDT is all zero, even though it's an initialized variable, and inspection of vmlinux shows that it has the right contents. But somehow it's either 1) getting zeroed on load, or 2) is loaded to the wrong place. The zero-based PDA mechanism requires the introduction of a new ELF segment based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me if its triggering some toolchain bug. Mike: what would happen if the PDA were based at 4k rather than 0? The stack canary would still be at its small offset (0x20?), but it doesn't need to be initialized. I'm not sure if doing so would fix anything, however. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 21:08 ` Jeremy Fitzhardinge @ 2008-07-01 8:40 ` Eric W. Biederman 2008-07-01 16:27 ` Jeremy Fitzhardinge 2008-07-01 16:56 ` H. Peter Anvin 2008-07-01 12:09 ` Mike Travis 1 sibling, 2 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-01 8:40 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Mike Travis, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List Jeremy Fitzhardinge <jeremy@goop.org> writes: > No, the original crash being discussed was a GP fault in head_64.S as it tries > to initialize the kernel segments. The cause was that the prototype GDT is all > zero, even though it's an initialized variable, and inspection of vmlinux shows > that it has the right contents. But somehow it's either 1) getting zeroed on > load, or 2) is loaded to the wrong place. > > The zero-based PDA mechanism requires the introduction of a new ELF segment > based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me if > its triggering some toolchain bug. Agreed. Given the previous description my hunch is that the bug is occurring during objcopy. If vmlinux is good and the compressed kernel is bad. It should be possible to look at vmlinux.bin and see if that was generated properly. > Mike: what would happen if the PDA were based at 4k rather than 0? The stack > canary would still be at its small offset (0x20?), but it doesn't need to be > initialized. I'm not sure if doing so would fix anything, however. I'm dense today. Why are we doing a zero based pda? That seems the most likely culprit of linker trouble, and we should be able to put a smaller offset in the segment register to allow for everything to work as expected. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 8:40 ` Eric W. Biederman @ 2008-07-01 16:27 ` Jeremy Fitzhardinge 2008-07-01 16:55 ` Mike Travis 2008-07-01 16:56 ` H. Peter Anvin 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-01 16:27 UTC (permalink / raw) To: Eric W. Biederman Cc: Mike Travis, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List Eric W. Biederman wrote: > Jeremy Fitzhardinge <jeremy@goop.org> writes: > > >> No, the original crash being discussed was a GP fault in head_64.S as it tries >> to initialize the kernel segments. The cause was that the prototype GDT is all >> zero, even though it's an initialized variable, and inspection of vmlinux shows >> that it has the right contents. But somehow it's either 1) getting zeroed on >> load, or 2) is loaded to the wrong place. >> >> The zero-based PDA mechanism requires the introduction of a new ELF segment >> based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me if >> its triggering some toolchain bug. >> > > Agreed. Given the previous description my hunch is that the bug is occurring > during objcopy. If vmlinux is good and the compressed kernel is bad. > > It should be possible to look at vmlinux.bin and see if that was generated > properly. > > >> Mike: what would happen if the PDA were based at 4k rather than 0? The stack >> canary would still be at its small offset (0x20?), but it doesn't need to be >> initialized. I'm not sure if doing so would fix anything, however. >> > > I'm dense today. Why are we doing a zero based pda? That seems the most > likely culprit of linker trouble, and we should be able to put a smaller > offset in the segment register to allow for everything to work as expected. > The only reason we need to do a zero-based PDA is because of the boneheaded gcc/x86_64 ABI decision to put the stack canary at a fixed offset from %gs (all they had to do was define it as a weak symbol we could override). If we want to support stack-protector and unify the handling of per-cpu variables, we need to rebase the per-cpu area at zero, starting with the PDA. My own inclination would be to drop stack-protector support until gcc gets fixed, rather than letting it prevent us from unifying an area which is in need of unification... J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 16:27 ` Jeremy Fitzhardinge @ 2008-07-01 16:55 ` Mike Travis 0 siblings, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-07-01 16:55 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List Jeremy Fitzhardinge wrote: > Eric W. Biederman wrote: >> Jeremy Fitzhardinge <jeremy@goop.org> writes: >> >> >>> No, the original crash being discussed was a GP fault in head_64.S as >>> it tries >>> to initialize the kernel segments. The cause was that the prototype >>> GDT is all >>> zero, even though it's an initialized variable, and inspection of >>> vmlinux shows >>> that it has the right contents. But somehow it's either 1) getting >>> zeroed on >>> load, or 2) is loaded to the wrong place. >>> >>> The zero-based PDA mechanism requires the introduction of a new ELF >>> segment >>> based at vaddr 0 which is sufficiently unusual that it wouldn't >>> surprise me if >>> its triggering some toolchain bug. >>> >> >> Agreed. Given the previous description my hunch is that the bug is >> occurring >> during objcopy. If vmlinux is good and the compressed kernel is bad. >> >> It should be possible to look at vmlinux.bin and see if that was >> generated >> properly. >> >> >>> Mike: what would happen if the PDA were based at 4k rather than 0? >>> The stack >>> canary would still be at its small offset (0x20?), but it doesn't >>> need to be >>> initialized. I'm not sure if doing so would fix anything, however. >>> >> >> I'm dense today. Why are we doing a zero based pda? That seems the most >> likely culprit of linker trouble, and we should be able to put a smaller >> offset in the segment register to allow for everything to work as >> expected. >> > > The only reason we need to do a zero-based PDA is because of the > boneheaded gcc/x86_64 ABI decision to put the stack canary at a fixed > offset from %gs (all they had to do was define it as a weak symbol we > could override). If we want to support stack-protector and unify the > handling of per-cpu variables, we need to rebase the per-cpu area at > zero, starting with the PDA. > > My own inclination would be to drop stack-protector support until gcc > gets fixed, rather than letting it prevent us from unifying an area > which is in need of unification... > > J I might be inclined to agree except most of the past few months of finding problems caused by NR_CPUS=4096 has been stack overflow. So any help detecting this condition is very useful. I can get static stacksizes (of course), but there's not a lot of help determining call chains except via actually executing the code. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 8:40 ` Eric W. Biederman 2008-07-01 16:27 ` Jeremy Fitzhardinge @ 2008-07-01 16:56 ` H. Peter Anvin 2008-07-01 17:26 ` Jeremy Fitzhardinge 2008-07-01 18:41 ` Eric W. Biederman 1 sibling, 2 replies; 108+ messages in thread From: H. Peter Anvin @ 2008-07-01 16:56 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Eric W. Biederman wrote: >> >> The zero-based PDA mechanism requires the introduction of a new ELF segment >> based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me if >> its triggering some toolchain bug. > > Agreed. Given the previous description my hunch is that the bug is occurring > during objcopy. If vmlinux is good and the compressed kernel is bad. > Actually, it's not all that unusual... it's pretty common in various restricted environments. That being said, it's probably uncommon for *64-bit* code. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 16:56 ` H. Peter Anvin @ 2008-07-01 17:26 ` Jeremy Fitzhardinge 2008-07-01 20:40 ` Eric W. Biederman 2008-07-01 18:41 ` Eric W. Biederman 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-01 17:26 UTC (permalink / raw) To: H. Peter Anvin Cc: Eric W. Biederman, Mike Travis, Christoph Lameter, Linux Kernel Mailing List H. Peter Anvin wrote: > Eric W. Biederman wrote: >>> >>> The zero-based PDA mechanism requires the introduction of a new ELF >>> segment >>> based at vaddr 0 which is sufficiently unusual that it wouldn't >>> surprise me if >>> its triggering some toolchain bug. >> >> Agreed. Given the previous description my hunch is that the bug is >> occurring >> during objcopy. If vmlinux is good and the compressed kernel is bad. >> > > Actually, it's not all that unusual... it's pretty common in various > restricted environments. That being said, it's probably uncommon for > *64-bit* code. Well, it's also unusual because 1) it's vaddr 0, but paddr <high>, and 2) the PHDRs are not sorted by vaddr order. 2) might actually be a bug. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 17:26 ` Jeremy Fitzhardinge @ 2008-07-01 20:40 ` Eric W. Biederman 2008-07-01 21:10 ` Jeremy Fitzhardinge 2008-07-01 21:11 ` Andi Kleen 0 siblings, 2 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-01 20:40 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: H. Peter Anvin, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Jeremy Fitzhardinge <jeremy@goop.org> writes: > H. Peter Anvin wrote: >> Eric W. Biederman wrote: >>>> >>>> The zero-based PDA mechanism requires the introduction of a new ELF segment >>>> based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me >>>> if >>>> its triggering some toolchain bug. >>> >>> Agreed. Given the previous description my hunch is that the bug is occurring >>> during objcopy. If vmlinux is good and the compressed kernel is bad. >>> >> >> Actually, it's not all that unusual... it's pretty common in various >> restricted environments. That being said, it's probably uncommon for *64-bit* >> code. > > Well, it's also unusual because 1) it's vaddr 0, but paddr <high>, and 2) the > PHDRs are not sorted by vaddr order. 2) might actually be a bug. I just looked and gcc does not use this technique for thread local data. My initial concern about all of this was not making symbols section relative is relieved as this all appears to be a 64bit arch thing where that doesn't matter. Has anyone investigated using the technique gcc uses for thread local storage? http://people.redhat.com/drepper/tls.pdf In particular using the local exec model so we can say: movq %fs:x@tpoff,%rax To load the contents of a per cpu variable x into %rax ? If we can use that model it should make it easier to interface with things like the stack protector code. Although we would still need to be very careful about thread switches. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 20:40 ` Eric W. Biederman @ 2008-07-01 21:10 ` Jeremy Fitzhardinge 2008-07-01 21:39 ` Eric W. Biederman 2008-07-01 21:11 ` Andi Kleen 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-01 21:10 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Eric W. Biederman wrote: > Jeremy Fitzhardinge <jeremy@goop.org> writes: > > >> H. Peter Anvin wrote: >> >>> Eric W. Biederman wrote: >>> >>>>> The zero-based PDA mechanism requires the introduction of a new ELF segment >>>>> based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me >>>>> if >>>>> its triggering some toolchain bug. >>>>> >>>> Agreed. Given the previous description my hunch is that the bug is occurring >>>> during objcopy. If vmlinux is good and the compressed kernel is bad. >>>> >>>> >>> Actually, it's not all that unusual... it's pretty common in various >>> restricted environments. That being said, it's probably uncommon for *64-bit* >>> code. >>> >> Well, it's also unusual because 1) it's vaddr 0, but paddr <high>, and 2) the >> PHDRs are not sorted by vaddr order. 2) might actually be a bug. >> > > I just looked and gcc does not use this technique for thread local data. > Which technique? It does assume you put the thread-local data near %gs (%fs in userspace), and it uses a small offset (positive or negative) to reach it. At present, the x86-64 only uses %gs-relative addressing to reach the pda, which are always small positive offsets. It always accesses per-cpu data in a two-step process of getting the base of per-cpu data, then offsetting to find the particular variable. x86-32 has no pda, and arranges %fs so that %fs:variable gets the percpu variant of variable. The offsets are always quite large. > My initial concern about all of this was not making symbols section relative > is relieved as this all appears to be a 64bit arch thing where that doesn't > matter. > Why's that? I thought you cared particularly about making the x86-64 kernel relocatable for kdump, and that using non-absolute symbols was part of that? > Has anyone investigated using the technique gcc uses for thread local storage? > http://people.redhat.com/drepper/tls.pdf > The powerpc guys tried using gcc-level thread-local storage, but it doesn't work well. per-cpu data and per-thread data have different constraints, and its hard to tell gcc about them. For example, if you have a section of preemptable code in your function, it's hard to tell gcc not to cache a "thread-local" variable across it, even though we could have switched CPUs in the meantime. > In particular using the local exec model so we can say: > movq %fs:x@tpoff,%rax > > To load the contents of a per cpu variable x into %rax ? > > If we can use that model it should make it easier to interface with things like > the stack protector code. Although we would still need to be very careful > about thread switches. > You mean cpu switches? We don't really have a notion of thread-local data in the kernel, other than things hanging off the kernel stack. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 21:10 ` Jeremy Fitzhardinge @ 2008-07-01 21:39 ` Eric W. Biederman 2008-07-01 21:52 ` Jeremy Fitzhardinge 2008-07-02 2:01 ` H. Peter Anvin 0 siblings, 2 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-01 21:39 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Jeremy Fitzhardinge <jeremy@goop.org> writes: >> I just looked and gcc does not use this technique for thread local data. >> > > Which technique? A section located at 0. > It does assume you put the thread-local data near %gs (%fs in > userspace), and it uses a small offset (positive or negative) to > reach it. Nope. It achieves that affect with a magic set of relocations instead of linker magic. > At present, the x86-64 only uses %gs-relative addressing to reach the pda, which > are always small positive offsets. It always accesses per-cpu data in a > two-step process of getting the base of per-cpu data, then offsetting to find > the particular variable. > > x86-32 has no pda, and arranges %fs so that %fs:variable gets the percpu variant > of variable. The offsets are always quite large. As a practical matter I like that approach (except for extra code size of the offsets). >> My initial concern about all of this was not making symbols section relative >> is relieved as this all appears to be a 64bit arch thing where that doesn't >> matter. >> > > Why's that? I thought you cared particularly about making the x86-64 kernel > relocatable for kdump, and that using non-absolute symbols was part of that? That is all true but unconnected. For x86_64 the kernel lives at a fixed virtual address. So absolute or non absolute symbols don't matter. Only __pa and a little bit of code in head64.S that sets up the intial page tables has to be aware of it. So relocation on x86_64 is practically free. For i386 since virtual address space is precious and because there were concerns about putting code in __pa we actually relocate the kernel symbols during load right after decompression. When we do relocations absolute symbols are a killer. >> Has anyone investigated using the technique gcc uses for thread local storage? >> http://people.redhat.com/drepper/tls.pdf >> > > The powerpc guys tried using gcc-level thread-local storage, but it doesn't work > well. per-cpu data and per-thread data have different constraints, and its hard > to tell gcc about them. For example, if you have a section of preemptable code > in your function, it's hard to tell gcc not to cache a "thread-local" variable > across it, even though we could have switched CPUs in the meantime. Yes, I completely agree with that. It doesn't mean however that we can't keep gcc ignorant and generate the same code manually. >> In particular using the local exec model so we can say: >> movq %fs:x@tpoff,%rax >> >> To load the contents of a per cpu variable x into %rax ? >> >> If we can use that model it should make it easier to interface with things > like >> the stack protector code. Although we would still need to be very careful >> about thread switches. >> > > You mean cpu switches? We don't really have a notion of thread-local data in > the kernel, other than things hanging off the kernel stack. Well I was thinking threads switching on a cpu having the kinds of problems you described when it was tried on ppc. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 21:39 ` Eric W. Biederman @ 2008-07-01 21:52 ` Jeremy Fitzhardinge 2008-07-02 0:20 ` H. Peter Anvin 2008-07-02 2:01 ` H. Peter Anvin 1 sibling, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-01 21:52 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Eric W. Biederman wrote: > Nope. It achieves that affect with a magic set of relocations instead > of linker magic. > Well, the code gcc generates for -fstack-protector emits a literal "%gs:40", so there's no relocations at all. >> At present, the x86-64 only uses %gs-relative addressing to reach the pda, which >> are always small positive offsets. It always accesses per-cpu data in a >> two-step process of getting the base of per-cpu data, then offsetting to find >> the particular variable. >> >> x86-32 has no pda, and arranges %fs so that %fs:variable gets the percpu variant >> of variable. The offsets are always quite large. >> > > As a practical matter I like that approach (except for extra code size > of the offsets). > Yes, and there's no reason we couldn't do the same on 64-bit, aside from the stack-protector's use of %gs:40. There's no code-size cost in large offsets, since they're always 32-bits anyway (there's no short absolute addressing mode). >> The powerpc guys tried using gcc-level thread-local storage, but it doesn't work >> well. per-cpu data and per-thread data have different constraints, and its hard >> to tell gcc about them. For example, if you have a section of preemptable code >> in your function, it's hard to tell gcc not to cache a "thread-local" variable >> across it, even though we could have switched CPUs in the meantime. >> > > Yes, I completely agree with that. It doesn't mean however that we > can't keep gcc ignorant and generate the same code manually. > Yes, I see. I haven't looked at that specifically, but I think both Rusty and Andi have, and it gets tricky with modules and -ve kernel addresses, or something. > Well I was thinking threads switching on a cpu having the kinds of problems you > described when it was tried on ppc. Uh, I think we're having a nomenclature imprecision here. Strictly speaking, the kernel doesn't have threads, only tasks and CPUs. We only care about per-cpu data, not per-task data, so the concern is not "threads switching on a CPU" but "CPUs switching on (under) a task". But I think we understand each other regardless ;) If we manually generate %gs-relative references to percpu data, then it's no different to what we do with 32-bit, whether it be a specific symbol address or using the TLS relocations. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 21:52 ` Jeremy Fitzhardinge @ 2008-07-02 0:20 ` H. Peter Anvin 2008-07-02 1:15 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: H. Peter Anvin @ 2008-07-02 0:20 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Jeremy Fitzhardinge wrote: > > Yes, and there's no reason we couldn't do the same on 64-bit, aside from > the stack-protector's use of %gs:40. There's no code-size cost in large > offsets, since they're always 32-bits anyway (there's no short absolute > addressing mode). > > If we manually generate %gs-relative references to percpu data, then > it's no different to what we do with 32-bit, whether it be a specific > symbol address or using the TLS relocations. > If we think the problem is the zero-basing triggering linker bugs, we should probably just use a small offset, like 64 (put a small dummy section before the .percpu.data section to occupy this section.) I'm going to play with this a bit and see if I come up with something sanish. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 0:20 ` H. Peter Anvin @ 2008-07-02 1:15 ` Mike Travis 2008-07-02 1:32 ` Eric W. Biederman ` (2 more replies) 0 siblings, 3 replies; 108+ messages in thread From: Mike Travis @ 2008-07-02 1:15 UTC (permalink / raw) To: H. Peter Anvin Cc: Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner H. Peter Anvin wrote: > Jeremy Fitzhardinge wrote: >> >> Yes, and there's no reason we couldn't do the same on 64-bit, aside >> from the stack-protector's use of %gs:40. There's no code-size cost >> in large offsets, since they're always 32-bits anyway (there's no >> short absolute addressing mode). >> >> If we manually generate %gs-relative references to percpu data, then >> it's no different to what we do with 32-bit, whether it be a specific >> symbol address or using the TLS relocations. >> > > If we think the problem is the zero-basing triggering linker bugs, we > should probably just use a small offset, like 64 (put a small dummy > section before the .percpu.data section to occupy this section.) > > I'm going to play with this a bit and see if I come up with something > sanish. > > -hpa One interesting thing I've discovered is the gcc --version may make a difference. The kernel panic that occurred from Ingo's config, I was able to replicate with GCC 4.2.0 (which is on our devel server). But this one complained about not being able to handle the STACK-PROTECTOR option so I moved everything to another machine that has 4.2.4, and now it seems that it works fine. I'm still re-verifying that the source bits and config options are identical (it was a later git-remote update), and that in fact it is the gcc --version, but that may be the conclusion. (My code also has some patches submitted but not yet included in the tip/master tree. Curiously just enabling some debug options changed the footprint of the panic.) Are we allowed to insist on a specific level of GCC for compiling the kernel? Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:15 ` Mike Travis @ 2008-07-02 1:32 ` Eric W. Biederman 2008-07-02 1:51 ` Mike Travis 2008-07-02 1:40 ` H. Peter Anvin 2008-07-02 1:44 ` Mike Travis 2 siblings, 1 reply; 108+ messages in thread From: Eric W. Biederman @ 2008-07-02 1:32 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis <travis@sgi.com> writes: > H. Peter Anvin wrote: >> Jeremy Fitzhardinge wrote: >>> >>> Yes, and there's no reason we couldn't do the same on 64-bit, aside >>> from the stack-protector's use of %gs:40. There's no code-size cost >>> in large offsets, since they're always 32-bits anyway (there's no >>> short absolute addressing mode). >>> >>> If we manually generate %gs-relative references to percpu data, then >>> it's no different to what we do with 32-bit, whether it be a specific >>> symbol address or using the TLS relocations. >>> >> >> If we think the problem is the zero-basing triggering linker bugs, we >> should probably just use a small offset, like 64 (put a small dummy >> section before the .percpu.data section to occupy this section.) >> >> I'm going to play with this a bit and see if I come up with something >> sanish. >> >> -hpa > > One interesting thing I've discovered is the gcc --version may make a > difference. > > The kernel panic that occurred from Ingo's config, I was able to replicate > with GCC 4.2.0 (which is on our devel server). But this one complained > about not being able to handle the STACK-PROTECTOR option so I moved > everything to another machine that has 4.2.4, and now it seems that it > works fine. I'm still re-verifying that the source bits and config options > are identical (it was a later git-remote update), and that in fact it is > the gcc --version, but that may be the conclusion. (My code also has some > patches submitted but not yet included in the tip/master tree. Curiously > just enabling some debug options changed the footprint of the panic.) > > Are we allowed to insist on a specific level of GCC for compiling the > kernel? Depends on the root cause. If it turns out to be something that is buggy in gcc and we can't work around. We might do something. I don't recall that kind of thing happening often. I think our minimum gcc is currently gcc-3.4. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:32 ` Eric W. Biederman @ 2008-07-02 1:51 ` Mike Travis 2008-07-02 2:50 ` Eric W. Biederman 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-02 1:51 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > >> H. Peter Anvin wrote: >>> Jeremy Fitzhardinge wrote: >>>> Yes, and there's no reason we couldn't do the same on 64-bit, aside >>>> from the stack-protector's use of %gs:40. There's no code-size cost >>>> in large offsets, since they're always 32-bits anyway (there's no >>>> short absolute addressing mode). >>>> >>>> If we manually generate %gs-relative references to percpu data, then >>>> it's no different to what we do with 32-bit, whether it be a specific >>>> symbol address or using the TLS relocations. >>>> >>> If we think the problem is the zero-basing triggering linker bugs, we >>> should probably just use a small offset, like 64 (put a small dummy >>> section before the .percpu.data section to occupy this section.) >>> >>> I'm going to play with this a bit and see if I come up with something >>> sanish. >>> >>> -hpa >> One interesting thing I've discovered is the gcc --version may make a >> difference. >> >> The kernel panic that occurred from Ingo's config, I was able to replicate >> with GCC 4.2.0 (which is on our devel server). But this one complained >> about not being able to handle the STACK-PROTECTOR option so I moved >> everything to another machine that has 4.2.4, and now it seems that it >> works fine. I'm still re-verifying that the source bits and config options >> are identical (it was a later git-remote update), and that in fact it is >> the gcc --version, but that may be the conclusion. (My code also has some >> patches submitted but not yet included in the tip/master tree. Curiously >> just enabling some debug options changed the footprint of the panic.) >> >> Are we allowed to insist on a specific level of GCC for compiling the >> kernel? > > Depends on the root cause. If it turns out to be something that is buggy > in gcc and we can't work around. We might do something. I don't recall > that kind of thing happening often. I think our minimum gcc is currently > gcc-3.4. > > Eric Ouch. How far into it do we need to investigate? I can surely compare the vmlinux object files, but I'm not cognizant enough about the linker internals to examine much more than that. But hey, maybe gcc-3.4 will work ok...? ;-) [Or it may be the stack-protector thing is introducing better code? I'll try some more config options tomorrow to see if that affects anything.] Cheers, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:51 ` Mike Travis @ 2008-07-02 2:50 ` Eric W. Biederman 0 siblings, 0 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-02 2:50 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis <travis@sgi.com> writes: > Ouch. How far into it do we need to investigate? I can surely compare the > vmlinux object files, but I'm not cognizant enough about the linker internals > to examine much more than that. As a first step we just need to know what we tell gcc or the linker to do, and what is incorrectly output. Once the problem is understood we can think about how to deal with the problem. What we really need is a recipe for success. A recipe for failure. At that point it should be much easier for someone else to reproduce the problem, and/or look into the specific details and see what is wrong or to suggest patches. The kernel is a significant enough program and a different enough one it is hard to predict how things go. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:15 ` Mike Travis 2008-07-02 1:32 ` Eric W. Biederman @ 2008-07-02 1:40 ` H. Peter Anvin 2008-07-02 1:44 ` Mike Travis 2 siblings, 0 replies; 108+ messages in thread From: H. Peter Anvin @ 2008-07-02 1:40 UTC (permalink / raw) To: Mike Travis Cc: Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis wrote: > > One interesting thing I've discovered is the gcc --version may make a > difference. > > The kernel panic that occurred from Ingo's config, I was able to replicate > with GCC 4.2.0 (which is on our devel server). But this one complained > about not being able to handle the STACK-PROTECTOR option so I moved > everything to another machine that has 4.2.4, and now it seems that it > works fine. I'm still re-verifying that the source bits and config options > are identical (it was a later git-remote update), and that in fact it is > the gcc --version, but that may be the conclusion. (My code also has some > patches submitted but not yet included in the tip/master tree. Curiously > just enabling some debug options changed the footprint of the panic.) > > Are we allowed to insist on a specific level of GCC for compiling the > kernel? > Yes, but certainly not anything even close to that recent -- I think right now we're supposed to support back to 3.2-something overall; specific architectures might have more stringent requirements. There are a couple of gcc versions known to miscompile there kernel that we don't support; I don't know if 4.2.0 is one of them. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:15 ` Mike Travis 2008-07-02 1:32 ` Eric W. Biederman 2008-07-02 1:40 ` H. Peter Anvin @ 2008-07-02 1:44 ` Mike Travis 2008-07-02 1:45 ` H. Peter Anvin 2 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-02 1:44 UTC (permalink / raw) To: H. Peter Anvin Cc: Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis wrote: ... I'm still re-verifying that the source bits and config options > are identical (it was a later git-remote update), and that in fact it is > the gcc --version, but that may be the conclusion. ... Yup, it's the gcc --version that makes the difference. GCC 4.2.0 couldn't boot past the grub screen, GCC 4.2.4 made it to the login prompt. The only config changes were imposed by the make script: stapp 125> diff ../configs/config-Tue_Jul__1_16_48_45_CEST_2008.bad ../build/ingo-test-0701/.config 4c4 < # Tue Jul 1 16:53:49 2008 --- > # Tue Jul 1 16:09:33 2008 64c64 < CONFIG_LOCALVERSION="" --- > CONFIG_LOCALVERSION="-ingo-test-0701" 120d119 < CONFIG_USE_GENERIC_SMP_HELPERS=y 124d122 < # CONFIG_HAVE_GENERIC_DMA_COHERENT is not set 1294d1291 < CONFIG_THERMAL_HWMON=y Posting the complete patchset RSN... (or tomorrow am so I can test some more configs and functionality.) Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:44 ` Mike Travis @ 2008-07-02 1:45 ` H. Peter Anvin 2008-07-02 1:55 ` Mike Travis 2008-07-02 22:50 ` Mike Travis 0 siblings, 2 replies; 108+ messages in thread From: H. Peter Anvin @ 2008-07-02 1:45 UTC (permalink / raw) To: Mike Travis Cc: Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis wrote: > Mike Travis wrote: > ... I'm still re-verifying that the source bits and config options >> are identical (it was a later git-remote update), and that in fact it is >> the gcc --version, but that may be the conclusion. > ... > > Yup, it's the gcc --version that makes the difference. GCC 4.2.0 couldn't > boot past the grub screen, GCC 4.2.4 made it to the login prompt. > IIRC, 4.2.0, 4.2.1 and 4.3.0 are known to miscompile the kernel in one way or another, however, that is from memory so don't quote me on it. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:45 ` H. Peter Anvin @ 2008-07-02 1:55 ` Mike Travis 2008-07-02 22:50 ` Mike Travis 1 sibling, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-07-02 1:55 UTC (permalink / raw) To: H. Peter Anvin Cc: Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner H. Peter Anvin wrote: > Mike Travis wrote: >> Mike Travis wrote: >> ... I'm still re-verifying that the source bits and config options >>> are identical (it was a later git-remote update), and that in fact it is >>> the gcc --version, but that may be the conclusion. >> ... >> >> Yup, it's the gcc --version that makes the difference. GCC 4.2.0 >> couldn't >> boot past the grub screen, GCC 4.2.4 made it to the login prompt. >> > > IIRC, 4.2.0, 4.2.1 and 4.3.0 are known to miscompile the kernel in one > way or another, however, that is from memory so don't quote me on it. > > -hpa Great. That's what's been on my devel server for the past 3 or 4 months now... [And it's a big shared server that I'm but a small ant wandering around on it. ;-)] ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 1:45 ` H. Peter Anvin 2008-07-02 1:55 ` Mike Travis @ 2008-07-02 22:50 ` Mike Travis 2008-07-03 4:34 ` Eric W. Biederman 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-02 22:50 UTC (permalink / raw) To: H. Peter Anvin Cc: Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner H. Peter Anvin wrote: > Mike Travis wrote: >> Mike Travis wrote: >> ... I'm still re-verifying that the source bits and config options >>> are identical (it was a later git-remote update), and that in fact it is >>> the gcc --version, but that may be the conclusion. >> ... >> >> Yup, it's the gcc --version that makes the difference. GCC 4.2.0 >> couldn't >> boot past the grub screen, GCC 4.2.4 made it to the login prompt. >> > > IIRC, 4.2.0, 4.2.1 and 4.3.0 are known to miscompile the kernel in one > way or another, however, that is from memory so don't quote me on it. > > -hpa This is definitely getting strange... Ingo's randconfig at: http://redhat.com/~mingo/misc/config-Tue_Jul__1_16_48_45_CEST_2008.bad will only boot and run with gcc-4.2.4, with gcc-4.2.0 it fails at the grub screen. Other configs like: defconfig w/NR_CPUS=4096 nonuma will only boot and run with gcc-4.2.0, and does the "failure at grub" screen with gcc-4.2.4 ...! The nosmp config works with either. I'm looking at the generated vmlinux files now (well at least the assembler, I'm not really familiar enough with the linker objects to look at those.) I'm also trying to track which config options changes the behavior so radically. Any other suggestions? Thanks! Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 22:50 ` Mike Travis @ 2008-07-03 4:34 ` Eric W. Biederman 2008-07-07 17:17 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Eric W. Biederman @ 2008-07-03 4:34 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis <travis@sgi.com> writes: > will only boot and run with gcc-4.2.0, and does the "failure at grub" screen > with > gcc-4.2.4 ...! Do these tests have an early serial console enabled? I want to confirm that the failure is very early in boot before or just as we reach C code. If you don't have an early console we could be failing late in the process just not have visibility. > Any other suggestions? If you have a serial console to add a little bit of instrumentation super duper early. Although looking at the generated objects might be just as interesting. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-03 4:34 ` Eric W. Biederman @ 2008-07-07 17:17 ` Mike Travis 2008-07-07 19:46 ` Eric W. Biederman 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-07 17:17 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > >> will only boot and run with gcc-4.2.0, and does the "failure at grub" screen >> with >> gcc-4.2.4 ...! > > Do these tests have an early serial console enabled? I want to confirm > that the failure is very early in boot before or just as we reach C code. > > If you don't have an early console we could be failing late in the process just > not have visibility. > >> Any other suggestions? > > If you have a serial console to add a little bit of instrumentation super duper early. > Although looking at the generated objects might be just as interesting. > > Eric Hi, Sorry for the delay, been off the past 4 days. I'll see how closely I can narrow down where the fault is. I have been inserting very early printk's to track how far into the startup it gets before bailing. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-07 17:17 ` Mike Travis @ 2008-07-07 19:46 ` Eric W. Biederman 2008-07-08 18:21 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Eric W. Biederman @ 2008-07-07 19:46 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis <travis@sgi.com> writes: > Hi, > > Sorry for the delay, been off the past 4 days. > > I'll see how closely I can narrow down where the fault is. I have been > inserting > very early printk's to track how far into the startup it gets before bailing. Thanks. That should help narrow down what is going on. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-07 19:46 ` Eric W. Biederman @ 2008-07-08 18:21 ` Mike Travis 2008-07-08 23:36 ` Eric W. Biederman 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-08 18:21 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: >> Hi, >> >> Sorry for the delay, been off the past 4 days. >> >> I'll see how closely I can narrow down where the fault is. I have been >> inserting >> very early printk's to track how far into the startup it gets before bailing. > > Thanks. That should help narrow down what is going on. > > Eric Unfortunately it's back to the problem of faulting before x86_64_start_kernel() and grub just immediately reboots. So I'm back at analyzing assembler and config differences. Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-08 18:21 ` Mike Travis @ 2008-07-08 23:36 ` Eric W. Biederman 2008-07-08 23:49 ` Jeremy Fitzhardinge 2008-07-09 14:37 ` Mike Travis 0 siblings, 2 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-08 23:36 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner [-- Attachment #1: Type: text/plain, Size: 434 bytes --] Mike Travis <travis@sgi.com> writes: > Unfortunately it's back to the problem of faulting before x86_64_start_kernel() > and grub just immediately reboots. So I'm back at analyzing assembler and > config differences. Ok. That is a narrow window of code. So it shouldn't be too bad, nasty though. If you would like to trace through it I have attached my serial port debugging routines that I use in that part of the code. Eric [-- Attachment #2: linux-debug.S --] [-- Type: text/plain, Size: 5876 bytes --] #if 1 /* Base Address */ #define TTYS0_BASE 0x3f8 /* Data */ #define TTYS0_RBR (TTYS0_BASE+0x00) #define TTYS0_TBR (TTYS0_BASE+0x00) /* Control */ #define TTYS0_IER (TTYS0_BASE+0x01) #define TTYS0_IIR (TTYS0_BASE+0x02) #define TTYS0_FCR (TTYS0_BASE+0x02) #define TTYS0_LCR (TTYS0_BASE+0x03) #define TTYS0_MCR (TTYS0_BASE+0x04) #define TTYS0_DLL (TTYS0_BASE+0x00) #define TTYS0_DLM (TTYS0_BASE+0x01) /* Status */ #define TTYS0_LSR (TTYS0_BASE+0x05) #define TTYS0_MSR (TTYS0_BASE+0x06) #define TTYS0_SCR (TTYS0_BASE+0x07) #define TTYS0_BAUD 9600 #define TTYS0_DIV (115200/TTYS0_BAUD) #define TTYS0_DIV_LO (TTYS0_DIV&0xFF) #define TTYS0_DIV_HI ((TTYS0_DIV >> 8)&0xFF) #if ((115200%TTYS0_BAUD) != 0) #error Bad ttyS0 baud rate #endif #define TTYS0_INIT \ /* disable interrupts */ \ movb $0x00, %al ; \ movw $TTYS0_IER, %dx ; \ outb %al, %dx ; \ ; \ /* enable fifos */ \ movb $0x01, %al ; \ movw $TTYS0_FCR, %dx ; \ outb %al, %dx ; \ ; \ /* Set Baud Rate Divisor to TTYS0_BAUD */ \ movw $TTYS0_LCR, %dx ; \ movb $0x83, %al ; \ outb %al, %dx ; \ ; \ movw $TTYS0_DLL, %dx ; \ movb $TTYS0_DIV_LO, %al ; \ outb %al, %dx ; \ ; \ movw $TTYS0_DLM, %dx ; \ movb $TTYS0_DIV_HI, %al ; \ outb %al, %dx ; \ ; \ movw $TTYS0_LCR, %dx ; \ movb $0x03, %al ; \ outb %al, %dx /* uses: ax, dx */ #define TTYS0_TX_AL \ mov %al, %ah ; \ 9: mov $TTYS0_LSR, %dx ; \ inb %dx, %al ; \ test $0x20, %al ; \ je 9b ; \ mov $TTYS0_TBR, %dx ; \ mov %ah, %al ; \ outb %al, %dx /* uses: ax, dx */ #define TTYS0_TX_CHAR(byte) \ mov byte, %al ; \ TTYS0_TX_AL /* uses: eax, dx */ #define TTYS0_TX_HEX32(lword) \ mov lword, %eax ; \ shr $28, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ shr $24, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ shr $20, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ shr $16, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ shr $12, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ shr $8, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ shr $4, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %eax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL /* uses: rax, dx */ #define TTYS0_TX_HEX64(lword) \ mov lword, %rax ; \ shr $60, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $56, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $52, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $48, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $44, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $40, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $36, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $32, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $28, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $24, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $20, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $16, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $12, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $8, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ shr $4, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL ; \ ; \ mov lword, %rax ; \ and $0x0f, %al ; \ add $'0', %al ; \ cmp $'9', %al ; \ jle 9f ; \ add $39, %al ; \ 9: ; \ TTYS0_TX_AL #define DEBUG(x) TTYS0_TX_CHAR($x) ; TTYS0_TX_CHAR($'\r') ; TTYS0_TX_CHAR($'\n') #define DEBUG_TX_HEX32(x) TTYS0_TX_HEX32(x); TTYS0_TX_CHAR($'\r') ; TTYS0_TX_CHAR($'\n') #define DEBUG_TX_HEX64(x) TTYS0_TX_HEX64(x); TTYS0_TX_CHAR($'\r') ; TTYS0_TX_CHAR($'\n') #endif ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-08 23:36 ` Eric W. Biederman @ 2008-07-08 23:49 ` Jeremy Fitzhardinge 2008-07-09 14:39 ` Mike Travis 2008-07-25 20:06 ` Mike Travis 2008-07-09 14:37 ` Mike Travis 1 sibling, 2 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-08 23:49 UTC (permalink / raw) To: Eric W. Biederman Cc: Mike Travis, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > > >> Unfortunately it's back to the problem of faulting before x86_64_start_kernel() >> and grub just immediately reboots. So I'm back at analyzing assembler and >> config differences. >> > > Ok. That is a narrow window of code. So it shouldn't be too bad, nasty though. > > If you would like to trace through it I have attached my serial port > debugging routines that I use in that part of the code. Last time it was doing this, it was a result of a triple-fault caused by loading %ds with an all-zero gdt. I modified Xen to dump the CPU state on triple-faults, so it was easy to pinpoint. I can do that again if it helps. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-08 23:49 ` Jeremy Fitzhardinge @ 2008-07-09 14:39 ` Mike Travis 2008-07-25 20:06 ` Mike Travis 1 sibling, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-07-09 14:39 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Jeremy Fitzhardinge wrote: > Eric W. Biederman wrote: >> Mike Travis <travis@sgi.com> writes: >> >> >>> Unfortunately it's back to the problem of faulting before >>> x86_64_start_kernel() >>> and grub just immediately reboots. So I'm back at analyzing >>> assembler and >>> config differences. >>> >> >> Ok. That is a narrow window of code. So it shouldn't be too bad, >> nasty though. >> >> If you would like to trace through it I have attached my serial port >> debugging routines that I use in that part of the code. > > Last time it was doing this, it was a result of a triple-fault caused by > loading %ds with an all-zero gdt. I modified Xen to dump the CPU state > on triple-faults, so it was easy to pinpoint. I can do that again if it > helps. > > J Absolutely! I'll repost the latest version of the patchset. Still haven't gotten around to enabling a XEN boot but it sure does sound handy. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-08 23:49 ` Jeremy Fitzhardinge 2008-07-09 14:39 ` Mike Travis @ 2008-07-25 20:06 ` Mike Travis 2008-07-25 20:12 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-25 20:06 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Jeremy Fitzhardinge wrote: ... > Last time it was doing this, it was a result of a triple-fault caused by > loading %ds with an all-zero gdt. I modified Xen to dump the CPU state > on triple-faults, so it was easy to pinpoint. I can do that again if it > helps. > > J Hi Jeremy, There are two question marks for my patchset. The first is in arch/x86/xen/smp.c:xen_cpu_up() 287 #ifdef CONFIG_X86_64 288 /* Allocate node local memory for AP pdas */ 289 WARN_ON(cpu == 0); 290 if (cpu > 0) { 291 rc = get_local_pda(cpu); 292 if (rc) 293 return rc; 294 } 295 #endif and the second is at: arch/x86/xen/enlighten.c:xen_start_kernel() 1748 #ifdef CONFIG_X86_64 1749 /* Disable until direct per-cpu data access. */ 1750 have_vcpu_info_placement = 0; 1751 x86_64_init_pda(); 1752 #endif I believe with the pda folded into the percpu area, get_local_pda() and x86_64_init_pda() have been removed, so these are no longer required, yes? Also, arch/x86/kernel/acpi/sleep.c:acpi_save_state_mem() sets up the startup code address with: 102 initial_code = (unsigned long)wakeup_long64; 103 saved_magic = 0x123456789abcdef0; Should the pda and gdt_page address also be setup as is done in smpboot.c:do_boot_cpu(): (CONFIG_X86_64) 801 initial_pda = (unsigned long)get_cpu_pda(cpu); 802 #endif 803 early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu); 804 initial_code = (unsigned long)start_secondary; Thanks! Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-25 20:06 ` Mike Travis @ 2008-07-25 20:12 ` Jeremy Fitzhardinge 2008-07-25 20:34 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-25 20:12 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis wrote: > Jeremy Fitzhardinge wrote: > ... > >> Last time it was doing this, it was a result of a triple-fault caused by >> loading %ds with an all-zero gdt. I modified Xen to dump the CPU state >> on triple-faults, so it was easy to pinpoint. I can do that again if it >> helps. >> >> J >> > > Hi Jeremy, > > There are two question marks for my patchset. The first is in > > arch/x86/xen/smp.c:xen_cpu_up() > > 287 #ifdef CONFIG_X86_64 > 288 /* Allocate node local memory for AP pdas */ > 289 WARN_ON(cpu == 0); > 290 if (cpu > 0) { > 291 rc = get_local_pda(cpu); > 292 if (rc) > 293 return rc; > 294 } > 295 #endif > > and the second is at: > > arch/x86/xen/enlighten.c:xen_start_kernel() > > 1748 #ifdef CONFIG_X86_64 > 1749 /* Disable until direct per-cpu data access. */ > 1750 have_vcpu_info_placement = 0; > 1751 x86_64_init_pda(); > 1752 #endif > > I believe with the pda folded into the percpu area, get_local_pda() > and x86_64_init_pda() have been removed, so these are no longer > required, yes? > Well, presumably they need to be replaced with whatever setup you need to do now. xen_start_kernel() is the first function called after a Xen kernel boot, and so it must make sure the early percpu setup is done before it can start using percpu variables. xen_cpu_up() needs to do whatever initialization needed for a new cpu's percpu area (presumably whatever do_boot_cpu() does). > Also, arch/x86/kernel/acpi/sleep.c:acpi_save_state_mem() sets up > the startup code address with: > > 102 initial_code = (unsigned long)wakeup_long64; > 103 saved_magic = 0x123456789abcdef0; > > Should the pda and gdt_page address also be setup as is done in > smpboot.c:do_boot_cpu(): > > (CONFIG_X86_64) > 801 initial_pda = (unsigned long)get_cpu_pda(cpu); > 802 #endif > 803 early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu); > 804 initial_code = (unsigned long)start_secondary; > I don't think so. It looks like it's doing its own gdt save/restore. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-25 20:12 ` Jeremy Fitzhardinge @ 2008-07-25 20:34 ` Mike Travis 2008-07-25 20:43 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-25 20:34 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Jeremy Fitzhardinge wrote: > Mike Travis wrote: >>... The first is in >> >> arch/x86/xen/smp.c:xen_cpu_up() >> >> 287 #ifdef CONFIG_X86_64 >> 288 /* Allocate node local memory for AP pdas */ >> 289 WARN_ON(cpu == 0); >> 290 if (cpu > 0) { >> 291 rc = get_local_pda(cpu); >> 292 if (rc) >> 293 return rc; >> 294 } >> 295 #endif >> >> and the second is at: >> >> arch/x86/xen/enlighten.c:xen_start_kernel() >> >> 1748 #ifdef CONFIG_X86_64 >> 1749 /* Disable until direct per-cpu data access. */ >> 1750 have_vcpu_info_placement = 0; >> 1751 x86_64_init_pda(); >> 1752 #endif >> >> I believe with the pda folded into the percpu area, get_local_pda() >> and x86_64_init_pda() have been removed, so these are no longer >> required, yes? >> > > Well, presumably they need to be replaced with whatever setup you need > to do now. > > xen_start_kernel() is the first function called after a Xen kernel boot, > and so it must make sure the early percpu setup is done before it can > start using percpu variables. Is this for the boot cpu (0), or for all cpus? For the boot cpu, I have this now in arch/x86/kernel/setup_percpu.c: +#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU + +/* Initialize percpu offset for boot cpu (0) */ +unsigned long __per_cpu_offset[NR_CPUS] __read_mostly = { + [0] = (unsigned long)__per_cpu_load +}; +#else unsigned long __per_cpu_offset[NR_CPUS] __read_mostly; +#endif So this should apply as well to the xen startup? > > xen_cpu_up() needs to do whatever initialization needed for a new cpu's > percpu area (presumably whatever do_boot_cpu() does). > Does the startup include executing arch/x86/kernel/head_64.S:startup_64() ? I see arch/x86/xen/xen-head.S:startup_xen() so I'm guessing not? For the real startup, I do the following two things. But I'm not comfortable enough with xen to think I'll get it right putting this in xen-head.S. - lgdt early_gdt_descr(%rip) + +#ifdef CONFIG_SMP + /* + * For zero-based percpu variables, the base (__per_cpu_load) must + * be added to the offset of per_cpu__gdt_page. This is only needed + * for the boot cpu but we can't do this prior to secondary_startup_64. + * So we use a NULL gdt adrs to indicate that we are starting up the + * boot cpu and not the secondary cpus. do_boot_cpu() will fixup + * the gdt adrs for those cpus. + */ +#define PER_CPU_GDT_PAGE 0 + movq early_gdt_descr_base(%rip), %rax + testq %rax, %rax + jnz 1f + movq $__per_cpu_load, %rax + addq $per_cpu__gdt_page, %rax + movq %rax, early_gdt_descr_base(%rip) +#else +#define PER_CPU_GDT_PAGE per_cpu__gdt_page +#endif +1: lgdt early_gdt_descr(%rip) and: + * Setup up the real PDA. + * + * For SMP, the boot cpu (0) uses the static pda which is the first + * element in the percpu area (@__per_cpu_load). This pda is moved + * to the real percpu area once that is allocated. Secondary cpus + * will use the initial_pda value setup in do_boot_cpu(). */ movl $MSR_GS_BASE,%ecx - movq $empty_zero_page,%rax + movq initial_pda(%rip), %rax movq %rax,%rdx shrq $32,%rdx wrmsr +#ifdef CONFIG_SMP + movq %rax, %gs:pda_data_offset +#endif + ENTRY(initial_pda) +#ifdef CONFIG_SMP + .quad __per_cpu_load # Overwritten for secondary CPUs +#else + .quad per_cpu__pda +#endif Thanks! Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-25 20:34 ` Mike Travis @ 2008-07-25 20:43 ` Jeremy Fitzhardinge 2008-07-25 21:05 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-07-25 20:43 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis wrote: > Is this for the boot cpu (0), or for all cpus? For the boot cpu, I have > this now in arch/x86/kernel/setup_percpu.c: > > +#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU > + > +/* Initialize percpu offset for boot cpu (0) */ > +unsigned long __per_cpu_offset[NR_CPUS] __read_mostly = { > + [0] = (unsigned long)__per_cpu_load > +}; > +#else > unsigned long __per_cpu_offset[NR_CPUS] __read_mostly; > +#endif > > So this should apply as well to the xen startup? > If it's just a static initialization, then it should be fine. But some equivalent of your head_64.S changes are needed to actually set things up? >> xen_cpu_up() needs to do whatever initialization needed for a new cpu's >> percpu area (presumably whatever do_boot_cpu() does). >> >> > > Does the startup include executing arch/x86/kernel/head_64.S:startup_64() ? > I see arch/x86/xen/xen-head.S:startup_xen() so I'm guessing not? > No, it doesn't. It bypasses all that startup code. Aside from the few instructions in xen-head.S, xen_start_kernel() is the first thing to get run. But when bringing up a secondary cpu, where does the new percpu memory actually get allocated? > For the real startup, I do the following two things. But I'm not comfortable > enough with xen to think I'll get it right putting this in xen-head.S. > Yes, it needn't be in the asm code. I'll work out what to do. Looks like I just need to do an appropriate wrmsr(MSR_GS_BASE, ). J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-25 20:43 ` Jeremy Fitzhardinge @ 2008-07-25 21:05 ` Mike Travis 0 siblings, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-07-25 21:05 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Ok, I'll just post what I have now (compiles and boots cleanly)... and then we can discuss these more extensively. Thanks, Mike Jeremy Fitzhardinge wrote: > Mike Travis wrote: >> Is this for the boot cpu (0), or for all cpus? For the boot cpu, I have >> this now in arch/x86/kernel/setup_percpu.c: >> >> +#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >> + >> +/* Initialize percpu offset for boot cpu (0) */ >> +unsigned long __per_cpu_offset[NR_CPUS] __read_mostly = { >> + [0] = (unsigned long)__per_cpu_load >> +}; >> +#else >> unsigned long __per_cpu_offset[NR_CPUS] __read_mostly; >> +#endif >> >> So this should apply as well to the xen startup? >> > > If it's just a static initialization, then it should be fine. But some > equivalent of your head_64.S changes are needed to actually set things up? > > >>> xen_cpu_up() needs to do whatever initialization needed for a new cpu's >>> percpu area (presumably whatever do_boot_cpu() does). >>> >>> >> >> Does the startup include executing >> arch/x86/kernel/head_64.S:startup_64() ? >> I see arch/x86/xen/xen-head.S:startup_xen() so I'm guessing not? >> > > No, it doesn't. It bypasses all that startup code. Aside from the few > instructions in xen-head.S, xen_start_kernel() is the first thing to get > run. > > But when bringing up a secondary cpu, where does the new percpu memory > actually get allocated? > >> For the real startup, I do the following two things. But I'm not >> comfortable >> enough with xen to think I'll get it right putting this in xen-head.S. >> > > Yes, it needn't be in the asm code. I'll work out what to do. Looks > like I just need to do an appropriate wrmsr(MSR_GS_BASE, ). > > J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-08 23:36 ` Eric W. Biederman 2008-07-08 23:49 ` Jeremy Fitzhardinge @ 2008-07-09 14:37 ` Mike Travis 2008-07-09 22:38 ` Eric W. Biederman 1 sibling, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-09 14:37 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > >> Unfortunately it's back to the problem of faulting before x86_64_start_kernel() >> and grub just immediately reboots. So I'm back at analyzing assembler and >> config differences. > > Ok. That is a narrow window of code. So it shouldn't be too bad, nasty though. > > If you would like to trace through it I have attached my serial port > debugging routines that I use in that part of the code. > > Eric > > Very cool, thanks!!! I will start using this. (I have been using the trick to replace printk with early_printk so messages come out immediately instead of from the log buf.) I've been able to make some more progress. I've gotten to a point where it panics from stack overflow. I've verified this by bumping THREAD_ORDER and it boots fine. Now tracking down stack usages. (I have found a couple of new functions using set_cpus_allowed(..., CPU_MASK_ALL) instead of set_cpus_allowed_ptr(... , CPU_MASK_ALL_PTR). But these are not in the calling sequence so subsequently are not the cause. One weird thing is early_idt_handler seems to have been called and that's one thing our simulator does not mimic for standard Intel FSB systems - early pending interrupts. (It's designed after all to mimic our h/w, and of course it's been booting fine under that environment.) Two patches are in the queue to reduce this stack usage: Subject: [PATCH 1/1] sched: Reduce stack size in isolated_cpu_setup() Subject: [PATCH 1/1] kthread: Reduce stack pressure in create_kthread and kthreadd The other stack pigs are: 1640 sched_domain_node_span 1576 tick_notify 1576 setup_IO_APIC_irq 1576 move_task_off_dead_cpu 1560 arch_setup_ht_irq 1560 __assign_irq_vector 1544 tick_handle_oneshot_broadcast 1352 zc0301_ioctl_v4l2 1336 i2o_cfg_compat_ioctl 1192 sn9c102_ioctl_v4l2 1176 __build_sched_domains 1152 e1000_check_options 1144 __build_all_zonelists 1128 setup_IO_APIC 1096 sched_balance_self 1096 _cpu_down 1080 do_ida_request 1064 sched_rt_period_timer 1064 native_smp_call_function_mask 1048 setup_timer_IRQ0_pin 1048 setup_ioapic_dest 1048 set_ioapic_affinity_irq 1048 set_ht_irq_affinity 1048 pci_device_probe 1048 native_machine_crash_shutdown 1032 tick_do_periodic_broadcast 1032 sched_setaffinity 1032 native_flush_tlb_others 1032 local_cpus_show 1032 local_cpulist_show 1032 irq_select_affinity 1032 irq_complete_move 1032 irq_affinity_write_proc 1032 ioapic_retrigger_irq 1032 flush_tlb_mm 1032 flush_tlb_current_task 1032 fixup_irqs 1032 do_cciss_request 1032 create_irq 1024 uv_vector_allocation_domain 1024 uv_send_IPI_allbutself 1024 smp_call_function_single 1024 smp_call_function 1024 physflat_send_IPI_allbutself 1024 pci_bus_show_cpuaffinity 1024 move_masked_irq 1024 flush_tlb_page 1024 flat_send_IPI_allbutself 1000 security_load_policy Only a few of these though I would think might get called early in the boot, that might also be contributing to the stack overflow. Oh yeah, I looked very closely at the differences in the assembler for vmlinux when compiled with 4.2.0 (fails) and 4.2.4 (which boots with the above mentioned THREAD_ORDER change) and except for some weirdness around ident_complete it seems to be the same code. But the per_cpu variables are in a completely different address order. I wouldn't think that the -j10 for make could cause this but I can verify that with -j1. But in any case, I'm sticking with 4.2.4 for now. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-09 14:37 ` Mike Travis @ 2008-07-09 22:38 ` Eric W. Biederman 2008-07-09 23:30 ` Mike Travis 0 siblings, 1 reply; 108+ messages in thread From: Eric W. Biederman @ 2008-07-09 22:38 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis <travis@sgi.com> writes: > Very cool, thanks!!! I will start using this. (I have been using the trick > to replace printk with early_printk so messages come out immediately instead > of from the log buf.) Just passing early_printk=xxx on the command line should have that effect. Although I do admit you have to be a little bit into the boot before early_printk is setup. > I've been able to make some more progress. I've gotten to a point where it > panics from stack overflow. I've verified this by bumping THREAD_ORDER and > it boots fine. Now tracking down stack usages. (I have found a couple of new > functions using set_cpus_allowed(..., CPU_MASK_ALL) instead of > set_cpus_allowed_ptr(... , CPU_MASK_ALL_PTR). But these are not in the calling > sequence so subsequently are not the cause. Is stack overflow the only problem you are seeing or are there still other mysteries? > One weird thing is early_idt_handler seems to have been called and that's one > thing our simulator does not mimic for standard Intel FSB systems - early > pending > interrupts. (It's designed after all to mimic our h/w, and of course it's been > booting fine under that environment.) That usually indicates you are taking an exception during boot not that you have received an external interrupt. Something like a page fault or a division by 0 error. > Only a few of these though I would think might get called early in > the boot, that might also be contributing to the stack overflow. Still the call chain depth shouldn't really be changing. So why should it matter? Ah. The high cpu count is growing cpumask_t so when you put it on the stack. That makes sense. So what stars out as a 4 byte variable on the stack in a normal setup winds up being a 1k variable with 4k cpus. > Oh yeah, I looked very closely at the differences in the assembler > for vmlinux when compiled with 4.2.0 (fails) and 4.2.4 (which boots > with the above mentioned THREAD_ORDER change) and except for some > weirdness around ident_complete it seems to be the same code. But > the per_cpu variables are in a completely different address order. > I wouldn't think that the -j10 for make could cause this but I can > verify that with -j1. But in any case, I'm sticking with 4.2.4 for > now. Reasonable. The practical problem is you are mixing a lot of changes simultaneously and it confuses things. Compiling with NR_CPUS=4096 and working out the bugs from a growing cpumask_t, putting the per cpu area in a zero based segment, and putting putting the pda into the per cpu area all at the same time. Who knows maybe the only difference between 4.2.0 and 4.2.4 is that 4.2.4 optimizes it's stack usage a little better and you don't see a stack overflow. It would be very very good if we could separate out these issues especially the segment for the per cpu variables. We need something like that. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-09 22:38 ` Eric W. Biederman @ 2008-07-09 23:30 ` Mike Travis 2008-07-10 0:04 ` Eric W. Biederman 0 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-07-09 23:30 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > ... (I have been using the trick >> to replace printk with early_printk so messages come out immediately instead >> of from the log buf.) > > Just passing early_printk=xxx on the command line should have that effect. > Although I do admit you have to be a little bit into the boot before early_printk > is setup. What I meant was using early_printk in place of printk, which seems to stuff the messages into the log buf until the serial console is setup fairly late in start_kernel. I did this by removing printk() and renaming early_printk() to be printk (and a couple other things like #define early_printk printk ... > >> I've been able to make some more progress. I've gotten to a point where it >> panics from stack overflow. I've verified this by bumping THREAD_ORDER and >> it boots fine. Now tracking down stack usages. (I have found a couple of new >> functions using set_cpus_allowed(..., CPU_MASK_ALL) instead of >> set_cpus_allowed_ptr(... , CPU_MASK_ALL_PTR). But these are not in the calling >> sequence so subsequently are not the cause. > > Is stack overflow the only problem you are seeing or are there still other mysteries? I'm not entirely sure it's a stack overflow, the fault has a NULL dereference and then the stack overflow message. > >> One weird thing is early_idt_handler seems to have been called and that's one >> thing our simulator does not mimic for standard Intel FSB systems - early >> pending >> interrupts. (It's designed after all to mimic our h/w, and of course it's been >> booting fine under that environment.) > > That usually indicates you are taking an exception during boot not that you > have received an external interrupt. Something like a page fault or a > division by 0 error. I was thinking maybe an RTC interrupt? But a fault does sound more likely. > >> Only a few of these though I would think might get called early in >> the boot, that might also be contributing to the stack overflow. > > Still the call chain depth shouldn't really be changing. So why should it > matter? Ah. The high cpu count is growing cpumask_t so when you put > it on the stack. That makes sense. So what stars out as a 4 byte > variable on the stack in a normal setup winds up being a 1k variable > with 4k cpus. Yes, it's definitely the three related: NR_CPUS Patch_Applied THREAD_ORDER Results 256 NO 1 works (obviously ;-) 256 YES 1 works 4096 NO 1 works 4096 YES 1 panics 4096 YES 3 works (just happened to pick 3, 2 probably will work as well.) > Reasonable. The practical problem is you are mixing a lot of changes > simultaneously and it confuses things. Compiling with NR_CPUS=4096 > and working out the bugs from a growing cpumask_t, putting the per cpu > area in a zero based segment, and putting putting the pda into the > per cpu area all at the same time. I've been testing NR_CPUS=4096 for quite a while and it's been very reliable. It's just weird that this config fails with this new patch applied. (default configs and some fairly normal distro configs also work fine.) And with the zillion config straws we now have, spotting the arbitrary needle is proving difficult. ;-) > Who knows maybe the only difference between 4.2.0 and 4.2.4 is that > 4.2.4 optimizes it's stack usage a little better and you don't see > a stack overflow. I haven't tried the THREAD_ORDER=3 (or 2) under 4.2.0, but that would seem to indicate this may be true. > It would be very very good if we could separate out these issues > especially the segment for the per cpu variables. We need something > like that. One reason I've been sticking with 4.2.4. Thanks again for your help. Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-09 23:30 ` Mike Travis @ 2008-07-10 0:04 ` Eric W. Biederman 0 siblings, 0 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-10 0:04 UTC (permalink / raw) To: Mike Travis Cc: H. Peter Anvin, Jeremy Fitzhardinge, Christoph Lameter, Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, Jack Steiner Mike Travis <travis@sgi.com> writes: > What I meant was using early_printk in place of printk, which seems to stuff the > messages into the log buf until the serial console is setup fairly late in > start_kernel. > I did this by removing printk() and renaming early_printk() to be printk (and a > couple > other things like #define early_printk printk ... Last I looked after the magic early_printk setup. printk calls early_printk and stuff messages in the log buffer. It matters little though. As long as you get the print messages. Weird cases where you don't get into C code worry me much more. Once you get into C things are much easier to track. >> Is stack overflow the only problem you are seeing or are there still other > mysteries? > > I'm not entirely sure it's a stack overflow, the fault has a NULL dereference > and > then the stack overflow message. Ok. Interesting. >>> Only a few of these though I would think might get called early in >>> the boot, that might also be contributing to the stack overflow. >> >> Still the call chain depth shouldn't really be changing. So why should it >> matter? Ah. The high cpu count is growing cpumask_t so when you put >> it on the stack. That makes sense. So what stars out as a 4 byte >> variable on the stack in a normal setup winds up being a 1k variable >> with 4k cpus. > > Yes, it's definitely the three related: > > NR_CPUS Patch_Applied THREAD_ORDER Results > 256 NO 1 works (obviously ;-) > 256 YES 1 works > 4096 NO 1 works > 4096 YES 1 panics > 4096 YES 3 works (just happened to pick 3, > 2 probably will work as well.) > I've been testing NR_CPUS=4096 for quite a while and it's been very > reliable. It's just weird that this config fails with this new patch > applied. (default configs and some fairly normal distro configs also > work fine.) And with the zillion config straws we now have, spotting > the arbitrary needle is proving difficult. ;-) Right. Just please split your patch up. It would be good to see if simply changing the per cpu segment address to 0 is related to your problem. Or if it the other logic changes necessary to put the use the pda as a per cpu variable? I just noticed that we always allocate the pda in the per cpu section. > One reason I've been sticking with 4.2.4. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 21:39 ` Eric W. Biederman 2008-07-01 21:52 ` Jeremy Fitzhardinge @ 2008-07-02 2:01 ` H. Peter Anvin 2008-07-02 3:08 ` Eric W. Biederman 1 sibling, 1 reply; 108+ messages in thread From: H. Peter Anvin @ 2008-07-02 2:01 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Eric W. Biederman wrote: > > For i386 since virtual address space is precious and because there were > concerns about putting code in __pa we actually relocate the kernel symbols > during load right after decompression. When we do relocations absolute > symbols are a killer. > Well, it means making it clear to the relocator if it should relocate those symbols or not. Since IIRC we're doing "all or nothing" relocation, the relative offsets are always the same, even between sections. -hpa ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-02 2:01 ` H. Peter Anvin @ 2008-07-02 3:08 ` Eric W. Biederman 0 siblings, 0 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-02 3:08 UTC (permalink / raw) To: H. Peter Anvin Cc: Jeremy Fitzhardinge, Mike Travis, Christoph Lameter, Linux Kernel Mailing List "H. Peter Anvin" <hpa@zytor.com> writes: > Eric W. Biederman wrote: >> >> For i386 since virtual address space is precious and because there were >> concerns about putting code in __pa we actually relocate the kernel symbols >> during load right after decompression. When we do relocations absolute >> symbols are a killer. >> > > Well, it means making it clear to the relocator if it should relocate those > symbols or not. Since IIRC we're doing "all or nothing" relocation, the > relative offsets are always the same, even between sections. Yes the relative offsets stay the same. I don't remember if ld generates useable relocations for what it figures are absolute symbols. I remember the solution was that anything that we wanted to relocate we would make section relative, and anything else we would leave absolute, and that those were easy things to do. The nasty case is that occasionally ld has a bug where it turns section relative symbols into global symbols if there is a section without data. I don't recall us caring in those instances. While interesting. All of this is irrelevant until we start talking unification between x86_63 and x86_64 as only x86_32 has this restriction. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 20:40 ` Eric W. Biederman 2008-07-01 21:10 ` Jeremy Fitzhardinge @ 2008-07-01 21:11 ` Andi Kleen 2008-07-01 21:42 ` Eric W. Biederman 1 sibling, 1 reply; 108+ messages in thread From: Andi Kleen @ 2008-07-01 21:11 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, H. Peter Anvin, Mike Travis, Christoph Lameter, Linux Kernel Mailing List ebiederm@xmission.com (Eric W. Biederman) writes: > > Has anyone investigated using the technique gcc uses for thread local storage? I investigated a long time ago (given when the binutils/gcc support was much more primitive) and my conclusion back then was that doing the same for kernel module (negative addresses) would need new relocation types. And the pain of a binutils change didn't seem worth it. -Andi ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 21:11 ` Andi Kleen @ 2008-07-01 21:42 ` Eric W. Biederman 0 siblings, 0 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-01 21:42 UTC (permalink / raw) To: Andi Kleen Cc: Jeremy Fitzhardinge, H. Peter Anvin, Mike Travis, Christoph Lameter, Linux Kernel Mailing List Andi Kleen <andi@firstfloor.org> writes: > ebiederm@xmission.com (Eric W. Biederman) writes: >> >> Has anyone investigated using the technique gcc uses for thread local storage? > > I investigated a long time ago (given when the binutils/gcc support > was much more primitive) and my conclusion back then was that doing > the same for kernel module (negative addresses) would need > new relocation types. And the pain of a binutils change didn't > seem worth it. Thanks. That does seem to be the fly in the ointment of using the builtin linker support. The kernel lives at negative addresses which is necessary but weird. If the @tpoff relocation doesn't work for us we clearly can't use the support. I will have to look and see if usable relocation types are generated from @tpoff. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-07-01 16:56 ` H. Peter Anvin 2008-07-01 17:26 ` Jeremy Fitzhardinge @ 2008-07-01 18:41 ` Eric W. Biederman 1 sibling, 0 replies; 108+ messages in thread From: Eric W. Biederman @ 2008-07-01 18:41 UTC (permalink / raw) To: H. Peter Anvin Cc: Jeremy Fitzhardinge, Mike Travis, Christoph Lameter, Linux Kernel Mailing List "H. Peter Anvin" <hpa@zytor.com> writes: > Eric W. Biederman wrote: >>> >>> The zero-based PDA mechanism requires the introduction of a new ELF segment >>> based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me > if >>> its triggering some toolchain bug. >> >> Agreed. Given the previous description my hunch is that the bug is occurring >> during objcopy. If vmlinux is good and the compressed kernel is bad. >> > > Actually, it's not all that unusual... it's pretty common in various restricted > environments. That being said, it's probably uncommon for *64-bit* code. It is a sensible thing to expect to work. By unusual I mean it isn't triggered by normal userspace code. In general I find that ld features if they aren't used in userspace and they aren't used in the kernel don't work reliably across versions. Eric ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 21:08 ` Jeremy Fitzhardinge 2008-07-01 8:40 ` Eric W. Biederman @ 2008-07-01 12:09 ` Mike Travis 1 sibling, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-07-01 12:09 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Eric W. Biederman, H. Peter Anvin, Christoph Lameter, Linux Kernel Mailing List Jeremy Fitzhardinge wrote: > Eric W. Biederman wrote: >> Mike Travis <travis@sgi.com> writes: >> >> >>> H. Peter Anvin wrote: >>> >>>> Mike Travis wrote: >>>> >>>>> FYI, I did try this out and it caused the bootloader to scramble the >>>>> loaded data. The first corruption I found was the .x86cpuvendor.init >>>>> section contained all zeroes. >>>>> >>>>> >>>> Explain what you mean with "the bootloader" in this context. >>>> >>>> -hpa >>>> >>> After the code was loaded (the compressed code, it seems that my GRUB >>> doesn't support uncompressed loading), the above section contained >>> zeroes. I snapped it fairly early, around secondary_startup_64, and >>> then printed it in x86_64_start_kernel. >>> >>> The object file had the correct data (as displayed by objdump) so I'm >>> assuming that the bootloading process didn't load the section correctly. >>> >>> Below was the linker script I used: >>> >>> --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h >>> +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h >>> @@ -373,9 +373,13 @@ >>> >>> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >>> #define >>> PERCPU(align) \ >>> - . = >>> ALIGN(align); \ >>> + .data.percpu.abs = >>> .; \ >>> percpu : { } >>> :percpu \ >>> - __per_cpu_load = >>> .; \ >>> + .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) >>> { \ >>> + >>> BYTE(0) \ >>> + . = >>> ALIGN(align); \ >>> + __per_cpu_load = >>> .; \ >>> + >>> } \ >>> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) >>> { \ >>> >>> *(.data.percpu.first) \ >>> >>> *(.data.percpu.shared_aligned) \ >>> @@ -383,8 +387,8 @@ >>> >>> *(.data.percpu.page_aligned) \ >>> ____per_cpu_size = >>> .; \ >>> >>> } \ >>> - . = __per_cpu_load + >>> ____per_cpu_size; \ >>> - data : { } :data >>> + . = __per_cpu_load + ____per_cpu_size; >>> + >>> #else >>> #define >>> PERCPU(align) \ >>> . = >>> ALIGN(align); \ >>> >>> It showed all the correct address in the map and __per_cpu_load was a >>> relative symbol (which was the objective.) >>> >>> Btw, our simulator, which only loads uncompressed code, had the data >>> correct, >>> so it *may* only be a result of the code being compressed. >>> >> >> Weird. Grub doesn't get involved in the decompression the kernel does it >> all itself so we should be able to track where things go bad. >> >> Last I looked the compressed code was formed by essentially. >> objcopy vmlinux -O binary vmlinux.bin >> gzip vmlinux.bin >> And then we take on a magic header to the gzip compressed file. >> >> Are things only bad with the change above? > > No, the original crash being discussed was a GP fault in head_64.S as it > tries to initialize the kernel segments. The cause was that the > prototype GDT is all zero, even though it's an initialized variable, and > inspection of vmlinux shows that it has the right contents. But somehow > it's either 1) getting zeroed on load, or 2) is loaded to the wrong place. > > The zero-based PDA mechanism requires the introduction of a new ELF > segment based at vaddr 0 which is sufficiently unusual that it wouldn't > surprise me if its triggering some toolchain bug. > > Mike: what would happen if the PDA were based at 4k rather than 0? The > stack canary would still be at its small offset (0x20?), but it doesn't > need to be initialized. I'm not sure if doing so would fix anything, > however. > > J I don't know that the basing at 0 or 4k would matter. I'll post the patch in it's current form (as an RFC?) to show what was needed to initialize the pda and gdt page pointer. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 20:50 ` Eric W. Biederman 2008-06-30 21:08 ` Jeremy Fitzhardinge @ 2008-07-01 11:49 ` Mike Travis 1 sibling, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-07-01 11:49 UTC (permalink / raw) To: Eric W. Biederman Cc: H. Peter Anvin, Jeremy Fitzhardinge, cl, Linux Kernel Mailing List Eric W. Biederman wrote: > Mike Travis <travis@sgi.com> writes: > >> H. Peter Anvin wrote: >>> Mike Travis wrote: >>>> FYI, I did try this out and it caused the bootloader to scramble the >>>> loaded data. The first corruption I found was the .x86cpuvendor.init >>>> section contained all zeroes. >>>> >>> Explain what you mean with "the bootloader" in this context. >>> >>> -hpa >> >> After the code was loaded (the compressed code, it seems that my GRUB >> doesn't support uncompressed loading), the above section contained >> zeroes. I snapped it fairly early, around secondary_startup_64, and >> then printed it in x86_64_start_kernel. >> >> The object file had the correct data (as displayed by objdump) so I'm >> assuming that the bootloading process didn't load the section correctly. >> >> Below was the linker script I used: >> >> --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h >> +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h >> @@ -373,9 +373,13 @@ >> >> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU >> #define PERCPU(align) \ >> - . = ALIGN(align); \ >> + .data.percpu.abs = .; \ >> percpu : { } :percpu \ >> - __per_cpu_load = .; \ >> + .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) { \ >> + BYTE(0) \ >> + . = ALIGN(align); \ >> + __per_cpu_load = .; \ >> + } \ >> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \ >> *(.data.percpu.first) \ >> *(.data.percpu.shared_aligned) \ >> @@ -383,8 +387,8 @@ >> *(.data.percpu.page_aligned) \ >> ____per_cpu_size = .; \ >> } \ >> - . = __per_cpu_load + ____per_cpu_size; \ >> - data : { } :data >> + . = __per_cpu_load + ____per_cpu_size; >> + >> #else >> #define PERCPU(align) \ >> . = ALIGN(align); \ >> >> It showed all the correct address in the map and __per_cpu_load was a >> relative symbol (which was the objective.) >> >> Btw, our simulator, which only loads uncompressed code, had the data correct, >> so it *may* only be a result of the code being compressed. > > Weird. Grub doesn't get involved in the decompression the kernel does it > all itself so we should be able to track where things go bad. > > Last I looked the compressed code was formed by essentially. > objcopy vmlinux -O binary vmlinux.bin > gzip vmlinux.bin > And then we take on a magic header to the gzip compressed file. > > Are things only bad with the change above? > > Eric Yes. The failure was "Unsupported CPU" (or some such) which clued me into the vendor section. I was able to get the zero-based variables working well for standard configs. It's getting tripped up now by some of Ingo's random configs, in very unusual places... And once again, it only fails on real h/w, not on our simulator, so catching the elusive bugger is tricky. Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu area 2008-06-30 17:07 ` Mike Travis 2008-06-30 17:18 ` H. Peter Anvin @ 2008-06-30 17:43 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-30 17:43 UTC (permalink / raw) To: Mike Travis Cc: Eric W. Biederman, Christoph Lameter, Linux Kernel Mailing List, H. Peter Anvin Mike Travis wrote: > Eric W. Biederman wrote: > >> Mike Travis <travis@sgi.com> writes: >> > ... > >>> Can we generate a new symbol which would account for LOAD_OFFSET? >>> >> Ouch. Absolute symbols indeed. On the 32bit kernel that may play havoc >> with the relocatable kernel, although we have had similar absolute logic >> for the last year. With __per_cpu_start and __per_cpu_end so it may >> not be a problem. >> >> To initialize the percpu data you do want to talk to the virtual address >> at __per_coup_load. But it is absolute Ugh. >> >> It might be worth saying something like. >> .data.percpu.start : AT(.data.percpu.dummy - LOAD_OFFSET) { >> DATA(0) >> . = ALIGN(align); >> __per_cpu_load = . ; >> } >> To make __per_cpu_load a relative symbol. ld has a bad habit of taking >> symbols out of empty sections and making them absolute. Which is why >> I added the DATA(0). >> >> Still I don't think that would be the 64bit problem. >> >> Eric >> > > FYI, I did try this out and it caused the bootloader to scramble the > loaded data. The first corruption I found was the .x86cpuvendor.init > section contained all zeroes. Well, that's what appeared to be happening with the pre-initialized GDT as well, so I'm not sure that's a new symptom. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-04 0:30 [PATCH 0/4] percpu: Optimize percpu accesses Mike Travis ` (2 preceding siblings ...) 2008-06-04 0:30 ` [PATCH 3/4] x86_64: Fold pda into per cpu area Mike Travis @ 2008-06-04 0:30 ` Mike Travis 2008-06-09 13:03 ` Ingo Molnar 2008-06-04 10:18 ` [PATCH] x86: collapse the various size-dependent percpu accessors together Jeremy Fitzhardinge 4 siblings, 1 reply; 108+ messages in thread From: Mike Travis @ 2008-06-04 0:30 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel [-- Attachment #1: zero_based_replace_pda_operations --] [-- Type: text/plain, Size: 14788 bytes --] * It is now possible to use percpu operations for pda access since the pda is in the percpu area. Drop the pda operations. Based on linux-2.6.tip Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> --- arch/x86/kernel/apic_64.c | 4 - arch/x86/kernel/cpu/mcheck/mce_amd_64.c | 2 arch/x86/kernel/cpu/mcheck/mce_intel_64.c | 2 arch/x86/kernel/nmi.c | 5 + arch/x86/kernel/process_64.c | 12 ++-- arch/x86/kernel/smp.c | 4 - arch/x86/kernel/time_64.c | 2 arch/x86/kernel/tlb_64.c | 12 ++-- arch/x86/kernel/traps_64.c | 2 arch/x86/kernel/x8664_ksyms_64.c | 2 arch/x86/xen/smp.c | 2 include/asm-x86/current.h | 3 - include/asm-x86/hardirq_64.h | 6 +- include/asm-x86/mmu_context_64.h | 12 ++-- include/asm-x86/pda.h | 80 ++---------------------------- include/asm-x86/smp.h | 2 include/asm-x86/stackprotector.h | 2 include/asm-x86/thread_info.h | 3 - include/asm-x86/topology.h | 2 19 files changed, 47 insertions(+), 112 deletions(-) --- linux-2.6.tip.orig/arch/x86/kernel/apic_64.c +++ linux-2.6.tip/arch/x86/kernel/apic_64.c @@ -481,7 +481,7 @@ static void local_apic_timer_interrupt(v /* * the NMI deadlock-detector uses this. */ - add_pda(apic_timer_irqs, 1); + x86_inc_percpu(pda.apic_timer_irqs); evt->event_handler(evt); } @@ -986,7 +986,7 @@ asmlinkage void smp_spurious_interrupt(v if (v & (1 << (SPURIOUS_APIC_VECTOR & 0x1f))) ack_APIC_irq(); - add_pda(irq_spurious_count, 1); + x86_inc_percpu(pda.irq_spurious_count); irq_exit(); } --- linux-2.6.tip.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c +++ linux-2.6.tip/arch/x86/kernel/cpu/mcheck/mce_amd_64.c @@ -237,7 +237,7 @@ asmlinkage void mce_threshold_interrupt( } } out: - add_pda(irq_threshold_count, 1); + x86_inc_percpu(pda.irq_threshold_count); irq_exit(); } --- linux-2.6.tip.orig/arch/x86/kernel/cpu/mcheck/mce_intel_64.c +++ linux-2.6.tip/arch/x86/kernel/cpu/mcheck/mce_intel_64.c @@ -26,7 +26,7 @@ asmlinkage void smp_thermal_interrupt(vo if (therm_throt_process(msr_val & 1)) mce_log_therm_throt_event(smp_processor_id(), msr_val); - add_pda(irq_thermal_count, 1); + x86_inc_percpu(pda.irq_thermal_count); irq_exit(); } --- linux-2.6.tip.orig/arch/x86/kernel/nmi.c +++ linux-2.6.tip/arch/x86/kernel/nmi.c @@ -56,7 +56,7 @@ static int endflag __initdata = 0; static inline unsigned int get_nmi_count(int cpu) { #ifdef CONFIG_X86_64 - return cpu_pda(cpu)->__nmi_count; + return x86_read_percpu(pda.__nmi_count); #else return nmi_count(cpu); #endif @@ -77,7 +77,8 @@ static inline int mce_in_progress(void) static inline unsigned int get_timer_irqs(int cpu) { #ifdef CONFIG_X86_64 - return read_pda(apic_timer_irqs) + read_pda(irq0_irqs); + return x86_read_percpu(pda.apic_timer_irqs) + + x86_read_percpu(pda.irq0_irqs); #else return per_cpu(irq_stat, cpu).apic_timer_irqs + per_cpu(irq_stat, cpu).irq0_irqs; --- linux-2.6.tip.orig/arch/x86/kernel/process_64.c +++ linux-2.6.tip/arch/x86/kernel/process_64.c @@ -75,7 +75,7 @@ void idle_notifier_register(struct notif void enter_idle(void) { - write_pda(isidle, 1); + x86_write_percpu(pda.isidle, 1); atomic_notifier_call_chain(&idle_notifier, IDLE_START, NULL); } @@ -438,7 +438,7 @@ start_thread(struct pt_regs *regs, unsig load_gs_index(0); regs->ip = new_ip; regs->sp = new_sp; - write_pda(oldrsp, new_sp); + x86_write_percpu(pda.oldrsp, new_sp); regs->cs = __USER_CS; regs->ss = __USER_DS; regs->flags = 0x200; @@ -674,11 +674,11 @@ __switch_to(struct task_struct *prev_p, /* * Switch the PDA and FPU contexts. */ - prev->usersp = read_pda(oldrsp); - write_pda(oldrsp, next->usersp); - write_pda(pcurrent, next_p); + prev->usersp = x86_read_percpu(pda.oldrsp); + x86_write_percpu(pda.oldrsp, next->usersp); + x86_write_percpu(pda.pcurrent, next_p); - write_pda(kernelstack, + x86_write_percpu(pda.kernelstack, (unsigned long)task_stack_page(next_p) + THREAD_SIZE - PDA_STACKOFFSET); #ifdef CONFIG_CC_STACKPROTECTOR /* --- linux-2.6.tip.orig/arch/x86/kernel/smp.c +++ linux-2.6.tip/arch/x86/kernel/smp.c @@ -295,7 +295,7 @@ void smp_reschedule_interrupt(struct pt_ #ifdef CONFIG_X86_32 __get_cpu_var(irq_stat).irq_resched_count++; #else - add_pda(irq_resched_count, 1); + x86_inc_percpu(pda.irq_resched_count); #endif } @@ -320,7 +320,7 @@ void smp_call_function_interrupt(struct #ifdef CONFIG_X86_32 __get_cpu_var(irq_stat).irq_call_count++; #else - add_pda(irq_call_count, 1); + x86_inc_percpu(pda.irq_call_count); #endif irq_exit(); --- linux-2.6.tip.orig/arch/x86/kernel/time_64.c +++ linux-2.6.tip/arch/x86/kernel/time_64.c @@ -46,7 +46,7 @@ EXPORT_SYMBOL(profile_pc); static irqreturn_t timer_event_interrupt(int irq, void *dev_id) { - add_pda(irq0_irqs, 1); + x86_inc_percpu(pda.irq0_irqs); global_clock_event->event_handler(global_clock_event); --- linux-2.6.tip.orig/arch/x86/kernel/tlb_64.c +++ linux-2.6.tip/arch/x86/kernel/tlb_64.c @@ -60,9 +60,9 @@ static DEFINE_PER_CPU(union smp_flush_st */ void leave_mm(int cpu) { - if (read_pda(mmu_state) == TLBSTATE_OK) + if (x86_read_percpu(pda.mmu_state) == TLBSTATE_OK) BUG(); - cpu_clear(cpu, read_pda(active_mm)->cpu_vm_mask); + cpu_clear(cpu, x86_read_percpu(pda.active_mm)->cpu_vm_mask); load_cr3(swapper_pg_dir); } EXPORT_SYMBOL_GPL(leave_mm); @@ -140,8 +140,8 @@ asmlinkage void smp_invalidate_interrupt * BUG(); */ - if (f->flush_mm == read_pda(active_mm)) { - if (read_pda(mmu_state) == TLBSTATE_OK) { + if (f->flush_mm == x86_read_percpu(pda.active_mm)) { + if (x86_read_percpu(pda.mmu_state) == TLBSTATE_OK) { if (f->flush_va == TLB_FLUSH_ALL) local_flush_tlb(); else @@ -152,7 +152,7 @@ asmlinkage void smp_invalidate_interrupt out: ack_APIC_irq(); cpu_clear(cpu, f->flush_cpumask); - add_pda(irq_tlb_count, 1); + x86_inc_percpu(pda.irq_tlb_count); } void native_flush_tlb_others(const cpumask_t *cpumaskp, struct mm_struct *mm, @@ -264,7 +264,7 @@ static void do_flush_tlb_all(void *info) unsigned long cpu = smp_processor_id(); __flush_tlb_all(); - if (read_pda(mmu_state) == TLBSTATE_LAZY) + if (x86_read_percpu(pda.mmu_state) == TLBSTATE_LAZY) leave_mm(cpu); } --- linux-2.6.tip.orig/arch/x86/kernel/traps_64.c +++ linux-2.6.tip/arch/x86/kernel/traps_64.c @@ -878,7 +878,7 @@ asmlinkage notrace __kprobes void do_nmi(struct pt_regs *regs, long error_code) { nmi_enter(); - add_pda(__nmi_count, 1); + x86_inc_percpu(pda.__nmi_count); if (!ignore_nmis) default_do_nmi(regs); nmi_exit(); --- linux-2.6.tip.orig/arch/x86/kernel/x8664_ksyms_64.c +++ linux-2.6.tip/arch/x86/kernel/x8664_ksyms_64.c @@ -59,8 +59,6 @@ EXPORT_SYMBOL(empty_zero_page); EXPORT_SYMBOL(init_level4_pgt); EXPORT_SYMBOL(load_gs_index); -EXPORT_SYMBOL(_proxy_pda); - #ifdef CONFIG_PARAVIRT /* Virtualized guests may want to use it */ EXPORT_SYMBOL_GPL(cpu_gdt_descr); --- linux-2.6.tip.orig/arch/x86/xen/smp.c +++ linux-2.6.tip/arch/x86/xen/smp.c @@ -68,7 +68,7 @@ static irqreturn_t xen_reschedule_interr #ifdef CONFIG_X86_32 __get_cpu_var(irq_stat).irq_resched_count++; #else - add_pda(irq_resched_count, 1); + x86_inc_percpu(pda.irq_resched_count); #endif return IRQ_HANDLED; --- linux-2.6.tip.orig/include/asm-x86/current.h +++ linux-2.6.tip/include/asm-x86/current.h @@ -17,12 +17,13 @@ static __always_inline struct task_struc #ifndef __ASSEMBLY__ #include <asm/pda.h> +#include <asm/percpu.h> struct task_struct; static __always_inline struct task_struct *get_current(void) { - return read_pda(pcurrent); + return x86_read_percpu(pda.pcurrent); } #else /* __ASSEMBLY__ */ --- linux-2.6.tip.orig/include/asm-x86/hardirq_64.h +++ linux-2.6.tip/include/asm-x86/hardirq_64.h @@ -11,12 +11,12 @@ #define __ARCH_IRQ_STAT 1 -#define local_softirq_pending() read_pda(__softirq_pending) +#define local_softirq_pending() x86_read_percpu(pda.__softirq_pending) #define __ARCH_SET_SOFTIRQ_PENDING 1 -#define set_softirq_pending(x) write_pda(__softirq_pending, (x)) -#define or_softirq_pending(x) or_pda(__softirq_pending, (x)) +#define set_softirq_pending(x) x86_write_percpu(pda.__softirq_pending, (x)) +#define or_softirq_pending(x) x86_or_percpu(pda.__softirq_pending, (x)) extern void ack_bad_irq(unsigned int irq); --- linux-2.6.tip.orig/include/asm-x86/mmu_context_64.h +++ linux-2.6.tip/include/asm-x86/mmu_context_64.h @@ -20,8 +20,8 @@ void destroy_context(struct mm_struct *m static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) { #ifdef CONFIG_SMP - if (read_pda(mmu_state) == TLBSTATE_OK) - write_pda(mmu_state, TLBSTATE_LAZY); + if (x86_read_percpu(pda.mmu_state) == TLBSTATE_OK) + x86_write_percpu(pda.mmu_state, TLBSTATE_LAZY); #endif } @@ -33,8 +33,8 @@ static inline void switch_mm(struct mm_s /* stop flush ipis for the previous mm */ cpu_clear(cpu, prev->cpu_vm_mask); #ifdef CONFIG_SMP - write_pda(mmu_state, TLBSTATE_OK); - write_pda(active_mm, next); + x86_write_percpu(pda.mmu_state, TLBSTATE_OK); + x86_write_percpu(pda.active_mm, next); #endif cpu_set(cpu, next->cpu_vm_mask); load_cr3(next->pgd); @@ -44,8 +44,8 @@ static inline void switch_mm(struct mm_s } #ifdef CONFIG_SMP else { - write_pda(mmu_state, TLBSTATE_OK); - if (read_pda(active_mm) != next) + x86_write_percpu(pda.mmu_state, TLBSTATE_OK); + if (x86_read_percpu(pda.active_mm) != next) BUG(); if (!cpu_test_and_set(cpu, next->cpu_vm_mask)) { /* We were in lazy tlb mode and leave_mm disabled --- linux-2.6.tip.orig/include/asm-x86/pda.h +++ linux-2.6.tip/include/asm-x86/pda.h @@ -21,7 +21,7 @@ struct x8664_pda { offset 40!!! */ char *irqstackptr; short nodenumber; /* number of current node (32k max) */ - short in_bootmem; /* pda lives in bootmem */ + short unused1; /* unused */ unsigned int __softirq_pending; unsigned int __nmi_count; /* number of NMI on this CPUs */ short mmu_state; @@ -37,17 +37,8 @@ struct x8664_pda { unsigned irq_spurious_count; } ____cacheline_aligned_in_smp; -extern struct x8664_pda **_cpu_pda; extern void pda_init(int); -#define cpu_pda(i) (_cpu_pda[i]) - -/* - * There is no fast way to get the base address of the PDA, all the accesses - * have to mention %fs/%gs. So it needs to be done this Torvaldian way. - */ -extern void __bad_pda_field(void) __attribute__((noreturn)); - /* * proxy_pda doesn't actually exist, but tell gcc it is accessed for * all PDA accesses so it gets read/write dependencies right. @@ -56,69 +47,11 @@ extern struct x8664_pda _proxy_pda; #define pda_offset(field) offsetof(struct x8664_pda, field) -#define pda_to_op(op, field, val) \ -do { \ - typedef typeof(_proxy_pda.field) T__; \ - if (0) { T__ tmp__; tmp__ = (val); } /* type checking */ \ - switch (sizeof(_proxy_pda.field)) { \ - case 2: \ - asm(op "w %1,%%gs:%c2" : \ - "+m" (_proxy_pda.field) : \ - "ri" ((T__)val), \ - "i"(pda_offset(field))); \ - break; \ - case 4: \ - asm(op "l %1,%%gs:%c2" : \ - "+m" (_proxy_pda.field) : \ - "ri" ((T__)val), \ - "i" (pda_offset(field))); \ - break; \ - case 8: \ - asm(op "q %1,%%gs:%c2": \ - "+m" (_proxy_pda.field) : \ - "ri" ((T__)val), \ - "i"(pda_offset(field))); \ - break; \ - default: \ - __bad_pda_field(); \ - } \ -} while (0) - -#define pda_from_op(op, field) \ -({ \ - typeof(_proxy_pda.field) ret__; \ - switch (sizeof(_proxy_pda.field)) { \ - case 2: \ - asm(op "w %%gs:%c1,%0" : \ - "=r" (ret__) : \ - "i" (pda_offset(field)), \ - "m" (_proxy_pda.field)); \ - break; \ - case 4: \ - asm(op "l %%gs:%c1,%0": \ - "=r" (ret__): \ - "i" (pda_offset(field)), \ - "m" (_proxy_pda.field)); \ - break; \ - case 8: \ - asm(op "q %%gs:%c1,%0": \ - "=r" (ret__) : \ - "i" (pda_offset(field)), \ - "m" (_proxy_pda.field)); \ - break; \ - default: \ - __bad_pda_field(); \ - } \ - ret__; \ -}) - -#define read_pda(field) pda_from_op("mov", field) -#define write_pda(field, val) pda_to_op("mov", field, val) -#define add_pda(field, val) pda_to_op("add", field, val) -#define sub_pda(field, val) pda_to_op("sub", field, val) -#define or_pda(field, val) pda_to_op("or", field, val) - -/* This is not atomic against other CPUs -- CPU preemption needs to be off */ +/* + * This is not atomic against other CPUs -- CPU preemption needs to be off + * NOTE: This relies on the fact that the cpu_pda is the *first* field in + * the per cpu area. Move it and you'll need to change this. + */ #define test_and_clear_bit_pda(bit, field) \ ({ \ int old__; \ @@ -128,6 +61,7 @@ do { \ old__; \ }) + #endif #define PDA_STACKOFFSET (5*8) --- linux-2.6.tip.orig/include/asm-x86/smp.h +++ linux-2.6.tip/include/asm-x86/smp.h @@ -134,7 +134,7 @@ DECLARE_PER_CPU(int, cpu_number); extern int safe_smp_processor_id(void); #elif defined(CONFIG_X86_64_SMP) -#define raw_smp_processor_id() read_pda(cpunumber) +#define raw_smp_processor_id() x86_read_percpu(pda.cpunumber) #define stack_smp_processor_id() \ ({ \ --- linux-2.6.tip.orig/include/asm-x86/stackprotector.h +++ linux-2.6.tip/include/asm-x86/stackprotector.h @@ -32,7 +32,7 @@ static __always_inline void boot_init_st canary += tsc + (tsc << 32UL); current->stack_canary = canary; - write_pda(stack_canary, canary); + x86_write_percpu(pda.stack_canary, canary); } #endif --- linux-2.6.tip.orig/include/asm-x86/thread_info.h +++ linux-2.6.tip/include/asm-x86/thread_info.h @@ -200,7 +200,8 @@ static inline struct thread_info *curren static inline struct thread_info *current_thread_info(void) { struct thread_info *ti; - ti = (void *)(read_pda(kernelstack) + PDA_STACKOFFSET - THREAD_SIZE); + ti = (void *)(x86_read_percpu(pda.kernelstack) + + PDA_STACKOFFSET - THREAD_SIZE); return ti; } --- linux-2.6.tip.orig/include/asm-x86/topology.h +++ linux-2.6.tip/include/asm-x86/topology.h @@ -72,7 +72,7 @@ extern cpumask_t *node_to_cpumask_map; DECLARE_EARLY_PER_CPU(int, x86_cpu_to_node_map); /* Returns the number of the current Node. */ -#define numa_node_id() read_pda(nodenumber) +#define numa_node_id() x86_read_percpu(pda.nodenumber) #ifdef CONFIG_DEBUG_PER_CPU_MAPS extern int cpu_to_node(int cpu); -- ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-04 0:30 ` [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu() Mike Travis @ 2008-06-09 13:03 ` Ingo Molnar 2008-06-09 16:08 ` Mike Travis 2008-06-09 17:36 ` Mike Travis 0 siblings, 2 replies; 108+ messages in thread From: Ingo Molnar @ 2008-06-09 13:03 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel [-- Attachment #1: Type: text/plain, Size: 233 bytes --] * Mike Travis <travis@sgi.com> wrote: > * It is now possible to use percpu operations for pda access > since the pda is in the percpu area. Drop the pda operations. FYI, this one didnt build with the attached config. Ingo [-- Attachment #2: config --] [-- Type: text/plain, Size: 32775 bytes --] # # Automatically generated make config: don't edit # Linux kernel version: 2.6.26-rc5 # Mon Jun 9 14:59:39 2008 # CONFIG_64BIT=y # CONFIG_X86_32 is not set CONFIG_X86_64=y CONFIG_X86=y CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig" # CONFIG_GENERIC_LOCKBREAK is not set CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_HWEIGHT=y # CONFIG_GENERIC_GPIO is not set CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_GENERIC_SPINLOCK=y # CONFIG_RWSEM_XCHGADD_ALGORITHM is not set # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ZONE_DMA32=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_AUDIT_ARCH=y CONFIG_ARCH_SUPPORTS_AOUT=y CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_X86_SMP=y CONFIG_X86_64_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y # CONFIG_KTIME_SCALAR is not set # CONFIG_BOOTPARAM_SUPPORT_WANTED is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_BOOT_ALLOWED3=y CONFIG_BROKEN_BOOT_ALLOWED2=y CONFIG_BROKEN_BOOT_ALLOWED=y CONFIG_BROKEN_BOOT=y CONFIG_BROKEN_BOOT_EUROPE=y CONFIG_BROKEN_BOOT_TITAN=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set # CONFIG_SWAP is not set # CONFIG_SYSVIPC is not set # CONFIG_POSIX_MQUEUE is not set CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set # CONFIG_TASKSTATS is not set # CONFIG_AUDIT is not set # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=20 CONFIG_CGROUPS=y CONFIG_CGROUP_DEBUG=y # CONFIG_CGROUP_NS is not set CONFIG_CGROUP_DEVICE=y # CONFIG_CPUSETS is not set CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y # CONFIG_GROUP_SCHED is not set # CONFIG_CGROUP_CPUACCT is not set CONFIG_RESOURCE_COUNTERS=y # CONFIG_CGROUP_MEM_RES_CTLR is not set CONFIG_RELAY=y # CONFIG_NAMESPACES is not set # CONFIG_BLK_DEV_INITRD is not set # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_EMBEDDED=y # CONFIG_UID16 is not set # CONFIG_SYSCTL_SYSCALL is not set CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set # CONFIG_HOTPLUG is not set # CONFIG_PRINTK is not set # CONFIG_BUG is not set CONFIG_ELF_CORE=y CONFIG_PCSPKR_PLATFORM=y CONFIG_COMPAT_BRK=y CONFIG_BASE_FULL=y # CONFIG_FUTEX is not set CONFIG_ANON_INODES=y # CONFIG_EPOLL is not set CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set # CONFIG_PROFILING is not set CONFIG_MARKERS=y CONFIG_HAVE_OPROFILE=y CONFIG_HAVE_KPROBES=y CONFIG_HAVE_KRETPROBES=y # CONFIG_HAVE_DMA_ATTRS is not set CONFIG_HAVE_IMMEDIATE=y # CONFIG_IMMEDIATE is not set # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # CONFIG_MODULES is not set CONFIG_BLOCK=y # CONFIG_BLK_DEV_BSG is not set CONFIG_BLOCK_COMPAT=y # # IO Schedulers # CONFIG_IOSCHED_NOOP=y # CONFIG_IOSCHED_AS is not set # CONFIG_IOSCHED_DEADLINE is not set CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" CONFIG_CLASSIC_RCU=y # # Processor type and features # CONFIG_TICK_ONESHOT=y # CONFIG_NO_HZ is not set CONFIG_HIGH_RES_TIMERS=y CONFIG_GENERIC_CLOCKEVENTS_BUILD=y CONFIG_SMP_SUPPORT=y CONFIG_UP_WANTED_1=y # CONFIG_UP_WANTED_2 is not set CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_RDC321X is not set # CONFIG_X86_VSMP is not set CONFIG_PARAVIRT_GUEST=y # CONFIG_KVM_CLOCK is not set # CONFIG_KVM_GUEST is not set CONFIG_PARAVIRT=y # CONFIG_MEMTEST is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_MPSC is not set CONFIG_MCORE2=y # CONFIG_GENERIC_CPU is not set CONFIG_X86_CPU=y CONFIG_X86_L1_CACHE_BYTES=64 CONFIG_X86_INTERNODE_CACHE_BYTES=64 CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_P6_NOP=y CONFIG_X86_TSC=y CONFIG_X86_CMPXCHG64=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=64 CONFIG_X86_DEBUGCTLMSR=y # CONFIG_X86_DS is not set CONFIG_HPET_TIMER=y # CONFIG_DMI is not set CONFIG_GART_IOMMU=y # CONFIG_CALGARY_IOMMU is not set CONFIG_SWIOTLB=y CONFIG_IOMMU_HELPER=y # CONFIG_MAXSMP is not set CONFIG_NR_CPUS=8 # CONFIG_SCHED_SMT is not set # CONFIG_SCHED_MC is not set CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_INTEL=y CONFIG_X86_MCE_AMD=y CONFIG_I8K=y CONFIG_MICROCODE=y CONFIG_MICROCODE_OLD_INTERFACE=y CONFIG_X86_MSR=y # CONFIG_X86_CPUID is not set # CONFIG_NUMA is not set CONFIG_ARCH_SPARSEMEM_DEFAULT=y CONFIG_ARCH_SPARSEMEM_ENABLE=y CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_ILLEGAL_POINTER_VALUE=0xffffc10000000000 CONFIG_SELECT_MEMORY_MODEL=y # CONFIG_FLATMEM_MANUAL is not set # CONFIG_DISCONTIGMEM_MANUAL is not set CONFIG_SPARSEMEM_MANUAL=y CONFIG_SPARSEMEM=y CONFIG_HAVE_MEMORY_PRESENT=y # CONFIG_SPARSEMEM_STATIC is not set CONFIG_SPARSEMEM_EXTREME=y CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y # CONFIG_SPARSEMEM_VMEMMAP is not set CONFIG_PAGEFLAGS_EXTENDED=y CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_RESOURCES_64BIT=y CONFIG_ZONE_DMA_FLAG=1 CONFIG_BOUNCE=y CONFIG_VIRT_TO_BUS=y # CONFIG_MTRR is not set # CONFIG_CC_STACKPROTECTOR is not set # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set CONFIG_HZ_300=y # CONFIG_HZ_1000 is not set CONFIG_HZ=300 CONFIG_SCHED_HRTICK=y CONFIG_KEXEC=y # CONFIG_CRASH_DUMP is not set CONFIG_PHYSICAL_START=0x200000 CONFIG_RELOCATABLE=y CONFIG_PHYSICAL_ALIGN=0x200000 # CONFIG_COMPAT_VDSO is not set CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y # # Power management options # # CONFIG_PM is not set # # CPU Frequency scaling # CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y # CONFIG_CPU_FREQ_DEBUG is not set CONFIG_CPU_FREQ_STAT=y # CONFIG_CPU_FREQ_STAT_DETAILS is not set CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y CONFIG_CPU_FREQ_GOV_USERSPACE=y # CONFIG_CPU_FREQ_GOV_ONDEMAND is not set # CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set # # CPUFreq processor drivers # CONFIG_X86_POWERNOW_K8=y # CONFIG_X86_P4_CLOCKMOD is not set # # shared options # # CONFIG_X86_SPEEDSTEP_LIB is not set # CONFIG_CPU_IDLE is not set # # Bus options (PCI etc.) # CONFIG_PCI=y CONFIG_PCI_DIRECT=y CONFIG_PCI_DOMAINS=y CONFIG_PCIEPORTBUS=y # CONFIG_PCIEAER is not set # CONFIG_PCIEASPM is not set CONFIG_ARCH_SUPPORTS_MSI=y # CONFIG_PCI_MSI is not set # CONFIG_PCI_LEGACY is not set # CONFIG_PCI_DEBUG is not set # CONFIG_HT_IRQ is not set CONFIG_ISA_DMA_API=y CONFIG_K8_NB=y # # Executable file formats / Emulations # CONFIG_BINFMT_ELF=y CONFIG_COMPAT_BINFMT_ELF=y # CONFIG_BINFMT_MISC is not set CONFIG_IA32_EMULATION=y CONFIG_IA32_AOUT=y CONFIG_COMPAT=y CONFIG_COMPAT_FOR_U64_ALIGNMENT=y # # Networking # CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y CONFIG_XFRM=y # CONFIG_XFRM_SUB_POLICY is not set CONFIG_XFRM_MIGRATE=y CONFIG_NET_KEY=y # CONFIG_NET_KEY_MIGRATE is not set # CONFIG_INET is not set CONFIG_NETWORK_SECMARK=y # CONFIG_NETFILTER is not set CONFIG_ATM=y CONFIG_ATM_LANE=y # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set CONFIG_DECNET=y # CONFIG_DECNET_ROUTER is not set CONFIG_LLC=y CONFIG_LLC2=y # CONFIG_IPX is not set CONFIG_ATALK=y # CONFIG_DEV_APPLETALK is not set CONFIG_X25=y # CONFIG_LAPB is not set CONFIG_WAN_ROUTER=y # CONFIG_NET_SCHED is not set # # Network testing # CONFIG_HAMRADIO=y # # Packet Radio protocols # # CONFIG_AX25 is not set CONFIG_CAN=y CONFIG_CAN_RAW=y CONFIG_CAN_BCM=y # # CAN Device Drivers # # CONFIG_CAN_VCAN is not set CONFIG_CAN_DEBUG_DEVICES=y CONFIG_IRDA=y # # IrDA protocols # CONFIG_IRLAN=y # CONFIG_IRCOMM is not set # CONFIG_IRDA_ULTRA is not set # # IrDA options # CONFIG_IRDA_CACHE_LAST_LSAP=y CONFIG_IRDA_FAST_RR=y # CONFIG_IRDA_DEBUG is not set # # Infrared-port device drivers # # # SIR device drivers # # CONFIG_IRTTY_SIR is not set # # Dongle support # # # FIR device drivers # CONFIG_NSC_FIR=y # CONFIG_WINBOND_FIR is not set CONFIG_SMC_IRCC_FIR=y CONFIG_ALI_FIR=y # CONFIG_VLSI_FIR is not set CONFIG_VIA_FIR=y CONFIG_BT=y # CONFIG_BT_L2CAP is not set # CONFIG_BT_SCO is not set # # Bluetooth device drivers # # CONFIG_BT_HCIUART is not set # CONFIG_BT_HCIVHCI is not set # # Wireless # # CONFIG_CFG80211 is not set CONFIG_WIRELESS_EXT=y # CONFIG_MAC80211 is not set CONFIG_IEEE80211=y # CONFIG_IEEE80211_DEBUG is not set CONFIG_IEEE80211_CRYPT_WEP=y # CONFIG_IEEE80211_CRYPT_CCMP is not set CONFIG_IEEE80211_CRYPT_TKIP=y CONFIG_RFKILL=y CONFIG_RFKILL_INPUT=y CONFIG_NET_9P=y # CONFIG_NET_9P_DEBUG is not set # # Device Drivers # # # Generic Driver Options # CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y # CONFIG_DEBUG_DRIVER is not set # CONFIG_DEBUG_DEVRES is not set # CONFIG_SYS_HYPERVISOR is not set # CONFIG_CONNECTOR is not set CONFIG_MTD=y # CONFIG_MTD_DEBUG is not set CONFIG_MTD_CONCAT=y CONFIG_MTD_PARTITIONS=y # CONFIG_MTD_REDBOOT_PARTS is not set # CONFIG_MTD_CMDLINE_PARTS is not set # CONFIG_MTD_AR7_PARTS is not set # # User Modules And Translation Layers # # CONFIG_MTD_CHAR is not set CONFIG_MTD_BLKDEVS=y CONFIG_MTD_BLOCK=y # CONFIG_FTL is not set CONFIG_NFTL=y CONFIG_NFTL_RW=y CONFIG_INFTL=y # CONFIG_RFD_FTL is not set CONFIG_SSFDC=y # CONFIG_MTD_OOPS is not set # # RAM/ROM/Flash chip drivers # CONFIG_MTD_CFI=y CONFIG_MTD_JEDECPROBE=y CONFIG_MTD_GEN_PROBE=y CONFIG_MTD_CFI_ADV_OPTIONS=y # CONFIG_MTD_CFI_NOSWAP is not set # CONFIG_MTD_CFI_BE_BYTE_SWAP is not set CONFIG_MTD_CFI_LE_BYTE_SWAP=y # CONFIG_MTD_CFI_GEOMETRY is not set CONFIG_MTD_MAP_BANK_WIDTH_1=y CONFIG_MTD_MAP_BANK_WIDTH_2=y CONFIG_MTD_MAP_BANK_WIDTH_4=y # CONFIG_MTD_MAP_BANK_WIDTH_8 is not set # CONFIG_MTD_MAP_BANK_WIDTH_16 is not set # CONFIG_MTD_MAP_BANK_WIDTH_32 is not set CONFIG_MTD_CFI_I1=y CONFIG_MTD_CFI_I2=y # CONFIG_MTD_CFI_I4 is not set # CONFIG_MTD_CFI_I8 is not set CONFIG_MTD_OTP=y # CONFIG_MTD_CFI_INTELEXT is not set CONFIG_MTD_CFI_AMDSTD=y CONFIG_MTD_CFI_STAA=y CONFIG_MTD_CFI_UTIL=y # CONFIG_MTD_RAM is not set CONFIG_MTD_ROM=y # CONFIG_MTD_ABSENT is not set # # Mapping drivers for chip access # # CONFIG_MTD_COMPLEX_MAPPINGS is not set # CONFIG_MTD_PHYSMAP is not set # CONFIG_MTD_SC520CDP is not set # CONFIG_MTD_NETSC520 is not set CONFIG_MTD_TS5500=y # CONFIG_MTD_AMD76XROM is not set CONFIG_MTD_ICHXROM=y # CONFIG_MTD_ESB2ROM is not set # CONFIG_MTD_CK804XROM is not set CONFIG_MTD_SCB2_FLASH=y # CONFIG_MTD_NETtel is not set CONFIG_MTD_L440GX=y CONFIG_MTD_INTEL_VR_NOR=y # CONFIG_MTD_PLATRAM is not set # # Self-contained MTD device drivers # # CONFIG_MTD_PMC551 is not set # CONFIG_MTD_DATAFLASH is not set # CONFIG_MTD_M25P80 is not set CONFIG_MTD_SLRAM=y CONFIG_MTD_PHRAM=y # CONFIG_MTD_MTDRAM is not set CONFIG_MTD_BLOCK2MTD=y # # Disk-On-Chip Device Drivers # # CONFIG_MTD_DOC2000 is not set CONFIG_MTD_DOC2001=y CONFIG_MTD_DOC2001PLUS=y CONFIG_MTD_DOCPROBE=y CONFIG_MTD_DOCECC=y # CONFIG_MTD_DOCPROBE_ADVANCED is not set CONFIG_MTD_DOCPROBE_ADDRESS=0 # CONFIG_MTD_NAND is not set CONFIG_MTD_NAND_IDS=y CONFIG_MTD_ONENAND=y CONFIG_MTD_ONENAND_VERIFY_WRITE=y # CONFIG_MTD_ONENAND_OTP is not set # CONFIG_MTD_ONENAND_2X_PROGRAM is not set # CONFIG_MTD_ONENAND_SIM is not set # # UBI - Unsorted block images # # CONFIG_MTD_UBI is not set CONFIG_PARPORT=y # CONFIG_PARPORT_PC is not set # CONFIG_PARPORT_GSC is not set CONFIG_PARPORT_AX88796=y # CONFIG_PARPORT_1284 is not set CONFIG_PARPORT_NOT_PC=y # CONFIG_BLK_DEV is not set # CONFIG_MISC_DEVICES is not set CONFIG_HAVE_IDE=y CONFIG_IDE=y CONFIG_IDE_MAX_HWIFS=4 # CONFIG_BLK_DEV_IDE is not set # CONFIG_BLK_DEV_HD_ONLY is not set # CONFIG_BLK_DEV_HD is not set # # SCSI device support # # CONFIG_RAID_ATTRS is not set CONFIG_SCSI=y CONFIG_SCSI_DMA=y CONFIG_SCSI_TGT=y CONFIG_SCSI_NETLINK=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=y # CONFIG_CHR_DEV_ST is not set # CONFIG_CHR_DEV_OSST is not set # CONFIG_BLK_DEV_SR is not set # CONFIG_CHR_DEV_SG is not set # CONFIG_CHR_DEV_SCH is not set # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # CONFIG_SCSI_MULTI_LUN=y CONFIG_SCSI_CONSTANTS=y # CONFIG_SCSI_LOGGING is not set CONFIG_SCSI_SCAN_ASYNC=y # # SCSI Transports # CONFIG_SCSI_SPI_ATTRS=y CONFIG_SCSI_FC_ATTRS=y CONFIG_SCSI_FC_TGT_ATTRS=y CONFIG_SCSI_ISCSI_ATTRS=y CONFIG_SCSI_SRP_ATTRS=y CONFIG_SCSI_SRP_TGT_ATTRS=y CONFIG_SCSI_LOWLEVEL=y # CONFIG_BLK_DEV_3W_XXXX_RAID is not set CONFIG_SCSI_3W_9XXX=y # CONFIG_SCSI_ACARD is not set # CONFIG_SCSI_AACRAID is not set # CONFIG_SCSI_AIC7XXX is not set # CONFIG_SCSI_AIC7XXX_OLD is not set # CONFIG_SCSI_AIC79XX is not set CONFIG_SCSI_DPT_I2O=y CONFIG_SCSI_ADVANSYS=y CONFIG_SCSI_ARCMSR=y CONFIG_MEGARAID_NEWGEN=y # CONFIG_MEGARAID_MM is not set CONFIG_MEGARAID_LEGACY=y # CONFIG_MEGARAID_SAS is not set CONFIG_SCSI_HPTIOP=y # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set CONFIG_SCSI_GDTH=y CONFIG_SCSI_IPS=y # CONFIG_SCSI_INITIO is not set CONFIG_SCSI_INIA100=y CONFIG_SCSI_STEX=y CONFIG_SCSI_SYM53C8XX_2=y CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_MMIO is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA_FC=y CONFIG_SCSI_QLA_ISCSI=y CONFIG_SCSI_LPFC=y CONFIG_SCSI_DC395x=y # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_DEBUG is not set CONFIG_SCSI_SRP=y # CONFIG_ATA is not set # CONFIG_MD is not set CONFIG_FUSION=y CONFIG_FUSION_SPI=y CONFIG_FUSION_FC=y # CONFIG_FUSION_SAS is not set CONFIG_FUSION_MAX_SGE=128 CONFIG_FUSION_CTL=y # CONFIG_FUSION_LOGGING is not set # # IEEE 1394 (FireWire) support # CONFIG_FIREWIRE=y CONFIG_FIREWIRE_OHCI=y CONFIG_FIREWIRE_OHCI_DEBUG=y # CONFIG_FIREWIRE_SBP2 is not set CONFIG_IEEE1394=y # # Subsystem Options # CONFIG_IEEE1394_VERBOSEDEBUG=y # # Controllers # # # Texas Instruments PCILynx requires I2C # # CONFIG_IEEE1394_OHCI1394 is not set # # Protocols # CONFIG_IEEE1394_SBP2=y CONFIG_IEEE1394_SBP2_PHYS_DMA=y # CONFIG_IEEE1394_ETH1394_ROM_ENTRY is not set # CONFIG_IEEE1394_RAWIO is not set CONFIG_I2O=y # CONFIG_I2O_LCT_NOTIFY_ON_CHANGES is not set CONFIG_I2O_EXT_ADAPTEC=y # CONFIG_I2O_EXT_ADAPTEC_DMA64 is not set # CONFIG_I2O_CONFIG is not set CONFIG_I2O_BUS=y CONFIG_I2O_BLOCK=y CONFIG_I2O_SCSI=y CONFIG_I2O_PROC=y CONFIG_MACINTOSH_DRIVERS=y CONFIG_MAC_EMUMOUSEBTN=y # CONFIG_NETDEVICES is not set CONFIG_MLX4_CORE=y # CONFIG_ISDN is not set CONFIG_PHONE=y # CONFIG_PHONE_IXJ is not set # # Input device support # CONFIG_INPUT=y CONFIG_INPUT_FF_MEMLESS=y CONFIG_INPUT_POLLDEV=y # # Userland interfaces # # CONFIG_INPUT_MOUSEDEV is not set CONFIG_INPUT_JOYDEV=y CONFIG_INPUT_EVDEV=y # CONFIG_INPUT_EVBUG is not set # # Input Device Drivers # # CONFIG_INPUT_KEYBOARD is not set # CONFIG_INPUT_MOUSE is not set CONFIG_INPUT_JOYSTICK=y # CONFIG_JOYSTICK_ANALOG is not set CONFIG_JOYSTICK_A3D=y # CONFIG_JOYSTICK_ADI is not set # CONFIG_JOYSTICK_COBRA is not set # CONFIG_JOYSTICK_GF2K is not set # CONFIG_JOYSTICK_GRIP is not set CONFIG_JOYSTICK_GRIP_MP=y CONFIG_JOYSTICK_GUILLEMOT=y CONFIG_JOYSTICK_INTERACT=y CONFIG_JOYSTICK_SIDEWINDER=y # CONFIG_JOYSTICK_TMDC is not set CONFIG_JOYSTICK_IFORCE=y # CONFIG_JOYSTICK_IFORCE_232 is not set # CONFIG_JOYSTICK_WARRIOR is not set # CONFIG_JOYSTICK_MAGELLAN is not set # CONFIG_JOYSTICK_SPACEORB is not set # CONFIG_JOYSTICK_SPACEBALL is not set # CONFIG_JOYSTICK_STINGER is not set CONFIG_JOYSTICK_TWIDJOY=y CONFIG_JOYSTICK_ZHENHUA=y CONFIG_JOYSTICK_DB9=y CONFIG_JOYSTICK_GAMECON=y CONFIG_JOYSTICK_TURBOGRAFX=y CONFIG_JOYSTICK_JOYDUMP=y CONFIG_INPUT_TABLET=y CONFIG_INPUT_TOUCHSCREEN=y # CONFIG_TOUCHSCREEN_ADS7846 is not set CONFIG_TOUCHSCREEN_FUJITSU=y CONFIG_TOUCHSCREEN_GUNZE=y CONFIG_TOUCHSCREEN_ELO=y # CONFIG_TOUCHSCREEN_MTOUCH is not set # CONFIG_TOUCHSCREEN_MK712 is not set # CONFIG_TOUCHSCREEN_PENMOUNT is not set CONFIG_TOUCHSCREEN_TOUCHRIGHT=y CONFIG_TOUCHSCREEN_TOUCHWIN=y # CONFIG_TOUCHSCREEN_UCB1400 is not set CONFIG_INPUT_MISC=y CONFIG_INPUT_PCSPKR=y # CONFIG_INPUT_UINPUT is not set # # Hardware I/O ports # CONFIG_SERIO=y # CONFIG_SERIO_I8042 is not set # CONFIG_SERIO_SERPORT is not set # CONFIG_SERIO_CT82C710 is not set CONFIG_SERIO_PARKBD=y CONFIG_SERIO_PCIPS2=y CONFIG_SERIO_LIBPS2=y # CONFIG_SERIO_RAW is not set CONFIG_GAMEPORT=y CONFIG_GAMEPORT_NS558=y # CONFIG_GAMEPORT_L4 is not set # CONFIG_GAMEPORT_EMU10K1 is not set CONFIG_GAMEPORT_FM801=y # # Character devices # # CONFIG_VT is not set # CONFIG_DEVKMEM is not set # CONFIG_SERIAL_NONSTANDARD is not set CONFIG_NOZOMI=y # # Serial drivers # CONFIG_SERIAL_8250=y # CONFIG_SERIAL_8250_CONSOLE is not set CONFIG_FIX_EARLYCON_MEM=y CONFIG_SERIAL_8250_PCI=y CONFIG_SERIAL_8250_NR_UARTS=4 CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # # Non-8250 serial port support # CONFIG_SERIAL_CORE=y CONFIG_CONSOLE_POLL=y CONFIG_SERIAL_JSM=y # CONFIG_UNIX98_PTYS is not set # CONFIG_LEGACY_PTYS is not set # CONFIG_PRINTER is not set CONFIG_PPDEV=y # CONFIG_IPMI_HANDLER is not set CONFIG_HW_RANDOM=y # CONFIG_HW_RANDOM_INTEL is not set # CONFIG_HW_RANDOM_AMD is not set # CONFIG_NVRAM is not set # CONFIG_R3964 is not set CONFIG_APPLICOM=y # CONFIG_MWAVE is not set # CONFIG_PC8736x_GPIO is not set CONFIG_RAW_DRIVER=y CONFIG_MAX_RAW_DEVS=256 # CONFIG_HANGCHECK_TIMER is not set CONFIG_TCG_TPM=y # CONFIG_TCG_NSC is not set CONFIG_TCG_ATMEL=y # CONFIG_TELCLOCK is not set CONFIG_DEVPORT=y # CONFIG_I2C is not set CONFIG_SPI=y CONFIG_SPI_DEBUG=y CONFIG_SPI_MASTER=y # # SPI Master Controller Drivers # CONFIG_SPI_BITBANG=y # CONFIG_SPI_BUTTERFLY is not set CONFIG_SPI_LM70_LLP=y # # SPI Protocol Masters # CONFIG_SPI_SPIDEV=y CONFIG_W1=y # # 1-wire Bus Masters # CONFIG_W1_MASTER_MATROX=y # # 1-wire Slaves # # CONFIG_W1_SLAVE_THERM is not set CONFIG_W1_SLAVE_SMEM=y # CONFIG_W1_SLAVE_DS2433 is not set CONFIG_W1_SLAVE_DS2760=y # CONFIG_POWER_SUPPLY is not set # CONFIG_HWMON is not set CONFIG_THERMAL=y CONFIG_WATCHDOG=y # CONFIG_WATCHDOG_NOWAYOUT is not set # # Watchdog Device Drivers # CONFIG_SOFT_WATCHDOG=y CONFIG_ACQUIRE_WDT=y # CONFIG_ADVANTECH_WDT is not set CONFIG_ALIM1535_WDT=y CONFIG_ALIM7101_WDT=y # CONFIG_SC520_WDT is not set # CONFIG_EUROTECH_WDT is not set CONFIG_IB700_WDT=y CONFIG_IBMASR=y CONFIG_WAFER_WDT=y # CONFIG_I6300ESB_WDT is not set CONFIG_ITCO_WDT=y CONFIG_ITCO_VENDOR_SUPPORT=y # CONFIG_IT8712F_WDT is not set # CONFIG_HP_WATCHDOG is not set # CONFIG_SC1200_WDT is not set # CONFIG_PC87413_WDT is not set CONFIG_60XX_WDT=y # CONFIG_SBC8360_WDT is not set CONFIG_CPU5_WDT=y CONFIG_SMSC37B787_WDT=y # CONFIG_W83627HF_WDT is not set CONFIG_W83697HF_WDT=y # CONFIG_W83877F_WDT is not set CONFIG_W83977F_WDT=y CONFIG_MACHZ_WDT=y CONFIG_SBC_EPX_C3_WATCHDOG=y # # PCI-based Watchdog Cards # CONFIG_PCIPCWATCHDOG=y CONFIG_WDTPCI=y CONFIG_WDT_501_PCI=y # # Sonics Silicon Backplane # CONFIG_SSB_POSSIBLE=y CONFIG_SSB=y CONFIG_SSB_SPROM=y CONFIG_SSB_PCIHOST_POSSIBLE=y CONFIG_SSB_PCIHOST=y # CONFIG_SSB_B43_PCI_BRIDGE is not set CONFIG_SSB_SILENT=y CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y CONFIG_SSB_DRIVER_PCICORE=y # # Multifunction device drivers # # CONFIG_MFD_SM501 is not set CONFIG_HTC_PASIC3=y # # Multimedia devices # # # Multimedia core support # # CONFIG_VIDEO_DEV is not set # CONFIG_VIDEO_MEDIA is not set # # Multimedia drivers # # CONFIG_DAB is not set # # Graphics support # CONFIG_AGP=y CONFIG_AGP_AMD64=y CONFIG_AGP_INTEL=y # CONFIG_AGP_SIS is not set CONFIG_AGP_VIA=y CONFIG_DRM=y # CONFIG_DRM_TDFX is not set # CONFIG_DRM_R128 is not set # CONFIG_DRM_RADEON is not set # CONFIG_DRM_I810 is not set # CONFIG_DRM_I830 is not set # CONFIG_DRM_I915 is not set # CONFIG_DRM_MGA is not set # CONFIG_DRM_SIS is not set CONFIG_DRM_VIA=y CONFIG_DRM_SAVAGE=y # CONFIG_VGASTATE is not set CONFIG_VIDEO_OUTPUT_CONTROL=y CONFIG_FB=y # CONFIG_FIRMWARE_EDID is not set # CONFIG_FB_DDC is not set CONFIG_FB_CFB_FILLRECT=y CONFIG_FB_CFB_COPYAREA=y CONFIG_FB_CFB_IMAGEBLIT=y # CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set CONFIG_FB_SYS_FILLRECT=y CONFIG_FB_SYS_COPYAREA=y CONFIG_FB_SYS_IMAGEBLIT=y # CONFIG_FB_FOREIGN_ENDIAN is not set CONFIG_FB_SYS_FOPS=y CONFIG_FB_DEFERRED_IO=y CONFIG_FB_HECUBA=y # CONFIG_FB_SVGALIB is not set # CONFIG_FB_MACMODES is not set CONFIG_FB_BACKLIGHT=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y # # Frame buffer hardware drivers # CONFIG_FB_CIRRUS=y # CONFIG_FB_PM2 is not set # CONFIG_FB_CYBER2000 is not set CONFIG_FB_ARC=y CONFIG_FB_ASILIANT=y # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set # CONFIG_FB_VESA is not set # CONFIG_FB_EFI is not set CONFIG_FB_N411=y CONFIG_FB_HGA=y # CONFIG_FB_HGA_ACCEL is not set # CONFIG_FB_S1D13XXX is not set # CONFIG_FB_NVIDIA is not set # CONFIG_FB_RIVA is not set # CONFIG_FB_LE80578 is not set CONFIG_FB_INTEL=y # CONFIG_FB_INTEL_DEBUG is not set # CONFIG_FB_INTEL_I2C is not set # CONFIG_FB_MATROX is not set CONFIG_FB_RADEON=y # CONFIG_FB_RADEON_I2C is not set # CONFIG_FB_RADEON_BACKLIGHT is not set CONFIG_FB_RADEON_DEBUG=y # CONFIG_FB_ATY128 is not set CONFIG_FB_ATY=y # CONFIG_FB_ATY_CT is not set CONFIG_FB_ATY_GX=y CONFIG_FB_ATY_BACKLIGHT=y # CONFIG_FB_S3 is not set # CONFIG_FB_SAVAGE is not set CONFIG_FB_SIS=y CONFIG_FB_SIS_300=y CONFIG_FB_SIS_315=y # CONFIG_FB_NEOMAGIC is not set CONFIG_FB_KYRO=y # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_VT8623 is not set CONFIG_FB_TRIDENT=y CONFIG_FB_TRIDENT_ACCEL=y # CONFIG_FB_ARK is not set # CONFIG_FB_PM3 is not set # CONFIG_FB_GEODE is not set CONFIG_FB_VIRTUAL=y CONFIG_BACKLIGHT_LCD_SUPPORT=y # CONFIG_LCD_CLASS_DEVICE is not set CONFIG_BACKLIGHT_CLASS_DEVICE=y # CONFIG_BACKLIGHT_CORGI is not set # CONFIG_BACKLIGHT_PROGEAR is not set # # Display device support # CONFIG_DISPLAY_SUPPORT=y # # Display hardware drivers # CONFIG_LOGO=y # CONFIG_LOGO_LINUX_MONO is not set # CONFIG_LOGO_LINUX_VGA16 is not set # CONFIG_LOGO_LINUX_CLUT224 is not set # # Sound # # CONFIG_SOUND is not set CONFIG_HID_SUPPORT=y # CONFIG_HID is not set # CONFIG_USB_SUPPORT is not set # CONFIG_MMC is not set # CONFIG_MEMSTICK is not set # CONFIG_NEW_LEDS is not set # CONFIG_ACCESSIBILITY is not set CONFIG_INFINIBAND=y CONFIG_INFINIBAND_USER_MAD=y CONFIG_INFINIBAND_USER_ACCESS=y CONFIG_INFINIBAND_USER_MEM=y # CONFIG_INFINIBAND_MTHCA is not set CONFIG_INFINIBAND_IPATH=y CONFIG_MLX4_INFINIBAND=y # CONFIG_INFINIBAND_SRP is not set # CONFIG_EDAC is not set CONFIG_RTC_LIB=y CONFIG_RTC_CLASS=y CONFIG_RTC_HCTOSYS=y CONFIG_RTC_HCTOSYS_DEVICE="rtc0" # CONFIG_RTC_DEBUG is not set # # RTC interfaces # # CONFIG_RTC_INTF_DEV is not set CONFIG_RTC_DRV_TEST=y # # SPI RTC drivers # # CONFIG_RTC_DRV_MAX6902 is not set # CONFIG_RTC_DRV_R9701 is not set # CONFIG_RTC_DRV_RS5C348 is not set # # Platform RTC drivers # # CONFIG_RTC_DRV_CMOS is not set CONFIG_RTC_DRV_DS1511=y # CONFIG_RTC_DRV_DS1553 is not set CONFIG_RTC_DRV_DS1742=y # CONFIG_RTC_DRV_STK17TA8 is not set CONFIG_RTC_DRV_M48T86=y # CONFIG_RTC_DRV_M48T59 is not set CONFIG_RTC_DRV_V3020=y # # on-CPU RTC drivers # CONFIG_DMADEVICES=y # # DMA Devices # # CONFIG_INTEL_IOATDMA is not set CONFIG_AUXDISPLAY=y CONFIG_UIO=y # CONFIG_UIO_CIF is not set CONFIG_UIO_SMX=y # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_DELL_RBU is not set CONFIG_DCDBAS=y CONFIG_ISCSI_IBFT_FIND=y # CONFIG_ISCSI_IBFT is not set # # File systems # CONFIG_EXT2_FS=y # CONFIG_EXT2_FS_XATTR is not set # CONFIG_EXT2_FS_XIP is not set CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y # CONFIG_EXT3_FS_POSIX_ACL is not set # CONFIG_EXT3_FS_SECURITY is not set # CONFIG_EXT4DEV_FS is not set CONFIG_JBD=y CONFIG_FS_MBCACHE=y # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set CONFIG_FS_POSIX_ACL=y CONFIG_XFS_FS=y # CONFIG_XFS_QUOTA is not set # CONFIG_XFS_POSIX_ACL is not set CONFIG_XFS_RT=y # CONFIG_XFS_DEBUG is not set CONFIG_GFS2_FS=y # CONFIG_GFS2_FS_LOCKING_NOLOCK is not set CONFIG_DNOTIFY=y # CONFIG_INOTIFY is not set # CONFIG_QUOTA is not set # CONFIG_AUTOFS_FS is not set # CONFIG_AUTOFS4_FS is not set CONFIG_FUSE_FS=y # # CD-ROM/DVD Filesystems # # CONFIG_ISO9660_FS is not set # CONFIG_UDF_FS is not set # # DOS/FAT/NT Filesystems # CONFIG_FAT_FS=y CONFIG_MSDOS_FS=y CONFIG_VFAT_FS=y CONFIG_FAT_DEFAULT_CODEPAGE=437 CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" CONFIG_NTFS_FS=y # CONFIG_NTFS_DEBUG is not set CONFIG_NTFS_RW=y # # Pseudo filesystems # # CONFIG_PROC_FS is not set # CONFIG_SYSFS is not set # CONFIG_TMPFS is not set # CONFIG_HUGETLBFS is not set # CONFIG_HUGETLB_PAGE is not set # # Miscellaneous filesystems # CONFIG_ADFS_FS=y # CONFIG_ADFS_FS_RW is not set CONFIG_AFFS_FS=y # CONFIG_ECRYPT_FS is not set CONFIG_HFS_FS=y CONFIG_HFSPLUS_FS=y CONFIG_BEFS_FS=y CONFIG_BEFS_DEBUG=y CONFIG_BFS_FS=y CONFIG_EFS_FS=y CONFIG_JFFS2_FS=y CONFIG_JFFS2_FS_DEBUG=0 CONFIG_JFFS2_FS_WRITEBUFFER=y # CONFIG_JFFS2_FS_WBUF_VERIFY is not set # CONFIG_JFFS2_SUMMARY is not set CONFIG_JFFS2_FS_XATTR=y # CONFIG_JFFS2_FS_POSIX_ACL is not set CONFIG_JFFS2_FS_SECURITY=y CONFIG_JFFS2_COMPRESSION_OPTIONS=y # CONFIG_JFFS2_ZLIB is not set CONFIG_JFFS2_LZO=y # CONFIG_JFFS2_RTIME is not set CONFIG_JFFS2_RUBIN=y # CONFIG_JFFS2_CMODE_NONE is not set # CONFIG_JFFS2_CMODE_PRIORITY is not set CONFIG_JFFS2_CMODE_SIZE=y # CONFIG_JFFS2_CMODE_FAVOURLZO is not set CONFIG_CRAMFS=y CONFIG_VXFS_FS=y # CONFIG_MINIX_FS is not set CONFIG_HPFS_FS=y # CONFIG_QNX4FS_FS is not set # CONFIG_ROMFS_FS is not set # CONFIG_SYSV_FS is not set CONFIG_UFS_FS=y CONFIG_UFS_FS_WRITE=y # CONFIG_UFS_DEBUG is not set # CONFIG_NETWORK_FILESYSTEMS is not set # # Partition Types # # CONFIG_PARTITION_ADVANCED is not set CONFIG_AMIGA_PARTITION=y CONFIG_MSDOS_PARTITION=y CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" # CONFIG_NLS_CODEPAGE_437 is not set # CONFIG_NLS_CODEPAGE_737 is not set # CONFIG_NLS_CODEPAGE_775 is not set CONFIG_NLS_CODEPAGE_850=y # CONFIG_NLS_CODEPAGE_852 is not set # CONFIG_NLS_CODEPAGE_855 is not set # CONFIG_NLS_CODEPAGE_857 is not set CONFIG_NLS_CODEPAGE_860=y CONFIG_NLS_CODEPAGE_861=y # CONFIG_NLS_CODEPAGE_862 is not set CONFIG_NLS_CODEPAGE_863=y CONFIG_NLS_CODEPAGE_864=y # CONFIG_NLS_CODEPAGE_865 is not set # CONFIG_NLS_CODEPAGE_866 is not set # CONFIG_NLS_CODEPAGE_869 is not set # CONFIG_NLS_CODEPAGE_936 is not set CONFIG_NLS_CODEPAGE_950=y # CONFIG_NLS_CODEPAGE_932 is not set # CONFIG_NLS_CODEPAGE_949 is not set CONFIG_NLS_CODEPAGE_874=y CONFIG_NLS_ISO8859_8=y CONFIG_NLS_CODEPAGE_1250=y # CONFIG_NLS_CODEPAGE_1251 is not set CONFIG_NLS_ASCII=y CONFIG_NLS_ISO8859_1=y CONFIG_NLS_ISO8859_2=y CONFIG_NLS_ISO8859_3=y # CONFIG_NLS_ISO8859_4 is not set CONFIG_NLS_ISO8859_5=y CONFIG_NLS_ISO8859_6=y # CONFIG_NLS_ISO8859_7 is not set CONFIG_NLS_ISO8859_9=y CONFIG_NLS_ISO8859_13=y CONFIG_NLS_ISO8859_14=y # CONFIG_NLS_ISO8859_15 is not set CONFIG_NLS_KOI8_R=y CONFIG_NLS_KOI8_U=y CONFIG_NLS_UTF8=y # # Kernel hacking # CONFIG_TRACE_IRQFLAGS_SUPPORT=y CONFIG_ENABLE_WARN_DEPRECATED=y CONFIG_ENABLE_MUST_CHECK=y CONFIG_FRAME_WARN=2048 CONFIG_MAGIC_SYSRQ=y # CONFIG_UNUSED_SYMBOLS is not set # CONFIG_HEADERS_CHECK is not set CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_SHIRQ=y CONFIG_DETECT_SOFTLOCKUP=y CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1 CONFIG_SCHED_DEBUG=y CONFIG_SCHEDSTATS=y # CONFIG_DEBUG_OBJECTS is not set CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y # CONFIG_PROVE_LOCKING is not set CONFIG_LOCKDEP=y CONFIG_LOCK_STAT=y CONFIG_DEBUG_LOCKDEP=y CONFIG_DEBUG_SPINLOCK_SLEEP=y CONFIG_DEBUG_LOCKING_API_SELFTESTS=y CONFIG_STACKTRACE=y CONFIG_DEBUG_KOBJECT=y CONFIG_DEBUG_VM=y # CONFIG_DEBUG_WRITECOUNT is not set CONFIG_DEBUG_LIST=y CONFIG_DEBUG_SG=y CONFIG_FRAME_POINTER=y # CONFIG_BACKTRACE_SELF_TEST is not set CONFIG_FAULT_INJECTION=y CONFIG_FAILSLAB=y # CONFIG_FAIL_PAGE_ALLOC is not set CONFIG_FAIL_MAKE_REQUEST=y CONFIG_LATENCYTOP=y CONFIG_HAVE_FTRACE=y CONFIG_HAVE_DYNAMIC_FTRACE=y # CONFIG_FTRACE is not set # CONFIG_IRQSOFF_TRACER is not set # CONFIG_SYSPROF_TRACER is not set # CONFIG_SCHED_TRACER is not set # CONFIG_CONTEXT_SWITCH_TRACER is not set CONFIG_PROVIDE_OHCI1394_DMA_INIT=y # CONFIG_FIREWIRE_OHCI_REMOTE_DMA is not set CONFIG_SAMPLES=y CONFIG_SAMPLE_KOBJECT=y CONFIG_HAVE_ARCH_KGDB=y CONFIG_KGDB=y CONFIG_KGDB_SERIAL_CONSOLE=y CONFIG_KGDB_TESTS=y # CONFIG_KGDB_TESTS_ON_BOOT is not set CONFIG_NONPROMISC_DEVMEM=y # CONFIG_EARLY_PRINTK is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set CONFIG_DEBUG_PAGEALLOC=y # CONFIG_DEBUG_PER_CPU_MAPS is not set # CONFIG_X86_PTDUMP is not set CONFIG_DEBUG_RODATA=y CONFIG_DIRECT_GBPAGES=y CONFIG_DEBUG_RODATA_TEST=y CONFIG_X86_MPPARSE=y # CONFIG_IOMMU_DEBUG is not set # CONFIG_MMIOTRACE is not set CONFIG_IO_DELAY_TYPE_0X80=0 CONFIG_IO_DELAY_TYPE_0XED=1 CONFIG_IO_DELAY_TYPE_UDELAY=2 CONFIG_IO_DELAY_TYPE_NONE=3 # CONFIG_IO_DELAY_0X80 is not set # CONFIG_IO_DELAY_0XED is not set CONFIG_IO_DELAY_UDELAY=y # CONFIG_IO_DELAY_NONE is not set CONFIG_DEFAULT_IO_DELAY_TYPE=2 CONFIG_CPA_DEBUG=y # # Security options # CONFIG_KEYS=y CONFIG_KEYS_DEBUG_PROC_KEYS=y CONFIG_SECURITY_FILE_CAPABILITIES=y CONFIG_CRYPTO=y # # Crypto core or helper # CONFIG_CRYPTO_ALGAPI=y CONFIG_CRYPTO_AEAD=y CONFIG_CRYPTO_BLKCIPHER=y CONFIG_CRYPTO_MANAGER=y CONFIG_CRYPTO_GF128MUL=y CONFIG_CRYPTO_NULL=y # CONFIG_CRYPTO_CRYPTD is not set # CONFIG_CRYPTO_AUTHENC is not set # # Authenticated Encryption with Associated Data # CONFIG_CRYPTO_CCM=y CONFIG_CRYPTO_GCM=y CONFIG_CRYPTO_SEQIV=y # # Block modes # # CONFIG_CRYPTO_CBC is not set CONFIG_CRYPTO_CTR=y CONFIG_CRYPTO_CTS=y CONFIG_CRYPTO_ECB=y # CONFIG_CRYPTO_LRW is not set # CONFIG_CRYPTO_PCBC is not set # CONFIG_CRYPTO_XTS is not set # # Hash modes # # CONFIG_CRYPTO_HMAC is not set # CONFIG_CRYPTO_XCBC is not set # # Digest # # CONFIG_CRYPTO_CRC32C is not set # CONFIG_CRYPTO_MD4 is not set # CONFIG_CRYPTO_MD5 is not set CONFIG_CRYPTO_MICHAEL_MIC=y CONFIG_CRYPTO_SHA1=y # CONFIG_CRYPTO_SHA256 is not set CONFIG_CRYPTO_SHA512=y # CONFIG_CRYPTO_TGR192 is not set CONFIG_CRYPTO_WP512=y # # Ciphers # CONFIG_CRYPTO_AES=y CONFIG_CRYPTO_AES_X86_64=y CONFIG_CRYPTO_ANUBIS=y CONFIG_CRYPTO_ARC4=y # CONFIG_CRYPTO_BLOWFISH is not set # CONFIG_CRYPTO_CAMELLIA is not set CONFIG_CRYPTO_CAST5=y CONFIG_CRYPTO_CAST6=y CONFIG_CRYPTO_DES=y # CONFIG_CRYPTO_FCRYPT is not set CONFIG_CRYPTO_KHAZAD=y CONFIG_CRYPTO_SALSA20=y CONFIG_CRYPTO_SALSA20_X86_64=y CONFIG_CRYPTO_SEED=y CONFIG_CRYPTO_SERPENT=y # CONFIG_CRYPTO_TEA is not set CONFIG_CRYPTO_TWOFISH=y CONFIG_CRYPTO_TWOFISH_COMMON=y # CONFIG_CRYPTO_TWOFISH_X86_64 is not set # # Compression # # CONFIG_CRYPTO_DEFLATE is not set CONFIG_CRYPTO_LZO=y # CONFIG_CRYPTO_HW is not set CONFIG_HAVE_KVM=y # CONFIG_VIRTUALIZATION is not set # # Library routines # CONFIG_BITREVERSE=y CONFIG_GENERIC_FIND_FIRST_BIT=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_CRC_CCITT=y CONFIG_CRC16=y CONFIG_CRC_ITU_T=y CONFIG_CRC32=y CONFIG_CRC7=y CONFIG_LIBCRC32C=y CONFIG_ZLIB_INFLATE=y CONFIG_LZO_COMPRESS=y CONFIG_LZO_DECOMPRESS=y CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y CONFIG_HAS_DMA=y CONFIG_FORCE_SUCCESSFUL_BUILD=y ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-09 13:03 ` Ingo Molnar @ 2008-06-09 16:08 ` Mike Travis 2008-06-09 17:36 ` Mike Travis 1 sibling, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-06-09 16:08 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> * It is now possible to use percpu operations for pda access >> since the pda is in the percpu area. Drop the pda operations. > > FYI, this one didnt build with the attached config. > > Ingo > Ok, thanks, I will check it out. I'm still having problems getting your previous "instantaneous reboot" problem working. It has something to do with cpu_clock interrupt, but I've still not figured out why it's failing. THanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-09 13:03 ` Ingo Molnar 2008-06-09 16:08 ` Mike Travis @ 2008-06-09 17:36 ` Mike Travis 2008-06-09 18:20 ` Christoph Lameter 2008-06-10 10:09 ` Ingo Molnar 1 sibling, 2 replies; 108+ messages in thread From: Mike Travis @ 2008-06-09 17:36 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> * It is now possible to use percpu operations for pda access >> since the pda is in the percpu area. Drop the pda operations. > > FYI, this one didnt build with the attached config. > > Ingo > Hi Ingo, Can you send me the output from the build? It builds fine on my machine (a few warnings). The silentoldconfig made these changes to the .config file. (I did a git-remote update and reapplied my changes before building.) Thanks, Mike --- ../configs/ingo-test-9 2008-06-09 10:25:32.148026511 -0700 +++ ../build/ingo-test-9/.config 2008-06-09 10:30:15.273171501 -0700 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.26-rc5 -# Mon Jun 9 14:59:39 2008 +# Linux kernel version: 2.6.26-rc4 +# Mon Jun 9 10:30:10 2008 # CONFIG_64BIT=y # CONFIG_X86_32 is not set @@ -36,6 +36,7 @@ CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y +CONFIG_HAVE_ZERO_BASED_PER_CPU=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ZONE_DMA32=y @@ -52,23 +53,16 @@ CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y # CONFIG_KTIME_SCALAR is not set -# CONFIG_BOOTPARAM_SUPPORT_WANTED is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y -CONFIG_BROKEN_BOOT_ALLOWED3=y -CONFIG_BROKEN_BOOT_ALLOWED2=y -CONFIG_BROKEN_BOOT_ALLOWED=y -CONFIG_BROKEN_BOOT=y -CONFIG_BROKEN_BOOT_EUROPE=y -CONFIG_BROKEN_BOOT_TITAN=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 -CONFIG_LOCALVERSION="" -# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_LOCALVERSION="-ingo-test-9" +CONFIG_LOCALVERSION_AUTO=y # CONFIG_SWAP is not set # CONFIG_SYSVIPC is not set # CONFIG_POSIX_MQUEUE is not set @@ -152,15 +146,16 @@ # CONFIG_NO_HZ is not set CONFIG_HIGH_RES_TIMERS=y CONFIG_GENERIC_CLOCKEVENTS_BUILD=y -CONFIG_SMP_SUPPORT=y -CONFIG_UP_WANTED_1=y -# CONFIG_UP_WANTED_2 is not set CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set +# CONFIG_X86_NUMAQ is not set +# CONFIG_X86_SUMMIT is not set +# CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set +# CONFIG_X86_ES7000 is not set # CONFIG_X86_RDC321X is not set # CONFIG_X86_VSMP is not set CONFIG_PARAVIRT_GUEST=y @@ -616,6 +611,7 @@ CONFIG_SCSI_FC_ATTRS=y CONFIG_SCSI_FC_TGT_ATTRS=y CONFIG_SCSI_ISCSI_ATTRS=y +# CONFIG_SCSI_SAS_LIBSAS is not set CONFIG_SCSI_SRP_ATTRS=y CONFIG_SCSI_SRP_TGT_ATTRS=y CONFIG_SCSI_LOWLEVEL=y @@ -626,6 +622,7 @@ # CONFIG_SCSI_AIC7XXX is not set # CONFIG_SCSI_AIC7XXX_OLD is not set # CONFIG_SCSI_AIC79XX is not set +# CONFIG_SCSI_AIC94XX is not set CONFIG_SCSI_DPT_I2O=y CONFIG_SCSI_ADVANSYS=y CONFIG_SCSI_ARCMSR=y @@ -642,6 +639,7 @@ CONFIG_SCSI_IPS=y # CONFIG_SCSI_INITIO is not set CONFIG_SCSI_INIA100=y +# CONFIG_SCSI_MVSAS is not set CONFIG_SCSI_STEX=y CONFIG_SCSI_SYM53C8XX_2=y CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 @@ -1302,6 +1300,7 @@ CONFIG_DEBUG_LOCKING_API_SELFTESTS=y CONFIG_STACKTRACE=y CONFIG_DEBUG_KOBJECT=y +# CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_VM=y # CONFIG_DEBUG_WRITECOUNT is not set CONFIG_DEBUG_LIST=y @@ -1460,4 +1459,3 @@ CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y CONFIG_HAS_DMA=y -CONFIG_FORCE_SUCCESSFUL_BUILD=y ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-09 17:36 ` Mike Travis @ 2008-06-09 18:20 ` Christoph Lameter 2008-06-09 23:29 ` Jeremy Fitzhardinge 2008-06-10 10:09 ` Ingo Molnar 1 sibling, 1 reply; 108+ messages in thread From: Christoph Lameter @ 2008-06-09 18:20 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel Paravirt support is on. We have seen an issue with that in the past. Why was that again? Also check that there is really no use of the segment register before start_kernel() loads it. If the segment register is used to refer to pda stuff before start_kernel then we need to make sure that the right value is loaded in the asm (head64.c). ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-09 18:20 ` Christoph Lameter @ 2008-06-09 23:29 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-09 23:29 UTC (permalink / raw) To: Christoph Lameter Cc: Mike Travis, Ingo Molnar, Andrew Morton, David Miller, Eric Dumazet, linux-kernel Christoph Lameter wrote: > Paravirt support is on. We have seen an issue with that in the past. Why > was that again? > I'm not aware of any paravirt-related percpu bugs (other than the fact that the last major revision of per-cpu variables was done under the aegis of paravirt-ops). But booting on bare hardware with pvops enabled should be exactly the same as non-pvops with respect to percpu. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-09 17:36 ` Mike Travis 2008-06-09 18:20 ` Christoph Lameter @ 2008-06-10 10:09 ` Ingo Molnar 2008-06-10 15:07 ` Mike Travis 1 sibling, 1 reply; 108+ messages in thread From: Ingo Molnar @ 2008-06-10 10:09 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel * Mike Travis <travis@sgi.com> wrote: > Ingo Molnar wrote: > > * Mike Travis <travis@sgi.com> wrote: > > > >> * It is now possible to use percpu operations for pda access > >> since the pda is in the percpu area. Drop the pda operations. > > > > FYI, this one didnt build with the attached config. > > > > Ingo > > > > Hi Ingo, > > Can you send me the output from the build? It builds fine on my > machine (a few warnings). The silentoldconfig made these changes to > the .config file. (I did a git-remote update and reapplied my changes > before building.) dont have the log anymore, but i had to revert 3/4 due to the boot crash, could there be a dependency of 4/4 on 3/4? the first two patches look fine and survived -tip testing. Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu(). 2008-06-10 10:09 ` Ingo Molnar @ 2008-06-10 15:07 ` Mike Travis 0 siblings, 0 replies; 108+ messages in thread From: Mike Travis @ 2008-06-10 15:07 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, Jeremy Fitzhardinge, linux-kernel Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> Ingo Molnar wrote: >>> * Mike Travis <travis@sgi.com> wrote: >>> >>>> * It is now possible to use percpu operations for pda access >>>> since the pda is in the percpu area. Drop the pda operations. >>> FYI, this one didnt build with the attached config. >>> >>> Ingo >>> >> Hi Ingo, >> >> Can you send me the output from the build? It builds fine on my >> machine (a few warnings). The silentoldconfig made these changes to >> the .config file. (I did a git-remote update and reapplied my changes >> before building.) > > dont have the log anymore, but i had to revert 3/4 due to the boot > crash, could there be a dependency of 4/4 on 3/4? > > the first two patches look fine and survived -tip testing. > > Ingo Thanks Ingo. I'm narrowing down the reboot problem by pulling config items out. I managed to get it to boot but now am trying to understand how those config options are causing the problem. Btw, it does not panic/reboot on linux-next with the original config. Perhaps I should try and bisect to an earlier patch to see what that reveals? Thanks, Mike ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH] x86: collapse the various size-dependent percpu accessors together 2008-06-04 0:30 [PATCH 0/4] percpu: Optimize percpu accesses Mike Travis ` (3 preceding siblings ...) 2008-06-04 0:30 ` [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu() Mike Travis @ 2008-06-04 10:18 ` Jeremy Fitzhardinge 2008-06-04 10:45 ` Jeremy Fitzhardinge 4 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-04 10:18 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel We can use gcc's %z modifier to emit the appropriate size suffix for an instruction, so we don't need to duplicate the asm statement for each size. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- include/asm-x86/percpu.h | 56 +++------------------------------------------- 1 file changed, 4 insertions(+), 52 deletions(-) =================================================================== --- a/include/asm-x86/percpu.h +++ b/include/asm-x86/percpu.h @@ -75,22 +75,10 @@ } \ switch (sizeof(var)) { \ case 1: \ - asm(op "b %1,"__percpu_seg"%0" \ - : "+m" (var) \ - : "ri" ((T__)val)); \ - break; \ case 2: \ - asm(op "w %1,"__percpu_seg"%0" \ - : "+m" (var) \ - : "ri" ((T__)val)); \ - break; \ case 4: \ - asm(op "l %1,"__percpu_seg"%0" \ - : "+m" (var) \ - : "ri" ((T__)val)); \ - break; \ case 8: \ - asm(op "q %1,"__percpu_seg"%0" \ + asm(op "%z0 %1,"__percpu_seg"%0" \ : "+m" (var) \ : "ri" ((T__)val)); \ break; \ @@ -103,22 +91,10 @@ typeof(var) ret__; \ switch (sizeof(var)) { \ case 1: \ - asm(op "b "__percpu_seg"%1,%0" \ - : "=r" (ret__) \ - : "m" (var)); \ - break; \ case 2: \ - asm(op "w "__percpu_seg"%1,%0" \ - : "=r" (ret__) \ - : "m" (var)); \ - break; \ case 4: \ - asm(op "l "__percpu_seg"%1,%0" \ - : "=r" (ret__) \ - : "m" (var)); \ - break; \ case 8: \ - asm(op "q "__percpu_seg"%1,%0" \ + asm(op "%z1 "__percpu_seg"%1,%0" \ : "=r" (ret__) \ : "m" (var)); \ break; \ @@ -131,19 +107,10 @@ ({ \ switch (sizeof(var)) { \ case 1: \ - asm(op "b "__percpu_seg"%0" \ - : : "m"(var)); \ - break; \ case 2: \ - asm(op "w "__percpu_seg"%0" \ - : : "m"(var)); \ - break; \ case 4: \ - asm(op "l "__percpu_seg"%0" \ - : : "m"(var)); \ - break; \ case 8: \ - asm(op "q "__percpu_seg"%0" \ + asm(op "%z0 "__percpu_seg"%0" \ : : "m"(var)); \ break; \ default: __bad_percpu_size(); \ @@ -155,25 +122,10 @@ typeof(var) prev; \ switch (sizeof(var)) { \ case 1: \ - asm("cmpxchgb %b1, "__percpu_seg"%2" \ - : "=a"(prev) \ - : "q"(new), "m"(var), "0"(old) \ - : "memory"); \ - break; \ case 2: \ - asm("cmpxchgw %w1, "__percpu_seg"%2" \ - : "=a"(prev) \ - : "r"(new), "m"(var), "0"(old) \ - : "memory"); \ - break; \ case 4: \ - asm("cmpxchgl %k1, "__percpu_seg"%2" \ - : "=a"(prev) \ - : "r"(new), "m"(var), "0"(old) \ - : "memory"); \ - break; \ case 8: \ - asm("cmpxchgq %1, "__percpu_seg"%2" \ + asm("cmpxchg%z1 %1, "__percpu_seg"%2" \ : "=a"(prev) \ : "r"(new), "m"(var), "0"(old) \ : "memory"); \ ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH] x86: collapse the various size-dependent percpu accessors together 2008-06-04 10:18 ` [PATCH] x86: collapse the various size-dependent percpu accessors together Jeremy Fitzhardinge @ 2008-06-04 10:45 ` Jeremy Fitzhardinge 2008-06-04 11:29 ` Ingo Molnar 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-04 10:45 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel Jeremy Fitzhardinge wrote: > We can use gcc's %z modifier to emit the appropriate size suffix for > an instruction, so we don't need to duplicate the asm statement for > each size. Nah, it's a disaster. Drop this one. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH] x86: collapse the various size-dependent percpu accessors together 2008-06-04 10:45 ` Jeremy Fitzhardinge @ 2008-06-04 11:29 ` Ingo Molnar 2008-06-04 12:09 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 108+ messages in thread From: Ingo Molnar @ 2008-06-04 11:29 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Mike Travis, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Jeremy Fitzhardinge wrote: >> We can use gcc's %z modifier to emit the appropriate size suffix for >> an instruction, so we don't need to duplicate the asm statement for >> each size. > > Nah, it's a disaster. Drop this one. hm, what's the problem with it? What you are trying to do here looks like a nice cleanup - assuming it results in the same instructions emitted ;-) Ingo ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH] x86: collapse the various size-dependent percpu accessors together 2008-06-04 11:29 ` Ingo Molnar @ 2008-06-04 12:09 ` Jeremy Fitzhardinge 2008-06-10 17:21 ` Christoph Lameter 0 siblings, 1 reply; 108+ messages in thread From: Jeremy Fitzhardinge @ 2008-06-04 12:09 UTC (permalink / raw) To: Ingo Molnar Cc: Mike Travis, Andrew Morton, Christoph Lameter, David Miller, Eric Dumazet, linux-kernel Ingo Molnar wrote: > * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > >> Jeremy Fitzhardinge wrote: >> >>> We can use gcc's %z modifier to emit the appropriate size suffix for >>> an instruction, so we don't need to duplicate the asm statement for >>> each size. >>> >> Nah, it's a disaster. Drop this one. >> > > hm, what's the problem with it? What you are trying to do here looks > like a nice cleanup - assuming it results in the same instructions > emitted ;-) Yes, would have been lovely. But gcc emits junk: CC arch/x86/xen/enlighten.o {standard input}: Assembler messages: {standard input}:637: Error: no such instruction: `movll %gs:per_cpu__xen_vcpu(%rip),%rax' {standard input}:655: Error: no such instruction: `movll %gs:per_cpu__xen_vcpu(%rip),%rax' {standard input}:671: Error: no such instruction: `movll %gs:per_cpu__xen_vcpu(%rip),%rax' {standard input}:682: Error: no such instruction: `movll %gs:per_cpu__xen_vcpu(%rip),%rax' {standard input}:783: Error: no such instruction: `movll %gs:per_cpu__pda+8(%rip),%rbx' {standard input}:834: Error: no such instruction: `movll %gs:per_cpu__xen_mc_irq_flags(%rip),%rdi' {standard input}:901: Error: no such instruction: `movll %gs:per_cpu__pda+8(%rip),%rbx' {standard input}:978: Error: no such instruction: `movll %gs:per_cpu__xen_mc_irq_flags(%rip),%rdi' {standard input}:1064: Error: no such instruction: `movll %gs:per_cpu__pda+8(%rip),%rbx' {standard input}:1110: Error: no such instruction: `movll %gs:per_cpu__xen_mc_irq_flags(%rip),%rdi' ... CC arch/x86/vdso/vclock_gettime.o {standard input}: Assembler messages: {standard input}:75: Error: suffix or operands invalid for `movs' (all over the place) I tried a version to do 64-bit accesses with an explicit "movq" to solve the "movll" problem, but it generates "movs" on occasion and that was the point I gave up. J ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH] x86: collapse the various size-dependent percpu accessors together 2008-06-04 12:09 ` Jeremy Fitzhardinge @ 2008-06-10 17:21 ` Christoph Lameter 0 siblings, 0 replies; 108+ messages in thread From: Christoph Lameter @ 2008-06-10 17:21 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Mike Travis, Andrew Morton, David Miller, Eric Dumazet, linux-kernel > I tried a version to do 64-bit accesses with an explicit "movq" to solve the > "movll" problem, but it generates "movs" on occasion and that was the point I > gave up. Shucks. Would have been a great approach. ^ permalink raw reply [flat|nested] 108+ messages in thread
end of thread, other threads:[~2008-07-25 21:05 UTC | newest]
Thread overview: 108+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-04 0:30 [PATCH 0/4] percpu: Optimize percpu accesses Mike Travis
2008-06-04 0:30 ` [PATCH 1/4] Zero based percpu: Infrastructure to rebase the per cpu area to zero Mike Travis
2008-06-10 10:06 ` Ingo Molnar
2008-06-04 0:30 ` [PATCH 2/4] x86: Extend percpu ops to 64 bit Mike Travis
2008-06-10 10:04 ` Ingo Molnar
2008-06-04 0:30 ` [PATCH 3/4] x86_64: Fold pda into per cpu area Mike Travis
2008-06-04 12:59 ` Jeremy Fitzhardinge
2008-06-04 13:48 ` Mike Travis
2008-06-04 13:58 ` Jeremy Fitzhardinge
2008-06-04 14:17 ` Mike Travis
2008-06-09 23:18 ` Christoph Lameter
2008-06-05 10:22 ` [crash, bisected] " Ingo Molnar
2008-06-05 16:02 ` Mike Travis
2008-06-06 8:29 ` Jeremy Fitzhardinge
2008-06-06 13:15 ` Mike Travis
2008-06-18 5:34 ` Jeremy Fitzhardinge
2008-06-10 21:31 ` Mike Travis
2008-06-18 17:36 ` Jeremy Fitzhardinge
2008-06-18 18:17 ` Mike Travis
2008-06-18 18:33 ` Ingo Molnar
2008-06-18 19:33 ` Jeremy Fitzhardinge
[not found] ` <48596893.4040908@sgi.com>
[not found] ` <485AADAC.3070301@sgi.com>
[not found] ` <485AB78B.5090904@goop.org>
[not found] ` <485AC120.6010202@sgi.com>
[not found] ` <485AC5D4.6040302@goop.org>
[not found] ` <485ACA8F.10006@sgi.com>
[not found] ` <485ACD92.8050109@sgi.com>
2008-06-19 21:35 ` Jeremy Fitzhardinge
2008-06-19 21:54 ` Jeremy Fitzhardinge
2008-06-19 22:13 ` Mike Travis
2008-06-19 22:21 ` Jeremy Fitzhardinge
2008-06-30 17:49 ` Mike Travis
2008-06-19 22:23 ` Jeremy Fitzhardinge
[not found] ` <485BDB04.4090709@sgi.com>
2008-06-20 17:25 ` Jeremy Fitzhardinge
2008-06-20 17:48 ` Christoph Lameter
2008-06-20 18:30 ` Mike Travis
2008-06-20 18:40 ` Jeremy Fitzhardinge
2008-06-20 18:37 ` Jeremy Fitzhardinge
2008-06-20 18:51 ` Christoph Lameter
2008-06-20 19:04 ` Jeremy Fitzhardinge
2008-06-20 19:21 ` H. Peter Anvin
2008-06-20 19:43 ` Eric W. Biederman
2008-06-20 20:04 ` Mike Travis
2008-06-20 20:37 ` Christoph Lameter
2008-06-20 19:06 ` Mike Travis
2008-06-20 20:25 ` Eric W. Biederman
2008-06-20 20:55 ` Christoph Lameter
2008-06-23 16:55 ` Mike Travis
2008-06-23 17:33 ` Jeremy Fitzhardinge
2008-06-23 18:04 ` Mike Travis
2008-06-23 18:36 ` Mike Travis
2008-06-23 19:41 ` Jeremy Fitzhardinge
2008-06-24 0:02 ` Mike Travis
2008-06-30 17:07 ` Mike Travis
2008-06-30 17:18 ` H. Peter Anvin
2008-06-30 17:57 ` Mike Travis
2008-06-30 20:50 ` Eric W. Biederman
2008-06-30 21:08 ` Jeremy Fitzhardinge
2008-07-01 8:40 ` Eric W. Biederman
2008-07-01 16:27 ` Jeremy Fitzhardinge
2008-07-01 16:55 ` Mike Travis
2008-07-01 16:56 ` H. Peter Anvin
2008-07-01 17:26 ` Jeremy Fitzhardinge
2008-07-01 20:40 ` Eric W. Biederman
2008-07-01 21:10 ` Jeremy Fitzhardinge
2008-07-01 21:39 ` Eric W. Biederman
2008-07-01 21:52 ` Jeremy Fitzhardinge
2008-07-02 0:20 ` H. Peter Anvin
2008-07-02 1:15 ` Mike Travis
2008-07-02 1:32 ` Eric W. Biederman
2008-07-02 1:51 ` Mike Travis
2008-07-02 2:50 ` Eric W. Biederman
2008-07-02 1:40 ` H. Peter Anvin
2008-07-02 1:44 ` Mike Travis
2008-07-02 1:45 ` H. Peter Anvin
2008-07-02 1:55 ` Mike Travis
2008-07-02 22:50 ` Mike Travis
2008-07-03 4:34 ` Eric W. Biederman
2008-07-07 17:17 ` Mike Travis
2008-07-07 19:46 ` Eric W. Biederman
2008-07-08 18:21 ` Mike Travis
2008-07-08 23:36 ` Eric W. Biederman
2008-07-08 23:49 ` Jeremy Fitzhardinge
2008-07-09 14:39 ` Mike Travis
2008-07-25 20:06 ` Mike Travis
2008-07-25 20:12 ` Jeremy Fitzhardinge
2008-07-25 20:34 ` Mike Travis
2008-07-25 20:43 ` Jeremy Fitzhardinge
2008-07-25 21:05 ` Mike Travis
2008-07-09 14:37 ` Mike Travis
2008-07-09 22:38 ` Eric W. Biederman
2008-07-09 23:30 ` Mike Travis
2008-07-10 0:04 ` Eric W. Biederman
2008-07-02 2:01 ` H. Peter Anvin
2008-07-02 3:08 ` Eric W. Biederman
2008-07-01 21:11 ` Andi Kleen
2008-07-01 21:42 ` Eric W. Biederman
2008-07-01 18:41 ` Eric W. Biederman
2008-07-01 12:09 ` Mike Travis
2008-07-01 11:49 ` Mike Travis
2008-06-30 17:43 ` Jeremy Fitzhardinge
2008-06-04 0:30 ` [PATCH 4/4] x86: Replace xxx_pda() operations with x86_xx_percpu() Mike Travis
2008-06-09 13:03 ` Ingo Molnar
2008-06-09 16:08 ` Mike Travis
2008-06-09 17:36 ` Mike Travis
2008-06-09 18:20 ` Christoph Lameter
2008-06-09 23:29 ` Jeremy Fitzhardinge
2008-06-10 10:09 ` Ingo Molnar
2008-06-10 15:07 ` Mike Travis
2008-06-04 10:18 ` [PATCH] x86: collapse the various size-dependent percpu accessors together Jeremy Fitzhardinge
2008-06-04 10:45 ` Jeremy Fitzhardinge
2008-06-04 11:29 ` Ingo Molnar
2008-06-04 12:09 ` Jeremy Fitzhardinge
2008-06-10 17:21 ` Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).