* unified percpu stuff
@ 2009-01-27 1:12 Jeremy Fitzhardinge
2009-01-27 4:17 ` Tejun Heo
2009-01-27 5:57 ` unified percpu stuff Brian Gerst
0 siblings, 2 replies; 15+ messages in thread
From: Jeremy Fitzhardinge @ 2009-01-27 1:12 UTC (permalink / raw)
To: Brian Gerst, Tejun Heo; +Cc: Ingo Molnar, Linux Kernel Mailing List
I'm really pleased to see the unified percpu stuff in the kernel, but
unfortunately its breaking Xen at the moment.
It looks like this is just a matter of initializing %gs properly in
xen_start_kernel. Is there any problem with me doing a load_gs_base(0)
somewhere early in xen_start_kernel (arch/x86/xen/enlighten.c)? Is the
initial percpu are and offset for cpu0 all set up? Do I need to make it
#ifdef CONFIG_SMP?
Do I need to do anything for 32-bit? (I haven't tested that yet.)
Thanks,
J
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: unified percpu stuff 2009-01-27 1:12 unified percpu stuff Jeremy Fitzhardinge @ 2009-01-27 4:17 ` Tejun Heo 2009-01-29 10:43 ` Jeremy Fitzhardinge 2009-01-27 5:57 ` unified percpu stuff Brian Gerst 1 sibling, 1 reply; 15+ messages in thread From: Tejun Heo @ 2009-01-27 4:17 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Brian Gerst, Ingo Molnar, Linux Kernel Mailing List Hello, Jeremy Fitzhardinge wrote: > I'm really pleased to see the unified percpu stuff in the kernel, but > unfortunately its breaking Xen at the moment. > It looks like this is just a matter of initializing %gs properly in > xen_start_kernel. Is there any problem with me doing a load_gs_base(0) > somewhere early in xen_start_kernel (arch/x86/xen/enlighten.c)? No, not at all. Patches welcome. :-) > Is the initial percpu are and offset for cpu0 all set up? Do I need > to make it #ifdef CONFIG_SMP? > > Do I need to do anything for 32-bit? (I haven't tested that yet.) The current tj-percpu[1] contains Brian's commit which consolidates load_gs_base() into switch_to_new_gdt(), so you'll either need to call it instead or call wrmsrl() or loadsegment() directly if that doesn't fit the bill (but I think it will). The initial offset is different between 32 and 64. Please take a look at BOOT_PERCPU_OFFSET definition in setup_percpu.c. Thanks. -- tejun [1] git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git tj-percpu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-27 4:17 ` Tejun Heo @ 2009-01-29 10:43 ` Jeremy Fitzhardinge 2009-01-29 11:01 ` Tejun Heo 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst 0 siblings, 2 replies; 15+ messages in thread From: Jeremy Fitzhardinge @ 2009-01-29 10:43 UTC (permalink / raw) To: Tejun Heo Cc: Brian Gerst, Ingo Molnar, Linux Kernel Mailing List, Gerd Hoffmann Tejun Heo wrote: > The current tj-percpu[1] contains Brian's commit which consolidates > load_gs_base() into switch_to_new_gdt(), There's a bit of a bootstrap problem using switch_to_new_gdt() on the boot CPU: it uses smp_processor_id, which ends up doing a percpu_read(cpu_number), which naturally crashes because %gs isn't set up yet. Perhaps switch_to_new_gdt() should take the cpu number as a parameter (or at least a __switch_to_new_gdt variant which does). J ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-29 10:43 ` Jeremy Fitzhardinge @ 2009-01-29 11:01 ` Tejun Heo 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst 1 sibling, 0 replies; 15+ messages in thread From: Tejun Heo @ 2009-01-29 11:01 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Brian Gerst, Ingo Molnar, Linux Kernel Mailing List, Gerd Hoffmann Jeremy Fitzhardinge wrote: > Tejun Heo wrote: >> The current tj-percpu[1] contains Brian's commit which consolidates >> load_gs_base() into switch_to_new_gdt(), > > There's a bit of a bootstrap problem using switch_to_new_gdt() on the > boot CPU: it uses smp_processor_id, which ends up doing a > percpu_read(cpu_number), which naturally crashes because %gs isn't set > up yet. Perhaps switch_to_new_gdt() should take the cpu number as a > parameter (or at least a __switch_to_new_gdt variant which does). Please feel free to submit a patch. It being a pretty low level function, I think it would be better to convert it to take cpu_number. Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] x86: pass in cpu number to switch_to_new_gdt() 2009-01-29 10:43 ` Jeremy Fitzhardinge 2009-01-29 11:01 ` Tejun Heo @ 2009-01-29 13:43 ` Brian Gerst 2009-01-29 21:36 ` Jeremy Fitzhardinge ` (3 more replies) 1 sibling, 4 replies; 15+ messages in thread From: Brian Gerst @ 2009-01-29 13:43 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Ingo Molnar, Tejun Heo, linux-kernel Impact: cleanup, prepare for xen boot fix. Xen needs to call this function very early to setup the GDT and per-cpu segments. Remove the call to smp_processor_id() and just pass in the cpu number. Signed-off-by: Brian Gerst <brgerst@gmail.com> --- arch/x86/include/asm/processor.h | 2 +- arch/x86/kernel/cpu/common.c | 7 +++---- arch/x86/kernel/setup_percpu.c | 2 +- arch/x86/kernel/smpboot.c | 2 +- arch/x86/mach-voyager/voyager_smp.c | 11 ++++++----- 5 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index befa20b..1c25eb6 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -768,7 +768,7 @@ extern int sysenter_setup(void); extern struct desc_ptr early_gdt_descr; extern void cpu_set_gdt(int); -extern void switch_to_new_gdt(void); +extern void switch_to_new_gdt(int); extern void cpu_init(void); static inline unsigned long get_debugctlmsr(void) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 652fdc9..6eacd64 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -255,10 +255,9 @@ __u32 cleared_cpu_caps[NCAPINTS] __cpuinitdata; /* Current gdt points %fs at the "master" per-cpu area: after this, * it's on the real one. */ -void switch_to_new_gdt(void) +void switch_to_new_gdt(int cpu) { struct desc_ptr gdt_descr; - int cpu = smp_processor_id(); gdt_descr.address = (long)get_cpu_gdt_table(cpu); gdt_descr.size = GDT_SIZE - 1; @@ -993,7 +992,7 @@ void __cpuinit cpu_init(void) * and set up the GDT descriptor: */ - switch_to_new_gdt(); + switch_to_new_gdt(cpu); loadsegment(fs, 0); load_idt((const struct desc_ptr *)&idt_descr); @@ -1098,7 +1097,7 @@ void __cpuinit cpu_init(void) clear_in_cr4(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE); load_idt(&idt_descr); - switch_to_new_gdt(); + switch_to_new_gdt(cpu); /* * Set up and load the per-CPU TSS and LDT diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c index 0d1e7ac..ef91747 100644 --- a/arch/x86/kernel/setup_percpu.c +++ b/arch/x86/kernel/setup_percpu.c @@ -122,7 +122,7 @@ void __init setup_per_cpu_areas(void) * area. Reload any changed state for the boot CPU. */ if (cpu == boot_cpu_id) - switch_to_new_gdt(); + switch_to_new_gdt(cpu); DBG("PERCPU: cpu %4d %p\n", cpu, ptr); } diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index f9dbcff..612d3c7 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1185,7 +1185,7 @@ out: void __init native_smp_prepare_boot_cpu(void) { int me = smp_processor_id(); - switch_to_new_gdt(); + switch_to_new_gdt(me); /* already set me in cpu_online_mask in boot_cpu_init() */ cpumask_set_cpu(me, cpu_callout_mask); per_cpu(cpu_state, me) = CPU_ONLINE; diff --git a/arch/x86/mach-voyager/voyager_smp.c b/arch/x86/mach-voyager/voyager_smp.c index 331cd6d..58c7cac 100644 --- a/arch/x86/mach-voyager/voyager_smp.c +++ b/arch/x86/mach-voyager/voyager_smp.c @@ -1746,12 +1746,13 @@ static void __init voyager_smp_prepare_cpus(unsigned int max_cpus) static void __cpuinit voyager_smp_prepare_boot_cpu(void) { - switch_to_new_gdt(); + int cpu = smp_processor_id(); + switch_to_new_gdt(cpu); - cpu_set(smp_processor_id(), cpu_online_map); - cpu_set(smp_processor_id(), cpu_callout_map); - cpu_set(smp_processor_id(), cpu_possible_map); - cpu_set(smp_processor_id(), cpu_present_map); + cpu_set(cpu, cpu_online_map); + cpu_set(cpu, cpu_callout_map); + cpu_set(cpu, cpu_possible_map); + cpu_set(cpu, cpu_present_map); } static int __cpuinit voyager_cpu_up(unsigned int cpu) -- 1.6.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] x86: pass in cpu number to switch_to_new_gdt() 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst @ 2009-01-29 21:36 ` Jeremy Fitzhardinge 2009-01-29 21:36 ` [PATCH 1/2] x86: split loading percpu segments from loading gdt Jeremy Fitzhardinge ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Jeremy Fitzhardinge @ 2009-01-29 21:36 UTC (permalink / raw) To: Brian Gerst; +Cc: Ingo Molnar, Tejun Heo, linux-kernel Brian Gerst wrote: > Impact: cleanup, prepare for xen boot fix. > > Xen needs to call this function very early to setup the GDT and > per-cpu segments. Remove the call to smp_processor_id() and just > pass in the cpu number. > Unfortunate that still not sufficient for a Xen boot. I had to separate the gdt reload from the segment register reload. Patches follow. J ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] x86: split loading percpu segments from loading gdt 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst 2009-01-29 21:36 ` Jeremy Fitzhardinge @ 2009-01-29 21:36 ` Jeremy Fitzhardinge 2009-01-29 21:36 ` [PATCH 2/2] xen: setup percpu data pointers Jeremy Fitzhardinge 2009-01-30 8:50 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Tejun Heo 3 siblings, 0 replies; 15+ messages in thread From: Jeremy Fitzhardinge @ 2009-01-29 21:36 UTC (permalink / raw) To: Brian Gerst; +Cc: Ingo Molnar, Tejun Heo, linux-kernel Xen needs to be able to access percpu data from very early on. For various reasons, it cannot also load the gdt at that time. It does, however, have a pefectly functional gdt at that point, so there's no pressing need to reload the gdt. Split the function to load the segment registers off, so Xen can call it directly. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/common.c | 18 ++++++++++++------ 2 files changed, 13 insertions(+), 6 deletions(-) =================================================================== --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -769,6 +769,7 @@ extern void cpu_set_gdt(int); extern void switch_to_new_gdt(int); +extern void load_percpu_segment(int); extern void cpu_init(void); static inline unsigned long get_debugctlmsr(void) =================================================================== --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -297,6 +297,16 @@ __u32 cleared_cpu_caps[NCAPINTS] __cpuinitdata; +void load_percpu_segment(int cpu) +{ +#ifdef CONFIG_X86_32 + loadsegment(fs, __KERNEL_PERCPU); +#else + loadsegment(gs, 0); + wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu)); +#endif +} + /* Current gdt points %fs at the "master" per-cpu area: after this, * it's on the real one. */ void switch_to_new_gdt(int cpu) @@ -307,12 +317,8 @@ gdt_descr.size = GDT_SIZE - 1; load_gdt(&gdt_descr); /* Reload the per-cpu base */ -#ifdef CONFIG_X86_32 - loadsegment(fs, __KERNEL_PERCPU); -#else - loadsegment(gs, 0); - wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu)); -#endif + + load_percpu_segment(cpu); } static struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] xen: setup percpu data pointers 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst 2009-01-29 21:36 ` Jeremy Fitzhardinge 2009-01-29 21:36 ` [PATCH 1/2] x86: split loading percpu segments from loading gdt Jeremy Fitzhardinge @ 2009-01-29 21:36 ` Jeremy Fitzhardinge 2009-01-30 8:50 ` Tejun Heo 2009-01-30 8:50 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Tejun Heo 3 siblings, 1 reply; 15+ messages in thread From: Jeremy Fitzhardinge @ 2009-01-29 21:36 UTC (permalink / raw) To: Brian Gerst; +Cc: Ingo Molnar, Tejun Heo, linux-kernel We need to access percpu data fairly early, so set up the percpu registers as soon as possible. We only need to load the appropriate segment register. We already have a GDT, but its hard to change it early because we need to manipulate the pagetable to do so, and that hasn't been set up yet. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/xen/enlighten.c | 3 +++ arch/x86/xen/smp.c | 5 ++++- 2 files changed, 7 insertions(+), 1 deletion(-) =================================================================== --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1646,6 +1640,9 @@ have_vcpu_info_placement = 0; #endif + /* setup percpu state */ + load_percpu_segment(0); + xen_smp_init(); /* Get mfn list */ =================================================================== --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -170,7 +170,8 @@ /* We've switched to the "real" per-cpu gdt, so make sure the old memory can be recycled */ - make_lowmem_page_readwrite(&per_cpu_var(gdt_page)); + make_lowmem_page_readwrite(__per_cpu_load + + (unsigned long)&per_cpu_var(gdt_page)); xen_setup_vcpu_info_placement(); } @@ -235,6 +236,8 @@ ctxt->user_regs.ss = __KERNEL_DS; #ifdef CONFIG_X86_32 ctxt->user_regs.fs = __KERNEL_PERCPU; +#else + ctxt->gs_base_kernel = per_cpu_offset(cpu); #endif ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle; ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] xen: setup percpu data pointers 2009-01-29 21:36 ` [PATCH 2/2] xen: setup percpu data pointers Jeremy Fitzhardinge @ 2009-01-30 8:50 ` Tejun Heo 0 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2009-01-30 8:50 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Brian Gerst, Ingo Molnar, linux-kernel Jeremy Fitzhardinge wrote: > We need to access percpu data fairly early, so set up the percpu > registers as soon as possible. We only need to load the appropriate > segment register. We already have a GDT, but its hard to change it > early because we need to manipulate the pagetable to do so, and that > hasn't been set up yet. > > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Committed 1-2 to #tj-percpu. Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86: pass in cpu number to switch_to_new_gdt() 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst ` (2 preceding siblings ...) 2009-01-29 21:36 ` [PATCH 2/2] xen: setup percpu data pointers Jeremy Fitzhardinge @ 2009-01-30 8:50 ` Tejun Heo 3 siblings, 0 replies; 15+ messages in thread From: Tejun Heo @ 2009-01-30 8:50 UTC (permalink / raw) To: Brian Gerst; +Cc: Jeremy Fitzhardinge, Ingo Molnar, linux-kernel Brian Gerst wrote: > Impact: cleanup, prepare for xen boot fix. > > Xen needs to call this function very early to setup the GDT and > per-cpu segments. Remove the call to smp_processor_id() and just > pass in the cpu number. > > Signed-off-by: Brian Gerst <brgerst@gmail.com> Committed to #tj-percpu. Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-27 1:12 unified percpu stuff Jeremy Fitzhardinge 2009-01-27 4:17 ` Tejun Heo @ 2009-01-27 5:57 ` Brian Gerst 2009-01-27 7:24 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 15+ messages in thread From: Brian Gerst @ 2009-01-27 5:57 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Tejun Heo, Ingo Molnar, Linux Kernel Mailing List On Mon, Jan 26, 2009 at 8:12 PM, Jeremy Fitzhardinge <jeremy@goop.org> wrote: > I'm really pleased to see the unified percpu stuff in the kernel, but > unfortunately its breaking Xen at the moment. > It looks like this is just a matter of initializing %gs properly in > xen_start_kernel. Is there any problem with me doing a load_gs_base(0) > somewhere early in xen_start_kernel (arch/x86/xen/enlighten.c)? Some of the changes I did made the assumption that the percpu state is set up early in head_xx.S, which apparently is skipped for xen. Is there documentation anywhere for the xen bootstrap process? Try this patch (untested): diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index bef941f..b90d061 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1647,6 +1647,8 @@ asmlinkage void __init xen_start_kernel(void) have_vcpu_info_placement = 0; #endif + switch_to_new_gdt(); + xen_smp_init(); /* Get mfn list */ diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 7735e3d..00d9265 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -235,6 +235,8 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle) ctxt->user_regs.ss = __KERNEL_DS; #ifdef CONFIG_X86_32 ctxt->user_regs.fs = __KERNEL_PERCPU; +#else + ctxt->gs_base_kernel = per_cpu_offset(cpu); #endif ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle; ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */ -- Brian Gerst ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-27 5:57 ` unified percpu stuff Brian Gerst @ 2009-01-27 7:24 ` Jeremy Fitzhardinge 2009-01-27 12:35 ` Brian Gerst 0 siblings, 1 reply; 15+ messages in thread From: Jeremy Fitzhardinge @ 2009-01-27 7:24 UTC (permalink / raw) To: Brian Gerst; +Cc: Tejun Heo, Ingo Molnar, Linux Kernel Mailing List Brian Gerst wrote: > On Mon, Jan 26, 2009 at 8:12 PM, Jeremy Fitzhardinge <jeremy@goop.org> wrote: > >> I'm really pleased to see the unified percpu stuff in the kernel, but >> unfortunately its breaking Xen at the moment. >> It looks like this is just a matter of initializing %gs properly in >> xen_start_kernel. Is there any problem with me doing a load_gs_base(0) >> somewhere early in xen_start_kernel (arch/x86/xen/enlighten.c)? >> > > Some of the changes I did made the assumption that the percpu state is > set up early in head_xx.S, which apparently is skipped for xen. Is > there documentation anywhere for the xen bootstrap process? > Not really. Its quite different from native because the guest kernel starts up in protected/long mode with paging enabled, using a Xen-provided pagetable. It means that most of the normal head.S stuff is redundant; the kernel starts executing more or less exactly at xen_start_kernel (there's a 2 instruction asm part which stashes away the Xen info pointer from %[re]si, then jumps to xen_start_kernel). > Try this patch (untested): > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index bef941f..b90d061 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -1647,6 +1647,8 @@ asmlinkage void __init xen_start_kernel(void) > have_vcpu_info_placement = 0; > #endif > > + switch_to_new_gdt(); > + > xen_smp_init(); > > /* Get mfn list */ > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 7735e3d..00d9265 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -235,6 +235,8 @@ cpu_initialize_context(unsigned int cpu, struct > task_struct *idle) > ctxt->user_regs.ss = __KERNEL_DS; > #ifdef CONFIG_X86_32 > ctxt->user_regs.fs = __KERNEL_PERCPU; > +#else > + ctxt->gs_base_kernel = per_cpu_offset(cpu); > #endif > ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle; > ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */ > Thanks, I'll try this out. BTW, does the initial cpu0 percpu area get reallocated and moved during boot, or does cpu0 keep using the same memory forever? Thanks, J ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-27 7:24 ` Jeremy Fitzhardinge @ 2009-01-27 12:35 ` Brian Gerst 2009-01-27 17:50 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 15+ messages in thread From: Brian Gerst @ 2009-01-27 12:35 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Tejun Heo, Ingo Molnar, Linux Kernel Mailing List On Tue, Jan 27, 2009 at 2:24 AM, Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Brian Gerst wrote: >> >> On Mon, Jan 26, 2009 at 8:12 PM, Jeremy Fitzhardinge <jeremy@goop.org> >> wrote: >> >>> >>> I'm really pleased to see the unified percpu stuff in the kernel, but >>> unfortunately its breaking Xen at the moment. >>> It looks like this is just a matter of initializing %gs properly in >>> xen_start_kernel. Is there any problem with me doing a load_gs_base(0) >>> somewhere early in xen_start_kernel (arch/x86/xen/enlighten.c)? >>> >> >> Some of the changes I did made the assumption that the percpu state is >> set up early in head_xx.S, which apparently is skipped for xen. Is >> there documentation anywhere for the xen bootstrap process? >> > > Not really. Its quite different from native because the guest kernel starts > up in protected/long mode with paging enabled, using a Xen-provided > pagetable. It means that most of the normal head.S stuff is redundant; the > kernel starts executing more or less exactly at xen_start_kernel (there's a > 2 instruction asm part which stashes away the Xen info pointer from %[re]si, > then jumps to xen_start_kernel). > >> Try this patch (untested): >> >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >> index bef941f..b90d061 100644 >> --- a/arch/x86/xen/enlighten.c >> +++ b/arch/x86/xen/enlighten.c >> @@ -1647,6 +1647,8 @@ asmlinkage void __init xen_start_kernel(void) >> have_vcpu_info_placement = 0; >> #endif >> >> + switch_to_new_gdt(); >> + >> xen_smp_init(); >> >> /* Get mfn list */ >> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c >> index 7735e3d..00d9265 100644 >> --- a/arch/x86/xen/smp.c >> +++ b/arch/x86/xen/smp.c >> @@ -235,6 +235,8 @@ cpu_initialize_context(unsigned int cpu, struct >> task_struct *idle) >> ctxt->user_regs.ss = __KERNEL_DS; >> #ifdef CONFIG_X86_32 >> ctxt->user_regs.fs = __KERNEL_PERCPU; >> +#else >> + ctxt->gs_base_kernel = per_cpu_offset(cpu); >> #endif >> ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle; >> ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */ >> > > Thanks, I'll try this out. > > BTW, does the initial cpu0 percpu area get reallocated and moved during > boot, or does cpu0 keep using the same memory forever? It is reallocated in setup_per_cpu_areas(). -- Brian Gerst ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-27 12:35 ` Brian Gerst @ 2009-01-27 17:50 ` Jeremy Fitzhardinge 2009-01-27 19:07 ` Brian Gerst 0 siblings, 1 reply; 15+ messages in thread From: Jeremy Fitzhardinge @ 2009-01-27 17:50 UTC (permalink / raw) To: Brian Gerst; +Cc: Tejun Heo, Ingo Molnar, Linux Kernel Mailing List Brian Gerst wrote: >> BTW, does the initial cpu0 percpu area get reallocated and moved during >> boot, or does cpu0 keep using the same memory forever? >> > > It is reallocated in setup_per_cpu_areas(). > Hm, OK. What's the reason we need to move cpu 0's percpu area? Its always seemed a bit awkward. Once it has been moved, how can I find the address of a variable in the original boot-time cpu 0 percpu area? There's a couple of pages which Xen will have marked RO which need to be made RW if they're being freed back into the kernel pool. Currently I use per_cpu_var(gdt_page), but guess that's a small offset rather than a directly usable address. Thanks, J ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: unified percpu stuff 2009-01-27 17:50 ` Jeremy Fitzhardinge @ 2009-01-27 19:07 ` Brian Gerst 0 siblings, 0 replies; 15+ messages in thread From: Brian Gerst @ 2009-01-27 19:07 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Tejun Heo, Ingo Molnar, Linux Kernel Mailing List On Tue, Jan 27, 2009 at 12:50 PM, Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Brian Gerst wrote: >>> >>> BTW, does the initial cpu0 percpu area get reallocated and moved during >>> boot, or does cpu0 keep using the same memory forever? >>> >> >> It is reallocated in setup_per_cpu_areas(). >> > > Hm, OK. What's the reason we need to move cpu 0's percpu area? Its always > seemed a bit awkward. > > Once it has been moved, how can I find the address of a variable in the > original boot-time cpu 0 percpu area? There's a couple of pages which Xen > will have marked RO which need to be made RW if they're being freed back > into the kernel pool. Currently I use per_cpu_var(gdt_page), but guess > that's a small offset rather than a directly usable address. The fact that the boot cpu area is moved isn't new. It's been that way for a while. The reason it is moved is so that extra space can be allocated at the end for dynamic allocations. It looks like Xen just avoided it by not loading the per-cpu GDT until after the copy. I guess you could get by without reloading the GDT, but you would still need to set MSR_GS_BASE for 64-bit. 32-bit happens to work because %fs is probably initialized to the same segment as %ds. If 32-bit were ever to switch to zero-based per-cpu like 64-bit, that assumption would fail. You need to add __per_cpu_load to the offset to get the address of the variable in the original boot per-cpu area. -- Brian Gerst ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-01-30 8:51 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-01-27 1:12 unified percpu stuff Jeremy Fitzhardinge 2009-01-27 4:17 ` Tejun Heo 2009-01-29 10:43 ` Jeremy Fitzhardinge 2009-01-29 11:01 ` Tejun Heo 2009-01-29 13:43 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Brian Gerst 2009-01-29 21:36 ` Jeremy Fitzhardinge 2009-01-29 21:36 ` [PATCH 1/2] x86: split loading percpu segments from loading gdt Jeremy Fitzhardinge 2009-01-29 21:36 ` [PATCH 2/2] xen: setup percpu data pointers Jeremy Fitzhardinge 2009-01-30 8:50 ` Tejun Heo 2009-01-30 8:50 ` [PATCH] x86: pass in cpu number to switch_to_new_gdt() Tejun Heo 2009-01-27 5:57 ` unified percpu stuff Brian Gerst 2009-01-27 7:24 ` Jeremy Fitzhardinge 2009-01-27 12:35 ` Brian Gerst 2009-01-27 17:50 ` Jeremy Fitzhardinge 2009-01-27 19:07 ` Brian Gerst
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox