* [PATCH][RFC] x86_64: Reload CS when startup_64 is used. @ 2006-08-21 9:54 Magnus Damm 2006-08-21 10:19 ` Andi Kleen 2006-08-21 21:02 ` Eric W. Biederman 0 siblings, 2 replies; 23+ messages in thread From: Magnus Damm @ 2006-08-21 9:54 UTC (permalink / raw) To: fastboot, linux-kernel; +Cc: Magnus Damm, ebiederm, ak x86_64: Reload CS when startup_64 is used. The current x86_64 startup code never reloads CS during the early boot process if the 64-bit function startup_64 is used as entry point. The 32-bit entry point startup_32 does the right thing and reloads CS, and this is what most people are using if they use bzImage. This patch fixes the case when the Linux kernel is booted into using kexec under Xen. The Xen hypervisor is using large CS values which makes the x86_64 kernel fail - but only if vmlinux is booted, bzImage works well because it is using the 32-bit entry point. The main question is if we require that the boot loader should setup CS to some certain offset to be able to boot the kernel. The sane solution IMO should be that the kernel requires that the loaded descriptors are correct, but that the exact offset within the GDT the boot loader is using should not matter. This is the way the i386 boot works if I understand things correctly. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> --- Applies on top of 2.6.18-rc4 head.S | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) --- 0001/arch/x86_64/kernel/head.S +++ work/arch/x86_64/kernel/head.S 2006-08-21 18:22:57.000000000 +0900 @@ -165,6 +165,25 @@ startup_64: */ lgdt cpu_gdt_descr + /* Reload CS with a value that is within our GDT. We need to do this + * if we were loaded by a 64 bit bootloader that happened to use a + * CS that is larger than the GDT limit. This is true if we came here + * from kexec running under Xen. + */ + movq %rsp, %rdx + movq $__KERNEL_DS, %rax + pushq %rax /* SS */ + pushq %rdx /* RSP */ + movq $__KERNEL_CS, %rax + movq $cs_reloaded, %rdx + pushq %rax /* CS */ + pushq %rdx /* RIP */ + lretq + +cs_reloaded: + /* Setup the boot time stack again */ + movq init_rsp(%rip),%rsp + /* * Setup up a dummy PDA. this is just for some early bootup code * that does in_interrupt() ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 9:54 [PATCH][RFC] x86_64: Reload CS when startup_64 is used Magnus Damm @ 2006-08-21 10:19 ` Andi Kleen 2006-08-21 13:29 ` [Fastboot] " Magnus Damm 2006-08-21 21:02 ` Eric W. Biederman 1 sibling, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-21 10:19 UTC (permalink / raw) To: Magnus Damm; +Cc: fastboot, linux-kernel, ebiederm > > + /* Reload CS with a value that is within our GDT. We need to do this > + * if we were loaded by a 64 bit bootloader that happened to use a > + * CS that is larger than the GDT limit. This is true if we came here > + * from kexec running under Xen. > + */ > + movq %rsp, %rdx > + movq $__KERNEL_DS, %rax > + pushq %rax /* SS */ > + pushq %rdx /* RSP */ > + movq $__KERNEL_CS, %rax > + movq $cs_reloaded, %rdx > + pushq %rax /* CS */ > + pushq %rdx /* RIP */ > + lretq Can't you just use a normal far jump? That might be simpler. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 10:19 ` Andi Kleen @ 2006-08-21 13:29 ` Magnus Damm 2006-08-21 14:16 ` Andi Kleen 2006-08-21 14:17 ` Vivek Goyal 0 siblings, 2 replies; 23+ messages in thread From: Magnus Damm @ 2006-08-21 13:29 UTC (permalink / raw) To: Andi Kleen; +Cc: Magnus Damm, fastboot, linux-kernel, ebiederm On 8/21/06, Andi Kleen <ak@suse.de> wrote: > > > > > + /* Reload CS with a value that is within our GDT. We need to do this > > + * if we were loaded by a 64 bit bootloader that happened to use a > > + * CS that is larger than the GDT limit. This is true if we came here > > + * from kexec running under Xen. > > + */ > > + movq %rsp, %rdx > > + movq $__KERNEL_DS, %rax > > + pushq %rax /* SS */ > > + pushq %rdx /* RSP */ > > + movq $__KERNEL_CS, %rax > > + movq $cs_reloaded, %rdx > > + pushq %rax /* CS */ > > + pushq %rdx /* RIP */ > > + lretq > > Can't you just use a normal far jump? That might be simpler. I couldn't find a far jump that took a 64-bit address to jump to. But I guess that the kernel will be loaded in the lowest 4G regardless so I guess 32-bit pointers are ok, right? That will make it simpler for sure. What do you think about reloading CS? Is it the right thing to do, or is it correct as it is today where we depend on that CS == _KERNEL_CS? I need to fix kexec-tools regardless, but maybe it is a good idea to make the 64-bit kernel boot a bit robust too. Thanks, / magnus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 13:29 ` [Fastboot] " Magnus Damm @ 2006-08-21 14:16 ` Andi Kleen 2006-08-22 0:47 ` Magnus Damm 2006-08-21 14:17 ` Vivek Goyal 1 sibling, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-21 14:16 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, fastboot, linux-kernel, ebiederm On Monday 21 August 2006 15:29, Magnus Damm wrote: > On 8/21/06, Andi Kleen <ak@suse.de> wrote: > > > > > > > > + /* Reload CS with a value that is within our GDT. We need to do this > > > + * if we were loaded by a 64 bit bootloader that happened to use a > > > + * CS that is larger than the GDT limit. This is true if we came here > > > + * from kexec running under Xen. > > > + */ > > > + movq %rsp, %rdx > > > + movq $__KERNEL_DS, %rax > > > + pushq %rax /* SS */ > > > + pushq %rdx /* RSP */ > > > + movq $__KERNEL_CS, %rax > > > + movq $cs_reloaded, %rdx > > > + pushq %rax /* CS */ > > > + pushq %rdx /* RIP */ > > > + lretq > > > > Can't you just use a normal far jump? That might be simpler. > > I couldn't find a far jump that took a 64-bit address to jump to. But > I guess that the kernel will be loaded in the lowest 4G regardless so > I guess 32-bit pointers are ok, right? That will make it simpler for > sure. Yes, that code always runs in the identity mapping and at 2MB. > > What do you think about reloading CS? Is it the right thing to do, or > is it correct as it is today where we depend on that CS == _KERNEL_CS? > I need to fix kexec-tools regardless, but maybe it is a good idea to > make the 64-bit kernel boot a bit robust too. Reloading CS is ok, although longer term I plan to switch the kernel to uncompress already in 64bit. Then you would need the same GDT anyways. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 14:16 ` Andi Kleen @ 2006-08-22 0:47 ` Magnus Damm 0 siblings, 0 replies; 23+ messages in thread From: Magnus Damm @ 2006-08-22 0:47 UTC (permalink / raw) To: Andi Kleen; +Cc: Magnus Damm, fastboot, linux-kernel, ebiederm On Mon, 2006-08-21 at 16:16 +0200, Andi Kleen wrote: > On Monday 21 August 2006 15:29, Magnus Damm wrote: > > On 8/21/06, Andi Kleen <ak@suse.de> wrote: > > > > > > > > > > > + /* Reload CS with a value that is within our GDT. We need to do this > > > > + * if we were loaded by a 64 bit bootloader that happened to use a > > > > + * CS that is larger than the GDT limit. This is true if we came here > > > > + * from kexec running under Xen. > > > > + */ > > > > + movq %rsp, %rdx > > > > + movq $__KERNEL_DS, %rax > > > > + pushq %rax /* SS */ > > > > + pushq %rdx /* RSP */ > > > > + movq $__KERNEL_CS, %rax > > > > + movq $cs_reloaded, %rdx > > > > + pushq %rax /* CS */ > > > > + pushq %rdx /* RIP */ > > > > + lretq > > > > > > Can't you just use a normal far jump? That might be simpler. > > > > I couldn't find a far jump that took a 64-bit address to jump to. But > > I guess that the kernel will be loaded in the lowest 4G regardless so > > I guess 32-bit pointers are ok, right? That will make it simpler for > > sure. > > Yes, that code always runs in the identity mapping and at 2MB. > > > > > What do you think about reloading CS? Is it the right thing to do, or > > is it correct as it is today where we depend on that CS == _KERNEL_CS? > > I need to fix kexec-tools regardless, but maybe it is a good idea to > > make the 64-bit kernel boot a bit robust too. > > Reloading CS is ok, although longer term I plan to switch the kernel > to uncompress already in 64bit. Then you would need the same GDT anyways. I think reloading CS is the right thing to do. IMO it is not sane to depend on that the 64-bit boot loader sets up CS to 0x18 for us. Having a dependency like that (unless there is a good reason and it is documented somehow) is good to avoid, regardless of 64-bit uncompress or not. I mean, if you plan on making the bzImage code 64-bit then it needs to reload CS too, right? Thanks, / magnus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 13:29 ` [Fastboot] " Magnus Damm 2006-08-21 14:16 ` Andi Kleen @ 2006-08-21 14:17 ` Vivek Goyal 2006-08-21 14:24 ` Andi Kleen 1 sibling, 1 reply; 23+ messages in thread From: Vivek Goyal @ 2006-08-21 14:17 UTC (permalink / raw) To: Magnus Damm; +Cc: Andi Kleen, Magnus Damm, fastboot, linux-kernel, ebiederm On Mon, Aug 21, 2006 at 10:29:36PM +0900, Magnus Damm wrote: > On 8/21/06, Andi Kleen <ak@suse.de> wrote: > > > >> > >> + /* Reload CS with a value that is within our GDT. We need to do > >this > >> + * if we were loaded by a 64 bit bootloader that happened to use a > >> + * CS that is larger than the GDT limit. This is true if we came > >here > >> + * from kexec running under Xen. > >> + */ > >> + movq %rsp, %rdx > >> + movq $__KERNEL_DS, %rax > >> + pushq %rax /* SS */ > >> + pushq %rdx /* RSP */ > >> + movq $__KERNEL_CS, %rax > >> + movq $cs_reloaded, %rdx > >> + pushq %rax /* CS */ > >> + pushq %rdx /* RIP */ > >> + lretq > > > >Can't you just use a normal far jump? That might be simpler. > > I couldn't find a far jump that took a 64-bit address to jump to. But > I guess that the kernel will be loaded in the lowest 4G regardless so > I guess 32-bit pointers are ok, right? That will make it simpler for > sure. > Given the idea of relocatable kernel is floating around I would prefer if we are not bounded by the restriction of loading a kernel in lowest 4G. Thanks Vivek ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 14:17 ` Vivek Goyal @ 2006-08-21 14:24 ` Andi Kleen 2006-08-21 14:46 ` Vivek Goyal 0 siblings, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-21 14:24 UTC (permalink / raw) To: vgoyal; +Cc: Magnus Damm, Magnus Damm, fastboot, linux-kernel, ebiederm > Given the idea of relocatable kernel is floating around I would prefer if > we are not bounded by the restriction of loading a kernel in lowest 4G. There is already other code that requires this. In fact i don't think it can be above 40MB currently. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 14:24 ` Andi Kleen @ 2006-08-21 14:46 ` Vivek Goyal 2006-08-21 15:04 ` Andi Kleen 0 siblings, 1 reply; 23+ messages in thread From: Vivek Goyal @ 2006-08-21 14:46 UTC (permalink / raw) To: Andi Kleen; +Cc: Magnus Damm, Magnus Damm, fastboot, linux-kernel, ebiederm On Mon, Aug 21, 2006 at 04:24:10PM +0200, Andi Kleen wrote: > > > Given the idea of relocatable kernel is floating around I would prefer if > > we are not bounded by the restriction of loading a kernel in lowest 4G. > > There is already other code that requires this. In fact i don't think it can > be above 40MB currently. > But I think Eric's prototype patches for relocatable kernel do get over this limitation (Hope I understood the code right). Assuming that relocatable kernel patches will be merged down the line, it would be nice not to be bound by 4G limitation. Thanks Vivek ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 14:46 ` Vivek Goyal @ 2006-08-21 15:04 ` Andi Kleen 2006-08-21 20:02 ` Eric W. Biederman 0 siblings, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-21 15:04 UTC (permalink / raw) To: vgoyal; +Cc: Magnus Damm, Magnus Damm, fastboot, linux-kernel, ebiederm On Monday 21 August 2006 16:46, Vivek Goyal wrote: > On Mon, Aug 21, 2006 at 04:24:10PM +0200, Andi Kleen wrote: > > > > > Given the idea of relocatable kernel is floating around I would prefer if > > > we are not bounded by the restriction of loading a kernel in lowest 4G. > > > > There is already other code that requires this. In fact i don't think it can > > be above 40MB currently. > > > > But I think Eric's prototype patches for relocatable kernel do get over > this limitation (Hope I understood the code right). Assuming that relocatable > kernel patches will be merged down the line, it would be nice not to be > bound by 4G limitation. He may have fixed the 40MB issue, but I very much doubt he changed the 2GB limitation because that would be a major change. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 15:04 ` Andi Kleen @ 2006-08-21 20:02 ` Eric W. Biederman 2006-08-21 20:10 ` Andi Kleen 0 siblings, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2006-08-21 20:02 UTC (permalink / raw) To: Andi Kleen Cc: vgoyal, Magnus Damm, Magnus Damm, fastboot, linux-kernel, ebiederm Andi Kleen <ak@suse.de> writes: > On Monday 21 August 2006 16:46, Vivek Goyal wrote: >> On Mon, Aug 21, 2006 at 04:24:10PM +0200, Andi Kleen wrote: >> > >> > > Given the idea of relocatable kernel is floating around I would prefer if >> > > we are not bounded by the restriction of loading a kernel in lowest 4G. >> > >> > There is already other code that requires this. In fact i don't think it can >> > be above 40MB currently. >> > >> >> But I think Eric's prototype patches for relocatable kernel do get over >> this limitation (Hope I understood the code right). Assuming that relocatable >> kernel patches will be merged down the line, it would be nice not to be >> bound by 4G limitation. > > He may have fixed the 40MB issue, but I very much doubt he changed the 2GB > limitation because that would be a major change. I'm not certain I caught everything but as far as I know I did. Part of that was by having the code run at a fixed virtual address so we still live in the last 2GB of the virtual address space. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 20:02 ` Eric W. Biederman @ 2006-08-21 20:10 ` Andi Kleen 2006-08-21 21:00 ` Eric W. Biederman 0 siblings, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-21 20:10 UTC (permalink / raw) To: Eric W. Biederman Cc: vgoyal, Magnus Damm, Magnus Damm, fastboot, linux-kernel > I'm not certain I caught everything but as far as I know I did. > Part of that was by having the code run at a fixed virtual address so > we still live in the last 2GB of the virtual address space. You changed the -2GB (or rather -40MB unpatched) mapping to not necessarily be linear? There are a couple of assumptions that it is, including at boot up (it doubles as the 1:1 mapping then) and in change_page_attr() and in suspend/resume. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 20:10 ` Andi Kleen @ 2006-08-21 21:00 ` Eric W. Biederman 0 siblings, 0 replies; 23+ messages in thread From: Eric W. Biederman @ 2006-08-21 21:00 UTC (permalink / raw) To: Andi Kleen; +Cc: vgoyal, Magnus Damm, Magnus Damm, fastboot, linux-kernel Andi Kleen <ak@suse.de> writes: >> I'm not certain I caught everything but as far as I know I did. >> Part of that was by having the code run at a fixed virtual address so >> we still live in the last 2GB of the virtual address space. > > You changed the -2GB (or rather -40MB unpatched) mapping to not necessarily > be linear? I left it linear but I removed assumptions about which physical address it goes to. > There are a couple of assumptions that it is, including at boot up > (it doubles as the 1:1 mapping then) and in change_page_attr() and in > suspend/resume. Yes. Those all sound familiar. I removed the doubling as the 1:1 physical mapping. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 9:54 [PATCH][RFC] x86_64: Reload CS when startup_64 is used Magnus Damm 2006-08-21 10:19 ` Andi Kleen @ 2006-08-21 21:02 ` Eric W. Biederman 2006-08-22 0:58 ` Magnus Damm 1 sibling, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2006-08-21 21:02 UTC (permalink / raw) To: Magnus Damm; +Cc: fastboot, linux-kernel, ak Magnus Damm <magnus@valinux.co.jp> writes: > x86_64: Reload CS when startup_64 is used. > > The current x86_64 startup code never reloads CS during the early boot process > if the 64-bit function startup_64 is used as entry point. The 32-bit entry > point startup_32 does the right thing and reloads CS, and this is what most > people are using if they use bzImage. > > This patch fixes the case when the Linux kernel is booted into using kexec > under Xen. The Xen hypervisor is using large CS values which makes the x86_64 > kernel fail - but only if vmlinux is booted, bzImage works well because it > is using the 32-bit entry point. > > The main question is if we require that the boot loader should setup CS > to some certain offset to be able to boot the kernel. The sane solution IMO > should be that the kernel requires that the loaded descriptors are correct, > but that the exact offset within the GDT the boot loader is using should not > matter. This is the way the i386 boot works if I understand things correctly. What extra reload of cs does Xen introduce? I'm not really comfortable with a half virtualized case. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-21 21:02 ` Eric W. Biederman @ 2006-08-22 0:58 ` Magnus Damm 2006-08-22 3:41 ` Eric W. Biederman 0 siblings, 1 reply; 23+ messages in thread From: Magnus Damm @ 2006-08-22 0:58 UTC (permalink / raw) To: Eric W. Biederman; +Cc: fastboot, linux-kernel, ak On Mon, 2006-08-21 at 15:02 -0600, Eric W. Biederman wrote: > Magnus Damm <magnus@valinux.co.jp> writes: > > > x86_64: Reload CS when startup_64 is used. > > > > The current x86_64 startup code never reloads CS during the early boot process > > if the 64-bit function startup_64 is used as entry point. The 32-bit entry > > point startup_32 does the right thing and reloads CS, and this is what most > > people are using if they use bzImage. > > > > This patch fixes the case when the Linux kernel is booted into using kexec > > under Xen. The Xen hypervisor is using large CS values which makes the x86_64 > > kernel fail - but only if vmlinux is booted, bzImage works well because it > > is using the 32-bit entry point. > > > > The main question is if we require that the boot loader should setup CS > > to some certain offset to be able to boot the kernel. The sane solution IMO > > should be that the kernel requires that the loaded descriptors are correct, > > but that the exact offset within the GDT the boot loader is using should not > > matter. This is the way the i386 boot works if I understand things correctly. > > What extra reload of cs does Xen introduce? None, but Xen is using CS values that are very different from Linux: xen/include/public/arch-x86_64.h: #define FLAT_RING3_CS32 0xe023 /* GDT index 260 */ #define FLAT_RING3_CS64 0xe033 /* GDT index 261 */ #define FLAT_RING3_DS32 0xe02b /* GDT index 262 */ #define FLAT_RING3_DS64 0x0000 /* NULL selector */ #define FLAT_RING3_SS32 0xe02b /* GDT index 262 */ #define FLAT_RING3_SS64 0xe02b /* GDT index 262 */ The main problem is that startup_64 depends on that CS is set to __KERNEL_CS when it shouldn't. On top of that the purgatory code in kexec-tools doesn't setup CS when using a 64-bit entry point. The following (mangled) patch solves that for me: --- 0001/purgatory/arch/x86_64/entry64.S +++ work/purgatory/arch/x86_64/entry64.S 2006-08-18 15:34:23.000000000 +0900 @@ -37,8 +37,9 @@ entry64: movl %eax, %fs movl %eax, %gs - /* In 64bit mode the code segment is meaningless */ - + ljmp *new_cs_addr(%rip) +new_cs_exit: + /* Load the registers */ movq rax(%rip), %rax movq rbx(%rip), %rbx @@ -93,8 +94,11 @@ gdt: /* 0x00 unusable segment .word 0, 0, 0 /* 0x10 4GB flat code segment */ - .word 0xFFFF, 0x0000, 0x9A00, 0x00CF + .word 0xFFFF, 0x0000, 0x9A00, 0x00AF /* 0x18 4GB flat data segment */ .word 0xFFFF, 0x0000, 0x9200, 0x00CF gdt_end: +new_cs_addr: + .long new_cs_exit + .word 0x10 /* CODE_CS */ > I'm not really comfortable with a half virtualized case. That I don't understand, care to explain more? Thanks, / magnus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-22 0:58 ` Magnus Damm @ 2006-08-22 3:41 ` Eric W. Biederman 2006-08-22 4:10 ` Magnus Damm 2006-08-22 8:03 ` Andi Kleen 0 siblings, 2 replies; 23+ messages in thread From: Eric W. Biederman @ 2006-08-22 3:41 UTC (permalink / raw) To: Magnus Damm; +Cc: fastboot, linux-kernel, ak Magnus Damm <magnus@valinux.co.jp> writes: > On Mon, 2006-08-21 at 15:02 -0600, Eric W. Biederman wrote: >> Magnus Damm <magnus@valinux.co.jp> writes: >> >> > x86_64: Reload CS when startup_64 is used. >> > >> > The current x86_64 startup code never reloads CS during the early boot > process >> > if the 64-bit function startup_64 is used as entry point. The 32-bit entry >> > point startup_32 does the right thing and reloads CS, and this is what most >> > people are using if they use bzImage. >> > >> > This patch fixes the case when the Linux kernel is booted into using kexec >> > under Xen. The Xen hypervisor is using large CS values which makes the > x86_64 >> > kernel fail - but only if vmlinux is booted, bzImage works well because it >> > is using the 32-bit entry point. >> > >> > The main question is if we require that the boot loader should setup CS >> > to some certain offset to be able to boot the kernel. The sane solution IMO >> > should be that the kernel requires that the loaded descriptors are correct, >> > but that the exact offset within the GDT the boot loader is using should not >> > matter. This is the way the i386 boot works if I understand things > correctly. >> >> What extra reload of cs does Xen introduce? > > None, but Xen is using CS values that are very different from Linux: > > xen/include/public/arch-x86_64.h: > > #define FLAT_RING3_CS32 0xe023 /* GDT index 260 */ > #define FLAT_RING3_CS64 0xe033 /* GDT index 261 */ > #define FLAT_RING3_DS32 0xe02b /* GDT index 262 */ > #define FLAT_RING3_DS64 0x0000 /* NULL selector */ > #define FLAT_RING3_SS32 0xe02b /* GDT index 262 */ > #define FLAT_RING3_SS64 0xe02b /* GDT index 262 */ > > The main problem is that startup_64 depends on that CS is set to > __KERNEL_CS when it shouldn't. On top of that the purgatory code in > kexec-tools doesn't setup CS when using a 64-bit entry point. The > following (mangled) patch solves that for me: I believe the cs shadow registers are fine. > --- 0001/purgatory/arch/x86_64/entry64.S > +++ work/purgatory/arch/x86_64/entry64.S 2006-08-18 > 15:34:23.000000000 +0900 > @@ -37,8 +37,9 @@ entry64: > movl %eax, %fs > movl %eax, %gs > > - /* In 64bit mode the code segment is meaningless */ Would you care to explain to me how the above comment is not true. As I recall the only meaning %cs has is if you are a kernel or user space process. The base address and everything else mean nothing. In addition the value in %cs never means anything. It is the values in the cs shadow registers that count. The value in %cs just reflects where those values in %cs came from. So if we never reload %cs and only use the shadow values why does the value in %cs matter? >> I'm not really comfortable with a half virtualized case. > > That I don't understand, care to explain more? The only case where I can think of that the value in %cs would matter is if you change %cs. The only way I can see that is if you are half para-virtualized. Because we are running privileged instructions in startup_64 it shouldn't work for the paravirtualized case, and it should work just like it does today in the fully virtualized case. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-22 3:41 ` Eric W. Biederman @ 2006-08-22 4:10 ` Magnus Damm 2006-08-22 8:03 ` Andi Kleen 1 sibling, 0 replies; 23+ messages in thread From: Magnus Damm @ 2006-08-22 4:10 UTC (permalink / raw) To: Eric W. Biederman; +Cc: fastboot, linux-kernel, ak On Mon, 2006-08-21 at 21:41 -0600, Eric W. Biederman wrote: > Magnus Damm <magnus@valinux.co.jp> writes: > > > On Mon, 2006-08-21 at 15:02 -0600, Eric W. Biederman wrote: > >> Magnus Damm <magnus@valinux.co.jp> writes: > >> > >> > x86_64: Reload CS when startup_64 is used. > >> > > >> > The current x86_64 startup code never reloads CS during the early boot > > process > >> > if the 64-bit function startup_64 is used as entry point. The 32-bit entry > >> > point startup_32 does the right thing and reloads CS, and this is what most > >> > people are using if they use bzImage. > >> > > >> > This patch fixes the case when the Linux kernel is booted into using kexec > >> > under Xen. The Xen hypervisor is using large CS values which makes the > > x86_64 > >> > kernel fail - but only if vmlinux is booted, bzImage works well because it > >> > is using the 32-bit entry point. > >> > > >> > The main question is if we require that the boot loader should setup CS > >> > to some certain offset to be able to boot the kernel. The sane solution IMO > >> > should be that the kernel requires that the loaded descriptors are correct, > >> > but that the exact offset within the GDT the boot loader is using should not > >> > matter. This is the way the i386 boot works if I understand things > > correctly. > >> > >> What extra reload of cs does Xen introduce? > > > > None, but Xen is using CS values that are very different from Linux: > > > > xen/include/public/arch-x86_64.h: > > > > #define FLAT_RING3_CS32 0xe023 /* GDT index 260 */ > > #define FLAT_RING3_CS64 0xe033 /* GDT index 261 */ > > #define FLAT_RING3_DS32 0xe02b /* GDT index 262 */ > > #define FLAT_RING3_DS64 0x0000 /* NULL selector */ > > #define FLAT_RING3_SS32 0xe02b /* GDT index 262 */ > > #define FLAT_RING3_SS64 0xe02b /* GDT index 262 */ > > > > The main problem is that startup_64 depends on that CS is set to > > __KERNEL_CS when it shouldn't. On top of that the purgatory code in > > kexec-tools doesn't setup CS when using a 64-bit entry point. The > > following (mangled) patch solves that for me: > > I believe the cs shadow registers are fine. Yeah, I know you do. =) > > --- 0001/purgatory/arch/x86_64/entry64.S > > +++ work/purgatory/arch/x86_64/entry64.S 2006-08-18 > > 15:34:23.000000000 +0900 > > @@ -37,8 +37,9 @@ entry64: > > movl %eax, %fs > > movl %eax, %gs > > > > - /* In 64bit mode the code segment is meaningless */ > > Would you care to explain to me how the above comment is not true. > As I recall the only meaning %cs has is if you are a kernel > or user space process. The base address and everything else > mean nothing. > > In addition the value in %cs never means anything. It is the > values in the cs shadow registers that count. The value in %cs > just reflects where those values in %cs came from. > > So if we never reload %cs and only use the shadow values why does > the value in %cs matter? Well, I cannot explain it well myself. But if you try setting CS to something else than 0x18 and start the x86_64 kernel using startup_64 you will see what I am talking about. It does not work. A good example of when this doesn't work is when my page_table_a patches are used on x86_64 to boot a vmlinux. It does not work because CS is set to something else then __KERNEL_CS. Things just happen to work in most cases today (excluding being rebooted from Xen) because either a) vmlinux is booted and CS is set to 0x18 by Linux before kexec:ing. or b) bzImage is booted (32 bit entry point) > >> I'm not really comfortable with a half virtualized case. > > > > That I don't understand, care to explain more? > > The only case where I can think of that the value in %cs would matter > is if you change %cs. The only way I can see that is if you are > half para-virtualized. Because we are running privileged > instructions in startup_64 it shouldn't work for the paravirtualized > case, and it should work just like it does today in the > fully virtualized case. I'm not talking about running Linux under Xen, this is a standard non-virtualized kernel that is started _from_ a Xen environment using kexec-tools. Think kdump kernel being started by the Xen hypervisor on panic. Does that make sense? Thanks, / magnus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH][RFC] x86_64: Reload CS when startup_64 is used. 2006-08-22 3:41 ` Eric W. Biederman 2006-08-22 4:10 ` Magnus Damm @ 2006-08-22 8:03 ` Andi Kleen 2006-08-22 8:37 ` [PATCH] " Eric W. Biederman 1 sibling, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-22 8:03 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Magnus Damm, fastboot, linux-kernel > So if we never reload %cs and only use the shadow values why does > the value in %cs matter? We usually do at the first IRET. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] x86_64: Reload CS when startup_64 is used. 2006-08-22 8:03 ` Andi Kleen @ 2006-08-22 8:37 ` Eric W. Biederman 2006-08-22 8:53 ` [Fastboot] " Magnus Damm 2006-08-22 9:01 ` Andi Kleen 0 siblings, 2 replies; 23+ messages in thread From: Eric W. Biederman @ 2006-08-22 8:37 UTC (permalink / raw) To: Andi Kleen; +Cc: Magnus Damm, fastboot, linux-kernel In long mode the %cs is largely a relic. However there are a few cases like lret where it matters that we have a valid value. Without this patch it is possible to enter the kernel in startup_64 without setting %cs to a valid value. With this patch we don't care what %cs value we enter the kernel with, so long as the cs shadow register indicates it is a privileged code segment. Thanks to Magnus Damm for finding this problem and posting the first workable patch. I have moved the jump to set %cs down a few instructions so we don't need to take an extra jump. Which keeps the code simpler. Signed-of-by: Eric W. Biederman <ebiederm@xmission.com> --- arch/x86_64/kernel/head.S | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S index 923f080..6c89f6a 100644 --- a/arch/x86_64/kernel/head.S +++ b/arch/x86_64/kernel/head.S @@ -186,10 +186,13 @@ #define CR0_PAGING (1<<31) /* Finally jump to run C code and to be on real kernel address * Since we are running on identity-mapped space we have to jump * to the full 64bit address , this is only possible as indirect - * jump + * jump. In addition we need to ensure %cs is set so we make this + * a far return. */ movq initial_code(%rip),%rax - jmp *%rax + pushq $__KERNEL_CS + pushq %rax + lretq /* SMP bootup changes these two */ .align 8 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH] x86_64: Reload CS when startup_64 is used. 2006-08-22 8:37 ` [PATCH] " Eric W. Biederman @ 2006-08-22 8:53 ` Magnus Damm 2006-08-22 9:25 ` Eric W. Biederman 2006-08-22 9:01 ` Andi Kleen 1 sibling, 1 reply; 23+ messages in thread From: Magnus Damm @ 2006-08-22 8:53 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andi Kleen, Magnus Damm, fastboot, linux-kernel Hi Eric, On 8/22/06, Eric W. Biederman <ebiederm@xmission.com> wrote: > > In long mode the %cs is largely a relic. However there are a few cases > like lret where it matters that we have a valid value. Without this > patch it is possible to enter the kernel in startup_64 without setting > %cs to a valid value. With this patch we don't care what %cs value > we enter the kernel with, so long as the cs shadow register indicates > it is a privileged code segment. > > Thanks to Magnus Damm for finding this problem and posting the > first workable patch. I have moved the jump to set %cs down a > few instructions so we don't need to take an extra jump. Which > keeps the code simpler. > > Signed-of-by: Eric W. Biederman <ebiederm@xmission.com> While at it, could you please fix up the purgatory code in kexec-tools to include this fix so we can boot older versions of the kernel too? Thanks, / magnus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH] x86_64: Reload CS when startup_64 is used. 2006-08-22 8:53 ` [Fastboot] " Magnus Damm @ 2006-08-22 9:25 ` Eric W. Biederman 2006-08-23 3:10 ` Magnus Damm 0 siblings, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2006-08-22 9:25 UTC (permalink / raw) To: Magnus Damm; +Cc: Andi Kleen, Magnus Damm, fastboot, linux-kernel "Magnus Damm" <magnus.damm@gmail.com> writes: > Hi Eric, > > On 8/22/06, Eric W. Biederman <ebiederm@xmission.com> wrote: >> >> In long mode the %cs is largely a relic. However there are a few cases >> like lret where it matters that we have a valid value. Without this >> patch it is possible to enter the kernel in startup_64 without setting >> %cs to a valid value. With this patch we don't care what %cs value >> we enter the kernel with, so long as the cs shadow register indicates >> it is a privileged code segment. >> >> Thanks to Magnus Damm for finding this problem and posting the >> first workable patch. I have moved the jump to set %cs down a >> few instructions so we don't need to take an extra jump. Which >> keeps the code simpler. >> >> Signed-of-by: Eric W. Biederman <ebiederm@xmission.com> > > While at it, could you please fix up the purgatory code in kexec-tools > to include this fix so we can boot older versions of the kernel too? I guess I'm not opposed to a patch but I have some serious reservations about a solution that limits my address to 32bits in 64bit mode, and appears to break the gdt used for entering the 32bit kernel. I'm up way to late tonight. I just wanted to resolve the situation when it was clear what was going on. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Fastboot] [PATCH] x86_64: Reload CS when startup_64 is used. 2006-08-22 9:25 ` Eric W. Biederman @ 2006-08-23 3:10 ` Magnus Damm 0 siblings, 0 replies; 23+ messages in thread From: Magnus Damm @ 2006-08-23 3:10 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andi Kleen, Magnus Damm, fastboot, linux-kernel On 8/22/06, Eric W. Biederman <ebiederm@xmission.com> wrote: > "Magnus Damm" <magnus.damm@gmail.com> writes: > > On 8/22/06, Eric W. Biederman <ebiederm@xmission.com> wrote: > >> > >> In long mode the %cs is largely a relic. However there are a few cases > >> like lret where it matters that we have a valid value. Without this > >> patch it is possible to enter the kernel in startup_64 without setting > >> %cs to a valid value. With this patch we don't care what %cs value > >> we enter the kernel with, so long as the cs shadow register indicates > >> it is a privileged code segment. > >> > >> Thanks to Magnus Damm for finding this problem and posting the > >> first workable patch. I have moved the jump to set %cs down a > >> few instructions so we don't need to take an extra jump. Which > >> keeps the code simpler. > >> > >> Signed-of-by: Eric W. Biederman <ebiederm@xmission.com> > > > > While at it, could you please fix up the purgatory code in kexec-tools > > to include this fix so we can boot older versions of the kernel too? > > I guess I'm not opposed to a patch but I have some serious reservations > about a solution that limits my address to 32bits in 64bit mode, and > appears to break the gdt used for entering the 32bit kernel. The 32-bit pointer issue can easily be resolved. I don't understand what you mean with breaking the GDT though - to me it looks like the entry for CS in entry64.S is broken without my patch. You reload the GDT to a new one anyway in entry64-32.S so I'm not sure what this 32-bit breakage is that you are talking about. > I'm up way to late tonight. I just wanted to resolve the situation > when it was clear what was going on. Getting the fix in the kernel is hopefully solved now, thanks for the help. Next step in my mind is to fix up kexec-tools - I'll send an updated patch to fastboot later on today. Thanks, / magnus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] x86_64: Reload CS when startup_64 is used. 2006-08-22 8:37 ` [PATCH] " Eric W. Biederman 2006-08-22 8:53 ` [Fastboot] " Magnus Damm @ 2006-08-22 9:01 ` Andi Kleen 2006-08-22 9:20 ` Eric W. Biederman 1 sibling, 1 reply; 23+ messages in thread From: Andi Kleen @ 2006-08-22 9:01 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Magnus Damm, fastboot, linux-kernel On Tue, 22 Aug 2006 02:37:44 -0600 ebiederm@xmission.com (Eric W. Biederman) wrote: > > In long mode the %cs is largely a relic. However there are a few cases > like lret You mean iret? > + * jump. In addition we need to ensure %cs is set so we make this > + * a far return. > */ > movq initial_code(%rip),%rax > - jmp *%rax > + pushq $__KERNEL_CS > + pushq %rax > + lretq Ok merged thanks -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] x86_64: Reload CS when startup_64 is used. 2006-08-22 9:01 ` Andi Kleen @ 2006-08-22 9:20 ` Eric W. Biederman 0 siblings, 0 replies; 23+ messages in thread From: Eric W. Biederman @ 2006-08-22 9:20 UTC (permalink / raw) To: Andi Kleen; +Cc: Magnus Damm, fastboot, linux-kernel Andi Kleen <ak@suse.de> writes: > On Tue, 22 Aug 2006 02:37:44 -0600 > ebiederm@xmission.com (Eric W. Biederman) wrote: > >> >> In long mode the %cs is largely a relic. However there are a few cases >> like lret > > You mean iret? Yes, sorry. > >> + * jump. In addition we need to ensure %cs is set so we make this >> + * a far return. >> */ >> movq initial_code(%rip),%rax >> - jmp *%rax >> + pushq $__KERNEL_CS >> + pushq %rax >> + lretq > > Ok merged thanks > > -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2006-08-23 3:10 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-08-21 9:54 [PATCH][RFC] x86_64: Reload CS when startup_64 is used Magnus Damm 2006-08-21 10:19 ` Andi Kleen 2006-08-21 13:29 ` [Fastboot] " Magnus Damm 2006-08-21 14:16 ` Andi Kleen 2006-08-22 0:47 ` Magnus Damm 2006-08-21 14:17 ` Vivek Goyal 2006-08-21 14:24 ` Andi Kleen 2006-08-21 14:46 ` Vivek Goyal 2006-08-21 15:04 ` Andi Kleen 2006-08-21 20:02 ` Eric W. Biederman 2006-08-21 20:10 ` Andi Kleen 2006-08-21 21:00 ` Eric W. Biederman 2006-08-21 21:02 ` Eric W. Biederman 2006-08-22 0:58 ` Magnus Damm 2006-08-22 3:41 ` Eric W. Biederman 2006-08-22 4:10 ` Magnus Damm 2006-08-22 8:03 ` Andi Kleen 2006-08-22 8:37 ` [PATCH] " Eric W. Biederman 2006-08-22 8:53 ` [Fastboot] " Magnus Damm 2006-08-22 9:25 ` Eric W. Biederman 2006-08-23 3:10 ` Magnus Damm 2006-08-22 9:01 ` Andi Kleen 2006-08-22 9:20 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox