* [RFC, PATCH 7/24] i386 Vmi memory hole @ 2006-03-13 18:04 Zachary Amsden 2006-03-14 6:41 ` Chris Wright 0 siblings, 1 reply; 13+ messages in thread From: Zachary Amsden @ 2006-03-13 18:04 UTC (permalink / raw) To: Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Zachary Amsden, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Chris Wright, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn, Zachary Amsden Create a configurable hole in the linear address space at the top of memory. A more advanced interface is needed to negotiate how much space the hypervisor is allowed to steal, but in the end, it seems most likely that a fixed constant size will be chosen for the compiled kernel, potentially propagated to an information page used by paravirtual initialization to determine interface compatibility. Signed-off-by: Zachary Amsden <zach@vmware.com> Index: linux-2.6.16-rc3/arch/i386/Kconfig =================================================================== --- linux-2.6.16-rc3.orig/arch/i386/Kconfig 2006-02-22 16:09:04.000000000 -0800 +++ linux-2.6.16-rc3/arch/i386/Kconfig 2006-02-22 16:33:27.000000000 -0800 @@ -201,6 +201,15 @@ config VMI_DEBUG endmenu +config MEMORY_HOLE + int "Create hole at top of memory (0-256 MB)" + range 0 256 + default "64" if X86_VMI + default "0" if !X86_VMI + help + Useful for creating a hole in the top of memory when running + inside of a virtual machine monitor. + config ACPI_SRAT bool default y Index: linux-2.6.16-rc3/include/asm-i386/fixmap.h =================================================================== --- linux-2.6.16-rc3.orig/include/asm-i386/fixmap.h 2006-02-22 15:48:23.000000000 -0800 +++ linux-2.6.16-rc3/include/asm-i386/fixmap.h 2006-02-22 16:33:27.000000000 -0800 @@ -20,7 +20,7 @@ * Leave one empty page between vmalloc'ed areas and * the start of the fixmap. */ -#define __FIXADDR_TOP 0xfffff000 +#define __FIXADDR_TOP 0xfffff000-(CONFIG_MEMORY_HOLE << 20) #ifndef __ASSEMBLY__ #include <linux/kernel.h> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-13 18:04 [RFC, PATCH 7/24] i386 Vmi memory hole Zachary Amsden @ 2006-03-14 6:41 ` Chris Wright 2006-03-14 7:14 ` Zachary Amsden 0 siblings, 1 reply; 13+ messages in thread From: Chris Wright @ 2006-03-14 6:41 UTC (permalink / raw) To: Zachary Amsden Cc: Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Chris Wright, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn * Zachary Amsden (zach@vmware.com) wrote: > Create a configurable hole in the linear address space at the top > of memory. A more advanced interface is needed to negotiate how > much space the hypervisor is allowed to steal, but in the end, it > seems most likely that a fixed constant size will be chosen for > the compiled kernel, potentially propagated to an information > page used by paravirtual initialization to determine interface > compatibility. > > Signed-off-by: Zachary Amsden <zach@vmware.com> > > Index: linux-2.6.16-rc3/arch/i386/Kconfig > =================================================================== > --- linux-2.6.16-rc3.orig/arch/i386/Kconfig 2006-02-22 16:09:04.000000000 -0800 > +++ linux-2.6.16-rc3/arch/i386/Kconfig 2006-02-22 16:33:27.000000000 -0800 > @@ -201,6 +201,15 @@ config VMI_DEBUG > > endmenu > > +config MEMORY_HOLE > + int "Create hole at top of memory (0-256 MB)" > + range 0 256 > + default "64" if X86_VMI > + default "0" if !X86_VMI Deja-vu ;-) And still works in context of Xen, but we've just let the subarch define the __FIXADDR_TOP. Having it be dynamic could be interesting. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-14 6:41 ` Chris Wright @ 2006-03-14 7:14 ` Zachary Amsden 2006-03-14 21:56 ` Chris Wright 0 siblings, 1 reply; 13+ messages in thread From: Zachary Amsden @ 2006-03-14 7:14 UTC (permalink / raw) To: Chris Wright Cc: Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Chris Wright, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn [-- Attachment #1: Type: text/plain, Size: 1250 bytes --] Chris Wright wrote: > * Zachary Amsden (zach@vmware.com) wrote: > >> Create a configurable hole in the linear address space at the top >> of memory. A more advanced interface is needed to negotiate how >> much space the hypervisor is allowed to steal, but in the end, it >> seems most likely that a fixed constant size will be chosen for >> the compiled kernel, potentially propagated to an information >> page used by paravirtual initialization to determine interface >> compatibility. >> >> Signed-off-by: Zachary Amsden <zach@vmware.com> >> >> Index: linux-2.6.16-rc3/arch/i386/Kconfig >> =================================================================== >> --- linux-2.6.16-rc3.orig/arch/i386/Kconfig 2006-02-22 16:09:04.000000000 -0800 >> +++ linux-2.6.16-rc3/arch/i386/Kconfig 2006-02-22 16:33:27.000000000 -0800 >> @@ -201,6 +201,15 @@ config VMI_DEBUG >> >> endmenu >> >> +config MEMORY_HOLE >> + int "Create hole at top of memory (0-256 MB)" >> + range 0 256 >> + default "64" if X86_VMI >> + default "0" if !X86_VMI >> > > Deja-vu ;-) And still works in context of Xen, but we've just let the > subarch define the __FIXADDR_TOP. Having it be dynamic could be > interesting. > Here's dynamic. I hope it still applies. [-- Attachment #2: linear-hole --] [-- Type: text/plain, Size: 9262 bytes --] Allow creation of an compile time hole at the top of linear address space. Extended to allow a dynamic hole in linear address space, 7/2005. This required some serious hacking to get everything perfect, but the end result appears to function quite nicely. Everyone can now share the appreciation of pseudo-undocumented ELF OS fields, which means core dumps, debuggers and even broken or obsolete linkers may continue to work. Signed-off-by: Zachary Amsden <zach@vmware.com> Index: linux-2.6.13/arch/i386/Kconfig =================================================================== --- linux-2.6.13.orig/arch/i386/Kconfig 2005-08-04 14:14:24.000000000 -0700 +++ linux-2.6.13/arch/i386/Kconfig 2005-08-05 15:28:42.000000000 -0700 @@ -127,6 +127,20 @@ endchoice +config RELOCATABLE_FIXMAP + bool "Allow the fixmap to be placed dynamically at runtime" + depends on EXPERIMENTAL + help + Crazy hackers only. + +config MEMORY_HOLE + int "Create hole at top of memory (0-512 MB)" + range 0 512 + default "0" + help + Useful for creating a hole in the top of memory when running + inside of a virtual machine monitor. + config ACPI_SRAT bool default y Index: linux-2.6.13/arch/i386/kernel/sysenter.c =================================================================== --- linux-2.6.13.orig/arch/i386/kernel/sysenter.c 2005-08-02 17:04:12.000000000 -0700 +++ linux-2.6.13/arch/i386/kernel/sysenter.c 2005-08-05 15:47:53.000000000 -0700 @@ -46,22 +46,90 @@ extern const char vsyscall_int80_start, vsyscall_int80_end; extern const char vsyscall_sysenter_start, vsyscall_sysenter_end; +#ifdef CONFIG_RELOCATABLE_FIXMAP +extern const char SYSENTER_RETURN; +const char *SYSENTER_RETURN_ADDR; + +static void fixup_vsyscall_elf(char *page) +{ + Elf32_Ehdr *hdr; + Elf32_Shdr *sechdrs; + Elf32_Phdr *phdr; + char *secstrings; + int i, j, n; + + hdr = (Elf32_Ehdr *)page; + + /* Sanity checks against insmoding binaries or wrong arch, + weird elf version */ + if (memcmp(hdr->e_ident, ELFMAG, 4) != 0 || + !elf_check_arch(hdr) || + hdr->e_type != ET_DYN) + panic("Bogus ELF in vsyscall DSO\n"); + + hdr->e_entry += VSYSCALL_RELOCATION; + + sechdrs = (void *)hdr + hdr->e_shoff; + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + + for (i = 1; i < hdr->e_shnum; i++) { + if (!(sechdrs[i].sh_flags & SHF_ALLOC)) + continue; + + sechdrs[i].sh_addr += VSYSCALL_RELOCATION; + if (strcmp(secstrings+sechdrs[i].sh_name, ".dynsym") == 0) { + Elf32_Sym *sym = (void *)hdr + sechdrs[i].sh_offset; + n = sechdrs[i].sh_size / sizeof(*sym); + for (j = 1; j < n; j++) { + int ndx = sym[j].st_shndx; + if (ndx == SHN_UNDEF || ndx == SHN_ABS) + continue; + sym[j].st_value += VSYSCALL_RELOCATION; + } + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".dynamic") == 0) { + Elf32_Dyn *dyn = (void *)hdr + sechdrs[i].sh_offset; + int tag; + while ((tag = (++dyn)->d_tag) != DT_NULL) { + if (tag == DT_PLTGOT || tag == DT_HASH || + tag == DT_STRTAB || tag == DT_SYMTAB || + tag == DT_RELA || tag == DT_INIT || + tag == DT_FINI || tag == DT_REL || + tag == DT_JMPREL || tag == DT_VERSYM || + tag == DT_VERDEF || tag == DT_VERNEED) + dyn->d_un.d_val += VSYSCALL_RELOCATION; + } + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".useless") == 0) { + uint32_t *got = (void *)hdr + sechdrs[i].sh_offset; + *got += VSYSCALL_RELOCATION; + } + } + phdr = (void *)hdr + hdr->e_phoff; + for (i = 0; i < hdr->e_phnum; i++) { + phdr[i].p_vaddr += VSYSCALL_RELOCATION; + phdr[i].p_paddr += VSYSCALL_RELOCATION; + } + SYSENTER_RETURN_ADDR = (char *)&SYSENTER_RETURN + VSYSCALL_RELOCATION; +} +#endif + int __init sysenter_setup(void) { void *page = (void *)get_zeroed_page(GFP_ATOMIC); - __set_fixmap(FIX_VSYSCALL, __pa(page), PAGE_READONLY_EXEC); - - if (!boot_cpu_has(X86_FEATURE_SEP)) { + if (!boot_cpu_has(X86_FEATURE_SEP)) memcpy(page, &vsyscall_int80_start, &vsyscall_int80_end - &vsyscall_int80_start); - return 0; - } + else + memcpy(page, + &vsyscall_sysenter_start, + &vsyscall_sysenter_end - &vsyscall_sysenter_start); - memcpy(page, - &vsyscall_sysenter_start, - &vsyscall_sysenter_end - &vsyscall_sysenter_start); +#ifdef CONFIG_RELOCATABLE_FIXMAP + fixup_vsyscall_elf((char *)page); +#endif + + __set_fixmap(FIX_VSYSCALL, __pa(page), PAGE_READONLY_EXEC); return 0; } Index: linux-2.6.13/arch/i386/kernel/asm-offsets.c =================================================================== --- linux-2.6.13.orig/arch/i386/kernel/asm-offsets.c 2005-08-04 14:28:35.000000000 -0700 +++ linux-2.6.13/arch/i386/kernel/asm-offsets.c 2005-08-05 15:11:45.000000000 -0700 @@ -68,5 +68,9 @@ sizeof(struct tss_struct)); DEFINE(PAGE_SIZE_asm, PAGE_SIZE); +#ifdef CONFIG_RELOCATABLE_FIXMAP + DEFINE(VSYSCALL_BASE, 0); +#else DEFINE(VSYSCALL_BASE, __fix_to_virt(FIX_VSYSCALL)); +#endif } Index: linux-2.6.13/arch/i386/kernel/signal.c =================================================================== --- linux-2.6.13.orig/arch/i386/kernel/signal.c 2005-08-03 23:36:46.000000000 -0700 +++ linux-2.6.13/arch/i386/kernel/signal.c 2005-08-05 15:11:33.000000000 -0700 @@ -345,6 +345,8 @@ See vsyscall-sigreturn.S. */ extern void __user __kernel_sigreturn; extern void __user __kernel_rt_sigreturn; +#define kernel_sigreturn (VSYSCALL_RELOCATION + (void __user *)&__kernel_sigreturn) +#define kernel_rt_sigreturn (VSYSCALL_RELOCATION + (void __user *)&__kernel_rt_sigreturn) static int setup_frame(int sig, struct k_sigaction *ka, sigset_t *set, struct pt_regs * regs) @@ -380,7 +382,7 @@ goto give_sigsegv; } - restorer = &__kernel_sigreturn; + restorer = kernel_sigreturn; if (ka->sa.sa_flags & SA_RESTORER) restorer = ka->sa.sa_restorer; @@ -476,7 +478,7 @@ goto give_sigsegv; /* Set up to return from userspace. */ - restorer = &__kernel_rt_sigreturn; + restorer = kernel_rt_sigreturn; if (ka->sa.sa_flags & SA_RESTORER) restorer = ka->sa.sa_restorer; err |= __put_user(restorer, &frame->pretcode); Index: linux-2.6.13/arch/i386/kernel/entry.S =================================================================== --- linux-2.6.13.orig/arch/i386/kernel/entry.S 2005-08-04 14:17:15.000000000 -0700 +++ linux-2.6.13/arch/i386/kernel/entry.S 2005-08-05 14:09:15.000000000 -0700 @@ -200,7 +200,11 @@ pushl %ebp pushfl pushl $(__USER_CS) +#ifdef CONFIG_RELOCATABLE_FIXMAP + pushl %ss:SYSENTER_RETURN_ADDR +#else pushl $SYSENTER_RETURN +#endif /* * Load the potential sixth argument from user stack. Index: linux-2.6.13/arch/i386/mm/init.c =================================================================== --- linux-2.6.13.orig/arch/i386/mm/init.c 2005-08-04 14:39:17.000000000 -0700 +++ linux-2.6.13/arch/i386/mm/init.c 2005-08-05 15:20:04.000000000 -0700 @@ -42,6 +42,10 @@ unsigned int __VMALLOC_RESERVE = 128 << 20; +#ifdef CONFIG_RELOCATABLE_FIXMAP +unsigned long __FIXADDR_TOP = 0; +#endif + DEFINE_PER_CPU(struct mmu_gather, mmu_gathers); unsigned long highstart_pfn, highend_pfn; @@ -478,6 +482,12 @@ printk("NX (Execute Disable) protection: active\n"); #endif +#ifdef CONFIG_RELOCATABLE_FIXMAP + if (!__FIXADDR_TOP) + __FIXADDR_TOP = 0xfffff000UL-(CONFIG_MEMORY_HOLE << 20); + printk(KERN_INFO "Fixmap top relocated to %lxh\n", __FIXADDR_TOP); +#endif + pagetable_init(); load_cr3(swapper_pg_dir); Index: linux-2.6.13/include/asm-i386/fixmap.h =================================================================== --- linux-2.6.13.orig/include/asm-i386/fixmap.h 2005-08-04 14:14:24.000000000 -0700 +++ linux-2.6.13/include/asm-i386/fixmap.h 2005-08-05 15:36:13.000000000 -0700 @@ -20,7 +20,13 @@ * Leave one empty page between vmalloc'ed areas and * the start of the fixmap. */ -#define __FIXADDR_TOP 0xfffff000 +#ifdef CONFIG_RELOCATABLE_FIXMAP +extern unsigned long __FIXADDR_TOP; +#define VSYSCALL_RELOCATION __fix_to_virt(FIX_VSYSCALL) +#else +#define __FIXADDR_TOP (0xfffff000-(CONFIG_MEMORY_HOLE << 20)) +#define VSYSCALL_RELOCATION 0 +#endif #ifndef __ASSEMBLY__ #include <linux/kernel.h> Index: linux-2.6.13/include/asm-i386/elf.h =================================================================== --- linux-2.6.13.orig/include/asm-i386/elf.h 2005-08-02 17:06:23.000000000 -0700 +++ linux-2.6.13/include/asm-i386/elf.h 2005-08-05 15:31:32.000000000 -0700 @@ -129,7 +129,7 @@ #define VSYSCALL_BASE (__fix_to_virt(FIX_VSYSCALL)) #define VSYSCALL_EHDR ((const struct elfhdr *) VSYSCALL_BASE) -#define VSYSCALL_ENTRY ((unsigned long) &__kernel_vsyscall) +#define VSYSCALL_ENTRY ((unsigned long) (VSYSCALL_RELOCATION+&__kernel_vsyscall)) extern void __kernel_vsyscall; #define ARCH_DLINFO \ Index: linux-2.6.13/include/linux/elf.h =================================================================== --- linux-2.6.13.orig/include/linux/elf.h 2005-08-02 17:06:24.000000000 -0700 +++ linux-2.6.13/include/linux/elf.h 2005-08-05 12:06:17.000000000 -0700 @@ -138,6 +138,9 @@ #define DT_DEBUG 21 #define DT_TEXTREL 22 #define DT_JMPREL 23 +#define DT_VERSYM 0x6ffffff0 +#define DT_VERDEF 0x6ffffffc +#define DT_VERNEED 0x6ffffffe #define DT_LOPROC 0x70000000 #define DT_HIPROC 0x7fffffff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-14 7:14 ` Zachary Amsden @ 2006-03-14 21:56 ` Chris Wright 2006-03-14 22:35 ` Zachary Amsden 0 siblings, 1 reply; 13+ messages in thread From: Chris Wright @ 2006-03-14 21:56 UTC (permalink / raw) To: Zachary Amsden Cc: Chris Wright, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Chris Wright, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn, Gerd Hoffmann * Zachary Amsden (zach@vmware.com) wrote: > Allow creation of an compile time hole at the top of linear address space. > > Extended to allow a dynamic hole in linear address space, 7/2005. This > required some serious hacking to get everything perfect, but the end result > appears to function quite nicely. Everyone can now share the appreciation > of pseudo-undocumented ELF OS fields, which means core dumps, debuggers > and even broken or obsolete linkers may continue to work. Thanks. Gerd did something similar (although I believe it's simpler, don't recall the relocation magic) for Xen. Either way, it's useful from Xen perspective. thanks, -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-14 21:56 ` Chris Wright @ 2006-03-14 22:35 ` Zachary Amsden 2006-03-15 4:31 ` Chris Wright 0 siblings, 1 reply; 13+ messages in thread From: Zachary Amsden @ 2006-03-14 22:35 UTC (permalink / raw) To: Chris Wright Cc: Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn, Gerd Hoffmann Chris Wright wrote: > * Zachary Amsden (zach@vmware.com) wrote: > >> Allow creation of an compile time hole at the top of linear address space. >> >> Extended to allow a dynamic hole in linear address space, 7/2005. This >> required some serious hacking to get everything perfect, but the end result >> appears to function quite nicely. Everyone can now share the appreciation >> of pseudo-undocumented ELF OS fields, which means core dumps, debuggers >> and even broken or obsolete linkers may continue to work. >> > > Thanks. Gerd did something similar (although I believe it's simpler, > don't recall the relocation magic) for Xen. Either way, it's useful > from Xen perspective. > I believe Xen disables sysenter. The complications in my patch come from the fact that the vsyscall page has to be relocated dynamically, requiring, basically run time linking on the page and some tweaks to get sysenter to work. If you don't use vsyscall (say, non-TLS glibc), then you don't need that complexity. But I think it might be needed now, even for Xen. Zach ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-14 22:35 ` Zachary Amsden @ 2006-03-15 4:31 ` Chris Wright 2006-03-15 8:27 ` Gerd Hoffmann 0 siblings, 1 reply; 13+ messages in thread From: Chris Wright @ 2006-03-15 4:31 UTC (permalink / raw) To: Zachary Amsden Cc: Chris Wright, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn, Gerd Hoffmann * Zachary Amsden (zach@vmware.com) wrote: > Chris Wright wrote: > >* Zachary Amsden (zach@vmware.com) wrote: > > > >>Allow creation of an compile time hole at the top of linear address space. > >> > >>Extended to allow a dynamic hole in linear address space, 7/2005. This > >>required some serious hacking to get everything perfect, but the end > >>result > >>appears to function quite nicely. Everyone can now share the appreciation > >>of pseudo-undocumented ELF OS fields, which means core dumps, debuggers > >>and even broken or obsolete linkers may continue to work. > >> > > > >Thanks. Gerd did something similar (although I believe it's simpler, > >don't recall the relocation magic) for Xen. Either way, it's useful > >from Xen perspective. > > I believe Xen disables sysenter. Yes, so vsyscall page has int80 implementation. > The complications in my patch come > from the fact that the vsyscall page has to be relocated dynamically, > requiring, basically run time linking on the page and some tweaks to get > sysenter to work. If you don't use vsyscall (say, non-TLS glibc), then > you don't need that complexity. But I think it might be needed now, > even for Xen. I believe both Xen and execshield move vsyscall out of fixmap, and then map into userspace as normal vma. thanks, -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 4:31 ` Chris Wright @ 2006-03-15 8:27 ` Gerd Hoffmann 2006-03-15 8:36 ` Zachary Amsden 0 siblings, 1 reply; 13+ messages in thread From: Gerd Hoffmann @ 2006-03-15 8:27 UTC (permalink / raw) To: Chris Wright Cc: Zachary Amsden, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn [-- Attachment #1: Type: text/plain, Size: 1386 bytes --] >> The complications in my patch come >> from the fact that the vsyscall page has to be relocated dynamically, >> requiring, basically run time linking on the page and some tweaks to get >> sysenter to work. If you don't use vsyscall (say, non-TLS glibc), then >> you don't need that complexity. But I think it might be needed now, >> even for Xen. > > I believe both Xen and execshield move vsyscall out of fixmap, and then > map into userspace as normal vma. Yep, my patch (attached below for reference) moves the vsyscall page into user address space, just below PAGE_OFFSET. Works basically the same way the vsyscall page is mapped in the ia32 emulation of the x86_64 architecture. Address stays fixed, thus the relocation magic isn't needed. Once the vsyscall page is moved out of fixmap it's easy to make fixmap movable and thus have a runtime-resizable address space hole at the top of address space. Patch is attached too, although that one is more proof-of-concept, it doesn't make much sense as-is. It has a kernel command line option to specify the top of address space so you can play around with it ... Both patches are against -rc3 and most likely still apply just fine, havn't tested that though. cheers, Gerd -- Gerd 'just married' Hoffmann <kraxel@suse.de> I'm the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg [-- Attachment #2: move-gate-page.diff --] [-- Type: text/x-patch, Size: 6955 bytes --] Index: vanilla-2.6.16-rc3/arch/i386/kernel/asm-offsets.c =================================================================== --- vanilla-2.6.16-rc3.orig/arch/i386/kernel/asm-offsets.c 2006-01-03 04:21:10.000000000 +0100 +++ vanilla-2.6.16-rc3/arch/i386/kernel/asm-offsets.c 2006-02-15 10:59:41.000000000 +0100 @@ -68,5 +68,5 @@ void foo(void) sizeof(struct tss_struct)); DEFINE(PAGE_SIZE_asm, PAGE_SIZE); - DEFINE(VSYSCALL_BASE, __fix_to_virt(FIX_VSYSCALL)); + DEFINE(VSYSCALL_BASE, (PAGE_OFFSET - 2*PAGE_SIZE)); } Index: vanilla-2.6.16-rc3/arch/i386/kernel/sysenter.c =================================================================== --- vanilla-2.6.16-rc3.orig/arch/i386/kernel/sysenter.c 2006-01-03 04:21:10.000000000 +0100 +++ vanilla-2.6.16-rc3/arch/i386/kernel/sysenter.c 2006-02-13 09:57:36.000000000 +0100 @@ -13,6 +13,7 @@ #include <linux/gfp.h> #include <linux/string.h> #include <linux/elf.h> +#include <linux/mm.h> #include <asm/cpufeature.h> #include <asm/msr.h> @@ -45,23 +46,88 @@ void enable_sep_cpu(void) */ extern const char vsyscall_int80_start, vsyscall_int80_end; extern const char vsyscall_sysenter_start, vsyscall_sysenter_end; +static void *syscall_page; int __init sysenter_setup(void) { - void *page = (void *)get_zeroed_page(GFP_ATOMIC); - - __set_fixmap(FIX_VSYSCALL, __pa(page), PAGE_READONLY_EXEC); + syscall_page = (void *)get_zeroed_page(GFP_ATOMIC); if (!boot_cpu_has(X86_FEATURE_SEP)) { - memcpy(page, + memcpy(syscall_page, &vsyscall_int80_start, &vsyscall_int80_end - &vsyscall_int80_start); return 0; } - memcpy(page, + memcpy(syscall_page, &vsyscall_sysenter_start, &vsyscall_sysenter_end - &vsyscall_sysenter_start); return 0; } + +static struct page* +syscall_nopage(struct vm_area_struct *vma, unsigned long adr, int *type) +{ + struct page *p = virt_to_page(adr - vma->vm_start + syscall_page); + get_page(p); + return p; +} + +/* Prevent VMA merging */ +static void syscall_vma_close(struct vm_area_struct *vma) +{ +} + +static struct vm_operations_struct syscall_vm_ops = { + .close = syscall_vma_close, + .nopage = syscall_nopage, +}; + +/* Setup a VMA at program startup for the vsyscall page */ +int arch_setup_additional_pages(struct linux_binprm *bprm, int exstack) +{ + struct vm_area_struct *vma; + struct mm_struct *mm = current->mm; + int ret; + + vma = kmem_cache_alloc(vm_area_cachep, SLAB_KERNEL); + if (!vma) + return -ENOMEM; + + memset(vma, 0, sizeof(struct vm_area_struct)); + /* Could randomize here */ + vma->vm_start = VSYSCALL_BASE; + vma->vm_end = VSYSCALL_BASE + PAGE_SIZE; + /* MAYWRITE to allow gdb to COW and set breakpoints */ + vma->vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYEXEC|VM_MAYWRITE; + vma->vm_flags |= mm->def_flags; + vma->vm_page_prot = protection_map[vma->vm_flags & 7]; + vma->vm_ops = &syscall_vm_ops; + vma->vm_mm = mm; + + down_write(&mm->mmap_sem); + if ((ret = insert_vm_struct(mm, vma))) { + up_write(&mm->mmap_sem); + kmem_cache_free(vm_area_cachep, vma); + return ret; + } + mm->total_vm++; + up_write(&mm->mmap_sem); + return 0; +} + +struct vm_area_struct *get_gate_vma(struct task_struct *tsk) +{ + return NULL; +} + +int in_gate_area(struct task_struct *task, unsigned long addr) +{ + return 0; +} + +int in_gate_area_no_task(unsigned long addr) +{ + return 0; +} Index: vanilla-2.6.16-rc3/include/asm-i386/a.out.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/a.out.h 2006-01-03 04:21:10.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/a.out.h 2006-02-13 09:57:36.000000000 +0100 @@ -19,7 +19,7 @@ struct exec #ifdef __KERNEL__ -#define STACK_TOP TASK_SIZE +#define STACK_TOP (TASK_SIZE - 3*PAGE_SIZE) #endif Index: vanilla-2.6.16-rc3/include/asm-i386/elf.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/elf.h 2006-01-03 04:21:10.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/elf.h 2006-02-13 09:57:36.000000000 +0100 @@ -129,11 +129,16 @@ extern int dump_task_extended_fpu (struc #define ELF_CORE_COPY_FPREGS(tsk, elf_fpregs) dump_task_fpu(tsk, elf_fpregs) #define ELF_CORE_COPY_XFPREGS(tsk, elf_xfpregs) dump_task_extended_fpu(tsk, elf_xfpregs) -#define VSYSCALL_BASE (__fix_to_virt(FIX_VSYSCALL)) +#define VSYSCALL_BASE (PAGE_OFFSET - 2*PAGE_SIZE) #define VSYSCALL_EHDR ((const struct elfhdr *) VSYSCALL_BASE) #define VSYSCALL_ENTRY ((unsigned long) &__kernel_vsyscall) extern void __kernel_vsyscall; +#define ARCH_HAS_SETUP_ADDITIONAL_PAGES +struct linux_binprm; +extern int arch_setup_additional_pages(struct linux_binprm *bprm, + int executable_stack); + #define ARCH_DLINFO \ do { \ NEW_AUX_ENT(AT_SYSINFO, VSYSCALL_ENTRY); \ Index: vanilla-2.6.16-rc3/include/asm-i386/fixmap.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/fixmap.h 2006-01-03 04:21:10.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/fixmap.h 2006-02-14 14:40:15.000000000 +0100 @@ -52,7 +52,6 @@ */ enum fixed_addresses { FIX_HOLE, - FIX_VSYSCALL, #ifdef CONFIG_X86_LOCAL_APIC FIX_APIC_BASE, /* local (CPU) APIC) -- required for SMP or not */ #endif @@ -116,14 +115,6 @@ extern void __set_fixmap (enum fixed_add #define __fix_to_virt(x) (FIXADDR_TOP - ((x) << PAGE_SHIFT)) #define __virt_to_fix(x) ((FIXADDR_TOP - ((x)&PAGE_MASK)) >> PAGE_SHIFT) -/* - * This is the range that is readable by user mode, and things - * acting like user mode such as get_user_pages. - */ -#define FIXADDR_USER_START (__fix_to_virt(FIX_VSYSCALL)) -#define FIXADDR_USER_END (FIXADDR_USER_START + PAGE_SIZE) - - extern void __this_fixmap_does_not_exist(void); /* Index: vanilla-2.6.16-rc3/include/asm-i386/page.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/page.h 2006-02-13 09:42:02.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/page.h 2006-02-14 14:40:15.000000000 +0100 @@ -139,6 +139,8 @@ extern int page_is_ram(unsigned long pag ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \ VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) +#define __HAVE_ARCH_GATE_AREA 1 + #endif /* __KERNEL__ */ #include <asm-generic/page.h> Index: vanilla-2.6.16-rc3/include/asm-i386/processor.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/processor.h 2006-02-13 09:42:02.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/processor.h 2006-02-14 14:43:25.000000000 +0100 @@ -318,7 +318,7 @@ extern int bootloader_type; /* * User space process size: 3GB (default). */ -#define TASK_SIZE (PAGE_OFFSET) +#define TASK_SIZE (PAGE_OFFSET - 3*PAGE_SIZE) /* This decides where the kernel will search for a free chunk of vm * space during mmap's. [-- Attachment #3: unfix-fixmap --] [-- Type: text/plain, Size: 5387 bytes --] Index: vanilla-2.6.16-rc3/arch/i386/kernel/setup.c =================================================================== --- vanilla-2.6.16-rc3.orig/arch/i386/kernel/setup.c 2006-02-13 09:39:33.000000000 +0100 +++ vanilla-2.6.16-rc3/arch/i386/kernel/setup.c 2006-02-13 09:57:36.000000000 +0100 @@ -922,6 +922,12 @@ static void __init parse_cmdline_early ( else if (!memcmp(from, "vmalloc=", 8)) __VMALLOC_RESERVE = memparse(from+8, &from); + /* + * fixmap=addr + */ + else if (!memcmp(from, "fixmap=", 7)) + set_fixaddr_top(simple_strtoul(from+7, NULL, 16)); + next_char: c = *(from++); if (!c) Index: vanilla-2.6.16-rc3/arch/i386/mm/init.c =================================================================== --- vanilla-2.6.16-rc3.orig/arch/i386/mm/init.c 2006-02-13 09:39:33.000000000 +0100 +++ vanilla-2.6.16-rc3/arch/i386/mm/init.c 2006-02-13 14:33:40.000000000 +0100 @@ -628,6 +628,42 @@ void __init mem_init(void) (unsigned long) (totalhigh_pages << (PAGE_SHIFT-10)) ); +#if 1 /* double-sanity-check paranoia */ + printk("virtual kernel memory layout:\n" + " fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n" + " pkmap : 0x%08lx - 0x%08lx (%4ld kB)\n" + " vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n" + " lowmem : 0x%08lx - 0x%08lx (%4ld MB)\n" + " .init : 0x%08lx - 0x%08lx (%4ld kB)\n" + " .data : 0x%08lx - 0x%08lx (%4ld kB)\n" + " .text : 0x%08lx - 0x%08lx (%4ld kB)\n", + FIXADDR_START, FIXADDR_TOP, + (FIXADDR_TOP - FIXADDR_START) >> 10, + + PKMAP_BASE, PKMAP_BASE+LAST_PKMAP*PAGE_SIZE, + (LAST_PKMAP*PAGE_SIZE) >> 10, + + VMALLOC_START, VMALLOC_END, + (VMALLOC_END - VMALLOC_START) >> 20, + + (unsigned long)__va(0), (unsigned long)high_memory, + ((unsigned long)high_memory - (unsigned long)__va(0)) >> 20, + + (unsigned long)&__init_begin, (unsigned long)&__init_end, + ((unsigned long)&__init_end - (unsigned long)&__init_begin) >> 10, + + (unsigned long)&_etext, (unsigned long)&_edata, + ((unsigned long)&_edata - (unsigned long)&_etext) >> 10, + + (unsigned long)&_text, (unsigned long)&_etext, + ((unsigned long)&_etext - (unsigned long)&_text) >> 10); + + BUG_ON(PKMAP_BASE+LAST_PKMAP*PAGE_SIZE > FIXADDR_START); + BUG_ON(VMALLOC_END > PKMAP_BASE); + BUG_ON(VMALLOC_START > VMALLOC_END); + BUG_ON((unsigned long)high_memory > VMALLOC_START); +#endif /* double-sanity-check paranoia */ + #ifdef CONFIG_X86_PAE if (!cpu_has_pae) panic("cannot execute a PAE-enabled kernel on a PAE-less CPU!"); Index: vanilla-2.6.16-rc3/arch/i386/mm/pgtable.c =================================================================== --- vanilla-2.6.16-rc3.orig/arch/i386/mm/pgtable.c 2006-01-03 04:21:10.000000000 +0100 +++ vanilla-2.6.16-rc3/arch/i386/mm/pgtable.c 2006-02-13 09:57:36.000000000 +0100 @@ -13,6 +13,7 @@ #include <linux/slab.h> #include <linux/pagemap.h> #include <linux/spinlock.h> +#include <linux/module.h> #include <asm/system.h> #include <asm/pgtable.h> @@ -138,6 +139,10 @@ void set_pmd_pfn(unsigned long vaddr, un __flush_tlb_one(vaddr); } +static int fixmaps = 0; +unsigned long __FIXADDR_TOP = 0xfffff000; +EXPORT_SYMBOL(__FIXADDR_TOP); + void __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t flags) { unsigned long address = __fix_to_virt(idx); @@ -147,6 +152,14 @@ void __set_fixmap (enum fixed_addresses return; } set_pte_pfn(address, phys >> PAGE_SHIFT, flags); + fixmaps++; +} + +void set_fixaddr_top(unsigned long top) +{ + BUG_ON(fixmaps > 0); + printk("%s: addr=0x%lx\n", __FUNCTION__, top); + __FIXADDR_TOP = top - PAGE_SIZE; } pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address) Index: vanilla-2.6.16-rc3/include/asm-i386/fixmap.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/fixmap.h 2006-02-13 09:57:36.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/fixmap.h 2006-02-13 09:57:36.000000000 +0100 @@ -20,7 +20,7 @@ * Leave one empty page between vmalloc'ed areas and * the start of the fixmap. */ -#define __FIXADDR_TOP 0xfffff000 +extern unsigned long __FIXADDR_TOP; #ifndef __ASSEMBLY__ #include <linux/kernel.h> @@ -93,6 +93,7 @@ enum fixed_addresses { extern void __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t flags); +extern void set_fixaddr_top(unsigned long top); #define set_fixmap(idx, phys) \ __set_fixmap(idx, phys, PAGE_KERNEL) Index: vanilla-2.6.16-rc3/include/asm-i386/page.h =================================================================== --- vanilla-2.6.16-rc3.orig/include/asm-i386/page.h 2006-02-13 09:57:36.000000000 +0100 +++ vanilla-2.6.16-rc3/include/asm-i386/page.h 2006-02-13 14:21:36.000000000 +0100 @@ -121,7 +121,7 @@ extern int page_is_ram(unsigned long pag #define PAGE_OFFSET ((unsigned long)__PAGE_OFFSET) #define VMALLOC_RESERVE ((unsigned long)__VMALLOC_RESERVE) -#define MAXMEM (-__PAGE_OFFSET-__VMALLOC_RESERVE) +#define MAXMEM (__FIXADDR_TOP-__PAGE_OFFSET-__VMALLOC_RESERVE) #define __pa(x) ((unsigned long)(x)-PAGE_OFFSET) #define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET)) #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 8:27 ` Gerd Hoffmann @ 2006-03-15 8:36 ` Zachary Amsden 2006-03-15 9:09 ` Chris Wright 2006-03-15 9:27 ` Gerd Hoffmann 0 siblings, 2 replies; 13+ messages in thread From: Zachary Amsden @ 2006-03-15 8:36 UTC (permalink / raw) To: Gerd Hoffmann Cc: Chris Wright, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn Gerd Hoffmann wrote: >>> The complications in my patch come >>> from the fact that the vsyscall page has to be relocated dynamically, >>> requiring, basically run time linking on the page and some tweaks to get >>> sysenter to work. If you don't use vsyscall (say, non-TLS glibc), then >>> you don't need that complexity. But I think it might be needed now, >>> even for Xen. >>> >> I believe both Xen and execshield move vsyscall out of fixmap, and then >> map into userspace as normal vma. >> > > Yep, my patch (attached below for reference) moves the vsyscall page > into user address space, just below PAGE_OFFSET. Works basically the > same way the vsyscall page is mapped in the ia32 emulation of the x86_64 > architecture. Address stays fixed, thus the relocation magic isn't needed. > > Once the vsyscall page is moved out of fixmap it's easy to make fixmap > movable and thus have a runtime-resizable address space hole at the top > of address space. Patch is attached too, although that one is more > proof-of-concept, it doesn't make much sense as-is. It has a kernel > command line option to specify the top of address space so you can play > around with it ... > > Both patches are against -rc3 and most likely still apply just fine, > havn't tested that though. > Your patch looks a lot cleaner and less hackish than mine. But I wonder if it still works with kernels that support the sysenter method of calling into the kernel. Look at the following code: ENTRY(sysenter_entry) movl TSS_sysenter_esp0(%esp),%esp sysenter_past_esp: STI pushl $(__USER_DS) pushl %ebp pushfl pushl $(__USER_CS) pushl $SYSENTER_RETURN SYSENTER_RETURN is a link time constant that is defined based on the location of the vsyscall page. If the vsyscall page can move, this can not be a constant. The reason is, this "fake" exception frame is used to return back to the EIP of the call site, and sysenter does not record the EIP of the call site. Zach ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 8:36 ` Zachary Amsden @ 2006-03-15 9:09 ` Chris Wright 2006-03-15 9:18 ` Zachary Amsden 2006-03-15 9:27 ` Gerd Hoffmann 1 sibling, 1 reply; 13+ messages in thread From: Chris Wright @ 2006-03-15 9:09 UTC (permalink / raw) To: Zachary Amsden Cc: Gerd Hoffmann, Chris Wright, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn * Zachary Amsden (zach@vmware.com) wrote: > ENTRY(sysenter_entry) > movl TSS_sysenter_esp0(%esp),%esp > sysenter_past_esp: > STI > pushl $(__USER_DS) > pushl %ebp > pushfl > pushl $(__USER_CS) > pushl $SYSENTER_RETURN > > SYSENTER_RETURN is a link time constant that is defined based on the > location of the vsyscall page. If the vsyscall page can move, this can > not be a constant. The reason is, this "fake" exception frame is used > to return back to the EIP of the call site, and sysenter does not record > the EIP of the call site. It's only real issue for something like execshield. For this it's easy to do the fixed math since it's still at fixed address. + DEFINE(VSYSCALL_BASE, (PAGE_OFFSET - 2*PAGE_SIZE)); But execshield has to make SYSENTER_RETURN context sensitive to current since the vdso is mapped at random location. thanks, -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 9:09 ` Chris Wright @ 2006-03-15 9:18 ` Zachary Amsden 2006-03-15 9:41 ` Chris Wright 0 siblings, 1 reply; 13+ messages in thread From: Zachary Amsden @ 2006-03-15 9:18 UTC (permalink / raw) To: Chris Wright Cc: Gerd Hoffmann, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn Chris Wright wrote: > * Zachary Amsden (zach@vmware.com) wrote: > >> ENTRY(sysenter_entry) >> movl TSS_sysenter_esp0(%esp),%esp >> sysenter_past_esp: >> STI >> pushl $(__USER_DS) >> pushl %ebp >> pushfl >> pushl $(__USER_CS) >> pushl $SYSENTER_RETURN >> >> SYSENTER_RETURN is a link time constant that is defined based on the >> location of the vsyscall page. If the vsyscall page can move, this can >> not be a constant. The reason is, this "fake" exception frame is used >> to return back to the EIP of the call site, and sysenter does not record >> the EIP of the call site. >> > > It's only real issue for something like execshield. For this it's easy > to do the fixed math since it's still at fixed address. > > + DEFINE(VSYSCALL_BASE, (PAGE_OFFSET - 2*PAGE_SIZE)); > Ok, I'm confused. What fixed math? The return EIP that is pushed here is used when sysenter is active and you have to IRET back to userspace. If that EIP is dynamically relocatable, you can't do fixed math unless you patch the pushl site dynamically. Notable reasons for returning via IRET on this fake exception frame were (until my recent submission) IOPL changes, but I believe there were more. I will have to inspect the source to determine if that is still the case. Zach ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 9:18 ` Zachary Amsden @ 2006-03-15 9:41 ` Chris Wright 0 siblings, 0 replies; 13+ messages in thread From: Chris Wright @ 2006-03-15 9:41 UTC (permalink / raw) To: Zachary Amsden Cc: Chris Wright, Gerd Hoffmann, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn * Zachary Amsden (zach@vmware.com) wrote: > >+ DEFINE(VSYSCALL_BASE, (PAGE_OFFSET - 2*PAGE_SIZE)); > > Ok, I'm confused. What fixed math? Sorry, bad choice of words. From above, the VYSYCALL_BASE is known at compile time (in asm-offsets.h). So the SYSENTER_RETURN is still fixed addr. For execshield it's truly dynamic, so you get something like this instead of the constant SYSENTER_RETURN: - pushl $SYSENTER_RETURN + pushl (TI_sysenter_return-THREAD_SIZE+8+4*4)(%esp) thanks, -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 8:36 ` Zachary Amsden 2006-03-15 9:09 ` Chris Wright @ 2006-03-15 9:27 ` Gerd Hoffmann 2006-03-15 9:37 ` Zachary Amsden 1 sibling, 1 reply; 13+ messages in thread From: Gerd Hoffmann @ 2006-03-15 9:27 UTC (permalink / raw) To: Zachary Amsden Cc: Chris Wright, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn > pushl $SYSENTER_RETURN > > SYSENTER_RETURN is a link time constant that is defined based on the > location of the vsyscall page. If the vsyscall page can move, this can > not be a constant. The vsyscall page is at PAGE_OFFSET - 2*PAGE_SIZE. It doesn't move. At least not at runtime. At compile time it can change with the new VMSPLIT config options, but that isn't a problem ;) cheers, Gerd -- Gerd 'just married' Hoffmann <kraxel@suse.de> I'm the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH 7/24] i386 Vmi memory hole 2006-03-15 9:27 ` Gerd Hoffmann @ 2006-03-15 9:37 ` Zachary Amsden 0 siblings, 0 replies; 13+ messages in thread From: Zachary Amsden @ 2006-03-15 9:37 UTC (permalink / raw) To: Gerd Hoffmann Cc: Chris Wright, Linus Torvalds, Linux Kernel Mailing List, Virtualization Mailing List, Xen-devel, Andrew Morton, Dan Hecht, Dan Arai, Anne Holler, Pratap Subrahmanyam, Christopher Li, Joshua LeVasseur, Rik Van Riel, Jyothy Reddy, Jack Lo, Kip Macy, Jan Beulich, Ky Srinivasan, Wim Coekaerts, Leendert van Doorn Gerd Hoffmann wrote: >> pushl $SYSENTER_RETURN >> >> SYSENTER_RETURN is a link time constant that is defined based on the >> location of the vsyscall page. If the vsyscall page can move, this can >> not be a constant. >> > > The vsyscall page is at PAGE_OFFSET - 2*PAGE_SIZE. It doesn't move. At > least not at runtime. At compile time it can change with the new > VMSPLIT config options, but that isn't a problem ;) > Okay, I get it now. Thanks for the explanation. This certainly does simplify the problem. Zach ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-03-15 9:38 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-13 18:04 [RFC, PATCH 7/24] i386 Vmi memory hole Zachary Amsden 2006-03-14 6:41 ` Chris Wright 2006-03-14 7:14 ` Zachary Amsden 2006-03-14 21:56 ` Chris Wright 2006-03-14 22:35 ` Zachary Amsden 2006-03-15 4:31 ` Chris Wright 2006-03-15 8:27 ` Gerd Hoffmann 2006-03-15 8:36 ` Zachary Amsden 2006-03-15 9:09 ` Chris Wright 2006-03-15 9:18 ` Zachary Amsden 2006-03-15 9:41 ` Chris Wright 2006-03-15 9:27 ` Gerd Hoffmann 2006-03-15 9:37 ` Zachary Amsden
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox