* Why disable vdso by default with CONFIG_PARAVIRT?
@ 2006-12-12 1:22 Jeremy Fitzhardinge
2006-12-12 1:42 ` Zachary Amsden
2006-12-12 3:02 ` Andi Kleen
0 siblings, 2 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2006-12-12 1:22 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List, Linux Kernel Mailing List
Hi Andi,
What problem do they cause together? There's certainly no problem with
Xen+vdso (in fact, its actually very useful so that it picks up the
right libc with Xen-friendly TLS).
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 1:22 Why disable vdso by default with CONFIG_PARAVIRT? Jeremy Fitzhardinge
@ 2006-12-12 1:42 ` Zachary Amsden
2006-12-12 1:44 ` Jeremy Fitzhardinge
2006-12-12 3:02 ` Andi Kleen
1 sibling, 1 reply; 17+ messages in thread
From: Zachary Amsden @ 2006-12-12 1:42 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Andi Kleen, Virtualization Mailing List,
Linux Kernel Mailing List
Jeremy Fitzhardinge wrote:
> Hi Andi,
>
> What problem do they cause together? There's certainly no problem with
> Xen+vdso (in fact, its actually very useful so that it picks up the
> right libc with Xen-friendly TLS).
>
Methinks the compat VDSO support got broken in the config? Paravirt +
COMPAT_VDSO are incompatible.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 1:42 ` Zachary Amsden
@ 2006-12-12 1:44 ` Jeremy Fitzhardinge
2006-12-12 1:46 ` Zachary Amsden
0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2006-12-12 1:44 UTC (permalink / raw)
To: Zachary Amsden
Cc: Andi Kleen, Virtualization Mailing List,
Linux Kernel Mailing List
Zachary Amsden wrote:
> Jeremy Fitzhardinge wrote:
>> Hi Andi,
>>
>> What problem do they cause together? There's certainly no problem with
>> Xen+vdso (in fact, its actually very useful so that it picks up the
>> right libc with Xen-friendly TLS).
>>
>
> Methinks the compat VDSO support got broken in the config? Paravirt +
> COMPAT_VDSO are incompatible.
Yes, that's true, but I'm looking at arch/i386/kernel/sysenter.c:
#ifdef CONFIG_PARAVIRT
unsigned int __read_mostly vdso_enabled = 0;
#else
unsigned int __read_mostly vdso_enabled = 1;
#endif
I can't think of any reason why that should be necessary.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 1:44 ` Jeremy Fitzhardinge
@ 2006-12-12 1:46 ` Zachary Amsden
2006-12-12 1:49 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 17+ messages in thread
From: Zachary Amsden @ 2006-12-12 1:46 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Andi Kleen, Virtualization Mailing List,
Linux Kernel Mailing List, Rusty Russell
Jeremy Fitzhardinge wrote:
> Zachary Amsden wrote:
>
>> Jeremy Fitzhardinge wrote:
>>
>>> Hi Andi,
>>>
>>> What problem do they cause together? There's certainly no problem with
>>> Xen+vdso (in fact, its actually very useful so that it picks up the
>>> right libc with Xen-friendly TLS).
>>>
>>>
>> Methinks the compat VDSO support got broken in the config? Paravirt +
>> COMPAT_VDSO are incompatible.
>>
>
> Yes, that's true, but I'm looking at arch/i386/kernel/sysenter.c:
>
> #ifdef CONFIG_PARAVIRT
> unsigned int __read_mostly vdso_enabled = 0;
> #else
> unsigned int __read_mostly vdso_enabled = 1;
> #endif
>
> I can't think of any reason why that should be necessary.
>
It's not for us or Xen. Perhaps it came from lhype?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 1:46 ` Zachary Amsden
@ 2006-12-12 1:49 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2006-12-12 1:49 UTC (permalink / raw)
To: Zachary Amsden
Cc: Andi Kleen, Virtualization Mailing List,
Linux Kernel Mailing List, Rusty Russell
Zachary Amsden wrote:
> It's not for us or Xen. Perhaps it came from lhype?
(I suspect it came from Andi's fevered brain.) If lhype can't deal with
vdso, it can turn it off for itself - but I don't think its a problem
for lhype.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 1:22 Why disable vdso by default with CONFIG_PARAVIRT? Jeremy Fitzhardinge
2006-12-12 1:42 ` Zachary Amsden
@ 2006-12-12 3:02 ` Andi Kleen
2006-12-12 6:28 ` Zachary Amsden
` (2 more replies)
1 sibling, 3 replies; 17+ messages in thread
From: Andi Kleen @ 2006-12-12 3:02 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Virtualization Mailing List
On Tuesday 12 December 2006 02:22, Jeremy Fitzhardinge wrote:
> What problem do they cause together? There's certainly no problem with
> Xen+vdso
This was the change which finally got my test system (with an older
SUSE 9.0 based user land to boot). With paravirt older glibc's ld.so
otherwise throws assertation failures because it somehow can't deal with
the new placement. This only happens with paravirt enabled.
Binary compatibility is important.
> (in fact, its actually very useful so that it picks up the
> right libc with Xen-friendly TLS).
AFAIK libc selection comes from the aux vector, not the vdso.
-Andi
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 3:02 ` Andi Kleen
@ 2006-12-12 6:28 ` Zachary Amsden
2006-12-12 6:50 ` Jeremy Fitzhardinge
2006-12-12 23:22 ` Rusty Russell
2 siblings, 0 replies; 17+ messages in thread
From: Zachary Amsden @ 2006-12-12 6:28 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
Andi Kleen wrote:
> On Tuesday 12 December 2006 02:22, Jeremy Fitzhardinge wrote:
>
>> What problem do they cause together? There's certainly no problem with
>> Xen+vdso
>>
>
> This was the change which finally got my test system (with an older
> SUSE 9.0 based user land to boot). With paravirt older glibc's ld.so
> otherwise throws assertation failures because it somehow can't deal with
> the new placement. This only happens with paravirt enabled.
>
> Binary compatibility is important.
>
Yes, but the old placement of the vdso is incompatible with paravirt
guests. The only solution I can think of to keep compatibility is to
dynamically place the vdso during boot, but this is complex and
introduces an indirection penalty to the fast sysenter syscall path
(unless we make that a dynamic patch).
What should we do to fix this? Breaking compatibility for paravirt
compilation is certainly the easiest thing to do.
Zach
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 3:02 ` Andi Kleen
2006-12-12 6:28 ` Zachary Amsden
@ 2006-12-12 6:50 ` Jeremy Fitzhardinge
2006-12-12 7:27 ` Andi Kleen
2006-12-12 23:22 ` Rusty Russell
2 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2006-12-12 6:50 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
Andi Kleen wrote:
> This was the change which finally got my test system (with an older
> SUSE 9.0 based user land to boot). With paravirt older glibc's ld.so
> otherwise throws assertation failures because it somehow can't deal with
> the new placement. This only happens with paravirt enabled.
>
> Binary compatibility is important.
>
So the SUSE 9 libc needs COMPAT_VDSO, or it just can't deal with the
vdso moving around? Can it deal with the current kernel's mobile vdso?
>> (in fact, its actually very useful so that it picks up the
>> right libc with Xen-friendly TLS).
>>
>
> AFAIK libc selection comes from the aux vector, not the vdso.
No, it seems to be done by adding a .note segment to the vdso share
object. Without the vdso, my Xen test system doesn't boot because it
can't find the nosegneg versions of the libraries (it doesn't seem to
know where to look). I'm not sure how all this stuff fits together.
But the auxv is supposed to point to the vdso...
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 6:50 ` Jeremy Fitzhardinge
@ 2006-12-12 7:27 ` Andi Kleen
2006-12-12 10:23 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2006-12-12 7:27 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Virtualization Mailing List
> So the SUSE 9
SL9.0 (not to be confused with SLES9). But it's all distributions
with an older glibc.
> libc needs COMPAT_VDSO, or it just can't deal with the
> vdso moving around? Can it deal with the current kernel's mobile vdso?
It needs COMPAT_VDSO
[which is basically COMPAT_NO_OLD_GLIBC and imho always
was a big mistake to have a config anyways -- one shouldn't gamble
with binary compatibility so lightly]
> >> (in fact, its actually very useful so that it picks up the
> >> right libc with Xen-friendly TLS).
> >>
> >
> > AFAIK libc selection comes from the aux vector, not the vdso.
>
> No, it seems to be done by adding a .note segment to the vdso share
> object.
Hmm, i had assumed it used the same mechanism as the CPU optimized
libcs -- and that comes from the aux vector AT_PLATFORM.
> Without the vdso, my Xen test system doesn't boot because it
> can't find the nosegneg versions of the libraries (it doesn't seem to
> know where to look). I'm not sure how all this stuff fits together.
Why does it not boot? At least in the past nosegneg was only a optimization
to avoid some unnecessary traps to the hypervisor, but it should handle it.
Has that changed?
> But the auxv is supposed to point to the vdso...
create_elf_tables() puts the AT_PLATFORM string onto the stack,
unless i'm misreading the code badly. No dependency on vdso.
-Andi
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 7:27 ` Andi Kleen
@ 2006-12-12 10:23 ` Jeremy Fitzhardinge
2006-12-12 12:01 ` Andi Kleen
0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2006-12-12 10:23 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
Andi Kleen wrote:
> It needs COMPAT_VDSO
>
> [which is basically COMPAT_NO_OLD_GLIBC and imho always
> was a big mistake to have a config anyways -- one shouldn't gamble
> with binary compatibility so lightly]
>
It's unfortunate; there's a fundamental address space clash, and there's
no nice resolution.
Will your system boot with vdso=0 on the kernel command line?
Presumably it will boot paravirt-native without it (since native makes
no claims on the address space), so its something you could put in your
Xen config file, no?
> Hmm, i had assumed it used the same mechanism as the CPU optimized
> libcs -- and that comes from the aux vector AT_PLATFORM.
>
There's some magic that involves a .note segment in the vdso itself, and
a ld.so.conf entry which maps the string in there ("nosegneg") to a
pseudo-hardware capability, and ld.so uses the result of that to look
for more places for libraries. I don't really understand how all the
pieces fit together.
> Why does it not boot? At least in the past nosegneg was only a optimization
> to avoid some unnecessary traps to the hypervisor, but it should handle it.
> Has that changed?
>
No, but my (very stripped down) test system has no other libraries.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 10:23 ` Jeremy Fitzhardinge
@ 2006-12-12 12:01 ` Andi Kleen
2006-12-12 20:11 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2006-12-12 12:01 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Virtualization Mailing List
On Tuesday 12 December 2006 11:23, Jeremy Fitzhardinge wrote:
> Will your system boot with vdso=0 on the kernel command line?
Sure, it's the same as what I did by default.
> Presumably it will boot paravirt-native without it (since native makes
> no claims on the address space), so its something you could put in your
> Xen config file, no?
I don't think being incompatible to old binaries is a sensible default. That
is why I changed the wrong default. If paravirt ops cannot supply
a compatible vdso it has to do without one.
-Andi
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 12:01 ` Andi Kleen
@ 2006-12-12 20:11 ` Jeremy Fitzhardinge
2006-12-12 21:15 ` Andi Kleen
0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2006-12-12 20:11 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
Andi Kleen wrote:
> I don't think being incompatible to old binaries is a sensible default. That
> is why I changed the wrong default. If paravirt ops cannot supply
> a compatible vdso it has to do without one.
Do you know what glibc2.1 actually needs from the vdso? Does it
actually interpret as an elf file, or just it just jump into it to
perform syscalls? I wonder if we could use a fault in the vdso memory
range to act as a syscall, or something like that?
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 20:11 ` Jeremy Fitzhardinge
@ 2006-12-12 21:15 ` Andi Kleen
2006-12-13 2:04 ` Rusty Russell
2006-12-13 4:36 ` Rusty Russell
0 siblings, 2 replies; 17+ messages in thread
From: Andi Kleen @ 2006-12-12 21:15 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Virtualization Mailing List
On Tuesday 12 December 2006 21:11, Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> > I don't think being incompatible to old binaries is a sensible default. That
> > is why I changed the wrong default. If paravirt ops cannot supply
> > a compatible vdso it has to do without one.
>
> Do you know what glibc2.1 actually needs from the vdso? Does it
> actually interpret as an elf file,
Interpret it as a ELF file, but then has some special hacks to jump
directly anyways (or at least there is no direct linking, but
it goes over a trampoline in the main glibc)
The failure is an assertation failure in ld.so.
-Andi
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 3:02 ` Andi Kleen
2006-12-12 6:28 ` Zachary Amsden
2006-12-12 6:50 ` Jeremy Fitzhardinge
@ 2006-12-12 23:22 ` Rusty Russell
2 siblings, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2006-12-12 23:22 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
On Tue, 2006-12-12 at 04:02 +0100, Andi Kleen wrote:
> On Tuesday 12 December 2006 02:22, Jeremy Fitzhardinge wrote:
> > What problem do they cause together? There's certainly no problem with
> > Xen+vdso
>
> This was the change which finally got my test system (with an older
> SUSE 9.0 based user land to boot). With paravirt older glibc's ld.so
> otherwise throws assertation failures because it somehow can't deal with
> the new placement. This only happens with paravirt enabled.
>
> Binary compatibility is important.
Yes, this goes back to the original COMPAT_VDSO config option, months
ago. FC1's buggy glibc couldn't handle the vdso being in an unusual
place, and (over my objections) the COMPAT_VDSO option was introduced.
Seems like SuSE 9.0 is similarly effected.
I don't have a system which has this problem (and, at the end of a
modem, I'm unlikely to get one soon). But I would suggest that
COMPAT_VDSO should be rewritten: catch the segv from init (presumably in
around the old vdso 0xFFFF0000 addr?), printk a message, turn vdso off
and re-exec init.
That should make everyone happy. Now, someone who can repro this please
code it up!
Thanks!
Rusty.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 21:15 ` Andi Kleen
@ 2006-12-13 2:04 ` Rusty Russell
2006-12-13 4:36 ` Rusty Russell
1 sibling, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2006-12-13 2:04 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
On Tue, 2006-12-12 at 22:15 +0100, Andi Kleen wrote:
> On Tuesday 12 December 2006 21:11, Jeremy Fitzhardinge wrote:
> > Andi Kleen wrote:
> > > I don't think being incompatible to old binaries is a sensible default. That
> > > is why I changed the wrong default. If paravirt ops cannot supply
> > > a compatible vdso it has to do without one.
> >
> > Do you know what glibc2.1 actually needs from the vdso? Does it
> > actually interpret as an elf file,
>
> Interpret it as a ELF file, but then has some special hacks to jump
> directly anyways (or at least there is no direct linking, but
> it goes over a trampoline in the main glibc)
>
> The failure is an assertation failure in ld.so.
And since init is special, the SIGABRT doesn't get delivered. Andi,
does this hack come close? (Against older kernel, you'll need to take
out the #ifdef CONFIG_PARAVIRT around the vdso_enabled initialization).
Rusty.
Older glibcs assert() that the vdso will be in a particular spot
(which it can no longer be with CONFIG_PARAVIRT). As this glibc was
shipped in SuSE 9.0 and Fedora Core 1, it's not a trivial breakage.
Try to detect the failing init at runtime, turn off vdso and re-exec.
Untested, since I don't have a failing system.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
diff -r c3d6f0e043e0 arch/i386/Kconfig
--- a/arch/i386/Kconfig Wed Nov 15 19:21:22 2006 +1100
+++ b/arch/i386/Kconfig Wed Dec 13 12:20:49 2006 +1100
@@ -848,15 +848,14 @@ config HOTPLUG_CPU
/sys/devices/system/cpu.
config COMPAT_VDSO
- bool "Compat VDSO support"
- default y
- depends on !PARAVIRT
- help
- Map the VDSO to the predictable old-style address too.
- ---help---
- Say N here if you are running a sufficiently recent glibc
- version (2.3.3 or later), to remove the high-mapped
- VDSO mapping and to exclusively use the randomized VDSO.
+ bool "Disable VDSO for old glibc"
+ default y
+ ---help---
+ Old glibc does not like the modern VDSO placement (glibc
+ 2.3.3 or later is fine, Fedora Core 1 and SuSE 9.0 have
+ problems). Very old glibc versions don't use the VDSO at
+ all. This option tries to detect the glibc assertion which
+ occurs and then disables the VDSO.
If unsure, say Y.
diff -r c3d6f0e043e0 arch/i386/kernel/signal.c
--- a/arch/i386/kernel/signal.c Wed Nov 15 19:21:22 2006 +1100
+++ b/arch/i386/kernel/signal.c Wed Dec 13 12:47:16 2006 +1100
@@ -608,6 +608,17 @@ static void fastcall do_signal(struct pt
return;
}
+#ifdef CONFIG_COMPAT_VDSO
+ else if (signr == -1) {
+ void reexec_init(void);
+ static int reexec_done;
+ if (!reexec_done++) {
+ printk("COMPAT_VDSO: old glibc? Disabling vdso\n");
+ vdso_enabled = 0;
+ reexec_init();
+ }
+ }
+#endif
/* Did we come from a system call? */
if (regs->orig_eax >= 0) {
diff -r c3d6f0e043e0 arch/i386/kernel/sysenter.c
--- a/arch/i386/kernel/sysenter.c Wed Nov 15 19:21:22 2006 +1100
+++ b/arch/i386/kernel/sysenter.c Wed Dec 13 12:17:02 2006 +1100
@@ -72,15 +72,10 @@ int __init sysenter_setup(void)
{
syscall_page = (void *)get_zeroed_page(GFP_ATOMIC);
-#ifdef CONFIG_COMPAT_VDSO
- __set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_READONLY);
- printk("Compat vDSO mapped to %08lx.\n", __fix_to_virt(FIX_VDSO));
-#else
/*
* In the non-compat case the ELF coredumping code needs the fixmap:
*/
__set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_KERNEL_RO);
-#endif
if (!boot_cpu_has(X86_FEATURE_SEP)) {
memcpy(syscall_page,
diff -r c3d6f0e043e0 arch/i386/mm/pgtable.c
--- a/arch/i386/mm/pgtable.c Wed Nov 15 19:21:22 2006 +1100
+++ b/arch/i386/mm/pgtable.c Wed Dec 13 12:17:18 2006 +1100
@@ -141,10 +141,8 @@ void set_pmd_pfn(unsigned long vaddr, un
}
static int fixmaps;
-#ifndef CONFIG_COMPAT_VDSO
unsigned long __FIXADDR_TOP = 0xfffff000;
EXPORT_SYMBOL(__FIXADDR_TOP);
-#endif
void __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
{
@@ -168,12 +166,8 @@ void reserve_top_address(unsigned long r
void reserve_top_address(unsigned long reserve)
{
BUG_ON(fixmaps > 0);
-#ifdef CONFIG_COMPAT_VDSO
- BUG_ON(reserve != 0);
-#else
__FIXADDR_TOP = -reserve - PAGE_SIZE;
__VMALLOC_RESERVE += reserve;
-#endif
}
pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
diff -r c3d6f0e043e0 include/asm-i386/elf.h
--- a/include/asm-i386/elf.h Wed Nov 15 19:21:22 2006 +1100
+++ b/include/asm-i386/elf.h Wed Dec 13 12:43:45 2006 +1100
@@ -135,13 +135,8 @@ extern int dump_task_extended_fpu (struc
#define VDSO_HIGH_BASE (__fix_to_virt(FIX_VDSO))
#define VDSO_BASE ((unsigned long)current->mm->context.vdso)
-#ifdef CONFIG_COMPAT_VDSO
-# define VDSO_COMPAT_BASE VDSO_HIGH_BASE
-# define VDSO_PRELINK VDSO_HIGH_BASE
-#else
# define VDSO_COMPAT_BASE VDSO_BASE
# define VDSO_PRELINK 0
-#endif
#define VDSO_COMPAT_SYM(x) \
(VDSO_COMPAT_BASE + (unsigned long)(x) - VDSO_PRELINK)
diff -r c3d6f0e043e0 include/asm-i386/fixmap.h
--- a/include/asm-i386/fixmap.h Wed Nov 15 19:21:22 2006 +1100
+++ b/include/asm-i386/fixmap.h Wed Dec 13 12:39:42 2006 +1100
@@ -19,11 +19,7 @@
* Leave one empty page between vmalloc'ed areas and
* the start of the fixmap.
*/
-#ifndef CONFIG_COMPAT_VDSO
extern unsigned long __FIXADDR_TOP;
-#else
-#define __FIXADDR_TOP 0xfffff000
-#endif
#ifndef __ASSEMBLY__
#include <linux/kernel.h>
diff -r c3d6f0e043e0 init/main.c
--- a/init/main.c Wed Nov 15 19:21:22 2006 +1100
+++ b/init/main.c Wed Dec 13 12:01:27 2006 +1100
@@ -707,6 +707,13 @@ static void run_init_process(char *init_
kernel_execve(init_filename, argv_init, envp_init);
}
+#ifdef CONFIG_COMPAT_VDSO
+void reexec_init(void)
+{
+ kernel_execve(argv_init[0], argv_init, envp_init);
+}
+#endif
+
static int init(void * unused)
{
lock_kernel();
diff -r c3d6f0e043e0 kernel/signal.c
--- a/kernel/signal.c Wed Nov 15 19:21:22 2006 +1100
+++ b/kernel/signal.c Wed Dec 13 11:59:05 2006 +1100
@@ -2010,8 +2010,17 @@ relock:
* within that pid space. It can of course get signals from
* its parent pid space.
*/
- if (current == child_reaper(current))
+ if (current == child_reaper(current)) {
+#ifdef CONFIG_COMPAT_VDSO
+ /* Gross hack: Old glibc asserts, not
+ liking moved vdso (SuSE 9, FC1) */
+ if (signr == SIGABRT && list_empty(¤t->children)) {
+ signr = -1;
+ break;
+ }
+#endif
continue;
+ }
if (sig_kernel_stop(signr)) {
/*
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-12 21:15 ` Andi Kleen
2006-12-13 2:04 ` Rusty Russell
@ 2006-12-13 4:36 ` Rusty Russell
2006-12-13 5:25 ` Rusty Russell
1 sibling, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2006-12-13 4:36 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
On Tue, 2006-12-12 at 22:15 +0100, Andi Kleen wrote:
> The failure is an assertation failure in ld.so.
OK, this patch tested on an assert() in init.
===
Older glibcs assert() that the vdso will be in a particular spot
(which it can no longer be with CONFIG_PARAVIRT). As this glibc was
shipped in SuSE 9.0 and Fedora Core 1, it's not a trivial breakage.
Try to detect the failing init at runtime, turn off vdso and re-exec.
Untested on the actual failing systems, but should work.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
diff -r ed1ffbd17965 arch/i386/Kconfig
--- a/arch/i386/Kconfig Wed Dec 13 14:11:14 2006 +1100
+++ b/arch/i386/Kconfig Wed Dec 13 14:30:30 2006 +1100
@@ -816,15 +816,14 @@ config HOTPLUG_CPU
/sys/devices/system/cpu.
config COMPAT_VDSO
- bool "Compat VDSO support"
- default y
- depends on !PARAVIRT
- help
- Map the VDSO to the predictable old-style address too.
- ---help---
- Say N here if you are running a sufficiently recent glibc
- version (2.3.3 or later), to remove the high-mapped
- VDSO mapping and to exclusively use the randomized VDSO.
+ bool "Disable VDSO for old glibc"
+ default y
+ ---help---
+ Old glibc does not like the modern VDSO placement (glibc
+ 2.3.3 or later is fine, Fedora Core 1 and SuSE 9.0 have
+ problems). Very old glibc versions don't use the VDSO at
+ all. This option tries to detect the glibc assertion which
+ occurs and then disables the VDSO.
If unsure, say Y.
diff -r ed1ffbd17965 arch/i386/kernel/signal.c
--- a/arch/i386/kernel/signal.c Wed Dec 13 14:11:14 2006 +1100
+++ b/arch/i386/kernel/signal.c Wed Dec 13 15:26:37 2006 +1100
@@ -608,6 +608,20 @@ static void fastcall do_signal(struct pt
return;
}
+#ifdef CONFIG_COMPAT_VDSO
+ else if (signr == -1) {
+ void reexec_init(void);
+ if (vdso_enabled) {
+ printk(KERN_WARNING "COMPAT_VDSO: Old glibc?"
+ " Re-execing init with vdso disabled\n");
+ vdso_enabled = 0;
+ /* kill() made us think we're dying: we're not. */
+ current->signal->group_stop_count = 0;
+ reexec_init();
+ printk(KERN_WARNING "Re-exec of init failed\n");
+ }
+ }
+#endif
/* Did we come from a system call? */
if (regs->orig_eax >= 0) {
diff -r ed1ffbd17965 arch/i386/kernel/sysenter.c
--- a/arch/i386/kernel/sysenter.c Wed Dec 13 14:11:14 2006 +1100
+++ b/arch/i386/kernel/sysenter.c Wed Dec 13 14:13:35 2006 +1100
@@ -27,11 +27,7 @@
* Should the kernel map a VDSO page into processes and pass its
* address down to glibc upon exec()?
*/
-#ifdef CONFIG_PARAVIRT
-unsigned int __read_mostly vdso_enabled = 0;
-#else
unsigned int __read_mostly vdso_enabled = 1;
-#endif
EXPORT_SYMBOL_GPL(vdso_enabled);
@@ -76,15 +72,10 @@ int __init sysenter_setup(void)
{
syscall_page = (void *)get_zeroed_page(GFP_ATOMIC);
-#ifdef CONFIG_COMPAT_VDSO
- __set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_READONLY);
- printk("Compat vDSO mapped to %08lx.\n", __fix_to_virt(FIX_VDSO));
-#else
/*
* In the non-compat case the ELF coredumping code needs the fixmap:
*/
__set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_KERNEL_RO);
-#endif
if (!boot_cpu_has(X86_FEATURE_SEP)) {
memcpy(syscall_page,
diff -r ed1ffbd17965 arch/i386/mm/pgtable.c
--- a/arch/i386/mm/pgtable.c Wed Dec 13 14:11:14 2006 +1100
+++ b/arch/i386/mm/pgtable.c Wed Dec 13 14:30:33 2006 +1100
@@ -144,10 +144,8 @@ void set_pmd_pfn(unsigned long vaddr, un
}
static int fixmaps;
-#ifndef CONFIG_COMPAT_VDSO
unsigned long __FIXADDR_TOP = 0xfffff000;
EXPORT_SYMBOL(__FIXADDR_TOP);
-#endif
void __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
{
@@ -171,12 +169,8 @@ void reserve_top_address(unsigned long r
void reserve_top_address(unsigned long reserve)
{
BUG_ON(fixmaps > 0);
-#ifdef CONFIG_COMPAT_VDSO
- BUG_ON(reserve != 0);
-#else
__FIXADDR_TOP = -reserve - PAGE_SIZE;
__VMALLOC_RESERVE += reserve;
-#endif
}
pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
diff -r ed1ffbd17965 include/asm-i386/elf.h
--- a/include/asm-i386/elf.h Wed Dec 13 14:11:14 2006 +1100
+++ b/include/asm-i386/elf.h Wed Dec 13 14:11:14 2006 +1100
@@ -135,13 +135,8 @@ extern int dump_task_extended_fpu (struc
#define VDSO_HIGH_BASE (__fix_to_virt(FIX_VDSO))
#define VDSO_BASE ((unsigned long)current->mm->context.vdso)
-#ifdef CONFIG_COMPAT_VDSO
-# define VDSO_COMPAT_BASE VDSO_HIGH_BASE
-# define VDSO_PRELINK VDSO_HIGH_BASE
-#else
# define VDSO_COMPAT_BASE VDSO_BASE
# define VDSO_PRELINK 0
-#endif
#define VDSO_COMPAT_SYM(x) \
(VDSO_COMPAT_BASE + (unsigned long)(x) - VDSO_PRELINK)
diff -r ed1ffbd17965 include/asm-i386/fixmap.h
--- a/include/asm-i386/fixmap.h Wed Dec 13 14:11:14 2006 +1100
+++ b/include/asm-i386/fixmap.h Wed Dec 13 14:11:14 2006 +1100
@@ -19,11 +19,7 @@
* Leave one empty page between vmalloc'ed areas and
* the start of the fixmap.
*/
-#ifndef CONFIG_COMPAT_VDSO
extern unsigned long __FIXADDR_TOP;
-#else
-#define __FIXADDR_TOP 0xfffff000
-#endif
#ifndef __ASSEMBLY__
#include <linux/kernel.h>
diff -r ed1ffbd17965 init/main.c
--- a/init/main.c Wed Dec 13 14:11:14 2006 +1100
+++ b/init/main.c Wed Dec 13 15:31:57 2006 +1100
@@ -710,6 +710,16 @@ static void run_init_process(char *init_
kernel_execve(init_filename, argv_init, envp_init);
}
+#ifdef CONFIG_COMPAT_VDSO
+void reexec_init(void)
+{
+ mm_segment_t oldfs = get_fs();
+ set_fs(KERNEL_DS);
+ kernel_execve(argv_init[0], argv_init, envp_init);
+ set_fs(oldfs);
+}
+#endif
+
static int init(void * unused)
{
lock_kernel();
diff -r ed1ffbd17965 kernel/signal.c
--- a/kernel/signal.c Wed Dec 13 14:11:14 2006 +1100
+++ b/kernel/signal.c Wed Dec 13 14:55:07 2006 +1100
@@ -1878,8 +1878,17 @@ relock:
continue;
/* Init gets no signals it doesn't want. */
- if (current == child_reaper)
+ if (current == child_reaper) {
+#ifdef CONFIG_COMPAT_VDSO
+ /* Gross hack: Old glibc asserts, not
+ liking moved vdso (SuSE 9, FC1) */
+ if (signr == SIGABRT) {
+ signr = -1;
+ break;
+ }
+#endif
continue;
+ }
if (sig_kernel_stop(signr)) {
/*
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Why disable vdso by default with CONFIG_PARAVIRT?
2006-12-13 4:36 ` Rusty Russell
@ 2006-12-13 5:25 ` Rusty Russell
0 siblings, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2006-12-13 5:25 UTC (permalink / raw)
To: Andi Kleen; +Cc: Virtualization Mailing List
On Wed, 2006-12-13 at 15:36 +1100, Rusty Russell wrote:
> On Tue, 2006-12-12 at 22:15 +0100, Andi Kleen wrote:
> > The failure is an assertation failure in ld.so.
>
> OK, this patch tested on an assert() in init.
And this variant traps kill instead which is simpler:
Older glibcs assert() that the vdso will be in a particular spot
(which it can no longer be with CONFIG_PARAVIRT). As this glibc was
shipped in SuSE 9.0 and Fedora Core 1, it's not a trivial breakage.
Try to detect the failing init at runtime, turn off vdso and re-exec.
Not tested on this particular assertion, but should work.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
diff -r 2d9ddfd41f3a arch/i386/Kconfig
--- a/arch/i386/Kconfig Wed Dec 13 16:04:20 2006 +1100
+++ b/arch/i386/Kconfig Wed Dec 13 16:04:21 2006 +1100
@@ -816,15 +816,14 @@ config HOTPLUG_CPU
/sys/devices/system/cpu.
config COMPAT_VDSO
- bool "Compat VDSO support"
- default y
- depends on !PARAVIRT
- help
- Map the VDSO to the predictable old-style address too.
- ---help---
- Say N here if you are running a sufficiently recent glibc
- version (2.3.3 or later), to remove the high-mapped
- VDSO mapping and to exclusively use the randomized VDSO.
+ bool "Disable VDSO for old glibc"
+ default y
+ ---help---
+ Old glibc does not like the modern VDSO placement (glibc
+ 2.3.3 or later is fine, Fedora Core 1 and SuSE 9.0 have
+ problems). Very old glibc versions don't use the VDSO at
+ all. This option tries to detect the glibc assertion which
+ occurs and then disables the VDSO.
If unsure, say Y.
diff -r 2d9ddfd41f3a arch/i386/kernel/signal.c
--- a/arch/i386/kernel/signal.c Wed Dec 13 16:04:20 2006 +1100
+++ b/arch/i386/kernel/signal.c Wed Dec 13 16:20:32 2006 +1100
@@ -655,3 +655,22 @@ void do_notify_resume(struct pt_regs *re
clear_thread_flag(TIF_IRET);
}
+
+#ifdef CONFIG_COMPAT_VDSO
+#include <linux/syscalls.h>
+
+asmlinkage long
+sys_check_init_abort_kill(int pid, int sig)
+{
+ if (unlikely(current == child_reaper)
+ && pid == 1 && sig == SIGABRT && vdso_enabled) {
+ void reexec_init(void);
+ printk(KERN_WARNING "COMPAT_VDSO: Old glibc?"
+ " Re-execing init with vdso disabled\n");
+ vdso_enabled = 0;
+ reexec_init();
+ printk(KERN_WARNING "Re-exec of init failed\n");
+ }
+ return sys_kill(pid, sig);
+}
+#endif
diff -r 2d9ddfd41f3a arch/i386/kernel/syscall_table.S
--- a/arch/i386/kernel/syscall_table.S Wed Dec 13 16:04:20 2006 +1100
+++ b/arch/i386/kernel/syscall_table.S Wed Dec 13 16:06:44 2006 +1100
@@ -36,7 +36,11 @@ ENTRY(sys_call_table)
.long sys_nice
.long sys_ni_syscall /* 35 - old ftime syscall holder */
.long sys_sync
+#ifdef CONFIG_COMPAT_VDSO
+ .long sys_check_init_abort_kill
+#else
.long sys_kill
+#endif
.long sys_rename
.long sys_mkdir
.long sys_rmdir /* 40 */
diff -r 2d9ddfd41f3a arch/i386/kernel/sysenter.c
--- a/arch/i386/kernel/sysenter.c Wed Dec 13 16:04:20 2006 +1100
+++ b/arch/i386/kernel/sysenter.c Wed Dec 13 16:04:21 2006 +1100
@@ -27,11 +27,7 @@
* Should the kernel map a VDSO page into processes and pass its
* address down to glibc upon exec()?
*/
-#ifdef CONFIG_PARAVIRT
-unsigned int __read_mostly vdso_enabled = 0;
-#else
unsigned int __read_mostly vdso_enabled = 1;
-#endif
EXPORT_SYMBOL_GPL(vdso_enabled);
@@ -76,15 +72,10 @@ int __init sysenter_setup(void)
{
syscall_page = (void *)get_zeroed_page(GFP_ATOMIC);
-#ifdef CONFIG_COMPAT_VDSO
- __set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_READONLY);
- printk("Compat vDSO mapped to %08lx.\n", __fix_to_virt(FIX_VDSO));
-#else
/*
* In the non-compat case the ELF coredumping code needs the fixmap:
*/
__set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_KERNEL_RO);
-#endif
if (!boot_cpu_has(X86_FEATURE_SEP)) {
memcpy(syscall_page,
diff -r 2d9ddfd41f3a arch/i386/mm/pgtable.c
--- a/arch/i386/mm/pgtable.c Wed Dec 13 16:04:20 2006 +1100
+++ b/arch/i386/mm/pgtable.c Wed Dec 13 16:04:21 2006 +1100
@@ -144,10 +144,8 @@ void set_pmd_pfn(unsigned long vaddr, un
}
static int fixmaps;
-#ifndef CONFIG_COMPAT_VDSO
unsigned long __FIXADDR_TOP = 0xfffff000;
EXPORT_SYMBOL(__FIXADDR_TOP);
-#endif
void __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
{
@@ -171,12 +169,8 @@ void reserve_top_address(unsigned long r
void reserve_top_address(unsigned long reserve)
{
BUG_ON(fixmaps > 0);
-#ifdef CONFIG_COMPAT_VDSO
- BUG_ON(reserve != 0);
-#else
__FIXADDR_TOP = -reserve - PAGE_SIZE;
__VMALLOC_RESERVE += reserve;
-#endif
}
pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
diff -r 2d9ddfd41f3a include/asm-i386/elf.h
--- a/include/asm-i386/elf.h Wed Dec 13 16:04:20 2006 +1100
+++ b/include/asm-i386/elf.h Wed Dec 13 16:04:21 2006 +1100
@@ -135,13 +135,8 @@ extern int dump_task_extended_fpu (struc
#define VDSO_HIGH_BASE (__fix_to_virt(FIX_VDSO))
#define VDSO_BASE ((unsigned long)current->mm->context.vdso)
-#ifdef CONFIG_COMPAT_VDSO
-# define VDSO_COMPAT_BASE VDSO_HIGH_BASE
-# define VDSO_PRELINK VDSO_HIGH_BASE
-#else
# define VDSO_COMPAT_BASE VDSO_BASE
# define VDSO_PRELINK 0
-#endif
#define VDSO_COMPAT_SYM(x) \
(VDSO_COMPAT_BASE + (unsigned long)(x) - VDSO_PRELINK)
diff -r 2d9ddfd41f3a include/asm-i386/fixmap.h
--- a/include/asm-i386/fixmap.h Wed Dec 13 16:04:20 2006 +1100
+++ b/include/asm-i386/fixmap.h Wed Dec 13 16:04:21 2006 +1100
@@ -19,11 +19,7 @@
* Leave one empty page between vmalloc'ed areas and
* the start of the fixmap.
*/
-#ifndef CONFIG_COMPAT_VDSO
extern unsigned long __FIXADDR_TOP;
-#else
-#define __FIXADDR_TOP 0xfffff000
-#endif
#ifndef __ASSEMBLY__
#include <linux/kernel.h>
diff -r 2d9ddfd41f3a init/main.c
--- a/init/main.c Wed Dec 13 16:04:20 2006 +1100
+++ b/init/main.c Wed Dec 13 16:04:21 2006 +1100
@@ -710,6 +710,16 @@ static void run_init_process(char *init_
kernel_execve(init_filename, argv_init, envp_init);
}
+#ifdef CONFIG_COMPAT_VDSO
+void reexec_init(void)
+{
+ mm_segment_t oldfs = get_fs();
+ set_fs(KERNEL_DS);
+ kernel_execve(argv_init[0], argv_init, envp_init);
+ set_fs(oldfs);
+}
+#endif
+
static int init(void * unused)
{
lock_kernel();
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-12-13 5:25 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-12 1:22 Why disable vdso by default with CONFIG_PARAVIRT? Jeremy Fitzhardinge
2006-12-12 1:42 ` Zachary Amsden
2006-12-12 1:44 ` Jeremy Fitzhardinge
2006-12-12 1:46 ` Zachary Amsden
2006-12-12 1:49 ` Jeremy Fitzhardinge
2006-12-12 3:02 ` Andi Kleen
2006-12-12 6:28 ` Zachary Amsden
2006-12-12 6:50 ` Jeremy Fitzhardinge
2006-12-12 7:27 ` Andi Kleen
2006-12-12 10:23 ` Jeremy Fitzhardinge
2006-12-12 12:01 ` Andi Kleen
2006-12-12 20:11 ` Jeremy Fitzhardinge
2006-12-12 21:15 ` Andi Kleen
2006-12-13 2:04 ` Rusty Russell
2006-12-13 4:36 ` Rusty Russell
2006-12-13 5:25 ` Rusty Russell
2006-12-12 23:22 ` Rusty Russell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).