* [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN
@ 2015-07-08 1:25 Andy Lutomirski
2015-07-08 2:33 ` Arjan van de Ven
` (2 more replies)
0 siblings, 3 replies; 41+ messages in thread
From: Andy Lutomirski @ 2015-07-08 1:25 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Oleg Nesterov, Kees Cook, Arjan van de Ven, Peter Zijlstra,
Borislav Petkov, Linus Torvalds, Andy Lutomirski
VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is
in use. The code is a big undocumented mess, it's a real PITA to
test, and it looks like a big chunk of vm86_32.c is dead code. It
also plays awful games with the entry asm.
No one should be using it anyway. Use DOSBOX or KVM instead.
Mark it BROKEN. I want to remove some (obviously incorrect) exit
asm that it depends on, and I don't want to figure out how to run
severely obsolete programs just to test something that no one uses
for anything other than exploits anyway.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
I find it implausible that vm86_32.c isn't full or root holes. It's
also full of hilariously ugly code, it does terrible things to the
kernel stack, and its interaction with the syscall slowpath is
blatantly incorrect.
It really shouldn't have any users, anyway. It doesn't (and can't!)
work on 64-bit kernels, and the only program that even knows how it
works appears to be DOSEMU. DOSEMU doesn't even need it for most
programs (it uses modify_ldt instead if possible), and DOSBOX and
KVM are better choices anyway.
I think that even DOSEMU might be able to emulate vm86 (by emulating
instruction-by-instruction) if the vm86 syscall isn't there.
Want to be terrified? Read copy_vm86_regs_from_user. Or
mark_screen_rdonly. Or return_to_32bit. Or VM86_REQUEST_IRQ.
What do you all think? This code is a maintenance disaster, and I'd
love to see it go. This would be a nice first step.
This patch is intended for tip/x86/asm. The 32-bit part of my big
cleanup will interfere with vm86, and, while I think I fixed it up
right, I'd rather not expose everyone to the high probability of
crazy security bugs in this mess.
arch/x86/Kconfig | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index aa94fd014fa2..080228bdbcda 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -997,8 +997,8 @@ config X86_THERMAL_VECTOR
depends on X86_MCE_INTEL
config VM86
- bool "Enable VM86 support" if EXPERT
- default y
+ bool "Enable VM86 support" if BROKEN
+ default n
depends on X86_32
---help---
This option is required by programs like DOSEMU to run
@@ -1006,6 +1006,12 @@ config VM86
be needed by software like XFree86 to initialize some video
cards via BIOS. Disabling this option saves about 6K.
+ Linux's vm86 support is poorly maintained, essentially never
+ tested by upstream kernel developers, has quite a few known
+ bugs, and is probably full of security holes. The only thing
+ that appears to use it is DOSEMU, and DOSBOX and KVM are
+ better options these days. Don't enable it.
+
config X86_16BIT
bool "Enable support for 16-bit segments" if EXPERT
default y
--
2.4.3
^ permalink raw reply related [flat|nested] 41+ messages in thread* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 1:25 [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN Andy Lutomirski @ 2015-07-08 2:33 ` Arjan van de Ven 2015-07-08 14:00 ` Thomas Gleixner 2015-07-08 16:59 ` Linus Torvalds 2015-07-08 9:45 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2015-07-08 15:32 ` [PATCH] " Brian Gerst 2 siblings, 2 replies; 41+ messages in thread From: Arjan van de Ven @ 2015-07-08 2:33 UTC (permalink / raw) To: Andy Lutomirski, x86, linux-kernel Cc: Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov, Linus Torvalds On 7/7/2015 6:25 PM, Andy Lutomirski wrote: > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is > in use. The code is a big undocumented mess, it's a real PITA to > test, and it looks like a big chunk of vm86_32.c is dead code. It > also plays awful games with the entry asm. > > No one should be using it anyway. Use DOSBOX or KVM instead. > > Mark it BROKEN. I want to remove some (obviously incorrect) exit > asm that it depends on, and I don't want to figure out how to run > severely obsolete programs just to test something that no one uses > for anything other than exploits anyway. > while it is never great to deprecate features, in this case I am not sure there is another choice unless someone steps up to seriously revamp this code. (and look at it from a PREEMPT, NO_HZ etc etc angle) if this patch would not be acceptable, at minimum we need some sort of "off by default unless the sysadmin flips a sysfs thing", which is really just a huge hack. so for me this is Acked-by: Arjan van de Ven <arjan@linux.intel.com> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 2:33 ` Arjan van de Ven @ 2015-07-08 14:00 ` Thomas Gleixner 2015-07-08 14:04 ` Ingo Molnar 2015-07-09 9:03 ` Pavel Machek 2015-07-08 16:59 ` Linus Torvalds 1 sibling, 2 replies; 41+ messages in thread From: Thomas Gleixner @ 2015-07-08 14:00 UTC (permalink / raw) To: Arjan van de Ven Cc: Andy Lutomirski, x86, LKML, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov, Linus Torvalds On Tue, 7 Jul 2015, Arjan van de Ven wrote: > On 7/7/2015 6:25 PM, Andy Lutomirski wrote: > > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is > > in use. The code is a big undocumented mess, it's a real PITA to > > test, and it looks like a big chunk of vm86_32.c is dead code. It > > also plays awful games with the entry asm. > > > > No one should be using it anyway. Use DOSBOX or KVM instead. > > > > Mark it BROKEN. I want to remove some (obviously incorrect) exit > > asm that it depends on, and I don't want to figure out how to run > > severely obsolete programs just to test something that no one uses > > for anything other than exploits anyway. > > > > while it is never great to deprecate features, in this case I am not sure > there is another choice unless someone steps up to seriously revamp this code. > (and look at it from a PREEMPT, NO_HZ etc etc angle) Aside of being broken in so many aspects it's even more obsolete than 386 support, we should just remove it right away. Thanks, tglx ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 14:00 ` Thomas Gleixner @ 2015-07-08 14:04 ` Ingo Molnar 2015-07-09 9:03 ` Pavel Machek 1 sibling, 0 replies; 41+ messages in thread From: Ingo Molnar @ 2015-07-08 14:04 UTC (permalink / raw) To: Thomas Gleixner Cc: Arjan van de Ven, Andy Lutomirski, x86, LKML, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov, Linus Torvalds * Thomas Gleixner <tglx@linutronix.de> wrote: > On Tue, 7 Jul 2015, Arjan van de Ven wrote: > > > On 7/7/2015 6:25 PM, Andy Lutomirski wrote: > > > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is > > > in use. The code is a big undocumented mess, it's a real PITA to > > > test, and it looks like a big chunk of vm86_32.c is dead code. It > > > also plays awful games with the entry asm. > > > > > > No one should be using it anyway. Use DOSBOX or KVM instead. > > > > > > Mark it BROKEN. I want to remove some (obviously incorrect) exit > > > asm that it depends on, and I don't want to figure out how to run > > > severely obsolete programs just to test something that no one uses > > > for anything other than exploits anyway. > > > > > > > while it is never great to deprecate features, in this case I am not sure > > there is another choice unless someone steps up to seriously revamp this code. > > (and look at it from a PREEMPT, NO_HZ etc etc angle) > > Aside of being broken in so many aspects it's even more obsolete than > 386 support, we should just remove it right away. Yes - marking is BROKEN essentially makes it impossible to build it without changing the kernel source, so the next patch(es) could remove it. But the 'marking BROKEN' patch will be much easier to backport, so I'd like to keep it separate. Thanks, Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 14:00 ` Thomas Gleixner 2015-07-08 14:04 ` Ingo Molnar @ 2015-07-09 9:03 ` Pavel Machek 2015-07-09 17:57 ` Andy Lutomirski 1 sibling, 1 reply; 41+ messages in thread From: Pavel Machek @ 2015-07-09 9:03 UTC (permalink / raw) To: Thomas Gleixner Cc: Arjan van de Ven, Andy Lutomirski, x86, LKML, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov, Linus Torvalds On Wed 2015-07-08 16:00:48, Thomas Gleixner wrote: > On Tue, 7 Jul 2015, Arjan van de Ven wrote: > > > On 7/7/2015 6:25 PM, Andy Lutomirski wrote: > > > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is > > > in use. The code is a big undocumented mess, it's a real PITA to > > > test, and it looks like a big chunk of vm86_32.c is dead code. It > > > also plays awful games with the entry asm. > > > > > > No one should be using it anyway. Use DOSBOX or KVM instead. > > > > > > Mark it BROKEN. I want to remove some (obviously incorrect) exit > > > asm that it depends on, and I don't want to figure out how to run > > > severely obsolete programs just to test something that no one uses > > > for anything other than exploits anyway. > > > > > > > while it is never great to deprecate features, in this case I am not sure > > there is another choice unless someone steps up to seriously revamp this code. > > (and look at it from a PREEMPT, NO_HZ etc etc angle) > > Aside of being broken in so many aspects it's even more obsolete than > 386 support, we should just remove it right away. Bad news for you: vbetool-0.5/lrmi.c:#include <asm/vm86.h> vbetool-0.5/lrmi.c:#include <sys/vm86.h> vbetool-0.5/lrmi.c:#include <machine/vm86.h> vbetool-0.5/lrmi.c: struct vm86_struct vm; vbetool-0.5/lrmi.c: struct vm86_init_args init; ... vbetool-0.5/lrmi.c:lrmi_vm86(struct vm86_struct *vm) vbetool-0.5/lrmi.c:#define lrmi_vm86 vm86 vbetool-0.5/lrmi.c: fputs("vm86() failed\n", stderr); vbetool-0.5/lrmi.c:run_vm86(void) vbetool-0.5/lrmi.c: vret = lrmi_vm86(&context.vm); vbetool-0.5/lrmi.c:vm86_callback(int sig, int code, struct sigcontext *sc) vbetool-0.5/lrmi.c:vm86_callback(int sig, int code, struct sigcontext *sc) vbetool-0.5/lrmi.c:run_vm86(void) vbetool-0.5/lrmi.c: fprintf(stderr, "run_vm86: callback already installed\n"); vbetool depends on it, and s2ram depends on vbetool. When we get proper kernel drivers, this one will be solved, but it is not "more obsolete than 386". Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-09 9:03 ` Pavel Machek @ 2015-07-09 17:57 ` Andy Lutomirski 2015-07-09 18:03 ` Kees Cook 2015-07-09 18:30 ` Linus Torvalds 0 siblings, 2 replies; 41+ messages in thread From: Andy Lutomirski @ 2015-07-09 17:57 UTC (permalink / raw) To: Pavel Machek Cc: Thomas Gleixner, Kees Cook, Borislav Petkov, Arjan van de Ven, Oleg Nesterov, LKML, X86 ML, Linus Torvalds, Peter Zijlstra On Jul 9, 2015 2:03 AM, "Pavel Machek" <pavel@ucw.cz> wrote: > > On Wed 2015-07-08 16:00:48, Thomas Gleixner wrote: > > On Tue, 7 Jul 2015, Arjan van de Ven wrote: > > > > > On 7/7/2015 6:25 PM, Andy Lutomirski wrote: > > > > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is > > > > in use. The code is a big undocumented mess, it's a real PITA to > > > > test, and it looks like a big chunk of vm86_32.c is dead code. It > > > > also plays awful games with the entry asm. > > > > > > > > No one should be using it anyway. Use DOSBOX or KVM instead. > > > > > > > > Mark it BROKEN. I want to remove some (obviously incorrect) exit > > > > asm that it depends on, and I don't want to figure out how to run > > > > severely obsolete programs just to test something that no one uses > > > > for anything other than exploits anyway. > > > > > > > > > > while it is never great to deprecate features, in this case I am not sure > > > there is another choice unless someone steps up to seriously revamp this code. > > > (and look at it from a PREEMPT, NO_HZ etc etc angle) > > > > Aside of being broken in so many aspects it's even more obsolete than > > 386 support, we should just remove it right away. > > Bad news for you: > > vbetool-0.5/lrmi.c:#include <asm/vm86.h> > vbetool-0.5/lrmi.c:#include <sys/vm86.h> > vbetool-0.5/lrmi.c:#include <machine/vm86.h> > vbetool-0.5/lrmi.c: struct vm86_struct vm; > vbetool-0.5/lrmi.c: struct vm86_init_args init; > ... > vbetool-0.5/lrmi.c:lrmi_vm86(struct vm86_struct *vm) > vbetool-0.5/lrmi.c:#define lrmi_vm86 vm86 > vbetool-0.5/lrmi.c: fputs("vm86() failed\n", stderr); > vbetool-0.5/lrmi.c:run_vm86(void) > vbetool-0.5/lrmi.c: vret = lrmi_vm86(&context.vm); > vbetool-0.5/lrmi.c:vm86_callback(int sig, int code, struct sigcontext > *sc) > vbetool-0.5/lrmi.c:vm86_callback(int sig, int code, struct sigcontext > *sc) > vbetool-0.5/lrmi.c:run_vm86(void) > vbetool-0.5/lrmi.c: fprintf(stderr, "run_vm86: callback > already installed\n"); > > vbetool depends on it, and s2ram depends on vbetool. When we get > proper kernel drivers, this one will be solved, but it is not "more > obsolete than 386". > vmetool has an x86 emulator. As far as I know, it's been there for a long time, and I'd be surprised if it doesn't work on CONFIG_VM86=n kernels. That being said, the code is kind of tangled and it's not quite clear to me what's going on. See: http://www.codon.org.uk/~mjg59/libx86/ Perhaps we should instead move CONFIG_VM86 out of EXPERT, default it to n, and suggest that everyone running a reasonably modern distro (2006 and up?) turn it off. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-09 17:57 ` Andy Lutomirski @ 2015-07-09 18:03 ` Kees Cook 2015-07-09 18:30 ` Linus Torvalds 1 sibling, 0 replies; 41+ messages in thread From: Kees Cook @ 2015-07-09 18:03 UTC (permalink / raw) To: Andy Lutomirski Cc: Pavel Machek, Thomas Gleixner, Borislav Petkov, Arjan van de Ven, Oleg Nesterov, LKML, X86 ML, Linus Torvalds, Peter Zijlstra, Matthew Garrett On Thu, Jul 9, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote: > On Jul 9, 2015 2:03 AM, "Pavel Machek" <pavel@ucw.cz> wrote: >> >> On Wed 2015-07-08 16:00:48, Thomas Gleixner wrote: >> > On Tue, 7 Jul 2015, Arjan van de Ven wrote: >> > >> > > On 7/7/2015 6:25 PM, Andy Lutomirski wrote: >> > > > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is >> > > > in use. The code is a big undocumented mess, it's a real PITA to >> > > > test, and it looks like a big chunk of vm86_32.c is dead code. It >> > > > also plays awful games with the entry asm. >> > > > >> > > > No one should be using it anyway. Use DOSBOX or KVM instead. >> > > > >> > > > Mark it BROKEN. I want to remove some (obviously incorrect) exit >> > > > asm that it depends on, and I don't want to figure out how to run >> > > > severely obsolete programs just to test something that no one uses >> > > > for anything other than exploits anyway. >> > > > >> > > >> > > while it is never great to deprecate features, in this case I am not sure >> > > there is another choice unless someone steps up to seriously revamp this code. >> > > (and look at it from a PREEMPT, NO_HZ etc etc angle) >> > >> > Aside of being broken in so many aspects it's even more obsolete than >> > 386 support, we should just remove it right away. >> >> Bad news for you: >> >> vbetool-0.5/lrmi.c:#include <asm/vm86.h> >> vbetool-0.5/lrmi.c:#include <sys/vm86.h> >> vbetool-0.5/lrmi.c:#include <machine/vm86.h> >> vbetool-0.5/lrmi.c: struct vm86_struct vm; >> vbetool-0.5/lrmi.c: struct vm86_init_args init; >> ... >> vbetool-0.5/lrmi.c:lrmi_vm86(struct vm86_struct *vm) >> vbetool-0.5/lrmi.c:#define lrmi_vm86 vm86 >> vbetool-0.5/lrmi.c: fputs("vm86() failed\n", stderr); >> vbetool-0.5/lrmi.c:run_vm86(void) >> vbetool-0.5/lrmi.c: vret = lrmi_vm86(&context.vm); >> vbetool-0.5/lrmi.c:vm86_callback(int sig, int code, struct sigcontext >> *sc) >> vbetool-0.5/lrmi.c:vm86_callback(int sig, int code, struct sigcontext >> *sc) >> vbetool-0.5/lrmi.c:run_vm86(void) >> vbetool-0.5/lrmi.c: fprintf(stderr, "run_vm86: callback >> already installed\n"); >> >> vbetool depends on it, and s2ram depends on vbetool. When we get >> proper kernel drivers, this one will be solved, but it is not "more >> obsolete than 386". >> > > vmetool has an x86 emulator. As far as I know, it's been there for a > long time, and I'd be surprised if it doesn't work on CONFIG_VM86=n > kernels. That being said, the code is kind of tangled and it's not > quite clear to me what's going on. > > See: http://www.codon.org.uk/~mjg59/libx86/ > > Perhaps we should instead move CONFIG_VM86 out of EXPERT, default it > to n, and suggest that everyone running a reasonably modern distro > (2006 and up?) turn it off. That seems like a good idea to me. -Kees -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-09 17:57 ` Andy Lutomirski 2015-07-09 18:03 ` Kees Cook @ 2015-07-09 18:30 ` Linus Torvalds 1 sibling, 0 replies; 41+ messages in thread From: Linus Torvalds @ 2015-07-09 18:30 UTC (permalink / raw) To: Andy Lutomirski Cc: Pavel Machek, Thomas Gleixner, Kees Cook, Borislav Petkov, Arjan van de Ven, Oleg Nesterov, LKML, X86 ML, Peter Zijlstra On Thu, Jul 9, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > Perhaps we should instead move CONFIG_VM86 out of EXPERT, default it > to n, and suggest that everyone running a reasonably modern distro > (2006 and up?) turn it off. Ack. Changing the default and trying to deprecate it sounds like a good plan to me. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 2:33 ` Arjan van de Ven 2015-07-08 14:00 ` Thomas Gleixner @ 2015-07-08 16:59 ` Linus Torvalds 2015-07-08 17:30 ` Andy Lutomirski 2015-07-08 19:13 ` Ingo Molnar 1 sibling, 2 replies; 41+ messages in thread From: Linus Torvalds @ 2015-07-08 16:59 UTC (permalink / raw) To: Arjan van de Ven Cc: Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: > > if this patch would not be acceptable, at minimum we need some sort of "off > by default > unless the sysadmin flips a sysfs thing", which is really just a huge hack. The only thing that matters is whether people use this or not. If people use vm86 mode, we can't just disable it. It's that simple. "It's poorly maintained" isn't an argument for removal. Only "nobody cares" works as an argument for that. My suspicion is that people still do use vm86 mode, but who knows.. Quite frankly, rather than disable it, I'd much rather see people who modify low-level x86 code (yes, that means you, Luto) *test* it. If you aren't willign to test the modifications you make, I don't think those modifications should be merged, regardless of how nice a cleanup they are. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 16:59 ` Linus Torvalds @ 2015-07-08 17:30 ` Andy Lutomirski 2015-07-08 17:49 ` Andy Lutomirski ` (2 more replies) 2015-07-08 19:13 ` Ingo Molnar 1 sibling, 3 replies; 41+ messages in thread From: Andy Lutomirski @ 2015-07-08 17:30 UTC (permalink / raw) To: Linus Torvalds Cc: Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: >> >> if this patch would not be acceptable, at minimum we need some sort of "off >> by default >> unless the sysadmin flips a sysfs thing", which is really just a huge hack. > > The only thing that matters is whether people use this or not. > I think that the world contains precisely two programs that use the vm86 syscalls. One is dosemu, and one is a test case I wrote. (There are probably some exploits written by other people that I don't know about. Certainly Spender has been patching vm86 for long enough that he must have an exploit or two up his sleeve.) As far as I can tell (and I'll try to test this better for real later this week), dosemu already knows how to emulate real mode if vm86 is unavailable. So it's unclear that turning off the vm86 syscalls actually breaks anything whatsoever. On the other hand, sys_vm86 fails if the syscall slow path is in use. That means that quite a few Fedora versions (auditing), anything with ptrace, seccomp (before 3.16 IIRC), and anything with context tracking is probably actually *improved* by turning off the vm86 syscalls even for dosemu users. And apparently Ubuntu has had CONFIG_VM86 disabled forever. IOW, vm86 really is broken. > If people use vm86 mode, we can't just disable it. It's that simple. > "It's poorly maintained" isn't an argument for removal. Only "nobody > cares" works as an argument for that. > > My suspicion is that people still do use vm86 mode, but who knows.. > Quite frankly, rather than disable it, I'd much rather see people who > modify low-level x86 code (yes, that means you, Luto) *test* it. If > you aren't willign to test the modifications you make, I don't think > those modifications should be merged, regardless of how nice a cleanup > they are. I tried to test it. As far as I know, my changes in -tip have no effect on vm86, and the changes I'm planning on sending this week will make it work better. I still thing that Linux users should have it configured out or deleted altogether. Especially people who care at all about security. It's easy to try the easy case (run from tools/testing/selftests/x86) -- this is v4.2-rc1, but most recent versions should be identical: $ ./entry_from_vm86_32 [RUN] #BR from vm86 mode [OK] Exited vm86 mode due to #BR [RUN] SYSENTER from vm86 mode [OK] Exited vm86 mode due to unhandled GP fault $ strace -e vm86 ./entry_from_vm86_32 [RUN] #BR from vm86 mode vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS (Function not implemented) [OK] Exited vm86 mode due to type 0, arg 0 [RUN] SYSENTER from vm86 mode vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS (Function not implemented) [OK] Exited vm86 mode due to type 0, arg 0 It only says "[OK]" because my test case isn't careful enough. That's a failure. I suspect it was a much worse failure a couple versions ago before my ENOSYS-reworking patch went in. Replace "-e vm86" with "-e write" and be puzzled. The failure mode is really pretty bad. This only tests easy stuff. The integration between vm86 and fault handling is truly awful and I don't even know how to approach testing it. I'd probably have to run twenty or thirty old real-mode games to even exercise those code paths. I'll try to confirm later this week that dosemu can really handle real mode without sys_vm86. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:30 ` Andy Lutomirski @ 2015-07-08 17:49 ` Andy Lutomirski 2015-07-08 17:55 ` Linus Torvalds 2015-07-08 19:05 ` Brian Gerst 2015-07-10 14:12 ` Eric W. Biederman 2 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-08 17:49 UTC (permalink / raw) To: Linus Torvalds Cc: Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 10:30 AM, Andy Lutomirski <luto@amacapital.net> wrote: > I'll try to confirm later this week that dosemu can really handle real > mode without sys_vm86. I don't know how to tell whether something is trying to use real mode, but I can play this just fine in DOSEMU on my 64-bit laptop: http://dosgames.com/dl.php?filename=http://www.dosgames.com/files/alleycat.zip which suggests that it works just fine. There is most certainly no working sys_vm86 on my laptop. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:49 ` Andy Lutomirski @ 2015-07-08 17:55 ` Linus Torvalds 2015-07-08 18:47 ` Andy Lutomirski ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Linus Torvalds @ 2015-07-08 17:55 UTC (permalink / raw) To: Andy Lutomirski Cc: Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 10:49 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > I don't know how to tell whether something is trying to use real mode, > but I can play this just fine in DOSEMU on my 64-bit laptop: So a 64-bit distro obviously will never have used vm86 mode - it doesn't work there. Never has. There's no sane way to get to vm86 mode from long mode, that's just how the 64-bit extensions worked. (64-bit hardware obviously does support vm86 mode, but you have to play games with mixing long mode and CPL0 32-bit protected mode to get there, and we never did that). It's the 32-bit distros I would worry about. The ones that may have well disabled emulation, because they have vm86 mode enabled. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:55 ` Linus Torvalds @ 2015-07-08 18:47 ` Andy Lutomirski 2015-07-08 18:53 ` Kees Cook 2015-07-08 18:48 ` Kees Cook 2015-07-08 18:54 ` Austin S Hemmelgarn 2 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-08 18:47 UTC (permalink / raw) To: Linus Torvalds Cc: Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 10:55 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Jul 8, 2015 at 10:49 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> I don't know how to tell whether something is trying to use real mode, >> but I can play this just fine in DOSEMU on my 64-bit laptop: > > So a 64-bit distro obviously will never have used vm86 mode - it > doesn't work there. Never has. There's no sane way to get to vm86 mode > from long mode, that's just how the 64-bit extensions worked. > > (64-bit hardware obviously does support vm86 mode, but you have to > play games with mixing long mode and CPL0 32-bit protected mode to get > there, and we never did that). Eww. My sanity hurts just thinking about that. We used to switch in and out of long mode for EFI mixed mode support, but that's gone now, since it produced lovely triple faults if perf was running. As far as I can tell, those triple faults are basically unfixable without disabling all NMI sources across long mode switches, and 32-bit EFI works just fine (i.e. as well as it ever did) in CPL0 compat mode. So exiting long mode to enter v8086 mode is nuts. Entering v8086 mode via VMX would be *much* better, but just not implementing it would be better still. > > It's the 32-bit distros I would worry about. The ones that may have > well disabled emulation, because they have vm86 mode enabled. > Fedora doesn't package dosemu at all, and Ubuntu turns off CONFIG_VM86 AFAIK. RPMFusion does package dosemu. Dosemu has a --disable-cpuemu configure option. A quick check suggests that neither RPMFusion, Gentoo, nor Arch sets that option (why would they?). So maybe there's a couple people with home-built --disable-cpuemu DOSEMU versions on 32-bit kernels who have syscall auditing and context tracking off. It's even plausible that some nonzero number of them use new kernels, but I'd be kind of surprised. Weighed against the fact that sys_vm86 under ptrace is probably a minor security bug* in some circumstances, I don't think the case for preserving vm86 support looks all that good. OTOH, if someone were to actually complain, that would be a different story. That's why I suggested marking it BROKEN instead of deleting it outright. * I'm planning on fixing that particular issue regardless on whether CONFIG_VM86 is marked BROKEN.** ** I don't know enough about the mm innards to know whether vm86_32.c's mark_screen_rdonly is a security bug, but poking at PTEs belonging to user addresses without even trying to see what VMAs back them doesn't look like a good thing... And I have no clue how to fix that without an ABI break, even if that particular ABI break might not affect dosemu. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 18:47 ` Andy Lutomirski @ 2015-07-08 18:53 ` Kees Cook 0 siblings, 0 replies; 41+ messages in thread From: Kees Cook @ 2015-07-08 18:53 UTC (permalink / raw) To: Andy Lutomirski Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 11:47 AM, Andy Lutomirski <luto@amacapital.net> wrote: > Fedora doesn't package dosemu at all, and Ubuntu turns off CONFIG_VM86 > AFAIK. RPMFusion does package dosemu. Just for reference, here's the config on latest Ubuntu: http://kernel.ubuntu.com/git/ubuntu/ubuntu-vivid.git/tree/debian.master/config/config.common.ubuntu#n8204 Also Debian enables it: http://anonscm.debian.org/viewvc/kernel/dists/trunk/linux/debian/config/kernelarch-x86/config-arch-32?view=markup :( -Kees -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:55 ` Linus Torvalds 2015-07-08 18:47 ` Andy Lutomirski @ 2015-07-08 18:48 ` Kees Cook 2015-07-08 19:04 ` Andy Lutomirski 2015-07-08 18:54 ` Austin S Hemmelgarn 2 siblings, 1 reply; 41+ messages in thread From: Kees Cook @ 2015-07-08 18:48 UTC (permalink / raw) To: Linus Torvalds Cc: Andy Lutomirski, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 10:55 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Jul 8, 2015 at 10:49 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> I don't know how to tell whether something is trying to use real mode, >> but I can play this just fine in DOSEMU on my 64-bit laptop: > > So a 64-bit distro obviously will never have used vm86 mode - it > doesn't work there. Never has. There's no sane way to get to vm86 mode > from long mode, that's just how the 64-bit extensions worked. > > (64-bit hardware obviously does support vm86 mode, but you have to > play games with mixing long mode and CPL0 32-bit protected mode to get > there, and we never did that). > > It's the 32-bit distros I would worry about. The ones that may have > well disabled emulation, because they have vm86 mode enabled. Speaking as the dosemu maintainer in Debian and Ubuntu, I can confirm what Andy mentioned: dosemu will kick over to emulation if SYS_vm86 and SYS_vm86old fail. The other area I remember that used vm86 mode was non-KMS Xorg drivers and anything using svgalib that tried to do video card BIOS initialization. Also, Andy, I think you weren't looking at i386 builds of Ubuntu. Current Ubuntu, and 12.04 ("Precise") LTS (supported until 2017), and 14.04 LTS (until 2019) releases all have CONFIG_VM86. -Kees -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 18:48 ` Kees Cook @ 2015-07-08 19:04 ` Andy Lutomirski 0 siblings, 0 replies; 41+ messages in thread From: Andy Lutomirski @ 2015-07-08 19:04 UTC (permalink / raw) To: Kees Cook Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 11:48 AM, Kees Cook <keescook@chromium.org> wrote: > On Wed, Jul 8, 2015 at 10:55 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> On Wed, Jul 8, 2015 at 10:49 AM, Andy Lutomirski <luto@amacapital.net> wrote: >>> >>> I don't know how to tell whether something is trying to use real mode, >>> but I can play this just fine in DOSEMU on my 64-bit laptop: >> >> So a 64-bit distro obviously will never have used vm86 mode - it >> doesn't work there. Never has. There's no sane way to get to vm86 mode >> from long mode, that's just how the 64-bit extensions worked. >> >> (64-bit hardware obviously does support vm86 mode, but you have to >> play games with mixing long mode and CPL0 32-bit protected mode to get >> there, and we never did that). >> >> It's the 32-bit distros I would worry about. The ones that may have >> well disabled emulation, because they have vm86 mode enabled. > > Speaking as the dosemu maintainer in Debian and Ubuntu, I can confirm > what Andy mentioned: dosemu will kick over to emulation if SYS_vm86 > and SYS_vm86old fail. The other area I remember that used vm86 mode > was non-KMS Xorg drivers and anything using svgalib that tried to do > video card BIOS initialization. Adam Jackson said on the Fedora list that everything uses x86emu these days. And haven't modern kernels already dropped most of the UMS support already? > > Also, Andy, I think you weren't looking at i386 builds of Ubuntu. > Current Ubuntu, and 12.04 ("Precise") LTS (supported until 2017), and > 14.04 LTS (until 2019) releases all have CONFIG_VM86. Hmm. I was going off something someone said an IRC. Apparently I should have double-checked. If you have a test system easily available, can you see what happens if you try to do: $ sudo auditctl -e 1 $ sudo auditctl -D # just in case you had a "-a task,never" rule installed $ dosemu on a system with CONFIG_VM86=y? I bet it fails. Maybe it gets lucky due to the the bogus vm86 asm code managing to explode with eax=-ENOSYS, triggering a fallback to emulation. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:55 ` Linus Torvalds 2015-07-08 18:47 ` Andy Lutomirski 2015-07-08 18:48 ` Kees Cook @ 2015-07-08 18:54 ` Austin S Hemmelgarn 2 siblings, 0 replies; 41+ messages in thread From: Austin S Hemmelgarn @ 2015-07-08 18:54 UTC (permalink / raw) To: Linus Torvalds, Andy Lutomirski Cc: Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov [-- Attachment #1: Type: text/plain, Size: 1478 bytes --] On 2015-07-08 13:55, Linus Torvalds wrote: > On Wed, Jul 8, 2015 at 10:49 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> I don't know how to tell whether something is trying to use real mode, >> but I can play this just fine in DOSEMU on my 64-bit laptop: > > So a 64-bit distro obviously will never have used vm86 mode - it > doesn't work there. Never has. There's no sane way to get to vm86 mode > from long mode, that's just how the 64-bit extensions worked. > > (64-bit hardware obviously does support vm86 mode, but you have to > play games with mixing long mode and CPL0 32-bit protected mode to get > there, and we never did that). > > It's the 32-bit distros I would worry about. The ones that may have > well disabled emulation, because they have vm86 mode enabled. > Other than the enterprise distros (which _probably_ don't even have dosemu packages, and I'm 99% certain would have VM86 enabled only for 'backwards compatibility'), I highly doubt that there are any modern ones that have real-mode emulation disabled in dosemu, there's just too high of a chance of a security minded user building their own kernel with VM86 disabled (or they just have it disabled anyway in the distro kernel, Ubuntu does this, and I'm pretty sure that Debian and Fedora do also). FWIW, there's no easy way to disable such emulation on Gentoo (it is possible, it just requires some significant configuration file hacking for portage). [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 2967 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:30 ` Andy Lutomirski 2015-07-08 17:49 ` Andy Lutomirski @ 2015-07-08 19:05 ` Brian Gerst 2015-07-08 19:14 ` Andy Lutomirski 2015-07-10 14:12 ` Eric W. Biederman 2 siblings, 1 reply; 41+ messages in thread From: Brian Gerst @ 2015-07-08 19:05 UTC (permalink / raw) To: Andy Lutomirski Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 1:30 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: >>> >>> if this patch would not be acceptable, at minimum we need some sort of "off >>> by default >>> unless the sysadmin flips a sysfs thing", which is really just a huge hack. >> >> The only thing that matters is whether people use this or not. >> > > I think that the world contains precisely two programs that use the > vm86 syscalls. One is dosemu, and one is a test case I wrote. (There > are probably some exploits written by other people that I don't know > about. Certainly Spender has been patching vm86 for long enough that > he must have an exploit or two up his sleeve.) > > As far as I can tell (and I'll try to test this better for real later > this week), dosemu already knows how to emulate real mode if vm86 is > unavailable. So it's unclear that turning off the vm86 syscalls > actually breaks anything whatsoever. > > On the other hand, sys_vm86 fails if the syscall slow path is in use. > That means that quite a few Fedora versions (auditing), anything with > ptrace, seccomp (before 3.16 IIRC), and anything with context tracking > is probably actually *improved* by turning off the vm86 syscalls even > for dosemu users. > > And apparently Ubuntu has had CONFIG_VM86 disabled forever. > > IOW, vm86 really is broken. > >> If people use vm86 mode, we can't just disable it. It's that simple. >> "It's poorly maintained" isn't an argument for removal. Only "nobody >> cares" works as an argument for that. >> >> My suspicion is that people still do use vm86 mode, but who knows.. >> Quite frankly, rather than disable it, I'd much rather see people who >> modify low-level x86 code (yes, that means you, Luto) *test* it. If >> you aren't willign to test the modifications you make, I don't think >> those modifications should be merged, regardless of how nice a cleanup >> they are. > > I tried to test it. As far as I know, my changes in -tip have no > effect on vm86, and the changes I'm planning on sending this week will > make it work better. I still thing that Linux users should have it > configured out or deleted altogether. Especially people who care at > all about security. > > It's easy to try the easy case (run from tools/testing/selftests/x86) > -- this is v4.2-rc1, but most recent versions should be identical: > > $ ./entry_from_vm86_32 > [RUN] #BR from vm86 mode > [OK] Exited vm86 mode due to #BR > [RUN] SYSENTER from vm86 mode > [OK] Exited vm86 mode due to unhandled GP fault > > $ strace -e vm86 ./entry_from_vm86_32 > [RUN] #BR from vm86 mode > vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS > (Function not implemented) > [OK] Exited vm86 mode due to type 0, arg 0 > [RUN] SYSENTER from vm86 mode > vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS > (Function not implemented) > [OK] Exited vm86 mode due to type 0, arg 0 > > It only says "[OK]" because my test case isn't careful enough. That's > a failure. I suspect it was a much worse failure a couple versions > ago before my ENOSYS-reworking patch went in. > > Replace "-e vm86" with "-e write" and be puzzled. The failure mode is > really pretty bad. > > This only tests easy stuff. The integration between vm86 and fault > handling is truly awful and I don't even know how to approach testing > it. I'd probably have to run twenty or thirty old real-mode games to > even exercise those code paths. > > I'll try to confirm later this week that dosemu can really handle real > mode without sys_vm86. None of these issues are unfixable. As I said before, many of them can be resolved if vm86 is changed to use the normal syscall/exception exit paths. Give me a few days to finish off that patch set. -- Brian Gerst ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 19:05 ` Brian Gerst @ 2015-07-08 19:14 ` Andy Lutomirski 2015-07-08 19:39 ` Brian Gerst 0 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-08 19:14 UTC (permalink / raw) To: Brian Gerst Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 12:05 PM, Brian Gerst <brgerst@gmail.com> wrote: > On Wed, Jul 8, 2015 at 1:30 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds >> <torvalds@linux-foundation.org> wrote: >>> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: >>>> >>>> if this patch would not be acceptable, at minimum we need some sort of "off >>>> by default >>>> unless the sysadmin flips a sysfs thing", which is really just a huge hack. >>> >>> The only thing that matters is whether people use this or not. >>> >> >> I think that the world contains precisely two programs that use the >> vm86 syscalls. One is dosemu, and one is a test case I wrote. (There >> are probably some exploits written by other people that I don't know >> about. Certainly Spender has been patching vm86 for long enough that >> he must have an exploit or two up his sleeve.) >> >> As far as I can tell (and I'll try to test this better for real later >> this week), dosemu already knows how to emulate real mode if vm86 is >> unavailable. So it's unclear that turning off the vm86 syscalls >> actually breaks anything whatsoever. >> >> On the other hand, sys_vm86 fails if the syscall slow path is in use. >> That means that quite a few Fedora versions (auditing), anything with >> ptrace, seccomp (before 3.16 IIRC), and anything with context tracking >> is probably actually *improved* by turning off the vm86 syscalls even >> for dosemu users. >> >> And apparently Ubuntu has had CONFIG_VM86 disabled forever. >> >> IOW, vm86 really is broken. >> >>> If people use vm86 mode, we can't just disable it. It's that simple. >>> "It's poorly maintained" isn't an argument for removal. Only "nobody >>> cares" works as an argument for that. >>> >>> My suspicion is that people still do use vm86 mode, but who knows.. >>> Quite frankly, rather than disable it, I'd much rather see people who >>> modify low-level x86 code (yes, that means you, Luto) *test* it. If >>> you aren't willign to test the modifications you make, I don't think >>> those modifications should be merged, regardless of how nice a cleanup >>> they are. >> >> I tried to test it. As far as I know, my changes in -tip have no >> effect on vm86, and the changes I'm planning on sending this week will >> make it work better. I still thing that Linux users should have it >> configured out or deleted altogether. Especially people who care at >> all about security. >> >> It's easy to try the easy case (run from tools/testing/selftests/x86) >> -- this is v4.2-rc1, but most recent versions should be identical: >> >> $ ./entry_from_vm86_32 >> [RUN] #BR from vm86 mode >> [OK] Exited vm86 mode due to #BR >> [RUN] SYSENTER from vm86 mode >> [OK] Exited vm86 mode due to unhandled GP fault >> >> $ strace -e vm86 ./entry_from_vm86_32 >> [RUN] #BR from vm86 mode >> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >> (Function not implemented) >> [OK] Exited vm86 mode due to type 0, arg 0 >> [RUN] SYSENTER from vm86 mode >> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >> (Function not implemented) >> [OK] Exited vm86 mode due to type 0, arg 0 >> >> It only says "[OK]" because my test case isn't careful enough. That's >> a failure. I suspect it was a much worse failure a couple versions >> ago before my ENOSYS-reworking patch went in. >> >> Replace "-e vm86" with "-e write" and be puzzled. The failure mode is >> really pretty bad. >> >> This only tests easy stuff. The integration between vm86 and fault >> handling is truly awful and I don't even know how to approach testing >> it. I'd probably have to run twenty or thirty old real-mode games to >> even exercise those code paths. >> >> I'll try to confirm later this week that dosemu can really handle real >> mode without sys_vm86. > > None of these issues are unfixable. As I said before, many of them > can be resolved if vm86 is changed to use the normal syscall/exception > exit paths. Give me a few days to finish off that patch set. > I look forward to it. However: I imagine that, if you do this, you may need to be quite careful about an x86_32-ism. Currently, if you have a pt_regs pointer for the current entry and user_mode(regs) returns true, then regs == current_pt_regs(). If you let user mode run with EFLAGS.VM set with the normal tss.sp0, then this will no longer be true, as the extra-long entry-from-v8086 frame will shift pt_regs by a few bytes. I don't know whether this matters, but I can imagine it causing do_signal to explode. *shudder* Anyway, I'll send out my 32-bit cleanups for review soon. If it conflicts with your changes, it'll be easy to fix up. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 19:14 ` Andy Lutomirski @ 2015-07-08 19:39 ` Brian Gerst 2015-07-08 19:59 ` Andy Lutomirski 0 siblings, 1 reply; 41+ messages in thread From: Brian Gerst @ 2015-07-08 19:39 UTC (permalink / raw) To: Andy Lutomirski Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 3:14 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Wed, Jul 8, 2015 at 12:05 PM, Brian Gerst <brgerst@gmail.com> wrote: >> On Wed, Jul 8, 2015 at 1:30 PM, Andy Lutomirski <luto@amacapital.net> wrote: >>> On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds >>> <torvalds@linux-foundation.org> wrote: >>>> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: >>>>> >>>>> if this patch would not be acceptable, at minimum we need some sort of "off >>>>> by default >>>>> unless the sysadmin flips a sysfs thing", which is really just a huge hack. >>>> >>>> The only thing that matters is whether people use this or not. >>>> >>> >>> I think that the world contains precisely two programs that use the >>> vm86 syscalls. One is dosemu, and one is a test case I wrote. (There >>> are probably some exploits written by other people that I don't know >>> about. Certainly Spender has been patching vm86 for long enough that >>> he must have an exploit or two up his sleeve.) >>> >>> As far as I can tell (and I'll try to test this better for real later >>> this week), dosemu already knows how to emulate real mode if vm86 is >>> unavailable. So it's unclear that turning off the vm86 syscalls >>> actually breaks anything whatsoever. >>> >>> On the other hand, sys_vm86 fails if the syscall slow path is in use. >>> That means that quite a few Fedora versions (auditing), anything with >>> ptrace, seccomp (before 3.16 IIRC), and anything with context tracking >>> is probably actually *improved* by turning off the vm86 syscalls even >>> for dosemu users. >>> >>> And apparently Ubuntu has had CONFIG_VM86 disabled forever. >>> >>> IOW, vm86 really is broken. >>> >>>> If people use vm86 mode, we can't just disable it. It's that simple. >>>> "It's poorly maintained" isn't an argument for removal. Only "nobody >>>> cares" works as an argument for that. >>>> >>>> My suspicion is that people still do use vm86 mode, but who knows.. >>>> Quite frankly, rather than disable it, I'd much rather see people who >>>> modify low-level x86 code (yes, that means you, Luto) *test* it. If >>>> you aren't willign to test the modifications you make, I don't think >>>> those modifications should be merged, regardless of how nice a cleanup >>>> they are. >>> >>> I tried to test it. As far as I know, my changes in -tip have no >>> effect on vm86, and the changes I'm planning on sending this week will >>> make it work better. I still thing that Linux users should have it >>> configured out or deleted altogether. Especially people who care at >>> all about security. >>> >>> It's easy to try the easy case (run from tools/testing/selftests/x86) >>> -- this is v4.2-rc1, but most recent versions should be identical: >>> >>> $ ./entry_from_vm86_32 >>> [RUN] #BR from vm86 mode >>> [OK] Exited vm86 mode due to #BR >>> [RUN] SYSENTER from vm86 mode >>> [OK] Exited vm86 mode due to unhandled GP fault >>> >>> $ strace -e vm86 ./entry_from_vm86_32 >>> [RUN] #BR from vm86 mode >>> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >>> (Function not implemented) >>> [OK] Exited vm86 mode due to type 0, arg 0 >>> [RUN] SYSENTER from vm86 mode >>> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >>> (Function not implemented) >>> [OK] Exited vm86 mode due to type 0, arg 0 >>> >>> It only says "[OK]" because my test case isn't careful enough. That's >>> a failure. I suspect it was a much worse failure a couple versions >>> ago before my ENOSYS-reworking patch went in. >>> >>> Replace "-e vm86" with "-e write" and be puzzled. The failure mode is >>> really pretty bad. >>> >>> This only tests easy stuff. The integration between vm86 and fault >>> handling is truly awful and I don't even know how to approach testing >>> it. I'd probably have to run twenty or thirty old real-mode games to >>> even exercise those code paths. >>> >>> I'll try to confirm later this week that dosemu can really handle real >>> mode without sys_vm86. >> >> None of these issues are unfixable. As I said before, many of them >> can be resolved if vm86 is changed to use the normal syscall/exception >> exit paths. Give me a few days to finish off that patch set. >> > > I look forward to it. > > However: I imagine that, if you do this, you may need to be quite > careful about an x86_32-ism. Currently, if you have a pt_regs pointer > for the current entry and user_mode(regs) returns true, then regs == > current_pt_regs(). If you let user mode run with EFLAGS.VM set with > the normal tss.sp0, then this will no longer be true, as the > extra-long entry-from-v8086 frame will shift pt_regs by a few bytes. > I don't know whether this matters, but I can imagine it causing > do_signal to explode. *shudder* I am aware that pt_regs is in a fixed location on the stack. What I plan to do is increase the padding at the top of the stack if VM86 is configured, to reserve space for the extra segment registers. Then it will move tss.sp0 up 16 bytes when entering vm86 mode so that the longer IRET frame is in the right place. -- Brian Gerst ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 19:39 ` Brian Gerst @ 2015-07-08 19:59 ` Andy Lutomirski 2015-07-09 5:52 ` Ingo Molnar 0 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-08 19:59 UTC (permalink / raw) To: Brian Gerst Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 12:39 PM, Brian Gerst <brgerst@gmail.com> wrote: > On Wed, Jul 8, 2015 at 3:14 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> On Wed, Jul 8, 2015 at 12:05 PM, Brian Gerst <brgerst@gmail.com> wrote: >>> On Wed, Jul 8, 2015 at 1:30 PM, Andy Lutomirski <luto@amacapital.net> wrote: >>>> On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds >>>> <torvalds@linux-foundation.org> wrote: >>>>> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: >>>>>> >>>>>> if this patch would not be acceptable, at minimum we need some sort of "off >>>>>> by default >>>>>> unless the sysadmin flips a sysfs thing", which is really just a huge hack. >>>>> >>>>> The only thing that matters is whether people use this or not. >>>>> >>>> >>>> I think that the world contains precisely two programs that use the >>>> vm86 syscalls. One is dosemu, and one is a test case I wrote. (There >>>> are probably some exploits written by other people that I don't know >>>> about. Certainly Spender has been patching vm86 for long enough that >>>> he must have an exploit or two up his sleeve.) >>>> >>>> As far as I can tell (and I'll try to test this better for real later >>>> this week), dosemu already knows how to emulate real mode if vm86 is >>>> unavailable. So it's unclear that turning off the vm86 syscalls >>>> actually breaks anything whatsoever. >>>> >>>> On the other hand, sys_vm86 fails if the syscall slow path is in use. >>>> That means that quite a few Fedora versions (auditing), anything with >>>> ptrace, seccomp (before 3.16 IIRC), and anything with context tracking >>>> is probably actually *improved* by turning off the vm86 syscalls even >>>> for dosemu users. >>>> >>>> And apparently Ubuntu has had CONFIG_VM86 disabled forever. >>>> >>>> IOW, vm86 really is broken. >>>> >>>>> If people use vm86 mode, we can't just disable it. It's that simple. >>>>> "It's poorly maintained" isn't an argument for removal. Only "nobody >>>>> cares" works as an argument for that. >>>>> >>>>> My suspicion is that people still do use vm86 mode, but who knows.. >>>>> Quite frankly, rather than disable it, I'd much rather see people who >>>>> modify low-level x86 code (yes, that means you, Luto) *test* it. If >>>>> you aren't willign to test the modifications you make, I don't think >>>>> those modifications should be merged, regardless of how nice a cleanup >>>>> they are. >>>> >>>> I tried to test it. As far as I know, my changes in -tip have no >>>> effect on vm86, and the changes I'm planning on sending this week will >>>> make it work better. I still thing that Linux users should have it >>>> configured out or deleted altogether. Especially people who care at >>>> all about security. >>>> >>>> It's easy to try the easy case (run from tools/testing/selftests/x86) >>>> -- this is v4.2-rc1, but most recent versions should be identical: >>>> >>>> $ ./entry_from_vm86_32 >>>> [RUN] #BR from vm86 mode >>>> [OK] Exited vm86 mode due to #BR >>>> [RUN] SYSENTER from vm86 mode >>>> [OK] Exited vm86 mode due to unhandled GP fault >>>> >>>> $ strace -e vm86 ./entry_from_vm86_32 >>>> [RUN] #BR from vm86 mode >>>> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >>>> (Function not implemented) >>>> [OK] Exited vm86 mode due to type 0, arg 0 >>>> [RUN] SYSENTER from vm86 mode >>>> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS >>>> (Function not implemented) >>>> [OK] Exited vm86 mode due to type 0, arg 0 >>>> >>>> It only says "[OK]" because my test case isn't careful enough. That's >>>> a failure. I suspect it was a much worse failure a couple versions >>>> ago before my ENOSYS-reworking patch went in. >>>> >>>> Replace "-e vm86" with "-e write" and be puzzled. The failure mode is >>>> really pretty bad. >>>> >>>> This only tests easy stuff. The integration between vm86 and fault >>>> handling is truly awful and I don't even know how to approach testing >>>> it. I'd probably have to run twenty or thirty old real-mode games to >>>> even exercise those code paths. >>>> >>>> I'll try to confirm later this week that dosemu can really handle real >>>> mode without sys_vm86. >>> >>> None of these issues are unfixable. As I said before, many of them >>> can be resolved if vm86 is changed to use the normal syscall/exception >>> exit paths. Give me a few days to finish off that patch set. >>> >> >> I look forward to it. >> >> However: I imagine that, if you do this, you may need to be quite >> careful about an x86_32-ism. Currently, if you have a pt_regs pointer >> for the current entry and user_mode(regs) returns true, then regs == >> current_pt_regs(). If you let user mode run with EFLAGS.VM set with >> the normal tss.sp0, then this will no longer be true, as the >> extra-long entry-from-v8086 frame will shift pt_regs by a few bytes. >> I don't know whether this matters, but I can imagine it causing >> do_signal to explode. *shudder* > > I am aware that pt_regs is in a fixed location on the stack. What I > plan to do is increase the padding at the top of the stack if VM86 is > configured, to reserve space for the extra segment registers. Then it > will move tss.sp0 up 16 bytes when entering vm86 mode so that the > longer IRET frame is in the right place. > Hmm, should work. I wonder if the right way to do this is to set a TIF_VM86 flag and do the fixups in enter_from_user_mode and prepare_return_to_usermode. See the patches I just sent (and tip/x88/asm, which they apply to). Without something like that, we'll be in the awkward position of having some of the selectors (DS, ES, FS, and GS) in both the normal pt_regs slot and in the extended hardware frame during execution of normal vm86-unaware kernel code. If, on the other hand, we copied the selectors across in enter_from_user_mode and prepare_return_from_usermode, then pt_regs would work normally even for tasks that are running in v8086 mode. regs->flags & X86_EFLAGS_VM will be true regardless, so all of the asm that decides to invoke those helpers should work fine. --Andy --Andy --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 19:59 ` Andy Lutomirski @ 2015-07-09 5:52 ` Ingo Molnar 2015-07-09 5:59 ` Ingo Molnar 0 siblings, 1 reply; 41+ messages in thread From: Ingo Molnar @ 2015-07-09 5:52 UTC (permalink / raw) To: Andy Lutomirski Cc: Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov * Andy Lutomirski <luto@amacapital.net> wrote: > >> I look forward to it. > >> > >> However: I imagine that, if you do this, you may need to be quite careful > >> about an x86_32-ism. Currently, if you have a pt_regs pointer for the > >> current entry and user_mode(regs) returns true, then regs == > >> current_pt_regs(). If you let user mode run with EFLAGS.VM set with the > >> normal tss.sp0, then this will no longer be true, as the extra-long > >> entry-from-v8086 frame will shift pt_regs by a few bytes. I don't know > >> whether this matters, but I can imagine it causing do_signal to explode. > >> *shudder* > > > > I am aware that pt_regs is in a fixed location on the stack. What I plan to > > do is increase the padding at the top of the stack if VM86 is configured, to > > reserve space for the extra segment registers. Then it will move tss.sp0 up > > 16 bytes when entering vm86 mode so that the longer IRET frame is in the right > > place. > > > > Hmm, should work. > > I wonder if the right way to do this is to set a TIF_VM86 flag and do the fixups > in enter_from_user_mode and prepare_return_to_usermode. See the patches I just > sent (and tip/x88/asm, which they apply to). > > Without something like that, we'll be in the awkward position of having some of > the selectors (DS, ES, FS, and GS) in both the normal pt_regs slot and in the > extended hardware frame during execution of normal vm86-unaware kernel code. > If, on the other hand, we copied the selectors across in enter_from_user_mode > and prepare_return_from_usermode, then pt_regs would work normally even for > tasks that are running in v8086 mode. > > regs->flags & X86_EFLAGS_VM will be true regardless, so all of the asm that > decides to invoke those helpers should work fine. Btw., has anyone considered an entirely different approach: using KVM's instruction emulator to emulate vm86 16-bit code execution? Basically the vm86 system call would be kept compatible, but fully emulated, the CPU never enters true 16-bit mode, just iterates pt_regs as if it had. This approach has four main advantages: - we could remove the fragile vm86 code from the entry code - it might even be faster for certain workloads than faulting in and out all the time and using ancient, fragile hardware mode of the CPU. (For example it could detect the VGA screen write patterns and accelerate them.) - it could be made to work on 64-bit as well, FWIIW - it would provide another angle of testing for the KVM emulator Hm? Thanks, Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-09 5:52 ` Ingo Molnar @ 2015-07-09 5:59 ` Ingo Molnar 2015-07-09 18:33 ` Andy Lutomirski 0 siblings, 1 reply; 41+ messages in thread From: Ingo Molnar @ 2015-07-09 5:59 UTC (permalink / raw) To: Andy Lutomirski Cc: Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov * Ingo Molnar <mingo@kernel.org> wrote: > > Without something like that, we'll be in the awkward position of having some > > of the selectors (DS, ES, FS, and GS) in both the normal pt_regs slot and in > > the extended hardware frame during execution of normal vm86-unaware kernel > > code. If, on the other hand, we copied the selectors across in > > enter_from_user_mode and prepare_return_from_usermode, then pt_regs would work > > normally even for tasks that are running in v8086 mode. > > > > regs->flags & X86_EFLAGS_VM will be true regardless, so all of the asm that > > decides to invoke those helpers should work fine. > > Btw., has anyone considered an entirely different approach: using KVM's > instruction emulator to emulate vm86 16-bit code execution? Basically the vm86 > system call would be kept compatible, but fully emulated, the CPU never enters > true 16-bit mode, just iterates pt_regs as if it had. > > This approach has four main advantages: > > - we could remove the fragile vm86 code from the entry code > > - it might even be faster for certain workloads than faulting in and out all > the time and using ancient, fragile hardware mode of the CPU. (For example it > could detect the VGA screen write patterns and accelerate them.) > > - it could be made to work on 64-bit as well, FWIIW > > - it would provide another angle of testing for the KVM emulator So there's a fifth advantage as well that I think needs to be stressed: - it's an _obviously_ much more secure design, as we only iterate user-space pt_regs and never truly touch any security relevant CPU state. The whole nested pt_regs and different hw frame entry complications would go away entirely. All CPU semantics would not be just assumed implicitly, but would be very much present in the CPU emulator and would be reviewable. Thanks, Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-09 5:59 ` Ingo Molnar @ 2015-07-09 18:33 ` Andy Lutomirski 2015-07-10 11:16 ` Paolo Bonzini 0 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-09 18:33 UTC (permalink / raw) To: Ingo Molnar, Paolo Bonzini Cc: Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Wed, Jul 8, 2015 at 10:59 PM, Ingo Molnar <mingo@kernel.org> wrote: > > * Ingo Molnar <mingo@kernel.org> wrote: > >> > Without something like that, we'll be in the awkward position of having some >> > of the selectors (DS, ES, FS, and GS) in both the normal pt_regs slot and in >> > the extended hardware frame during execution of normal vm86-unaware kernel >> > code. If, on the other hand, we copied the selectors across in >> > enter_from_user_mode and prepare_return_from_usermode, then pt_regs would work >> > normally even for tasks that are running in v8086 mode. >> > >> > regs->flags & X86_EFLAGS_VM will be true regardless, so all of the asm that >> > decides to invoke those helpers should work fine. >> >> Btw., has anyone considered an entirely different approach: using KVM's >> instruction emulator to emulate vm86 16-bit code execution? Basically the vm86 >> system call would be kept compatible, but fully emulated, the CPU never enters >> true 16-bit mode, just iterates pt_regs as if it had. >> >> This approach has four main advantages: >> >> - we could remove the fragile vm86 code from the entry code >> >> - it might even be faster for certain workloads than faulting in and out all >> the time and using ancient, fragile hardware mode of the CPU. (For example it >> could detect the VGA screen write patterns and accelerate them.) >> >> - it could be made to work on 64-bit as well, FWIIW >> >> - it would provide another angle of testing for the KVM emulator > > So there's a fifth advantage as well that I think needs to be stressed: > > - it's an _obviously_ much more secure design, as we only iterate user-space > pt_regs and never truly touch any security relevant CPU state. The whole > nested pt_regs and different hw frame entry complications would go away > entirely. All CPU semantics would not be just assumed implicitly, but would > be very much present in the CPU emulator and would be reviewable. > Hmm. If we did this, I think I'd prefer a slightly more general approach. First teach KVM to support a mode in which it's purely an emulator (Paolo: how hard is this? It would also make testing the emulator much easier). Then re-implement vm86 on top of that. The big downside of that, or of writing a more ad-hoc emulator, is understanding what the semantics of all the weird vm86plus stuff is supposed to be in the first place. It's completely undocumented and it's not at all obvious what it's all supposed to do. This sounds like a fairly large project. I think I'd rather get all the distros to turn vm86 off and let it slowly die in a dark corner. After all, dosemu and vbetool both already contain emulators that seem to work, and dosbox (which is, by all reports, better than dosemu) never used vm86 in the first place. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-09 18:33 ` Andy Lutomirski @ 2015-07-10 11:16 ` Paolo Bonzini 2015-07-10 14:13 ` Ingo Molnar 2015-07-10 14:39 ` Andy Lutomirski 0 siblings, 2 replies; 41+ messages in thread From: Paolo Bonzini @ 2015-07-10 11:16 UTC (permalink / raw) To: Andy Lutomirski, Ingo Molnar Cc: Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On 09/07/2015 20:33, Andy Lutomirski wrote: > On Wed, Jul 8, 2015 at 10:59 PM, Ingo Molnar <mingo@kernel.org> wrote: >> >> * Ingo Molnar <mingo@kernel.org> wrote: >> >>>> Without something like that, we'll be in the awkward position of having some >>>> of the selectors (DS, ES, FS, and GS) in both the normal pt_regs slot and in >>>> the extended hardware frame during execution of normal vm86-unaware kernel >>>> code. If, on the other hand, we copied the selectors across in >>>> enter_from_user_mode and prepare_return_from_usermode, then pt_regs would work >>>> normally even for tasks that are running in v8086 mode. >>>> >>>> regs->flags & X86_EFLAGS_VM will be true regardless, so all of the asm that >>>> decides to invoke those helpers should work fine. >>> >>> Btw., has anyone considered an entirely different approach: using KVM's >>> instruction emulator to emulate vm86 16-bit code execution? Basically the vm86 >>> system call would be kept compatible, but fully emulated, the CPU never enters >>> true 16-bit mode, just iterates pt_regs as if it had. >>> >>> This approach has four main advantages: >>> >>> - we could remove the fragile vm86 code from the entry code >>> >>> - it might even be faster for certain workloads than faulting in and out all >>> the time and using ancient, fragile hardware mode of the CPU. (For example it >>> could detect the VGA screen write patterns and accelerate them.) >>> >>> - it could be made to work on 64-bit as well, FWIIW >>> >>> - it would provide another angle of testing for the KVM emulator >> >> So there's a fifth advantage as well that I think needs to be stressed: >> >> - it's an _obviously_ much more secure design, as we only iterate user-space >> pt_regs and never truly touch any security relevant CPU state. The whole >> nested pt_regs and different hw frame entry complications would go away >> entirely. All CPU semantics would not be just assumed implicitly, but would >> be very much present in the CPU emulator and would be reviewable. >> > > Hmm. > > If we did this, I think I'd prefer a slightly more general approach. > First teach KVM to support a mode in which it's purely an emulator > (Paolo: how hard is this? It would also make testing the emulator > much easier). This isn't hard, at least for Intel: make emulation_required() return true always (and fix the fallout). However, it's not necessary. The emulator is designed to be independent from the rest of KVM. At some point I think Avi was testing it in userspace (or planning to do so). So you would just move it from arch/x86/kvm to arch/x86/emulate. The obvious downside is that the emulator isn't really designed for speed. In KVM it's currently 1000-1500 times slower than the real thing. Even if you modified it to remove the KVM overhead (vm86 is just running ring 3 code; no interrupts and no pagetables to walk), it probably would take 300-500 cycles to execute one instruction. But it's doable. > The big downside of that, or of writing a more ad-hoc emulator, is > understanding what the semantics of all the weird vm86plus stuff is > supposed to be in the first place. Do you mean VIF/VIP and the other vm86 mode extensions? Or is vm86plus something in Linux? Paolo > It's completely undocumented and > it's not at all obvious what it's all supposed to do. This sounds > like a fairly large project. > > I think I'd rather get all the distros to turn vm86 off and let it > slowly die in a dark corner. After all, dosemu and vbetool both > already contain emulators that seem to work, and dosbox (which is, by > all reports, better than dosemu) never used vm86 in the first place. > > --Andy > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 11:16 ` Paolo Bonzini @ 2015-07-10 14:13 ` Ingo Molnar 2015-07-10 14:24 ` Paolo Bonzini 2015-07-10 14:39 ` Andy Lutomirski 1 sibling, 1 reply; 41+ messages in thread From: Ingo Molnar @ 2015-07-10 14:13 UTC (permalink / raw) To: Paolo Bonzini Cc: Andy Lutomirski, Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov * Paolo Bonzini <pbonzini@redhat.com> wrote: > > Hmm. > > > > If we did this, I think I'd prefer a slightly more general approach. First > > teach KVM to support a mode in which it's purely an emulator (Paolo: how hard > > is this? It would also make testing the emulator much easier). > > This isn't hard, at least for Intel: make emulation_required() return true > always (and fix the fallout). However, it's not necessary. The emulator is > designed to be independent from the rest of KVM. At some point I think Avi was > testing it in userspace (or planning to do so). So you would just move it from > arch/x86/kvm to arch/x86/emulate. Very nice! > The obvious downside is that the emulator isn't really designed for speed. > > In KVM it's currently 1000-1500 times slower than the real thing. Even if you > modified it to remove the KVM overhead (vm86 is just running ring 3 code; no > interrupts and no pagetables to walk), it probably would take 300-500 cycles to > execute one instruction. This needs to be tested, but I wouldn't expect it to be a big issue: - if anyone cares they can improve its performance - or worst case they can upgrade their tool to something newer which will use user-space emulation of 16-bit code anyway ... - Furthermore I suspect with vm86 we'd trap out of vm86 mode rather often - and a single trap can take thousands of cycles. So I suspect the effective slowdown depends on the workload. - In the absolute worst case it will perform like a really old CPU. Thanks, Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 14:13 ` Ingo Molnar @ 2015-07-10 14:24 ` Paolo Bonzini 0 siblings, 0 replies; 41+ messages in thread From: Paolo Bonzini @ 2015-07-10 14:24 UTC (permalink / raw) To: Ingo Molnar Cc: Andy Lutomirski, Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On 10/07/2015 16:13, Ingo Molnar wrote: > > This isn't hard, at least for Intel: make emulation_required() return true > > always (and fix the fallout). However, it's not necessary. The emulator is > > designed to be independent from the rest of KVM. At some point I think Avi was > > testing it in userspace (or planning to do so). So you would just move it from > > arch/x86/kvm to arch/x86/emulate. > > Very nice! Thanks. :) Mostly on behalf of the former maintainers---and the Xen folks too, the emulator has its roots there. So, the starting point for hooking into the emulator is struct x86_emulate_ops (in asm/kvm_emulate.h) and the function that calls into it in KVM is x86_emulate_instruction. You can look there to see how the emulator can be used. If it doesn't compile straight away in userspace, I'll gladly accept patches. There are parts of emulation that are actually done (for simplicity and laziness) in x86_emulate_instruction rather than emulate.c, most notably hardware debugging support, but these aren't really needed for an initial prototype of vm86. A lot of the stuff in x86_emulate_instruction isn't necessary for vm86 and can be WARN()ed away, because for example IN/OUT always cause a #GP in vm86 mode. Paolo ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 11:16 ` Paolo Bonzini 2015-07-10 14:13 ` Ingo Molnar @ 2015-07-10 14:39 ` Andy Lutomirski 1 sibling, 0 replies; 41+ messages in thread From: Andy Lutomirski @ 2015-07-10 14:39 UTC (permalink / raw) To: Paolo Bonzini Cc: Ingo Molnar, Brian Gerst, Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 4:16 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 09/07/2015 20:33, Andy Lutomirski wrote: >> On Wed, Jul 8, 2015 at 10:59 PM, Ingo Molnar <mingo@kernel.org> wrote: > >> The big downside of that, or of writing a more ad-hoc emulator, is >> understanding what the semantics of all the weird vm86plus stuff is >> supposed to be in the first place. > > Do you mean VIF/VIP and the other vm86 mode extensions? Or is vm86plus > something in Linux? Something in Linux written for DOSEMU's benefit. I don't really understand what it encompasses. Oddly, Linux doesn't use the virtual mode extensions. Instead, it emulates them (but probably not very well). So STI manipulates a fake VIF flag and checks a fake VIP flag. There's also a huge hack involving 0xA0000. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 17:30 ` Andy Lutomirski 2015-07-08 17:49 ` Andy Lutomirski 2015-07-08 19:05 ` Brian Gerst @ 2015-07-10 14:12 ` Eric W. Biederman 2015-07-10 14:37 ` Andy Lutomirski 2 siblings, 1 reply; 41+ messages in thread From: Eric W. Biederman @ 2015-07-10 14:12 UTC (permalink / raw) To: Andy Lutomirski Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov Andy Lutomirski <luto@amacapital.net> writes: > On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: >>> >>> if this patch would not be acceptable, at minimum we need some sort of "off >>> by default >>> unless the sysadmin flips a sysfs thing", which is really just a huge hack. >> >> The only thing that matters is whether people use this or not. >> > > I think that the world contains precisely two programs that use the > vm86 syscalls. One is dosemu, and one is a test case I wrote. Wine used to also call vm86. > As far as I can tell (and I'll try to test this better for real later > this week), dosemu already knows how to emulate real mode if vm86 is > unavailable. So it's unclear that turning off the vm86 syscalls > actually breaks anything whatsoever. Yes. This happened after 64bit kernels became common years ago, as the lack of vm86 on 64bit nearly killed the dosemu project. > On the other hand, sys_vm86 fails if the syscall slow path is in use. > That means that quite a few Fedora versions (auditing), anything with > ptrace, seccomp (before 3.16 IIRC), and anything with context tracking > is probably actually *improved* by turning off the vm86 syscalls even > for dosemu users. Is there any chance that vm86 is sufficiently badly broken before this that we can conclude vm86 is not in use? It would really simplify this discussion if we could point to code rot and say that it is clear that no one has been testing this code path for ages, and that the code can't possibly work the way it is now. That would just let us remove vm86. > It only says "[OK]" because my test case isn't careful enough. That's > a failure. I suspect it was a much worse failure a couple versions > ago before my ENOSYS-reworking patch went in. > > I'll try to confirm later this week that dosemu can really handle real > mode without sys_vm86. I have not looked in ages but certainly on 64bit dosemu can. As someone else pointed out dosemu maps the zero page so that may also be a point where vm86 support gets broken. Eric ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 14:12 ` Eric W. Biederman @ 2015-07-10 14:37 ` Andy Lutomirski 2015-07-10 16:35 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-10 14:37 UTC (permalink / raw) To: Eric W. Biederman Cc: Linus Torvalds, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 7:12 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Andy Lutomirski <luto@amacapital.net> writes: > >> On the other hand, sys_vm86 fails if the syscall slow path is in use. >> That means that quite a few Fedora versions (auditing), anything with >> ptrace, seccomp (before 3.16 IIRC), and anything with context tracking >> is probably actually *improved* by turning off the vm86 syscalls even >> for dosemu users. > > Is there any chance that vm86 is sufficiently badly broken before this > that we can conclude vm86 is not in use? It would really simplify this > discussion if we could point to code rot and say that it is clear that > no one has been testing this code path for ages, and that the code can't > possibly work the way it is now. That would just let us remove vm86. > Having just written a pile of tests for it, I don't think so, as long as none of the syscall slow path stuff is in use :( >> It only says "[OK]" because my test case isn't careful enough. That's >> a failure. I suspect it was a much worse failure a couple versions >> ago before my ENOSYS-reworking patch went in. >> >> I'll try to confirm later this week that dosemu can really handle real >> mode without sys_vm86. > > I have not looked in ages but certainly on 64bit dosemu can. > > As someone else pointed out dosemu maps the zero page so that may also > be a point where vm86 support gets broken. Right. And someone pointed out that vbetool sometimes needs access to virtual (or emulated virtual) addresses above 3GB, and vm86 can't do that. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 14:37 ` Andy Lutomirski @ 2015-07-10 16:35 ` Linus Torvalds 2015-07-10 16:44 ` Andy Lutomirski 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2015-07-10 16:35 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 7:37 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > Having just written a pile of tests for it, I don't think so, as long as none > of the syscall slow path stuff is in use :( It seems that you are thinking that people actually use vm86 mode as a real Linux mode, and do system calls from it etc. I'm sure that has happened in some crazy situation (people doing some random pseudo-BIOS etc), but it's not the common situation at all. The common situation is that you enter vm86 mode with vm86(), and that you exit it due to one of the (many) unhandled situations or a signal or whatever. Yeah,we handle a few sad instructions directly, but most vm86 exits just return to user mode. The system call paths just aren't an issue in reality, because they just aren't used. And I'm personally violently against Ingo's idea of emulating this with an instruction emulator. Hell no. That's what user mode does, and it's fine there. In the kernel, we either support the hardware vm86 mode, or we phase it out because we can show that nobody uses it any more. None of that "let's emulate it in software" crud. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 16:35 ` Linus Torvalds @ 2015-07-10 16:44 ` Andy Lutomirski 2015-07-10 17:04 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-10 16:44 UTC (permalink / raw) To: Linus Torvalds Cc: Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 9:35 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Jul 10, 2015 at 7:37 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> Having just written a pile of tests for it, I don't think so, as long as none >> of the syscall slow path stuff is in use :( > > It seems that you are thinking that people actually use vm86 mode as a > real Linux mode, and do system calls from it etc. Nope. > > The common situation is that you enter vm86 mode with vm86(), and that > you exit it due to one of the (many) unhandled situations or a signal > or whatever. Yeah,we handle a few sad instructions directly, but most > vm86 exits just return to user mode. > > The system call paths just aren't an issue in reality, because they > just aren't used. > That's not what I mean. I'm referring to the vm86 syscall itself. If you have a ti flag that causes the slow exit path to be used, then you call vm86. vm86 sets up the ludicrous double stack frame that it uses and jumps back to the exit asm. The exit asm then branches off to the slow path, hits the notifysig_v86 kludge, calls save_v86_state, tears down its double stack frame, and keeps meandering back through the exit asm. We finally IRET right back to protected mode, and the code that userspace was trying to execute in v8086 mode never actually runs. That code looked fishy when I first read it, and it is, indeed, entirely incorrect. So the vm86 syscall itself is broken if the slow path is in use. Fortunately, you can't do an a syscall inside vm86. If you could, I think it would be a disaster, because the double stack means that the syscall would run in a completely bogus context. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 16:44 ` Andy Lutomirski @ 2015-07-10 17:04 ` Linus Torvalds 2015-07-10 17:13 ` Andy Lutomirski 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2015-07-10 17:04 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 9:44 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > That's not what I mean. I'm referring to the vm86 syscall itself. If > you have a ti flag that causes the slow exit path to be used, then you > call vm86. vm86 sets up the ludicrous double stack frame that it uses > and jumps back to the exit asm. The exit asm then branches off to the > slow path, hits the notifysig_v86 kludge, calls save_v86_state, tears > down its double stack frame, and keeps meandering back through the > exit asm. We finally IRET right back to protected mode, and the code > that userspace was trying to execute in v8086 mode never actually > runs. So? Yes, we exit vm86 mode if anything odd happens. That's very much part of the whole vm86() model. If the kernel needs to do anything, it saves off the vm86 state and returns to regular 32-bit mode. That's how it's designed to be. What's your point? The user mode "vm86 hypervisor" will call vm86() in a loop. Always has. Always will. And yes, that can mean that you never execute even a single instruction in vm86 mode, if one of the "we have other work to do" flags are set. Maybe a signal came in. Maybe just a delayed work happened. Maybe it has nothing to do with user space, and we *could* have returned to vm86 mode, but the thing is, that code sequence is _designed_ that way - it's very much minimizing the impact of vm86 mode. Pretty much the *only* thing we ever do with the vm86 stack still active is reschedule. Pretty much *any* other context change issue will get rid of the vm86 mode in kernel space, saving back the state to user space so that user space can try again. An it was done that way to minimize the vm86 impact on the rest of the kernel. Basically there's a few hooks in a couple of traps that say "ok, let's handle this case for vm86 mode", and there's the "let's reschedule without exiting the user vm86 state", but the code is designed so that we'll just say "screw it, the user can restart, we'll go back to normal 32-bit code because something else than just plain returning to vm86 mode happend". vm86() mode is not some kind of "run this DOS program to completion". It's exactly like a (very stupid) vmx mode. There are exit conditions, and while many of them are about the code it executes, equally many of them are "oh, we may have some event that cannot be handled in vm86 mode like a signal happened" etc. So yes, if the thread work flags are set, we never enter vm86 mode. BUT THAT'S EXACTLY WHAT SHOULD HAPPEN. It worries me that you think these kinds of fundamental issues are completely broken. No, I wouldn't be surprised at all if there is actual breakage, just because vm86 mode clearly gets very little testing, but the things you have pointed out as "broken" really haven't been as far as I can tell. And yes, if you enable system call auditing, and you actually audit the vm86 mode system call, that probably causes an exit condition, which means that you can't actually run vm86 mode and make progress if you audit that system call. Big f*cking deal. People who enable system call auditing break many more important things (eg basic performance) that that isn't even an argument. Do you really think that people who wanted to run DOS games at hardware speeds wanted to _audit_ those games? No. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 17:04 ` Linus Torvalds @ 2015-07-10 17:13 ` Andy Lutomirski 2015-07-10 17:39 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Andy Lutomirski @ 2015-07-10 17:13 UTC (permalink / raw) To: Linus Torvalds Cc: Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 10:04 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Jul 10, 2015 at 9:44 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> That's not what I mean. I'm referring to the vm86 syscall itself. If >> you have a ti flag that causes the slow exit path to be used, then you >> call vm86. vm86 sets up the ludicrous double stack frame that it uses >> and jumps back to the exit asm. The exit asm then branches off to the >> slow path, hits the notifysig_v86 kludge, calls save_v86_state, tears >> down its double stack frame, and keeps meandering back through the >> exit asm. We finally IRET right back to protected mode, and the code >> that userspace was trying to execute in v8086 mode never actually >> runs. > > So? > > So yes, if the thread work flags are set, we never enter vm86 mode. > BUT THAT'S EXACTLY WHAT SHOULD HAPPEN. > > It worries me that you think these kinds of fundamental issues are > completely broken. > The problem is that it's *every* event. That includes this that happen literally every time like strace. (NOHZ_FULL would count, too, if it worked at all on 32-bit kernels.) Try it: vm86 will make zero progress if you run it under strace. It will also execute the trace hooks the wrong number of times, so strace gets very confused. If someone does something daft like using a systrace-style sandbox, it probably breaks the sandbox. > > And yes, if you enable system call auditing, and you actually audit > the vm86 mode system call, that probably causes an exit condition, > which means that you can't actually run vm86 mode and make progress if > you audit that system call. Big f*cking deal. People who enable system > call auditing break many more important things (eg basic performance) > that that isn't even an argument. Do you really think that people who > wanted to run DOS games at hardware speeds wanted to _audit_ those > games? No. Not at all. It does, however, mean that Fedora/RHEL users (who use auditing by default in most cases, sigh) have a decent change of having had a non-working vm86 syscall for a long time. This makes me think that there really aren't many vm86 users out there, since we'd have heard about the breakage. Note that audit is very special, though, since it has its own asm path. It might actually work, but I haven't tested it. In any event, we're quibbling about the wording of the kconfig text here. Both Brian and I have patches that fix the ptrace problem, so it's likely to be a nonissue in 4.3 regardless. --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 17:13 ` Andy Lutomirski @ 2015-07-10 17:39 ` Linus Torvalds 2015-07-10 17:58 ` Andy Lutomirski ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Linus Torvalds @ 2015-07-10 17:39 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 10:13 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > The problem is that it's *every* event. That includes this that > happen literally every time like strace. (NOHZ_FULL would count, too, > if it worked at all on 32-bit kernels.) But things like strace and auditing etc has probably never worked in the first place. So yeah, I can well imagine that vm86 isn't universally useful. And maybe it's been effectively broken in halfway modern distributions due to their insane use of auditing - which is wonderful, because it's just a stronger argument for disabling it by default. But what I'd worry about is regressions - people who actually want to upgrade kernels, and had an old machine and had an old distro, and just want to keep that working. They aren't interested in running strace on their old DOS game, or on their X server that uses it to run the video BIOS. They just want it to work. And it doesn't look "completely broken" to me for that. Put another way: I think vm86 is very much "legacy". Nobody cares about it in modern environments. That's not what we should even worry about. We shouldn't worry about new users, and we _should_ try to discourage it. But I think we should keep it working for the cases it used to work before. So no marking it "BROKEN". No calling it names just because it doesn't work in insane situations that nobody cares about. It's a legacy thing, and it probably has very few users, but I'm getting the vibe that you want to remove it or hate it just because it might not work in situations that simply don't make sense in the first place, and that it was never used for anyway. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 17:39 ` Linus Torvalds @ 2015-07-10 17:58 ` Andy Lutomirski 2015-07-10 18:00 ` Al Viro 2015-07-11 9:18 ` Ingo Molnar 2 siblings, 0 replies; 41+ messages in thread From: Andy Lutomirski @ 2015-07-10 17:58 UTC (permalink / raw) To: Linus Torvalds Cc: Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 10:39 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > So no marking it "BROKEN". No calling it names just because it doesn't > work in insane situations that nobody cares about. It's a legacy > thing, and it probably has very few users, but I'm getting the vibe > that you want to remove it or hate it just because it might not work > in situations that simply don't make sense in the first place, and > that it was never used for anyway. Oh, right, I didn't realize this was still the v1 thread. v3 no longer calls it BROKEN. That being said, if vm86 actually has feelings, then I'm worried :) --Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 17:39 ` Linus Torvalds 2015-07-10 17:58 ` Andy Lutomirski @ 2015-07-10 18:00 ` Al Viro 2015-07-11 9:18 ` Ingo Molnar 2 siblings, 0 replies; 41+ messages in thread From: Al Viro @ 2015-07-10 18:00 UTC (permalink / raw) To: Linus Torvalds Cc: Andy Lutomirski, Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov On Fri, Jul 10, 2015 at 10:39:02AM -0700, Linus Torvalds wrote: > But things like strace and auditing etc has probably never worked in > the first place. > > So yeah, I can well imagine that vm86 isn't universally useful. And > maybe it's been effectively broken in halfway modern distributions due > to their insane use of auditing - which is wonderful, because it's > just a stronger argument for disabling it by default. ITYM "both"... ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-10 17:39 ` Linus Torvalds 2015-07-10 17:58 ` Andy Lutomirski 2015-07-10 18:00 ` Al Viro @ 2015-07-11 9:18 ` Ingo Molnar 2 siblings, 0 replies; 41+ messages in thread From: Ingo Molnar @ 2015-07-11 9:18 UTC (permalink / raw) To: Linus Torvalds Cc: Andy Lutomirski, Eric W. Biederman, Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov * Linus Torvalds <torvalds@linux-foundation.org> wrote: > [...] > > So no marking it "BROKEN". No calling it names just because it doesn't work in > insane situations that nobody cares about. It's a legacy thing, and it probably > has very few users, but I'm getting the vibe that you want to remove it or hate > it just because it might not work in situations that simply don't make sense in > the first place, and that it was never used for anyway. So just to make it clear that we are on the same page: I voiced a number of bad ideas in this thread that got you (rightfully) worried. Those bad ideas are all off the table: - We won't mark VM86 as BROKEN (which effectively disables it permanently) - We won't do SW emulation either. The current plans with the vm86 ABI are the following: - We change the name to VM86_LEGACY and mark it default n to flush out people/distros who had it enabled for no good reason. Anyone who builds a new kernel for an old kernel and needs it for old hardware or DOS games can still enable it, and v86 will continue to work to the best of our abilities. (in fact it will work better, now that we are gradually making the x86 entry code more maintainable.) - We enhance the help text so that people who enable it make an informed choice. - We apply Brian's and Andy's various fixes and cleanups to fix all known vm86 bugs and to make it more maintainable. Agreed? Btw., what do you think about one more measure to make vm86 more configurable, and to allow the locking down of the default some more: - Introduce a sysctl that globally disables/enables the sys_vm86 and sys_vm86old syscalls by default for non-privileged users, i.e. something like: static int __read_mostly sysctl_x86_vm86_paranoia = 1; ... switch (sysctl_x86_vm86_paranoia) { case 0: /* Not paranoid at all: allow everyone vm86 access: */ break; case 1: /* Somewhat paranoid: only allow privileged users vm86 access: */ if (!capable(CAP_SYS_ADMIN)) return -EPERM; break; case 2: default: /* Very paranoid, turn off the syscall: */ return -EPERM; } Note that with this we also introduce the '2' setting: users in such a distro could still disable vm86 globally, as if it had been turned off in the kernel config. Thanks, Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 16:59 ` Linus Torvalds 2015-07-08 17:30 ` Andy Lutomirski @ 2015-07-08 19:13 ` Ingo Molnar 1 sibling, 0 replies; 41+ messages in thread From: Ingo Molnar @ 2015-07-08 19:13 UTC (permalink / raw) To: Linus Torvalds Cc: Arjan van de Ven, Andy Lutomirski, the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Peter Zijlstra, Borislav Petkov * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: > > > > if this patch would not be acceptable, at minimum we need some sort of "off by > > default unless the sysadmin flips a sysfs thing", which is really just a huge > > hack. > > The only thing that matters is whether people use this or not. > > If people use vm86 mode, we can't just disable it. It's that simple. "It's > poorly maintained" isn't an argument for removal. Only "nobody cares" works as > an argument for that. > > My suspicion is that people still do use vm86 mode, but who knows.. Quite > frankly, rather than disable it, I'd much rather see people who modify low-level > x86 code (yes, that means you, Luto) *test* it. If you aren't willign to test > the modifications you make, I don't think those modifications should be merged, > regardless of how nice a cleanup they are. The dosemu case might just work due to emulation (assuming emulation is equivalent or better than vm86 mode), but if Xorg still uses vm86 on old systems to run the Video-BIOS, with no fallback code available, then I doubt we can remove it. In any case it's a lot less clear-cut than I initially thought, so I've removed the patch until it's determined whether it's still used by anything. Thanks, Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread
* [tip:x86/asm] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 1:25 [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN Andy Lutomirski 2015-07-08 2:33 ` Arjan van de Ven @ 2015-07-08 9:45 ` tip-bot for Andy Lutomirski 2015-07-08 15:32 ` [PATCH] " Brian Gerst 2 siblings, 0 replies; 41+ messages in thread From: tip-bot for Andy Lutomirski @ 2015-07-08 9:45 UTC (permalink / raw) To: linux-tip-commits Cc: brgerst, oleg, hpa, torvalds, keescook, bp, dvlasenk, peterz, luto, arjan, tglx, luto, linux-kernel, mingo Commit-ID: 0b02e20767a3b4d843d2c58cf031d9e31f60e39d Gitweb: http://git.kernel.org/tip/0b02e20767a3b4d843d2c58cf031d9e31f60e39d Author: Andy Lutomirski <luto@kernel.org> AuthorDate: Tue, 7 Jul 2015 18:25:56 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 8 Jul 2015 11:04:45 +0200 x86/kconfig/32: Mark CONFIG_VM86 as BROKEN VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is in use. The code is a big undocumented mess, it's a real PITA to test, and it looks like a big chunk of vm86_32.c is dead code. It also plays awful games with the entry asm. No one should be using it anyway. Use DOSBOX or KVM instead. Mark it BROKEN. I want to remove some (obviously incorrect) exit asm that it depends on, and I don't want to figure out how to run severely obsolete programs just to test something that no one uses for anything other than exploits anyway. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: <stable@vger.kernel.org> # Backport it as far back as possible Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/23d4709cee2fe92c32d41b99c7a3c1823725925a.1436312944.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/Kconfig | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index aa94fd0..a7648f9b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -997,8 +997,8 @@ config X86_THERMAL_VECTOR depends on X86_MCE_INTEL config VM86 - bool "Enable VM86 support" if EXPERT - default y + bool "Enable VM86 support" if BROKEN + default n depends on X86_32 ---help--- This option is required by programs like DOSEMU to run @@ -1006,6 +1006,12 @@ config VM86 be needed by software like XFree86 to initialize some video cards via BIOS. Disabling this option saves about 6K. + Linux's VM86 support is poorly maintained, essentially never + tested by upstream kernel developers, has quite a few known + bugs, and is probably full of security holes. The only thing + that appears to use it is DOSEMU, and DOSBOX and KVM are + better options these days. Don't enable it. + config X86_16BIT bool "Enable support for 16-bit segments" if EXPERT default y ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN 2015-07-08 1:25 [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN Andy Lutomirski 2015-07-08 2:33 ` Arjan van de Ven 2015-07-08 9:45 ` [tip:x86/asm] " tip-bot for Andy Lutomirski @ 2015-07-08 15:32 ` Brian Gerst 2 siblings, 0 replies; 41+ messages in thread From: Brian Gerst @ 2015-07-08 15:32 UTC (permalink / raw) To: Andy Lutomirski Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Oleg Nesterov, Kees Cook, Arjan van de Ven, Peter Zijlstra, Borislav Petkov, Linus Torvalds On Tue, Jul 7, 2015 at 9:25 PM, Andy Lutomirski <luto@kernel.org> wrote: > VM86 is entirely broken if ptrace, syscall auditing, or NOHZ_FULL is > in use. The code is a big undocumented mess, it's a real PITA to > test, and it looks like a big chunk of vm86_32.c is dead code. It > also plays awful games with the entry asm. > > No one should be using it anyway. Use DOSBOX or KVM instead. > > Mark it BROKEN. I want to remove some (obviously incorrect) exit > asm that it depends on, and I don't want to figure out how to run > severely obsolete programs just to test something that no one uses > for anything other than exploits anyway. > > Signed-off-by: Andy Lutomirski <luto@kernel.org> > --- > > I find it implausible that vm86_32.c isn't full or root holes. It's > also full of hilariously ugly code, it does terrible things to the > kernel stack, and its interaction with the syscall slowpath is > blatantly incorrect. > > It really shouldn't have any users, anyway. It doesn't (and can't!) > work on 64-bit kernels, and the only program that even knows how it > works appears to be DOSEMU. DOSEMU doesn't even need it for most > programs (it uses modify_ldt instead if possible), and DOSBOX and > KVM are better choices anyway. > > I think that even DOSEMU might be able to emulate vm86 (by emulating > instruction-by-instruction) if the vm86 syscall isn't there. > > Want to be terrified? Read copy_vm86_regs_from_user. Or > mark_screen_rdonly. Or return_to_32bit. Or VM86_REQUEST_IRQ. > > What do you all think? This code is a maintenance disaster, and I'd > love to see it go. This would be a nice first step. > > This patch is intended for tip/x86/asm. The 32-bit part of my big > cleanup will interfere with vm86, and, while I think I fixed it up > right, I'd rather not expose everyone to the high probability of > crazy security bugs in this mess. I have been working on some patches to fix the ugly hacks vm86 uses and make it more easily maintainable. The general idea is to make it use the regular pt_regs area and save the 32-bit regs and other data off-stack. That would allow a normal kernel exit route instead of jumping directly into the exit asm code. It should also allow ptrace to work with a few tweaks. One other place to check for usage is Wine. I recall there being some DOS compatibility stuff in there. -- Brian Gerst ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2015-07-11 9:18 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-08 1:25 [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN Andy Lutomirski 2015-07-08 2:33 ` Arjan van de Ven 2015-07-08 14:00 ` Thomas Gleixner 2015-07-08 14:04 ` Ingo Molnar 2015-07-09 9:03 ` Pavel Machek 2015-07-09 17:57 ` Andy Lutomirski 2015-07-09 18:03 ` Kees Cook 2015-07-09 18:30 ` Linus Torvalds 2015-07-08 16:59 ` Linus Torvalds 2015-07-08 17:30 ` Andy Lutomirski 2015-07-08 17:49 ` Andy Lutomirski 2015-07-08 17:55 ` Linus Torvalds 2015-07-08 18:47 ` Andy Lutomirski 2015-07-08 18:53 ` Kees Cook 2015-07-08 18:48 ` Kees Cook 2015-07-08 19:04 ` Andy Lutomirski 2015-07-08 18:54 ` Austin S Hemmelgarn 2015-07-08 19:05 ` Brian Gerst 2015-07-08 19:14 ` Andy Lutomirski 2015-07-08 19:39 ` Brian Gerst 2015-07-08 19:59 ` Andy Lutomirski 2015-07-09 5:52 ` Ingo Molnar 2015-07-09 5:59 ` Ingo Molnar 2015-07-09 18:33 ` Andy Lutomirski 2015-07-10 11:16 ` Paolo Bonzini 2015-07-10 14:13 ` Ingo Molnar 2015-07-10 14:24 ` Paolo Bonzini 2015-07-10 14:39 ` Andy Lutomirski 2015-07-10 14:12 ` Eric W. Biederman 2015-07-10 14:37 ` Andy Lutomirski 2015-07-10 16:35 ` Linus Torvalds 2015-07-10 16:44 ` Andy Lutomirski 2015-07-10 17:04 ` Linus Torvalds 2015-07-10 17:13 ` Andy Lutomirski 2015-07-10 17:39 ` Linus Torvalds 2015-07-10 17:58 ` Andy Lutomirski 2015-07-10 18:00 ` Al Viro 2015-07-11 9:18 ` Ingo Molnar 2015-07-08 19:13 ` Ingo Molnar 2015-07-08 9:45 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2015-07-08 15:32 ` [PATCH] " Brian Gerst
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).