* [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) @ 2013-10-22 2:33 Josh Triplett 2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Josh Triplett @ 2013-10-22 2:33 UTC (permalink / raw) To: linux-kernel, virtualization, x86, xen-devel Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris This patch series makes it possible to compile out the iopl and ioperm system calls, which allow privileged processes to request permission to directly poke I/O ports from userspace. Nothing on a modern Linux system uses these calls anymore, and anything new should be using /dev/port instead, or better yet writing a driver. Copying the bloat-o-meter stats from the final patch: 32-bit bloat-o-meter: add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681) function old new delta cpu_init 676 668 -8 ioperm_active 18 7 -11 init_task 1296 1284 -12 exit_thread 179 91 -88 ioperm_get 103 10 -93 __switch_to_xtra 254 161 -93 sys_iopl 127 - -127 SyS_iopl 127 - -127 copy_thread 606 446 -160 vt_ioctl 4127 3919 -208 sys_ioperm 370 - -370 init_tss 8576 384 -8192 doublefault_tss 8576 384 -8192 64-bit bloat-o-meter: add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719) function old new delta cpu_init 958 995 +37 arch_align_stack 78 86 +8 perf_event_exit_task 525 517 -8 ioperm_active 17 8 -9 init_task 1968 1944 -24 stub_iopl 81 - -81 ioperm_get 111 11 -100 __switch_to_xtra 281 164 -117 exit_thread 212 92 -120 vt_ioctl 4432 4304 -128 sys_iopl 137 - -137 SyS_iopl 137 - -137 copy_thread 694 520 -174 sys_ioperm 473 - -473 init_tss 8896 640 -8256 Josh Triplett (3): x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling x86: tss: Eliminate fragile calculation of TSS segment limit x86: Support compiling out userspace I/O (iopl and ioperm) arch/x86/Kconfig | 10 ++++ arch/x86/include/asm/desc.h | 11 +---- arch/x86/include/asm/paravirt.h | 2 + arch/x86/include/asm/paravirt_types.h | 2 + arch/x86/include/asm/processor.h | 54 +++++++++++++++++--- arch/x86/include/asm/syscalls.h | 3 ++ arch/x86/kernel/Makefile | 3 +- arch/x86/kernel/cpu/common.c | 12 +---- arch/x86/kernel/entry_64.S | 9 ++-- arch/x86/kernel/paravirt.c | 2 + arch/x86/kernel/process-io.h | 93 +++++++++++++++++++++++++++++++++++ arch/x86/kernel/process.c | 34 ++----------- arch/x86/kernel/process_32.c | 40 ++++----------- arch/x86/kernel/process_64.c | 26 ++-------- arch/x86/kernel/ptrace.c | 8 +++ arch/x86/xen/enlighten.c | 4 ++ drivers/tty/vt/vt_ioctl.c | 2 +- kernel/sys_ni.c | 5 ++ 18 files changed, 206 insertions(+), 114 deletions(-) create mode 100644 arch/x86/kernel/process-io.h -- 1.8.4.rc3 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling 2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett @ 2013-10-22 2:34 ` Josh Triplett 2013-10-30 22:21 ` Kees Cook 2013-10-31 20:01 ` Alexander van Heukelum 2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett 2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2 siblings, 2 replies; 17+ messages in thread From: Josh Triplett @ 2013-10-22 2:34 UTC (permalink / raw) To: linux-kernel, virtualization, x86, xen-devel Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris The 32-bit and 64-bit versions of copy_thread have functionally identical handling for copying the I/O bitmap, modulo differences in error handling. Clean up the error paths in both by moving the copy of the I/O bitmap to the end, to eliminate the need to free it if subsequent copy steps fail; move the resulting identical code to a static inline in a common header. Signed-off-by: Josh Triplett <josh@joshtriplett.org> --- arch/x86/kernel/process-io.h | 22 ++++++++++++++++++++++ arch/x86/kernel/process_32.c | 29 ++++++++--------------------- arch/x86/kernel/process_64.c | 26 +++++--------------------- 3 files changed, 35 insertions(+), 42 deletions(-) create mode 100644 arch/x86/kernel/process-io.h diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h new file mode 100644 index 0000000..d884444 --- /dev/null +++ b/arch/x86/kernel/process-io.h @@ -0,0 +1,22 @@ +#ifndef _X86_KERNEL_PROCESS_IO_H +#define _X86_KERNEL_PROCESS_IO_H + +static inline int copy_io_bitmap(struct task_struct *me, + struct task_struct *p) +{ + if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { + p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, + IO_BITMAP_BYTES, GFP_KERNEL); + if (!p->thread.io_bitmap_ptr) { + p->thread.io_bitmap_max = 0; + return -ENOMEM; + } + set_tsk_thread_flag(p, TIF_IO_BITMAP); + } else { + p->thread.io_bitmap_ptr = NULL; + } + + return 0; +} + +#endif /* _X86_KERNEL_PROCESS_IO_H */ diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 884f98f..e86028b 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -56,6 +56,8 @@ #include <asm/debugreg.h> #include <asm/switch_to.h> +#include "process-io.h" + asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); asmlinkage void ret_from_kernel_thread(void) __asm__("ret_from_kernel_thread"); @@ -135,7 +137,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, { struct pt_regs *childregs = task_pt_regs(p); struct task_struct *tsk; - int err; p->thread.sp = (unsigned long) childregs; p->thread.sp0 = (unsigned long) (childregs+1); @@ -167,36 +168,22 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, task_user_gs(p) = get_user_gs(current_pt_regs()); p->fpu_counter = 0; - p->thread.io_bitmap_ptr = NULL; tsk = current; - err = -ENOMEM; memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); - if (unlikely(test_tsk_thread_flag(tsk, TIF_IO_BITMAP))) { - p->thread.io_bitmap_ptr = kmemdup(tsk->thread.io_bitmap_ptr, - IO_BITMAP_BYTES, GFP_KERNEL); - if (!p->thread.io_bitmap_ptr) { - p->thread.io_bitmap_max = 0; - return -ENOMEM; - } - set_tsk_thread_flag(p, TIF_IO_BITMAP); - } - - err = 0; - /* * Set a new TLS for the child thread? */ - if (clone_flags & CLONE_SETTLS) + if (clone_flags & CLONE_SETTLS) { + int err; err = do_set_thread_area(p, -1, (struct user_desc __user *)childregs->si, 0); - - if (err && p->thread.io_bitmap_ptr) { - kfree(p->thread.io_bitmap_ptr); - p->thread.io_bitmap_max = 0; + if(err) + return err; } - return err; + + return copy_io_bitmap(tsk, p); } void diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index bb1dc51..6e680ad 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -50,6 +50,8 @@ #include <asm/debugreg.h> #include <asm/switch_to.h> +#include "process-io.h" + asmlinkage extern void ret_from_fork(void); asmlinkage DEFINE_PER_CPU(unsigned long, old_rsp); @@ -154,7 +156,6 @@ static inline u32 read_32bit_tls(struct task_struct *t, int tls) int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg, struct task_struct *p) { - int err; struct pt_regs *childregs; struct task_struct *me = current; @@ -164,7 +165,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, p->thread.usersp = me->thread.usersp; set_tsk_thread_flag(p, TIF_FORK); p->fpu_counter = 0; - p->thread.io_bitmap_ptr = NULL; savesegment(gs, p->thread.gsindex); p->thread.gs = p->thread.gsindex ? 0 : me->thread.gs; @@ -192,23 +192,13 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, if (sp) childregs->sp = sp; - err = -ENOMEM; memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); - if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { - p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, - IO_BITMAP_BYTES, GFP_KERNEL); - if (!p->thread.io_bitmap_ptr) { - p->thread.io_bitmap_max = 0; - return -ENOMEM; - } - set_tsk_thread_flag(p, TIF_IO_BITMAP); - } - /* * Set a new TLS for the child thread? */ if (clone_flags & CLONE_SETTLS) { + int err; #ifdef CONFIG_IA32_EMULATION if (test_thread_flag(TIF_IA32)) err = do_set_thread_area(p, -1, @@ -217,16 +207,10 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, #endif err = do_arch_prctl(p, ARCH_SET_FS, childregs->r8); if (err) - goto out; - } - err = 0; -out: - if (err && p->thread.io_bitmap_ptr) { - kfree(p->thread.io_bitmap_ptr); - p->thread.io_bitmap_max = 0; + return err; } - return err; + return copy_io_bitmap(me, p); } static void -- 1.8.4.rc3 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling 2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett @ 2013-10-30 22:21 ` Kees Cook 2013-10-31 20:01 ` Alexander van Heukelum 1 sibling, 0 replies; 17+ messages in thread From: Kees Cook @ 2013-10-30 22:21 UTC (permalink / raw) To: Josh Triplett Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org, Len Brown, Frederic Weisbecker, H. Peter Anvin, virtualization@lists.linux-foundation.org, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu <feng> On Mon, Oct 21, 2013 at 7:34 PM, Josh Triplett <josh@joshtriplett.org> wrote: > The 32-bit and 64-bit versions of copy_thread have functionally > identical handling for copying the I/O bitmap, modulo differences in > error handling. Clean up the error paths in both by moving the copy of > the I/O bitmap to the end, to eliminate the need to free it if > subsequent copy steps fail; move the resulting identical code to a > static inline in a common header. > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> This seems quite sensible to me. Acked-by: Kees Cook <keescook@chromium.org> -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling 2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett 2013-10-30 22:21 ` Kees Cook @ 2013-10-31 20:01 ` Alexander van Heukelum 2013-11-01 16:33 ` Josh Triplett 1 sibling, 1 reply; 17+ messages in thread From: Alexander van Heukelum @ 2013-10-31 20:01 UTC (permalink / raw) To: Josh Triplett, linux-kernel, virtualization, x86, xen-devel Cc: Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris Metcalf, Chris Wright Hi Josh, Since you added me to the CC, I had a look... Comments inline On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote: > The 32-bit and 64-bit versions of copy_thread have functionally > identical handling for copying the I/O bitmap, modulo differences in > error handling. Clean up the error paths in both by moving the copy of > the I/O bitmap to the end, to eliminate the need to free it if > subsequent copy steps fail; move the resulting identical code to a > static inline in a common header. > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > --- > arch/x86/kernel/process-io.h | 22 ++++++++++++++++++++++ I don't particularly like this new file. It's also not in a proper place. > arch/x86/kernel/process_32.c | 29 ++++++++--------------------- > arch/x86/kernel/process_64.c | 26 +++++--------------------- Yay, this I like :). However, unification is best done in a separate step. First make the two parts equal in one or two patches. Then, in a separate patch, do the mechanical unification. > 3 files changed, 35 insertions(+), 42 deletions(-) > create mode 100644 arch/x86/kernel/process-io.h > > diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h > new file mode 100644 > index 0000000..d884444 > --- /dev/null > +++ b/arch/x86/kernel/process-io.h > @@ -0,0 +1,22 @@ > +#ifndef _X86_KERNEL_PROCESS_IO_H > +#define _X86_KERNEL_PROCESS_IO_H > + > +static inline int copy_io_bitmap(struct task_struct *me, > + struct task_struct *p) > +{ > + if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { > + p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, > + IO_BITMAP_BYTES, GFP_KERNEL); > + if (!p->thread.io_bitmap_ptr) { > + p->thread.io_bitmap_max = 0; > + return -ENOMEM; > + } > + set_tsk_thread_flag(p, TIF_IO_BITMAP); > + } else { > + p->thread.io_bitmap_ptr = NULL; > + } > + > + return 0; > +} > + I really don't like the .h-file like this. Could you try avoiding it? I think it's possible to unify the copy_thread function and in the end move it to process.c instead. The arch-specific parts can be split out in static functions guarded by CONFIG_X86_64 (cooking up good names is the challenge here ;) ). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling 2013-10-31 20:01 ` Alexander van Heukelum @ 2013-11-01 16:33 ` Josh Triplett 0 siblings, 0 replies; 17+ messages in thread From: Josh Triplett @ 2013-11-01 16:33 UTC (permalink / raw) To: Alexander van Heukelum Cc: Jeremy Fitzhardinge, x86, Len Brown, Frederic Weisbecker, H. Peter Anvin, virtualization, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross On Thu, Oct 31, 2013 at 09:01:30PM +0100, Alexander van Heukelum wrote: > On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote: > > The 32-bit and 64-bit versions of copy_thread have functionally > > identical handling for copying the I/O bitmap, modulo differences in > > error handling. Clean up the error paths in both by moving the copy of > > the I/O bitmap to the end, to eliminate the need to free it if > > subsequent copy steps fail; move the resulting identical code to a > > static inline in a common header. > > > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > > --- > > arch/x86/kernel/process-io.h | 22 ++++++++++++++++++++++ > > I don't particularly like this new file. It's also not in a proper place. It's a private header, and private headers are normally put in the directories with the source code, not in the include directories. The need for it becomes more obvious later in the patch series. > > arch/x86/kernel/process_32.c | 29 ++++++++--------------------- > > arch/x86/kernel/process_64.c | 26 +++++--------------------- > > Yay, this I like :). However, unification is best done in a separate step. First > make the two parts equal in one or two patches. Then, in a separate patch, > do the mechanical unification. I could certainly split the first patch into two, sure. > > 3 files changed, 35 insertions(+), 42 deletions(-) > > create mode 100644 arch/x86/kernel/process-io.h > > > > diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h > > new file mode 100644 > > index 0000000..d884444 > > --- /dev/null > > +++ b/arch/x86/kernel/process-io.h > > @@ -0,0 +1,22 @@ > > +#ifndef _X86_KERNEL_PROCESS_IO_H > > +#define _X86_KERNEL_PROCESS_IO_H > > + > > +static inline int copy_io_bitmap(struct task_struct *me, > > + struct task_struct *p) > > +{ > > + if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { > > + p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, > > + IO_BITMAP_BYTES, GFP_KERNEL); > > + if (!p->thread.io_bitmap_ptr) { > > + p->thread.io_bitmap_max = 0; > > + return -ENOMEM; > > + } > > + set_tsk_thread_flag(p, TIF_IO_BITMAP); > > + } else { > > + p->thread.io_bitmap_ptr = NULL; > > + } > > + > > + return 0; > > +} > > + > > I really don't like the .h-file like this. Could you try avoiding it? I think > it's possible to unify the copy_thread function and in the end move it > to process.c instead. The arch-specific parts can be split out in static > functions guarded by CONFIG_X86_64 (cooking up good names is > the challenge here ;) ). It's theoretically possible to completely unify copy_thread by adding several more functions like this to abstract the low-level details; however, complete unification wasn't the goal of this patch series. I unified and pulled out this portion because it's also the portion I need to *remove* when compiling with !CONFIG_X86_IOPORT. - Josh Triplett ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett @ 2013-10-22 2:34 ` Josh Triplett 2013-10-30 22:22 ` Kees Cook 2013-10-31 20:02 ` Alexander van Heukelum 2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2 siblings, 2 replies; 17+ messages in thread From: Josh Triplett @ 2013-10-22 2:34 UTC (permalink / raw) To: linux-kernel, virtualization, x86, xen-devel Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris __set_tss_desc has a complex calculation of the TSS segment limit, duplicating the quirky details of the I/O bitmap array length, and requiring a complex comment to explain. Replace that calculation with a simpler one based on the offsetof the "stack" field that follows the array. That then removes the last use of IO_BITMAP_OFFSET, so delete it. Signed-off-by: Josh Triplett <josh@joshtriplett.org> --- arch/x86/include/asm/desc.h | 11 +---------- arch/x86/include/asm/processor.h | 3 ++- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index b90e5df..17ac92f 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr) struct desc_struct *d = get_cpu_gdt_table(cpu); tss_desc tss; - /* - * sizeof(unsigned long) coming from an extra "long" at the end - * of the iobitmap. See tss_struct definition in processor.h - * - * -1? seg base+limit should be pointing to the address of the - * last valid byte - */ - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, - IO_BITMAP_OFFSET + IO_BITMAP_BYTES + - sizeof(unsigned long) - 1); + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT); write_gdt_entry(d, entry, &tss, DESC_TSS); } diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 987c75e..03d3003 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -259,9 +259,10 @@ struct x86_hw_tss { #define IO_BITMAP_BITS 65536 #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap) #define INVALID_IO_BITMAP_OFFSET 0x8000 +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1) + struct tss_struct { /* * The hardware state: -- 1.8.4.rc3 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett @ 2013-10-30 22:22 ` Kees Cook 2013-10-30 22:53 ` H. Peter Anvin 2013-10-31 11:12 ` Josh Triplett 2013-10-31 20:02 ` Alexander van Heukelum 1 sibling, 2 replies; 17+ messages in thread From: Kees Cook @ 2013-10-30 22:22 UTC (permalink / raw) To: Josh Triplett Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org, Len Brown, Frederic Weisbecker, H. Peter Anvin, virtualization@lists.linux-foundation.org, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu <feng> On Mon, Oct 21, 2013 at 7:34 PM, Josh Triplett <josh@joshtriplett.org> wrote: > __set_tss_desc has a complex calculation of the TSS segment limit, > duplicating the quirky details of the I/O bitmap array length, and > requiring a complex comment to explain. Replace that calculation with a > simpler one based on the offsetof the "stack" field that follows the > array. > > That then removes the last use of IO_BITMAP_OFFSET, so delete it. > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > --- > arch/x86/include/asm/desc.h | 11 +---------- > arch/x86/include/asm/processor.h | 3 ++- > 2 files changed, 3 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > index b90e5df..17ac92f 100644 > --- a/arch/x86/include/asm/desc.h > +++ b/arch/x86/include/asm/desc.h > @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr) > struct desc_struct *d = get_cpu_gdt_table(cpu); > tss_desc tss; > > - /* > - * sizeof(unsigned long) coming from an extra "long" at the end > - * of the iobitmap. See tss_struct definition in processor.h > - * > - * -1? seg base+limit should be pointing to the address of the > - * last valid byte I think it might be better to keep at least a minimal comment near the TSS_LIMIT declaration, just to explain the "-1" part, which is not entirely obvious from just reading the code. -Kees > - */ > - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, > - IO_BITMAP_OFFSET + IO_BITMAP_BYTES + > - sizeof(unsigned long) - 1); > + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT); > write_gdt_entry(d, entry, &tss, DESC_TSS); > } > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 987c75e..03d3003 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -259,9 +259,10 @@ struct x86_hw_tss { > #define IO_BITMAP_BITS 65536 > #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) > #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) > -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap) > #define INVALID_IO_BITMAP_OFFSET 0x8000 > > +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1) > + > struct tss_struct { > /* > * The hardware state: > -- > 1.8.4.rc3 > -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-30 22:22 ` Kees Cook @ 2013-10-30 22:53 ` H. Peter Anvin 2013-10-31 11:17 ` Josh Triplett 2013-10-31 11:12 ` Josh Triplett 1 sibling, 1 reply; 17+ messages in thread From: H. Peter Anvin @ 2013-10-30 22:53 UTC (permalink / raw) To: Kees Cook, Josh Triplett Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org, Len Brown, Frederic Weisbecker, virtualization@lists.linux-foundation.org, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu, Konrad On 10/30/2013 03:22 PM, Kees Cook wrote: >> >> - /* >> - * sizeof(unsigned long) coming from an extra "long" at the end >> - * of the iobitmap. See tss_struct definition in processor.h >> - * >> - * -1? seg base+limit should be pointing to the address of the >> - * last valid byte > > I think it might be better to keep at least a minimal comment near the > TSS_LIMIT declaration, just to explain the "-1" part, which is not > entirely obvious from just reading the code. > Agreed, although it doesn't need to be an unsigned long at all... the CPU will only ever access one extra byte past the end. -hpa ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-30 22:53 ` H. Peter Anvin @ 2013-10-31 11:17 ` Josh Triplett 0 siblings, 0 replies; 17+ messages in thread From: Josh Triplett @ 2013-10-31 11:17 UTC (permalink / raw) To: H. Peter Anvin Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Daniel Lezcano, Len Brown, Frederic Weisbecker, virtualization@lists.linux-foundation.org, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, x86@kernel.org, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu, Kees Cook <kees> On Wed, Oct 30, 2013 at 03:53:11PM -0700, H. Peter Anvin wrote: > On 10/30/2013 03:22 PM, Kees Cook wrote: > >> > >> - /* > >> - * sizeof(unsigned long) coming from an extra "long" at the end > >> - * of the iobitmap. See tss_struct definition in processor.h > >> - * > >> - * -1? seg base+limit should be pointing to the address of the > >> - * last valid byte > > > > I think it might be better to keep at least a minimal comment near the > > TSS_LIMIT declaration, just to explain the "-1" part, which is not > > entirely obvious from just reading the code. > > > > Agreed, although it doesn't need to be an unsigned long at all... the > CPU will only ever access one extra byte past the end. True, but the thing immediately following the iobitmap is a stack, which needs aligning, so the array does need to contain a full additional unsigned long even if the CPU only accesses a byte of it. In any case, that isn't the reason for the -1, just the reason for the sizeof(unsigned long) mentioned in the comment above, which goes away now that TSS_LIMIT uses the offset of the *following* field rather than recalculating the size of the iobitmap. - Josh Triplett ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-30 22:22 ` Kees Cook 2013-10-30 22:53 ` H. Peter Anvin @ 2013-10-31 11:12 ` Josh Triplett 1 sibling, 0 replies; 17+ messages in thread From: Josh Triplett @ 2013-10-31 11:12 UTC (permalink / raw) To: Kees Cook Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org, Len Brown, Frederic Weisbecker, H. Peter Anvin, virtualization@lists.linux-foundation.org, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu <feng> On Wed, Oct 30, 2013 at 03:22:33PM -0700, Kees Cook wrote: > On Mon, Oct 21, 2013 at 7:34 PM, Josh Triplett <josh@joshtriplett.org> wrote: > > __set_tss_desc has a complex calculation of the TSS segment limit, > > duplicating the quirky details of the I/O bitmap array length, and > > requiring a complex comment to explain. Replace that calculation with a > > simpler one based on the offsetof the "stack" field that follows the > > array. > > > > That then removes the last use of IO_BITMAP_OFFSET, so delete it. > > > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > > --- > > arch/x86/include/asm/desc.h | 11 +---------- > > arch/x86/include/asm/processor.h | 3 ++- > > 2 files changed, 3 insertions(+), 11 deletions(-) > > > > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > > index b90e5df..17ac92f 100644 > > --- a/arch/x86/include/asm/desc.h > > +++ b/arch/x86/include/asm/desc.h > > @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr) > > struct desc_struct *d = get_cpu_gdt_table(cpu); > > tss_desc tss; > > > > - /* > > - * sizeof(unsigned long) coming from an extra "long" at the end > > - * of the iobitmap. See tss_struct definition in processor.h > > - * > > - * -1? seg base+limit should be pointing to the address of the > > - * last valid byte > > I think it might be better to keep at least a minimal comment near the > TSS_LIMIT declaration, just to explain the "-1" part, which is not > entirely obvious from just reading the code. Fair enough; I've added an appropriate comment next to TSS_LIMIT, and I'll include that in PATCHv2, which I'll send out as soon as I see any feedback on patch 3/3. - Josh Triplett ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett 2013-10-30 22:22 ` Kees Cook @ 2013-10-31 20:02 ` Alexander van Heukelum 2013-11-01 16:40 ` Josh Triplett 1 sibling, 1 reply; 17+ messages in thread From: Alexander van Heukelum @ 2013-10-31 20:02 UTC (permalink / raw) To: Josh Triplett, linux-kernel, virtualization, x86, xen-devel Cc: Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris Metcalf, Chris Wright Hi Josh, On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote: > __set_tss_desc has a complex calculation of the TSS segment limit, > duplicating the quirky details of the I/O bitmap array length, and > requiring a complex comment to explain. Replace that calculation with a > simpler one based on the offsetof the "stack" field that follows the > array. > > That then removes the last use of IO_BITMAP_OFFSET, so delete it. > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > --- > arch/x86/include/asm/desc.h | 11 +---------- > arch/x86/include/asm/processor.h | 3 ++- > 2 files changed, 3 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > index b90e5df..17ac92f 100644 > --- a/arch/x86/include/asm/desc.h > +++ b/arch/x86/include/asm/desc.h > @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr) > struct desc_struct *d = get_cpu_gdt_table(cpu); > tss_desc tss; > > - /* > - * sizeof(unsigned long) coming from an extra "long" at the end > - * of the iobitmap. See tss_struct definition in processor.h > - * > - * -1? seg base+limit should be pointing to the address of the > - * last valid byte > - */ > - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, > - IO_BITMAP_OFFSET + IO_BITMAP_BYTES + > - sizeof(unsigned long) - 1); > + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT); > write_gdt_entry(d, entry, &tss, DESC_TSS); > } > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 987c75e..03d3003 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -259,9 +259,10 @@ struct x86_hw_tss { > #define IO_BITMAP_BITS 65536 > #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) > #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) > -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap) > #define INVALID_IO_BITMAP_OFFSET 0x8000 > > +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1) > + I wonder if the 'stack' in tss_struct has any meaning anymore. Can it be removed? If so, the offsetof can be changed to sizeof(struct tss_struct), which looks more natural to me. Even so, I think this is already an improvement. Acked-by: Alexander van Heukelum <heukelum@fastmail.fm> > struct tss_struct { > /* > * The hardware state: > -- > 1.8.4.rc3 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit 2013-10-31 20:02 ` Alexander van Heukelum @ 2013-11-01 16:40 ` Josh Triplett 0 siblings, 0 replies; 17+ messages in thread From: Josh Triplett @ 2013-11-01 16:40 UTC (permalink / raw) To: Alexander van Heukelum Cc: Jeremy Fitzhardinge, x86, Len Brown, Frederic Weisbecker, H. Peter Anvin, virtualization, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross On Thu, Oct 31, 2013 at 09:02:14PM +0100, Alexander van Heukelum wrote: > Hi Josh, > > On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote: > > __set_tss_desc has a complex calculation of the TSS segment limit, > > duplicating the quirky details of the I/O bitmap array length, and > > requiring a complex comment to explain. Replace that calculation with a > > simpler one based on the offsetof the "stack" field that follows the > > array. > > > > That then removes the last use of IO_BITMAP_OFFSET, so delete it. > > > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > > --- > > arch/x86/include/asm/desc.h | 11 +---------- > > arch/x86/include/asm/processor.h | 3 ++- > > 2 files changed, 3 insertions(+), 11 deletions(-) > > > > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > > index b90e5df..17ac92f 100644 > > --- a/arch/x86/include/asm/desc.h > > +++ b/arch/x86/include/asm/desc.h > > @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr) > > struct desc_struct *d = get_cpu_gdt_table(cpu); > > tss_desc tss; > > > > - /* > > - * sizeof(unsigned long) coming from an extra "long" at the end > > - * of the iobitmap. See tss_struct definition in processor.h > > - * > > - * -1? seg base+limit should be pointing to the address of the > > - * last valid byte > > - */ > > - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, > > - IO_BITMAP_OFFSET + IO_BITMAP_BYTES + > > - sizeof(unsigned long) - 1); > > + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT); > > write_gdt_entry(d, entry, &tss, DESC_TSS); > > } > > > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > > index 987c75e..03d3003 100644 > > --- a/arch/x86/include/asm/processor.h > > +++ b/arch/x86/include/asm/processor.h > > @@ -259,9 +259,10 @@ struct x86_hw_tss { > > #define IO_BITMAP_BITS 65536 > > #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) > > #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) > > -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap) > > #define INVALID_IO_BITMAP_OFFSET 0x8000 > > > > +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1) > > + > > I wonder if the 'stack' in tss_struct has any meaning anymore. Can it be removed? > If so, the offsetof can be changed to sizeof(struct tss_struct), which looks more > natural to me. That might be possible, if nothing references the stack. The obvious references would be easy to search for, but I'd be concerned that something might reference the stack in some non-obvious way, like "right after the TSS base+limit", which a search wouldn't necessarily turn up. In any case, that change could easily occur in another patch on top of this one. > Even so, I think this is already an improvement. > > Acked-by: Alexander van Heukelum <heukelum@fastmail.fm> Thanks. - Josh Triplett ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) 2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett 2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett @ 2013-10-22 2:35 ` Josh Triplett 2013-10-26 3:17 ` Stephen Hemminger 2013-10-31 20:04 ` Alexander van Heukelum 2 siblings, 2 replies; 17+ messages in thread From: Josh Triplett @ 2013-10-22 2:35 UTC (permalink / raw) To: linux-kernel, virtualization, x86, xen-devel Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris On the vast majority of modern systems, no processes will use the userspsace I/O syscalls, iopl and ioperm. Add a new config option, CONFIG_X86_IOPORT, to support configuring them out of the kernel entirely. Since these syscalls only exist to support rare legacy userspace programs, X86_IOPORT does not depend on EXPERT, though it does still default to y. In addition to saving a significant amount of space, this also reduces the size of several major kernel data structures, drops a bit of code from several task-related hot paths, and reduces the attack surface of the kernel. All of the task-related code dealing with userspace I/O permissions now lives in process-io.h as several new inline functions, which become no-ops when CONFIG_X86_IOPORT. bloat-o-meter shows a net reduction of 17681 bytes on 32-bit and 9719 bytes on 64-bit: 32-bit bloat-o-meter: add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681) function old new delta cpu_init 676 668 -8 ioperm_active 18 7 -11 init_task 1296 1284 -12 exit_thread 179 91 -88 ioperm_get 103 10 -93 __switch_to_xtra 254 161 -93 sys_iopl 127 - -127 SyS_iopl 127 - -127 copy_thread 606 446 -160 vt_ioctl 4127 3919 -208 sys_ioperm 370 - -370 init_tss 8576 384 -8192 doublefault_tss 8576 384 -8192 64-bit bloat-o-meter: add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719) function old new delta cpu_init 958 995 +37 arch_align_stack 78 86 +8 perf_event_exit_task 525 517 -8 ioperm_active 17 8 -9 init_task 1968 1944 -24 stub_iopl 81 - -81 ioperm_get 111 11 -100 __switch_to_xtra 281 164 -117 exit_thread 212 92 -120 vt_ioctl 4432 4304 -128 sys_iopl 137 - -137 SyS_iopl 137 - -137 copy_thread 694 520 -174 sys_ioperm 473 - -473 init_tss 8896 640 -8256 Signed-off-by: Josh Triplett <josh@joshtriplett.org> --- arch/x86/Kconfig | 10 +++++ arch/x86/include/asm/paravirt.h | 2 + arch/x86/include/asm/paravirt_types.h | 2 + arch/x86/include/asm/processor.h | 51 +++++++++++++++++++++---- arch/x86/include/asm/syscalls.h | 3 ++ arch/x86/kernel/Makefile | 3 +- arch/x86/kernel/cpu/common.c | 12 +----- arch/x86/kernel/entry_64.S | 9 +++-- arch/x86/kernel/paravirt.c | 2 + arch/x86/kernel/process-io.h | 71 +++++++++++++++++++++++++++++++++++ arch/x86/kernel/process.c | 34 ++--------------- arch/x86/kernel/process_32.c | 11 +----- arch/x86/kernel/ptrace.c | 8 ++++ arch/x86/xen/enlighten.c | 4 ++ drivers/tty/vt/vt_ioctl.c | 2 +- kernel/sys_ni.c | 5 +++ 16 files changed, 168 insertions(+), 61 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e241a19..d5b1e68 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -976,6 +976,16 @@ config VM86 XFree86 to initialize some video cards via BIOS. Disabling this option saves about 6k. +config X86_IOPORT + bool "iopl and ioperm system calls" + default y + ---help--- + This option enables the iopl and ioperm system calls, which allow + privileged userspace processes to directly access I/O ports. This + is used by some legacy software to drive hardware directly from + userspace rather than via a proper kernel driver. Unless you intend + to run such software, you can safely say N here. + config TOSHIBA tristate "Toshiba Laptop support" depends on X86_32 diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 401f350..2e106d5 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -299,10 +299,12 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g) { PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g); } +#ifdef CONFIG_X86_IOPORT static inline void set_iopl_mask(unsigned mask) { PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask); } +#endif /* CONFIG_X86_IOPORT */ /* The paravirtualized I/O functions */ static inline void slow_down_io(void) diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index aab8f67..bc5bfeb 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -142,7 +142,9 @@ struct pv_cpu_ops { void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t); +#ifdef CONFIG_X86_IOPORT void (*set_iopl_mask)(unsigned mask); +#endif /* CONFIG_X86_IOPORT */ void (*wbinvd)(void); void (*io_delay)(void); diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 03d3003..48f6fbe 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -256,7 +256,11 @@ struct x86_hw_tss { /* * IO-bitmap sizes: */ +#ifdef CONFIG_X86_IOPORT #define IO_BITMAP_BITS 65536 +#else +#define IO_BITMAP_BITS 0 +#endif #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) #define INVALID_IO_BITMAP_OFFSET 0x8000 @@ -269,6 +273,7 @@ struct tss_struct { */ struct x86_hw_tss x86_tss; +#ifdef CONFIG_X86_IOPORT /* * The extra 1 is there because the CPU will access an * additional byte beyond the end of the IO permission @@ -276,6 +281,7 @@ struct tss_struct { * be within the limit. */ unsigned long io_bitmap[IO_BITMAP_LONGS + 1]; +#endif /* CONFIG_X86_IOPORT */ /* * .. and then another 0x100 bytes for the emergency kernel stack: @@ -286,6 +292,24 @@ struct tss_struct { DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss); +static inline void init_tss_io(struct tss_struct *t) +{ +#ifdef CONFIG_X86_IOPORT + int i; + + t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); + + /* + * <= is required because the CPU will access up to + * 8 bits beyond the end of the IO permission bitmap. + */ + for (i = 0; i <= IO_BITMAP_LONGS; i++) + t->io_bitmap[i] = ~0UL; +#else + t->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET; +#endif +} + /* * Save the original ist values for checking stack pointers during debugging */ @@ -484,13 +508,16 @@ struct thread_struct { unsigned int saved_fs; unsigned int saved_gs; #endif +#ifdef CONFIG_X86_IOPORT /* IO permissions: */ unsigned long *io_bitmap_ptr; unsigned long iopl; /* Max allowed port in the bitmap, in bytes: */ unsigned io_bitmap_max; +#endif /* CONFIG_X86_IOPORT */ }; +#ifdef CONFIG_X86_IOPORT /* * Set IOPL bits in EFLAGS from given mask */ @@ -509,6 +536,7 @@ static inline void native_set_iopl_mask(unsigned mask) : "i" (~X86_EFLAGS_IOPL), "r" (mask)); #endif } +#endif /* CONFIG_X86_IOPORT */ static inline void native_load_sp0(struct tss_struct *tss, struct thread_struct *thread) @@ -828,12 +856,8 @@ static inline void spin_lock_prefetch(const void *x) #define STACK_TOP TASK_SIZE #define STACK_TOP_MAX STACK_TOP -#define INIT_THREAD { \ - .sp0 = sizeof(init_stack) + (long)&init_stack, \ - .vm86_info = NULL, \ - .sysenter_cs = __KERNEL_CS, \ - .io_bitmap_ptr = NULL, \ -} +#ifdef CONFIG_X86_IOPORT +#define INIT_THREAD_IO .io_bitmap_ptr = NULL, /* * Note that the .io_bitmap member must be extra-big. This is because @@ -841,6 +865,19 @@ static inline void spin_lock_prefetch(const void *x) * permission bitmap. The extra byte must be all 1 bits, and must * be within the limit. */ +#define INIT_TSS_IO .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, +#else +#define INIT_THREAD_IO +#define INIT_TSS_IO +#endif + +#define INIT_THREAD { \ + .sp0 = sizeof(init_stack) + (long)&init_stack, \ + .vm86_info = NULL, \ + .sysenter_cs = __KERNEL_CS, \ + INIT_THREAD_IO \ +} + #define INIT_TSS { \ .x86_tss = { \ .sp0 = sizeof(init_stack) + (long)&init_stack, \ @@ -848,7 +885,7 @@ static inline void spin_lock_prefetch(const void *x) .ss1 = __KERNEL_CS, \ .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \ }, \ - .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \ + INIT_TSS_IO \ } extern unsigned long thread_saved_pc(struct task_struct *tsk); diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h index 592a6a6..4db79ea 100644 --- a/arch/x86/include/asm/syscalls.h +++ b/arch/x86/include/asm/syscalls.h @@ -16,9 +16,12 @@ #include <linux/types.h> /* Common in X86_32 and X86_64 */ + +#ifdef CONFIG_X86_IOPORT /* kernel/ioport.c */ asmlinkage long sys_ioperm(unsigned long, unsigned long, int); asmlinkage long sys_iopl(unsigned int); +#endif /* CONFIG_X86_IOPORT */ /* kernel/ldt.c */ asmlinkage int sys_modify_ldt(int, void __user *, unsigned long); diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index a5408b9..9644737 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -20,7 +20,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace obj-y := process_$(BITS).o signal.o entry_$(BITS).o obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o -obj-y += time.o ioport.o ldt.o dumpstack.o nmi.o +obj-y += time.o ldt.o dumpstack.o nmi.o +obj-$(CONFIG_X86_IOPORT) += ioport.o obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o obj-$(CONFIG_IRQ_WORK) += irq_work.o obj-y += probe_roms.o diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 2793d1f..01235ef 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1223,7 +1223,6 @@ void cpu_init(void) struct tss_struct *t; unsigned long v; int cpu; - int i; /* * Load microcode on this cpu if a valid microcode is available. @@ -1285,14 +1284,7 @@ void cpu_init(void) } } - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); - - /* - * <= is required because the CPU will access up to - * 8 bits beyond the end of the IO permission bitmap. - */ - for (i = 0; i <= IO_BITMAP_LONGS; i++) - t->io_bitmap[i] = ~0UL; + init_tss_io(t); atomic_inc(&init_mm.mm_count); me->active_mm = &init_mm; @@ -1351,7 +1343,7 @@ void cpu_init(void) load_TR_desc(); load_LDT(&init_mm.context); - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); + init_tss_io(t); #ifdef CONFIG_DOUBLEFAULT /* Set up doublefault TSS pointer in the GDT */ diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index 1b69951..eb4f5ba 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -844,6 +844,11 @@ ENTRY(stub_\func) END(stub_\func) .endm + FORK_LIKE clone + FORK_LIKE fork + FORK_LIKE vfork + +#ifdef CONFIG_X86_IOPORT .macro FIXED_FRAME label,func ENTRY(\label) CFI_STARTPROC @@ -856,10 +861,8 @@ ENTRY(\label) END(\label) .endm - FORK_LIKE clone - FORK_LIKE fork - FORK_LIKE vfork FIXED_FRAME stub_iopl, sys_iopl +#endif /* CONFIG_X86_IOPORT */ ENTRY(ptregscall_common) DEFAULT_FRAME 1 8 /* offset 8: return address */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 1b10af8..46da3d9 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -382,7 +382,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = { .iret = native_iret, .swapgs = native_swapgs, +#ifdef CONFIG_X86_IOPORT .set_iopl_mask = native_set_iopl_mask, +#endif /* CONFIG_X86_IOPORT */ .io_delay = native_io_delay, .start_context_switch = paravirt_nop, diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h index d884444..3e773fa 100644 --- a/arch/x86/kernel/process-io.h +++ b/arch/x86/kernel/process-io.h @@ -1,9 +1,17 @@ #ifndef _X86_KERNEL_PROCESS_IO_H #define _X86_KERNEL_PROCESS_IO_H +static inline void clear_thread_io_bitmap(struct task_struct *p) +{ +#ifdef CONFIG_X86_IOPORT + p->thread.io_bitmap_ptr = NULL; +#endif /* CONFIG_X86_IOPORT */ +} + static inline int copy_io_bitmap(struct task_struct *me, struct task_struct *p) { +#ifdef CONFIG_X86_IOPORT if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, IO_BITMAP_BYTES, GFP_KERNEL); @@ -15,8 +23,71 @@ static inline int copy_io_bitmap(struct task_struct *me, } else { p->thread.io_bitmap_ptr = NULL; } +#endif /* CONFIG_X86_IOPORT */ return 0; } +static inline void exit_thread_io(struct task_struct *me) +{ +#ifdef CONFIG_X86_IOPORT + struct thread_struct *t = &me->thread; + unsigned long *bp = t->io_bitmap_ptr; + + if (bp) { + struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); + + t->io_bitmap_ptr = NULL; + clear_thread_flag(TIF_IO_BITMAP); + /* + * Careful, clear this in the TSS too: + */ + memset(tss->io_bitmap, 0xff, t->io_bitmap_max); + t->io_bitmap_max = 0; + put_cpu(); + kfree(bp); + } +#endif /* CONFIG_X86_IOPORT */ +} + +static inline void switch_iopl_mask(struct thread_struct *prev, + struct thread_struct *next) +{ +#ifdef CONFIG_X86_IOPORT + /* + * Restore IOPL if needed. In normal use, the flags restore + * in the switch assembly will handle this. But if the kernel + * is running virtualized at a non-zero CPL, the popf will + * not restore flags, so it must be done in a separate step. + */ + if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) + set_iopl_mask(next->iopl); +#endif /* CONFIG_X86_IOPORT */ +} + +static inline void switch_io_bitmap(struct tss_struct *tss, + struct task_struct *prev_p, + struct task_struct *next_p) +{ +#ifdef CONFIG_X86_IOPORT + struct thread_struct *prev, *next; + prev = &prev_p->thread; + next = &next_p->thread; + + if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { + /* + * Copy the relevant range of the IO bitmap. + * Normally this is 128 bytes or less: + */ + memcpy(tss->io_bitmap, next->io_bitmap_ptr, + max(prev->io_bitmap_max, next->io_bitmap_max)); + } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { + /* + * Clear any possible leftover bits: + */ + memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); + } +#endif /* CONFIG_X86_IOPORT */ +} + #endif /* _X86_KERNEL_PROCESS_IO_H */ diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index c83516b..2df5b00 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -29,6 +29,8 @@ #include <asm/debugreg.h> #include <asm/nmi.h> +#include "process-io.h" + /* * per-CPU TSS segments. Threads are completely 'soft' on Linux, * no more per-task TSS's. The TSS size is kept cacheline-aligned @@ -101,23 +103,7 @@ void arch_task_cache_init(void) void exit_thread(void) { struct task_struct *me = current; - struct thread_struct *t = &me->thread; - unsigned long *bp = t->io_bitmap_ptr; - - if (bp) { - struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); - - t->io_bitmap_ptr = NULL; - clear_thread_flag(TIF_IO_BITMAP); - /* - * Careful, clear this in the TSS too: - */ - memset(tss->io_bitmap, 0xff, t->io_bitmap_max); - t->io_bitmap_max = 0; - put_cpu(); - kfree(bp); - } - + exit_thread_io(me); drop_fpu(me); } @@ -222,19 +208,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, hard_enable_TSC(); } - if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { - /* - * Copy the relevant range of the IO bitmap. - * Normally this is 128 bytes or less: - */ - memcpy(tss->io_bitmap, next->io_bitmap_ptr, - max(prev->io_bitmap_max, next->io_bitmap_max)); - } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { - /* - * Clear any possible leftover bits: - */ - memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); - } + switch_io_bitmap(tss, prev_p, next_p); propagate_user_return_notify(prev_p, next_p); } diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index e86028b..dc780cb 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, childregs->cs = __KERNEL_CS | get_kernel_rpl(); childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED; p->fpu_counter = 0; - p->thread.io_bitmap_ptr = NULL; + clear_thread_io_bitmap(p); memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); return 0; } @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) */ load_TLS(next, cpu); - /* - * Restore IOPL if needed. In normal use, the flags restore - * in the switch assembly will handle this. But if the kernel - * is running virtualized at a non-zero CPL, the popf will - * not restore flags, so it must be done in a separate step. - */ - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) - set_iopl_mask(next->iopl); + switch_iopl_mask(prev, next); /* * Now maybe handle debug registers and/or IO bitmaps diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 7461f50..f071bfa 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -785,7 +785,11 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n, static int ioperm_active(struct task_struct *target, const struct user_regset *regset) { +#ifdef CONFIG_X86_IOPORT return target->thread.io_bitmap_max / regset->size; +#else + return 0; +#endif } static int ioperm_get(struct task_struct *target, @@ -793,12 +797,16 @@ static int ioperm_get(struct task_struct *target, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { +#ifdef CONFIG_X86_IOPORT if (!target->thread.io_bitmap_ptr) return -ENXIO; return user_regset_copyout(&pos, &count, &kbuf, &ubuf, target->thread.io_bitmap_ptr, 0, IO_BITMAP_BYTES); +#else + return -ENXIO; +#endif } /* diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index fa6ade7..f485d92 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -911,6 +911,7 @@ static void xen_load_sp0(struct tss_struct *tss, xen_mc_issue(PARAVIRT_LAZY_CPU); } +#ifdef CONFIG_X86_IOPORT static void xen_set_iopl_mask(unsigned mask) { struct physdev_set_iopl set_iopl; @@ -919,6 +920,7 @@ static void xen_set_iopl_mask(unsigned mask) set_iopl.iopl = (mask == 0) ? 1 : (mask >> 12) & 3; HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl); } +#endif /* CONFIG_X86_IOPORT */ static void xen_io_delay(void) { @@ -1277,7 +1279,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { .write_idt_entry = xen_write_idt_entry, .load_sp0 = xen_load_sp0, +#ifdef CONFIG_X86_IOPORT .set_iopl_mask = xen_set_iopl_mask, +#endif /* CONFIG_X86_IOPORT */ .io_delay = xen_io_delay, /* Xen takes care of %gs when switching to usermode for us */ diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c index 2bd78e2..7c7d405 100644 --- a/drivers/tty/vt/vt_ioctl.c +++ b/drivers/tty/vt/vt_ioctl.c @@ -410,7 +410,7 @@ int vt_ioctl(struct tty_struct *tty, * * XXX: you should never use these, just call ioperm directly.. */ -#ifdef CONFIG_X86 +#ifdef CONFIG_X86_IOPORT case KDADDIO: case KDDELIO: /* diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 7078052..e7f4a35 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -209,3 +209,8 @@ cond_syscall(compat_sys_open_by_handle_at); /* compare kernel pointers */ cond_syscall(sys_kcmp); + +/* userspace I/O port access */ +cond_syscall(stub_iopl); +cond_syscall(sys_iopl); +cond_syscall(sys_ioperm); -- 1.8.4.rc3 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) 2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett @ 2013-10-26 3:17 ` Stephen Hemminger 2013-10-26 4:30 ` Kees Cook 2013-10-31 20:04 ` Alexander van Heukelum 1 sibling, 1 reply; 17+ messages in thread From: Stephen Hemminger @ 2013-10-26 3:17 UTC (permalink / raw) To: Josh Triplett Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Daniel Lezcano, Fenghua Yu, Frederic Weisbecker, H. Peter Anvin, x86, virtualization, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Boris Ostrovsky, Jesper Nilsson, Andi Kleen, Raghavendra K T, Ingo Molnar, xen-devel, Borislav Petkov, Len Brown, Kees Cook, Konrad Rzeszutek Wilk I/O from userspace is used to implement usermode virtio driver(s). This has been done independently by Intel, Brocade/Vyatta, and 6Wind. Sorry, it has to stay. On Mon, Oct 21, 2013 at 7:35 PM, Josh Triplett <josh@joshtriplett.org> wrote: > On the vast majority of modern systems, no processes will use the > userspsace I/O syscalls, iopl and ioperm. Add a new config option, > CONFIG_X86_IOPORT, to support configuring them out of the kernel > entirely. Since these syscalls only exist to support rare legacy > userspace programs, X86_IOPORT does not depend on EXPERT, though it does > still default to y. > > In addition to saving a significant amount of space, this also reduces > the size of several major kernel data structures, drops a bit of code > from several task-related hot paths, and reduces the attack surface of > the kernel. > > All of the task-related code dealing with userspace I/O permissions now > lives in process-io.h as several new inline functions, which become > no-ops when CONFIG_X86_IOPORT. > > bloat-o-meter shows a net reduction of 17681 bytes on 32-bit and 9719 > bytes on 64-bit: > > 32-bit bloat-o-meter: > add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681) > function old new delta > cpu_init 676 668 -8 > ioperm_active 18 7 -11 > init_task 1296 1284 -12 > exit_thread 179 91 -88 > ioperm_get 103 10 -93 > __switch_to_xtra 254 161 -93 > sys_iopl 127 - -127 > SyS_iopl 127 - -127 > copy_thread 606 446 -160 > vt_ioctl 4127 3919 -208 > sys_ioperm 370 - -370 > init_tss 8576 384 -8192 > doublefault_tss 8576 384 -8192 > > 64-bit bloat-o-meter: > add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719) > function old new delta > cpu_init 958 995 +37 > arch_align_stack 78 86 +8 > perf_event_exit_task 525 517 -8 > ioperm_active 17 8 -9 > init_task 1968 1944 -24 > stub_iopl 81 - -81 > ioperm_get 111 11 -100 > __switch_to_xtra 281 164 -117 > exit_thread 212 92 -120 > vt_ioctl 4432 4304 -128 > sys_iopl 137 - -137 > SyS_iopl 137 - -137 > copy_thread 694 520 -174 > sys_ioperm 473 - -473 > init_tss 8896 640 -8256 > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > --- > arch/x86/Kconfig | 10 +++++ > arch/x86/include/asm/paravirt.h | 2 + > arch/x86/include/asm/paravirt_types.h | 2 + > arch/x86/include/asm/processor.h | 51 +++++++++++++++++++++---- > arch/x86/include/asm/syscalls.h | 3 ++ > arch/x86/kernel/Makefile | 3 +- > arch/x86/kernel/cpu/common.c | 12 +----- > arch/x86/kernel/entry_64.S | 9 +++-- > arch/x86/kernel/paravirt.c | 2 + > arch/x86/kernel/process-io.h | 71 +++++++++++++++++++++++++++++++++++ > arch/x86/kernel/process.c | 34 ++--------------- > arch/x86/kernel/process_32.c | 11 +----- > arch/x86/kernel/ptrace.c | 8 ++++ > arch/x86/xen/enlighten.c | 4 ++ > drivers/tty/vt/vt_ioctl.c | 2 +- > kernel/sys_ni.c | 5 +++ > 16 files changed, 168 insertions(+), 61 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index e241a19..d5b1e68 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -976,6 +976,16 @@ config VM86 > XFree86 to initialize some video cards via BIOS. Disabling this > option saves about 6k. > > +config X86_IOPORT > + bool "iopl and ioperm system calls" > + default y > + ---help--- > + This option enables the iopl and ioperm system calls, which allow > + privileged userspace processes to directly access I/O ports. This > + is used by some legacy software to drive hardware directly from > + userspace rather than via a proper kernel driver. Unless you intend > + to run such software, you can safely say N here. > + > config TOSHIBA > tristate "Toshiba Laptop support" > depends on X86_32 > diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h > index 401f350..2e106d5 100644 > --- a/arch/x86/include/asm/paravirt.h > +++ b/arch/x86/include/asm/paravirt.h > @@ -299,10 +299,12 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g) > { > PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g); > } > +#ifdef CONFIG_X86_IOPORT > static inline void set_iopl_mask(unsigned mask) > { > PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask); > } > +#endif /* CONFIG_X86_IOPORT */ > > /* The paravirtualized I/O functions */ > static inline void slow_down_io(void) > diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h > index aab8f67..bc5bfeb 100644 > --- a/arch/x86/include/asm/paravirt_types.h > +++ b/arch/x86/include/asm/paravirt_types.h > @@ -142,7 +142,9 @@ struct pv_cpu_ops { > > void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t); > > +#ifdef CONFIG_X86_IOPORT > void (*set_iopl_mask)(unsigned mask); > +#endif /* CONFIG_X86_IOPORT */ > > void (*wbinvd)(void); > void (*io_delay)(void); > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 03d3003..48f6fbe 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -256,7 +256,11 @@ struct x86_hw_tss { > /* > * IO-bitmap sizes: > */ > +#ifdef CONFIG_X86_IOPORT > #define IO_BITMAP_BITS 65536 > +#else > +#define IO_BITMAP_BITS 0 > +#endif > #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) > #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) > #define INVALID_IO_BITMAP_OFFSET 0x8000 > @@ -269,6 +273,7 @@ struct tss_struct { > */ > struct x86_hw_tss x86_tss; > > +#ifdef CONFIG_X86_IOPORT > /* > * The extra 1 is there because the CPU will access an > * additional byte beyond the end of the IO permission > @@ -276,6 +281,7 @@ struct tss_struct { > * be within the limit. > */ > unsigned long io_bitmap[IO_BITMAP_LONGS + 1]; > +#endif /* CONFIG_X86_IOPORT */ > > /* > * .. and then another 0x100 bytes for the emergency kernel stack: > @@ -286,6 +292,24 @@ struct tss_struct { > > DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss); > > +static inline void init_tss_io(struct tss_struct *t) > +{ > +#ifdef CONFIG_X86_IOPORT > + int i; > + > + t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > + > + /* > + * <= is required because the CPU will access up to > + * 8 bits beyond the end of the IO permission bitmap. > + */ > + for (i = 0; i <= IO_BITMAP_LONGS; i++) > + t->io_bitmap[i] = ~0UL; > +#else > + t->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET; > +#endif > +} > + > /* > * Save the original ist values for checking stack pointers during debugging > */ > @@ -484,13 +508,16 @@ struct thread_struct { > unsigned int saved_fs; > unsigned int saved_gs; > #endif > +#ifdef CONFIG_X86_IOPORT > /* IO permissions: */ > unsigned long *io_bitmap_ptr; > unsigned long iopl; > /* Max allowed port in the bitmap, in bytes: */ > unsigned io_bitmap_max; > +#endif /* CONFIG_X86_IOPORT */ > }; > > +#ifdef CONFIG_X86_IOPORT > /* > * Set IOPL bits in EFLAGS from given mask > */ > @@ -509,6 +536,7 @@ static inline void native_set_iopl_mask(unsigned mask) > : "i" (~X86_EFLAGS_IOPL), "r" (mask)); > #endif > } > +#endif /* CONFIG_X86_IOPORT */ > > static inline void > native_load_sp0(struct tss_struct *tss, struct thread_struct *thread) > @@ -828,12 +856,8 @@ static inline void spin_lock_prefetch(const void *x) > #define STACK_TOP TASK_SIZE > #define STACK_TOP_MAX STACK_TOP > > -#define INIT_THREAD { \ > - .sp0 = sizeof(init_stack) + (long)&init_stack, \ > - .vm86_info = NULL, \ > - .sysenter_cs = __KERNEL_CS, \ > - .io_bitmap_ptr = NULL, \ > -} > +#ifdef CONFIG_X86_IOPORT > +#define INIT_THREAD_IO .io_bitmap_ptr = NULL, > > /* > * Note that the .io_bitmap member must be extra-big. This is because > @@ -841,6 +865,19 @@ static inline void spin_lock_prefetch(const void *x) > * permission bitmap. The extra byte must be all 1 bits, and must > * be within the limit. > */ > +#define INIT_TSS_IO .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, > +#else > +#define INIT_THREAD_IO > +#define INIT_TSS_IO > +#endif > + > +#define INIT_THREAD { \ > + .sp0 = sizeof(init_stack) + (long)&init_stack, \ > + .vm86_info = NULL, \ > + .sysenter_cs = __KERNEL_CS, \ > + INIT_THREAD_IO \ > +} > + > #define INIT_TSS { \ > .x86_tss = { \ > .sp0 = sizeof(init_stack) + (long)&init_stack, \ > @@ -848,7 +885,7 @@ static inline void spin_lock_prefetch(const void *x) > .ss1 = __KERNEL_CS, \ > .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \ > }, \ > - .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \ > + INIT_TSS_IO \ > } > > extern unsigned long thread_saved_pc(struct task_struct *tsk); > diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h > index 592a6a6..4db79ea 100644 > --- a/arch/x86/include/asm/syscalls.h > +++ b/arch/x86/include/asm/syscalls.h > @@ -16,9 +16,12 @@ > #include <linux/types.h> > > /* Common in X86_32 and X86_64 */ > + > +#ifdef CONFIG_X86_IOPORT > /* kernel/ioport.c */ > asmlinkage long sys_ioperm(unsigned long, unsigned long, int); > asmlinkage long sys_iopl(unsigned int); > +#endif /* CONFIG_X86_IOPORT */ > > /* kernel/ldt.c */ > asmlinkage int sys_modify_ldt(int, void __user *, unsigned long); > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile > index a5408b9..9644737 100644 > --- a/arch/x86/kernel/Makefile > +++ b/arch/x86/kernel/Makefile > @@ -20,7 +20,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace > > obj-y := process_$(BITS).o signal.o entry_$(BITS).o > obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o > -obj-y += time.o ioport.o ldt.o dumpstack.o nmi.o > +obj-y += time.o ldt.o dumpstack.o nmi.o > +obj-$(CONFIG_X86_IOPORT) += ioport.o > obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o > obj-$(CONFIG_IRQ_WORK) += irq_work.o > obj-y += probe_roms.o > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index 2793d1f..01235ef 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -1223,7 +1223,6 @@ void cpu_init(void) > struct tss_struct *t; > unsigned long v; > int cpu; > - int i; > > /* > * Load microcode on this cpu if a valid microcode is available. > @@ -1285,14 +1284,7 @@ void cpu_init(void) > } > } > > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > - > - /* > - * <= is required because the CPU will access up to > - * 8 bits beyond the end of the IO permission bitmap. > - */ > - for (i = 0; i <= IO_BITMAP_LONGS; i++) > - t->io_bitmap[i] = ~0UL; > + init_tss_io(t); > > atomic_inc(&init_mm.mm_count); > me->active_mm = &init_mm; > @@ -1351,7 +1343,7 @@ void cpu_init(void) > load_TR_desc(); > load_LDT(&init_mm.context); > > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > + init_tss_io(t); > > #ifdef CONFIG_DOUBLEFAULT > /* Set up doublefault TSS pointer in the GDT */ > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index 1b69951..eb4f5ba 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -844,6 +844,11 @@ ENTRY(stub_\func) > END(stub_\func) > .endm > > + FORK_LIKE clone > + FORK_LIKE fork > + FORK_LIKE vfork > + > +#ifdef CONFIG_X86_IOPORT > .macro FIXED_FRAME label,func > ENTRY(\label) > CFI_STARTPROC > @@ -856,10 +861,8 @@ ENTRY(\label) > END(\label) > .endm > > - FORK_LIKE clone > - FORK_LIKE fork > - FORK_LIKE vfork > FIXED_FRAME stub_iopl, sys_iopl > +#endif /* CONFIG_X86_IOPORT */ > > ENTRY(ptregscall_common) > DEFAULT_FRAME 1 8 /* offset 8: return address */ > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c > index 1b10af8..46da3d9 100644 > --- a/arch/x86/kernel/paravirt.c > +++ b/arch/x86/kernel/paravirt.c > @@ -382,7 +382,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = { > .iret = native_iret, > .swapgs = native_swapgs, > > +#ifdef CONFIG_X86_IOPORT > .set_iopl_mask = native_set_iopl_mask, > +#endif /* CONFIG_X86_IOPORT */ > .io_delay = native_io_delay, > > .start_context_switch = paravirt_nop, > diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h > index d884444..3e773fa 100644 > --- a/arch/x86/kernel/process-io.h > +++ b/arch/x86/kernel/process-io.h > @@ -1,9 +1,17 @@ > #ifndef _X86_KERNEL_PROCESS_IO_H > #define _X86_KERNEL_PROCESS_IO_H > > +static inline void clear_thread_io_bitmap(struct task_struct *p) > +{ > +#ifdef CONFIG_X86_IOPORT > + p->thread.io_bitmap_ptr = NULL; > +#endif /* CONFIG_X86_IOPORT */ > +} > + > static inline int copy_io_bitmap(struct task_struct *me, > struct task_struct *p) > { > +#ifdef CONFIG_X86_IOPORT > if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { > p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, > IO_BITMAP_BYTES, GFP_KERNEL); > @@ -15,8 +23,71 @@ static inline int copy_io_bitmap(struct task_struct *me, > } else { > p->thread.io_bitmap_ptr = NULL; > } > +#endif /* CONFIG_X86_IOPORT */ > > return 0; > } > > +static inline void exit_thread_io(struct task_struct *me) > +{ > +#ifdef CONFIG_X86_IOPORT > + struct thread_struct *t = &me->thread; > + unsigned long *bp = t->io_bitmap_ptr; > + > + if (bp) { > + struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); > + > + t->io_bitmap_ptr = NULL; > + clear_thread_flag(TIF_IO_BITMAP); > + /* > + * Careful, clear this in the TSS too: > + */ > + memset(tss->io_bitmap, 0xff, t->io_bitmap_max); > + t->io_bitmap_max = 0; > + put_cpu(); > + kfree(bp); > + } > +#endif /* CONFIG_X86_IOPORT */ > +} > + > +static inline void switch_iopl_mask(struct thread_struct *prev, > + struct thread_struct *next) > +{ > +#ifdef CONFIG_X86_IOPORT > + /* > + * Restore IOPL if needed. In normal use, the flags restore > + * in the switch assembly will handle this. But if the kernel > + * is running virtualized at a non-zero CPL, the popf will > + * not restore flags, so it must be done in a separate step. > + */ > + if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) > + set_iopl_mask(next->iopl); > +#endif /* CONFIG_X86_IOPORT */ > +} > + > +static inline void switch_io_bitmap(struct tss_struct *tss, > + struct task_struct *prev_p, > + struct task_struct *next_p) > +{ > +#ifdef CONFIG_X86_IOPORT > + struct thread_struct *prev, *next; > + prev = &prev_p->thread; > + next = &next_p->thread; > + > + if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { > + /* > + * Copy the relevant range of the IO bitmap. > + * Normally this is 128 bytes or less: > + */ > + memcpy(tss->io_bitmap, next->io_bitmap_ptr, > + max(prev->io_bitmap_max, next->io_bitmap_max)); > + } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { > + /* > + * Clear any possible leftover bits: > + */ > + memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); > + } > +#endif /* CONFIG_X86_IOPORT */ > +} > + > #endif /* _X86_KERNEL_PROCESS_IO_H */ > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index c83516b..2df5b00 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -29,6 +29,8 @@ > #include <asm/debugreg.h> > #include <asm/nmi.h> > > +#include "process-io.h" > + > /* > * per-CPU TSS segments. Threads are completely 'soft' on Linux, > * no more per-task TSS's. The TSS size is kept cacheline-aligned > @@ -101,23 +103,7 @@ void arch_task_cache_init(void) > void exit_thread(void) > { > struct task_struct *me = current; > - struct thread_struct *t = &me->thread; > - unsigned long *bp = t->io_bitmap_ptr; > - > - if (bp) { > - struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); > - > - t->io_bitmap_ptr = NULL; > - clear_thread_flag(TIF_IO_BITMAP); > - /* > - * Careful, clear this in the TSS too: > - */ > - memset(tss->io_bitmap, 0xff, t->io_bitmap_max); > - t->io_bitmap_max = 0; > - put_cpu(); > - kfree(bp); > - } > - > + exit_thread_io(me); > drop_fpu(me); > } > > @@ -222,19 +208,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, > hard_enable_TSC(); > } > > - if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { > - /* > - * Copy the relevant range of the IO bitmap. > - * Normally this is 128 bytes or less: > - */ > - memcpy(tss->io_bitmap, next->io_bitmap_ptr, > - max(prev->io_bitmap_max, next->io_bitmap_max)); > - } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { > - /* > - * Clear any possible leftover bits: > - */ > - memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); > - } > + switch_io_bitmap(tss, prev_p, next_p); > propagate_user_return_notify(prev_p, next_p); > } > > diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c > index e86028b..dc780cb 100644 > --- a/arch/x86/kernel/process_32.c > +++ b/arch/x86/kernel/process_32.c > @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, > childregs->cs = __KERNEL_CS | get_kernel_rpl(); > childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED; > p->fpu_counter = 0; > - p->thread.io_bitmap_ptr = NULL; > + clear_thread_io_bitmap(p); > memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); > return 0; > } > @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) > */ > load_TLS(next, cpu); > > - /* > - * Restore IOPL if needed. In normal use, the flags restore > - * in the switch assembly will handle this. But if the kernel > - * is running virtualized at a non-zero CPL, the popf will > - * not restore flags, so it must be done in a separate step. > - */ > - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) > - set_iopl_mask(next->iopl); > + switch_iopl_mask(prev, next); > > /* > * Now maybe handle debug registers and/or IO bitmaps > diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c > index 7461f50..f071bfa 100644 > --- a/arch/x86/kernel/ptrace.c > +++ b/arch/x86/kernel/ptrace.c > @@ -785,7 +785,11 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n, > static int ioperm_active(struct task_struct *target, > const struct user_regset *regset) > { > +#ifdef CONFIG_X86_IOPORT > return target->thread.io_bitmap_max / regset->size; > +#else > + return 0; > +#endif > } > > static int ioperm_get(struct task_struct *target, > @@ -793,12 +797,16 @@ static int ioperm_get(struct task_struct *target, > unsigned int pos, unsigned int count, > void *kbuf, void __user *ubuf) > { > +#ifdef CONFIG_X86_IOPORT > if (!target->thread.io_bitmap_ptr) > return -ENXIO; > > return user_regset_copyout(&pos, &count, &kbuf, &ubuf, > target->thread.io_bitmap_ptr, > 0, IO_BITMAP_BYTES); > +#else > + return -ENXIO; > +#endif > } > > /* > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index fa6ade7..f485d92 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -911,6 +911,7 @@ static void xen_load_sp0(struct tss_struct *tss, > xen_mc_issue(PARAVIRT_LAZY_CPU); > } > > +#ifdef CONFIG_X86_IOPORT > static void xen_set_iopl_mask(unsigned mask) > { > struct physdev_set_iopl set_iopl; > @@ -919,6 +920,7 @@ static void xen_set_iopl_mask(unsigned mask) > set_iopl.iopl = (mask == 0) ? 1 : (mask >> 12) & 3; > HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl); > } > +#endif /* CONFIG_X86_IOPORT */ > > static void xen_io_delay(void) > { > @@ -1277,7 +1279,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { > .write_idt_entry = xen_write_idt_entry, > .load_sp0 = xen_load_sp0, > > +#ifdef CONFIG_X86_IOPORT > .set_iopl_mask = xen_set_iopl_mask, > +#endif /* CONFIG_X86_IOPORT */ > .io_delay = xen_io_delay, > > /* Xen takes care of %gs when switching to usermode for us */ > diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c > index 2bd78e2..7c7d405 100644 > --- a/drivers/tty/vt/vt_ioctl.c > +++ b/drivers/tty/vt/vt_ioctl.c > @@ -410,7 +410,7 @@ int vt_ioctl(struct tty_struct *tty, > * > * XXX: you should never use these, just call ioperm directly.. > */ > -#ifdef CONFIG_X86 > +#ifdef CONFIG_X86_IOPORT > case KDADDIO: > case KDDELIO: > /* > diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > index 7078052..e7f4a35 100644 > --- a/kernel/sys_ni.c > +++ b/kernel/sys_ni.c > @@ -209,3 +209,8 @@ cond_syscall(compat_sys_open_by_handle_at); > > /* compare kernel pointers */ > cond_syscall(sys_kcmp); > + > +/* userspace I/O port access */ > +cond_syscall(stub_iopl); > +cond_syscall(sys_iopl); > +cond_syscall(sys_ioperm); > -- > 1.8.4.rc3 > > _______________________________________________ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) 2013-10-26 3:17 ` Stephen Hemminger @ 2013-10-26 4:30 ` Kees Cook 0 siblings, 0 replies; 17+ messages in thread From: Kees Cook @ 2013-10-26 4:30 UTC (permalink / raw) To: Stephen Hemminger Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Daniel Lezcano, Fenghua Yu, Frederic Weisbecker, David Herrmann, x86@kernel.org, virtualization@lists.linux-foundation.org, Paul Gortmaker, H. Peter Anvin, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Boris Ostrovsky, Jesper Nilsson, Andi Kleen, Raghavendra K T, Ingo Molnar, xen-devel, Borislav Petkov, Len Brown, Konrad On Sat, Oct 26, 2013 at 4:17 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > I/O from userspace is used to implement usermode virtio driver(s). > This has been done independently by Intel, Brocade/Vyatta, and 6Wind. > Sorry, it has to stay. This isn't about removing it, it's about putting it behind a default=y CONFIG. At the very least, the cleanups and 32/64 merging look sensible. As for the CONFIG, I think it'd be useful to have that, since some more specialized users may not want IO ports at all (some folks see IO ports as dangerous since they are a potential way to make a jump from uid-0 to ring-0). -Kees -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) 2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2013-10-26 3:17 ` Stephen Hemminger @ 2013-10-31 20:04 ` Alexander van Heukelum 2013-11-01 17:19 ` Josh Triplett 1 sibling, 1 reply; 17+ messages in thread From: Alexander van Heukelum @ 2013-10-31 20:04 UTC (permalink / raw) To: Josh Triplett, linux-kernel, virtualization, x86, xen-devel Cc: Jeremy Fitzhardinge, Raghavendra K T, Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris Metcalf, Chris Wright Hi Josh, On Tue, Oct 22, 2013, at 3:35, Josh Triplett wrote: > On the vast majority of modern systems, no processes will use the > userspsace I/O syscalls, iopl and ioperm. Add a new config option, > CONFIG_X86_IOPORT, to support configuring them out of the kernel > entirely. Since these syscalls only exist to support rare legacy > userspace programs, X86_IOPORT does not depend on EXPERT, though it does > still default to y. > > In addition to saving a significant amount of space, this also reduces > the size of several major kernel data structures, drops a bit of code > from several task-related hot paths, and reduces the attack surface of > the kernel. > > All of the task-related code dealing with userspace I/O permissions now > lives in process-io.h as several new inline functions, which become > no-ops when CONFIG_X86_IOPORT. > > bloat-o-meter shows a net reduction of 17681 bytes on 32-bit and 9719 > bytes on 64-bit: > > 32-bit bloat-o-meter: > add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681) > function old new delta > cpu_init 676 668 -8 > ioperm_active 18 7 -11 > init_task 1296 1284 -12 > exit_thread 179 91 -88 > ioperm_get 103 10 -93 > __switch_to_xtra 254 161 -93 > sys_iopl 127 - -127 > SyS_iopl 127 - -127 These are not really different ;) > copy_thread 606 446 -160 > vt_ioctl 4127 3919 -208 > sys_ioperm 370 - -370 > init_tss 8576 384 -8192 > doublefault_tss 8576 384 -8192 > > 64-bit bloat-o-meter: > add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719) > function old new delta > cpu_init 958 995 +37 > arch_align_stack 78 86 +8 > perf_event_exit_task 525 517 -8 > ioperm_active 17 8 -9 > init_task 1968 1944 -24 > stub_iopl 81 - -81 > ioperm_get 111 11 -100 > __switch_to_xtra 281 164 -117 > exit_thread 212 92 -120 > vt_ioctl 4432 4304 -128 > sys_iopl 137 - -137 > SyS_iopl 137 - -137 > copy_thread 694 520 -174 > sys_ioperm 473 - -473 > init_tss 8896 640 -8256 > > Signed-off-by: Josh Triplett <josh@joshtriplett.org> > --- > arch/x86/Kconfig | 10 +++++ > arch/x86/include/asm/paravirt.h | 2 + > arch/x86/include/asm/paravirt_types.h | 2 + > arch/x86/include/asm/processor.h | 51 +++++++++++++++++++++---- > arch/x86/include/asm/syscalls.h | 3 ++ > arch/x86/kernel/Makefile | 3 +- > arch/x86/kernel/cpu/common.c | 12 +----- > arch/x86/kernel/entry_64.S | 9 +++-- > arch/x86/kernel/paravirt.c | 2 + > arch/x86/kernel/process-io.h | 71 +++++++++++++++++++++++++++++++++++ > arch/x86/kernel/process.c | 34 ++--------------- > arch/x86/kernel/process_32.c | 11 +----- > arch/x86/kernel/ptrace.c | 8 ++++ > arch/x86/xen/enlighten.c | 4 ++ > drivers/tty/vt/vt_ioctl.c | 2 +- > kernel/sys_ni.c | 5 +++ > 16 files changed, 168 insertions(+), 61 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index e241a19..d5b1e68 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -976,6 +976,16 @@ config VM86 > XFree86 to initialize some video cards via BIOS. Disabling this > option saves about 6k. > > +config X86_IOPORT > + bool "iopl and ioperm system calls" > + default y > + ---help--- > + This option enables the iopl and ioperm system calls, which allow > + privileged userspace processes to directly access I/O ports. This > + is used by some legacy software to drive hardware directly from > + userspace rather than via a proper kernel driver. Unless you intend > + to run such software, you can safely say N here. > + I think this entry should be under General setup / Configure standard kernel features (expert users). Remove references to "legacy" and "proper driver". > config TOSHIBA > tristate "Toshiba Laptop support" > depends on X86_32 > diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h > index 401f350..2e106d5 100644 > --- a/arch/x86/include/asm/paravirt.h > +++ b/arch/x86/include/asm/paravirt.h > @@ -299,10 +299,12 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g) > { > PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g); > } > +#ifdef CONFIG_X86_IOPORT > static inline void set_iopl_mask(unsigned mask) > { > PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask); > } > +#endif /* CONFIG_X86_IOPORT */ > > /* The paravirtualized I/O functions */ > static inline void slow_down_io(void) > diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h > index aab8f67..bc5bfeb 100644 > --- a/arch/x86/include/asm/paravirt_types.h > +++ b/arch/x86/include/asm/paravirt_types.h > @@ -142,7 +142,9 @@ struct pv_cpu_ops { > > void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t); > > +#ifdef CONFIG_X86_IOPORT > void (*set_iopl_mask)(unsigned mask); > +#endif /* CONFIG_X86_IOPORT */ > > void (*wbinvd)(void); > void (*io_delay)(void); > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 03d3003..48f6fbe 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -256,7 +256,11 @@ struct x86_hw_tss { > /* > * IO-bitmap sizes: > */ > +#ifdef CONFIG_X86_IOPORT > #define IO_BITMAP_BITS 65536 > +#else > +#define IO_BITMAP_BITS 0 > +#endif > #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8) > #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long)) > #define INVALID_IO_BITMAP_OFFSET 0x8000 > @@ -269,6 +273,7 @@ struct tss_struct { > */ > struct x86_hw_tss x86_tss; > > +#ifdef CONFIG_X86_IOPORT > /* > * The extra 1 is there because the CPU will access an > * additional byte beyond the end of the IO permission > @@ -276,6 +281,7 @@ struct tss_struct { > * be within the limit. > */ > unsigned long io_bitmap[IO_BITMAP_LONGS + 1]; > +#endif /* CONFIG_X86_IOPORT */ > > /* > * .. and then another 0x100 bytes for the emergency kernel stack: > @@ -286,6 +292,24 @@ struct tss_struct { > > DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss); > > +static inline void init_tss_io(struct tss_struct *t) > +{ > +#ifdef CONFIG_X86_IOPORT > + int i; > + > + t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > + > + /* > + * <= is required because the CPU will access up to > + * 8 bits beyond the end of the IO permission bitmap. > + */ > + for (i = 0; i <= IO_BITMAP_LONGS; i++) > + t->io_bitmap[i] = ~0UL; > +#else > + t->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET; > +#endif > +} > + > /* > * Save the original ist values for checking stack pointers during debugging > */ > @@ -484,13 +508,16 @@ struct thread_struct { > unsigned int saved_fs; > unsigned int saved_gs; > #endif > +#ifdef CONFIG_X86_IOPORT > /* IO permissions: */ > unsigned long *io_bitmap_ptr; > unsigned long iopl; > /* Max allowed port in the bitmap, in bytes: */ > unsigned io_bitmap_max; > +#endif /* CONFIG_X86_IOPORT */ > }; > > +#ifdef CONFIG_X86_IOPORT > /* > * Set IOPL bits in EFLAGS from given mask > */ > @@ -509,6 +536,7 @@ static inline void native_set_iopl_mask(unsigned mask) > : "i" (~X86_EFLAGS_IOPL), "r" (mask)); > #endif > } > +#endif /* CONFIG_X86_IOPORT */ > > static inline void > native_load_sp0(struct tss_struct *tss, struct thread_struct *thread) > @@ -828,12 +856,8 @@ static inline void spin_lock_prefetch(const void *x) > #define STACK_TOP TASK_SIZE > #define STACK_TOP_MAX STACK_TOP > > -#define INIT_THREAD { \ > - .sp0 = sizeof(init_stack) + (long)&init_stack, \ > - .vm86_info = NULL, \ > - .sysenter_cs = __KERNEL_CS, \ > - .io_bitmap_ptr = NULL, \ > -} > +#ifdef CONFIG_X86_IOPORT > +#define INIT_THREAD_IO .io_bitmap_ptr = NULL, > > /* > * Note that the .io_bitmap member must be extra-big. This is because > @@ -841,6 +865,19 @@ static inline void spin_lock_prefetch(const void *x) > * permission bitmap. The extra byte must be all 1 bits, and must > * be within the limit. > */ > +#define INIT_TSS_IO .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, > +#else > +#define INIT_THREAD_IO > +#define INIT_TSS_IO > +#endif > + > +#define INIT_THREAD { \ > + .sp0 = sizeof(init_stack) + (long)&init_stack, \ > + .vm86_info = NULL, \ > + .sysenter_cs = __KERNEL_CS, \ > + INIT_THREAD_IO \ > +} > + > #define INIT_TSS { \ > .x86_tss = { \ > .sp0 = sizeof(init_stack) + (long)&init_stack, \ > @@ -848,7 +885,7 @@ static inline void spin_lock_prefetch(const void *x) > .ss1 = __KERNEL_CS, \ > .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \ > }, \ > - .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \ > + INIT_TSS_IO \ > } > > extern unsigned long thread_saved_pc(struct task_struct *tsk); > diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h > index 592a6a6..4db79ea 100644 > --- a/arch/x86/include/asm/syscalls.h > +++ b/arch/x86/include/asm/syscalls.h > @@ -16,9 +16,12 @@ > #include <linux/types.h> > > /* Common in X86_32 and X86_64 */ > + > +#ifdef CONFIG_X86_IOPORT > /* kernel/ioport.c */ > asmlinkage long sys_ioperm(unsigned long, unsigned long, int); > asmlinkage long sys_iopl(unsigned int); > +#endif /* CONFIG_X86_IOPORT */ > > /* kernel/ldt.c */ > asmlinkage int sys_modify_ldt(int, void __user *, unsigned long); > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile > index a5408b9..9644737 100644 > --- a/arch/x86/kernel/Makefile > +++ b/arch/x86/kernel/Makefile > @@ -20,7 +20,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace > > obj-y := process_$(BITS).o signal.o entry_$(BITS).o > obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o > -obj-y += time.o ioport.o ldt.o dumpstack.o nmi.o > +obj-y += time.o ldt.o dumpstack.o nmi.o > +obj-$(CONFIG_X86_IOPORT) += ioport.o > obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o > obj-$(CONFIG_IRQ_WORK) += irq_work.o > obj-y += probe_roms.o > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index 2793d1f..01235ef 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -1223,7 +1223,6 @@ void cpu_init(void) > struct tss_struct *t; > unsigned long v; > int cpu; > - int i; > > /* > * Load microcode on this cpu if a valid microcode is available. > @@ -1285,14 +1284,7 @@ void cpu_init(void) > } > } > > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > - > - /* > - * <= is required because the CPU will access up to > - * 8 bits beyond the end of the IO permission bitmap. > - */ > - for (i = 0; i <= IO_BITMAP_LONGS; i++) > - t->io_bitmap[i] = ~0UL; > + init_tss_io(t); > > atomic_inc(&init_mm.mm_count); > me->active_mm = &init_mm; > @@ -1351,7 +1343,7 @@ void cpu_init(void) > load_TR_desc(); > load_LDT(&init_mm.context); > > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > + init_tss_io(t); This patch is too big. I think it would all look nicer if you added ioport.c in one patch, and then convert the users in a separate patch? > #ifdef CONFIG_DOUBLEFAULT > /* Set up doublefault TSS pointer in the GDT */ > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index 1b69951..eb4f5ba 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -844,6 +844,11 @@ ENTRY(stub_\func) > END(stub_\func) > .endm > > + FORK_LIKE clone > + FORK_LIKE fork > + FORK_LIKE vfork > + > +#ifdef CONFIG_X86_IOPORT > .macro FIXED_FRAME label,func > ENTRY(\label) > CFI_STARTPROC > @@ -856,10 +861,8 @@ ENTRY(\label) > END(\label) > .endm > > - FORK_LIKE clone > - FORK_LIKE fork > - FORK_LIKE vfork > FIXED_FRAME stub_iopl, sys_iopl > +#endif /* CONFIG_X86_IOPORT */ > > ENTRY(ptregscall_common) > DEFAULT_FRAME 1 8 /* offset 8: return address */ > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c > index 1b10af8..46da3d9 100644 > --- a/arch/x86/kernel/paravirt.c > +++ b/arch/x86/kernel/paravirt.c > @@ -382,7 +382,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = { > .iret = native_iret, > .swapgs = native_swapgs, > > +#ifdef CONFIG_X86_IOPORT > .set_iopl_mask = native_set_iopl_mask, > +#endif /* CONFIG_X86_IOPORT */ > .io_delay = native_io_delay, > > .start_context_switch = paravirt_nop, > diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h > index d884444..3e773fa 100644 > --- a/arch/x86/kernel/process-io.h > +++ b/arch/x86/kernel/process-io.h > @@ -1,9 +1,17 @@ > #ifndef _X86_KERNEL_PROCESS_IO_H > #define _X86_KERNEL_PROCESS_IO_H > > +static inline void clear_thread_io_bitmap(struct task_struct *p) > +{ > +#ifdef CONFIG_X86_IOPORT > + p->thread.io_bitmap_ptr = NULL; > +#endif /* CONFIG_X86_IOPORT */ > +} > + This is thought of as ugly... Instead, do something like #ifndef CONFIG_X86_IOPORT static inline void clear_thread_io_bitmap(struct task_struct *p) {} static inline int copy_io_bitmap(struct task_struct *me, struct task_struct *p) {return 0} ... etc... #else static inline void clear_thread_io_bitmap(struct task_struct *p) { p->thread.io_bitmap_ptr = NULL; } ... etc... #endif > static inline int copy_io_bitmap(struct task_struct *me, > struct task_struct *p) > { > +#ifdef CONFIG_X86_IOPORT > if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) { > p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr, > IO_BITMAP_BYTES, GFP_KERNEL); > @@ -15,8 +23,71 @@ static inline int copy_io_bitmap(struct task_struct *me, > } else { > p->thread.io_bitmap_ptr = NULL; > } > +#endif /* CONFIG_X86_IOPORT */ > > return 0; > } > > +static inline void exit_thread_io(struct task_struct *me) > +{ > +#ifdef CONFIG_X86_IOPORT > + struct thread_struct *t = &me->thread; > + unsigned long *bp = t->io_bitmap_ptr; > + > + if (bp) { > + struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); > + > + t->io_bitmap_ptr = NULL; > + clear_thread_flag(TIF_IO_BITMAP); > + /* > + * Careful, clear this in the TSS too: > + */ > + memset(tss->io_bitmap, 0xff, t->io_bitmap_max); > + t->io_bitmap_max = 0; > + put_cpu(); > + kfree(bp); > + } > +#endif /* CONFIG_X86_IOPORT */ > +} > + > +static inline void switch_iopl_mask(struct thread_struct *prev, > + struct thread_struct *next) > +{ > +#ifdef CONFIG_X86_IOPORT > + /* > + * Restore IOPL if needed. In normal use, the flags restore > + * in the switch assembly will handle this. But if the kernel > + * is running virtualized at a non-zero CPL, the popf will > + * not restore flags, so it must be done in a separate step. > + */ > + if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) > + set_iopl_mask(next->iopl); > +#endif /* CONFIG_X86_IOPORT */ > +} > + > +static inline void switch_io_bitmap(struct tss_struct *tss, > + struct task_struct *prev_p, > + struct task_struct *next_p) > +{ > +#ifdef CONFIG_X86_IOPORT > + struct thread_struct *prev, *next; > + prev = &prev_p->thread; > + next = &next_p->thread; > + > + if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { > + /* > + * Copy the relevant range of the IO bitmap. > + * Normally this is 128 bytes or less: > + */ > + memcpy(tss->io_bitmap, next->io_bitmap_ptr, > + max(prev->io_bitmap_max, next->io_bitmap_max)); > + } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { > + /* > + * Clear any possible leftover bits: > + */ > + memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); > + } > +#endif /* CONFIG_X86_IOPORT */ > +} > + > #endif /* _X86_KERNEL_PROCESS_IO_H */ > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index c83516b..2df5b00 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -29,6 +29,8 @@ > #include <asm/debugreg.h> > #include <asm/nmi.h> > > +#include "process-io.h" > + > /* > * per-CPU TSS segments. Threads are completely 'soft' on Linux, > * no more per-task TSS's. The TSS size is kept cacheline-aligned > @@ -101,23 +103,7 @@ void arch_task_cache_init(void) > void exit_thread(void) > { > struct task_struct *me = current; > - struct thread_struct *t = &me->thread; > - unsigned long *bp = t->io_bitmap_ptr; > - > - if (bp) { > - struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); > - > - t->io_bitmap_ptr = NULL; > - clear_thread_flag(TIF_IO_BITMAP); > - /* > - * Careful, clear this in the TSS too: > - */ > - memset(tss->io_bitmap, 0xff, t->io_bitmap_max); > - t->io_bitmap_max = 0; > - put_cpu(); > - kfree(bp); > - } > - > + exit_thread_io(me); > drop_fpu(me); > } > > @@ -222,19 +208,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, > hard_enable_TSC(); > } > > - if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { > - /* > - * Copy the relevant range of the IO bitmap. > - * Normally this is 128 bytes or less: > - */ > - memcpy(tss->io_bitmap, next->io_bitmap_ptr, > - max(prev->io_bitmap_max, next->io_bitmap_max)); > - } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { > - /* > - * Clear any possible leftover bits: > - */ > - memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); > - } > + switch_io_bitmap(tss, prev_p, next_p); > propagate_user_return_notify(prev_p, next_p); > } > > diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c > index e86028b..dc780cb 100644 > --- a/arch/x86/kernel/process_32.c > +++ b/arch/x86/kernel/process_32.c > @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, > childregs->cs = __KERNEL_CS | get_kernel_rpl(); > childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED; > p->fpu_counter = 0; > - p->thread.io_bitmap_ptr = NULL; > + clear_thread_io_bitmap(p); > memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); > return 0; > } > @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) > */ > load_TLS(next, cpu); > > - /* > - * Restore IOPL if needed. In normal use, the flags restore > - * in the switch assembly will handle this. But if the kernel > - * is running virtualized at a non-zero CPL, the popf will > - * not restore flags, so it must be done in a separate step. > - */ > - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) > - set_iopl_mask(next->iopl); > + switch_iopl_mask(prev, next); > > /* > * Now maybe handle debug registers and/or IO bitmaps If copy_thread would be in process.c instead, the .h-file would be unnecessary, right? > diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c > index 7461f50..f071bfa 100644 > --- a/arch/x86/kernel/ptrace.c > +++ b/arch/x86/kernel/ptrace.c > @@ -785,7 +785,11 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n, > static int ioperm_active(struct task_struct *target, > const struct user_regset *regset) > { > +#ifdef CONFIG_X86_IOPORT > return target->thread.io_bitmap_max / regset->size; > +#else > + return 0; > +#endif > } > > static int ioperm_get(struct task_struct *target, > @@ -793,12 +797,16 @@ static int ioperm_get(struct task_struct *target, > unsigned int pos, unsigned int count, > void *kbuf, void __user *ubuf) > { > +#ifdef CONFIG_X86_IOPORT > if (!target->thread.io_bitmap_ptr) > return -ENXIO; > > return user_regset_copyout(&pos, &count, &kbuf, &ubuf, > target->thread.io_bitmap_ptr, > 0, IO_BITMAP_BYTES); > +#else > + return -ENXIO; > +#endif > } > > /* > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index fa6ade7..f485d92 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -911,6 +911,7 @@ static void xen_load_sp0(struct tss_struct *tss, > xen_mc_issue(PARAVIRT_LAZY_CPU); > } > > +#ifdef CONFIG_X86_IOPORT > static void xen_set_iopl_mask(unsigned mask) > { > struct physdev_set_iopl set_iopl; > @@ -919,6 +920,7 @@ static void xen_set_iopl_mask(unsigned mask) > set_iopl.iopl = (mask == 0) ? 1 : (mask >> 12) & 3; > HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl); > } > +#endif /* CONFIG_X86_IOPORT */ > > static void xen_io_delay(void) > { > @@ -1277,7 +1279,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { > .write_idt_entry = xen_write_idt_entry, > .load_sp0 = xen_load_sp0, > > +#ifdef CONFIG_X86_IOPORT > .set_iopl_mask = xen_set_iopl_mask, > +#endif /* CONFIG_X86_IOPORT */ > .io_delay = xen_io_delay, > > /* Xen takes care of %gs when switching to usermode for us */ > diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c > index 2bd78e2..7c7d405 100644 > --- a/drivers/tty/vt/vt_ioctl.c > +++ b/drivers/tty/vt/vt_ioctl.c > @@ -410,7 +410,7 @@ int vt_ioctl(struct tty_struct *tty, > * > * XXX: you should never use these, just call ioperm directly.. > */ > -#ifdef CONFIG_X86 > +#ifdef CONFIG_X86_IOPORT > case KDADDIO: > case KDDELIO: > /* > diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > index 7078052..e7f4a35 100644 > --- a/kernel/sys_ni.c > +++ b/kernel/sys_ni.c > @@ -209,3 +209,8 @@ cond_syscall(compat_sys_open_by_handle_at); > > /* compare kernel pointers */ > cond_syscall(sys_kcmp); > + > +/* userspace I/O port access */ > +cond_syscall(stub_iopl); > +cond_syscall(sys_iopl); > +cond_syscall(sys_ioperm); > -- > 1.8.4.rc3 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) 2013-10-31 20:04 ` Alexander van Heukelum @ 2013-11-01 17:19 ` Josh Triplett 0 siblings, 0 replies; 17+ messages in thread From: Josh Triplett @ 2013-11-01 17:19 UTC (permalink / raw) To: Alexander van Heukelum Cc: Jeremy Fitzhardinge, x86, Len Brown, Frederic Weisbecker, H. Peter Anvin, virtualization, Paul Gortmaker, Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov, Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross On Thu, Oct 31, 2013 at 09:04:42PM +0100, Alexander van Heukelum wrote: > On Tue, Oct 22, 2013, at 3:35, Josh Triplett wrote: > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -976,6 +976,16 @@ config VM86 > > XFree86 to initialize some video cards via BIOS. Disabling this > > option saves about 6k. > > > > +config X86_IOPORT > > + bool "iopl and ioperm system calls" > > + default y > > + ---help--- > > + This option enables the iopl and ioperm system calls, which allow > > + privileged userspace processes to directly access I/O ports. This > > + is used by some legacy software to drive hardware directly from > > + userspace rather than via a proper kernel driver. Unless you intend > > + to run such software, you can safely say N here. > > + > > I think this entry should be under General setup / Configure standard kernel > features (expert users). It's entirely x86-specific, and it's similar to VM86 just above it; it belongs on the x86-specific menu. > Remove references to "legacy" and "proper driver". Hardware drivers belong in the kernel, and anything using iopl or ioperm won't run on x86-64, which argues rather strongly for its obsolescence. There's also /dev/port for cleaner access to ports from userspace. However, I've rephrased this for v2. > > --- a/arch/x86/kernel/cpu/common.c > > +++ b/arch/x86/kernel/cpu/common.c > > @@ -1223,7 +1223,6 @@ void cpu_init(void) > > struct tss_struct *t; > > unsigned long v; > > int cpu; > > - int i; > > > > /* > > * Load microcode on this cpu if a valid microcode is available. > > @@ -1285,14 +1284,7 @@ void cpu_init(void) > > } > > } > > > > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > > - > > - /* > > - * <= is required because the CPU will access up to > > - * 8 bits beyond the end of the IO permission bitmap. > > - */ > > - for (i = 0; i <= IO_BITMAP_LONGS; i++) > > - t->io_bitmap[i] = ~0UL; > > + init_tss_io(t); > > > > atomic_inc(&init_mm.mm_count); > > me->active_mm = &init_mm; > > @@ -1351,7 +1343,7 @@ void cpu_init(void) > > load_TR_desc(); > > load_LDT(&init_mm.context); > > > > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap); > > + init_tss_io(t); > > This patch is too big. I think it would all look nicer if you added ioport.c in > one patch, and then convert the users in a separate patch? I'm not sure what you mean by "added ioport.c"; ioport.c already exists, and this patch just makes it optional. > > --- a/arch/x86/kernel/process-io.h > > +++ b/arch/x86/kernel/process-io.h > > @@ -1,9 +1,17 @@ > > #ifndef _X86_KERNEL_PROCESS_IO_H > > #define _X86_KERNEL_PROCESS_IO_H > > > > +static inline void clear_thread_io_bitmap(struct task_struct *p) > > +{ > > +#ifdef CONFIG_X86_IOPORT > > + p->thread.io_bitmap_ptr = NULL; > > +#endif /* CONFIG_X86_IOPORT */ > > +} > > + > > This is thought of as ugly... Instead, do something like > > #ifndef CONFIG_X86_IOPORT > > static inline void clear_thread_io_bitmap(struct task_struct *p) {} > static inline int copy_io_bitmap(struct task_struct *me, struct task_struct *p) {return 0} > ... etc... > > #else > > static inline void clear_thread_io_bitmap(struct task_struct *p) > { > p->thread.io_bitmap_ptr = NULL; > } > ... etc... > > #endif In .c files, any ifdefs at all are ugly; in .h files, both styles are quite common, and in particular the style I used is common when omitting the entire body of the function. It has the advantage of keeping a common function signature rather than duplicating it. > > --- a/arch/x86/kernel/process_32.c > > +++ b/arch/x86/kernel/process_32.c > > @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, > > childregs->cs = __KERNEL_CS | get_kernel_rpl(); > > childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED; > > p->fpu_counter = 0; > > - p->thread.io_bitmap_ptr = NULL; > > + clear_thread_io_bitmap(p); > > memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); > > return 0; > > } > > @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) > > */ > > load_TLS(next, cpu); > > > > - /* > > - * Restore IOPL if needed. In normal use, the flags restore > > - * in the switch assembly will handle this. But if the kernel > > - * is running virtualized at a non-zero CPL, the popf will > > - * not restore flags, so it must be done in a separate step. > > - */ > > - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) > > - set_iopl_mask(next->iopl); > > + switch_iopl_mask(prev, next); > > > > /* > > * Now maybe handle debug registers and/or IO bitmaps > > If copy_thread would be in process.c instead, the .h-file would be unnecessary, > right? No, there are calls to the new functions in the .h file in several other places. - Josh Triplett ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-11-01 17:19 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett 2013-10-30 22:21 ` Kees Cook 2013-10-31 20:01 ` Alexander van Heukelum 2013-11-01 16:33 ` Josh Triplett 2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett 2013-10-30 22:22 ` Kees Cook 2013-10-30 22:53 ` H. Peter Anvin 2013-10-31 11:17 ` Josh Triplett 2013-10-31 11:12 ` Josh Triplett 2013-10-31 20:02 ` Alexander van Heukelum 2013-11-01 16:40 ` Josh Triplett 2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett 2013-10-26 3:17 ` Stephen Hemminger 2013-10-26 4:30 ` Kees Cook 2013-10-31 20:04 ` Alexander van Heukelum 2013-11-01 17:19 ` Josh Triplett
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).