* [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm)
@ 2013-10-22 2:33 Josh Triplett
2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Josh Triplett @ 2013-10-22 2:33 UTC (permalink / raw)
To: linux-kernel, virtualization, x86, xen-devel
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T,
Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris
This patch series makes it possible to compile out the iopl and ioperm system
calls, which allow privileged processes to request permission to directly poke
I/O ports from userspace.
Nothing on a modern Linux system uses these calls anymore, and anything new
should be using /dev/port instead, or better yet writing a driver.
Copying the bloat-o-meter stats from the final patch:
32-bit bloat-o-meter:
add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681)
function old new delta
cpu_init 676 668 -8
ioperm_active 18 7 -11
init_task 1296 1284 -12
exit_thread 179 91 -88
ioperm_get 103 10 -93
__switch_to_xtra 254 161 -93
sys_iopl 127 - -127
SyS_iopl 127 - -127
copy_thread 606 446 -160
vt_ioctl 4127 3919 -208
sys_ioperm 370 - -370
init_tss 8576 384 -8192
doublefault_tss 8576 384 -8192
64-bit bloat-o-meter:
add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719)
function old new delta
cpu_init 958 995 +37
arch_align_stack 78 86 +8
perf_event_exit_task 525 517 -8
ioperm_active 17 8 -9
init_task 1968 1944 -24
stub_iopl 81 - -81
ioperm_get 111 11 -100
__switch_to_xtra 281 164 -117
exit_thread 212 92 -120
vt_ioctl 4432 4304 -128
sys_iopl 137 - -137
SyS_iopl 137 - -137
copy_thread 694 520 -174
sys_ioperm 473 - -473
init_tss 8896 640 -8256
Josh Triplett (3):
x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling
x86: tss: Eliminate fragile calculation of TSS segment limit
x86: Support compiling out userspace I/O (iopl and ioperm)
arch/x86/Kconfig | 10 ++++
arch/x86/include/asm/desc.h | 11 +----
arch/x86/include/asm/paravirt.h | 2 +
arch/x86/include/asm/paravirt_types.h | 2 +
arch/x86/include/asm/processor.h | 54 +++++++++++++++++---
arch/x86/include/asm/syscalls.h | 3 ++
arch/x86/kernel/Makefile | 3 +-
arch/x86/kernel/cpu/common.c | 12 +----
arch/x86/kernel/entry_64.S | 9 ++--
arch/x86/kernel/paravirt.c | 2 +
arch/x86/kernel/process-io.h | 93 +++++++++++++++++++++++++++++++++++
arch/x86/kernel/process.c | 34 ++-----------
arch/x86/kernel/process_32.c | 40 ++++-----------
arch/x86/kernel/process_64.c | 26 ++--------
arch/x86/kernel/ptrace.c | 8 +++
arch/x86/xen/enlighten.c | 4 ++
drivers/tty/vt/vt_ioctl.c | 2 +-
kernel/sys_ni.c | 5 ++
18 files changed, 206 insertions(+), 114 deletions(-)
create mode 100644 arch/x86/kernel/process-io.h
--
1.8.4.rc3
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling
2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
@ 2013-10-22 2:34 ` Josh Triplett
2013-10-30 22:21 ` Kees Cook
2013-10-31 20:01 ` Alexander van Heukelum
2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett
2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2 siblings, 2 replies; 17+ messages in thread
From: Josh Triplett @ 2013-10-22 2:34 UTC (permalink / raw)
To: linux-kernel, virtualization, x86, xen-devel
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T,
Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris
The 32-bit and 64-bit versions of copy_thread have functionally
identical handling for copying the I/O bitmap, modulo differences in
error handling. Clean up the error paths in both by moving the copy of
the I/O bitmap to the end, to eliminate the need to free it if
subsequent copy steps fail; move the resulting identical code to a
static inline in a common header.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---
arch/x86/kernel/process-io.h | 22 ++++++++++++++++++++++
arch/x86/kernel/process_32.c | 29 ++++++++---------------------
arch/x86/kernel/process_64.c | 26 +++++---------------------
3 files changed, 35 insertions(+), 42 deletions(-)
create mode 100644 arch/x86/kernel/process-io.h
diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h
new file mode 100644
index 0000000..d884444
--- /dev/null
+++ b/arch/x86/kernel/process-io.h
@@ -0,0 +1,22 @@
+#ifndef _X86_KERNEL_PROCESS_IO_H
+#define _X86_KERNEL_PROCESS_IO_H
+
+static inline int copy_io_bitmap(struct task_struct *me,
+ struct task_struct *p)
+{
+ if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
+ p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
+ IO_BITMAP_BYTES, GFP_KERNEL);
+ if (!p->thread.io_bitmap_ptr) {
+ p->thread.io_bitmap_max = 0;
+ return -ENOMEM;
+ }
+ set_tsk_thread_flag(p, TIF_IO_BITMAP);
+ } else {
+ p->thread.io_bitmap_ptr = NULL;
+ }
+
+ return 0;
+}
+
+#endif /* _X86_KERNEL_PROCESS_IO_H */
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 884f98f..e86028b 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -56,6 +56,8 @@
#include <asm/debugreg.h>
#include <asm/switch_to.h>
+#include "process-io.h"
+
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
asmlinkage void ret_from_kernel_thread(void) __asm__("ret_from_kernel_thread");
@@ -135,7 +137,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
{
struct pt_regs *childregs = task_pt_regs(p);
struct task_struct *tsk;
- int err;
p->thread.sp = (unsigned long) childregs;
p->thread.sp0 = (unsigned long) (childregs+1);
@@ -167,36 +168,22 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
task_user_gs(p) = get_user_gs(current_pt_regs());
p->fpu_counter = 0;
- p->thread.io_bitmap_ptr = NULL;
tsk = current;
- err = -ENOMEM;
memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
- if (unlikely(test_tsk_thread_flag(tsk, TIF_IO_BITMAP))) {
- p->thread.io_bitmap_ptr = kmemdup(tsk->thread.io_bitmap_ptr,
- IO_BITMAP_BYTES, GFP_KERNEL);
- if (!p->thread.io_bitmap_ptr) {
- p->thread.io_bitmap_max = 0;
- return -ENOMEM;
- }
- set_tsk_thread_flag(p, TIF_IO_BITMAP);
- }
-
- err = 0;
-
/*
* Set a new TLS for the child thread?
*/
- if (clone_flags & CLONE_SETTLS)
+ if (clone_flags & CLONE_SETTLS) {
+ int err;
err = do_set_thread_area(p, -1,
(struct user_desc __user *)childregs->si, 0);
-
- if (err && p->thread.io_bitmap_ptr) {
- kfree(p->thread.io_bitmap_ptr);
- p->thread.io_bitmap_max = 0;
+ if(err)
+ return err;
}
- return err;
+
+ return copy_io_bitmap(tsk, p);
}
void
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index bb1dc51..6e680ad 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -50,6 +50,8 @@
#include <asm/debugreg.h>
#include <asm/switch_to.h>
+#include "process-io.h"
+
asmlinkage extern void ret_from_fork(void);
asmlinkage DEFINE_PER_CPU(unsigned long, old_rsp);
@@ -154,7 +156,6 @@ static inline u32 read_32bit_tls(struct task_struct *t, int tls)
int copy_thread(unsigned long clone_flags, unsigned long sp,
unsigned long arg, struct task_struct *p)
{
- int err;
struct pt_regs *childregs;
struct task_struct *me = current;
@@ -164,7 +165,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
p->thread.usersp = me->thread.usersp;
set_tsk_thread_flag(p, TIF_FORK);
p->fpu_counter = 0;
- p->thread.io_bitmap_ptr = NULL;
savesegment(gs, p->thread.gsindex);
p->thread.gs = p->thread.gsindex ? 0 : me->thread.gs;
@@ -192,23 +192,13 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
if (sp)
childregs->sp = sp;
- err = -ENOMEM;
memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
- if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
- p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
- IO_BITMAP_BYTES, GFP_KERNEL);
- if (!p->thread.io_bitmap_ptr) {
- p->thread.io_bitmap_max = 0;
- return -ENOMEM;
- }
- set_tsk_thread_flag(p, TIF_IO_BITMAP);
- }
-
/*
* Set a new TLS for the child thread?
*/
if (clone_flags & CLONE_SETTLS) {
+ int err;
#ifdef CONFIG_IA32_EMULATION
if (test_thread_flag(TIF_IA32))
err = do_set_thread_area(p, -1,
@@ -217,16 +207,10 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
#endif
err = do_arch_prctl(p, ARCH_SET_FS, childregs->r8);
if (err)
- goto out;
- }
- err = 0;
-out:
- if (err && p->thread.io_bitmap_ptr) {
- kfree(p->thread.io_bitmap_ptr);
- p->thread.io_bitmap_max = 0;
+ return err;
}
- return err;
+ return copy_io_bitmap(me, p);
}
static void
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett
@ 2013-10-22 2:34 ` Josh Triplett
2013-10-30 22:22 ` Kees Cook
2013-10-31 20:02 ` Alexander van Heukelum
2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2 siblings, 2 replies; 17+ messages in thread
From: Josh Triplett @ 2013-10-22 2:34 UTC (permalink / raw)
To: linux-kernel, virtualization, x86, xen-devel
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T,
Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris
__set_tss_desc has a complex calculation of the TSS segment limit,
duplicating the quirky details of the I/O bitmap array length, and
requiring a complex comment to explain. Replace that calculation with a
simpler one based on the offsetof the "stack" field that follows the
array.
That then removes the last use of IO_BITMAP_OFFSET, so delete it.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---
arch/x86/include/asm/desc.h | 11 +----------
arch/x86/include/asm/processor.h | 3 ++-
2 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index b90e5df..17ac92f 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
struct desc_struct *d = get_cpu_gdt_table(cpu);
tss_desc tss;
- /*
- * sizeof(unsigned long) coming from an extra "long" at the end
- * of the iobitmap. See tss_struct definition in processor.h
- *
- * -1? seg base+limit should be pointing to the address of the
- * last valid byte
- */
- set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS,
- IO_BITMAP_OFFSET + IO_BITMAP_BYTES +
- sizeof(unsigned long) - 1);
+ set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT);
write_gdt_entry(d, entry, &tss, DESC_TSS);
}
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 987c75e..03d3003 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -259,9 +259,10 @@ struct x86_hw_tss {
#define IO_BITMAP_BITS 65536
#define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
#define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
-#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap)
#define INVALID_IO_BITMAP_OFFSET 0x8000
+#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1)
+
struct tss_struct {
/*
* The hardware state:
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm)
2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett
2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett
@ 2013-10-22 2:35 ` Josh Triplett
2013-10-26 3:17 ` Stephen Hemminger
2013-10-31 20:04 ` Alexander van Heukelum
2 siblings, 2 replies; 17+ messages in thread
From: Josh Triplett @ 2013-10-22 2:35 UTC (permalink / raw)
To: linux-kernel, virtualization, x86, xen-devel
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Raghavendra K T,
Len Brown, Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris
On the vast majority of modern systems, no processes will use the
userspsace I/O syscalls, iopl and ioperm. Add a new config option,
CONFIG_X86_IOPORT, to support configuring them out of the kernel
entirely. Since these syscalls only exist to support rare legacy
userspace programs, X86_IOPORT does not depend on EXPERT, though it does
still default to y.
In addition to saving a significant amount of space, this also reduces
the size of several major kernel data structures, drops a bit of code
from several task-related hot paths, and reduces the attack surface of
the kernel.
All of the task-related code dealing with userspace I/O permissions now
lives in process-io.h as several new inline functions, which become
no-ops when CONFIG_X86_IOPORT.
bloat-o-meter shows a net reduction of 17681 bytes on 32-bit and 9719
bytes on 64-bit:
32-bit bloat-o-meter:
add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681)
function old new delta
cpu_init 676 668 -8
ioperm_active 18 7 -11
init_task 1296 1284 -12
exit_thread 179 91 -88
ioperm_get 103 10 -93
__switch_to_xtra 254 161 -93
sys_iopl 127 - -127
SyS_iopl 127 - -127
copy_thread 606 446 -160
vt_ioctl 4127 3919 -208
sys_ioperm 370 - -370
init_tss 8576 384 -8192
doublefault_tss 8576 384 -8192
64-bit bloat-o-meter:
add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719)
function old new delta
cpu_init 958 995 +37
arch_align_stack 78 86 +8
perf_event_exit_task 525 517 -8
ioperm_active 17 8 -9
init_task 1968 1944 -24
stub_iopl 81 - -81
ioperm_get 111 11 -100
__switch_to_xtra 281 164 -117
exit_thread 212 92 -120
vt_ioctl 4432 4304 -128
sys_iopl 137 - -137
SyS_iopl 137 - -137
copy_thread 694 520 -174
sys_ioperm 473 - -473
init_tss 8896 640 -8256
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---
arch/x86/Kconfig | 10 +++++
arch/x86/include/asm/paravirt.h | 2 +
arch/x86/include/asm/paravirt_types.h | 2 +
arch/x86/include/asm/processor.h | 51 +++++++++++++++++++++----
arch/x86/include/asm/syscalls.h | 3 ++
arch/x86/kernel/Makefile | 3 +-
arch/x86/kernel/cpu/common.c | 12 +-----
arch/x86/kernel/entry_64.S | 9 +++--
arch/x86/kernel/paravirt.c | 2 +
arch/x86/kernel/process-io.h | 71 +++++++++++++++++++++++++++++++++++
arch/x86/kernel/process.c | 34 ++---------------
arch/x86/kernel/process_32.c | 11 +-----
arch/x86/kernel/ptrace.c | 8 ++++
arch/x86/xen/enlighten.c | 4 ++
drivers/tty/vt/vt_ioctl.c | 2 +-
kernel/sys_ni.c | 5 +++
16 files changed, 168 insertions(+), 61 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e241a19..d5b1e68 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -976,6 +976,16 @@ config VM86
XFree86 to initialize some video cards via BIOS. Disabling this
option saves about 6k.
+config X86_IOPORT
+ bool "iopl and ioperm system calls"
+ default y
+ ---help---
+ This option enables the iopl and ioperm system calls, which allow
+ privileged userspace processes to directly access I/O ports. This
+ is used by some legacy software to drive hardware directly from
+ userspace rather than via a proper kernel driver. Unless you intend
+ to run such software, you can safely say N here.
+
config TOSHIBA
tristate "Toshiba Laptop support"
depends on X86_32
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 401f350..2e106d5 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -299,10 +299,12 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g)
{
PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g);
}
+#ifdef CONFIG_X86_IOPORT
static inline void set_iopl_mask(unsigned mask)
{
PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask);
}
+#endif /* CONFIG_X86_IOPORT */
/* The paravirtualized I/O functions */
static inline void slow_down_io(void)
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index aab8f67..bc5bfeb 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -142,7 +142,9 @@ struct pv_cpu_ops {
void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t);
+#ifdef CONFIG_X86_IOPORT
void (*set_iopl_mask)(unsigned mask);
+#endif /* CONFIG_X86_IOPORT */
void (*wbinvd)(void);
void (*io_delay)(void);
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 03d3003..48f6fbe 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -256,7 +256,11 @@ struct x86_hw_tss {
/*
* IO-bitmap sizes:
*/
+#ifdef CONFIG_X86_IOPORT
#define IO_BITMAP_BITS 65536
+#else
+#define IO_BITMAP_BITS 0
+#endif
#define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
#define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
#define INVALID_IO_BITMAP_OFFSET 0x8000
@@ -269,6 +273,7 @@ struct tss_struct {
*/
struct x86_hw_tss x86_tss;
+#ifdef CONFIG_X86_IOPORT
/*
* The extra 1 is there because the CPU will access an
* additional byte beyond the end of the IO permission
@@ -276,6 +281,7 @@ struct tss_struct {
* be within the limit.
*/
unsigned long io_bitmap[IO_BITMAP_LONGS + 1];
+#endif /* CONFIG_X86_IOPORT */
/*
* .. and then another 0x100 bytes for the emergency kernel stack:
@@ -286,6 +292,24 @@ struct tss_struct {
DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss);
+static inline void init_tss_io(struct tss_struct *t)
+{
+#ifdef CONFIG_X86_IOPORT
+ int i;
+
+ t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
+
+ /*
+ * <= is required because the CPU will access up to
+ * 8 bits beyond the end of the IO permission bitmap.
+ */
+ for (i = 0; i <= IO_BITMAP_LONGS; i++)
+ t->io_bitmap[i] = ~0UL;
+#else
+ t->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET;
+#endif
+}
+
/*
* Save the original ist values for checking stack pointers during debugging
*/
@@ -484,13 +508,16 @@ struct thread_struct {
unsigned int saved_fs;
unsigned int saved_gs;
#endif
+#ifdef CONFIG_X86_IOPORT
/* IO permissions: */
unsigned long *io_bitmap_ptr;
unsigned long iopl;
/* Max allowed port in the bitmap, in bytes: */
unsigned io_bitmap_max;
+#endif /* CONFIG_X86_IOPORT */
};
+#ifdef CONFIG_X86_IOPORT
/*
* Set IOPL bits in EFLAGS from given mask
*/
@@ -509,6 +536,7 @@ static inline void native_set_iopl_mask(unsigned mask)
: "i" (~X86_EFLAGS_IOPL), "r" (mask));
#endif
}
+#endif /* CONFIG_X86_IOPORT */
static inline void
native_load_sp0(struct tss_struct *tss, struct thread_struct *thread)
@@ -828,12 +856,8 @@ static inline void spin_lock_prefetch(const void *x)
#define STACK_TOP TASK_SIZE
#define STACK_TOP_MAX STACK_TOP
-#define INIT_THREAD { \
- .sp0 = sizeof(init_stack) + (long)&init_stack, \
- .vm86_info = NULL, \
- .sysenter_cs = __KERNEL_CS, \
- .io_bitmap_ptr = NULL, \
-}
+#ifdef CONFIG_X86_IOPORT
+#define INIT_THREAD_IO .io_bitmap_ptr = NULL,
/*
* Note that the .io_bitmap member must be extra-big. This is because
@@ -841,6 +865,19 @@ static inline void spin_lock_prefetch(const void *x)
* permission bitmap. The extra byte must be all 1 bits, and must
* be within the limit.
*/
+#define INIT_TSS_IO .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 },
+#else
+#define INIT_THREAD_IO
+#define INIT_TSS_IO
+#endif
+
+#define INIT_THREAD { \
+ .sp0 = sizeof(init_stack) + (long)&init_stack, \
+ .vm86_info = NULL, \
+ .sysenter_cs = __KERNEL_CS, \
+ INIT_THREAD_IO \
+}
+
#define INIT_TSS { \
.x86_tss = { \
.sp0 = sizeof(init_stack) + (long)&init_stack, \
@@ -848,7 +885,7 @@ static inline void spin_lock_prefetch(const void *x)
.ss1 = __KERNEL_CS, \
.io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \
}, \
- .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \
+ INIT_TSS_IO \
}
extern unsigned long thread_saved_pc(struct task_struct *tsk);
diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 592a6a6..4db79ea 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -16,9 +16,12 @@
#include <linux/types.h>
/* Common in X86_32 and X86_64 */
+
+#ifdef CONFIG_X86_IOPORT
/* kernel/ioport.c */
asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
asmlinkage long sys_iopl(unsigned int);
+#endif /* CONFIG_X86_IOPORT */
/* kernel/ldt.c */
asmlinkage int sys_modify_ldt(int, void __user *, unsigned long);
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index a5408b9..9644737 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -20,7 +20,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace
obj-y := process_$(BITS).o signal.o entry_$(BITS).o
obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
-obj-y += time.o ioport.o ldt.o dumpstack.o nmi.o
+obj-y += time.o ldt.o dumpstack.o nmi.o
+obj-$(CONFIG_X86_IOPORT) += ioport.o
obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o
obj-$(CONFIG_IRQ_WORK) += irq_work.o
obj-y += probe_roms.o
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2793d1f..01235ef 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1223,7 +1223,6 @@ void cpu_init(void)
struct tss_struct *t;
unsigned long v;
int cpu;
- int i;
/*
* Load microcode on this cpu if a valid microcode is available.
@@ -1285,14 +1284,7 @@ void cpu_init(void)
}
}
- t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
-
- /*
- * <= is required because the CPU will access up to
- * 8 bits beyond the end of the IO permission bitmap.
- */
- for (i = 0; i <= IO_BITMAP_LONGS; i++)
- t->io_bitmap[i] = ~0UL;
+ init_tss_io(t);
atomic_inc(&init_mm.mm_count);
me->active_mm = &init_mm;
@@ -1351,7 +1343,7 @@ void cpu_init(void)
load_TR_desc();
load_LDT(&init_mm.context);
- t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
+ init_tss_io(t);
#ifdef CONFIG_DOUBLEFAULT
/* Set up doublefault TSS pointer in the GDT */
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1b69951..eb4f5ba 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -844,6 +844,11 @@ ENTRY(stub_\func)
END(stub_\func)
.endm
+ FORK_LIKE clone
+ FORK_LIKE fork
+ FORK_LIKE vfork
+
+#ifdef CONFIG_X86_IOPORT
.macro FIXED_FRAME label,func
ENTRY(\label)
CFI_STARTPROC
@@ -856,10 +861,8 @@ ENTRY(\label)
END(\label)
.endm
- FORK_LIKE clone
- FORK_LIKE fork
- FORK_LIKE vfork
FIXED_FRAME stub_iopl, sys_iopl
+#endif /* CONFIG_X86_IOPORT */
ENTRY(ptregscall_common)
DEFAULT_FRAME 1 8 /* offset 8: return address */
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 1b10af8..46da3d9 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -382,7 +382,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.iret = native_iret,
.swapgs = native_swapgs,
+#ifdef CONFIG_X86_IOPORT
.set_iopl_mask = native_set_iopl_mask,
+#endif /* CONFIG_X86_IOPORT */
.io_delay = native_io_delay,
.start_context_switch = paravirt_nop,
diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h
index d884444..3e773fa 100644
--- a/arch/x86/kernel/process-io.h
+++ b/arch/x86/kernel/process-io.h
@@ -1,9 +1,17 @@
#ifndef _X86_KERNEL_PROCESS_IO_H
#define _X86_KERNEL_PROCESS_IO_H
+static inline void clear_thread_io_bitmap(struct task_struct *p)
+{
+#ifdef CONFIG_X86_IOPORT
+ p->thread.io_bitmap_ptr = NULL;
+#endif /* CONFIG_X86_IOPORT */
+}
+
static inline int copy_io_bitmap(struct task_struct *me,
struct task_struct *p)
{
+#ifdef CONFIG_X86_IOPORT
if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
IO_BITMAP_BYTES, GFP_KERNEL);
@@ -15,8 +23,71 @@ static inline int copy_io_bitmap(struct task_struct *me,
} else {
p->thread.io_bitmap_ptr = NULL;
}
+#endif /* CONFIG_X86_IOPORT */
return 0;
}
+static inline void exit_thread_io(struct task_struct *me)
+{
+#ifdef CONFIG_X86_IOPORT
+ struct thread_struct *t = &me->thread;
+ unsigned long *bp = t->io_bitmap_ptr;
+
+ if (bp) {
+ struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
+
+ t->io_bitmap_ptr = NULL;
+ clear_thread_flag(TIF_IO_BITMAP);
+ /*
+ * Careful, clear this in the TSS too:
+ */
+ memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
+ t->io_bitmap_max = 0;
+ put_cpu();
+ kfree(bp);
+ }
+#endif /* CONFIG_X86_IOPORT */
+}
+
+static inline void switch_iopl_mask(struct thread_struct *prev,
+ struct thread_struct *next)
+{
+#ifdef CONFIG_X86_IOPORT
+ /*
+ * Restore IOPL if needed. In normal use, the flags restore
+ * in the switch assembly will handle this. But if the kernel
+ * is running virtualized at a non-zero CPL, the popf will
+ * not restore flags, so it must be done in a separate step.
+ */
+ if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
+ set_iopl_mask(next->iopl);
+#endif /* CONFIG_X86_IOPORT */
+}
+
+static inline void switch_io_bitmap(struct tss_struct *tss,
+ struct task_struct *prev_p,
+ struct task_struct *next_p)
+{
+#ifdef CONFIG_X86_IOPORT
+ struct thread_struct *prev, *next;
+ prev = &prev_p->thread;
+ next = &next_p->thread;
+
+ if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
+ /*
+ * Copy the relevant range of the IO bitmap.
+ * Normally this is 128 bytes or less:
+ */
+ memcpy(tss->io_bitmap, next->io_bitmap_ptr,
+ max(prev->io_bitmap_max, next->io_bitmap_max));
+ } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
+ /*
+ * Clear any possible leftover bits:
+ */
+ memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
+ }
+#endif /* CONFIG_X86_IOPORT */
+}
+
#endif /* _X86_KERNEL_PROCESS_IO_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c83516b..2df5b00 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -29,6 +29,8 @@
#include <asm/debugreg.h>
#include <asm/nmi.h>
+#include "process-io.h"
+
/*
* per-CPU TSS segments. Threads are completely 'soft' on Linux,
* no more per-task TSS's. The TSS size is kept cacheline-aligned
@@ -101,23 +103,7 @@ void arch_task_cache_init(void)
void exit_thread(void)
{
struct task_struct *me = current;
- struct thread_struct *t = &me->thread;
- unsigned long *bp = t->io_bitmap_ptr;
-
- if (bp) {
- struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
-
- t->io_bitmap_ptr = NULL;
- clear_thread_flag(TIF_IO_BITMAP);
- /*
- * Careful, clear this in the TSS too:
- */
- memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
- t->io_bitmap_max = 0;
- put_cpu();
- kfree(bp);
- }
-
+ exit_thread_io(me);
drop_fpu(me);
}
@@ -222,19 +208,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
hard_enable_TSC();
}
- if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
- /*
- * Copy the relevant range of the IO bitmap.
- * Normally this is 128 bytes or less:
- */
- memcpy(tss->io_bitmap, next->io_bitmap_ptr,
- max(prev->io_bitmap_max, next->io_bitmap_max));
- } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
- /*
- * Clear any possible leftover bits:
- */
- memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
- }
+ switch_io_bitmap(tss, prev_p, next_p);
propagate_user_return_notify(prev_p, next_p);
}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index e86028b..dc780cb 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
childregs->cs = __KERNEL_CS | get_kernel_rpl();
childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED;
p->fpu_counter = 0;
- p->thread.io_bitmap_ptr = NULL;
+ clear_thread_io_bitmap(p);
memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
return 0;
}
@@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
*/
load_TLS(next, cpu);
- /*
- * Restore IOPL if needed. In normal use, the flags restore
- * in the switch assembly will handle this. But if the kernel
- * is running virtualized at a non-zero CPL, the popf will
- * not restore flags, so it must be done in a separate step.
- */
- if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
- set_iopl_mask(next->iopl);
+ switch_iopl_mask(prev, next);
/*
* Now maybe handle debug registers and/or IO bitmaps
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 7461f50..f071bfa 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -785,7 +785,11 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n,
static int ioperm_active(struct task_struct *target,
const struct user_regset *regset)
{
+#ifdef CONFIG_X86_IOPORT
return target->thread.io_bitmap_max / regset->size;
+#else
+ return 0;
+#endif
}
static int ioperm_get(struct task_struct *target,
@@ -793,12 +797,16 @@ static int ioperm_get(struct task_struct *target,
unsigned int pos, unsigned int count,
void *kbuf, void __user *ubuf)
{
+#ifdef CONFIG_X86_IOPORT
if (!target->thread.io_bitmap_ptr)
return -ENXIO;
return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
target->thread.io_bitmap_ptr,
0, IO_BITMAP_BYTES);
+#else
+ return -ENXIO;
+#endif
}
/*
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index fa6ade7..f485d92 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -911,6 +911,7 @@ static void xen_load_sp0(struct tss_struct *tss,
xen_mc_issue(PARAVIRT_LAZY_CPU);
}
+#ifdef CONFIG_X86_IOPORT
static void xen_set_iopl_mask(unsigned mask)
{
struct physdev_set_iopl set_iopl;
@@ -919,6 +920,7 @@ static void xen_set_iopl_mask(unsigned mask)
set_iopl.iopl = (mask == 0) ? 1 : (mask >> 12) & 3;
HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl);
}
+#endif /* CONFIG_X86_IOPORT */
static void xen_io_delay(void)
{
@@ -1277,7 +1279,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
.write_idt_entry = xen_write_idt_entry,
.load_sp0 = xen_load_sp0,
+#ifdef CONFIG_X86_IOPORT
.set_iopl_mask = xen_set_iopl_mask,
+#endif /* CONFIG_X86_IOPORT */
.io_delay = xen_io_delay,
/* Xen takes care of %gs when switching to usermode for us */
diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 2bd78e2..7c7d405 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -410,7 +410,7 @@ int vt_ioctl(struct tty_struct *tty,
*
* XXX: you should never use these, just call ioperm directly..
*/
-#ifdef CONFIG_X86
+#ifdef CONFIG_X86_IOPORT
case KDADDIO:
case KDDELIO:
/*
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 7078052..e7f4a35 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -209,3 +209,8 @@ cond_syscall(compat_sys_open_by_handle_at);
/* compare kernel pointers */
cond_syscall(sys_kcmp);
+
+/* userspace I/O port access */
+cond_syscall(stub_iopl);
+cond_syscall(sys_iopl);
+cond_syscall(sys_ioperm);
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm)
2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
@ 2013-10-26 3:17 ` Stephen Hemminger
2013-10-26 4:30 ` Kees Cook
2013-10-31 20:04 ` Alexander van Heukelum
1 sibling, 1 reply; 17+ messages in thread
From: Stephen Hemminger @ 2013-10-26 3:17 UTC (permalink / raw)
To: Josh Triplett
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Daniel Lezcano,
Fenghua Yu, Frederic Weisbecker, H. Peter Anvin, x86,
virtualization, Paul Gortmaker, David Herrmann, Masami Hiramatsu,
Seiji Aguchi, Jiri Slaby, Boris Ostrovsky, Jesper Nilsson,
Andi Kleen, Raghavendra K T, Ingo Molnar, xen-devel,
Borislav Petkov, Len Brown, Kees Cook, Konrad Rzeszutek Wilk
I/O from userspace is used to implement usermode virtio driver(s).
This has been done independently by Intel, Brocade/Vyatta, and 6Wind.
Sorry, it has to stay.
On Mon, Oct 21, 2013 at 7:35 PM, Josh Triplett <josh@joshtriplett.org> wrote:
> On the vast majority of modern systems, no processes will use the
> userspsace I/O syscalls, iopl and ioperm. Add a new config option,
> CONFIG_X86_IOPORT, to support configuring them out of the kernel
> entirely. Since these syscalls only exist to support rare legacy
> userspace programs, X86_IOPORT does not depend on EXPERT, though it does
> still default to y.
>
> In addition to saving a significant amount of space, this also reduces
> the size of several major kernel data structures, drops a bit of code
> from several task-related hot paths, and reduces the attack surface of
> the kernel.
>
> All of the task-related code dealing with userspace I/O permissions now
> lives in process-io.h as several new inline functions, which become
> no-ops when CONFIG_X86_IOPORT.
>
> bloat-o-meter shows a net reduction of 17681 bytes on 32-bit and 9719
> bytes on 64-bit:
>
> 32-bit bloat-o-meter:
> add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681)
> function old new delta
> cpu_init 676 668 -8
> ioperm_active 18 7 -11
> init_task 1296 1284 -12
> exit_thread 179 91 -88
> ioperm_get 103 10 -93
> __switch_to_xtra 254 161 -93
> sys_iopl 127 - -127
> SyS_iopl 127 - -127
> copy_thread 606 446 -160
> vt_ioctl 4127 3919 -208
> sys_ioperm 370 - -370
> init_tss 8576 384 -8192
> doublefault_tss 8576 384 -8192
>
> 64-bit bloat-o-meter:
> add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719)
> function old new delta
> cpu_init 958 995 +37
> arch_align_stack 78 86 +8
> perf_event_exit_task 525 517 -8
> ioperm_active 17 8 -9
> init_task 1968 1944 -24
> stub_iopl 81 - -81
> ioperm_get 111 11 -100
> __switch_to_xtra 281 164 -117
> exit_thread 212 92 -120
> vt_ioctl 4432 4304 -128
> sys_iopl 137 - -137
> SyS_iopl 137 - -137
> copy_thread 694 520 -174
> sys_ioperm 473 - -473
> init_tss 8896 640 -8256
>
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> arch/x86/Kconfig | 10 +++++
> arch/x86/include/asm/paravirt.h | 2 +
> arch/x86/include/asm/paravirt_types.h | 2 +
> arch/x86/include/asm/processor.h | 51 +++++++++++++++++++++----
> arch/x86/include/asm/syscalls.h | 3 ++
> arch/x86/kernel/Makefile | 3 +-
> arch/x86/kernel/cpu/common.c | 12 +-----
> arch/x86/kernel/entry_64.S | 9 +++--
> arch/x86/kernel/paravirt.c | 2 +
> arch/x86/kernel/process-io.h | 71 +++++++++++++++++++++++++++++++++++
> arch/x86/kernel/process.c | 34 ++---------------
> arch/x86/kernel/process_32.c | 11 +-----
> arch/x86/kernel/ptrace.c | 8 ++++
> arch/x86/xen/enlighten.c | 4 ++
> drivers/tty/vt/vt_ioctl.c | 2 +-
> kernel/sys_ni.c | 5 +++
> 16 files changed, 168 insertions(+), 61 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e241a19..d5b1e68 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -976,6 +976,16 @@ config VM86
> XFree86 to initialize some video cards via BIOS. Disabling this
> option saves about 6k.
>
> +config X86_IOPORT
> + bool "iopl and ioperm system calls"
> + default y
> + ---help---
> + This option enables the iopl and ioperm system calls, which allow
> + privileged userspace processes to directly access I/O ports. This
> + is used by some legacy software to drive hardware directly from
> + userspace rather than via a proper kernel driver. Unless you intend
> + to run such software, you can safely say N here.
> +
> config TOSHIBA
> tristate "Toshiba Laptop support"
> depends on X86_32
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 401f350..2e106d5 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -299,10 +299,12 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g)
> {
> PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g);
> }
> +#ifdef CONFIG_X86_IOPORT
> static inline void set_iopl_mask(unsigned mask)
> {
> PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask);
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> /* The paravirtualized I/O functions */
> static inline void slow_down_io(void)
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index aab8f67..bc5bfeb 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -142,7 +142,9 @@ struct pv_cpu_ops {
>
> void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t);
>
> +#ifdef CONFIG_X86_IOPORT
> void (*set_iopl_mask)(unsigned mask);
> +#endif /* CONFIG_X86_IOPORT */
>
> void (*wbinvd)(void);
> void (*io_delay)(void);
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 03d3003..48f6fbe 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -256,7 +256,11 @@ struct x86_hw_tss {
> /*
> * IO-bitmap sizes:
> */
> +#ifdef CONFIG_X86_IOPORT
> #define IO_BITMAP_BITS 65536
> +#else
> +#define IO_BITMAP_BITS 0
> +#endif
> #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
> #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
> #define INVALID_IO_BITMAP_OFFSET 0x8000
> @@ -269,6 +273,7 @@ struct tss_struct {
> */
> struct x86_hw_tss x86_tss;
>
> +#ifdef CONFIG_X86_IOPORT
> /*
> * The extra 1 is there because the CPU will access an
> * additional byte beyond the end of the IO permission
> @@ -276,6 +281,7 @@ struct tss_struct {
> * be within the limit.
> */
> unsigned long io_bitmap[IO_BITMAP_LONGS + 1];
> +#endif /* CONFIG_X86_IOPORT */
>
> /*
> * .. and then another 0x100 bytes for the emergency kernel stack:
> @@ -286,6 +292,24 @@ struct tss_struct {
>
> DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss);
>
> +static inline void init_tss_io(struct tss_struct *t)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + int i;
> +
> + t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> +
> + /*
> + * <= is required because the CPU will access up to
> + * 8 bits beyond the end of the IO permission bitmap.
> + */
> + for (i = 0; i <= IO_BITMAP_LONGS; i++)
> + t->io_bitmap[i] = ~0UL;
> +#else
> + t->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET;
> +#endif
> +}
> +
> /*
> * Save the original ist values for checking stack pointers during debugging
> */
> @@ -484,13 +508,16 @@ struct thread_struct {
> unsigned int saved_fs;
> unsigned int saved_gs;
> #endif
> +#ifdef CONFIG_X86_IOPORT
> /* IO permissions: */
> unsigned long *io_bitmap_ptr;
> unsigned long iopl;
> /* Max allowed port in the bitmap, in bytes: */
> unsigned io_bitmap_max;
> +#endif /* CONFIG_X86_IOPORT */
> };
>
> +#ifdef CONFIG_X86_IOPORT
> /*
> * Set IOPL bits in EFLAGS from given mask
> */
> @@ -509,6 +536,7 @@ static inline void native_set_iopl_mask(unsigned mask)
> : "i" (~X86_EFLAGS_IOPL), "r" (mask));
> #endif
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> static inline void
> native_load_sp0(struct tss_struct *tss, struct thread_struct *thread)
> @@ -828,12 +856,8 @@ static inline void spin_lock_prefetch(const void *x)
> #define STACK_TOP TASK_SIZE
> #define STACK_TOP_MAX STACK_TOP
>
> -#define INIT_THREAD { \
> - .sp0 = sizeof(init_stack) + (long)&init_stack, \
> - .vm86_info = NULL, \
> - .sysenter_cs = __KERNEL_CS, \
> - .io_bitmap_ptr = NULL, \
> -}
> +#ifdef CONFIG_X86_IOPORT
> +#define INIT_THREAD_IO .io_bitmap_ptr = NULL,
>
> /*
> * Note that the .io_bitmap member must be extra-big. This is because
> @@ -841,6 +865,19 @@ static inline void spin_lock_prefetch(const void *x)
> * permission bitmap. The extra byte must be all 1 bits, and must
> * be within the limit.
> */
> +#define INIT_TSS_IO .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 },
> +#else
> +#define INIT_THREAD_IO
> +#define INIT_TSS_IO
> +#endif
> +
> +#define INIT_THREAD { \
> + .sp0 = sizeof(init_stack) + (long)&init_stack, \
> + .vm86_info = NULL, \
> + .sysenter_cs = __KERNEL_CS, \
> + INIT_THREAD_IO \
> +}
> +
> #define INIT_TSS { \
> .x86_tss = { \
> .sp0 = sizeof(init_stack) + (long)&init_stack, \
> @@ -848,7 +885,7 @@ static inline void spin_lock_prefetch(const void *x)
> .ss1 = __KERNEL_CS, \
> .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \
> }, \
> - .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \
> + INIT_TSS_IO \
> }
>
> extern unsigned long thread_saved_pc(struct task_struct *tsk);
> diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
> index 592a6a6..4db79ea 100644
> --- a/arch/x86/include/asm/syscalls.h
> +++ b/arch/x86/include/asm/syscalls.h
> @@ -16,9 +16,12 @@
> #include <linux/types.h>
>
> /* Common in X86_32 and X86_64 */
> +
> +#ifdef CONFIG_X86_IOPORT
> /* kernel/ioport.c */
> asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
> asmlinkage long sys_iopl(unsigned int);
> +#endif /* CONFIG_X86_IOPORT */
>
> /* kernel/ldt.c */
> asmlinkage int sys_modify_ldt(int, void __user *, unsigned long);
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index a5408b9..9644737 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -20,7 +20,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace
>
> obj-y := process_$(BITS).o signal.o entry_$(BITS).o
> obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
> -obj-y += time.o ioport.o ldt.o dumpstack.o nmi.o
> +obj-y += time.o ldt.o dumpstack.o nmi.o
> +obj-$(CONFIG_X86_IOPORT) += ioport.o
> obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o
> obj-$(CONFIG_IRQ_WORK) += irq_work.o
> obj-y += probe_roms.o
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 2793d1f..01235ef 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1223,7 +1223,6 @@ void cpu_init(void)
> struct tss_struct *t;
> unsigned long v;
> int cpu;
> - int i;
>
> /*
> * Load microcode on this cpu if a valid microcode is available.
> @@ -1285,14 +1284,7 @@ void cpu_init(void)
> }
> }
>
> - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> -
> - /*
> - * <= is required because the CPU will access up to
> - * 8 bits beyond the end of the IO permission bitmap.
> - */
> - for (i = 0; i <= IO_BITMAP_LONGS; i++)
> - t->io_bitmap[i] = ~0UL;
> + init_tss_io(t);
>
> atomic_inc(&init_mm.mm_count);
> me->active_mm = &init_mm;
> @@ -1351,7 +1343,7 @@ void cpu_init(void)
> load_TR_desc();
> load_LDT(&init_mm.context);
>
> - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> + init_tss_io(t);
>
> #ifdef CONFIG_DOUBLEFAULT
> /* Set up doublefault TSS pointer in the GDT */
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 1b69951..eb4f5ba 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -844,6 +844,11 @@ ENTRY(stub_\func)
> END(stub_\func)
> .endm
>
> + FORK_LIKE clone
> + FORK_LIKE fork
> + FORK_LIKE vfork
> +
> +#ifdef CONFIG_X86_IOPORT
> .macro FIXED_FRAME label,func
> ENTRY(\label)
> CFI_STARTPROC
> @@ -856,10 +861,8 @@ ENTRY(\label)
> END(\label)
> .endm
>
> - FORK_LIKE clone
> - FORK_LIKE fork
> - FORK_LIKE vfork
> FIXED_FRAME stub_iopl, sys_iopl
> +#endif /* CONFIG_X86_IOPORT */
>
> ENTRY(ptregscall_common)
> DEFAULT_FRAME 1 8 /* offset 8: return address */
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 1b10af8..46da3d9 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -382,7 +382,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
> .iret = native_iret,
> .swapgs = native_swapgs,
>
> +#ifdef CONFIG_X86_IOPORT
> .set_iopl_mask = native_set_iopl_mask,
> +#endif /* CONFIG_X86_IOPORT */
> .io_delay = native_io_delay,
>
> .start_context_switch = paravirt_nop,
> diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h
> index d884444..3e773fa 100644
> --- a/arch/x86/kernel/process-io.h
> +++ b/arch/x86/kernel/process-io.h
> @@ -1,9 +1,17 @@
> #ifndef _X86_KERNEL_PROCESS_IO_H
> #define _X86_KERNEL_PROCESS_IO_H
>
> +static inline void clear_thread_io_bitmap(struct task_struct *p)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + p->thread.io_bitmap_ptr = NULL;
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> static inline int copy_io_bitmap(struct task_struct *me,
> struct task_struct *p)
> {
> +#ifdef CONFIG_X86_IOPORT
> if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
> p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
> IO_BITMAP_BYTES, GFP_KERNEL);
> @@ -15,8 +23,71 @@ static inline int copy_io_bitmap(struct task_struct *me,
> } else {
> p->thread.io_bitmap_ptr = NULL;
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> return 0;
> }
>
> +static inline void exit_thread_io(struct task_struct *me)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + struct thread_struct *t = &me->thread;
> + unsigned long *bp = t->io_bitmap_ptr;
> +
> + if (bp) {
> + struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
> +
> + t->io_bitmap_ptr = NULL;
> + clear_thread_flag(TIF_IO_BITMAP);
> + /*
> + * Careful, clear this in the TSS too:
> + */
> + memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
> + t->io_bitmap_max = 0;
> + put_cpu();
> + kfree(bp);
> + }
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> +static inline void switch_iopl_mask(struct thread_struct *prev,
> + struct thread_struct *next)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + /*
> + * Restore IOPL if needed. In normal use, the flags restore
> + * in the switch assembly will handle this. But if the kernel
> + * is running virtualized at a non-zero CPL, the popf will
> + * not restore flags, so it must be done in a separate step.
> + */
> + if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
> + set_iopl_mask(next->iopl);
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> +static inline void switch_io_bitmap(struct tss_struct *tss,
> + struct task_struct *prev_p,
> + struct task_struct *next_p)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + struct thread_struct *prev, *next;
> + prev = &prev_p->thread;
> + next = &next_p->thread;
> +
> + if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
> + /*
> + * Copy the relevant range of the IO bitmap.
> + * Normally this is 128 bytes or less:
> + */
> + memcpy(tss->io_bitmap, next->io_bitmap_ptr,
> + max(prev->io_bitmap_max, next->io_bitmap_max));
> + } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
> + /*
> + * Clear any possible leftover bits:
> + */
> + memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
> + }
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> #endif /* _X86_KERNEL_PROCESS_IO_H */
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index c83516b..2df5b00 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -29,6 +29,8 @@
> #include <asm/debugreg.h>
> #include <asm/nmi.h>
>
> +#include "process-io.h"
> +
> /*
> * per-CPU TSS segments. Threads are completely 'soft' on Linux,
> * no more per-task TSS's. The TSS size is kept cacheline-aligned
> @@ -101,23 +103,7 @@ void arch_task_cache_init(void)
> void exit_thread(void)
> {
> struct task_struct *me = current;
> - struct thread_struct *t = &me->thread;
> - unsigned long *bp = t->io_bitmap_ptr;
> -
> - if (bp) {
> - struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
> -
> - t->io_bitmap_ptr = NULL;
> - clear_thread_flag(TIF_IO_BITMAP);
> - /*
> - * Careful, clear this in the TSS too:
> - */
> - memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
> - t->io_bitmap_max = 0;
> - put_cpu();
> - kfree(bp);
> - }
> -
> + exit_thread_io(me);
> drop_fpu(me);
> }
>
> @@ -222,19 +208,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
> hard_enable_TSC();
> }
>
> - if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
> - /*
> - * Copy the relevant range of the IO bitmap.
> - * Normally this is 128 bytes or less:
> - */
> - memcpy(tss->io_bitmap, next->io_bitmap_ptr,
> - max(prev->io_bitmap_max, next->io_bitmap_max));
> - } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
> - /*
> - * Clear any possible leftover bits:
> - */
> - memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
> - }
> + switch_io_bitmap(tss, prev_p, next_p);
> propagate_user_return_notify(prev_p, next_p);
> }
>
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index e86028b..dc780cb 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
> childregs->cs = __KERNEL_CS | get_kernel_rpl();
> childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED;
> p->fpu_counter = 0;
> - p->thread.io_bitmap_ptr = NULL;
> + clear_thread_io_bitmap(p);
> memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
> return 0;
> }
> @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> */
> load_TLS(next, cpu);
>
> - /*
> - * Restore IOPL if needed. In normal use, the flags restore
> - * in the switch assembly will handle this. But if the kernel
> - * is running virtualized at a non-zero CPL, the popf will
> - * not restore flags, so it must be done in a separate step.
> - */
> - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
> - set_iopl_mask(next->iopl);
> + switch_iopl_mask(prev, next);
>
> /*
> * Now maybe handle debug registers and/or IO bitmaps
> diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
> index 7461f50..f071bfa 100644
> --- a/arch/x86/kernel/ptrace.c
> +++ b/arch/x86/kernel/ptrace.c
> @@ -785,7 +785,11 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n,
> static int ioperm_active(struct task_struct *target,
> const struct user_regset *regset)
> {
> +#ifdef CONFIG_X86_IOPORT
> return target->thread.io_bitmap_max / regset->size;
> +#else
> + return 0;
> +#endif
> }
>
> static int ioperm_get(struct task_struct *target,
> @@ -793,12 +797,16 @@ static int ioperm_get(struct task_struct *target,
> unsigned int pos, unsigned int count,
> void *kbuf, void __user *ubuf)
> {
> +#ifdef CONFIG_X86_IOPORT
> if (!target->thread.io_bitmap_ptr)
> return -ENXIO;
>
> return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
> target->thread.io_bitmap_ptr,
> 0, IO_BITMAP_BYTES);
> +#else
> + return -ENXIO;
> +#endif
> }
>
> /*
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index fa6ade7..f485d92 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -911,6 +911,7 @@ static void xen_load_sp0(struct tss_struct *tss,
> xen_mc_issue(PARAVIRT_LAZY_CPU);
> }
>
> +#ifdef CONFIG_X86_IOPORT
> static void xen_set_iopl_mask(unsigned mask)
> {
> struct physdev_set_iopl set_iopl;
> @@ -919,6 +920,7 @@ static void xen_set_iopl_mask(unsigned mask)
> set_iopl.iopl = (mask == 0) ? 1 : (mask >> 12) & 3;
> HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl);
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> static void xen_io_delay(void)
> {
> @@ -1277,7 +1279,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> .write_idt_entry = xen_write_idt_entry,
> .load_sp0 = xen_load_sp0,
>
> +#ifdef CONFIG_X86_IOPORT
> .set_iopl_mask = xen_set_iopl_mask,
> +#endif /* CONFIG_X86_IOPORT */
> .io_delay = xen_io_delay,
>
> /* Xen takes care of %gs when switching to usermode for us */
> diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
> index 2bd78e2..7c7d405 100644
> --- a/drivers/tty/vt/vt_ioctl.c
> +++ b/drivers/tty/vt/vt_ioctl.c
> @@ -410,7 +410,7 @@ int vt_ioctl(struct tty_struct *tty,
> *
> * XXX: you should never use these, just call ioperm directly..
> */
> -#ifdef CONFIG_X86
> +#ifdef CONFIG_X86_IOPORT
> case KDADDIO:
> case KDDELIO:
> /*
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 7078052..e7f4a35 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -209,3 +209,8 @@ cond_syscall(compat_sys_open_by_handle_at);
>
> /* compare kernel pointers */
> cond_syscall(sys_kcmp);
> +
> +/* userspace I/O port access */
> +cond_syscall(stub_iopl);
> +cond_syscall(sys_iopl);
> +cond_syscall(sys_ioperm);
> --
> 1.8.4.rc3
>
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm)
2013-10-26 3:17 ` Stephen Hemminger
@ 2013-10-26 4:30 ` Kees Cook
0 siblings, 0 replies; 17+ messages in thread
From: Kees Cook @ 2013-10-26 4:30 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Daniel Lezcano,
Fenghua Yu, Frederic Weisbecker, David Herrmann, x86@kernel.org,
virtualization@lists.linux-foundation.org, Paul Gortmaker,
H. Peter Anvin, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Boris Ostrovsky, Jesper Nilsson, Andi Kleen, Raghavendra K T,
Ingo Molnar, xen-devel, Borislav Petkov, Len Brown, Konrad
On Sat, Oct 26, 2013 at 4:17 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> I/O from userspace is used to implement usermode virtio driver(s).
> This has been done independently by Intel, Brocade/Vyatta, and 6Wind.
> Sorry, it has to stay.
This isn't about removing it, it's about putting it behind a default=y
CONFIG. At the very least, the cleanups and 32/64 merging look
sensible.
As for the CONFIG, I think it'd be useful to have that, since some
more specialized users may not want IO ports at all (some folks see IO
ports as dangerous since they are a potential way to make a jump from
uid-0 to ring-0).
-Kees
--
Kees Cook
Chrome OS Security
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling
2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett
@ 2013-10-30 22:21 ` Kees Cook
2013-10-31 20:01 ` Alexander van Heukelum
1 sibling, 0 replies; 17+ messages in thread
From: Kees Cook @ 2013-10-30 22:21 UTC (permalink / raw)
To: Josh Triplett
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org,
Len Brown, Frederic Weisbecker, H. Peter Anvin,
virtualization@lists.linux-foundation.org, Paul Gortmaker,
Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi,
Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen,
Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel,
Borislav Petkov, Fenghua Yu <feng>
On Mon, Oct 21, 2013 at 7:34 PM, Josh Triplett <josh@joshtriplett.org> wrote:
> The 32-bit and 64-bit versions of copy_thread have functionally
> identical handling for copying the I/O bitmap, modulo differences in
> error handling. Clean up the error paths in both by moving the copy of
> the I/O bitmap to the end, to eliminate the need to free it if
> subsequent copy steps fail; move the resulting identical code to a
> static inline in a common header.
>
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
This seems quite sensible to me.
Acked-by: Kees Cook <keescook@chromium.org>
--
Kees Cook
Chrome OS Security
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett
@ 2013-10-30 22:22 ` Kees Cook
2013-10-30 22:53 ` H. Peter Anvin
2013-10-31 11:12 ` Josh Triplett
2013-10-31 20:02 ` Alexander van Heukelum
1 sibling, 2 replies; 17+ messages in thread
From: Kees Cook @ 2013-10-30 22:22 UTC (permalink / raw)
To: Josh Triplett
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org,
Len Brown, Frederic Weisbecker, H. Peter Anvin,
virtualization@lists.linux-foundation.org, Paul Gortmaker,
Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi,
Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen,
Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel,
Borislav Petkov, Fenghua Yu <feng>
On Mon, Oct 21, 2013 at 7:34 PM, Josh Triplett <josh@joshtriplett.org> wrote:
> __set_tss_desc has a complex calculation of the TSS segment limit,
> duplicating the quirky details of the I/O bitmap array length, and
> requiring a complex comment to explain. Replace that calculation with a
> simpler one based on the offsetof the "stack" field that follows the
> array.
>
> That then removes the last use of IO_BITMAP_OFFSET, so delete it.
>
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> arch/x86/include/asm/desc.h | 11 +----------
> arch/x86/include/asm/processor.h | 3 ++-
> 2 files changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> index b90e5df..17ac92f 100644
> --- a/arch/x86/include/asm/desc.h
> +++ b/arch/x86/include/asm/desc.h
> @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
> struct desc_struct *d = get_cpu_gdt_table(cpu);
> tss_desc tss;
>
> - /*
> - * sizeof(unsigned long) coming from an extra "long" at the end
> - * of the iobitmap. See tss_struct definition in processor.h
> - *
> - * -1? seg base+limit should be pointing to the address of the
> - * last valid byte
I think it might be better to keep at least a minimal comment near the
TSS_LIMIT declaration, just to explain the "-1" part, which is not
entirely obvious from just reading the code.
-Kees
> - */
> - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS,
> - IO_BITMAP_OFFSET + IO_BITMAP_BYTES +
> - sizeof(unsigned long) - 1);
> + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT);
> write_gdt_entry(d, entry, &tss, DESC_TSS);
> }
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 987c75e..03d3003 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -259,9 +259,10 @@ struct x86_hw_tss {
> #define IO_BITMAP_BITS 65536
> #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
> #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
> -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap)
> #define INVALID_IO_BITMAP_OFFSET 0x8000
>
> +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1)
> +
> struct tss_struct {
> /*
> * The hardware state:
> --
> 1.8.4.rc3
>
--
Kees Cook
Chrome OS Security
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-30 22:22 ` Kees Cook
@ 2013-10-30 22:53 ` H. Peter Anvin
2013-10-31 11:17 ` Josh Triplett
2013-10-31 11:12 ` Josh Triplett
1 sibling, 1 reply; 17+ messages in thread
From: H. Peter Anvin @ 2013-10-30 22:53 UTC (permalink / raw)
To: Kees Cook, Josh Triplett
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org,
Len Brown, Frederic Weisbecker,
virtualization@lists.linux-foundation.org, Paul Gortmaker,
Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi,
Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen,
Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel,
Borislav Petkov, Fenghua Yu, Konrad
On 10/30/2013 03:22 PM, Kees Cook wrote:
>>
>> - /*
>> - * sizeof(unsigned long) coming from an extra "long" at the end
>> - * of the iobitmap. See tss_struct definition in processor.h
>> - *
>> - * -1? seg base+limit should be pointing to the address of the
>> - * last valid byte
>
> I think it might be better to keep at least a minimal comment near the
> TSS_LIMIT declaration, just to explain the "-1" part, which is not
> entirely obvious from just reading the code.
>
Agreed, although it doesn't need to be an unsigned long at all... the
CPU will only ever access one extra byte past the end.
-hpa
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-30 22:22 ` Kees Cook
2013-10-30 22:53 ` H. Peter Anvin
@ 2013-10-31 11:12 ` Josh Triplett
1 sibling, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2013-10-31 11:12 UTC (permalink / raw)
To: Kees Cook
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, x86@kernel.org,
Len Brown, Frederic Weisbecker, H. Peter Anvin,
virtualization@lists.linux-foundation.org, Paul Gortmaker,
Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi,
Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen,
Daniel Lezcano, Ingo Molnar, Steven Rostedt, xen-devel,
Borislav Petkov, Fenghua Yu <feng>
On Wed, Oct 30, 2013 at 03:22:33PM -0700, Kees Cook wrote:
> On Mon, Oct 21, 2013 at 7:34 PM, Josh Triplett <josh@joshtriplett.org> wrote:
> > __set_tss_desc has a complex calculation of the TSS segment limit,
> > duplicating the quirky details of the I/O bitmap array length, and
> > requiring a complex comment to explain. Replace that calculation with a
> > simpler one based on the offsetof the "stack" field that follows the
> > array.
> >
> > That then removes the last use of IO_BITMAP_OFFSET, so delete it.
> >
> > Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> > ---
> > arch/x86/include/asm/desc.h | 11 +----------
> > arch/x86/include/asm/processor.h | 3 ++-
> > 2 files changed, 3 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> > index b90e5df..17ac92f 100644
> > --- a/arch/x86/include/asm/desc.h
> > +++ b/arch/x86/include/asm/desc.h
> > @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
> > struct desc_struct *d = get_cpu_gdt_table(cpu);
> > tss_desc tss;
> >
> > - /*
> > - * sizeof(unsigned long) coming from an extra "long" at the end
> > - * of the iobitmap. See tss_struct definition in processor.h
> > - *
> > - * -1? seg base+limit should be pointing to the address of the
> > - * last valid byte
>
> I think it might be better to keep at least a minimal comment near the
> TSS_LIMIT declaration, just to explain the "-1" part, which is not
> entirely obvious from just reading the code.
Fair enough; I've added an appropriate comment next to TSS_LIMIT, and
I'll include that in PATCHv2, which I'll send out as soon as I see
any feedback on patch 3/3.
- Josh Triplett
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-30 22:53 ` H. Peter Anvin
@ 2013-10-31 11:17 ` Josh Triplett
0 siblings, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2013-10-31 11:17 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Alexander van Heukelum, Jeremy Fitzhardinge, Daniel Lezcano,
Len Brown, Frederic Weisbecker,
virtualization@lists.linux-foundation.org, Paul Gortmaker,
Raghavendra K T, David Herrmann, Masami Hiramatsu, Seiji Aguchi,
Jiri Slaby, Alok Kataria, Jesper Nilsson, Andi Kleen,
x86@kernel.org, Ingo Molnar, Steven Rostedt, xen-devel,
Borislav Petkov, Fenghua Yu, Kees Cook <kees>
On Wed, Oct 30, 2013 at 03:53:11PM -0700, H. Peter Anvin wrote:
> On 10/30/2013 03:22 PM, Kees Cook wrote:
> >>
> >> - /*
> >> - * sizeof(unsigned long) coming from an extra "long" at the end
> >> - * of the iobitmap. See tss_struct definition in processor.h
> >> - *
> >> - * -1? seg base+limit should be pointing to the address of the
> >> - * last valid byte
> >
> > I think it might be better to keep at least a minimal comment near the
> > TSS_LIMIT declaration, just to explain the "-1" part, which is not
> > entirely obvious from just reading the code.
> >
>
> Agreed, although it doesn't need to be an unsigned long at all... the
> CPU will only ever access one extra byte past the end.
True, but the thing immediately following the iobitmap is a stack, which
needs aligning, so the array does need to contain a full additional
unsigned long even if the CPU only accesses a byte of it. In any case,
that isn't the reason for the -1, just the reason for the
sizeof(unsigned long) mentioned in the comment above, which goes away
now that TSS_LIMIT uses the offset of the *following* field rather than
recalculating the size of the iobitmap.
- Josh Triplett
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling
2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett
2013-10-30 22:21 ` Kees Cook
@ 2013-10-31 20:01 ` Alexander van Heukelum
2013-11-01 16:33 ` Josh Triplett
1 sibling, 1 reply; 17+ messages in thread
From: Alexander van Heukelum @ 2013-10-31 20:01 UTC (permalink / raw)
To: Josh Triplett, linux-kernel, virtualization, x86, xen-devel
Cc: Jeremy Fitzhardinge, Raghavendra K T, Len Brown,
Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris Metcalf,
Chris Wright
Hi Josh,
Since you added me to the CC, I had a look... Comments inline
On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote:
> The 32-bit and 64-bit versions of copy_thread have functionally
> identical handling for copying the I/O bitmap, modulo differences in
> error handling. Clean up the error paths in both by moving the copy of
> the I/O bitmap to the end, to eliminate the need to free it if
> subsequent copy steps fail; move the resulting identical code to a
> static inline in a common header.
>
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> arch/x86/kernel/process-io.h | 22 ++++++++++++++++++++++
I don't particularly like this new file. It's also not in a proper place.
> arch/x86/kernel/process_32.c | 29 ++++++++---------------------
> arch/x86/kernel/process_64.c | 26 +++++---------------------
Yay, this I like :). However, unification is best done in a separate step. First
make the two parts equal in one or two patches. Then, in a separate patch,
do the mechanical unification.
> 3 files changed, 35 insertions(+), 42 deletions(-)
> create mode 100644 arch/x86/kernel/process-io.h
>
> diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h
> new file mode 100644
> index 0000000..d884444
> --- /dev/null
> +++ b/arch/x86/kernel/process-io.h
> @@ -0,0 +1,22 @@
> +#ifndef _X86_KERNEL_PROCESS_IO_H
> +#define _X86_KERNEL_PROCESS_IO_H
> +
> +static inline int copy_io_bitmap(struct task_struct *me,
> + struct task_struct *p)
> +{
> + if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
> + p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
> + IO_BITMAP_BYTES, GFP_KERNEL);
> + if (!p->thread.io_bitmap_ptr) {
> + p->thread.io_bitmap_max = 0;
> + return -ENOMEM;
> + }
> + set_tsk_thread_flag(p, TIF_IO_BITMAP);
> + } else {
> + p->thread.io_bitmap_ptr = NULL;
> + }
> +
> + return 0;
> +}
> +
I really don't like the .h-file like this. Could you try avoiding it? I think
it's possible to unify the copy_thread function and in the end move it
to process.c instead. The arch-specific parts can be split out in static
functions guarded by CONFIG_X86_64 (cooking up good names is
the challenge here ;) ).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett
2013-10-30 22:22 ` Kees Cook
@ 2013-10-31 20:02 ` Alexander van Heukelum
2013-11-01 16:40 ` Josh Triplett
1 sibling, 1 reply; 17+ messages in thread
From: Alexander van Heukelum @ 2013-10-31 20:02 UTC (permalink / raw)
To: Josh Triplett, linux-kernel, virtualization, x86, xen-devel
Cc: Jeremy Fitzhardinge, Raghavendra K T, Len Brown,
Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris Metcalf,
Chris Wright
Hi Josh,
On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote:
> __set_tss_desc has a complex calculation of the TSS segment limit,
> duplicating the quirky details of the I/O bitmap array length, and
> requiring a complex comment to explain. Replace that calculation with a
> simpler one based on the offsetof the "stack" field that follows the
> array.
>
> That then removes the last use of IO_BITMAP_OFFSET, so delete it.
>
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> arch/x86/include/asm/desc.h | 11 +----------
> arch/x86/include/asm/processor.h | 3 ++-
> 2 files changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> index b90e5df..17ac92f 100644
> --- a/arch/x86/include/asm/desc.h
> +++ b/arch/x86/include/asm/desc.h
> @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
> struct desc_struct *d = get_cpu_gdt_table(cpu);
> tss_desc tss;
>
> - /*
> - * sizeof(unsigned long) coming from an extra "long" at the end
> - * of the iobitmap. See tss_struct definition in processor.h
> - *
> - * -1? seg base+limit should be pointing to the address of the
> - * last valid byte
> - */
> - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS,
> - IO_BITMAP_OFFSET + IO_BITMAP_BYTES +
> - sizeof(unsigned long) - 1);
> + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT);
> write_gdt_entry(d, entry, &tss, DESC_TSS);
> }
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 987c75e..03d3003 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -259,9 +259,10 @@ struct x86_hw_tss {
> #define IO_BITMAP_BITS 65536
> #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
> #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
> -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap)
> #define INVALID_IO_BITMAP_OFFSET 0x8000
>
> +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1)
> +
I wonder if the 'stack' in tss_struct has any meaning anymore. Can it be removed?
If so, the offsetof can be changed to sizeof(struct tss_struct), which looks more
natural to me.
Even so, I think this is already an improvement.
Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
> struct tss_struct {
> /*
> * The hardware state:
> --
> 1.8.4.rc3
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm)
2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2013-10-26 3:17 ` Stephen Hemminger
@ 2013-10-31 20:04 ` Alexander van Heukelum
2013-11-01 17:19 ` Josh Triplett
1 sibling, 1 reply; 17+ messages in thread
From: Alexander van Heukelum @ 2013-10-31 20:04 UTC (permalink / raw)
To: Josh Triplett, linux-kernel, virtualization, x86, xen-devel
Cc: Jeremy Fitzhardinge, Raghavendra K T, Len Brown,
Frederic Weisbecker, H. Peter Anvin, Paul Gortmaker,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, Borislav Petkov, Fenghua Yu,
Kees Cook, Konrad Rzeszutek Wilk, Ross Lagerwall, Chris Metcalf,
Chris Wright
Hi Josh,
On Tue, Oct 22, 2013, at 3:35, Josh Triplett wrote:
> On the vast majority of modern systems, no processes will use the
> userspsace I/O syscalls, iopl and ioperm. Add a new config option,
> CONFIG_X86_IOPORT, to support configuring them out of the kernel
> entirely. Since these syscalls only exist to support rare legacy
> userspace programs, X86_IOPORT does not depend on EXPERT, though it does
> still default to y.
>
> In addition to saving a significant amount of space, this also reduces
> the size of several major kernel data structures, drops a bit of code
> from several task-related hot paths, and reduces the attack surface of
> the kernel.
>
> All of the task-related code dealing with userspace I/O permissions now
> lives in process-io.h as several new inline functions, which become
> no-ops when CONFIG_X86_IOPORT.
>
> bloat-o-meter shows a net reduction of 17681 bytes on 32-bit and 9719
> bytes on 64-bit:
>
> 32-bit bloat-o-meter:
> add/remove: 0/3 grow/shrink: 0/10 up/down: 0/-17681 (-17681)
> function old new delta
> cpu_init 676 668 -8
> ioperm_active 18 7 -11
> init_task 1296 1284 -12
> exit_thread 179 91 -88
> ioperm_get 103 10 -93
> __switch_to_xtra 254 161 -93
> sys_iopl 127 - -127
> SyS_iopl 127 - -127
These are not really different ;)
> copy_thread 606 446 -160
> vt_ioctl 4127 3919 -208
> sys_ioperm 370 - -370
> init_tss 8576 384 -8192
> doublefault_tss 8576 384 -8192
>
> 64-bit bloat-o-meter:
> add/remove: 0/4 grow/shrink: 2/9 up/down: 45/-9764 (-9719)
> function old new delta
> cpu_init 958 995 +37
> arch_align_stack 78 86 +8
> perf_event_exit_task 525 517 -8
> ioperm_active 17 8 -9
> init_task 1968 1944 -24
> stub_iopl 81 - -81
> ioperm_get 111 11 -100
> __switch_to_xtra 281 164 -117
> exit_thread 212 92 -120
> vt_ioctl 4432 4304 -128
> sys_iopl 137 - -137
> SyS_iopl 137 - -137
> copy_thread 694 520 -174
> sys_ioperm 473 - -473
> init_tss 8896 640 -8256
>
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> arch/x86/Kconfig | 10 +++++
> arch/x86/include/asm/paravirt.h | 2 +
> arch/x86/include/asm/paravirt_types.h | 2 +
> arch/x86/include/asm/processor.h | 51 +++++++++++++++++++++----
> arch/x86/include/asm/syscalls.h | 3 ++
> arch/x86/kernel/Makefile | 3 +-
> arch/x86/kernel/cpu/common.c | 12 +-----
> arch/x86/kernel/entry_64.S | 9 +++--
> arch/x86/kernel/paravirt.c | 2 +
> arch/x86/kernel/process-io.h | 71 +++++++++++++++++++++++++++++++++++
> arch/x86/kernel/process.c | 34 ++---------------
> arch/x86/kernel/process_32.c | 11 +-----
> arch/x86/kernel/ptrace.c | 8 ++++
> arch/x86/xen/enlighten.c | 4 ++
> drivers/tty/vt/vt_ioctl.c | 2 +-
> kernel/sys_ni.c | 5 +++
> 16 files changed, 168 insertions(+), 61 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e241a19..d5b1e68 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -976,6 +976,16 @@ config VM86
> XFree86 to initialize some video cards via BIOS. Disabling this
> option saves about 6k.
>
> +config X86_IOPORT
> + bool "iopl and ioperm system calls"
> + default y
> + ---help---
> + This option enables the iopl and ioperm system calls, which allow
> + privileged userspace processes to directly access I/O ports. This
> + is used by some legacy software to drive hardware directly from
> + userspace rather than via a proper kernel driver. Unless you intend
> + to run such software, you can safely say N here.
> +
I think this entry should be under General setup / Configure standard kernel
features (expert users). Remove references to "legacy" and "proper driver".
> config TOSHIBA
> tristate "Toshiba Laptop support"
> depends on X86_32
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 401f350..2e106d5 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -299,10 +299,12 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g)
> {
> PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g);
> }
> +#ifdef CONFIG_X86_IOPORT
> static inline void set_iopl_mask(unsigned mask)
> {
> PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask);
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> /* The paravirtualized I/O functions */
> static inline void slow_down_io(void)
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index aab8f67..bc5bfeb 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -142,7 +142,9 @@ struct pv_cpu_ops {
>
> void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t);
>
> +#ifdef CONFIG_X86_IOPORT
> void (*set_iopl_mask)(unsigned mask);
> +#endif /* CONFIG_X86_IOPORT */
>
> void (*wbinvd)(void);
> void (*io_delay)(void);
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 03d3003..48f6fbe 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -256,7 +256,11 @@ struct x86_hw_tss {
> /*
> * IO-bitmap sizes:
> */
> +#ifdef CONFIG_X86_IOPORT
> #define IO_BITMAP_BITS 65536
> +#else
> +#define IO_BITMAP_BITS 0
> +#endif
> #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
> #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
> #define INVALID_IO_BITMAP_OFFSET 0x8000
> @@ -269,6 +273,7 @@ struct tss_struct {
> */
> struct x86_hw_tss x86_tss;
>
> +#ifdef CONFIG_X86_IOPORT
> /*
> * The extra 1 is there because the CPU will access an
> * additional byte beyond the end of the IO permission
> @@ -276,6 +281,7 @@ struct tss_struct {
> * be within the limit.
> */
> unsigned long io_bitmap[IO_BITMAP_LONGS + 1];
> +#endif /* CONFIG_X86_IOPORT */
>
> /*
> * .. and then another 0x100 bytes for the emergency kernel stack:
> @@ -286,6 +292,24 @@ struct tss_struct {
>
> DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss);
>
> +static inline void init_tss_io(struct tss_struct *t)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + int i;
> +
> + t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> +
> + /*
> + * <= is required because the CPU will access up to
> + * 8 bits beyond the end of the IO permission bitmap.
> + */
> + for (i = 0; i <= IO_BITMAP_LONGS; i++)
> + t->io_bitmap[i] = ~0UL;
> +#else
> + t->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET;
> +#endif
> +}
> +
> /*
> * Save the original ist values for checking stack pointers during debugging
> */
> @@ -484,13 +508,16 @@ struct thread_struct {
> unsigned int saved_fs;
> unsigned int saved_gs;
> #endif
> +#ifdef CONFIG_X86_IOPORT
> /* IO permissions: */
> unsigned long *io_bitmap_ptr;
> unsigned long iopl;
> /* Max allowed port in the bitmap, in bytes: */
> unsigned io_bitmap_max;
> +#endif /* CONFIG_X86_IOPORT */
> };
>
> +#ifdef CONFIG_X86_IOPORT
> /*
> * Set IOPL bits in EFLAGS from given mask
> */
> @@ -509,6 +536,7 @@ static inline void native_set_iopl_mask(unsigned mask)
> : "i" (~X86_EFLAGS_IOPL), "r" (mask));
> #endif
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> static inline void
> native_load_sp0(struct tss_struct *tss, struct thread_struct *thread)
> @@ -828,12 +856,8 @@ static inline void spin_lock_prefetch(const void *x)
> #define STACK_TOP TASK_SIZE
> #define STACK_TOP_MAX STACK_TOP
>
> -#define INIT_THREAD { \
> - .sp0 = sizeof(init_stack) + (long)&init_stack, \
> - .vm86_info = NULL, \
> - .sysenter_cs = __KERNEL_CS, \
> - .io_bitmap_ptr = NULL, \
> -}
> +#ifdef CONFIG_X86_IOPORT
> +#define INIT_THREAD_IO .io_bitmap_ptr = NULL,
>
> /*
> * Note that the .io_bitmap member must be extra-big. This is because
> @@ -841,6 +865,19 @@ static inline void spin_lock_prefetch(const void *x)
> * permission bitmap. The extra byte must be all 1 bits, and must
> * be within the limit.
> */
> +#define INIT_TSS_IO .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 },
> +#else
> +#define INIT_THREAD_IO
> +#define INIT_TSS_IO
> +#endif
> +
> +#define INIT_THREAD { \
> + .sp0 = sizeof(init_stack) + (long)&init_stack, \
> + .vm86_info = NULL, \
> + .sysenter_cs = __KERNEL_CS, \
> + INIT_THREAD_IO \
> +}
> +
> #define INIT_TSS { \
> .x86_tss = { \
> .sp0 = sizeof(init_stack) + (long)&init_stack, \
> @@ -848,7 +885,7 @@ static inline void spin_lock_prefetch(const void *x)
> .ss1 = __KERNEL_CS, \
> .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \
> }, \
> - .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \
> + INIT_TSS_IO \
> }
>
> extern unsigned long thread_saved_pc(struct task_struct *tsk);
> diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
> index 592a6a6..4db79ea 100644
> --- a/arch/x86/include/asm/syscalls.h
> +++ b/arch/x86/include/asm/syscalls.h
> @@ -16,9 +16,12 @@
> #include <linux/types.h>
>
> /* Common in X86_32 and X86_64 */
> +
> +#ifdef CONFIG_X86_IOPORT
> /* kernel/ioport.c */
> asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
> asmlinkage long sys_iopl(unsigned int);
> +#endif /* CONFIG_X86_IOPORT */
>
> /* kernel/ldt.c */
> asmlinkage int sys_modify_ldt(int, void __user *, unsigned long);
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index a5408b9..9644737 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -20,7 +20,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace
>
> obj-y := process_$(BITS).o signal.o entry_$(BITS).o
> obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
> -obj-y += time.o ioport.o ldt.o dumpstack.o nmi.o
> +obj-y += time.o ldt.o dumpstack.o nmi.o
> +obj-$(CONFIG_X86_IOPORT) += ioport.o
> obj-y += setup.o x86_init.o i8259.o irqinit.o jump_label.o
> obj-$(CONFIG_IRQ_WORK) += irq_work.o
> obj-y += probe_roms.o
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 2793d1f..01235ef 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1223,7 +1223,6 @@ void cpu_init(void)
> struct tss_struct *t;
> unsigned long v;
> int cpu;
> - int i;
>
> /*
> * Load microcode on this cpu if a valid microcode is available.
> @@ -1285,14 +1284,7 @@ void cpu_init(void)
> }
> }
>
> - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> -
> - /*
> - * <= is required because the CPU will access up to
> - * 8 bits beyond the end of the IO permission bitmap.
> - */
> - for (i = 0; i <= IO_BITMAP_LONGS; i++)
> - t->io_bitmap[i] = ~0UL;
> + init_tss_io(t);
>
> atomic_inc(&init_mm.mm_count);
> me->active_mm = &init_mm;
> @@ -1351,7 +1343,7 @@ void cpu_init(void)
> load_TR_desc();
> load_LDT(&init_mm.context);
>
> - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> + init_tss_io(t);
This patch is too big. I think it would all look nicer if you added ioport.c in
one patch, and then convert the users in a separate patch?
> #ifdef CONFIG_DOUBLEFAULT
> /* Set up doublefault TSS pointer in the GDT */
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 1b69951..eb4f5ba 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -844,6 +844,11 @@ ENTRY(stub_\func)
> END(stub_\func)
> .endm
>
> + FORK_LIKE clone
> + FORK_LIKE fork
> + FORK_LIKE vfork
> +
> +#ifdef CONFIG_X86_IOPORT
> .macro FIXED_FRAME label,func
> ENTRY(\label)
> CFI_STARTPROC
> @@ -856,10 +861,8 @@ ENTRY(\label)
> END(\label)
> .endm
>
> - FORK_LIKE clone
> - FORK_LIKE fork
> - FORK_LIKE vfork
> FIXED_FRAME stub_iopl, sys_iopl
> +#endif /* CONFIG_X86_IOPORT */
>
> ENTRY(ptregscall_common)
> DEFAULT_FRAME 1 8 /* offset 8: return address */
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 1b10af8..46da3d9 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -382,7 +382,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
> .iret = native_iret,
> .swapgs = native_swapgs,
>
> +#ifdef CONFIG_X86_IOPORT
> .set_iopl_mask = native_set_iopl_mask,
> +#endif /* CONFIG_X86_IOPORT */
> .io_delay = native_io_delay,
>
> .start_context_switch = paravirt_nop,
> diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h
> index d884444..3e773fa 100644
> --- a/arch/x86/kernel/process-io.h
> +++ b/arch/x86/kernel/process-io.h
> @@ -1,9 +1,17 @@
> #ifndef _X86_KERNEL_PROCESS_IO_H
> #define _X86_KERNEL_PROCESS_IO_H
>
> +static inline void clear_thread_io_bitmap(struct task_struct *p)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + p->thread.io_bitmap_ptr = NULL;
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
This is thought of as ugly... Instead, do something like
#ifndef CONFIG_X86_IOPORT
static inline void clear_thread_io_bitmap(struct task_struct *p) {}
static inline int copy_io_bitmap(struct task_struct *me, struct task_struct *p) {return 0}
... etc...
#else
static inline void clear_thread_io_bitmap(struct task_struct *p)
{
p->thread.io_bitmap_ptr = NULL;
}
... etc...
#endif
> static inline int copy_io_bitmap(struct task_struct *me,
> struct task_struct *p)
> {
> +#ifdef CONFIG_X86_IOPORT
> if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
> p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
> IO_BITMAP_BYTES, GFP_KERNEL);
> @@ -15,8 +23,71 @@ static inline int copy_io_bitmap(struct task_struct *me,
> } else {
> p->thread.io_bitmap_ptr = NULL;
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> return 0;
> }
>
> +static inline void exit_thread_io(struct task_struct *me)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + struct thread_struct *t = &me->thread;
> + unsigned long *bp = t->io_bitmap_ptr;
> +
> + if (bp) {
> + struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
> +
> + t->io_bitmap_ptr = NULL;
> + clear_thread_flag(TIF_IO_BITMAP);
> + /*
> + * Careful, clear this in the TSS too:
> + */
> + memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
> + t->io_bitmap_max = 0;
> + put_cpu();
> + kfree(bp);
> + }
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> +static inline void switch_iopl_mask(struct thread_struct *prev,
> + struct thread_struct *next)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + /*
> + * Restore IOPL if needed. In normal use, the flags restore
> + * in the switch assembly will handle this. But if the kernel
> + * is running virtualized at a non-zero CPL, the popf will
> + * not restore flags, so it must be done in a separate step.
> + */
> + if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
> + set_iopl_mask(next->iopl);
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> +static inline void switch_io_bitmap(struct tss_struct *tss,
> + struct task_struct *prev_p,
> + struct task_struct *next_p)
> +{
> +#ifdef CONFIG_X86_IOPORT
> + struct thread_struct *prev, *next;
> + prev = &prev_p->thread;
> + next = &next_p->thread;
> +
> + if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
> + /*
> + * Copy the relevant range of the IO bitmap.
> + * Normally this is 128 bytes or less:
> + */
> + memcpy(tss->io_bitmap, next->io_bitmap_ptr,
> + max(prev->io_bitmap_max, next->io_bitmap_max));
> + } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
> + /*
> + * Clear any possible leftover bits:
> + */
> + memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
> + }
> +#endif /* CONFIG_X86_IOPORT */
> +}
> +
> #endif /* _X86_KERNEL_PROCESS_IO_H */
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index c83516b..2df5b00 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -29,6 +29,8 @@
> #include <asm/debugreg.h>
> #include <asm/nmi.h>
>
> +#include "process-io.h"
> +
> /*
> * per-CPU TSS segments. Threads are completely 'soft' on Linux,
> * no more per-task TSS's. The TSS size is kept cacheline-aligned
> @@ -101,23 +103,7 @@ void arch_task_cache_init(void)
> void exit_thread(void)
> {
> struct task_struct *me = current;
> - struct thread_struct *t = &me->thread;
> - unsigned long *bp = t->io_bitmap_ptr;
> -
> - if (bp) {
> - struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
> -
> - t->io_bitmap_ptr = NULL;
> - clear_thread_flag(TIF_IO_BITMAP);
> - /*
> - * Careful, clear this in the TSS too:
> - */
> - memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
> - t->io_bitmap_max = 0;
> - put_cpu();
> - kfree(bp);
> - }
> -
> + exit_thread_io(me);
> drop_fpu(me);
> }
>
> @@ -222,19 +208,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
> hard_enable_TSC();
> }
>
> - if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) {
> - /*
> - * Copy the relevant range of the IO bitmap.
> - * Normally this is 128 bytes or less:
> - */
> - memcpy(tss->io_bitmap, next->io_bitmap_ptr,
> - max(prev->io_bitmap_max, next->io_bitmap_max));
> - } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) {
> - /*
> - * Clear any possible leftover bits:
> - */
> - memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
> - }
> + switch_io_bitmap(tss, prev_p, next_p);
> propagate_user_return_notify(prev_p, next_p);
> }
>
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index e86028b..dc780cb 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
> childregs->cs = __KERNEL_CS | get_kernel_rpl();
> childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED;
> p->fpu_counter = 0;
> - p->thread.io_bitmap_ptr = NULL;
> + clear_thread_io_bitmap(p);
> memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
> return 0;
> }
> @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> */
> load_TLS(next, cpu);
>
> - /*
> - * Restore IOPL if needed. In normal use, the flags restore
> - * in the switch assembly will handle this. But if the kernel
> - * is running virtualized at a non-zero CPL, the popf will
> - * not restore flags, so it must be done in a separate step.
> - */
> - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
> - set_iopl_mask(next->iopl);
> + switch_iopl_mask(prev, next);
>
> /*
> * Now maybe handle debug registers and/or IO bitmaps
If copy_thread would be in process.c instead, the .h-file would be unnecessary,
right?
> diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
> index 7461f50..f071bfa 100644
> --- a/arch/x86/kernel/ptrace.c
> +++ b/arch/x86/kernel/ptrace.c
> @@ -785,7 +785,11 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n,
> static int ioperm_active(struct task_struct *target,
> const struct user_regset *regset)
> {
> +#ifdef CONFIG_X86_IOPORT
> return target->thread.io_bitmap_max / regset->size;
> +#else
> + return 0;
> +#endif
> }
>
> static int ioperm_get(struct task_struct *target,
> @@ -793,12 +797,16 @@ static int ioperm_get(struct task_struct *target,
> unsigned int pos, unsigned int count,
> void *kbuf, void __user *ubuf)
> {
> +#ifdef CONFIG_X86_IOPORT
> if (!target->thread.io_bitmap_ptr)
> return -ENXIO;
>
> return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
> target->thread.io_bitmap_ptr,
> 0, IO_BITMAP_BYTES);
> +#else
> + return -ENXIO;
> +#endif
> }
>
> /*
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index fa6ade7..f485d92 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -911,6 +911,7 @@ static void xen_load_sp0(struct tss_struct *tss,
> xen_mc_issue(PARAVIRT_LAZY_CPU);
> }
>
> +#ifdef CONFIG_X86_IOPORT
> static void xen_set_iopl_mask(unsigned mask)
> {
> struct physdev_set_iopl set_iopl;
> @@ -919,6 +920,7 @@ static void xen_set_iopl_mask(unsigned mask)
> set_iopl.iopl = (mask == 0) ? 1 : (mask >> 12) & 3;
> HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl);
> }
> +#endif /* CONFIG_X86_IOPORT */
>
> static void xen_io_delay(void)
> {
> @@ -1277,7 +1279,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> .write_idt_entry = xen_write_idt_entry,
> .load_sp0 = xen_load_sp0,
>
> +#ifdef CONFIG_X86_IOPORT
> .set_iopl_mask = xen_set_iopl_mask,
> +#endif /* CONFIG_X86_IOPORT */
> .io_delay = xen_io_delay,
>
> /* Xen takes care of %gs when switching to usermode for us */
> diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
> index 2bd78e2..7c7d405 100644
> --- a/drivers/tty/vt/vt_ioctl.c
> +++ b/drivers/tty/vt/vt_ioctl.c
> @@ -410,7 +410,7 @@ int vt_ioctl(struct tty_struct *tty,
> *
> * XXX: you should never use these, just call ioperm directly..
> */
> -#ifdef CONFIG_X86
> +#ifdef CONFIG_X86_IOPORT
> case KDADDIO:
> case KDDELIO:
> /*
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 7078052..e7f4a35 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -209,3 +209,8 @@ cond_syscall(compat_sys_open_by_handle_at);
>
> /* compare kernel pointers */
> cond_syscall(sys_kcmp);
> +
> +/* userspace I/O port access */
> +cond_syscall(stub_iopl);
> +cond_syscall(sys_iopl);
> +cond_syscall(sys_ioperm);
> --
> 1.8.4.rc3
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling
2013-10-31 20:01 ` Alexander van Heukelum
@ 2013-11-01 16:33 ` Josh Triplett
0 siblings, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2013-11-01 16:33 UTC (permalink / raw)
To: Alexander van Heukelum
Cc: Jeremy Fitzhardinge, x86, Len Brown, Frederic Weisbecker,
H. Peter Anvin, virtualization, Paul Gortmaker, Raghavendra K T,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov,
Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross
On Thu, Oct 31, 2013 at 09:01:30PM +0100, Alexander van Heukelum wrote:
> On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote:
> > The 32-bit and 64-bit versions of copy_thread have functionally
> > identical handling for copying the I/O bitmap, modulo differences in
> > error handling. Clean up the error paths in both by moving the copy of
> > the I/O bitmap to the end, to eliminate the need to free it if
> > subsequent copy steps fail; move the resulting identical code to a
> > static inline in a common header.
> >
> > Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> > ---
> > arch/x86/kernel/process-io.h | 22 ++++++++++++++++++++++
>
> I don't particularly like this new file. It's also not in a proper place.
It's a private header, and private headers are normally put in the
directories with the source code, not in the include directories.
The need for it becomes more obvious later in the patch series.
> > arch/x86/kernel/process_32.c | 29 ++++++++---------------------
> > arch/x86/kernel/process_64.c | 26 +++++---------------------
>
> Yay, this I like :). However, unification is best done in a separate step. First
> make the two parts equal in one or two patches. Then, in a separate patch,
> do the mechanical unification.
I could certainly split the first patch into two, sure.
> > 3 files changed, 35 insertions(+), 42 deletions(-)
> > create mode 100644 arch/x86/kernel/process-io.h
> >
> > diff --git a/arch/x86/kernel/process-io.h b/arch/x86/kernel/process-io.h
> > new file mode 100644
> > index 0000000..d884444
> > --- /dev/null
> > +++ b/arch/x86/kernel/process-io.h
> > @@ -0,0 +1,22 @@
> > +#ifndef _X86_KERNEL_PROCESS_IO_H
> > +#define _X86_KERNEL_PROCESS_IO_H
> > +
> > +static inline int copy_io_bitmap(struct task_struct *me,
> > + struct task_struct *p)
> > +{
> > + if (unlikely(test_tsk_thread_flag(me, TIF_IO_BITMAP))) {
> > + p->thread.io_bitmap_ptr = kmemdup(me->thread.io_bitmap_ptr,
> > + IO_BITMAP_BYTES, GFP_KERNEL);
> > + if (!p->thread.io_bitmap_ptr) {
> > + p->thread.io_bitmap_max = 0;
> > + return -ENOMEM;
> > + }
> > + set_tsk_thread_flag(p, TIF_IO_BITMAP);
> > + } else {
> > + p->thread.io_bitmap_ptr = NULL;
> > + }
> > +
> > + return 0;
> > +}
> > +
>
> I really don't like the .h-file like this. Could you try avoiding it? I think
> it's possible to unify the copy_thread function and in the end move it
> to process.c instead. The arch-specific parts can be split out in static
> functions guarded by CONFIG_X86_64 (cooking up good names is
> the challenge here ;) ).
It's theoretically possible to completely unify copy_thread by adding
several more functions like this to abstract the low-level details;
however, complete unification wasn't the goal of this patch series. I
unified and pulled out this portion because it's also the portion I need
to *remove* when compiling with !CONFIG_X86_IOPORT.
- Josh Triplett
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit
2013-10-31 20:02 ` Alexander van Heukelum
@ 2013-11-01 16:40 ` Josh Triplett
0 siblings, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2013-11-01 16:40 UTC (permalink / raw)
To: Alexander van Heukelum
Cc: Jeremy Fitzhardinge, x86, Len Brown, Frederic Weisbecker,
H. Peter Anvin, virtualization, Paul Gortmaker, Raghavendra K T,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov,
Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross
On Thu, Oct 31, 2013 at 09:02:14PM +0100, Alexander van Heukelum wrote:
> Hi Josh,
>
> On Tue, Oct 22, 2013, at 3:34, Josh Triplett wrote:
> > __set_tss_desc has a complex calculation of the TSS segment limit,
> > duplicating the quirky details of the I/O bitmap array length, and
> > requiring a complex comment to explain. Replace that calculation with a
> > simpler one based on the offsetof the "stack" field that follows the
> > array.
> >
> > That then removes the last use of IO_BITMAP_OFFSET, so delete it.
> >
> > Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> > ---
> > arch/x86/include/asm/desc.h | 11 +----------
> > arch/x86/include/asm/processor.h | 3 ++-
> > 2 files changed, 3 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> > index b90e5df..17ac92f 100644
> > --- a/arch/x86/include/asm/desc.h
> > +++ b/arch/x86/include/asm/desc.h
> > @@ -177,16 +177,7 @@ static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
> > struct desc_struct *d = get_cpu_gdt_table(cpu);
> > tss_desc tss;
> >
> > - /*
> > - * sizeof(unsigned long) coming from an extra "long" at the end
> > - * of the iobitmap. See tss_struct definition in processor.h
> > - *
> > - * -1? seg base+limit should be pointing to the address of the
> > - * last valid byte
> > - */
> > - set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS,
> > - IO_BITMAP_OFFSET + IO_BITMAP_BYTES +
> > - sizeof(unsigned long) - 1);
> > + set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, TSS_LIMIT);
> > write_gdt_entry(d, entry, &tss, DESC_TSS);
> > }
> >
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index 987c75e..03d3003 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -259,9 +259,10 @@ struct x86_hw_tss {
> > #define IO_BITMAP_BITS 65536
> > #define IO_BITMAP_BYTES (IO_BITMAP_BITS/8)
> > #define IO_BITMAP_LONGS (IO_BITMAP_BYTES/sizeof(long))
> > -#define IO_BITMAP_OFFSET offsetof(struct tss_struct, io_bitmap)
> > #define INVALID_IO_BITMAP_OFFSET 0x8000
> >
> > +#define TSS_LIMIT (offsetof(struct tss_struct, stack) - 1)
> > +
>
> I wonder if the 'stack' in tss_struct has any meaning anymore. Can it be removed?
> If so, the offsetof can be changed to sizeof(struct tss_struct), which looks more
> natural to me.
That might be possible, if nothing references the stack. The obvious
references would be easy to search for, but I'd be concerned that
something might reference the stack in some non-obvious way, like "right
after the TSS base+limit", which a search wouldn't necessarily turn up.
In any case, that change could easily occur in another patch on top of
this one.
> Even so, I think this is already an improvement.
>
> Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
Thanks.
- Josh Triplett
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm)
2013-10-31 20:04 ` Alexander van Heukelum
@ 2013-11-01 17:19 ` Josh Triplett
0 siblings, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2013-11-01 17:19 UTC (permalink / raw)
To: Alexander van Heukelum
Cc: Jeremy Fitzhardinge, x86, Len Brown, Frederic Weisbecker,
H. Peter Anvin, virtualization, Paul Gortmaker, Raghavendra K T,
David Herrmann, Masami Hiramatsu, Seiji Aguchi, Jiri Slaby,
Alok Kataria, Jesper Nilsson, Andi Kleen, Daniel Lezcano,
Ingo Molnar, Steven Rostedt, xen-devel, Borislav Petkov,
Fenghua Yu, Kees Cook, Konrad Rzeszutek Wilk, Ross
On Thu, Oct 31, 2013 at 09:04:42PM +0100, Alexander van Heukelum wrote:
> On Tue, Oct 22, 2013, at 3:35, Josh Triplett wrote:
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -976,6 +976,16 @@ config VM86
> > XFree86 to initialize some video cards via BIOS. Disabling this
> > option saves about 6k.
> >
> > +config X86_IOPORT
> > + bool "iopl and ioperm system calls"
> > + default y
> > + ---help---
> > + This option enables the iopl and ioperm system calls, which allow
> > + privileged userspace processes to directly access I/O ports. This
> > + is used by some legacy software to drive hardware directly from
> > + userspace rather than via a proper kernel driver. Unless you intend
> > + to run such software, you can safely say N here.
> > +
>
> I think this entry should be under General setup / Configure standard kernel
> features (expert users).
It's entirely x86-specific, and it's similar to VM86 just above it; it
belongs on the x86-specific menu.
> Remove references to "legacy" and "proper driver".
Hardware drivers belong in the kernel, and anything using iopl or ioperm
won't run on x86-64, which argues rather strongly for its obsolescence.
There's also /dev/port for cleaner access to ports from userspace.
However, I've rephrased this for v2.
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -1223,7 +1223,6 @@ void cpu_init(void)
> > struct tss_struct *t;
> > unsigned long v;
> > int cpu;
> > - int i;
> >
> > /*
> > * Load microcode on this cpu if a valid microcode is available.
> > @@ -1285,14 +1284,7 @@ void cpu_init(void)
> > }
> > }
> >
> > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> > -
> > - /*
> > - * <= is required because the CPU will access up to
> > - * 8 bits beyond the end of the IO permission bitmap.
> > - */
> > - for (i = 0; i <= IO_BITMAP_LONGS; i++)
> > - t->io_bitmap[i] = ~0UL;
> > + init_tss_io(t);
> >
> > atomic_inc(&init_mm.mm_count);
> > me->active_mm = &init_mm;
> > @@ -1351,7 +1343,7 @@ void cpu_init(void)
> > load_TR_desc();
> > load_LDT(&init_mm.context);
> >
> > - t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
> > + init_tss_io(t);
>
> This patch is too big. I think it would all look nicer if you added ioport.c in
> one patch, and then convert the users in a separate patch?
I'm not sure what you mean by "added ioport.c"; ioport.c already exists,
and this patch just makes it optional.
> > --- a/arch/x86/kernel/process-io.h
> > +++ b/arch/x86/kernel/process-io.h
> > @@ -1,9 +1,17 @@
> > #ifndef _X86_KERNEL_PROCESS_IO_H
> > #define _X86_KERNEL_PROCESS_IO_H
> >
> > +static inline void clear_thread_io_bitmap(struct task_struct *p)
> > +{
> > +#ifdef CONFIG_X86_IOPORT
> > + p->thread.io_bitmap_ptr = NULL;
> > +#endif /* CONFIG_X86_IOPORT */
> > +}
> > +
>
> This is thought of as ugly... Instead, do something like
>
> #ifndef CONFIG_X86_IOPORT
>
> static inline void clear_thread_io_bitmap(struct task_struct *p) {}
> static inline int copy_io_bitmap(struct task_struct *me, struct task_struct *p) {return 0}
> ... etc...
>
> #else
>
> static inline void clear_thread_io_bitmap(struct task_struct *p)
> {
> p->thread.io_bitmap_ptr = NULL;
> }
> ... etc...
>
> #endif
In .c files, any ifdefs at all are ugly; in .h files, both styles are
quite common, and in particular the style I used is common when omitting
the entire body of the function. It has the advantage of keeping a
common function signature rather than duplicating it.
> > --- a/arch/x86/kernel/process_32.c
> > +++ b/arch/x86/kernel/process_32.c
> > @@ -155,7 +155,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
> > childregs->cs = __KERNEL_CS | get_kernel_rpl();
> > childregs->flags = X86_EFLAGS_IF | X86_EFLAGS_FIXED;
> > p->fpu_counter = 0;
> > - p->thread.io_bitmap_ptr = NULL;
> > + clear_thread_io_bitmap(p);
> > memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
> > return 0;
> > }
> > @@ -269,14 +269,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> > */
> > load_TLS(next, cpu);
> >
> > - /*
> > - * Restore IOPL if needed. In normal use, the flags restore
> > - * in the switch assembly will handle this. But if the kernel
> > - * is running virtualized at a non-zero CPL, the popf will
> > - * not restore flags, so it must be done in a separate step.
> > - */
> > - if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
> > - set_iopl_mask(next->iopl);
> > + switch_iopl_mask(prev, next);
> >
> > /*
> > * Now maybe handle debug registers and/or IO bitmaps
>
> If copy_thread would be in process.c instead, the .h-file would be unnecessary,
> right?
No, there are calls to the new functions in the .h file in several other
places.
- Josh Triplett
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-11-01 17:19 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-22 2:33 [PATCH 0/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2013-10-22 2:34 ` [PATCH 1/3] x86: process: Unify 32-bit and 64-bit copy_thread I/O bitmap handling Josh Triplett
2013-10-30 22:21 ` Kees Cook
2013-10-31 20:01 ` Alexander van Heukelum
2013-11-01 16:33 ` Josh Triplett
2013-10-22 2:34 ` [PATCH 2/3] x86: tss: Eliminate fragile calculation of TSS segment limit Josh Triplett
2013-10-30 22:22 ` Kees Cook
2013-10-30 22:53 ` H. Peter Anvin
2013-10-31 11:17 ` Josh Triplett
2013-10-31 11:12 ` Josh Triplett
2013-10-31 20:02 ` Alexander van Heukelum
2013-11-01 16:40 ` Josh Triplett
2013-10-22 2:35 ` [PATCH 3/3] x86: Support compiling out userspace I/O (iopl and ioperm) Josh Triplett
2013-10-26 3:17 ` Stephen Hemminger
2013-10-26 4:30 ` Kees Cook
2013-10-31 20:04 ` Alexander van Heukelum
2013-11-01 17:19 ` Josh Triplett
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).