* [PATCH 000/109] remove in-kernel calls to syscalls
@ 2018-03-29 11:22 Dominik Brodowski
2018-03-29 11:22 ` [PATCH 004/109] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Dominik Brodowski @ 2018-03-29 11:22 UTC (permalink / raw)
To: linux-kernel
Cc: Peter Zijlstra, Amir Goldstein, linux-mm, H . Peter Anvin,
tautschn, Ingo Molnar, linux-arch, linux-s390, Andi Kleen,
user-mode-linux-devel, x86, hmclauchlan, Christoph Hellwig,
Jiri Slaby, Darren Hart, Jaswinder Singh, arnd, Jeff Dike, viro,
Thomas Gleixner, netdev, kexec, Luis R . Rodriguez,
Eric W . Biederman, linux-fsdevel, Andrew Morton, torvalds,
David S . Miller
[ While most parts of this patch set have been sent out already at least
once, I send out *all* patches to lkml once again as this whole series
touches several different subsystems in sensitive areas. ]
System calls are interaction points between userspace and the kernel.
Therefore, system call functions such as sys_xyzzy() or compat_sys_xyzzy()
should only be called from userspace via the syscall table, but not from
elsewhere in the kernel.
At least on 64-bit x86, it will likely be a hard requirement from v4.17
onwards to not call system call functions in the kernel: It is better to
use use a different calling convention for system calls there, where
struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands
processing over to the actual syscall function. This means that only those
parameters which are actually needed for a specific syscall are passed on
during syscall entry, instead of filling in six CPU registers with random
user space content all the time (which may cause serious trouble down the
call chain).[*]
Moreover, rules on how data may be accessed may differ between kernel data
and user data. This is another reason why calling sys_xyzzy() is
generally a bad idea, and -- at most -- acceptable in arch-specific code.
This patchset removes all in-kernel calls to syscall functions in the
kernel with the exception of arch/. On top of this, it cleans up the
three places where many syscalls are referenced or prototyped, namely
kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h.
Patches 1 to 101 have been sent out earlier, namely
- part 1 ( http://lkml.kernel.org/r/20180315190529.20943-1-linux@dominikbrodowski.net )
- part 2 ( http://lkml.kernel.org/r/20180316170614.5392-1-linux@dominikbrodowski.net )
- part 3 ( http://lkml.kernel.org/r/20180322090059.19361-1-linux@dominikbrodowski.net ).
Changes since these earlier versions are:
- I have added a lot more documentation and improved the commit messages,
namely to explain the naming convention and the rationale of this
patches.
- ACKs/Reviewed-by (thanks!) were added .
- Shuffle the patches around to have them grouped together systematically:
First goes a patch which defines the goal and explains the rationale:
syscalls: define and explain goal to not call syscalls in the kernel
A few codepaths can trivially be converted to existing in-kernel interfaces:
kernel: use kernel_wait4() instead of sys_wait4()
kernel: open-code sys_rt_sigpending() in sys_sigpending()
kexec: call do_kexec_load() in compat syscall directly
mm: use do_futex() instead of sys_futex() in mm_release()
x86: use _do_fork() in compat_sys_x86_clone()
x86: remove compat_sys_x86_waitpid()
Then follow many patches which only affect specfic subsystems each, and
replace sys_*() with internal helpers named __sys_*() or do_sys_*(). Let's
start with net/:
net: socket: add __sys_recvfrom() helper; remove in-kernel call to syscall
net: socket: add __sys_sendto() helper; remove in-kernel call to syscall
net: socket: add __sys_accept4() helper; remove in-kernel call to syscall
net: socket: add __sys_socket() helper; remove in-kernel call to syscall
net: socket: add __sys_bind() helper; remove in-kernel call to syscall
net: socket: add __sys_connect() helper; remove in-kernel call to syscall
net: socket: add __sys_listen() helper; remove in-kernel call to syscall
net: socket: add __sys_getsockname() helper; remove in-kernel call to syscall
net: socket: add __sys_getpeername() helper; remove in-kernel call to syscall
net: socket: add __sys_socketpair() helper; remove in-kernel call to syscall
net: socket: add __sys_shutdown() helper; remove in-kernel call to syscall
net: socket: add __sys_setsockopt() helper; remove in-kernel call to syscall
net: socket: add __sys_getsockopt() helper; remove in-kernel call to syscall
net: socket: add do_sys_recvmmsg() helper; remove in-kernel call to syscall
net: socket: move check for forbid_cmsg_compat to __sys_...msg()
net: socket: replace calls to sys_send() with __sys_sendto()
net: socket: replace call to sys_recv() with __sys_recvfrom()
net: socket: add __compat_sys_recvfrom() helper; remove in-kernel call to compat syscall
net: socket: add __compat_sys_setsockopt() helper; remove in-kernel call to compat syscall
net: socket: add __compat_sys_getsockopt() helper; remove in-kernel call to compat syscall
net: socket: add __compat_sys_recvmmsg() helper; remove in-kernel call to compat syscall
net: socket: add __compat_sys_...msg() helpers; remove in-kernel calls to compat syscalls
The changes in ipc/ are limited to this specific subsystem. The wrappers are
named ksys_*() to denote that these functions are meant as a drop-in replacement
for the syscalls.
ipc: add semtimedop syscall/compat_syscall wrappers
ipc: add semget syscall wrapper
ipc: add semctl syscall/compat_syscall wrappers
ipc: add msgget syscall wrapper
ipc: add shmget syscall wrapper
ipc: add shmdt syscall wrapper
ipc: add shmctl syscall/compat_syscall wrappers
ipc: add msgctl syscall/compat_syscall wrappers
ipc: add msgrcv syscall/compat_syscall wrappers
ipc: add msgsnd syscall/compat_syscall wrappers
A few mindless conversions in kernel/ and mm/:
kernel: add do_getpgid() helper; remove internal call to sys_getpgid()
kernel: add do_compat_sigaltstack() helper; remove in-kernel call to compat syscall
kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c
sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()
mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.c
mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.c
mm: add kernel_mbind() helper; remove in-kernel call to syscall
mm: add kernel_[sg]et_mempolicy() helpers; remove in-kernel calls to syscalls
Then, let's handle those instances internal to fs/ which call syscalls:
fs: add do_readlinkat() helper; remove internal call to sys_readlinkat()
fs: add do_pipe2() helper; remove internal call to sys_pipe2()
fs: add do_renameat2() helper; remove internal call to sys_renameat2()
fs: add do_futimesat() helper; remove internal call to sys_futimesat()
fs: add do_epoll_*() helpers; remove internal calls to sys_epoll_*()
fs: add do_signalfd4() helper; remove internal calls to sys_signalfd4()
fs: add do_eventfd() helper; remove internal call to sys_eventfd()
fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall
fs: add do_vmsplice() helper; remove in-kernel call to syscall
fs: add kern_select() helper; remove in-kernel call to sys_select()
fs: add do_compat_fcntl64() helper; remove in-kernel call to compat syscall
fs: add do_compat_select() helper; remove in-kernel call to compat syscall
fs: add do_compat_signalfd4() helper; remove in-kernel call to compat syscall
fs: add do_compat_futimesat() helper; remove in-kernel call to compat syscall
inotify: add do_inotify_init() helper; remove in-kernel call to syscall
fanotify: add do_fanotify_mark() helper; remove in-kernel call to syscall
fs/quota: add kernel_quotactl() helper; remove in-kernel call to syscall
fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl()
Several fs- and some mm-related syscalls are called in initramfs, initrd and
init, devtmpfs, and pm code. While at least many of these instances should be
converted to use proper in-kernel VFS interfaces in future, convert them
mindlessly to ksys_*() helpers or wrappers for now.
fs: add ksys_mount() helper; remove in-kernel calls to sys_mount()
fs: add ksys_umount() helper; remove in-kernel call to sys_umount()
fs: add ksys_dup{,3}() helper; remove in-kernel calls to sys_dup{,3}()
fs: add ksys_chroot() helper; remove-in kernel calls to sys_chroot()
fs: add ksys_write() helper; remove in-kernel calls to sys_write()
fs: add ksys_chdir() helper; remove in-kernel calls to sys_chdir()
fs: add ksys_unlink() wrapper; remove in-kernel calls to sys_unlink()
hostfs: rename do_rmdir() to hostfs_do_rmdir()
fs: add ksys_rmdir() wrapper; remove in-kernel calls to sys_rmdir()
fs: add do_mkdirat() helper and ksys_mkdir() wrapper; remove in-kernel calls to syscall
fs: add do_symlinkat() helper and ksys_symlink() wrapper; remove in-kernel calls to syscall
fs: add do_mknodat() helper and ksys_mknod() wrapper; remove in-kernel calls to syscall
fs: add do_linkat() helper and ksys_link() wrapper; remove in-kernel calls to syscall
fs: add ksys_fchmod() and do_fchmodat() helpers and ksys_chmod() wrapper; remove in-kernel calls to syscall
fs: add do_faccessat() helper and ksys_access() wrapper; remove in-kernel calls to syscall
fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers
fs: add ksys_ftruncate() wrapper; remove in-kernel calls to sys_ftruncate()
fs: add ksys_close() wrapper; remove in-kernel calls to sys_close()
fs: add ksys_open() wrapper; remove in-kernel calls to sys_open()
fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64()
fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl()
fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek()
fs: add ksys_read() helper; remove in-kernel calls to sys_read()
fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()
kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare()
kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()
To reach the goal to get rid of all in-kernel calls to syscalls for x86, we
need to handle a few further syscalls called from compat syscalls in x86 and
(mostly) from other architectures. Those could be made generic making use of
Al Viro's macro trickery. For v4.17, I'd suggest to keep it simple:
fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall
fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate()
fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls
fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate()
mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64()
mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff()
mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead()
x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm()
Then, throw in two fixes for x86:
x86: fix sys_sigreturn() return type to be long, not unsigned long
x86/sigreturn: use SYSCALL_DEFINE0 (by Michael Tautschnig)
... and clean up the three places where many syscalls are referenced or
prototyped (kernel/sys_ni.c, include/linux/syscalls.h and
include/linux/compat.h):
kexec: move sys_kexec_load() prototype to syscalls.h
syscalls: sort syscall prototypes in include/linux/syscalls.h
net: remove compat_sys_*() prototypes from net/compat.h
syscalls: sort syscall prototypes in include/linux/compat.h
syscalls/x86: auto-create compat_sys_*() prototypes
kernel/sys_ni: sort cond_syscall() entries
kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions
Last but not least, add a patch by Howard McLauchlan to whitelist all syscalls
for error injection:
bpf: whitelist all syscalls for error injection (by Howard McLauchlan)
Tze whole series is available at
https://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux.git syscalls-next
and I intend to push this upstream early in the v4.17-rc1 cycle.
Thanks,
Dominik
Documentation/process/adding-syscalls.rst | 34 +-
arch/alpha/kernel/osf_sys.c | 2 +-
arch/arm/kernel/sys_arm.c | 2 +-
arch/arm64/kernel/sys.c | 2 +-
arch/ia64/kernel/sys_ia64.c | 4 +-
arch/m68k/kernel/sys_m68k.c | 2 +-
arch/microblaze/kernel/sys_microblaze.c | 6 +-
arch/mips/kernel/linux32.c | 22 +-
arch/mips/kernel/syscall.c | 6 +-
arch/parisc/kernel/sys_parisc.c | 30 +-
arch/powerpc/kernel/sys_ppc32.c | 18 +-
arch/powerpc/kernel/syscalls.c | 6 +-
arch/riscv/kernel/sys_riscv.c | 4 +-
arch/s390/kernel/compat_linux.c | 37 +-
arch/s390/kernel/sys_s390.c | 2 +-
arch/sh/kernel/sys_sh.c | 4 +-
arch/sh/kernel/sys_sh32.c | 12 +-
arch/sparc/kernel/setup_32.c | 2 +-
arch/sparc/kernel/sys_sparc32.c | 26 +-
arch/sparc/kernel/sys_sparc_32.c | 6 +-
arch/sparc/kernel/sys_sparc_64.c | 2 +-
arch/um/kernel/syscall.c | 2 +-
arch/x86/entry/syscalls/syscall_32.tbl | 4 +-
arch/x86/ia32/ia32_signal.c | 1 -
arch/x86/ia32/sys_ia32.c | 50 +-
arch/x86/include/asm/sys_ia32.h | 67 --
arch/x86/include/asm/syscalls.h | 3 +-
arch/x86/kernel/ioport.c | 7 +-
arch/x86/kernel/signal.c | 5 +-
arch/x86/kernel/sys_x86_64.c | 2 +-
arch/xtensa/kernel/syscall.c | 2 +-
drivers/base/devtmpfs.c | 11 +-
drivers/tty/sysrq.c | 2 +-
drivers/tty/vt/vt_ioctl.c | 6 +-
fs/autofs4/dev-ioctl.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/dcookies.c | 11 +-
fs/eventfd.c | 9 +-
fs/eventpoll.c | 23 +-
fs/fcntl.c | 12 +-
fs/file.c | 17 +-
fs/hostfs/hostfs.h | 2 +-
fs/hostfs/hostfs_kern.c | 2 +-
fs/hostfs/hostfs_user.c | 2 +-
fs/internal.h | 14 +
fs/ioctl.c | 7 +-
fs/namei.c | 61 +-
fs/namespace.c | 19 +-
fs/notify/fanotify/fanotify_user.c | 14 +-
fs/notify/inotify/inotify_user.c | 9 +-
fs/open.c | 77 +-
fs/pipe.c | 9 +-
fs/quota/compat.c | 13 +-
fs/quota/quota.c | 10 +-
fs/read_write.c | 45 +-
fs/readdir.c | 11 +-
fs/select.c | 29 +-
fs/signalfd.c | 31 +-
fs/splice.c | 12 +-
fs/stat.c | 12 +-
fs/sync.c | 19 +-
fs/utimes.c | 25 +-
include/linux/compat.h | 644 ++++++------
include/linux/futex.h | 13 +-
include/linux/kexec.h | 4 -
include/linux/quotaops.h | 3 +
include/linux/socket.h | 37 +-
include/linux/syscalls.h | 1511 +++++++++++++++++------------
include/net/compat.h | 11 -
init/do_mounts.c | 26 +-
init/do_mounts.h | 4 +-
init/do_mounts_initrd.c | 42 +-
init/do_mounts_md.c | 29 +-
init/do_mounts_rd.c | 40 +-
init/initramfs.c | 52 +-
init/main.c | 9 +-
init/noinitramfs.c | 6 +-
ipc/msg.c | 60 +-
ipc/sem.c | 44 +-
ipc/shm.c | 28 +-
ipc/syscall.c | 58 +-
ipc/util.h | 31 +
kernel/compat.c | 55 --
kernel/exit.c | 2 +-
kernel/fork.c | 11 +-
kernel/kexec.c | 52 +-
kernel/pid_namespace.c | 6 +-
kernel/power/hibernate.c | 2 +-
kernel/power/suspend.c | 2 +-
kernel/power/user.c | 2 +-
kernel/sched/core.c | 8 +-
kernel/signal.c | 29 +-
kernel/sys.c | 74 +-
kernel/sys_ni.c | 617 +++++++-----
kernel/uid16.c | 25 +-
kernel/uid16.h | 14 +
kernel/umh.c | 4 +-
mm/fadvise.c | 10 +-
mm/mempolicy.c | 92 +-
mm/migrate.c | 39 +-
mm/mmap.c | 17 +-
mm/nommu.c | 17 +-
mm/readahead.c | 7 +-
net/compat.c | 136 ++-
net/socket.c | 234 +++--
105 files changed, 3129 insertions(+), 1868 deletions(-)
delete mode 100644 arch/x86/include/asm/sys_ia32.h
create mode 100644 kernel/uid16.h
[*] An early, not-yet-ready version and partly untested (i386, x32) of the
patches required to implement this on top of this series is available at
https://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux.git syscalls-WIP
--
2.16.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 004/109] kexec: call do_kexec_load() in compat syscall directly
2018-03-29 11:22 [PATCH 000/109] remove in-kernel calls to syscalls Dominik Brodowski
@ 2018-03-29 11:22 ` Dominik Brodowski
2018-03-29 11:24 ` [PATCH 102/109] kexec: move sys_kexec_load() prototype to syscalls.h Dominik Brodowski
2018-03-29 14:20 ` [PATCH 000/109] remove in-kernel calls to syscalls Matthew Wilcox
2 siblings, 0 replies; 7+ messages in thread
From: Dominik Brodowski @ 2018-03-29 11:22 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-arch, arnd, kexec, viro, torvalds, Eric Biederman
do_kexec_load() can be called directly by compat_sys_kexec() as long as
the same parameters checks are completed which are currently handled
(also) by sys_kexec(). Therefore, move those to kexec_load_check(),
call that newly introduced helper function from both sys_kexec() and
compat_sys_kexec(), and duplicate the remaining code from sys_kexec()
in compat_sys_kexec().
This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
kernel/kexec.c | 52 +++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 39 insertions(+), 13 deletions(-)
diff --git a/kernel/kexec.c b/kernel/kexec.c
index e62ec4dc6620..aed8fb2564b3 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -192,11 +192,9 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments,
* that to happen you need to do that yourself.
*/
-SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
- struct kexec_segment __user *, segments, unsigned long, flags)
+static inline int kexec_load_check(unsigned long nr_segments,
+ unsigned long flags)
{
- int result;
-
/* We only trust the superuser with rebooting the system. */
if (!capable(CAP_SYS_BOOT) || kexec_load_disabled)
return -EPERM;
@@ -208,17 +206,29 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
if ((flags & KEXEC_FLAGS) != (flags & ~KEXEC_ARCH_MASK))
return -EINVAL;
- /* Verify we are on the appropriate architecture */
- if (((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH) &&
- ((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH_DEFAULT))
- return -EINVAL;
-
/* Put an artificial cap on the number
* of segments passed to kexec_load.
*/
if (nr_segments > KEXEC_SEGMENT_MAX)
return -EINVAL;
+ return 0;
+}
+
+SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
+ struct kexec_segment __user *, segments, unsigned long, flags)
+{
+ int result;
+
+ result = kexec_load_check(nr_segments, flags);
+ if (result)
+ return result;
+
+ /* Verify we are on the appropriate architecture */
+ if (((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH) &&
+ ((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH_DEFAULT))
+ return -EINVAL;
+
/* Because we write directly to the reserved memory
* region when loading crash kernels we need a mutex here to
* prevent multiple crash kernels from attempting to load
@@ -247,15 +257,16 @@ COMPAT_SYSCALL_DEFINE4(kexec_load, compat_ulong_t, entry,
struct kexec_segment out, __user *ksegments;
unsigned long i, result;
+ result = kexec_load_check(nr_segments, flags);
+ if (result)
+ return result;
+
/* Don't allow clients that don't understand the native
* architecture to do anything.
*/
if ((flags & KEXEC_ARCH_MASK) == KEXEC_ARCH_DEFAULT)
return -EINVAL;
- if (nr_segments > KEXEC_SEGMENT_MAX)
- return -EINVAL;
-
ksegments = compat_alloc_user_space(nr_segments * sizeof(out));
for (i = 0; i < nr_segments; i++) {
result = copy_from_user(&in, &segments[i], sizeof(in));
@@ -272,6 +283,21 @@ COMPAT_SYSCALL_DEFINE4(kexec_load, compat_ulong_t, entry,
return -EFAULT;
}
- return sys_kexec_load(entry, nr_segments, ksegments, flags);
+ /* Because we write directly to the reserved memory
+ * region when loading crash kernels we need a mutex here to
+ * prevent multiple crash kernels from attempting to load
+ * simultaneously, and to prevent a crash kernel from loading
+ * over the top of a in use crash kernel.
+ *
+ * KISS: always take the mutex.
+ */
+ if (!mutex_trylock(&kexec_mutex))
+ return -EBUSY;
+
+ result = do_kexec_load(entry, nr_segments, ksegments, flags);
+
+ mutex_unlock(&kexec_mutex);
+
+ return result;
}
#endif
--
2.16.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 102/109] kexec: move sys_kexec_load() prototype to syscalls.h
2018-03-29 11:22 [PATCH 000/109] remove in-kernel calls to syscalls Dominik Brodowski
2018-03-29 11:22 ` [PATCH 004/109] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
@ 2018-03-29 11:24 ` Dominik Brodowski
2018-03-29 14:20 ` [PATCH 000/109] remove in-kernel calls to syscalls Matthew Wilcox
2 siblings, 0 replies; 7+ messages in thread
From: Dominik Brodowski @ 2018-03-29 11:24 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-arch, arnd, kexec, viro, torvalds, Eric Biederman
As the syscall function should only be called from the system call table
but not from elsewhere in the kernel, move the prototype for
sys_kexec_load() to include/syscall.h.
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
include/linux/kexec.h | 4 ----
include/linux/syscalls.h | 4 ++++
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index f16f6ceb3875..0ebcbeb21056 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -223,10 +223,6 @@ struct kimage {
extern void machine_kexec(struct kimage *image);
extern int machine_kexec_prepare(struct kimage *image);
extern void machine_kexec_cleanup(struct kimage *image);
-extern asmlinkage long sys_kexec_load(unsigned long entry,
- unsigned long nr_segments,
- struct kexec_segment __user *segments,
- unsigned long flags);
extern int kernel_kexec(void);
extern struct page *kimage_alloc_control_pages(struct kimage *image,
unsigned int order);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 815fbdd9cca1..8330f046541e 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -936,6 +936,10 @@ asmlinkage long sys_pkey_mprotect(unsigned long start, size_t len,
unsigned long prot, int pkey);
asmlinkage long sys_pkey_alloc(unsigned long flags, unsigned long init_val);
asmlinkage long sys_pkey_free(int pkey);
+asmlinkage long sys_kexec_load(unsigned long entry,
+ unsigned long nr_segments,
+ struct kexec_segment __user *segments,
+ unsigned long flags);
asmlinkage long sys_statx(int dfd, const char __user *path, unsigned flags,
unsigned mask, struct statx __user *buffer);
--
2.16.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 000/109] remove in-kernel calls to syscalls
2018-03-29 11:22 [PATCH 000/109] remove in-kernel calls to syscalls Dominik Brodowski
2018-03-29 11:22 ` [PATCH 004/109] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
2018-03-29 11:24 ` [PATCH 102/109] kexec: move sys_kexec_load() prototype to syscalls.h Dominik Brodowski
@ 2018-03-29 14:20 ` Matthew Wilcox
2018-03-29 14:42 ` Dominik Brodowski
2 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2018-03-29 14:20 UTC (permalink / raw)
To: Dominik Brodowski
Cc: Peter Zijlstra, Amir Goldstein, linux-mm, H . Peter Anvin,
tautschn, Ingo Molnar, linux-arch, linux-s390, Andi Kleen,
user-mode-linux-devel, x86, hmclauchlan, Christoph Hellwig,
Jiri Slaby, Darren Hart, Jaswinder Singh, arnd, Jeff Dike, viro,
Thomas Gleixner, netdev, kexec, linux-kernel, Luis R . Rodriguez,
Eric W . Biederman, linux-fsdevel, Andrew Morton, torvalds,
David S . Miller
On Thu, Mar 29, 2018 at 01:22:37PM +0200, Dominik Brodowski wrote:
> At least on 64-bit x86, it will likely be a hard requirement from v4.17
> onwards to not call system call functions in the kernel: It is better to
> use use a different calling convention for system calls there, where
> struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands
> processing over to the actual syscall function. This means that only those
> parameters which are actually needed for a specific syscall are passed on
> during syscall entry, instead of filling in six CPU registers with random
> user space content all the time (which may cause serious trouble down the
> call chain).[*]
How do we stop new ones from springing up? Some kind of linker trick
like was used to, er, "dissuade" people from using gets()?
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 000/109] remove in-kernel calls to syscalls
2018-03-29 14:20 ` [PATCH 000/109] remove in-kernel calls to syscalls Matthew Wilcox
@ 2018-03-29 14:42 ` Dominik Brodowski
2018-03-29 14:46 ` David Laight
0 siblings, 1 reply; 7+ messages in thread
From: Dominik Brodowski @ 2018-03-29 14:42 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Peter Zijlstra, Amir Goldstein, linux-mm, H . Peter Anvin,
tautschn, Ingo Molnar, linux-arch, linux-s390, Andi Kleen,
user-mode-linux-devel, x86, hmclauchlan, Christoph Hellwig,
Jiri Slaby, Darren Hart, Jaswinder Singh, arnd, Jeff Dike, viro,
Thomas Gleixner, netdev, kexec, linux-kernel, Luis R . Rodriguez,
Eric W . Biederman, linux-fsdevel, Andrew Morton, torvalds,
David S . Miller
On Thu, Mar 29, 2018 at 07:20:27AM -0700, Matthew Wilcox wrote:
> On Thu, Mar 29, 2018 at 01:22:37PM +0200, Dominik Brodowski wrote:
> > At least on 64-bit x86, it will likely be a hard requirement from v4.17
> > onwards to not call system call functions in the kernel: It is better to
> > use use a different calling convention for system calls there, where
> > struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands
> > processing over to the actual syscall function. This means that only those
> > parameters which are actually needed for a specific syscall are passed on
> > during syscall entry, instead of filling in six CPU registers with random
> > user space content all the time (which may cause serious trouble down the
> > call chain).[*]
>
> How do we stop new ones from springing up? Some kind of linker trick
> like was used to, er, "dissuade" people from using gets()?
Once the patches which modify the syscall calling convention are merged,
it won't compile on 64-bit x86, but bark loudly. That should frighten anyone.
Meow.
Thanks,
Dominik
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH 000/109] remove in-kernel calls to syscalls
2018-03-29 14:42 ` Dominik Brodowski
@ 2018-03-29 14:46 ` David Laight
2018-03-29 14:55 ` Dominik Brodowski
0 siblings, 1 reply; 7+ messages in thread
From: David Laight @ 2018-03-29 14:46 UTC (permalink / raw)
To: 'Dominik Brodowski', Matthew Wilcox
Cc: Peter Zijlstra, Amir Goldstein, linux-mm@kvack.org,
H . Peter Anvin, tautschn@amazon.co.uk, Ingo Molnar,
linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
Andi Kleen, user-mode-linux-devel@lists.sourceforge.net,
x86@kernel.org, hmclauchlan@fb.com, Christoph Hellwig, Jiri Slaby,
Darren Hart, Jaswinder Singh, arnd@arndb.de, Jeff Dike,
viro@ZenIV.linux.org.uk, Thomas Gleixner, netdev@vger.kernel.org,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
Luis R . Rodriguez, Eric W . Biederman,
linux-fsdevel@vger.kernel.org, Andrew Morton,
torvalds@linux-foundation.org, David S . Miller
From: Dominik Brodowski
> Sent: 29 March 2018 15:42
> On Thu, Mar 29, 2018 at 07:20:27AM -0700, Matthew Wilcox wrote:
> > On Thu, Mar 29, 2018 at 01:22:37PM +0200, Dominik Brodowski wrote:
> > > At least on 64-bit x86, it will likely be a hard requirement from v4.17
> > > onwards to not call system call functions in the kernel: It is better to
> > > use use a different calling convention for system calls there, where
> > > struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands
> > > processing over to the actual syscall function. This means that only those
> > > parameters which are actually needed for a specific syscall are passed on
> > > during syscall entry, instead of filling in six CPU registers with random
> > > user space content all the time (which may cause serious trouble down the
> > > call chain).[*]
> >
> > How do we stop new ones from springing up? Some kind of linker trick
> > like was used to, er, "dissuade" people from using gets()?
>
> Once the patches which modify the syscall calling convention are merged,
> it won't compile on 64-bit x86, but bark loudly. That should frighten anyone.
> Meow.
Should be pretty easy to ensure the prototypes aren't in any normal header.
Renaming the global symbols (to not match the function name) will make it
much harder to call them as well.
David
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 000/109] remove in-kernel calls to syscalls
2018-03-29 14:46 ` David Laight
@ 2018-03-29 14:55 ` Dominik Brodowski
0 siblings, 0 replies; 7+ messages in thread
From: Dominik Brodowski @ 2018-03-29 14:55 UTC (permalink / raw)
To: David Laight
Cc: Peter Zijlstra, Amir Goldstein, linux-mm@kvack.org,
H . Peter Anvin, tautschn@amazon.co.uk, Ingo Molnar,
linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
Andi Kleen, user-mode-linux-devel@lists.sourceforge.net,
x86@kernel.org, Matthew Wilcox, hmclauchlan@fb.com,
Christoph Hellwig, Jiri Slaby, Darren Hart, Jaswinder Singh,
arnd@arndb.de, Jeff Dike, viro@ZenIV.linux.org.uk,
Thomas Gleixner, netdev@vger.kernel.org,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
Luis R . Rodriguez, Eric W . Biederman,
linux-fsdevel@vger.kernel.org, Andrew Morton,
torvalds@linux-foundation.org, David S . Miller
On Thu, Mar 29, 2018 at 02:46:44PM +0000, David Laight wrote:
> From: Dominik Brodowski
> > Sent: 29 March 2018 15:42
> > On Thu, Mar 29, 2018 at 07:20:27AM -0700, Matthew Wilcox wrote:
> > > On Thu, Mar 29, 2018 at 01:22:37PM +0200, Dominik Brodowski wrote:
> > > > At least on 64-bit x86, it will likely be a hard requirement from v4.17
> > > > onwards to not call system call functions in the kernel: It is better to
> > > > use use a different calling convention for system calls there, where
> > > > struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands
> > > > processing over to the actual syscall function. This means that only those
> > > > parameters which are actually needed for a specific syscall are passed on
> > > > during syscall entry, instead of filling in six CPU registers with random
> > > > user space content all the time (which may cause serious trouble down the
> > > > call chain).[*]
> > >
> > > How do we stop new ones from springing up? Some kind of linker trick
> > > like was used to, er, "dissuade" people from using gets()?
> >
> > Once the patches which modify the syscall calling convention are merged,
> > it won't compile on 64-bit x86, but bark loudly. That should frighten anyone.
> > Meow.
>
> Should be pretty easy to ensure the prototypes aren't in any normal header.
That's exactly why the compile will fail.
> Renaming the global symbols (to not match the function name) will make it
> much harder to call them as well.
That still depends on the exact design of the patchset, which is still under
review.
Thanks,
Dominik
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-03-29 14:55 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-29 11:22 [PATCH 000/109] remove in-kernel calls to syscalls Dominik Brodowski
2018-03-29 11:22 ` [PATCH 004/109] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
2018-03-29 11:24 ` [PATCH 102/109] kexec: move sys_kexec_load() prototype to syscalls.h Dominik Brodowski
2018-03-29 14:20 ` [PATCH 000/109] remove in-kernel calls to syscalls Matthew Wilcox
2018-03-29 14:42 ` Dominik Brodowski
2018-03-29 14:46 ` David Laight
2018-03-29 14:55 ` Dominik Brodowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).