* [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ [not found] ` <20060820134745.GA11843@atjola.homenet> @ 2006-08-20 17:13 ` Arnd Bergmann 2006-08-20 17:36 ` Chase Venters 2006-08-21 0:36 ` Paul Mackerras 0 siblings, 2 replies; 22+ messages in thread From: Arnd Bergmann @ 2006-08-20 17:13 UTC (permalink / raw) To: Björn Steinbrink Cc: Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Sunday 20 August 2006 15:47, Björn Steinbrink wrote: > How is execve() supposed to use the local errno? The kernel syscall > macro only "creates" a function, so you still need a global errno for > that code, don't you? Right, I got confused by the macro referencing it. As an alternative, you can have a static errno variable in the source file that defines kernel_execve. > And I (because I'm clueless ;) wonder about the calls to set_fs(), why > do we need them? The current code does not seem to do them. Or is there > something special about kernel_execve that I'm missing? cscope and grep > didn't tell anything and Google had only a few useless results for > kernel_execve. You need to do set_fs if you want to pass a kernel pointer to a function expecting a __user pointer (like 'char __user *argv[]'). I guess every place in the kernel where we do call execve actually is running in a set_fs(KERNEL_DS) environment already and anything else would not make too much sense. Maybe adding a small check in there to make sure we're really running in kernel space is better then. --- Iit turned out most of the architectures that already implement their own execve() call instead of using the _syscall3 function for it end up passing the return value of sys_execve down, instead of setting errno. The patch below converts those functions to a new kernel_execve variant and provides a lib/execve.c file with an alternative implementation for the architectures that are using the traditional __KERNEL_SYSCALLS__ mechanism for it. It also removes the kernel syscalls implementation on the architectures that no longer need it. The architectures that this patch doesn't touch should ideally introduce their own kernel_execve() function to get rid of __KERNEL_SYSCALLS__ as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- Tested-on: i386 Compiled-on: i386, powerpc arch/alpha/Kconfig | 3 + arch/alpha/kernel/entry.S | 10 ++-- arch/arm/Kconfig | 3 + arch/arm/kernel/sys_arm.c | 4 - arch/arm26/Kconfig | 3 + arch/arm26/kernel/sys_arm.c | 2 arch/ia64/Kconfig | 3 + arch/ia64/kernel/entry.S | 4 - arch/parisc/Kconfig | 3 + arch/parisc/kernel/process.c | 9 +++- arch/powerpc/Kconfig | 3 + arch/powerpc/kernel/misc_32.S | 2 arch/powerpc/kernel/misc_64.S | 2 arch/sparc64/kernel/power.c | 5 -- arch/um/Kconfig | 3 + arch/um/kernel/syscall.c | 13 ++++++ arch/x86_64/Kconfig | 3 + arch/x86_64/kernel/entry.S | 4 - drivers/sbus/char/bbc_envctrl.c | 5 -- drivers/sbus/char/envctrl.c | 5 -- include/asm-alpha/unistd.h | 69 ---------------------------------- include/asm-arm/unistd.h | 24 ----------- include/asm-arm26/unistd.h | 24 ----------- include/asm-ia64/unistd.h | 72 ----------------------------------- include/asm-parisc/unistd.h | 79 --------------------------------------- include/asm-powerpc/unistd.h | 7 --- include/asm-um/unistd.h | 27 ------------- include/asm-x86_64/unistd.h | 81 ---------------------------------------- include/linux/syscalls.h | 2 init/do_mounts_initrd.c | 3 - init/main.c | 4 - kernel/kmod.c | 5 -- lib/Makefile | 4 + lib/execve.c | 19 +++++++++ 34 files changed, 92 insertions(+), 417 deletions(-) Index: linux-cg/init/do_mounts_initrd.c =================================================================== --- linux-cg.orig/init/do_mounts_initrd.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/init/do_mounts_initrd.c 2006-08-20 19:06:00.000000000 +0200 @@ -1,4 +1,3 @@ -#define __KERNEL_SYSCALLS__ #include <linux/unistd.h> #include <linux/kernel.h> #include <linux/fs.h> @@ -35,7 +34,7 @@ (void) sys_open("/dev/console",O_RDWR,0); (void) sys_dup(0); (void) sys_dup(0); - return execve(shell, argv, envp_init); + return kernel_execve(shell, argv, envp_init); } static void __init handle_initrd(void) Index: linux-cg/kernel/kmod.c =================================================================== --- linux-cg.orig/kernel/kmod.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/kernel/kmod.c 2006-08-20 19:06:00.000000000 +0200 @@ -18,8 +18,6 @@ call_usermodehelper wait flag, and remove exec_usermodehelper. Rusty Russell <rusty@rustcorp.com.au> Jan 2003 */ -#define __KERNEL_SYSCALLS__ - #include <linux/module.h> #include <linux/sched.h> #include <linux/syscalls.h> @@ -150,7 +148,8 @@ retval = -EPERM; if (current->fs->root) - retval = execve(sub_info->path, sub_info->argv,sub_info->envp); + retval = kernel_execve(sub_info->path, + sub_info->argv, sub_info->envp); /* Exec failed? */ sub_info->retval = retval; Index: linux-cg/lib/Makefile =================================================================== --- linux-cg.orig/lib/Makefile 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/lib/Makefile 2006-08-20 19:06:00.000000000 +0200 @@ -33,6 +33,10 @@ lib-y += dec_and_lock.o endif +ifneq ($(CONFIG_HAVE_KERNEL_EXECVE),y) + lib-y += execve.o +endif + obj-$(CONFIG_CRC_CCITT) += crc-ccitt.o obj-$(CONFIG_CRC16) += crc16.o obj-$(CONFIG_CRC32) += crc32.o Index: linux-cg/lib/execve.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-cg/lib/execve.c 2006-08-20 19:06:00.000000000 +0200 @@ -0,0 +1,19 @@ +#include <asm/bug.h> +#include <asm/uaccess.h> + +#define __KERNEL_SYSCALLS__ +static int errno; +#include <asm/unistd.h> + +int kernel_execve(const char *filename, char *const argv[], char *const envp[]) +{ + mm_segment_t fs = get_fs(); + int ret; + + WARN_ON(segment_eq(fs, USER_DS)); + ret = execve(filename, (char **)argv, (char **)envp); + if (ret) + ret = errno; + + return ret; +} Index: linux-cg/init/main.c =================================================================== --- linux-cg.orig/init/main.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/init/main.c 2006-08-20 19:06:00.000000000 +0200 @@ -9,8 +9,6 @@ * Simplified starting of init: Michael A. Griffith <grif@acm.org> */ -#define __KERNEL_SYSCALLS__ - #include <linux/types.h> #include <linux/module.h> #include <linux/proc_fs.h> @@ -679,7 +677,7 @@ static void run_init_process(char *init_filename) { argv_init[0] = init_filename; - execve(init_filename, argv_init, envp_init); + kernel_execve(init_filename, argv_init, envp_init); } static int init(void * unused) Index: linux-cg/arch/sparc64/kernel/power.c =================================================================== --- linux-cg.orig/arch/sparc64/kernel/power.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/sparc64/kernel/power.c 2006-08-20 19:06:00.000000000 +0200 @@ -4,8 +4,6 @@ * Copyright (C) 1999 David S. Miller (davem@redhat.com) */ -#define __KERNEL_SYSCALLS__ - #include <linux/kernel.h> #include <linux/module.h> #include <linux/init.h> @@ -14,6 +12,7 @@ #include <linux/delay.h> #include <linux/interrupt.h> #include <linux/pm.h> +#include <linux/syscalls.h> #include <asm/system.h> #include <asm/auxio.h> @@ -98,7 +97,7 @@ /* Ok, down we go... */ button_pressed = 0; - if (execve("/sbin/shutdown", argv, envp) < 0) { + if (kernel_execve("/sbin/shutdown", argv, envp) < 0) { printk("powerd: shutdown execution failed\n"); add_wait_queue(&powerd_wait, &wait); goto again; Index: linux-cg/arch/x86_64/kernel/entry.S =================================================================== --- linux-cg.orig/arch/x86_64/kernel/entry.S 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/x86_64/kernel/entry.S 2006-08-20 19:06:00.000000000 +0200 @@ -1000,7 +1000,7 @@ * do_sys_execve asm fallback arguments: * rdi: name, rsi: argv, rdx: envp, fake frame on the stack */ -ENTRY(execve) +ENTRY(kernel_execve) CFI_STARTPROC FAKE_STACK_FRAME $0 SAVE_ALL @@ -1013,7 +1013,7 @@ UNFAKE_STACK_FRAME ret CFI_ENDPROC -ENDPROC(execve) +ENDPROC(kernel_execve) KPROBE_ENTRY(page_fault) errorentry do_page_fault Index: linux-cg/drivers/sbus/char/bbc_envctrl.c =================================================================== --- linux-cg.orig/drivers/sbus/char/bbc_envctrl.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/drivers/sbus/char/bbc_envctrl.c 2006-08-20 19:06:00.000000000 +0200 @@ -4,9 +4,6 @@ * Copyright (C) 2001 David S. Miller (davem@redhat.com) */ -#define __KERNEL_SYSCALLS__ -static int errno; - #include <linux/kernel.h> #include <linux/kthread.h> #include <linux/sched.h> @@ -200,7 +197,7 @@ printk(KERN_CRIT "kenvctrld: Shutting down the system now.\n"); shutting_down = 1; - if (execve("/sbin/shutdown", argv, envp) < 0) + if (kernel_execve("/sbin/shutdown", argv, envp) < 0) printk(KERN_CRIT "envctrl: shutdown execution failed\n"); } Index: linux-cg/drivers/sbus/char/envctrl.c =================================================================== --- linux-cg.orig/drivers/sbus/char/envctrl.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/drivers/sbus/char/envctrl.c 2006-08-20 19:06:00.000000000 +0200 @@ -19,9 +19,6 @@ * Daniele Bellucci <bellucda@tiscali.it> */ -#define __KERNEL_SYSCALLS__ -static int errno; - #include <linux/module.h> #include <linux/sched.h> #include <linux/kthread.h> @@ -982,7 +979,7 @@ inprog = 1; printk(KERN_CRIT "kenvctrld: WARNING: Shutting down the system now.\n"); - if (0 > execve("/sbin/shutdown", argv, envp)) { + if (0 > kernel_execve("/sbin/shutdown", argv, envp)) { printk(KERN_CRIT "kenvctrld: WARNING: system shutdown failed!\n"); inprog = 0; /* unlikely to succeed, but we could try again */ } Index: linux-cg/arch/x86_64/Kconfig =================================================================== --- linux-cg.orig/arch/x86_64/Kconfig 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/x86_64/Kconfig 2006-08-20 19:07:12.000000000 +0200 @@ -61,6 +61,9 @@ bool default y +config HAVE_KERNEL_EXECVE + def_bool y + config X86_CMPXCHG bool default y Index: linux-cg/include/asm-x86_64/unistd.h =================================================================== --- linux-cg.orig/include/asm-x86_64/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-x86_64/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -661,8 +661,6 @@ #define __ARCH_WANT_SYS_TIME #define __ARCH_WANT_COMPAT_SYS_TIME -#ifndef __KERNEL_SYSCALLS__ - #define __syscall "syscall" #define _syscall0(type,name) \ @@ -744,85 +742,6 @@ __syscall_return(type,__res); \ } -#else /* __KERNEL_SYSCALLS__ */ - -#include <linux/syscalls.h> -#include <asm/ptrace.h> - -/* - * we need this inline - forking from kernel space will result - * in NO COPY ON WRITE (!!!), until an execve is executed. This - * is no problem, but for the stack. This is handled by not letting - * main() use the stack at all after fork(). Thus, no function - * calls - which means inline code for fork too, as otherwise we - * would use the stack upon exit from 'fork()'. - * - * Actually only pause and fork are needed inline, so that there - * won't be any messing with the stack from main(), but we define - * some others too. - */ -#define __NR__exit __NR_exit - -static inline pid_t setsid(void) -{ - return sys_setsid(); -} - -static inline ssize_t write(unsigned int fd, char * buf, size_t count) -{ - return sys_write(fd, buf, count); -} - -static inline ssize_t read(unsigned int fd, char * buf, size_t count) -{ - return sys_read(fd, buf, count); -} - -static inline off_t lseek(unsigned int fd, off_t offset, unsigned int origin) -{ - return sys_lseek(fd, offset, origin); -} - -static inline long dup(unsigned int fd) -{ - return sys_dup(fd); -} - -/* implemented in asm in arch/x86_64/kernel/entry.S */ -extern int execve(const char *, char * const *, char * const *); - -static inline long open(const char * filename, int flags, int mode) -{ - return sys_open(filename, flags, mode); -} - -static inline long close(unsigned int fd) -{ - return sys_close(fd); -} - -static inline pid_t waitpid(int pid, int * wait_stat, int flags) -{ - return sys_wait4(pid, wait_stat, flags, NULL); -} - -extern long sys_mmap(unsigned long addr, unsigned long len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long off); - -extern int sys_modify_ldt(int func, void *ptr, unsigned long bytecount); - -asmlinkage long sys_execve(char *name, char **argv, char **envp, - struct pt_regs regs); -asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp, - void *parent_tid, void *child_tid, - struct pt_regs regs); -asmlinkage long sys_fork(struct pt_regs regs); -asmlinkage long sys_vfork(struct pt_regs regs); -asmlinkage long sys_pipe(int *fildes); - -#endif /* __KERNEL_SYSCALLS__ */ - #ifndef __ASSEMBLY__ #include <linux/linkage.h> Index: linux-cg/arch/alpha/kernel/entry.S =================================================================== --- linux-cg.orig/arch/alpha/kernel/entry.S 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/alpha/kernel/entry.S 2006-08-20 19:06:00.000000000 +0200 @@ -655,12 +655,12 @@ .end kernel_thread /* - * execve(path, argv, envp) + * kernel_execve(path, argv, envp) */ .align 4 - .globl execve - .ent execve -execve: + .globl kernel_execve + .ent kernel_execve +kernel_execve: /* We can be called from a module. */ ldgp $gp, 0($27) lda $sp, -(32+SIZEOF_PT_REGS+8)($sp) @@ -704,7 +704,7 @@ 1: lda $sp, 32+SIZEOF_PT_REGS+8($sp) ret -.end execve +.end kernel_execve \f /* Index: linux-cg/arch/arm/kernel/sys_arm.c =================================================================== --- linux-cg.orig/arch/arm/kernel/sys_arm.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/arm/kernel/sys_arm.c 2006-08-20 19:06:00.000000000 +0200 @@ -279,7 +279,7 @@ return error; } -long execve(const char *filename, char **argv, char **envp) +int kernel_execve(const char *filename, char *const argv[], char *const envp[]); { struct pt_regs regs; int ret; @@ -317,7 +317,7 @@ out: return ret; } -EXPORT_SYMBOL(execve); +EXPORT_SYMBOL(kernel_execve); /* * Since loff_t is a 64 bit type we avoid a lot of ABI hastle Index: linux-cg/arch/arm26/kernel/sys_arm.c =================================================================== --- linux-cg.orig/arch/arm26/kernel/sys_arm.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/arm26/kernel/sys_arm.c 2006-08-20 19:08:32.000000000 +0200 @@ -283,7 +283,7 @@ } /* FIXME - see if this is correct for arm26 */ -long execve(const char *filename, char **argv, char **envp) +int kernel_execve(const char *filename, char *const argv[], char *const envp[]); { struct pt_regs regs; int ret; @@ -320,4 +320,4 @@ return ret; } -EXPORT_SYMBOL(execve); +EXPORT_SYMBOL(kernel_execve); Index: linux-cg/arch/powerpc/kernel/misc_32.S =================================================================== --- linux-cg.orig/arch/powerpc/kernel/misc_32.S 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/powerpc/kernel/misc_32.S 2006-08-20 19:06:00.000000000 +0200 @@ -843,7 +843,7 @@ addi r1,r1,16 blr -_GLOBAL(execve) +_GLOBAL(kernel_execve) li r0,__NR_execve sc bnslr Index: linux-cg/arch/powerpc/kernel/misc_64.S =================================================================== --- linux-cg.orig/arch/powerpc/kernel/misc_64.S 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/powerpc/kernel/misc_64.S 2006-08-20 19:06:00.000000000 +0200 @@ -556,7 +556,7 @@ #endif /* CONFIG_ALTIVEC */ -_GLOBAL(execve) +_GLOBAL(kernel_execve) li r0,__NR_execve sc bnslr Index: linux-cg/arch/um/Kconfig =================================================================== --- linux-cg.orig/arch/um/Kconfig 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/um/Kconfig 2006-08-20 19:06:00.000000000 +0200 @@ -29,6 +29,9 @@ bool default y +config HAVE_KERNEL_EXECVE + def_bool y + # Used in kernel/irq/manage.c and include/linux/irq.h config IRQ_RELEASE_METHOD bool Index: linux-cg/include/asm-alpha/unistd.h =================================================================== --- linux-cg.orig/include/asm-alpha/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-alpha/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -580,75 +580,6 @@ #define __ARCH_WANT_SYS_OLDUMOUNT #define __ARCH_WANT_SYS_SIGPENDING -#ifdef __KERNEL_SYSCALLS__ - -#include <linux/compiler.h> -#include <linux/types.h> -#include <linux/string.h> -#include <linux/signal.h> -#include <linux/syscalls.h> -#include <asm/ptrace.h> - -static inline long open(const char * name, int mode, int flags) -{ - return sys_open(name, mode, flags); -} - -static inline long dup(int fd) -{ - return sys_dup(fd); -} - -static inline long close(int fd) -{ - return sys_close(fd); -} - -static inline off_t lseek(int fd, off_t off, int whence) -{ - return sys_lseek(fd, off, whence); -} - -static inline void _exit(int value) -{ - sys_exit(value); -} - -#define exit(x) _exit(x) - -static inline long write(int fd, const char * buf, size_t nr) -{ - return sys_write(fd, buf, nr); -} - -static inline long read(int fd, char * buf, size_t nr) -{ - return sys_read(fd, buf, nr); -} - -extern int execve(char *, char **, char **); - -static inline long setsid(void) -{ - return sys_setsid(); -} - -static inline pid_t waitpid(int pid, int * wait_stat, int flags) -{ - return sys_wait4(pid, wait_stat, flags, NULL); -} - -asmlinkage int sys_execve(char *ufilename, char **argv, char **envp, - unsigned long a3, unsigned long a4, unsigned long a5, - struct pt_regs regs); -asmlinkage long sys_rt_sigaction(int sig, - const struct sigaction __user *act, - struct sigaction __user *oact, - size_t sigsetsize, - void *restorer); - -#endif /* __KERNEL_SYSCALLS__ */ - /* "Conditional" syscalls. What we want is __attribute__((weak,alias("sys_ni_syscall"))) Index: linux-cg/include/asm-arm/unistd.h =================================================================== --- linux-cg.orig/include/asm-arm/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-arm/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -548,30 +548,6 @@ #define __ARCH_WANT_SYS_SOCKETCALL #endif -#ifdef __KERNEL_SYSCALLS__ - -#include <linux/compiler.h> -#include <linux/types.h> -#include <linux/syscalls.h> - -extern long execve(const char *file, char **argv, char **envp); - -struct pt_regs; -asmlinkage int sys_execve(char *filenamei, char **argv, char **envp, - struct pt_regs *regs); -asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp, - struct pt_regs *regs); -asmlinkage int sys_fork(struct pt_regs *regs); -asmlinkage int sys_vfork(struct pt_regs *regs); -asmlinkage int sys_pipe(unsigned long *fildes); -struct sigaction; -asmlinkage long sys_rt_sigaction(int sig, - const struct sigaction __user *act, - struct sigaction __user *oact, - size_t sigsetsize); - -#endif /* __KERNEL_SYSCALLS__ */ - /* * "Conditional" syscalls * Index: linux-cg/include/asm-arm26/unistd.h =================================================================== --- linux-cg.orig/include/asm-arm26/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-arm26/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -463,30 +463,6 @@ #define __ARCH_WANT_SYS_SIGPROCMASK #define __ARCH_WANT_SYS_RT_SIGACTION -#ifdef __KERNEL_SYSCALLS__ - -#include <linux/compiler.h> -#include <linux/types.h> -#include <linux/syscalls.h> - -extern long execve(const char *file, char **argv, char **envp); - -struct pt_regs; -asmlinkage int sys_execve(char *filenamei, char **argv, char **envp, - struct pt_regs *regs); -asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp, - struct pt_regs *regs); -asmlinkage int sys_fork(struct pt_regs *regs); -asmlinkage int sys_vfork(struct pt_regs *regs); -asmlinkage int sys_pipe(unsigned long *fildes); -struct sigaction; -asmlinkage long sys_rt_sigaction(int sig, - const struct sigaction __user *act, - struct sigaction __user *oact, - size_t sigsetsize); - -#endif /* __KERNEL_SYSCALLS__ */ - /* * "Conditional" syscalls * Index: linux-cg/include/asm-parisc/unistd.h =================================================================== --- linux-cg.orig/include/asm-parisc/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-parisc/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -959,85 +959,6 @@ return K_INLINE_SYSCALL(name, 6, arg1, arg2, arg3, arg4, arg5, arg6); \ } -#ifdef __KERNEL_SYSCALLS__ - -#include <asm/current.h> -#include <linux/compiler.h> -#include <linux/types.h> -#include <linux/syscalls.h> - -static inline pid_t setsid(void) -{ - return sys_setsid(); -} - -static inline int write(int fd, const char *buf, off_t count) -{ - return sys_write(fd, buf, count); -} - -static inline int read(int fd, char *buf, off_t count) -{ - return sys_read(fd, buf, count); -} - -static inline off_t lseek(int fd, off_t offset, int count) -{ - return sys_lseek(fd, offset, count); -} - -static inline int dup(int fd) -{ - return sys_dup(fd); -} - -static inline int execve(char *filename, char * argv [], - char * envp[]) -{ - extern int __execve(char *, char **, char **, struct task_struct *); - return __execve(filename, argv, envp, current); -} - -static inline int open(const char *file, int flag, int mode) -{ - return sys_open(file, flag, mode); -} - -static inline int close(int fd) -{ - return sys_close(fd); -} - -static inline void _exit(int exitcode) -{ - sys_exit(exitcode); -} - -static inline pid_t waitpid(pid_t pid, int *wait_stat, int options) -{ - return sys_wait4(pid, wait_stat, options, NULL); -} - -asmlinkage unsigned long sys_mmap(unsigned long addr, unsigned long len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long offset); -asmlinkage unsigned long sys_mmap2(unsigned long addr, unsigned long len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -struct pt_regs; -asmlinkage int sys_execve(struct pt_regs *regs); -int sys_clone(unsigned long clone_flags, unsigned long usp, - struct pt_regs *regs); -int sys_vfork(struct pt_regs *regs); -int sys_pipe(int *fildes); -struct sigaction; -asmlinkage long sys_rt_sigaction(int sig, - const struct sigaction __user *act, - struct sigaction __user *oact, - size_t sigsetsize); - -#endif /* __KERNEL_SYSCALLS__ */ - #endif /* __ASSEMBLY__ */ #undef STR Index: linux-cg/include/asm-powerpc/unistd.h =================================================================== --- linux-cg.orig/include/asm-powerpc/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-powerpc/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -479,13 +479,6 @@ #endif /* - * System call prototypes. - */ -#ifdef __KERNEL_SYSCALLS__ -extern int execve(const char *file, char **argv, char **envp); -#endif /* __KERNEL_SYSCALLS__ */ - -/* * "Conditional" syscalls * * What we want is __attribute__((weak,alias("sys_ni_syscall"))), Index: linux-cg/include/asm-um/unistd.h =================================================================== --- linux-cg.orig/include/asm-um/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-um/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -37,33 +37,6 @@ #define __ARCH_WANT_SYS_RT_SIGSUSPEND #endif -#ifdef __KERNEL_SYSCALLS__ - -#include <linux/compiler.h> -#include <linux/types.h> - -static inline int execve(const char *filename, char *const argv[], - char *const envp[]) -{ - mm_segment_t fs; - int ret; - - fs = get_fs(); - set_fs(KERNEL_DS); - ret = um_execve(filename, argv, envp); - set_fs(fs); - - if (ret >= 0) - return ret; - - errno = -(long)ret; - return -1; -} - -int sys_execve(char *file, char **argv, char **env); - -#endif /* __KERNEL_SYSCALLS__ */ - #undef __KERNEL_SYSCALLS__ #include "asm/arch/unistd.h" Index: linux-cg/include/linux/syscalls.h =================================================================== --- linux-cg.orig/include/linux/syscalls.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/linux/syscalls.h 2006-08-20 19:06:00.000000000 +0200 @@ -597,4 +597,6 @@ asmlinkage long sys_set_robust_list(struct robust_list_head __user *head, size_t len); +int kernel_execve(const char *filename, char *const argv[], char *const envp[]); + #endif Index: linux-cg/arch/ia64/kernel/entry.S =================================================================== --- linux-cg.orig/arch/ia64/kernel/entry.S 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/ia64/kernel/entry.S 2006-08-20 19:06:00.000000000 +0200 @@ -492,11 +492,11 @@ br.ret.sptk.many rp END(prefetch_stack) -GLOBAL_ENTRY(execve) +GLOBAL_ENTRY(kernel_execve) mov r15=__NR_execve // put syscall number in place break __BREAK_SYSCALL br.ret.sptk.many rp -END(execve) +END(kernel_execve) GLOBAL_ENTRY(clone) mov r15=__NR_clone // put syscall number in place Index: linux-cg/arch/parisc/kernel/process.c =================================================================== --- linux-cg.orig/arch/parisc/kernel/process.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/parisc/kernel/process.c 2006-08-20 19:06:00.000000000 +0200 @@ -368,7 +368,14 @@ return error; } -unsigned long +extern int __execve(const char *filename, char *const argv[], + char *const envp[], struct task_struct *task); +int kernel_execve(const char *filename, char *const argv[], char *const envp[]); +{ + return __execve(filename, argv, envp, current); +} + +unsigned long get_wchan(struct task_struct *p) { struct unwind_frame_info info; Index: linux-cg/arch/um/kernel/syscall.c =================================================================== --- linux-cg.orig/arch/um/kernel/syscall.c 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/arch/um/kernel/syscall.c 2006-08-20 19:06:00.000000000 +0200 @@ -164,3 +164,16 @@ spin_unlock(&syscall_lock); return(ret); } + +int kernel_execve(const char *filename, char *const argv[], char *const envp[]) +{ + mm_segment_t fs; + int ret; + + fs = get_fs(); + set_fs(KERNEL_DS); + ret = um_execve(filename, argv, envp); + set_fs(fs); + + return ret; +} Index: linux-cg/include/asm-ia64/unistd.h =================================================================== --- linux-cg.orig/include/asm-ia64/unistd.h 2006-08-20 19:05:53.000000000 +0200 +++ linux-cg/include/asm-ia64/unistd.h 2006-08-20 19:06:00.000000000 +0200 @@ -319,78 +319,6 @@ extern long __ia64_syscall (long a0, long a1, long a2, long a3, long a4, long nr); -#ifdef __KERNEL_SYSCALLS__ - -#include <linux/compiler.h> -#include <linux/string.h> -#include <linux/signal.h> -#include <asm/ptrace.h> -#include <linux/stringify.h> -#include <linux/syscalls.h> - -static inline long -open (const char * name, int mode, int flags) -{ - return sys_open(name, mode, flags); -} - -static inline long -dup (int fd) -{ - return sys_dup(fd); -} - -static inline long -close (int fd) -{ - return sys_close(fd); -} - -static inline off_t -lseek (int fd, off_t off, int whence) -{ - return sys_lseek(fd, off, whence); -} - -static inline void -_exit (int value) -{ - sys_exit(value); -} - -#define exit(x) _exit(x) - -static inline long -write (int fd, const char * buf, size_t nr) -{ - return sys_write(fd, buf, nr); -} - -static inline long -read (int fd, char * buf, size_t nr) -{ - return sys_read(fd, buf, nr); -} - - -static inline long -setsid (void) -{ - return sys_setsid(); -} - -static inline pid_t -waitpid (int pid, int * wait_stat, int flags) -{ - return sys_wait4(pid, wait_stat, flags, NULL); -} - - -extern int execve (const char *filename, char *const av[], char *const ep[]); -extern pid_t clone (unsigned long flags, void *sp); - -#endif /* __KERNEL_SYSCALLS__ */ - asmlinkage unsigned long sys_mmap( unsigned long addr, unsigned long len, int prot, int flags, Index: linux-cg/arch/alpha/Kconfig =================================================================== --- linux-cg.orig/arch/alpha/Kconfig 2006-08-20 19:06:01.000000000 +0200 +++ linux-cg/arch/alpha/Kconfig 2006-08-20 19:06:02.000000000 +0200 @@ -524,6 +524,9 @@ depends on SMP default y +config HAVE_KERNEL_EXECVE + def_bool y + config NR_CPUS int "Maximum number of CPUs (2-64)" range 2 64 Index: linux-cg/arch/arm/Kconfig =================================================================== --- linux-cg.orig/arch/arm/Kconfig 2006-08-20 19:06:01.000000000 +0200 +++ linux-cg/arch/arm/Kconfig 2006-08-20 19:06:02.000000000 +0200 @@ -77,6 +77,9 @@ config GENERIC_BUST_SPINLOCK bool +config HAVE_KERNEL_EXECVE + def_bool y + config ARCH_MAY_HAVE_PC_FDC bool Index: linux-cg/arch/arm26/Kconfig =================================================================== --- linux-cg.orig/arch/arm26/Kconfig 2006-08-20 19:06:01.000000000 +0200 +++ linux-cg/arch/arm26/Kconfig 2006-08-20 19:06:02.000000000 +0200 @@ -52,6 +52,9 @@ config GENERIC_BUST_SPINLOCK bool +config HAVE_KERNEL_EXECVE + def_bool y + config GENERIC_ISA_DMA bool Index: linux-cg/arch/ia64/Kconfig =================================================================== --- linux-cg.orig/arch/ia64/Kconfig 2006-08-20 19:06:01.000000000 +0200 +++ linux-cg/arch/ia64/Kconfig 2006-08-20 19:06:02.000000000 +0200 @@ -54,6 +54,9 @@ bool default y +config HAVE_KERNEL_EXECVE + def_bool y + config GENERIC_IOMAP bool default y Index: linux-cg/arch/parisc/Kconfig =================================================================== --- linux-cg.orig/arch/parisc/Kconfig 2006-08-20 19:06:01.000000000 +0200 +++ linux-cg/arch/parisc/Kconfig 2006-08-20 19:06:02.000000000 +0200 @@ -37,6 +37,9 @@ bool default y +config HAVE_KERNEL_EXECVE + def_bool y + config TIME_LOW_RES bool depends on SMP Index: linux-cg/arch/powerpc/Kconfig =================================================================== --- linux-cg.orig/arch/powerpc/Kconfig 2006-08-20 19:06:01.000000000 +0200 +++ linux-cg/arch/powerpc/Kconfig 2006-08-20 19:06:02.000000000 +0200 @@ -53,6 +53,9 @@ bool default y +config HAVE_KERNEL_EXECVE + def_bool y + config PPC bool default y Index: linux-cg/arch/alpha/kernel/alpha_ksyms.c =================================================================== --- linux-cg.orig/arch/alpha/kernel/alpha_ksyms.c 2006-08-20 19:09:47.000000000 +0200 +++ linux-cg/arch/alpha/kernel/alpha_ksyms.c 2006-08-20 19:09:48.000000000 +0200 @@ -116,7 +116,7 @@ EXPORT_SYMBOL(sys_exit); EXPORT_SYMBOL(sys_write); EXPORT_SYMBOL(sys_lseek); -EXPORT_SYMBOL(execve); +EXPORT_SYMBOL(kernel_execve); EXPORT_SYMBOL(sys_setsid); EXPORT_SYMBOL(sys_wait4); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 17:13 ` [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ Arnd Bergmann @ 2006-08-20 17:36 ` Chase Venters 2006-08-20 18:25 ` Andrew Morton 2006-08-20 19:31 ` Arnd Bergmann 2006-08-21 0:36 ` Paul Mackerras 1 sibling, 2 replies; 22+ messages in thread From: Chase Venters @ 2006-08-20 17:36 UTC (permalink / raw) To: Arnd Bergmann Cc: Björn Steinbrink, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Sunday 20 August 2006 12:13, Arnd Bergmann wrote: > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ linux-cg/lib/execve.c 2006-08-20 19:06:00.000000000 +0200 > @@ -0,0 +1,19 @@ > +#include <asm/bug.h> > +#include <asm/uaccess.h> > + > +#define __KERNEL_SYSCALLS__ > +static int errno; > +#include <asm/unistd.h> > + > +int kernel_execve(const char *filename, char *const argv[], char *const > envp[]) +{ > + mm_segment_t fs = get_fs(); > + int ret; > + > + WARN_ON(segment_eq(fs, USER_DS)); > + ret = execve(filename, (char **)argv, (char **)envp); > + if (ret) > + ret = errno; > + > + return ret; > +} I noticed this global errno in lib/errno.c a while ago and was wondering what the right way to clean it up is. From what I remember, no one actually uses errno in the kernel (unless it's an "errno" they've defined locally). The only other place errno gets used is by all of the syscall macros. Unless there's some TLS kernel magic that I've totally missed, using errno in this manner is totally unsafe anyway. So I would NAK the above because your kernel_execve() function gives an unsafe errno value significance it should not have by turning it into a return value. (As an aside, shouldn't that have read [ ret = -errno; ] anyway?) Unless 'errno' has some significant reason to live on in the kernel, I think it would be better to kill it and write kernel syscall macros that don't muck with it. Thanks, Chase ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 17:36 ` Chase Venters @ 2006-08-20 18:25 ` Andrew Morton 2006-08-20 18:32 ` Chase Venters 2006-08-20 19:31 ` Arnd Bergmann 1 sibling, 1 reply; 22+ messages in thread From: Andrew Morton @ 2006-08-20 18:25 UTC (permalink / raw) To: Chase Venters Cc: Arnd Bergmann, Björn Steinbrink, Russell King, rusty, linux-kernel, linux-arch On Sun, 20 Aug 2006 12:36:49 -0500 Chase Venters <chase.venters@clientec.com> wrote: > Unless 'errno' has some significant reason to live on in the kernel, I think > it would be better to kill it and write kernel syscall macros that don't muck > with it. We have been working in that direction. It's certainly something we'd like to kill off. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 18:25 ` Andrew Morton @ 2006-08-20 18:32 ` Chase Venters 2006-08-20 19:45 ` Björn Steinbrink 0 siblings, 1 reply; 22+ messages in thread From: Chase Venters @ 2006-08-20 18:32 UTC (permalink / raw) To: Andrew Morton Cc: Arnd Bergmann, Björn Steinbrink, Russell King, rusty, linux-kernel, linux-arch On Sunday 20 August 2006 13:25, Andrew Morton wrote: > On Sun, 20 Aug 2006 12:36:49 -0500 > > Chase Venters <chase.venters@clientec.com> wrote: > > Unless 'errno' has some significant reason to live on in the kernel, I > > think it would be better to kill it and write kernel syscall macros that > > don't muck with it. > > We have been working in that direction. It's certainly something we'd like > to kill off. Perhaps Arnd's patch is a good step in that direction then. A secondary suggestion is to put a big comment there that explains "Yes, we know this is ugly, it's going to die soon." I'd also consider going so far as just returning -1 if we failed, since we can't quite trust errno anyway. Thanks, Chase ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 18:32 ` Chase Venters @ 2006-08-20 19:45 ` Björn Steinbrink 2006-08-20 19:50 ` Arjan van de Ven 0 siblings, 1 reply; 22+ messages in thread From: Björn Steinbrink @ 2006-08-20 19:45 UTC (permalink / raw) To: Chase Venters Cc: Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch On 2006.08.20 13:32:39 -0500, Chase Venters wrote: > On Sunday 20 August 2006 13:25, Andrew Morton wrote: > > On Sun, 20 Aug 2006 12:36:49 -0500 > > > > Chase Venters <chase.venters@clientec.com> wrote: > > > Unless 'errno' has some significant reason to live on in the kernel, I > > > think it would be better to kill it and write kernel syscall macros that > > > don't muck with it. > > > > We have been working in that direction. It's certainly something we'd like > > to kill off. > > Perhaps Arnd's patch is a good step in that direction then. A secondary > suggestion is to put a big comment there that explains "Yes, we know this is > ugly, it's going to die soon." > > I'd also consider going so far as just returning -1 if we failed, since we > can't quite trust errno anyway. Could we rename __syscall_return to IS_SYS_ERR (or whatever) and force kernel syscall users to do the check? That way we could eliminate errno and still provide the real error code to the code using the syscall. Björn ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 19:45 ` Björn Steinbrink @ 2006-08-20 19:50 ` Arjan van de Ven 2006-08-20 20:11 ` Björn Steinbrink 0 siblings, 1 reply; 22+ messages in thread From: Arjan van de Ven @ 2006-08-20 19:50 UTC (permalink / raw) To: Björn Steinbrink Cc: Chase Venters, Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch \ > Could we rename __syscall_return to IS_SYS_ERR (or whatever) and force > kernel syscall users to do the check? That way we could eliminate errno s/users/user/ .. there's one left that should die out soon ;) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 19:50 ` Arjan van de Ven @ 2006-08-20 20:11 ` Björn Steinbrink 2006-08-20 20:20 ` Arjan van de Ven 2006-08-20 20:33 ` Arnd Bergmann 0 siblings, 2 replies; 22+ messages in thread From: Björn Steinbrink @ 2006-08-20 20:11 UTC (permalink / raw) To: Arjan van de Ven Cc: Chase Venters, Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch On 2006.08.20 21:50:46 +0200, Arjan van de Ven wrote: > \ > > Could we rename __syscall_return to IS_SYS_ERR (or whatever) and force > > kernel syscall users to do the check? That way we could eliminate errno > > s/users/user/ .. there's one left that should die out soon ;) > Only one in unistd.h, but throughout the kernel there are quite a few unless I'm missing something here: doener@atjola:~/src/kernel/linux-2.6$ grep \ _syscall * -R | \ > grep -v define\\\|undef\\\|clobber | wc -l 116 Are these just going to be replaced by calls to sys_whatever? Björn ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 20:11 ` Björn Steinbrink @ 2006-08-20 20:20 ` Arjan van de Ven 2006-08-20 20:36 ` Björn Steinbrink 2006-08-20 20:33 ` Arnd Bergmann 1 sibling, 1 reply; 22+ messages in thread From: Arjan van de Ven @ 2006-08-20 20:20 UTC (permalink / raw) To: Björn Steinbrink Cc: Chase Venters, Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch On Sun, 2006-08-20 at 22:11 +0200, Björn Steinbrink wrote: > On 2006.08.20 21:50:46 +0200, Arjan van de Ven wrote: > > \ > > > Could we rename __syscall_return to IS_SYS_ERR (or whatever) and force > > > kernel syscall users to do the check? That way we could eliminate errno > > > > s/users/user/ .. there's one left that should die out soon ;) > > > > Only one in unistd.h, but throughout the kernel there are quite a few > unless I'm missing something here: > doener@atjola:~/src/kernel/linux-2.6$ grep \ _syscall * -R | \ > > grep -v define\\\|undef\\\|clobber | wc -l > 116 > > Are these just going to be replaced by calls to sys_whatever? they're not the users of this, they're the definitions... ;) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 20:20 ` Arjan van de Ven @ 2006-08-20 20:36 ` Björn Steinbrink 2006-08-20 20:40 ` Arjan van de Ven 2006-08-21 1:55 ` Jeff Dike 0 siblings, 2 replies; 22+ messages in thread From: Björn Steinbrink @ 2006-08-20 20:36 UTC (permalink / raw) To: Arjan van de Ven Cc: Chase Venters, Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch On 2006.08.20 22:20:28 +0200, Arjan van de Ven wrote: > On Sun, 2006-08-20 at 22:11 +0200, Björn Steinbrink wrote: > > On 2006.08.20 21:50:46 +0200, Arjan van de Ven wrote: > > > \ > > > > Could we rename __syscall_return to IS_SYS_ERR (or whatever) and force > > > > kernel syscall users to do the check? That way we could eliminate errno > > > > > > s/users/user/ .. there's one left that should die out soon ;) > > > > > > > Only one in unistd.h, but throughout the kernel there are quite a few > > unless I'm missing something here: > > doener@atjola:~/src/kernel/linux-2.6$ grep \ _syscall * -R | \ > > > grep -v define\\\|undef\\\|clobber | wc -l > > 116 > > > > Are these just going to be replaced by calls to sys_whatever? > > they're not the users of this, they're the definitions... ;) Well, I assume that if some code defines a syscall, it will actually use it. Of course I meant to ask if the users of those definitions are going to just call sys_whatever. For example check_host_supports_tls in arch/um/os-Linux/sys-i386/tls.c which even uses the global errno (although in that case the whole else part could probably be just removed). Björn ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 20:36 ` Björn Steinbrink @ 2006-08-20 20:40 ` Arjan van de Ven 2006-08-21 1:55 ` Jeff Dike 1 sibling, 0 replies; 22+ messages in thread From: Arjan van de Ven @ 2006-08-20 20:40 UTC (permalink / raw) To: Björn Steinbrink Cc: Chase Venters, Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch On Sun, 2006-08-20 at 22:36 +0200, Björn Steinbrink wrote: > On 2006.08.20 22:20:28 +0200, Arjan van de Ven wrote: > > On Sun, 2006-08-20 at 22:11 +0200, Björn Steinbrink wrote: > > > On 2006.08.20 21:50:46 +0200, Arjan van de Ven wrote: > > > > \ > > > > > Could we rename __syscall_return to IS_SYS_ERR (or whatever) and force > > > > > kernel syscall users to do the check? That way we could eliminate errno > > > > > > > > s/users/user/ .. there's one left that should die out soon ;) > > > > > > > > > > Only one in unistd.h, but throughout the kernel there are quite a few > > > unless I'm missing something here: > > > doener@atjola:~/src/kernel/linux-2.6$ grep \ _syscall * -R | \ > > > > grep -v define\\\|undef\\\|clobber | wc -l > > > 116 > > > > > > Are these just going to be replaced by calls to sys_whatever? > > > > they're not the users of this, they're the definitions... ;) > > Well, I assume that if some code defines a syscall, it will actually use > it. Of course I meant to ask if the users of those definitions are going > to just call sys_whatever. > For example check_host_supports_tls in arch/um/os-Linux/sys-i386/tls.c > which even uses the global errno (although in that case the whole > else part could probably be just removed). um uses glibc, and is thus special.. lets ignore that ;) (really, it's an entire different beast in this regard) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 20:36 ` Björn Steinbrink 2006-08-20 20:40 ` Arjan van de Ven @ 2006-08-21 1:55 ` Jeff Dike 1 sibling, 0 replies; 22+ messages in thread From: Jeff Dike @ 2006-08-21 1:55 UTC (permalink / raw) To: Bj?rn Steinbrink, Arjan van de Ven, Chase Venters, Andrew Morton, Arnd Bergmann, Russell King, rusty, linux-kernel, linux-arch On Sun, Aug 20, 2006 at 10:36:04PM +0200, Bj?rn Steinbrink wrote: > For example check_host_supports_tls in arch/um/os-Linux/sys-i386/tls.c > which even uses the global errno (although in that case the whole > else part could probably be just removed). UML is different. It uses errno extensively (as it must) on the glibc side of things. On the kernel side, there are no uses of errno that I'm aware of. Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 20:11 ` Björn Steinbrink 2006-08-20 20:20 ` Arjan van de Ven @ 2006-08-20 20:33 ` Arnd Bergmann 1 sibling, 0 replies; 22+ messages in thread From: Arnd Bergmann @ 2006-08-20 20:33 UTC (permalink / raw) To: Björn Steinbrink Cc: Arjan van de Ven, Chase Venters, Andrew Morton, Russell King, rusty, linux-kernel, linux-arch On Sunday 20 August 2006 22:11, Björn Steinbrink wrote: > Only one in unistd.h, but throughout the kernel there are quite a few > unless I'm missing something here: > doener@atjola:~/src/kernel/linux-2.6$ grep \ _syscall * -R | \ > > grep -v define\\\|undef\\\|clobber | wc -l > 116 there are only a few direct calls that managed to sneak in after we removed them all some time ago: | arch/sh64/kernel/process.c: _syscall0(int, getpid) | arch/sh64/kernel/process.c: _syscall1(int, getpgid, int, pid) | arch/sh64/kernel/process.c:static __inline__ _syscall2(int,clone,unsigned long,flags,unsigned long,newsp) | arch/sh64/kernel/process.c:static __inline__ _syscall1(int,exit,int,ret) These should be replaced with calls to sys_*, or whatever the other architectures do in order to implement the respective functions. | arch/um/os-Linux/sys-i386/tls.c:static _syscall1(int, get_thread_area, user_desc_t *, u_info); | arch/um/os-Linux/process.c:inline _syscall0(pid_t, getpid) | arch/um/os-Linux/tls.c:static _syscall1(int, get_thread_area, user_desc_t *, u_info); | arch/um/os-Linux/tls.c:static _syscall1(int, set_thread_area, user_desc_t *, u_info); | arch/um/sys-i386/unmap.c:static inline _syscall2(int,munmap,void *,start,size_t,len) | arch/um/sys-i386/unmap.c:static inline _syscall6(void *,mmap2,void *,addr,size_t,len,int,prot,int,flags,int,fd,off_t,offset) | arch/um/sys-x86_64/unmap.c:static inline _syscall2(int,munmap,void *,start,size_t,len) | arch/um/sys-x86_64/unmap.c:static inline _syscall6(void *,mmap,void *,addr,size_t,len,int,prot,int,flags,int,fd,off_t,offset) UML is special, there may be a good reason to use them, if they are not actually kernel syscalls, but instead calls to the host OS. Arnd <>< ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 17:36 ` Chase Venters 2006-08-20 18:25 ` Andrew Morton @ 2006-08-20 19:31 ` Arnd Bergmann 1 sibling, 0 replies; 22+ messages in thread From: Arnd Bergmann @ 2006-08-20 19:31 UTC (permalink / raw) To: Chase Venters Cc: Björn Steinbrink, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Sunday 20 August 2006 19:36, Chase Venters wrote: > Unless there's some TLS kernel magic that I've totally missed, using errno in > this manner is totally unsafe anyway. So I would NAK the above because your > kernel_execve() function gives an unsafe errno value significance it should > not have by turning it into a return value. It has always resulted in an unsafe errno value, my patch just fixes it on a few architectures and makes it safe there. Note that never even noticed execve returning -1 on some architectures and -errno on others, and if execve succeeds, errno is never assigned anyway. > (As an aside, shouldn't that have read [ ret = -errno; ] anyway?) Right, thanks for pointing this out. > Unless 'errno' has some significant reason to live on in the kernel, I think > it would be better to kill it and write kernel syscall macros that don't muck > with it. The direction in which this patch goes is to kill off kernel syscalls entirely. The main problem there is that kernel_execve needs an architecture specific implementation (calling sys_execve does the wrong thing), so doing it all in one step would require knowing how to do it on all 20 architectures. Once the execve kernel syscall is gone, errno can die with it. Arnd <>< ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-20 17:13 ` [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ Arnd Bergmann 2006-08-20 17:36 ` Chase Venters @ 2006-08-21 0:36 ` Paul Mackerras 2006-08-21 15:12 ` Arnd Bergmann 1 sibling, 1 reply; 22+ messages in thread From: Paul Mackerras @ 2006-08-21 0:36 UTC (permalink / raw) To: Arnd Bergmann Cc: Björn Steinbrink, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch Arnd Bergmann writes: > Iit turned out most of the architectures that already implement > their own execve() call instead of using the _syscall3 function > for it end up passing the return value of sys_execve down, > instead of setting errno. I really don't like having an "errno" variable in the kernel. What if two processes are doing an execve concurrently? Anyway, your patch returns the (positive) errno value here: > + WARN_ON(segment_eq(fs, USER_DS)); > + ret = execve(filename, (char **)argv, (char **)envp); > + if (ret) > + ret = errno; > + > + return ret; but here we are testing for a negative value to mean error: > - if (execve("/sbin/shutdown", argv, envp) < 0) { > + if (kernel_execve("/sbin/shutdown", argv, envp) < 0) { Paul. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-21 0:36 ` Paul Mackerras @ 2006-08-21 15:12 ` Arnd Bergmann 2006-08-21 15:17 ` Russell King 2006-08-22 7:29 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 22+ messages in thread From: Arnd Bergmann @ 2006-08-21 15:12 UTC (permalink / raw) To: Paul Mackerras Cc: Björn Steinbrink, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Monday 21 August 2006 02:36, Paul Mackerras wrote: > > Iit turned out most of the architectures that already implement > > their own execve() call instead of using the _syscall3 function > > for it end up passing the return value of sys_execve down, > > instead of setting errno. > > I really don't like having an "errno" variable in the kernel. What if > two processes are doing an execve concurrently? The point is that we have two different schemes in the kernel that conflict: alpha, arm{,26}, ia64, parisc, powerpc and x86_64 pass the error code from execve, all others pass -1 and set the global errno. So the caller does not really have a chance to get the correct error value at all. Bjoern's first patch changed one caller from looking at the return value to looking at errno in case of an error, which shifts the problem to other architectures. My patch makes the errno variable local to execve, which slightly helps, and makes it easier to get it right completely right by doing the same as powerpc or parisc. Now, we could do a truely evil involving a nested function, like #include <asm/bug.h> #include <asm/uaccess.h> #define __KERNEL_SYSCALLS__ #include <linux/unistd.h> int kernel_execve(const char *filename, char *const argv[], char *const envp[]) { mm_segment_t fs = get_fs(); int errno; int ret; _syscall3(int,execve,const char *,file,char *const*,argv,char *const*,envp) WARN_ON(segment_eq(fs, USER_DS)); ret = execve(filename, argv, envp); if (ret) ret = -errno; return ret; } That would solve the problem of races on the errno variable, but set a bad example to other hackers. > Anyway, your patch returns the (positive) errno value here: > > > + WARN_ON(segment_eq(fs, USER_DS)); > > + ret = execve(filename, (char **)argv, (char **)envp); > > + if (ret) > > + ret = errno; > > + > > + return ret; > > but here we are testing for a negative value to mean error: > > > - if (execve("/sbin/shutdown", argv, envp) < 0) { > > + if (kernel_execve("/sbin/shutdown", argv, envp) < 0) { Yes, Chase Venters already noticed that bug. If obviously needs to be 'ret = -errno;'. Arnd <>< ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-21 15:12 ` Arnd Bergmann @ 2006-08-21 15:17 ` Russell King 2006-08-22 7:29 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 22+ messages in thread From: Russell King @ 2006-08-21 15:17 UTC (permalink / raw) To: Arnd Bergmann Cc: Paul Mackerras, Bj?rn Steinbrink, Andrew Morton, rusty, linux-kernel, linux-arch On Mon, Aug 21, 2006 at 05:12:17PM +0200, Arnd Bergmann wrote: > On Monday 21 August 2006 02:36, Paul Mackerras wrote: > > > Iit turned out most of the architectures that already implement > > > their own execve() call instead of using the _syscall3 function > > > for it end up passing the return value of sys_execve down, > > > instead of setting errno. > > > > I really don't like having an "errno" variable in the kernel. ?What if > > two processes are doing an execve concurrently? > > The point is that we have two different schemes in the kernel that > conflict: > > alpha, arm{,26}, ia64, parisc, powerpc and x86_64 pass the error > code from execve, all others pass -1 and set the global errno. Indeed, and rather than fixing execve() for one set of architectures and by doing that breaking the other set, the point of this change is to fix _all_ architectures in the most expedient way. At a later date, those architectures who are using the global errno can have that _separate_ bug fixed. Let's fix one bug at a time. Especially as this probably needs to go in to -rc. Arnd - thanks for taking this on. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-21 15:12 ` Arnd Bergmann 2006-08-21 15:17 ` Russell King @ 2006-08-22 7:29 ` Benjamin Herrenschmidt 2006-08-22 8:00 ` Björn Steinbrink 1 sibling, 1 reply; 22+ messages in thread From: Benjamin Herrenschmidt @ 2006-08-22 7:29 UTC (permalink / raw) To: Arnd Bergmann Cc: Paul Mackerras, Björn Steinbrink, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Mon, 2006-08-21 at 17:12 +0200, Arnd Bergmann wrote: > On Monday 21 August 2006 02:36, Paul Mackerras wrote: > > > Iit turned out most of the architectures that already implement > > > their own execve() call instead of using the _syscall3 function > > > for it end up passing the return value of sys_execve down, > > > instead of setting errno. > > > > I really don't like having an "errno" variable in the kernel. What if > > two processes are doing an execve concurrently? > > The point is that we have two different schemes in the kernel that > conflict: > > alpha, arm{,26}, ia64, parisc, powerpc and x86_64 pass the error > code from execve, all others pass -1 and set the global errno. All other need to be fixed then... having an errno is just plain wrong. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-22 7:29 ` Benjamin Herrenschmidt @ 2006-08-22 8:00 ` Björn Steinbrink 2006-08-22 10:06 ` Arnd Bergmann 0 siblings, 1 reply; 22+ messages in thread From: Björn Steinbrink @ 2006-08-22 8:00 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Arnd Bergmann, Paul Mackerras, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On 2006.08.22 17:29:02 +1000, Benjamin Herrenschmidt wrote: > On Mon, 2006-08-21 at 17:12 +0200, Arnd Bergmann wrote: > > On Monday 21 August 2006 02:36, Paul Mackerras wrote: > > > > Iit turned out most of the architectures that already implement > > > > their own execve() call instead of using the _syscall3 function > > > > for it end up passing the return value of sys_execve down, > > > > instead of setting errno. > > > > > > I really don't like having an "errno" variable in the kernel. What if > > > two processes are doing an execve concurrently? > > > > The point is that we have two different schemes in the kernel that > > conflict: > > > > alpha, arm{,26}, ia64, parisc, powerpc and x86_64 pass the error > > code from execve, all others pass -1 and set the global errno. > > All other need to be fixed then... having an errno is just plain wrong. I'm working on a patch loosely based on Arnd's that changes the in-kernel syscall macros to directly return the error codes. Once kernel_execve is implemented for each arch, only um should remain as a user and I found only two calls there that care about the exact non-zero return value, both are simple to adapt. That should allow to get rid of errno completely. If someone knows a reason why this is destined to fail (maybe syscalls returning char?!), please let me know before I waste too much time on it ;) Björn ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-22 8:00 ` Björn Steinbrink @ 2006-08-22 10:06 ` Arnd Bergmann 2006-08-22 13:39 ` Jeff Dike 0 siblings, 1 reply; 22+ messages in thread From: Arnd Bergmann @ 2006-08-22 10:06 UTC (permalink / raw) To: Björn Steinbrink Cc: Benjamin Herrenschmidt, Paul Mackerras, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Tuesday 22 August 2006 10:00, Björn Steinbrink wrote: > I'm working on a patch loosely based on Arnd's that changes the > in-kernel syscall macros to directly return the error codes. I think that is still going in the wrong direction. Traditionally, the macros in unistd.h were meant for user space, but we're now discouraging that strongly (i.e. they are inside of #ifdef __KERNEL__). The only in-kernel users on the _syscall macros used to by the __KERNEL_SYSCALLS__ that we're trying to kill. The logical consequence should be that we remove the _syscall macros entirely, for all architectures. UML can be converted to use the syscall function provided by libc in order to call the host OS. Arnd <>< ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-22 10:06 ` Arnd Bergmann @ 2006-08-22 13:39 ` Jeff Dike 2006-08-22 15:13 ` Arnd Bergmann 0 siblings, 1 reply; 22+ messages in thread From: Jeff Dike @ 2006-08-22 13:39 UTC (permalink / raw) To: Arnd Bergmann Cc: Bj?rn Steinbrink, Benjamin Herrenschmidt, Paul Mackerras, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Tue, Aug 22, 2006 at 12:06:59PM +0200, Arnd Bergmann wrote: > UML can be converted to use the syscall function provided by libc > in order to call the host OS. You're contemplating changing UML to do, e.g. syscall(NR_write, fd, buf, len) instead of the current write(fd, buf,len) ? That hardly seems like an improvement and it seems fairly unnecessary. Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-22 13:39 ` Jeff Dike @ 2006-08-22 15:13 ` Arnd Bergmann 2006-08-22 15:37 ` Jeff Dike 0 siblings, 1 reply; 22+ messages in thread From: Arnd Bergmann @ 2006-08-22 15:13 UTC (permalink / raw) To: Jeff Dike Cc: Bj?rn Steinbrink, Benjamin Herrenschmidt, Paul Mackerras, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Tuesday 22 August 2006 15:39, Jeff Dike wrote: > You're contemplating changing UML to do, e.g. > syscall(NR_write, fd, buf, len) > instead of the current > write(fd, buf,len) > ? > > That hardly seems like an improvement and it seems fairly unnecessary. > No, that's not what I was referring to. I was thinking of the calls: arch/um/os-Linux/process.c:inline _syscall0(pid_t, getpid) arch/um/os-Linux/sys-i386/tls.c:static _syscall1(int, get_thread_area, user_desc_t *, u_info); arch/um/os-Linux/tls.c:static _syscall1(int, get_thread_area, user_desc_t *, u_info); arch/um/os-Linux/tls.c:static _syscall1(int, set_thread_area, user_desc_t *, u_info); arch/um/sys-i386/unmap.c:static inline _syscall2(int,munmap,void *,start,size_t,len) arch/um/sys-i386/unmap.c:static inline _syscall6(void *,mmap2,void *,addr,size_t,len,int,prot,int,flags,int,fd,off_t,offset) arch/um/sys-x86_64/unmap.c:static inline _syscall2(int,munmap,void *,start,size_t,len) arch/um/sys-x86_64/unmap.c:static inline _syscall6(void *,mmap,void *,addr,size_t,len,int,prot,int,flags,int,fd,off_t,offset) Are these for calling the host OS or calling the UML kernel? If they are for the host, they can be implemented using syscall(), otherwise by calling the sys_* functions directly. Arnd <>< ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ 2006-08-22 15:13 ` Arnd Bergmann @ 2006-08-22 15:37 ` Jeff Dike 0 siblings, 0 replies; 22+ messages in thread From: Jeff Dike @ 2006-08-22 15:37 UTC (permalink / raw) To: Arnd Bergmann Cc: Bj?rn Steinbrink, Benjamin Herrenschmidt, Paul Mackerras, Russell King, Andrew Morton, rusty, linux-kernel, linux-arch On Tue, Aug 22, 2006 at 05:13:39PM +0200, Arnd Bergmann wrote: > No, that's not what I was referring to. I was thinking of the calls: > > arch/um/os-Linux/process.c:inline _syscall0(pid_t, getpid) > arch/um/os-Linux/sys-i386/tls.c:static _syscall1(int, get_thread_area, user_desc_t *, u_info); > arch/um/os-Linux/tls.c:static _syscall1(int, get_thread_area, user_desc_t *, u_info); > arch/um/os-Linux/tls.c:static _syscall1(int, set_thread_area, user_desc_t *, u_info); > arch/um/sys-i386/unmap.c:static inline _syscall2(int,munmap,void *,start,size_t,len) > arch/um/sys-i386/unmap.c:static inline _syscall6(void *,mmap2,void *,addr,size_t,len,int,prot,int,flags,int,fd,off_t,offset) > arch/um/sys-x86_64/unmap.c:static inline _syscall2(int,munmap,void *,start,size_t,len) > arch/um/sys-x86_64/unmap.c:static inline _syscall6(void *,mmap,void *,addr,size_t,len,int,prot,int,flags,int,fd,off_t,offset) > > Are these for calling the host OS or calling the UML kernel? > If they are for the host, they can be implemented using syscall(), > otherwise by calling the sys_* functions directly. OK, these are all calling the host, and using syscall() instead sounds reasonable. Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2006-08-22 15:38 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20060819073031.GA25711@atjola.homenet>
[not found] ` <200608201501.29296.arnd@arndb.de>
[not found] ` <20060820134745.GA11843@atjola.homenet>
2006-08-20 17:13 ` [PATCH] introduce kernel_execve function to replace __KERNEL_SYSCALLS__ Arnd Bergmann
2006-08-20 17:36 ` Chase Venters
2006-08-20 18:25 ` Andrew Morton
2006-08-20 18:32 ` Chase Venters
2006-08-20 19:45 ` Björn Steinbrink
2006-08-20 19:50 ` Arjan van de Ven
2006-08-20 20:11 ` Björn Steinbrink
2006-08-20 20:20 ` Arjan van de Ven
2006-08-20 20:36 ` Björn Steinbrink
2006-08-20 20:40 ` Arjan van de Ven
2006-08-21 1:55 ` Jeff Dike
2006-08-20 20:33 ` Arnd Bergmann
2006-08-20 19:31 ` Arnd Bergmann
2006-08-21 0:36 ` Paul Mackerras
2006-08-21 15:12 ` Arnd Bergmann
2006-08-21 15:17 ` Russell King
2006-08-22 7:29 ` Benjamin Herrenschmidt
2006-08-22 8:00 ` Björn Steinbrink
2006-08-22 10:06 ` Arnd Bergmann
2006-08-22 13:39 ` Jeff Dike
2006-08-22 15:13 ` Arnd Bergmann
2006-08-22 15:37 ` Jeff Dike
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox