* + syscallsx86-implement-execveat-system-call.patch added to -mm tree
@ 2014-11-12 22:08 akpm
0 siblings, 0 replies; 3+ messages in thread
From: akpm @ 2014-11-12 22:08 UTC (permalink / raw)
To: drysdale, arnd, dalias, ebiederm, hch, hpa, keescook, luto,
meredydd, mingo, mtk.manpages, shuahkh, tglx, viro, mm-commits
The patch titled
Subject: syscalls,x86: implement execveat() system call
has been added to the -mm tree. Its filename is
syscallsx86-implement-execveat-system-call.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/syscallsx86-implement-execveat-system-call.patch
echo and later at
echo http://ozlabs.org/~akpm/mmotm/broken-out/syscallsx86-implement-execveat-system-call.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: David Drysdale <drysdale@google.com>
Subject: syscalls,x86: implement execveat() system call
This patchset adds execveat(2) for x86, and is derived from Meredydd
Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).
The primary aim of adding an execveat syscall is to allow an
implementation of fexecve(3) that does not rely on the /proc filesystem,
at least for executables (rather than scripts). The current glibc version
of fexecve(3) is implemented via /proc, which causes problems in sandboxed
or otherwise restricted environments.
Given the desire for a /proc-free fexecve() implementation, HPA suggested
(https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
an appropriate generalization.
Also, having a new syscall means that it can take a flags argument without
back-compatibility concerns. The current implementation just defines the
AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
added in future -- for example, flags for new namespaces (as suggested at
https://lkml.org/lkml/2006/7/11/474).
Related history:
- https://lkml.org/lkml/2006/12/27/123 is an example of someone
realizing that fexecve() is likely to fail in a chroot environment.
- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
documenting the /proc requirement of fexecve(3) in its manpage, to
"prevent other people from wasting their time".
- https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
problem where a process that did setuid() could not fexecve()
because it no longer had access to /proc/self/fd; this has since
been fixed.
This patch (of 2):
Add a new execveat(2) system call. execveat() is to execve() as openat()
is to open(): it takes a file descriptor that refers to a directory, and
resolves the filename relative to that.
In addition, if the filename is empty and AT_EMPTY_PATH is specified,
execveat() executes the file to which the file descriptor refers. This
replicates the functionality of fexecve(), which is a system call in other
UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and
so relies on /proc being mounted).
The filename fed to the executed program as argv[0] (or the name of the
script fed to a script interpreter) will be of the form "/dev/fd/<fd>"
(for an empty filename) or "/dev/fd/<fd>/<filename>", effectively
reflecting how the executable was found. This does however mean that
execution of a script in a /proc-less environment won't work; also, script
execution via an O_CLOEXEC file descriptor fails (as the file will not be
accessible after exec).
Only x86-64, i386 and x32 ABIs are supported in this patch.
Based on patches by Meredydd Luff.
Signed-off-by: David Drysdale <drysdale@google.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rich Felker <dalias@aerifal.cx>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86/ia32/audit.c | 1
arch/x86/ia32/ia32entry.S | 1
arch/x86/kernel/audit_64.c | 1
arch/x86/kernel/entry_64.S | 28 +++++++
arch/x86/syscalls/syscall_32.tbl | 1
arch/x86/syscalls/syscall_64.tbl | 2
arch/x86/um/sys_call_table_64.c | 1
fs/binfmt_em86.c | 4 +
fs/binfmt_misc.c | 4 +
fs/binfmt_script.c | 10 ++
fs/exec.c | 110 ++++++++++++++++++++++++----
fs/namei.c | 2
include/linux/binfmts.h | 4 +
include/linux/compat.h | 3
include/linux/fs.h | 1
include/linux/sched.h | 4 +
include/linux/syscalls.h | 4 +
include/uapi/asm-generic/unistd.h | 4 -
kernel/sys_ni.c | 3
lib/audit.c | 3
20 files changed, 176 insertions(+), 15 deletions(-)
diff -puN arch/x86/ia32/audit.c~syscallsx86-implement-execveat-system-call arch/x86/ia32/audit.c
--- a/arch/x86/ia32/audit.c~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/ia32/audit.c
@@ -35,6 +35,7 @@ int ia32_classify_syscall(unsigned sysca
case __NR_socketcall:
return 4;
case __NR_execve:
+ case __NR_execveat:
return 5;
default:
return 1;
diff -puN arch/x86/ia32/ia32entry.S~syscallsx86-implement-execveat-system-call arch/x86/ia32/ia32entry.S
--- a/arch/x86/ia32/ia32entry.S~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/ia32/ia32entry.S
@@ -480,6 +480,7 @@ GLOBAL(\label)
PTREGSCALL stub32_rt_sigreturn, sys32_rt_sigreturn
PTREGSCALL stub32_sigreturn, sys32_sigreturn
PTREGSCALL stub32_execve, compat_sys_execve
+ PTREGSCALL stub32_execveat, compat_sys_execveat
PTREGSCALL stub32_fork, sys_fork
PTREGSCALL stub32_vfork, sys_vfork
diff -puN arch/x86/kernel/audit_64.c~syscallsx86-implement-execveat-system-call arch/x86/kernel/audit_64.c
--- a/arch/x86/kernel/audit_64.c~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/kernel/audit_64.c
@@ -50,6 +50,7 @@ int audit_classify_syscall(int abi, unsi
case __NR_openat:
return 3;
case __NR_execve:
+ case __NR_execveat:
return 5;
default:
return 0;
diff -puN arch/x86/kernel/entry_64.S~syscallsx86-implement-execveat-system-call arch/x86/kernel/entry_64.S
--- a/arch/x86/kernel/entry_64.S~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/kernel/entry_64.S
@@ -652,6 +652,20 @@ ENTRY(stub_execve)
CFI_ENDPROC
END(stub_execve)
+ENTRY(stub_execveat)
+ CFI_STARTPROC
+ addq $8, %rsp
+ PARTIAL_FRAME 0
+ SAVE_REST
+ FIXUP_TOP_OF_STACK %r11
+ call sys_execveat
+ RESTORE_TOP_OF_STACK %r11
+ movq %rax,RAX(%rsp)
+ RESTORE_REST
+ jmp int_ret_from_sys_call
+ CFI_ENDPROC
+END(stub_execveat)
+
/*
* sigreturn is special because it needs to restore all registers on return.
* This cannot be done with SYSRET, so use the IRET return path instead.
@@ -697,6 +711,20 @@ ENTRY(stub_x32_execve)
CFI_ENDPROC
END(stub_x32_execve)
+ENTRY(stub_x32_execveat)
+ CFI_STARTPROC
+ addq $8, %rsp
+ PARTIAL_FRAME 0
+ SAVE_REST
+ FIXUP_TOP_OF_STACK %r11
+ call compat_sys_execveat
+ RESTORE_TOP_OF_STACK %r11
+ movq %rax,RAX(%rsp)
+ RESTORE_REST
+ jmp int_ret_from_sys_call
+ CFI_ENDPROC
+END(stub_x32_execveat)
+
#endif
/*
diff -puN arch/x86/syscalls/syscall_32.tbl~syscallsx86-implement-execveat-system-call arch/x86/syscalls/syscall_32.tbl
--- a/arch/x86/syscalls/syscall_32.tbl~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/syscalls/syscall_32.tbl
@@ -364,3 +364,4 @@
355 i386 getrandom sys_getrandom
356 i386 memfd_create sys_memfd_create
357 i386 bpf sys_bpf
+358 i386 execveat sys_execveat stub32_execveat
diff -puN arch/x86/syscalls/syscall_64.tbl~syscallsx86-implement-execveat-system-call arch/x86/syscalls/syscall_64.tbl
--- a/arch/x86/syscalls/syscall_64.tbl~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/syscalls/syscall_64.tbl
@@ -328,6 +328,7 @@
319 common memfd_create sys_memfd_create
320 common kexec_file_load sys_kexec_file_load
321 common bpf sys_bpf
+322 64 execveat stub_execveat
#
# x32-specific system call numbers start at 512 to avoid cache impact
@@ -366,3 +367,4 @@
542 x32 getsockopt compat_sys_getsockopt
543 x32 io_setup compat_sys_io_setup
544 x32 io_submit compat_sys_io_submit
+545 x32 execveat stub_x32_execveat
diff -puN arch/x86/um/sys_call_table_64.c~syscallsx86-implement-execveat-system-call arch/x86/um/sys_call_table_64.c
--- a/arch/x86/um/sys_call_table_64.c~syscallsx86-implement-execveat-system-call
+++ a/arch/x86/um/sys_call_table_64.c
@@ -31,6 +31,7 @@
#define stub_fork sys_fork
#define stub_vfork sys_vfork
#define stub_execve sys_execve
+#define stub_execveat sys_execveat
#define stub_rt_sigreturn sys_rt_sigreturn
#define __SYSCALL_COMMON(nr, sym, compat) __SYSCALL_64(nr, sym, compat)
diff -puN fs/binfmt_em86.c~syscallsx86-implement-execveat-system-call fs/binfmt_em86.c
--- a/fs/binfmt_em86.c~syscallsx86-implement-execveat-system-call
+++ a/fs/binfmt_em86.c
@@ -42,6 +42,10 @@ static int load_em86(struct linux_binprm
return -ENOEXEC;
}
+ /* Need to be able to load the file after exec */
+ if (bprm->interp_flags & BINPRM_FLAGS_PATH_INACCESSIBLE)
+ return -ENOENT;
+
allow_write_access(bprm->file);
fput(bprm->file);
bprm->file = NULL;
diff -puN fs/binfmt_misc.c~syscallsx86-implement-execveat-system-call fs/binfmt_misc.c
--- a/fs/binfmt_misc.c~syscallsx86-implement-execveat-system-call
+++ a/fs/binfmt_misc.c
@@ -144,6 +144,10 @@ static int load_misc_binary(struct linux
if (!fmt)
goto ret;
+ /* Need to be able to load the file after exec */
+ if (bprm->interp_flags & BINPRM_FLAGS_PATH_INACCESSIBLE)
+ return -ENOENT;
+
if (!(fmt->flags & MISC_FMT_PRESERVE_ARGV0)) {
retval = remove_arg_zero(bprm);
if (retval)
diff -puN fs/binfmt_script.c~syscallsx86-implement-execveat-system-call fs/binfmt_script.c
--- a/fs/binfmt_script.c~syscallsx86-implement-execveat-system-call
+++ a/fs/binfmt_script.c
@@ -24,6 +24,16 @@ static int load_script(struct linux_binp
if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))
return -ENOEXEC;
+
+ /*
+ * If the script filename will be inaccessible after exec, typically
+ * because it is a "/dev/fd/<fd>/.." path against an O_CLOEXEC fd, give
+ * up now (on the assumption that the interpreter will want to load
+ * this file).
+ */
+ if (bprm->interp_flags & BINPRM_FLAGS_PATH_INACCESSIBLE)
+ return -ENOENT;
+
/*
* This section does the #! interpretation.
* Sorta complicated, but hopefully it will work. -TYT
diff -puN fs/exec.c~syscallsx86-implement-execveat-system-call fs/exec.c
--- a/fs/exec.c~syscallsx86-implement-execveat-system-call
+++ a/fs/exec.c
@@ -747,18 +747,25 @@ EXPORT_SYMBOL(setup_arg_pages);
#endif /* CONFIG_MMU */
-static struct file *do_open_exec(struct filename *name)
+static struct file *do_open_execat(int fd, struct filename *name, int flags)
{
struct file *file;
int err;
- static const struct open_flags open_exec_flags = {
+ struct open_flags open_exec_flags = {
.open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
.acc_mode = MAY_EXEC | MAY_OPEN,
.intent = LOOKUP_OPEN,
.lookup_flags = LOOKUP_FOLLOW,
};
- file = do_filp_open(AT_FDCWD, name, &open_exec_flags);
+ if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+ return ERR_PTR(-EINVAL);
+ if (flags & AT_SYMLINK_NOFOLLOW)
+ open_exec_flags.lookup_flags &= ~LOOKUP_FOLLOW;
+ if (flags & AT_EMPTY_PATH)
+ open_exec_flags.lookup_flags |= LOOKUP_EMPTY;
+
+ file = do_filp_open(fd, name, &open_exec_flags);
if (IS_ERR(file))
goto out;
@@ -769,12 +776,13 @@ static struct file *do_open_exec(struct
if (file->f_path.mnt->mnt_flags & MNT_NOEXEC)
goto exit;
- fsnotify_open(file);
-
err = deny_write_access(file);
if (err)
goto exit;
+ if (name->name[0] != '\0')
+ fsnotify_open(file);
+
out:
return file;
@@ -786,7 +794,7 @@ exit:
struct file *open_exec(const char *name)
{
struct filename tmp = { .name = name };
- return do_open_exec(&tmp);
+ return do_open_execat(AT_FDCWD, &tmp, 0);
}
EXPORT_SYMBOL(open_exec);
@@ -1427,10 +1435,12 @@ static int exec_binprm(struct linux_binp
/*
* sys_execve() executes a new program.
*/
-static int do_execve_common(struct filename *filename,
- struct user_arg_ptr argv,
- struct user_arg_ptr envp)
+static int do_execveat_common(int fd, struct filename *filename,
+ struct user_arg_ptr argv,
+ struct user_arg_ptr envp,
+ int flags)
{
+ char *pathbuf = NULL;
struct linux_binprm *bprm;
struct file *file;
struct files_struct *displaced;
@@ -1471,7 +1481,7 @@ static int do_execve_common(struct filen
check_unsafe_exec(bprm);
current->in_execve = 1;
- file = do_open_exec(filename);
+ file = do_open_execat(fd, filename, flags);
retval = PTR_ERR(file);
if (IS_ERR(file))
goto out_unmark;
@@ -1479,7 +1489,26 @@ static int do_execve_common(struct filen
sched_exec();
bprm->file = file;
- bprm->filename = bprm->interp = filename->name;
+ if (fd == AT_FDCWD || filename->name[0] == '/') {
+ bprm->filename = filename->name;
+ } else {
+ if (filename->name[0] == '\0')
+ pathbuf = kasprintf(GFP_TEMPORARY, "/dev/fd/%d", fd);
+ else
+ pathbuf = kasprintf(GFP_TEMPORARY, "/dev/fd/%d/%s",
+ fd, filename->name);
+ if (!pathbuf) {
+ retval = -ENOMEM;
+ goto out_unmark;
+ }
+ /* Record that a name derived from an O_CLOEXEC fd will be
+ * inaccessible after exec. Relies on having exclusive access to
+ * current->files (due to unshare_files above). */
+ if (close_on_exec(fd, current->files->fdt))
+ bprm->interp_flags |= BINPRM_FLAGS_PATH_INACCESSIBLE;
+ bprm->filename = pathbuf;
+ }
+ bprm->interp = bprm->filename;
retval = bprm_mm_init(bprm);
if (retval)
@@ -1537,6 +1566,7 @@ out_unmark:
out_free:
free_bprm(bprm);
+ kfree(pathbuf);
out_files:
if (displaced)
@@ -1552,7 +1582,18 @@ int do_execve(struct filename *filename,
{
struct user_arg_ptr argv = { .ptr.native = __argv };
struct user_arg_ptr envp = { .ptr.native = __envp };
- return do_execve_common(filename, argv, envp);
+ return do_execveat_common(AT_FDCWD, filename, argv, envp, 0);
+}
+
+int do_execveat(int fd, struct filename *filename,
+ const char __user *const __user *__argv,
+ const char __user *const __user *__envp,
+ int flags)
+{
+ struct user_arg_ptr argv = { .ptr.native = __argv };
+ struct user_arg_ptr envp = { .ptr.native = __envp };
+
+ return do_execveat_common(fd, filename, argv, envp, flags);
}
#ifdef CONFIG_COMPAT
@@ -1568,7 +1609,23 @@ static int compat_do_execve(struct filen
.is_compat = true,
.ptr.compat = __envp,
};
- return do_execve_common(filename, argv, envp);
+ return do_execveat_common(AT_FDCWD, filename, argv, envp, 0);
+}
+
+static int compat_do_execveat(int fd, struct filename *filename,
+ const compat_uptr_t __user *__argv,
+ const compat_uptr_t __user *__envp,
+ int flags)
+{
+ struct user_arg_ptr argv = {
+ .is_compat = true,
+ .ptr.compat = __argv,
+ };
+ struct user_arg_ptr envp = {
+ .is_compat = true,
+ .ptr.compat = __envp,
+ };
+ return do_execveat_common(fd, filename, argv, envp, flags);
}
#endif
@@ -1608,6 +1665,20 @@ SYSCALL_DEFINE3(execve,
{
return do_execve(getname(filename), argv, envp);
}
+
+SYSCALL_DEFINE5(execveat,
+ int, fd, const char __user *, filename,
+ const char __user *const __user *, argv,
+ const char __user *const __user *, envp,
+ int, flags)
+{
+ int lookup_flags = (flags & AT_EMPTY_PATH) ? LOOKUP_EMPTY : 0;
+
+ return do_execveat(fd,
+ getname_flags(filename, lookup_flags, NULL),
+ argv, envp, flags);
+}
+
#ifdef CONFIG_COMPAT
COMPAT_SYSCALL_DEFINE3(execve, const char __user *, filename,
const compat_uptr_t __user *, argv,
@@ -1615,4 +1686,17 @@ COMPAT_SYSCALL_DEFINE3(execve, const cha
{
return compat_do_execve(getname(filename), argv, envp);
}
+
+COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
+ const char __user *, filename,
+ const compat_uptr_t __user *, argv,
+ const compat_uptr_t __user *, envp,
+ int, flags)
+{
+ int lookup_flags = (flags & AT_EMPTY_PATH) ? LOOKUP_EMPTY : 0;
+
+ return compat_do_execveat(fd,
+ getname_flags(filename, lookup_flags, NULL),
+ argv, envp, flags);
+}
#endif
diff -puN fs/namei.c~syscallsx86-implement-execveat-system-call fs/namei.c
--- a/fs/namei.c~syscallsx86-implement-execveat-system-call
+++ a/fs/namei.c
@@ -130,7 +130,7 @@ void final_putname(struct filename *name
#define EMBEDDED_NAME_MAX (PATH_MAX - sizeof(struct filename))
-static struct filename *
+struct filename *
getname_flags(const char __user *filename, int flags, int *empty)
{
struct filename *result, *err;
diff -puN include/linux/binfmts.h~syscallsx86-implement-execveat-system-call include/linux/binfmts.h
--- a/include/linux/binfmts.h~syscallsx86-implement-execveat-system-call
+++ a/include/linux/binfmts.h
@@ -53,6 +53,10 @@ struct linux_binprm {
#define BINPRM_FLAGS_EXECFD_BIT 1
#define BINPRM_FLAGS_EXECFD (1 << BINPRM_FLAGS_EXECFD_BIT)
+/* filename of the binary will be inaccessible after exec */
+#define BINPRM_FLAGS_PATH_INACCESSIBLE_BIT 2
+#define BINPRM_FLAGS_PATH_INACCESSIBLE (1 << BINPRM_FLAGS_PATH_INACCESSIBLE_BIT)
+
/* Function parameter for binfmt->coredump */
struct coredump_params {
const siginfo_t *siginfo;
diff -puN include/linux/compat.h~syscallsx86-implement-execveat-system-call include/linux/compat.h
--- a/include/linux/compat.h~syscallsx86-implement-execveat-system-call
+++ a/include/linux/compat.h
@@ -357,6 +357,9 @@ asmlinkage long compat_sys_lseek(unsigne
asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr_t __user *argv,
const compat_uptr_t __user *envp);
+asmlinkage long compat_sys_execveat(int dfd, const char __user *filename,
+ const compat_uptr_t __user *argv,
+ const compat_uptr_t __user *envp, int flags);
asmlinkage long compat_sys_select(int n, compat_ulong_t __user *inp,
compat_ulong_t __user *outp, compat_ulong_t __user *exp,
diff -puN include/linux/fs.h~syscallsx86-implement-execveat-system-call include/linux/fs.h
--- a/include/linux/fs.h~syscallsx86-implement-execveat-system-call
+++ a/include/linux/fs.h
@@ -2093,6 +2093,7 @@ extern int vfs_open(const struct path *,
extern struct file * dentry_open(const struct path *, int, const struct cred *);
extern int filp_close(struct file *, fl_owner_t id);
+extern struct filename *getname_flags(const char __user *, int, int *);
extern struct filename *getname(const char __user *);
extern struct filename *getname_kernel(const char *);
diff -puN include/linux/sched.h~syscallsx86-implement-execveat-system-call include/linux/sched.h
--- a/include/linux/sched.h~syscallsx86-implement-execveat-system-call
+++ a/include/linux/sched.h
@@ -2441,6 +2441,10 @@ extern void do_group_exit(int);
extern int do_execve(struct filename *,
const char __user * const __user *,
const char __user * const __user *);
+extern int do_execveat(int, struct filename *,
+ const char __user * const __user *,
+ const char __user * const __user *,
+ int);
extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *);
struct task_struct *fork_idle(int);
extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
diff -puN include/linux/syscalls.h~syscallsx86-implement-execveat-system-call include/linux/syscalls.h
--- a/include/linux/syscalls.h~syscallsx86-implement-execveat-system-call
+++ a/include/linux/syscalls.h
@@ -877,4 +877,8 @@ asmlinkage long sys_seccomp(unsigned int
asmlinkage long sys_getrandom(char __user *buf, size_t count,
unsigned int flags);
asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size);
+asmlinkage long sys_execveat(int dfd, const char __user *filename,
+ const char __user *const __user *argv,
+ const char __user *const __user *envp, int flags);
+
#endif
diff -puN include/uapi/asm-generic/unistd.h~syscallsx86-implement-execveat-system-call include/uapi/asm-generic/unistd.h
--- a/include/uapi/asm-generic/unistd.h~syscallsx86-implement-execveat-system-call
+++ a/include/uapi/asm-generic/unistd.h
@@ -707,9 +707,11 @@ __SYSCALL(__NR_getrandom, sys_getrandom)
__SYSCALL(__NR_memfd_create, sys_memfd_create)
#define __NR_bpf 280
__SYSCALL(__NR_bpf, sys_bpf)
+#define __NR_execveat 281
+__SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
#undef __NR_syscalls
-#define __NR_syscalls 281
+#define __NR_syscalls 282
/*
* All syscalls below here should go away really,
diff -puN kernel/sys_ni.c~syscallsx86-implement-execveat-system-call kernel/sys_ni.c
--- a/kernel/sys_ni.c~syscallsx86-implement-execveat-system-call
+++ a/kernel/sys_ni.c
@@ -224,3 +224,6 @@ cond_syscall(sys_seccomp);
/* access BPF programs and maps */
cond_syscall(sys_bpf);
+
+/* execveat */
+cond_syscall(sys_execveat);
diff -puN lib/audit.c~syscallsx86-implement-execveat-system-call lib/audit.c
--- a/lib/audit.c~syscallsx86-implement-execveat-system-call
+++ a/lib/audit.c
@@ -54,6 +54,9 @@ int audit_classify_syscall(int abi, unsi
case __NR_socketcall:
return 4;
#endif
+#ifdef __NR_execveat
+ case __NR_execveat:
+#endif
case __NR_execve:
return 5;
default:
_
Patches currently in -mm which might be from drysdale@google.com are
syscallsx86-implement-execveat-system-call.patch
syscallsx86-implement-execveat-system-call-fix.patch
syscallsx86-add-selftest-for-execveat2.patch
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: + syscallsx86-implement-execveat-system-call.patch added to -mm tree
@ 2014-11-14 0:11 Oleg Nesterov
2014-11-14 14:55 ` David Drysdale
0 siblings, 1 reply; 3+ messages in thread
From: Oleg Nesterov @ 2014-11-14 0:11 UTC (permalink / raw)
To: David Drysdale, Andrew Morton
Cc: Meredydd Luff, Shuah Khan, Eric W. Biederman, Andy Lutomirski,
Alexander Viro, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
Kees Cook, Arnd Bergmann, Rich Felker, Christoph Hellwig,
Michael Kerrisk, linux-kernel
> @@ -1479,7 +1489,26 @@ static int do_execve_common(struct filen
>
> bprm->file = file;
> - bprm->filename = bprm->interp = filename->name;
> + if (fd == AT_FDCWD || filename->name[0] == '/') {
> + bprm->filename = filename->name;
> + } else {
> + if (filename->name[0] == '\0')
> + pathbuf = kasprintf(GFP_TEMPORARY, "/dev/fd/%d", fd);
> + else
> + pathbuf = kasprintf(GFP_TEMPORARY, "/dev/fd/%d/%s",
> + fd, filename->name);
> + if (!pathbuf) {
> + retval = -ENOMEM;
> + goto out_unmark;
> + }
> + /* Record that a name derived from an O_CLOEXEC fd will be
> + * inaccessible after exec. Relies on having exclusive access to
> + * current->files (due to unshare_files above). */
> + if (close_on_exec(fd, current->files->fdt))
> + bprm->interp_flags |= BINPRM_FLAGS_PATH_INACCESSIBLE;
> + bprm->filename = pathbuf;
+ }
+ bprm->interp = bprm->filename;
Not sure I understand this patch, will try to read later...
Just once question, don't we leak pathbuf if exec() succeeds?
OTOH, if it fails,
> out_free:
> free_bprm(bprm);
> + kfree(pathbuf);
Is it correct if we fail after bprm_change_interp() was called? It seems
that we can free interp == pathbuf twice?
Oleg.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: + syscallsx86-implement-execveat-system-call.patch added to -mm tree
2014-11-14 0:11 + syscallsx86-implement-execveat-system-call.patch added to -mm tree Oleg Nesterov
@ 2014-11-14 14:55 ` David Drysdale
0 siblings, 0 replies; 3+ messages in thread
From: David Drysdale @ 2014-11-14 14:55 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Meredydd Luff, Shuah Khan, Eric W. Biederman,
Andy Lutomirski, Alexander Viro, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Kees Cook, Arnd Bergmann, Rich Felker,
Christoph Hellwig, Michael Kerrisk, linux-kernel@vger.kernel.org
On Fri, Nov 14, 2014 at 12:11 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> @@ -1479,7 +1489,26 @@ static int do_execve_common(struct filen
>>
>> bprm->file = file;
>> - bprm->filename = bprm->interp = filename->name;
>> + if (fd == AT_FDCWD || filename->name[0] == '/') {
>> + bprm->filename = filename->name;
>> + } else {
>> + if (filename->name[0] == '\0')
>> + pathbuf = kasprintf(GFP_TEMPORARY, "/dev/fd/%d", fd);
>> + else
>> + pathbuf = kasprintf(GFP_TEMPORARY, "/dev/fd/%d/%s",
>> + fd, filename->name);
>> + if (!pathbuf) {
>> + retval = -ENOMEM;
>> + goto out_unmark;
>> + }
>> + /* Record that a name derived from an O_CLOEXEC fd will be
>> + * inaccessible after exec. Relies on having exclusive access to
>> + * current->files (due to unshare_files above). */
>> + if (close_on_exec(fd, current->files->fdt))
>> + bprm->interp_flags |= BINPRM_FLAGS_PATH_INACCESSIBLE;
>> + bprm->filename = pathbuf;
> + }
> + bprm->interp = bprm->filename;
>
> Not sure I understand this patch, will try to read later...
>
> Just once question, don't we leak pathbuf if exec() succeeds?
Doh, yes. I was sure I'd run this through kmemleak too, although
the evidence in front of me now clearly implies I didn't ...
> OTOH, if it fails,
>
>> out_free:
>> free_bprm(bprm);
>> + kfree(pathbuf);
>
> Is it correct if we fail after bprm_change_interp() was called? It seems
> that we can free interp == pathbuf twice?
I think this is OK -- bprm_change_interp() changes bprm->interp to point to a
newly kstrdup'ed string, but leaves brpm->filename as pathbuf. The former
then gets freed in free_bprm() (because it differs from filename == pathbuf),
and pathbuf is freed on the line afterwards.
> Oleg.
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-11-14 14:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-14 0:11 + syscallsx86-implement-execveat-system-call.patch added to -mm tree Oleg Nesterov
2014-11-14 14:55 ` David Drysdale
-- strict thread matches above, loose matches on Subject: below --
2014-11-12 22:08 akpm
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.