* [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case
@ 2024-10-01 13:49 Tycho Andersen
2024-10-01 13:49 ` [PATCH v3 2/2] selftests/exec: add a test to enforce execveat()'s comm Tycho Andersen
2024-10-01 18:42 ` [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Aleksa Sarai
0 siblings, 2 replies; 7+ messages in thread
From: Tycho Andersen @ 2024-10-01 13:49 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman,
Kees Cook
Cc: linux-fsdevel, linux-mm, linux-kernel, linux-kselftest,
Tycho Andersen, Tycho Andersen, Zbigniew Jędrzejewski-Szmek,
Aleksa Sarai
From: Tycho Andersen <tandersen@netflix.com>
Zbigniew mentioned at Linux Plumber's that systemd is interested in
switching to execveat() for service execution, but can't, because the
contents of /proc/pid/comm are the file descriptor which was used,
instead of the path to the binary. This makes the output of tools like
top and ps useless, especially in a world where most fds are opened
CLOEXEC so the number is truly meaningless.
Change exec path to fix up /proc/pid/comm in the case where we have
allocated one of these synthetic paths in bprm_init(). This way the actual
exec machinery is unchanged, but cosmetically the comm looks reasonable to
admins investigating things.
Signed-off-by: Tycho Andersen <tandersen@netflix.com>
Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
CC: Aleksa Sarai <cyphar@cyphar.com>
Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
---
v2: * drop the flag, everyone :)
* change the rendered value to f_path.dentry->d_name.name instead of
argv[0], Eric
v3: * fix up subject line, Eric
---
fs/exec.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index dad402d55681..9520359a8dcc 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
set_dumpable(current->mm, SUID_DUMP_USER);
perf_event_exec();
- __set_task_comm(me, kbasename(bprm->filename), true);
+
+ /*
+ * If fdpath was set, execveat() made up a path that will
+ * probably not be useful to admins running ps or similar.
+ * Let's fix it up to be something reasonable.
+ */
+ if (bprm->fdpath) {
+ BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
+ __set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
+ } else {
+ __set_task_comm(me, kbasename(bprm->filename), true);
+ }
/* An exec changes our domain. We are no longer part of the thread
group */
base-commit: baeb9a7d8b60b021d907127509c44507539c15e5
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 2/2] selftests/exec: add a test to enforce execveat()'s comm 2024-10-01 13:49 [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Tycho Andersen @ 2024-10-01 13:49 ` Tycho Andersen 2024-10-01 15:14 ` Shuah Khan 2024-10-01 18:42 ` [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Aleksa Sarai 1 sibling, 1 reply; 7+ messages in thread From: Tycho Andersen @ 2024-10-01 13:49 UTC (permalink / raw) To: Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman, Kees Cook Cc: linux-fsdevel, linux-mm, linux-kernel, linux-kselftest, Tycho Andersen, Tycho Andersen From: Tycho Andersen <tandersen@netflix.com> We want to ensure that /proc/self/comm stays useful for execveat() callers. Signed-off-by: Tycho Andersen <tandersen@netflix.com> --- tools/testing/selftests/exec/execveat.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/tools/testing/selftests/exec/execveat.c b/tools/testing/selftests/exec/execveat.c index 071e03532cba..091029f4ca9b 100644 --- a/tools/testing/selftests/exec/execveat.c +++ b/tools/testing/selftests/exec/execveat.c @@ -419,6 +419,9 @@ int main(int argc, char **argv) if (argc >= 2) { /* If we are invoked with an argument, don't run tests. */ const char *in_test = getenv("IN_TEST"); + /* TASK_COMM_LEN == 16 */ + char buf[32]; + int fd; if (verbose) { ksft_print_msg("invoked with:\n"); @@ -432,6 +435,28 @@ int main(int argc, char **argv) return 1; } + fd = open("/proc/self/comm", O_RDONLY); + if (fd < 0) { + perror("open comm"); + return 1; + } + + if (read(fd, buf, sizeof(buf)) < 0) { + close(fd); + perror("read comm"); + return 1; + } + close(fd); + + /* + * /proc/self/comm should fail to convert to an integer, i.e. + * atoi() should return 0. + */ + if (atoi(buf) != 0) { + ksft_print_msg("bad /proc/self/comm: %s", buf); + return 1; + } + /* Use the final argument as an exit code. */ rc = atoi(argv[argc - 1]); exit(rc); -- 2.34.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 2/2] selftests/exec: add a test to enforce execveat()'s comm 2024-10-01 13:49 ` [PATCH v3 2/2] selftests/exec: add a test to enforce execveat()'s comm Tycho Andersen @ 2024-10-01 15:14 ` Shuah Khan 0 siblings, 0 replies; 7+ messages in thread From: Shuah Khan @ 2024-10-01 15:14 UTC (permalink / raw) To: Tycho Andersen, Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman, Kees Cook Cc: linux-fsdevel, linux-mm, linux-kernel, linux-kselftest, Tycho Andersen, Shuah Khan On 10/1/24 07:49, Tycho Andersen wrote: > From: Tycho Andersen <tandersen@netflix.com> > > We want to ensure that /proc/self/comm stays useful for execveat() callers. This commit message is vague? What does staying useful mean? Elaborate on the staying useful and the tests added to ensure. Add test results as well. > > Signed-off-by: Tycho Andersen <tandersen@netflix.com> > --- > tools/testing/selftests/exec/execveat.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/tools/testing/selftests/exec/execveat.c b/tools/testing/selftests/exec/execveat.c > index 071e03532cba..091029f4ca9b 100644 > --- a/tools/testing/selftests/exec/execveat.c > +++ b/tools/testing/selftests/exec/execveat.c > @@ -419,6 +419,9 @@ int main(int argc, char **argv) > if (argc >= 2) { > /* If we are invoked with an argument, don't run tests. */ > const char *in_test = getenv("IN_TEST"); > + /* TASK_COMM_LEN == 16 */ > + char buf[32]; > + int fd; > > if (verbose) { > ksft_print_msg("invoked with:\n"); > @@ -432,6 +435,28 @@ int main(int argc, char **argv) > return 1; > } > > + fd = open("/proc/self/comm", O_RDONLY); > + if (fd < 0) { > + perror("open comm"); The existing code in this file uses ksft_perror() - please keep the new code consistent with the existing code. > + return 1; > + } > + > + if (read(fd, buf, sizeof(buf)) < 0) { > + close(fd); > + perror("read comm"); Same comment as above. > + return 1; > + } > + close(fd); > + > + /* > + * /proc/self/comm should fail to convert to an integer, i.e. > + * atoi() should return 0. > + */ > + if (atoi(buf) != 0) { > + ksft_print_msg("bad /proc/self/comm: %s", buf); > + return 1; > + } > + > /* Use the final argument as an exit code. */ > rc = atoi(argv[argc - 1]); > exit(rc); thanks, -- Shuah ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case 2024-10-01 13:49 [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Tycho Andersen 2024-10-01 13:49 ` [PATCH v3 2/2] selftests/exec: add a test to enforce execveat()'s comm Tycho Andersen @ 2024-10-01 18:42 ` Aleksa Sarai 2024-10-01 18:48 ` Aleksa Sarai 2024-10-02 13:45 ` Zbigniew Jędrzejewski-Szmek 1 sibling, 2 replies; 7+ messages in thread From: Aleksa Sarai @ 2024-10-01 18:42 UTC (permalink / raw) To: Tycho Andersen Cc: Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman, Kees Cook, linux-fsdevel, linux-mm, linux-kernel, linux-kselftest, Tycho Andersen, Zbigniew Jędrzejewski-Szmek [-- Attachment #1: Type: text/plain, Size: 2596 bytes --] On 2024-10-01, Tycho Andersen <tycho@tycho.pizza> wrote: > From: Tycho Andersen <tandersen@netflix.com> > > Zbigniew mentioned at Linux Plumber's that systemd is interested in > switching to execveat() for service execution, but can't, because the > contents of /proc/pid/comm are the file descriptor which was used, > instead of the path to the binary. This makes the output of tools like > top and ps useless, especially in a world where most fds are opened > CLOEXEC so the number is truly meaningless. > > Change exec path to fix up /proc/pid/comm in the case where we have > allocated one of these synthetic paths in bprm_init(). This way the actual > exec machinery is unchanged, but cosmetically the comm looks reasonable to > admins investigating things. While I still think the argv[0] solution was semantically nicer, it seems this is enough to fix the systemd problem for most cases and so we can revisit the argv[0] discussion in another 10 years. :D Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> > Signed-off-by: Tycho Andersen <tandersen@netflix.com> > Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> > CC: Aleksa Sarai <cyphar@cyphar.com> > Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec > --- > v2: * drop the flag, everyone :) > * change the rendered value to f_path.dentry->d_name.name instead of > argv[0], Eric > v3: * fix up subject line, Eric > --- > fs/exec.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/fs/exec.c b/fs/exec.c > index dad402d55681..9520359a8dcc 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm) > set_dumpable(current->mm, SUID_DUMP_USER); > > perf_event_exec(); > - __set_task_comm(me, kbasename(bprm->filename), true); > + > + /* > + * If fdpath was set, execveat() made up a path that will > + * probably not be useful to admins running ps or similar. > + * Let's fix it up to be something reasonable. > + */ > + if (bprm->fdpath) { > + BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN); > + __set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true); > + } else { > + __set_task_comm(me, kbasename(bprm->filename), true); > + } > > /* An exec changes our domain. We are no longer part of the thread > group */ > > base-commit: baeb9a7d8b60b021d907127509c44507539c15e5 > -- > 2.34.1 > -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case 2024-10-01 18:42 ` [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Aleksa Sarai @ 2024-10-01 18:48 ` Aleksa Sarai 2024-10-02 13:45 ` Zbigniew Jędrzejewski-Szmek 1 sibling, 0 replies; 7+ messages in thread From: Aleksa Sarai @ 2024-10-01 18:48 UTC (permalink / raw) To: Tycho Andersen Cc: Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman, Kees Cook, linux-fsdevel, linux-mm, linux-kernel, linux-kselftest, Tycho Andersen, Zbigniew Jędrzejewski-Szmek [-- Attachment #1: Type: text/plain, Size: 3101 bytes --] On 2024-10-01, Aleksa Sarai <cyphar@cyphar.com> wrote: > On 2024-10-01, Tycho Andersen <tycho@tycho.pizza> wrote: > > From: Tycho Andersen <tandersen@netflix.com> > > > > Zbigniew mentioned at Linux Plumber's that systemd is interested in > > switching to execveat() for service execution, but can't, because the > > contents of /proc/pid/comm are the file descriptor which was used, > > instead of the path to the binary. This makes the output of tools like > > top and ps useless, especially in a world where most fds are opened > > CLOEXEC so the number is truly meaningless. > > > > Change exec path to fix up /proc/pid/comm in the case where we have > > allocated one of these synthetic paths in bprm_init(). This way the actual > > exec machinery is unchanged, but cosmetically the comm looks reasonable to > > admins investigating things. > > While I still think the argv[0] solution was semantically nicer, it > seems this is enough to fix the systemd problem for most cases and so we > can revisit the argv[0] discussion in another 10 years. :D Of course, this assumes the busybox problem I mentioned really is not an issue. But at least this option is "less wrong" than using the fd number. I suspect we will eventually need the argv[0] thing. > Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> > > > Signed-off-by: Tycho Andersen <tandersen@netflix.com> > > Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> > > CC: Aleksa Sarai <cyphar@cyphar.com> > > Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec > > --- > > v2: * drop the flag, everyone :) > > * change the rendered value to f_path.dentry->d_name.name instead of > > argv[0], Eric > > v3: * fix up subject line, Eric > > --- > > fs/exec.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/fs/exec.c b/fs/exec.c > > index dad402d55681..9520359a8dcc 100644 > > --- a/fs/exec.c > > +++ b/fs/exec.c > > @@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm) > > set_dumpable(current->mm, SUID_DUMP_USER); > > > > perf_event_exec(); > > - __set_task_comm(me, kbasename(bprm->filename), true); > > + > > + /* > > + * If fdpath was set, execveat() made up a path that will > > + * probably not be useful to admins running ps or similar. > > + * Let's fix it up to be something reasonable. > > + */ > > + if (bprm->fdpath) { > > + BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN); > > + __set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true); > > + } else { > > + __set_task_comm(me, kbasename(bprm->filename), true); > > + } > > > > /* An exec changes our domain. We are no longer part of the thread > > group */ > > > > base-commit: baeb9a7d8b60b021d907127509c44507539c15e5 > > -- > > 2.34.1 > > > > -- > Aleksa Sarai > Senior Software Engineer (Containers) > SUSE Linux GmbH > <https://www.cyphar.com/> -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case 2024-10-01 18:42 ` [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Aleksa Sarai 2024-10-01 18:48 ` Aleksa Sarai @ 2024-10-02 13:45 ` Zbigniew Jędrzejewski-Szmek 2024-10-02 14:09 ` Tycho Andersen 1 sibling, 1 reply; 7+ messages in thread From: Zbigniew Jędrzejewski-Szmek @ 2024-10-02 13:45 UTC (permalink / raw) To: Tycho Andersen Cc: Aleksa Sarai, Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman, Kees Cook, linux-fsdevel, linux-mm, linux-kernel, linux-kselftest, Tycho Andersen On Tue, Oct 01, 2024 at 08:42:56PM +0200, Aleksa Sarai wrote: > On 2024-10-01, Tycho Andersen <tycho@tycho.pizza> wrote: > > From: Tycho Andersen <tandersen@netflix.com> > > > > Zbigniew mentioned at Linux Plumber's that systemd is interested in > > switching to execveat() for service execution, but can't, because the > > contents of /proc/pid/comm are the file descriptor which was used, > > instead of the path to the binary. This makes the output of tools like > > top and ps useless, especially in a world where most fds are opened > > CLOEXEC so the number is truly meaningless. > > > > Change exec path to fix up /proc/pid/comm in the case where we have > > allocated one of these synthetic paths in bprm_init(). This way the actual > > exec machinery is unchanged, but cosmetically the comm looks reasonable to > > admins investigating things. > > While I still think the argv[0] solution was semantically nicer, it > seems this is enough to fix the systemd problem for most cases and so we > can revisit the argv[0] discussion in another 10 years. :D Hi Tycho and everyone else, First, thank you so much for picking this up! Second, sorry for being late with a reply… Third, I tested the kernel with the patch and with systemd with the fexecve option enabled, and it all works as expected. Unfortunately, I don't think that the approach with f_path.dentry->d_name.name can be used :( As discussed previously, there are various places where symlinks are used. Alternatives and busybox were raised. Some additional examples: systemd (e.g. /usr/lib/systemd/systemd-udevd symlinks to /usr/bin/udevadm), yum-builddep, yum-config-manager, yumdownloader, yum-groups-manager all symlink to a multicall dnf-utils binary, and so on. The question is whether "pgrep" and similar tools not getting the expected name is a problem, and I think that the answer is, unfortunately, "yes". Users will notice this. As a distro maintainer, I would be _very_ wary of flipping this on in systemd, because there certainly are scripts and other tools that use that logic to check if things are running on the system. systemd uses cgroups and doesn't care about COMM at all, and I expect that many other modern tools won't either, but we have to take into account the long tail of local admin scripts and older tools. To avoid regressions and complaints, we really want an API that replaces the current execve invocations with an fd-based approach but doesn't change how things otherwise look. Arguably, the current patch would work great for 99% of cases, but that's not enough. (In particular, with rust being used more often for low level tools, and the binaries being large because of the static linking, I expect multicall binaries to become even more common. E.g. uutils-coreutils that might become the default coreutils implementation in a few years. It uses a multi-call binary and symlinks. Having 'coreutils' instead of 'sleep' as COMM is not good.) Please consider going back to the approach with argv[0]. Zbyszek > > v2: * drop the flag, everyone :) > > * change the rendered value to f_path.dentry->d_name.name instead of > > argv[0], Eric > > + __set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true); ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case 2024-10-02 13:45 ` Zbigniew Jędrzejewski-Szmek @ 2024-10-02 14:09 ` Tycho Andersen 0 siblings, 0 replies; 7+ messages in thread From: Tycho Andersen @ 2024-10-02 14:09 UTC (permalink / raw) To: Zbigniew Jędrzejewski-Szmek Cc: Aleksa Sarai, Alexander Viro, Christian Brauner, Jan Kara, Eric Biederman, Kees Cook, linux-fsdevel, linux-mm, linux-kernel, linux-kselftest, Tycho Andersen On Wed, Oct 02, 2024 at 01:45:15PM +0000, Zbigniew Jędrzejewski-Szmek wrote: > On Tue, Oct 01, 2024 at 08:42:56PM +0200, Aleksa Sarai wrote: > > On 2024-10-01, Tycho Andersen <tycho@tycho.pizza> wrote: > > > From: Tycho Andersen <tandersen@netflix.com> > > > > > > Zbigniew mentioned at Linux Plumber's that systemd is interested in > > > switching to execveat() for service execution, but can't, because the > > > contents of /proc/pid/comm are the file descriptor which was used, > > > instead of the path to the binary. This makes the output of tools like > > > top and ps useless, especially in a world where most fds are opened > > > CLOEXEC so the number is truly meaningless. > > > > > > Change exec path to fix up /proc/pid/comm in the case where we have > > > allocated one of these synthetic paths in bprm_init(). This way the actual > > > exec machinery is unchanged, but cosmetically the comm looks reasonable to > > > admins investigating things. > > > > While I still think the argv[0] solution was semantically nicer, it > > seems this is enough to fix the systemd problem for most cases and so we > > can revisit the argv[0] discussion in another 10 years. :D > ... > Unfortunately, I don't think that the approach with > f_path.dentry->d_name.name can be used :( hmm. Somehow earlier I had managed to convince myself that this gives the right answer for symlinks too (instead of the original kbasename(__d_path(file->f_path, root, buf, buflen)), but now upon retesting it doesn't. So I agree, seems like the argv[0] hack is needed unfortunately. Tycho ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-10-02 14:09 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-01 13:49 [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Tycho Andersen 2024-10-01 13:49 ` [PATCH v3 2/2] selftests/exec: add a test to enforce execveat()'s comm Tycho Andersen 2024-10-01 15:14 ` Shuah Khan 2024-10-01 18:42 ` [PATCH v3 1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case Aleksa Sarai 2024-10-01 18:48 ` Aleksa Sarai 2024-10-02 13:45 ` Zbigniew Jędrzejewski-Szmek 2024-10-02 14:09 ` Tycho Andersen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).