All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <kees@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	linux-kernel@vger.kernel.org,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Christophe JAILLET" <christophe.jaillet@wanadoo.fr>,
	"Dan Carpenter" <dan.carpenter@linaro.org>,
	"Nir Lichtman" <nir@lichtman.org>,
	syzbot+03e1af5c332f7e0eb84b@syzkaller.appspotmail.com,
	"Tycho Andersen" <tandersen@netflix.com>,
	"Vegard Nossum" <vegard.nossum@oracle.com>
Subject: Re: [GIT PULL] execve updates for v6.13-rc1
Date: Thu, 21 Nov 2024 14:06:00 -0800	[thread overview]
Message-ID: <202411211302.08EEE6D395@keescook> (raw)
In-Reply-To: <CAHk-=wgfX4dvvKo8PrPZj76Z2ULMMK2RvaF+O7QhLnwOSBYdhQ@mail.gmail.com>

On Thu, Nov 21, 2024 at 11:23:46AM -0800, Linus Torvalds wrote:
> I'm done with this discussion that apparently was brought on by people
> not knowing what the hell they were doing.

This is disrespectful. If you're frustrated you can just say so. I'm
certainly frustrated.

> For user space, comm[] is basically the fallback for when cmdline
> fails for some reason (for example, /proc/*/cmdline will be empty for
> kworkers, but there are other situations too)
> 
> The reason? comm[] has *always* been much too limited for 'ps' output. ALWAYS.
> 
> Yes, you can literally *force* ps to not do that (eg "ps -eo comm")
> but if you do that, you get the very limited comm[] output that nobody
> has ever wanted ps to give exactly because it's so limited.

I think I finally figured out why you keep saying this. I think you mean
to imply "ps -e" (or similar), not "ps". Asking for more process details
("ps a", "ps -f", "ps -e", etc) uses cmdline. Without options that turn
on more details, the default is comm. If you run just "ps", it shows comm:

$ strace ps 2>&1 | grep /cmdline | wc -l
0

If you run with detail options it shows cmdline:

$ strace ps a 2>&1 | grep /cmdline | wc -l
1266
$ strace ps -f 2>&1 | grep /cmdline | wc -l
1266

This is procps-ng found on all Debian and Ubuntu systems. AFAICT
procps-ng is used on Fedora too.

Note I'm not saying comm is GOOD or anything. I'm just saying that it
IS regularly visible.

> And yes, 'top' will give comm[] output because it's so much faster.

Exactly. By default, top and ps both show comm, which in the vast
majority of cases is identical to argv[0]. I don't understand why this
is causing such angst -- it's just the observable facts: many things in
userspace expose comm and many also expose cmdline. Having them be
mismatched due to fexecve() is the whole issue.

Nobody blinks at:

    $ ps
        PID TTY          TIME CMD
    4125309 pts/1    00:00:47 bash
    4171960 pts/1    00:00:00 bash
    4171962 pts/1    00:00:00 vim
    4171997 pts/1    00:00:00 ps

vs

    $ ps -f
    UID          PID    PPID  C STIME TTY          TIME CMD
    kees     4125309    3947  0 Jul11 pts/1    00:00:47 -bash
    kees     4171960 4125309  0 13:30 pts/1    00:00:00 -bash
    kees     4171962 4171960  0 13:30 pts/1    00:00:00 vim
    kees     4172004 4125309  0 13:30 pts/1    00:00:00 ps -f

But if fexecve() were used now, "ps" would show:

    $ ps
        PID TTY          TIME CMD
    4125309 pts/1    00:00:47 3
    4171960 pts/1    00:00:00 3
    4171962 pts/1    00:00:00 3
    4171997 pts/1    00:00:00 3

This is obviously horrible.

Using f_path, we'd get close, but symlink destinations (or weird stuff
like "memfd:name-here") are shown:

    $ realpath $(which vim)
    /usr/bin/vim.basic

    $ ps
        PID TTY          TIME CMD
    4125309 pts/1    00:00:47 bash
    4171960 pts/1    00:00:00 bash
    4171962 pts/1    00:00:00 vim.basic
    4171997 pts/1    00:00:00 ps

Using argv[0], we'd get the original output:

    $ ps
        PID TTY          TIME CMD
    4125309 pts/1    00:00:47 bash
    4171960 pts/1    00:00:00 bash
    4171962 pts/1    00:00:00 vim
    4171997 pts/1    00:00:00 ps

IMO, switching to fexecve() shouldn't regress this basic piece of
information.

Now, I think we have three choices:

1) Leave it as-is. (comm is useless)

2) Use argv[0]. (comm matches what would show with execve() in most cases)

3) Use f_path (comm exposes f_path dentry name, which matches
   basename(readlink(/proc/*/exe)), but doesn't always match what execve()
   would show).

I think everyone agrees "1" should go away.

So it comes down to trying to stay looking more like execve()'s comm, or
more like /proc/*/exe's value.

Since comm is mutable anyway, I feel like the "friendlier" default for
userspace would be option 2.

If you still conclude differently, I guess the discussion is over, and
we go with 3:

diff --git a/fs/exec.c b/fs/exec.c
index e0435b31a811..8688bbbaf4af 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1347,7 +1347,21 @@ int begin_new_exec(struct linux_binprm * bprm)
 		set_dumpable(current->mm, SUID_DUMP_USER);
 
 	perf_event_exec();
-	__set_task_comm(me, kbasename(bprm->filename), true);
+
+	/*
+	 * If fdpath was set, alloc_bprm() made up a path that will
+	 * probably not be useful to admins running ps or similar.
+	 * Let's fix it up to be something reasonable.
+	 */
+	if (bprm->fdpath) {
+		rcu_read_lock();
+		__set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name),
+				true);
+		rcu_read_unlock();
+	}
+	else {
+		__set_task_comm(me, kbasename(bprm->filename), true);
+	}
 
 	/* An exec changes our domain. We are no longer part of the thread
 	   group */


I've minimally tested this.

-Kees

-- 
Kees Cook

  parent reply	other threads:[~2024-11-21 22:06 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-19 17:00 [GIT PULL] execve updates for v6.13-rc1 Kees Cook
2024-11-20 22:32 ` Linus Torvalds
2024-11-20 22:33   ` Linus Torvalds
2024-11-20 22:50     ` Linus Torvalds
2024-11-21  2:36       ` Al Viro
2024-11-21  2:41         ` Al Viro
2024-11-21  2:51         ` Linus Torvalds
2024-11-21  0:54   ` Eric W. Biederman
2024-11-21  2:23     ` Linus Torvalds
2024-11-21  2:29       ` Kees Cook
2024-11-21  2:45       ` Linus Torvalds
2024-11-21 17:22       ` Zbigniew Jędrzejewski-Szmek
2024-11-21 17:28         ` Linus Torvalds
2024-11-21 17:47           ` Linus Torvalds
2024-11-21 18:00           ` Zbigniew Jędrzejewski-Szmek
2024-11-21 18:02             ` Linus Torvalds
2024-11-21 18:47               ` Zbigniew Jędrzejewski-Szmek
2024-11-21 18:50               ` Kees Cook
2024-11-21 19:23                 ` Linus Torvalds
2024-11-21 21:31                   ` Zbigniew Jędrzejewski-Szmek
2024-11-21 21:48                     ` Linus Torvalds
2024-11-21 22:24                       ` Zbigniew Jędrzejewski-Szmek
2024-11-21 22:06                   ` Kees Cook [this message]
2024-11-21 22:38                     ` Linus Torvalds
     [not found]         ` <87zflrsw1c.fsf@email.froward.int.ebiederm.org>
2024-11-22  7:47           ` Zbigniew Jędrzejewski-Szmek
2024-11-22 10:21             ` Harald Arnesen
     [not found]             ` <87frnjqqh6.fsf@email.froward.int.ebiederm.org>
2024-11-24 15:21               ` Zbigniew Jędrzejewski-Szmek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202411211302.08EEE6D395@keescook \
    --to=kees@kernel.org \
    --cc=christophe.jaillet@wanadoo.fr \
    --cc=dan.carpenter@linaro.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nir@lichtman.org \
    --cc=syzbot+03e1af5c332f7e0eb84b@syzkaller.appspotmail.com \
    --cc=tandersen@netflix.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vegard.nossum@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zbyszek@in.waw.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.