From: Andrew Morton <akpm@linux-foundation.org>
To: Denys Vlasenko <vda.linux@googlemail.com>
Cc: linux-kernel@vger.kernel.org, vapier@gentoo.org
Subject: Re: [PATCH] allow execve'ing "/proc/self/exe" even if /proc is not mounted
Date: Wed, 24 Jun 2009 16:21:25 -0700 [thread overview]
Message-ID: <20090624162125.a3a9b2c4.akpm@linux-foundation.org> (raw)
In-Reply-To: <1158166a0906241600w5f7f4ffcm49d9c849f0c27f72@mail.gmail.com>
On Thu, 25 Jun 2009 01:00:56 +0200
Denys Vlasenko <vda.linux@googlemail.com> wrote:
> In some circumstances running process needs to re-execute
> its image.
>
> Among other useful cases, it is _crucial_ for NOMMU arches.
>
> They need it to perform daemonization. Classic sequence
> of "fork, parent dies, child continues" can't be used
> due to lack of fork on NOMMU, and instead we have to do
> "vfork, child re-exec itself (with a flag to not daemonize)
> and therefore unblocks parent, parent dies".
>
> Another crucial use case on NOMMU is POSIX shell support.
> Imagine a shell command of the form "func1 | func2 | func3".
> This can be implemented on NOMMU by vforking thrice,
> re-executing the shell in every child in the form
> "<shell> -c 'body of funcN'", and letting parent wait and collect
> exitcodes and such. As far as I can see, it's the only way
> to implement it correctly on NOMMU.
>
> The program may re-execute itself by name if it knows the name,
> but we generally may be unsure about it. Binary may be renamed,
> or even deleted while it is being run.
>
> More elegant way is to execute /proc/self/exe.
> This works just fine as long as /proc is mounted.
>
> But it breaks if /proc isn't mounted, and this can happen in real-world
> usage. For example, when shell invoked very early in initrd/initramfs.
Why can't userspace mount /proc before doing the daemonization?
> With this patch, it is possible to execute /proc/self/exe
> even if /proc is not mounted. In the below example,
> ./sh is a static shell binary:
>
> # chroot . ./sh
> / # echo $0
> ./sh
> / # . /proc/self/exe
> hush: /proc/self/exe: No such file or directory
> / # /proc/self/exe <==========
> / # echo $0
> /proc/self/exe
> / # exit
> / # exit
> #
>
> On an unpatched kernel, command marked with <=== would fail.
>
> How patch does it: when execve syscall discovers that opening of binary
> image fails, a small bit of code is added to special case "/proc/self/exe"
> string. If binary name is *exactly* that string, and if error is ENOENT
> or EACCES, then exec will still succeed, using current binary's image.
>
> Please apply.
>
>
> diff -urp ../linux-2.6.30.org/fs/exec.c linux-2.6.30/fs/exec.c
> --- ../linux-2.6.30.org/fs/exec.c 2009-06-10 05:05:27.000000000 +0200
> +++ linux-2.6.30/fs/exec.c 2009-06-25 00:20:13.000000000 +0200
> @@ -652,9 +652,25 @@ struct file *open_exec(const char *name)
> file = do_filp_open(AT_FDCWD, name,
> O_LARGEFILE | O_RDONLY | FMODE_EXEC, 0,
> MAY_EXEC | MAY_OPEN);
> - if (IS_ERR(file))
> - goto out;
> + if (IS_ERR(file)) {
> + if ((PTR_ERR(file) == -ENOENT || PTR_ERR(file) == -EACCES)
> + && strcmp(name, "/proc/self/exe") == 0
> + ) {
> + struct file *sv = file;
> + struct mm_struct *mm;
>
> + mm = get_task_mm(current);
> + if (!mm)
> + goto out;
> + file = get_mm_exe_file(mm);
> + mmput(mm);
> + if (file)
> + goto ok;
> + file = sv;
> + }
> + goto out;
> + }
> +ok:
> err = -EACCES;
> if (!S_ISREG(file->f_path.dentry->d_inode->i_mode))
> goto exit;
Oh geeze. Hard-coded "/proc/self/exec" it the middle of the core exec
code? You're a brave man.
Relatively minor observations:
- The code layout is weird
- This hack should be hidden in a separate function, not splattered
all over the middle of open_exec().
- That function should be documented in a way which will permit
readers to understand why it exists.
But don't do any of that yet. This will be an unpopular patch and I
fear for its future ;)
next prev parent reply other threads:[~2009-06-24 23:21 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-24 23:00 [PATCH] allow execve'ing "/proc/self/exe" even if /proc is not mounted Denys Vlasenko
2009-06-24 23:21 ` Andrew Morton [this message]
2009-06-24 23:49 ` Denys Vlasenko
2009-06-24 14:51 ` Pavel Machek
2009-06-25 0:26 ` Andrew Morton
2009-06-26 8:06 ` Florian Weimer
2009-06-24 23:58 ` Al Viro
2009-06-25 0:07 ` Mike Frysinger
2009-06-26 23:18 ` Denys Vlasenko
2009-06-25 8:10 ` Alan Cox
2009-06-26 8:00 ` Denys Vlasenko
2009-06-26 13:26 ` Mike Frysinger
2009-06-26 22:55 ` Denys Vlasenko
2009-06-28 19:31 ` Mike Frysinger
2009-06-25 18:02 ` Eric W. Biederman
2009-06-25 18:16 ` Mike Frysinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090624162125.a3a9b2c4.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=vapier@gentoo.org \
--cc=vda.linux@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox