All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Denys Vlasenko <vda.linux@googlemail.com>
Cc: linux-kernel@vger.kernel.org, vapier@gentoo.org
Subject: Re: [PATCH] allow execve'ing "/proc/self/exe" even if /proc is not mounted
Date: Wed, 24 Jun 2009 16:21:25 -0700	[thread overview]
Message-ID: <20090624162125.a3a9b2c4.akpm@linux-foundation.org> (raw)
In-Reply-To: <1158166a0906241600w5f7f4ffcm49d9c849f0c27f72@mail.gmail.com>

On Thu, 25 Jun 2009 01:00:56 +0200
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> In some circumstances running process needs to re-execute
> its image.
> 
> Among other useful cases, it is _crucial_ for NOMMU arches.
> 
> They need it to perform daemonization. Classic sequence
> of "fork, parent dies, child continues" can't be used
> due to lack of fork on NOMMU, and instead we have to do
> "vfork, child re-exec itself (with a flag to not daemonize)
> and therefore unblocks parent, parent dies".
> 
> Another crucial use case on NOMMU is POSIX shell support.
> Imagine a shell command of the form "func1 | func2 | func3".
> This can be implemented on NOMMU by vforking thrice,
> re-executing the shell in every child in the form
> "<shell> -c 'body of funcN'", and letting parent wait and collect
> exitcodes and such. As far as I can see, it's the only way
> to implement it correctly on NOMMU.
> 
> The program may re-execute itself by name if it knows the name,
> but we generally may be unsure about it. Binary may be renamed,
> or even deleted while it is being run.
> 
> More elegant way is to execute /proc/self/exe.
> This works just fine as long as /proc is mounted.
> 
> But it breaks if /proc isn't mounted, and this can happen in real-world
> usage. For example, when shell invoked very early in initrd/initramfs.

Why can't userspace mount /proc before doing the daemonization?

> With this patch, it is possible to execute /proc/self/exe
> even if /proc is not mounted. In the below example,
> ./sh is a static shell binary:
> 
> # chroot . ./sh
> / # echo $0
> ./sh
> / # . /proc/self/exe
> hush: /proc/self/exe: No such file or directory
> / # /proc/self/exe   <==========
> / # echo $0
> /proc/self/exe
> / # exit
> / # exit
> #
> 
> On an unpatched kernel, command marked with <=== would fail.
> 
> How patch does it: when execve syscall discovers that opening of binary
> image fails, a small bit of code is added to special case "/proc/self/exe"
> string. If binary name is *exactly* that string, and if error is ENOENT
> or EACCES, then exec will still succeed, using current binary's image.
> 
> Please apply.
> 
>
> diff -urp ../linux-2.6.30.org/fs/exec.c linux-2.6.30/fs/exec.c
> --- ../linux-2.6.30.org/fs/exec.c	2009-06-10 05:05:27.000000000 +0200
> +++ linux-2.6.30/fs/exec.c	2009-06-25 00:20:13.000000000 +0200
> @@ -652,9 +652,25 @@ struct file *open_exec(const char *name)
>  	file = do_filp_open(AT_FDCWD, name,
>  				O_LARGEFILE | O_RDONLY | FMODE_EXEC, 0,
>  				MAY_EXEC | MAY_OPEN);
> -	if (IS_ERR(file))
> -		goto out;
> +	if (IS_ERR(file)) {
> +		if ((PTR_ERR(file) == -ENOENT || PTR_ERR(file) == -EACCES)
> +		 && strcmp(name, "/proc/self/exe") == 0
> +		) {
> +			struct file *sv = file;
> +			struct mm_struct *mm;
>  
> +			mm = get_task_mm(current);
> +			if (!mm)
> +				goto out;
> +			file = get_mm_exe_file(mm);
> +			mmput(mm);
> +			if (file)
> +				goto ok;
> +			file = sv;
> +		}
> +		goto out;
> +	}
> +ok:
>  	err = -EACCES;
>  	if (!S_ISREG(file->f_path.dentry->d_inode->i_mode))
>  		goto exit;

Oh geeze.  Hard-coded "/proc/self/exec" it the middle of the core exec
code?  You're a brave man.

Relatively minor observations:

- The code layout is weird

- This hack should be hidden in a separate function, not splattered
  all over the middle of open_exec().

- That function should be documented in a way which will permit
  readers to understand why it exists.


But don't do any of that yet.  This will be an unpopular patch and I
fear for its future ;)


  reply	other threads:[~2009-06-24 23:21 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-24 23:00 [PATCH] allow execve'ing "/proc/self/exe" even if /proc is not mounted Denys Vlasenko
2009-06-24 23:21 ` Andrew Morton [this message]
2009-06-24 23:49   ` Denys Vlasenko
2009-06-24 14:51     ` Pavel Machek
2009-06-25  0:26     ` Andrew Morton
2009-06-26  8:06     ` Florian Weimer
2009-06-24 23:58 ` Al Viro
2009-06-25  0:07   ` Mike Frysinger
2009-06-26 23:18   ` Denys Vlasenko
2009-06-25  8:10 ` Alan Cox
2009-06-26  8:00   ` Denys Vlasenko
2009-06-26 13:26     ` Mike Frysinger
2009-06-26 22:55       ` Denys Vlasenko
2009-06-28 19:31         ` Mike Frysinger
2009-06-25 18:02 ` Eric W. Biederman
2009-06-25 18:16   ` Mike Frysinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090624162125.a3a9b2c4.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vapier@gentoo.org \
    --cc=vda.linux@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.