All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] make execve(NULL) re-execute current binary
@ 2009-06-29 22:03 Denys Vlasenko
  2009-06-29 22:27 ` Alan Cox
  0 siblings, 1 reply; 4+ messages in thread
From: Denys Vlasenko @ 2009-06-29 22:03 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Al Viro, Andrew Morton, Mike Frysinger

[-- Attachment #1: Type: text/plain, Size: 2061 bytes --]

Hi Al, Andrew, folks,

This is a version 2 of re-execution patch.

I replaced hardcoded "/proc/self/exe" with execve(NULL)
extension in the hopes that this is considered less ugly.
Also I tried to format code according to Andrew's wishes.

Handling execve(NULL) requires adding a bit of code
to per-architecture sys_execve().
In the attached patch, it is done only on x86.
If this patch will be ACKed in principle,
the final version will do it for all architectures.

Description follows.

=========================================================

In some circumstances running process needs to re-execute
its image.

Among other useful cases, it is _crucial_ for NOMMU arches.

They need it to perform daemonization. Classic sequence
of "fork, parent dies, child continues" can't be used
due to lack of fork on NOMMU, and instead we have to do
"vfork, child re-exec itself (with a flag to not daemonize)
and therefore unblocks parent, parent dies".

Another crucial use case on NOMMU is POSIX shell support.
Imagine a shell command of the form "func1 | func2 | func3".
This can be implemented on NOMMU by vforking thrice,
re-executing the shell in every child in the form
"<shell> -c 'body of funcN'", and letting parent wait and collect
exitcodes and such. As far as I can see, it's the only way
to implement it correctly on NOMMU.

The program may re-execute itself by name if it knows the name,
but we generally may be unsure about it. Binary may be renamed,
or even deleted while it is being run.

More elegant way is to execute /proc/self/exe.
This works just fine as long as /proc is mounted.

But it breaks if /proc isn't mounted, and this can happen in real-world
usage. For example, when shell invoked very early in initrd/initramfs.
Or when the program is in a chroot jail. Etc.

With this patch, it is possible to re-execute current binary
even if /proc is not mounted. It is done with execve()
call with NULL pointer as a 1st parameter instead of filename to exec.

Please comment.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
--
vda

[-- Attachment #2: linux-2.6.30_proc_self_exe_v2.patch --]
[-- Type: text/x-diff, Size: 2786 bytes --]

diff -urp ../linux-2.6.30.org/arch/x86/kernel/process_32.c linux-2.6.30-1/arch/x86/kernel/process_32.c
--- ../linux-2.6.30.org/arch/x86/kernel/process_32.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30-1/arch/x86/kernel/process_32.c	2009-06-29 22:28:38.000000000 +0200
@@ -453,6 +453,15 @@ int sys_execve(struct pt_regs *regs)
 	int error;
 	char *filename;
 
+	if (regs->bx == 0) {
+		/* execme */
+		error = do_execve(NULL,
+			(char __user * __user *) regs->cx,
+			(char __user * __user *) regs->dx,
+			regs);
+		goto out;
+	}
+
 	filename = getname((char __user *) regs->bx);
 	error = PTR_ERR(filename);
 	if (IS_ERR(filename))
@@ -461,12 +470,12 @@ int sys_execve(struct pt_regs *regs)
 			(char __user * __user *) regs->cx,
 			(char __user * __user *) regs->dx,
 			regs);
+	putname(filename);
+out:
 	if (error == 0) {
 		/* Make sure we don't return using sysenter.. */
 		set_thread_flag(TIF_IRET);
 	}
-	putname(filename);
-out:
 	return error;
 }
 
diff -urp ../linux-2.6.30.org/arch/x86/kernel/process_64.c linux-2.6.30-1/arch/x86/kernel/process_64.c
--- ../linux-2.6.30.org/arch/x86/kernel/process_64.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30-1/arch/x86/kernel/process_64.c	2009-06-29 22:28:56.000000000 +0200
@@ -504,6 +504,9 @@ long sys_execve(char __user *name, char 
 	long error;
 	char *filename;
 
+	if (name == NULL)
+		return do_execve(NULL, argv, envp, regs);
+
 	filename = getname(name);
 	error = PTR_ERR(filename);
 	if (IS_ERR(filename))
diff -urp ../linux-2.6.30.org/fs/exec.c linux-2.6.30-1/fs/exec.c
--- ../linux-2.6.30.org/fs/exec.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.30-1/fs/exec.c	2009-06-29 22:29:44.000000000 +0200
@@ -644,14 +644,39 @@ EXPORT_SYMBOL(setup_arg_pages);
 
 #endif /* CONFIG_MMU */
 
+static struct file *open_self(void)
+{
+	struct file *file;
+	struct mm_struct *mm;
+
+	mm = get_task_mm(current);
+	file = NULL;
+	if (mm) {
+		file = get_mm_exe_file(mm);
+		mmput(mm);
+	}
+	if (!file)
+		file = ERR_PTR(-ENOENT);
+	return file;
+}
+
 struct file *open_exec(const char *name)
 {
 	struct file *file;
 	int err;
 
-	file = do_filp_open(AT_FDCWD, name,
+	if (name == NULL) {
+		/*
+		 * execve(NULL) execs the binary of the current process.
+		 * Unlike execve("/proc/self/exe"), it does not require
+		 * mounted /proc.
+		 */
+		file = open_self();
+	} else {
+		file = do_filp_open(AT_FDCWD, name,
 				O_LARGEFILE | O_RDONLY | FMODE_EXEC, 0,
 				MAY_EXEC | MAY_OPEN);
+	}
 	if (IS_ERR(file))
 		goto out;
 
@@ -1291,8 +1316,8 @@ int do_execve(char * filename,
 	sched_exec();
 
 	bprm->file = file;
-	bprm->filename = filename;
-	bprm->interp = filename;
+	bprm->filename = filename ? filename : current->comm;
+	bprm->interp = bprm->filename;
 
 	retval = bprm_mm_init(bprm);
 	if (retval)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make execve(NULL) re-execute current binary
  2009-06-29 22:03 [PATCH] make execve(NULL) re-execute current binary Denys Vlasenko
@ 2009-06-29 22:27 ` Alan Cox
  2009-06-30 12:10   ` Denys Vlasenko
  0 siblings, 1 reply; 4+ messages in thread
From: Alan Cox @ 2009-06-29 22:27 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linux Kernel Mailing List, Al Viro, Andrew Morton, Mike Frysinger

On Tue, 30 Jun 2009 00:03:39 +0200
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> Hi Al, Andrew, folks,
> 
> This is a version 2 of re-execution patch.
> 
> I replaced hardcoded "/proc/self/exe" with execve(NULL)

So you add hacks to sys_execve, which means hacks on every system that
doesn't need it and also undefined behaviour if you use the feature when
it isn't present.

This makes no sense.

> extension in the hopes that this is considered less ugly.
> Also I tried to format code according to Andrew's wishes.
> 
> Handling execve(NULL) requires adding a bit of code
> to per-architecture sys_execve().

Please implement sys_reexec() as was suggested before. Not only does it
allow this stuff to stay out of the exec() call without risking mashing
up the API and changing ABI in odd corner cases it means its actually
going to be usable by apps - because a reexec() is going to give you a
-ENOSYS on older platforms not randomness.

> In the attached patch, it is done only on x86.
> If this patch will be ACKed in principle,
> the final version will do it for all architectures.

NAK in principle.

Alan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make execve(NULL) re-execute current binary
  2009-06-29 22:27 ` Alan Cox
@ 2009-06-30 12:10   ` Denys Vlasenko
  2009-06-30 12:33     ` Alan Cox
  0 siblings, 1 reply; 4+ messages in thread
From: Denys Vlasenko @ 2009-06-30 12:10 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linux Kernel Mailing List, Al Viro, Andrew Morton, Mike Frysinger

On Tue, Jun 30, 2009 at 12:27 AM, Alan Cox<alan@lxorguk.ukuu.org.uk> wrote:
> On Tue, 30 Jun 2009 00:03:39 +0200
> Denys Vlasenko <vda.linux@googlemail.com> wrote:
>
>> Hi Al, Andrew, folks,
>>
>> This is a version 2 of re-execution patch.
>>
>> I replaced hardcoded "/proc/self/exe" with execve(NULL)
>
> So you add hacks to sys_execve, which means hacks on every system that
> doesn't need it

Yes.

> and also undefined behaviour if you use the feature when
> it isn't present.

The behavior of execve(NULL) is not undefined. It returns EFAULT.
The idea was that application will do:

/* Try traditional way */
execv("/proc/self/exe", argv);

/* Try Linux specific extension (only new kernels have it) */
execv(NULL, argv);

/* Give up */
printf("Can't re-exec myself\n");
exit(1);

which is safe on older kernels as well.

> Please implement sys_reexec() as was suggested before.

Ok.

--
vda

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make execve(NULL) re-execute current binary
  2009-06-30 12:10   ` Denys Vlasenko
@ 2009-06-30 12:33     ` Alan Cox
  0 siblings, 0 replies; 4+ messages in thread
From: Alan Cox @ 2009-06-30 12:33 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linux Kernel Mailing List, Al Viro, Andrew Morton, Mike Frysinger

> The behavior of execve(NULL) is not undefined. It returns EFAULT.
> The idea was that application will do:

That depends what you have mapped at address zero and if your platform
supports EFAULT reporting for address zero. It isn't a reliable
assumption.

Alan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-06-30 12:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-29 22:03 [PATCH] make execve(NULL) re-execute current binary Denys Vlasenko
2009-06-29 22:27 ` Alan Cox
2009-06-30 12:10   ` Denys Vlasenko
2009-06-30 12:33     ` Alan Cox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.