Re: [PATCH v7 2/3] pid: Introduce pidfd_getfd syscall

public inbox for linux-api@vger.kernel.org
 help / color / mirror / Atom feed

From: Christian Brauner <christian.brauner@ubuntu.com>
To: Sargun Dhillon <sargun@sargun.me>
Cc: linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org, linux-api@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, tycho@tycho.ws, jannh@google.com,
	cyphar@cyphar.com, oleg@redhat.com, luto@amacapital.net,
	viro@zeniv.linux.org.uk, gpascutto@mozilla.com,
	ealvarez@mozilla.com, fweimer@redhat.com, jld@mozilla.com,
	arnd@arndb.de
Subject: Re: [PATCH v7 2/3] pid: Introduce pidfd_getfd syscall
Date: Sat, 28 Dec 2019 11:11:52 +0100	[thread overview]
Message-ID: <20191228100944.kh22bofbr5oe2kvk@wittgenstein> (raw)
In-Reply-To: <20191226180334.GA29409@ircssh-2.c.rugged-nimbus-611.internal>

On Thu, Dec 26, 2019 at 06:03:36PM +0000, Sargun Dhillon wrote:
> This syscall allows for the retrieval of file descriptors from other
> processes, based on their pidfd. This is possible using ptrace, and
> injection of parasitic code to inject code which leverages SCM_RIGHTS
> to move file descriptors between a tracee and a tracer. Unfortunately,
> ptrace comes with a high cost of requiring the process to be stopped,
> and breaks debuggers. This does not require stopping the process under
> manipulation.
> 
> One reason to use this is to allow sandboxers to take actions on file
> descriptors on the behalf of another process. For example, this can be
> combined with seccomp-bpf's user notification to do on-demand fd
> extraction and take privileged actions. One such privileged action
> is binding a socket to a privileged port.
> 
> This also adds the syscall to all architectures at the same time.
> 
> /* prototype */
>   /* flags is currently reserved and should be set to 0 */
>   int sys_pidfd_getfd(int pidfd, int fd, unsigned int flags);
> 
> /* testing */
> Ran self-test suite on x86_64

Fyi, I'm likely going to rewrite/add parts of/to this once I apply.

A few comments below.

> diff --git a/kernel/pid.c b/kernel/pid.c
> index 2278e249141d..4a551f947869 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -578,3 +578,106 @@ void __init pid_idr_init(void)
>  	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
>  			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
>  }
> +
> +static struct file *__pidfd_fget(struct task_struct *task, int fd)
> +{
> +	struct file *file;
> +	int ret;
> +
> +	ret = mutex_lock_killable(&task->signal->cred_guard_mutex);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	if (!ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS)) {
> +		file = ERR_PTR(-EPERM);
> +		goto out;
> +	}
> +
> +	file = fget_task(task, fd);
> +	if (!file)
> +		file = ERR_PTR(-EBADF);
> +
> +out:
> +	mutex_unlock(&task->signal->cred_guard_mutex);
> +	return file;
> +}

Looking at this code now a bit closer, ptrace_may_access() and
fget_task() both take task_lock(task) so this currently does:

task_lock();
/* check access */
task_unlock();

task_lock();
/* get fd */
task_unlock();

which doesn't seem great.

I would prefer if we could do:
task_lock();
/* check access */
/* get fd */
task_unlock();

But ptrace_may_access() doesn't export an unlocked variant so _shrug_.

But we can write this a little cleaner without the goto as:

static struct file *__pidfd_fget(struct task_struct *task, int fd)
{
	struct file *file;
	int ret;

	ret = mutex_lock_killable(&task->signal->cred_guard_mutex);
	if (ret)
		return ERR_PTR(ret);

	if (ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS))
		file = fget_task(task, fd);
	else
		file = ERR_PTR(-EPERM);
	mutex_unlock(&task->signal->cred_guard_mutex);

	return file ?: ERR_PTR(-EBADF);
}

If you don't like the ?: just do:

if (!file)
	return ERR_PTR(-EBADF);

return file;

though I prefer the shorter ?: syntax which is perfect for shortcutting
returns.

> +
> +static int pidfd_getfd(struct pid *pid, int fd)
> +{
> +	struct task_struct *task;
> +	struct file *file;
> +	int ret, retfd;
> +
> +	task = get_pid_task(pid, PIDTYPE_PID);
> +	if (!task)
> +		return -ESRCH;
> +
> +	file = __pidfd_fget(task, fd);
> +	put_task_struct(task);
> +	if (IS_ERR(file))
> +		return PTR_ERR(file);
> +
> +	retfd = get_unused_fd_flags(O_CLOEXEC);
> +	if (retfd < 0) {
> +		ret = retfd;
> +		goto out;
> +	}
> +
> +	/*
> +	 * security_file_receive must come last since it may have side effects
> +	 * and cannot be reversed.
> +	 */
> +	ret = security_file_receive(file);

So I don't understand the comment here. Can you explain what the side
effects are?
security_file_receive() is called in two places: net/core/scm.c and
net/compat.c. In both places it is called _before_ get_unused_fd_flags()
so I don't know what's special here that would prevent us from doing the
same. If there's no actual reason, please rewrite this functions as:

static int pidfd_getfd(struct pid *pid, int fd)
{
	int ret;
	struct task_struct *task;
	struct file *file;

	task = get_pid_task(pid, PIDTYPE_PID);
	if (!task)
		return -ESRCH;

	file = __pidfd_fget(task, fd);
	put_task_struct(task);
	if (IS_ERR(file))
		return PTR_ERR(file);

	ret = security_file_receive(file);
	if (ret) {
		fput(file);
		return ret;
	}

	ret = get_unused_fd_flags(O_CLOEXEC);
	if (ret < 0)
		fput(file);
	else
		fd_install(ret, file);

	return ret;
}

next prev parent reply	other threads:[~2019-12-28 10:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-26 18:03 [PATCH v7 2/3] pid: Introduce pidfd_getfd syscall Sargun Dhillon
2019-12-26 22:20 ` kbuild test robot
2019-12-27  1:35   ` Sargun Dhillon
2019-12-28 10:11 ` Christian Brauner [this message]
2019-12-28 13:03   ` Sargun Dhillon
2019-12-29 17:32     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191228100944.kh22bofbr5oe2kvk@wittgenstein \
    --to=christian.brauner@ubuntu.com \
    --cc=arnd@arndb.de \
    --cc=containers@lists.linux-foundation.org \
    --cc=cyphar@cyphar.com \
    --cc=ealvarez@mozilla.com \
    --cc=fweimer@redhat.com \
    --cc=gpascutto@mozilla.com \
    --cc=jannh@google.com \
    --cc=jld@mozilla.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=sargun@sargun.me \
    --cc=tycho@tycho.ws \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox