From: Jeff Layton <jlayton@redhat.com>
To: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Greg KH <gregkh@linuxfoundation.org>,
<linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
<linux-nfs@vger.kernel.org>, <devel@openvz.org>,
<ebiederm@xmission.com>, <oleg@redhat.com>,
<bfields@fieldses.org>, <bharrosh@panasas.com>
Subject: Re: call_usermodehelper in containers
Date: Tue, 12 Nov 2013 08:30:43 -0500 [thread overview]
Message-ID: <20131112083043.0ab78e67@tlielax.poochiereds.net> (raw)
In-Reply-To: <528226EC.4050701@parallels.com>
On Tue, 12 Nov 2013 17:02:36 +0400
Stanislav Kinsbursky <skinsbursky@parallels.com> wrote:
> 12.11.2013 15:12, Jeff Layton пишет:
> > On Mon, 11 Nov 2013 16:47:03 -0800
> > Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> >> On Mon, Nov 11, 2013 at 07:18:25AM -0500, Jeff Layton wrote:
> >>> We have a bit of a problem wrt to upcalls that use call_usermodehelper
> >>> with containers and I'd like to bring this to some sort of resolution...
> >>>
> >>> A particularly problematic case (though there are others) is the
> >>> nfsdcltrack upcall. It basically uses call_usermodehelper to run a
> >>> program in userland to track some information on stable storage for
> >>> nfsd.
> >>
> >> I thought the discussion at the kernel summit about this issue was:
> >> - don't do this.
> >> - don't do it.
> >> - if you really need to do this, fix nfsd
> >>
> >
> > Sorry, I couldn't make the kernel summit so I missed that discussion. I
> > guess LWN didn't cover it?
> >
> > In any case, I guess then that we'll either have to come up with some
> > way to fix nfsd here, or simply ensure that nfsd can never be started
> > unless root in the container has a full set of a full set of
> > capabilities.
> >
> > One sort of Rube Goldberg possibility to fix nfsd is:
> >
> > - when we start nfsd in a container, fork off an extra kernel thread
> > that just sits idle. That thread would need to be a descendant of the
> > userland process that started nfsd, so we'd need to create it with
> > kernel_thread().
> >
> > - Have the kernel just start up the UMH program in the init_ns mount
> > namespace as it currently does, but also pass the pid of the idle
> > kernel thread to the UMH upcall.
> >
> > - The program will then use /proc/<pid>/root and /proc/<pid>/ns/* to set
> > itself up for doing things properly.
> >
> > Note that with this mechanism we can't actually run a different binary
> > per container, but that's probably fine for most purposes.
> >
>
> Hmmm... Why we can't? We can go a bit further with userspace idea.
>
> We use UMH some very limited number of user programs. For 2, actually:
> 1) /sbin/nfs_cache_getent
> 2) /sbin/nfsdcltrack
>
No, the kernel uses them for a lot more than that. Pretty much all of
the keys API upcalls use it. See all of the callers of
call_usermodehelper. All of them are running user binaries out of the
kernel, and almost all of them are certainly broken wrt containers.
> If we convert them into proxies, which use /proc/<pid>/root and /proc/<pid>/ns/*, this will allow us to lookup the right binary.
> The only limitation here is presence of this "proxy" binaries on "host".
>
Suppose I spawn my own container as a user, using all of this spiffy
new user namespace stuff. Then I make the kernel use
call_usermodehelper to call the upcall in the init_ns, and then trick
it into running my new "escape_from_namespace" program with "real" root
privileges.
I don't think we can reasonably assume that having the kernel exec an
arbitrary binary inside of a container is safe. Doing so inside of the
init_ns is marginally more safe, but only marginally so...
> And we don't need any significant changes in kernel.
>
> BTW, Jeff, could you remind me, please, why exactly we need to use UMH to run the binary?
> What are this capabilities, which force us to do so?
>
Nothing _forces_ us to do so, but upcalls are very difficult to handle,
and UMH has a lot of advantages over a long-running daemon launched by
userland.
Originally, I created the nfsdcltrack upcall as a running daemon called
nfsdcld, and the kernel used rpc_pipefs to communicate with it.
Everyone hated it because no one likes to have to run daemons for
infrequently used upcalls. It's a pain for users to ensure that it's
running and it's a pain to handle when it isn't. So, I was encouraged
to turn that instead into a UMH upcall.
But leaving that aside, this problem is a lot larger than just nfsd. We
have a *lot* of UMH upcalls in the kernel, so this problem is more
general than just "fixing" nfsd's.
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2013-11-12 13:31 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-11 12:18 call_usermodehelper in containers Jeff Layton
2013-11-11 12:43 ` [Devel] " Vasily Kulikov
2013-11-11 13:26 ` Jeff Layton
2013-11-12 0:47 ` Greg KH
2013-11-12 11:12 ` Jeff Layton
2013-11-12 13:02 ` Stanislav Kinsbursky
2013-11-12 13:30 ` Jeff Layton [this message]
2013-11-15 5:05 ` Eric W. Biederman
2013-11-15 10:40 ` Stanislav Kinsbursky
2013-11-15 11:03 ` Eric W. Biederman
2013-11-15 11:54 ` Stanislav Kinsbursky
2016-02-12 23:39 ` Ian Kent
2016-02-13 16:08 ` Stanislav Kinsburskiy
2016-02-15 0:11 ` Ian Kent
2016-02-18 3:17 ` Eric W. Biederman
2013-11-18 17:28 ` Oleg Nesterov
2013-11-18 18:02 ` Oleg Nesterov
2013-11-19 14:51 ` Jeff Layton
2016-02-11 0:17 ` Ian Kent
2016-02-18 2:57 ` Eric W. Biederman
2016-02-18 3:43 ` Kamezawa Hiroyuki
2016-02-18 6:36 ` Ian Kent
2016-02-18 7:37 ` Ian Kent
2016-02-18 20:45 ` Eric W. Biederman
2016-02-19 3:08 ` Kamezawa Hiroyuki
2016-02-19 5:37 ` Ian Kent
2016-02-19 9:30 ` Kamezawa Hiroyuki
2016-02-20 3:28 ` Ian Kent
2016-02-19 5:14 ` Ian Kent
2016-02-23 2:55 ` Ian Kent
2016-02-23 14:36 ` J. Bruce Fields
2016-02-24 0:55 ` Ian Kent
2016-03-24 7:45 ` Ian Kent
2016-03-25 1:28 ` Oleg Nesterov
2016-03-25 7:25 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131112083043.0ab78e67@tlielax.poochiereds.net \
--to=jlayton@redhat.com \
--cc=bfields@fieldses.org \
--cc=bharrosh@panasas.com \
--cc=devel@openvz.org \
--cc=ebiederm@xmission.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=skinsbursky@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).