From: Ian Kent <ikent@redhat.com>
To: Jeff Layton <jeff.layton@primarydata.com>
Cc: "Benjamin Coddington" <bcodding@redhat.com>,
"David Howells" <dhowells@redhat.com>,
"David Härdeman" <david@hardeman.nu>,
linux-nfs@vger.kernel.org, SteveD@redhat.com
Subject: Re: [PATCH 00/19] gssd improvements
Date: Thu, 11 Dec 2014 20:55:42 +0800 [thread overview]
Message-ID: <1418302542.2513.14.camel@pluto.fritz.box> (raw)
In-Reply-To: <20141211064537.540e2e12@tlielax.poochiereds.net>
On Thu, 2014-12-11 at 06:45 -0500, Jeff Layton wrote:
> On Thu, 11 Dec 2014 11:21:21 +0800
> Ian Kent <ikent@redhat.com> wrote:
>
> > On Wed, 2014-12-10 at 20:54 -0500, Benjamin Coddington wrote:
> > >
> > > On Thu, 11 Dec 2014, Ian Kent wrote:
> > >
> > > > On Wed, 2014-12-10 at 18:21 -0500, Benjamin Coddington wrote:
> > > > > On Wed, 10 Dec 2014, David Howells wrote:
> > > > >
> > > > > > Jeff Layton <jeff.layton@primarydata.com> wrote:
> > > > > >
> > > > > > > > This thread might be interesting:
> > > > > > > > https://lkml.org/lkml/2014/11/24/885
> > > > > > > >
> > > > > > >
> > > > > > > Nice. I wasn't aware that Ian was working on this. I'll take a look.
> > > > > >
> > > > > > I'm not sure what the current state of this is. There was some discussion
> > > > > > over how best to determine which container we need to run in - and it's
> > > > > > complicated by the fact that the mounter may run in a different container to
> > > > > > the program that triggered the mount due to mountpoint propagation.
> > > > > >
> > > > > > David
> > > > >
> > > > > The specific problem of how to run /sbin/request-key in the caller's
> > > > > "container" for idmap and gssd (and other friends) became more generally a
> > > > > problem of how to solve the namespace (or more generally again, "context")
> > > > > problem for some users of kmod's call_usermodehelper. The nice thing about
> > > > > call_usermodehelper is that you don't have to do a lot of work to set up a
> > > > > process to get something done in userspace -- however it is sounding more
> > > > > like we do need to work hard to set up context for some users.
> > > > >
> > > > > The userspace work needs to be done within a context that currently exists
> > > > > or once existed, so the questions are where do we get that context and how
> > > > > do we keep it around until we need it?
> > > > >
> > > > > I think there's agreement that the setup of that context should be basically
> > > > > what's done in fork() for consistency and future work. So we get LSM and
> > > > > cgroups, etc.. in addition to namespaces.
> > > >
> > > > And that's when the usermode helper init function is called, just before
> > > > the exec, so I think that's the place it needs to be done.
> > > >
> > > > >
> > > > > There are two suggested approaches:
> > > > >
> > > > > 1) Anytime we think we're going to later need to upcall with a context we
> > > > > fork and keep a thread around to do that work. For NFS, that would look
> > > > > like forking a thread for every mount at mount time. The user of this API
> > > > > would be responsible for creating/maintaining the thread and passing it
> > > > > along for work.
> > > >
> > > > Yeah, I don't think that's workable for large numbers of mounts and I
> > > > don't think it's really necessary.
> > > >
> > > > >
> > > > > 2) Specify that a usermodehelper should attempt to use a context rather than
> > > > > the default root context. The context used would be taken from the "init"
> > > > > process of the current pid_namespace. Either that init_process itself could
> > > > > be asked to fork/execve or when the pid_namespace is created a separate
> > > > > helper thread is reserved.
> > > >
> > > > I think this is doable using open()/setns() in a similar way to
> > > > nsenter(1). We can worry about simplifying it once we have a viable
> > > > approach to work from.
> > > >
> > > > The reality is that now user mode helpers are executed within the root
> > > > context of init so I can't see why we can't use the context of init of
> > > > the container for this.
> > > >
> > > > Modifying that along the way with a "struct cred" is probably a good
> > > > idea although it isn't done now for user mode callbacks. The "struct
> > > > cred" of the root init process surely isn't what needs to be used when
> > > > executing in a container so something needs to be done. If we duplicate
> > > > the same behaviour we have now for execution outside of a container then
> > > > we'd use the "struct cred" of the container init process so maybe we do
> > > > know where to get the cred, not sure about that though.
> > >
> > > I'm not following you entirely here. Do you mean that the helper should
> > > probably have the container init's cred stripped off or sanitized?
> >
> > LOL, that's good question.
> >
> > What I think I'm saying is that, when the usermode helper is run we
> > don't want to use root init's credentials but some other credentials
> > relevant to the container, possibly the credentials of the mounter or
> > nfsd process credentials or the container init credentials.
> >
> > In any case they will need to be set to something different and
> > appropriate. I'm not sure how to do that just yet.
> >
>
> Yes, I think we might need to step back and consider that we have a
> number of different use cases here, most of which are currently not
> well served.
Indeed yes, and what we got was the result I expected from the initial
post of the patches for this, so, I am, ;)
>
> For instance: module loading clearly needs to be done in the "context"
> of the canonical root init process. That's what call_usermodehelper was
> originally used for so we need to keep that ability intact.
Not sure that's an issue since the original call_usermodehelper() will
be left in tact and people will need to make a conscious decision to
call what, so far, is call_usermodehelper_ns() to exec within a
container. At least that's the plan.
>
> OTOH, keyring upcalls probably ought to be done in the context of the
> task that triggered them. Certainly we ought to be spawning them with
> the credentials associated with the keyring.
Yes, but I'm not really there yet so I can't make sensible comments
about it.
>
> Today, those tasks not only run in the namespaces, etc of the root init
> process, but also with with root's creds. That's unnecessary and seems
> wrong. I think it's something that ought to be changed (though doing so
> will likely be painful as we'll need to change the upcall programs to
> handle that).
One thing I believe is that user space programs shouldn't know or need
to to know they are running within a container, I believe this should
have been part of the namespace implementation from the start.
The creds issue is what I'm trying to understand now since I've not had
to concern myself with these before I'm a bit at sea. It may prove not
doable but then maybe not.
>
> There are also other questions:
>
> How should we go about spawning the binary given that we might want to
> have it run in a different mount namespace? There are at least two
> options:
If anything the response to the initial post of these patches showed
that we can't just consider the mount namespace we need to consider the
whole process environment.
>
> 1) change the mount namespace first and then exec the binary (in effect
> run the binary with the given path from inside the container). This is
> possibly a security hole if an attacker can trick the kernel into
> running a different binary than intended by manipulating namespaces.
I believe this has to be the way it's done, after sub-process creation
and before the exec, in the user mode helper runner.
>
> ...or...
>
> 2) find and exec the binary and then change the namespaces afterward.
> This has some potential problems if the program does something like
> try to dlopen libraries after setns(). You could end up with a mismatch
> if the container holds a different set of binaries from the one in the
> root container.
We really shouldn't need to change the user space binaries, I'd like to
try to avoid that if at all possible.
When I've referred to setns() here I'm thinking of an in kernel
equivalent not the user space setns() syscall and that wasn't clear,
sorry.
Ian
next prev parent reply other threads:[~2014-12-11 12:55 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-09 5:40 [PATCH 00/19] gssd improvements David Härdeman
2014-12-09 5:40 ` [PATCH 01/19] nfs-utils: cleanup daemonization code David Härdeman
2014-12-09 5:40 ` [PATCH 02/19] nfs-utils: gssd - merge gssd_main_loop.c and gssd.c David Härdeman
2014-12-09 5:40 ` [PATCH 03/19] nfs-utils: gssd - simplify some option handling David Härdeman
2014-12-09 5:41 ` [PATCH 04/19] nfs-utils: gssd - remove arbitrary GSSD_MAX_CCACHE_SEARCH limitation David Härdeman
2014-12-09 5:41 ` [PATCH 05/19] nfs-utils: gssd - simplify topdirs path David Härdeman
2014-12-09 5:41 ` [PATCH 06/19] nfs-utils: gssd - move over pipfs scanning code David Härdeman
2014-12-09 5:41 ` [PATCH 07/19] nfs-utils: gssd - simplify client dir " David Härdeman
2014-12-09 5:41 ` [PATCH 08/19] nfs-utils: gssd - use libevent David Härdeman
2014-12-09 5:41 ` [PATCH 09/19] nfs-utils: gssd - remove "close me" code David Härdeman
2014-12-09 5:41 ` [PATCH 10/19] nfs-utils: gssd - make the client lists per-topdir David Härdeman
2014-12-09 5:41 ` [PATCH 11/19] nfs-utils: gssd - keep the rpc_pipefs dir open David Härdeman
2014-12-09 5:41 ` [PATCH 12/19] nfs-utils: gssd - use more relative paths David Härdeman
2014-12-09 5:41 ` [PATCH 13/19] nfs-utils: gssd - simplify topdir scanning David Härdeman
2014-12-09 5:41 ` [PATCH 14/19] nfs-utils: gssd - simplify client scanning David Härdeman
2014-12-09 5:41 ` [PATCH 15/19] nfs-utils: gssd - cleanup read_service_info David Härdeman
2014-12-09 5:42 ` [PATCH 16/19] nfs-utils: gssd - change dnotify to inotify David Härdeman
2014-12-09 5:42 ` [PATCH 17/19] nfs-utils: gssd - further shorten some pathnames David Härdeman
2014-12-09 5:42 ` [PATCH 18/19] nfs-utils: gssd - improve inotify David Härdeman
2014-12-09 5:42 ` [PATCH 19/19] nfs-utils: gssd - simplify handle_gssd_upcall David Härdeman
2014-12-09 13:09 ` [PATCH 00/19] gssd improvements Jeff Layton
2014-12-09 13:52 ` David Härdeman
2014-12-09 14:58 ` Jeff Layton
2014-12-09 15:07 ` Simo Sorce
2014-12-09 19:55 ` David Härdeman
2014-12-10 11:52 ` Jeff Layton
2014-12-10 14:08 ` David Härdeman
2014-12-10 14:17 ` Jeff Layton
2014-12-10 14:31 ` David Härdeman
2014-12-10 14:34 ` Jeff Layton
2014-12-10 16:03 ` David Howells
2014-12-10 19:03 ` Jeff Layton
2014-12-10 20:55 ` David Härdeman
2014-12-10 23:44 ` Ian Kent
2014-12-10 23:21 ` Benjamin Coddington
2014-12-11 0:12 ` Ian Kent
2014-12-11 1:54 ` Benjamin Coddington
2014-12-11 3:21 ` Ian Kent
2014-12-11 11:45 ` Jeff Layton
2014-12-11 12:55 ` Ian Kent [this message]
2014-12-11 13:46 ` Jeff Layton
2014-12-11 22:31 ` Ian Kent
2014-12-11 19:32 ` J. Bruce Fields
2014-12-11 19:50 ` Jeff Layton
2014-12-11 19:55 ` J. Bruce Fields
2014-12-11 20:11 ` Jeff Layton
2014-12-11 20:38 ` J. Bruce Fields
2014-12-11 22:20 ` Ian Kent
2014-12-09 16:39 ` Steve Dickson
2014-12-09 20:22 ` David Härdeman
2014-12-09 21:13 ` Steve Dickson
2014-12-10 14:20 ` David Härdeman
2014-12-10 20:35 ` J. Bruce Fields
2014-12-10 20:49 ` David Härdeman
2014-12-10 21:07 ` J. Bruce Fields
2015-01-28 21:29 ` Steve Dickson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1418302542.2513.14.camel@pluto.fritz.box \
--to=ikent@redhat.com \
--cc=SteveD@redhat.com \
--cc=bcodding@redhat.com \
--cc=david@hardeman.nu \
--cc=dhowells@redhat.com \
--cc=jeff.layton@primarydata.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox