Re: [RFC][PATCH] Improve NFS use of network and mount namespaces

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Matt Helsley <matthltc@us.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Matt Helsley <matthltc@us.ibm.com>,
	Containers <containers@lists.osdl.org>,
	linux-nfs@vger.kernel.org
Subject: Re: [RFC][PATCH] Improve NFS use of network and mount namespaces
Date: Tue, 12 May 2009 18:05:45 -0700	[thread overview]
Message-ID: <20090513010545.GG3912@us.ibm.com> (raw)
In-Reply-To: <m1fxf97tvt.fsf@fess.ebiederm.org>

On Tue, May 12, 2009 at 05:01:58PM -0700, Eric W. Biederman wrote:
> Matt Helsley <matthltc@us.ibm.com> writes:
> 
> > Sun RPC currently opens sockets from the initial network namespace making it
> > impossible to restrict which NFS servers a container may interact with.
> >
> > For example, the NFS server at 10.0.0.3 reachable from the initial namespace
> > will always be used even if an entirely different server with the address
> > 10.0.0.3 is reachable from a container's network namespace. Hence network
> > namespaces cannot be used to restrict the network access of a container as long
> > as the RPC code opens sockets using the initial network namespace. This is
> > in stark contrast to other protocols like HTTP where the sockets are created in
> > their proper namespaces because kernel threads are not used to open sockets for
> > client network IO.
> >
> > We may plausibly end up with namespaces created by:
> > I) The administrator may mount 10.0.0.3:/export_foo from init's
> > container, clone the mount namespace, and unmount from the original
> > mount namespace.
> >
> > II) The administrator may start a task which clones the mount namespace
> > before mounting 10.0.0.3:/export_foo.
> >
> > Proposed Solution:
> >
> > The network namespace of the task that did the mount best defines which server
> > the "administrator", whether in a container or not, expects to work with.
> > When the mount is done inside a container then that is the network namespace 
> > to use. When the mount is done prior to creating the container then that's the 
> > namespace that should be used.
> >
> > This allows system administrators to isolate network traffic generated by NFS
> > clients by mounting after creating a container. If partial isolation is desired
> > then the administrator may mount before creating a container with a new network
> > namespace. In each case the RPC packets would originate from a consistent
> > namespace.
> >
> > One way to ensure consistent namespace usage would be to hold a reference to
> > the original network namespace as long as the mount exists. This naturally 
> > suggests storing the network namespace reference in the NFS superblock. 
> > However, it may be better to store it with the RPC transport itself since
> > it is directly responsible for (re)opening the sockets.
> >
> > This patch adds a reference to the network namespace to the RPC
> > transport. When the NFS export is mounted the network namespace of
> > the current task establishes which namespace to reference. That
> > reference is stored in the RPC transport and used to open sockets
> > whenever a new socket is required.
> 
> Matt.  This may be the basis of something and the problem is real.
> However it is clear you have missed a lot of details.

Well crap. While I did not ignore all the RPC services I noticed
when I tried reading the NFS/RPC code, based on the response from Chuck,
you, and Trond, I clearly fucked up when I thought I had properly understood 
how the RPC code works with the services that support NFS.

I figured that since RPC was the core of these services it would be a
good place to start trying to address the problem. It looked like the
RPC transport was a good place to deal with all of these services since
it's responsible for (re)opening the sockets needed to perform RPC IO.
But apparently the transport is not shared the way I thought it was :/..

> So could you first address this problem in nfs_get_sb by 
> denying the mount if we are not in the initial network namespace.
> 
> I.e.
> 
> if (current->nsproxy->net_ns != &init_net)
> 	return -EINVAL;
> 
> That should be a lot simpler to get right and at least give reliable
> and predictable semantics.

Yes, that seems like a reasonable preventitive measure for now.

	-Matt

     prev parent reply	other threads:[~2009-05-13  1:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12 21:51 [RFC][PATCH] Improve NFS use of network and mount namespaces Matt Helsley
2009-05-12 22:18 ` Chuck Lever
2009-05-12 23:46 ` Trond Myklebust
2009-05-13  0:04   ` Eric W. Biederman
2009-05-13  0:13     ` Trond Myklebust
2009-05-13  0:44       ` Matt Helsley
2009-05-13  1:11       ` Eric W. Biederman
2009-05-13  0:01 ` Eric W. Biederman
2009-05-13  1:05   ` Matt Helsley [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090513010545.GG3912@us.ibm.com \
    --to=matthltc@us.ibm.com \
    --cc=containers@lists.osdl.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox