All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	Containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RFC][PATCH] Improve NFS use of network and mount namespaces
Date: Tue, 12 May 2009 18:05:45 -0700	[thread overview]
Message-ID: <20090513010545.GG3912@us.ibm.com> (raw)
In-Reply-To: <m1fxf97tvt.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>

On Tue, May 12, 2009 at 05:01:58PM -0700, Eric W. Biederman wrote:
> Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> writes:
> 
> > Sun RPC currently opens sockets from the initial network namespace making it
> > impossible to restrict which NFS servers a container may interact with.
> >
> > For example, the NFS server at 10.0.0.3 reachable from the initial namespace
> > will always be used even if an entirely different server with the address
> > 10.0.0.3 is reachable from a container's network namespace. Hence network
> > namespaces cannot be used to restrict the network access of a container as long
> > as the RPC code opens sockets using the initial network namespace. This is
> > in stark contrast to other protocols like HTTP where the sockets are created in
> > their proper namespaces because kernel threads are not used to open sockets for
> > client network IO.
> >
> > We may plausibly end up with namespaces created by:
> > I) The administrator may mount 10.0.0.3:/export_foo from init's
> > container, clone the mount namespace, and unmount from the original
> > mount namespace.
> >
> > II) The administrator may start a task which clones the mount namespace
> > before mounting 10.0.0.3:/export_foo.
> >
> > Proposed Solution:
> >
> > The network namespace of the task that did the mount best defines which server
> > the "administrator", whether in a container or not, expects to work with.
> > When the mount is done inside a container then that is the network namespace 
> > to use. When the mount is done prior to creating the container then that's the 
> > namespace that should be used.
> >
> > This allows system administrators to isolate network traffic generated by NFS
> > clients by mounting after creating a container. If partial isolation is desired
> > then the administrator may mount before creating a container with a new network
> > namespace. In each case the RPC packets would originate from a consistent
> > namespace.
> >
> > One way to ensure consistent namespace usage would be to hold a reference to
> > the original network namespace as long as the mount exists. This naturally 
> > suggests storing the network namespace reference in the NFS superblock. 
> > However, it may be better to store it with the RPC transport itself since
> > it is directly responsible for (re)opening the sockets.
> >
> > This patch adds a reference to the network namespace to the RPC
> > transport. When the NFS export is mounted the network namespace of
> > the current task establishes which namespace to reference. That
> > reference is stored in the RPC transport and used to open sockets
> > whenever a new socket is required.
> 
> Matt.  This may be the basis of something and the problem is real.
> However it is clear you have missed a lot of details.

Well crap. While I did not ignore all the RPC services I noticed
when I tried reading the NFS/RPC code, based on the response from Chuck,
you, and Trond, I clearly fucked up when I thought I had properly understood 
how the RPC code works with the services that support NFS.

I figured that since RPC was the core of these services it would be a
good place to start trying to address the problem. It looked like the
RPC transport was a good place to deal with all of these services since
it's responsible for (re)opening the sockets needed to perform RPC IO.
But apparently the transport is not shared the way I thought it was :/..

> So could you first address this problem in nfs_get_sb by 
> denying the mount if we are not in the initial network namespace.
> 
> I.e.
> 
> if (current->nsproxy->net_ns != &init_net)
> 	return -EINVAL;
> 
> That should be a lot simpler to get right and at least give reliable
> and predictable semantics.

Yes, that seems like a reasonable preventitive measure for now.

	-Matt

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Matt Helsley <matthltc@us.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Matt Helsley <matthltc@us.ibm.com>,
	Containers <containers@lists.osdl.org>,
	linux-nfs@vger.kernel.org
Subject: Re: [RFC][PATCH] Improve NFS use of network and mount namespaces
Date: Tue, 12 May 2009 18:05:45 -0700	[thread overview]
Message-ID: <20090513010545.GG3912@us.ibm.com> (raw)
In-Reply-To: <m1fxf97tvt.fsf@fess.ebiederm.org>

On Tue, May 12, 2009 at 05:01:58PM -0700, Eric W. Biederman wrote:
> Matt Helsley <matthltc@us.ibm.com> writes:
> 
> > Sun RPC currently opens sockets from the initial network namespace making it
> > impossible to restrict which NFS servers a container may interact with.
> >
> > For example, the NFS server at 10.0.0.3 reachable from the initial namespace
> > will always be used even if an entirely different server with the address
> > 10.0.0.3 is reachable from a container's network namespace. Hence network
> > namespaces cannot be used to restrict the network access of a container as long
> > as the RPC code opens sockets using the initial network namespace. This is
> > in stark contrast to other protocols like HTTP where the sockets are created in
> > their proper namespaces because kernel threads are not used to open sockets for
> > client network IO.
> >
> > We may plausibly end up with namespaces created by:
> > I) The administrator may mount 10.0.0.3:/export_foo from init's
> > container, clone the mount namespace, and unmount from the original
> > mount namespace.
> >
> > II) The administrator may start a task which clones the mount namespace
> > before mounting 10.0.0.3:/export_foo.
> >
> > Proposed Solution:
> >
> > The network namespace of the task that did the mount best defines which server
> > the "administrator", whether in a container or not, expects to work with.
> > When the mount is done inside a container then that is the network namespace 
> > to use. When the mount is done prior to creating the container then that's the 
> > namespace that should be used.
> >
> > This allows system administrators to isolate network traffic generated by NFS
> > clients by mounting after creating a container. If partial isolation is desired
> > then the administrator may mount before creating a container with a new network
> > namespace. In each case the RPC packets would originate from a consistent
> > namespace.
> >
> > One way to ensure consistent namespace usage would be to hold a reference to
> > the original network namespace as long as the mount exists. This naturally 
> > suggests storing the network namespace reference in the NFS superblock. 
> > However, it may be better to store it with the RPC transport itself since
> > it is directly responsible for (re)opening the sockets.
> >
> > This patch adds a reference to the network namespace to the RPC
> > transport. When the NFS export is mounted the network namespace of
> > the current task establishes which namespace to reference. That
> > reference is stored in the RPC transport and used to open sockets
> > whenever a new socket is required.
> 
> Matt.  This may be the basis of something and the problem is real.
> However it is clear you have missed a lot of details.

Well crap. While I did not ignore all the RPC services I noticed
when I tried reading the NFS/RPC code, based on the response from Chuck,
you, and Trond, I clearly fucked up when I thought I had properly understood 
how the RPC code works with the services that support NFS.

I figured that since RPC was the core of these services it would be a
good place to start trying to address the problem. It looked like the
RPC transport was a good place to deal with all of these services since
it's responsible for (re)opening the sockets needed to perform RPC IO.
But apparently the transport is not shared the way I thought it was :/..

> So could you first address this problem in nfs_get_sb by 
> denying the mount if we are not in the initial network namespace.
> 
> I.e.
> 
> if (current->nsproxy->net_ns != &init_net)
> 	return -EINVAL;
> 
> That should be a lot simpler to get right and at least give reliable
> and predictable semantics.

Yes, that seems like a reasonable preventitive measure for now.

	-Matt


  parent reply	other threads:[~2009-05-13  1:05 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12 21:51 [RFC][PATCH] Improve NFS use of network and mount namespaces Matt Helsley
2009-05-12 21:51 ` Matt Helsley
     [not found] ` <20090512215138.GD3912-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-12 22:18   ` Chuck Lever
2009-05-12 22:18     ` Chuck Lever
2009-05-12 23:46   ` Trond Myklebust
2009-05-12 23:46     ` Trond Myklebust
     [not found]     ` <1242172010.5407.79.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-13  0:04       ` Eric W. Biederman
2009-05-13  0:04         ` Eric W. Biederman
     [not found]         ` <m13ab97trc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-05-13  0:13           ` Trond Myklebust
2009-05-13  0:13             ` Trond Myklebust
     [not found]             ` <1242173604.5407.82.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-13  0:44               ` Matt Helsley
2009-05-13  0:44                 ` Matt Helsley
2009-05-13  1:11               ` Eric W. Biederman
2009-05-13  1:11                 ` Eric W. Biederman
2009-05-13  0:01   ` Eric W. Biederman
2009-05-13  0:01     ` Eric W. Biederman
     [not found]     ` <m1fxf97tvt.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-05-13  1:05       ` Matt Helsley [this message]
2009-05-13  1:05         ` Matt Helsley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090513010545.GG3912@us.ibm.com \
    --to=matthltc-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.