* nfsd and containers
@ 2009-01-04 2:54 J. Bruce Fields
[not found] ` <20090104025415.GF24075-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2009-01-04 2:54 UTC (permalink / raw)
To: containers-qjLDD68F18O7TbgM5vRIOg
Does anyone have any ideas about how the kernel's nfsd should interact
(if at all) with network namespaces?
I'm initially interested because I've been experimenting with modifying
the server to allow it to present different exported filesystems
depending on which ip address it's accessed through. One way to do that
might be by modifying the kernel to behave as though there's a separate
nfsd service per network namespace; then we'd need little or no
modification of the userspace support daemons (statd, the portmapper,
etc.)--just start multiple instances of them in separate network
namespaces and teach the kernel to route requests to them to the
corresponding loopback interface. (That would work at least for daemons
that communicate with the kernel exclusively using rpc over loopback.
We could perhaps do something similar with the various /proc and nfsctl
interfaces.)
I'm also curious more generally whether anyone's thought about how nfsd
should behave in the presence of containers.
(Also, I take it the sysfs problem described in
http://lwn.net/Articles/295587/ is still unsolved?)
--b.
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <20090104025415.GF24075-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: nfsd and containers [not found] ` <20090104025415.GF24075-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2009-01-05 16:40 ` Serge E. Hallyn [not found] ` <20090105164016.GA8746-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Serge E. Hallyn @ 2009-01-05 16:40 UTC (permalink / raw) To: J. Bruce Fields; +Cc: containers-qjLDD68F18O7TbgM5vRIOg Quoting J. Bruce Fields (bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org): > Does anyone have any ideas about how the kernel's nfsd should interact > (if at all) with network namespaces? > > I'm initially interested because I've been experimenting with modifying > the server to allow it to present different exported filesystems > depending on which ip address it's accessed through. One way to do that > might be by modifying the kernel to behave as though there's a separate > nfsd service per network namespace; then we'd need little or no > modification of the userspace support daemons (statd, the portmapper, > etc.)--just start multiple instances of them in separate network > namespaces and teach the kernel to route requests to them to the > corresponding loopback interface. (That would work at least for daemons > that communicate with the kernel exclusively using rpc over loopback. > We could perhaps do something similar with the various /proc and nfsctl > interfaces.) > > I'm also curious more generally whether anyone's thought about how nfsd > should behave in the presence of containers. I suspect Eric has had more detailed thoughts than I so I'm waiting to see his response. Matt sent a patchset to deal with sunrpc/nfs/uts namespaces which I haven't yet had a chance to look at, so he might also have some good comments at this point. > (Also, I take it the sysfs problem described in > http://lwn.net/Articles/295587/ is still unsolved?) Sysfs tagging is not yet implemented, but 2.6.29 is supposed to have the netdev hack-fix which simply doesn't show /sys/class/net entries for tasks which are not in the initial network namespace. That allows network namespaces to be used. Checking gitweb real quick, yes, CONFIG_NET_NS is an option in Linus' current git tree. -serge ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20090105164016.GA8746-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: nfsd and containers [not found] ` <20090105164016.GA8746-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2009-01-05 22:55 ` Matt Helsley 2009-01-06 15:41 ` J. Bruce Fields 0 siblings, 1 reply; 5+ messages in thread From: Matt Helsley @ 2009-01-05 22:55 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: J. Bruce Fields, containers-qjLDD68F18O7TbgM5vRIOg On Mon, 2009-01-05 at 10:40 -0600, Serge E. Hallyn wrote: > Quoting J. Bruce Fields (bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org): > > Does anyone have any ideas about how the kernel's nfsd should interact > > (if at all) with network namespaces? > > > > I'm initially interested because I've been experimenting with modifying > > the server to allow it to present different exported filesystems > > depending on which ip address it's accessed through. One way to do that > > might be by modifying the kernel to behave as though there's a separate > > nfsd service per network namespace; then we'd need little or no > > modification of the userspace support daemons (statd, the portmapper, > > etc.)--just start multiple instances of them in separate network > > namespaces and teach the kernel to route requests to them to the > > corresponding loopback interface. (That would work at least for daemons > > that communicate with the kernel exclusively using rpc over loopback. > > We could perhaps do something similar with the various /proc and nfsctl > > interfaces.) This sounds good. It is somewhat related to UTS namespaces because the hostname reported from the UTS namespace and the DNS name might not match. I haven't thoroughly explored all the combinations but I suspect the use of network namespaces could play a part in that depending on what choices the administrator(s) make. > > I'm also curious more generally whether anyone's thought about how nfsd > > should behave in the presence of containers. I have only thought about how nfsd should see clients in different UTS and mount namespaces. The conclusion I came to was NFS should use whatever name was used with the original mount. So if we mounted an NFS export and then create a container that uses that mount then it should use the hostname of the original container. However if the child container then does another NFS mount then the child's hostname ought to be used for the new mount. > I suspect Eric has had more detailed thoughts than I so I'm waiting > to see his response. Matt sent a patchset to deal with sunrpc/nfs/uts > namespaces which I haven't yet had a chance to look at, so he might > also have some good comments at this point. Seems there's some confusion -- that patchset went out privately during the holidays. Now is a good time to repost it publicly though. I'll cc folks on this thread. I'm also planning on cc'ing nfs-devel to get their thoughts. Cheers, -Matt Helsley ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nfsd and containers 2009-01-05 22:55 ` Matt Helsley @ 2009-01-06 15:41 ` J. Bruce Fields 0 siblings, 0 replies; 5+ messages in thread From: J. Bruce Fields @ 2009-01-06 15:41 UTC (permalink / raw) To: Matt Helsley; +Cc: containers-qjLDD68F18O7TbgM5vRIOg On Mon, Jan 05, 2009 at 02:55:18PM -0800, Matt Helsley wrote: > On Mon, 2009-01-05 at 10:40 -0600, Serge E. Hallyn wrote: > > Quoting J. Bruce Fields (bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org): > > > Does anyone have any ideas about how the kernel's nfsd should interact > > > (if at all) with network namespaces? > > > > > > I'm initially interested because I've been experimenting with modifying > > > the server to allow it to present different exported filesystems > > > depending on which ip address it's accessed through. One way to do that > > > might be by modifying the kernel to behave as though there's a separate > > > nfsd service per network namespace; then we'd need little or no > > > modification of the userspace support daemons (statd, the portmapper, > > > etc.)--just start multiple instances of them in separate network > > > namespaces and teach the kernel to route requests to them to the > > > corresponding loopback interface. (That would work at least for daemons > > > that communicate with the kernel exclusively using rpc over loopback. > > > We could perhaps do something similar with the various /proc and nfsctl > > > interfaces.) > > This sounds good. It is somewhat related to UTS namespaces because the > hostname reported from the UTS namespace and the DNS name might not > match. I haven't thoroughly explored all the combinations but I suspect > the use of network namespaces could play a part in that depending on > what choices the administrator(s) make. > > > > I'm also curious more generally whether anyone's thought about how nfsd > > > should behave in the presence of containers. > > I have only thought about how nfsd should see clients in different UTS > and mount namespaces. The conclusion I came to was NFS should use > whatever name was used with the original mount. So if we mounted an NFS > export and then create a container that uses that mount then it should > use the hostname of the original container. However if the child > container then does another NFS mount then the child's hostname ought to > be used for the new mount. I'm interested in what needs to be done on the server side rather than on the client side. The server is perhaps less of a natural fit for the containers project since, unlike the rest of the kernel, it doesn't exist to perform services on behalf of local processes--it's more like a full-blown application that happens to run inside the kernel. Still, it might make sense to use network namespaces to implement something like Apache's ip-based virtual hosts, by presenting the processes in each network namespace with the illusion that they control their own private nfs server. On the other hand, requiring administrators to set up multiple network namespaces just to configure this kind of nfs service may be cumbersome. --b. ^ permalink raw reply [flat|nested] 5+ messages in thread
* nfsd and containers @ 2016-02-06 0:19 Kjetil Jørgensen 0 siblings, 0 replies; 5+ messages in thread From: Kjetil Jørgensen @ 2016-02-06 0:19 UTC (permalink / raw) To: linux-nfs Hi, trying to fit everything into the same mold, we're trying to run the in-kernel "nfs server" inside of docker containers. It works great - with one exception, "unclean shutdown" of the container itself which leaves behind the knfsd threads, which holds on to references to i.e. the mount-namespace the filesystem it's exported lives within. We've done some patching of docker, so we use ceph rbd devices, mounted into the docker container, and a veth pair for networking. The "init" process in the docker containers pid-namespace has a notion of graceful shutdown, where echos 0 into /proc/fs/nfsd/threads. In the case where the container init process gets an un-trappable signal, the kernel threads not really being part of the pid-namespace will be left behind, the knfsd threads holds on references to the mount-namespace, which leaves the filesystem mounted. Yes - we can from the outside signal the kernel NFSd threads which do let them terminate, but it's not ideal. A simple-ish test case: unshare -n -p -m -f --mount-proc -- /usr/sbin/rpc.nfsd Wishful thinking: the kernel nfsd threads that were spawned by rpc.nfsd goes away with the pid-namespace Actual outcome: the kernel nfsd threads sticks around until signalled The actual question(s): - Am I missing something ? - Is this folly, and should be abandoned post haste ? (In essence, go find a userspace nfs implementation) - In the case where this is folly but we still decide to plow ahead, is there any way I can determine which namespaces the kernel nfsd threads hold references to ? (Making killing signalling the "right" nfsd threads easier, as I lost my reference to the correct /proc/fs/nfsd with the process I had in the corresponding pid-namespace) Cheers, -- Kjetil Joergensen <kjetil@medallia.com> Medallia Inc ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-02-06 0:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-04 2:54 nfsd and containers J. Bruce Fields
[not found] ` <20090104025415.GF24075-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2009-01-05 16:40 ` Serge E. Hallyn
[not found] ` <20090105164016.GA8746-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-01-05 22:55 ` Matt Helsley
2009-01-06 15:41 ` J. Bruce Fields
-- strict thread matches above, loose matches on Subject: below --
2016-02-06 0:19 Kjetil Jørgensen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.