* LXC L3 network isolation, yes/no ?, how ?
@ 2011-11-01 2:12 Toerless Eckert
[not found] ` <20111101021230.GE15906-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Toerless Eckert @ 2011-11-01 2:12 UTC (permalink / raw)
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
I am trying to understand if (and if so how) i can use LXC (or any
other comparable lightweightc container option) to effectively
run applications on a linux system with two separate IP interfaces
as if they each had only access to a single IP interface.
Eg:
eth0 with address and default-router learned by DHCP
eg: address 10.1.1.2/24, default-router 10.1.1.254
DNS prefix and DNS domain name for ether0 of course also learned by DHCP.
eth1 with address and default-router learned by DHCP
eg: address 10.2.1.a/242, default-router 10.2.1.254
DNS prefix and DNS domain name for ether0 of course also learned by DHCP.
(no need for overlapping addresses).
So, i configure LXC accordingly (how...) for one eth0container, and one
eth1container. All processes running eth0container will have all their
traffic use ony eth0, all the ones in eth1container will only use eth1.
If this works, i'd love to get a pointer to an example config. The
ones i could find on the web looked as if they where using bridging
to attach multiple containers to ultimately the same single IP subnet
with the same default router (and thereby the same DNS prefix and DNS servers).
I can't see how LXC can make my case work without some additional kernel
support because when either process1 or process2 open let say a
client socket and just connect(), then (AFAIK) the default linux routing
logic takes place which would (AFAIK) first figure out where to route the
destination to (eth0 or eth1) and then pick the local IP address of that
interface as the sockets local IP address. And i don't understand how
LXC would make this decision process dependent on which contain the process
is running in.
I guess one can create additional routing tables, one for each container
and then use the fwmark on all sockets to have them use that container
specific routing table, but it's not clear to me whether/how that is really
done on LXC.
Thanks a lot!
Toerless
^ permalink raw reply [flat|nested] 9+ messages in thread[parent not found: <20111101021230.GE15906-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <20111101021230.GE15906-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> @ 2011-11-01 3:19 ` Eric W. Biederman [not found] ` <m1r51swmun.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Eric W. Biederman @ 2011-11-01 3:19 UTC (permalink / raw) To: Toerless Eckert; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Toerless Eckert <Toerless.Eckert-vrlraubKdiR4tiELkoLHDcSSVFg4/55HhC4ANOJQIlc@public.gmane.org> writes: > I am trying to understand if (and if so how) i can use LXC (or any > other comparable lightweightc container option) to effectively > run applications on a linux system with two separate IP interfaces > as if they each had only access to a single IP interface. > > Eg: > eth0 with address and default-router learned by DHCP > eg: address 10.1.1.2/24, default-router 10.1.1.254 > DNS prefix and DNS domain name for ether0 of course also learned by DHCP. > > eth1 with address and default-router learned by DHCP > eg: address 10.2.1.a/242, default-router 10.2.1.254 > DNS prefix and DNS domain name for ether0 of course also learned by DHCP. > > (no need for overlapping addresses). That sounds like L2 level isolation. ip link set eth1 netns XXXX. Will let move a network device to a choose network namespace. That is the easy trivial case. Most people don't have the multiple physical interfaces so tricky things have to happen. Does that sound like what you are looking for? Eric ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <m1r51swmun.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <m1r51swmun.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> @ 2011-11-01 4:32 ` Toerless Eckert [not found] ` <20111101043201.GA14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Toerless Eckert @ 2011-11-01 4:32 UTC (permalink / raw) To: Eric W. Biederman Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Toerless Eckert Thanks, Eric How do i configure eg: an LXC container to use a specific network name space XXXX ? Also: if an app within some LXC container does a socket() and then a bind(..INADDR_ANY...) how does the kernel know which subset of IP interfaces it should bind to ? does the process context have a network name space ? And how do i create per namespace routing tables ? Example or pointer to docs would be great. or just walk me through the rough outline of my use case...: - create container e0procs, configure just the physical eth0 interface into it ?? - without assigning an IP address ? - run a dhcp daemon from withing container e0proces and that will correctly get ip address/mask and default route configured in a routing table solely used by container e0procs ? - container e0procs DHCPd will also populate containerized /etc/resolv.conf with eth0 domain prefix/DNS-servers... - same approach for container c1procs, confgiure phys eth1 interface into it, start DHCP daemon inside container inside it, get routing table and dNS for container c1procs from it. Is that it ? Of not, then how. If yes, then what type of routing table would i actually see outside of the containers ? And back to the original question, would socket(), bind(INADDR_ANY) from inside the containers work correctly ? Thanks Toerless On Mon, Oct 31, 2011 at 08:19:44PM -0700, Eric W. Biederman wrote: > Toerless Eckert <Toerless.Eckert-vrlraubKdiR4tiELkoLHDcSSVFg4/55HhC4ANOJQIlc@public.gmane.org> writes: > > > I am trying to understand if (and if so how) i can use LXC (or any > > other comparable lightweightc container option) to effectively > > run applications on a linux system with two separate IP interfaces > > as if they each had only access to a single IP interface. > > > > Eg: > > eth0 with address and default-router learned by DHCP > > eg: address 10.1.1.2/24, default-router 10.1.1.254 > > DNS prefix and DNS domain name for ether0 of course also learned by DHCP. > > > > eth1 with address and default-router learned by DHCP > > eg: address 10.2.1.a/242, default-router 10.2.1.254 > > DNS prefix and DNS domain name for ether0 of course also learned by DHCP. > > > > (no need for overlapping addresses). > > That sounds like L2 level isolation. > > ip link set eth1 netns XXXX. > > Will let move a network device to a choose network namespace. > > That is the easy trivial case. Most people don't have the multiple > physical interfaces so tricky things have to happen. > > Does that sound like what you are looking for? > > Eric > > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20111101043201.GA14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <20111101043201.GA14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> @ 2011-11-01 12:20 ` Eric W. Biederman [not found] ` <m1lis0vxu6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Eric W. Biederman @ 2011-11-01 12:20 UTC (permalink / raw) To: Toerless Eckert; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Toerless Eckert <Toerless.Eckert-vrlraubKdiR4tiELkoLHDcSSVFg4/55HhC4ANOJQIlc@public.gmane.org> writes: > Thanks, Eric > > How do i configure eg: an LXC container to use a specific network name space XXXX ? > > Also: if an app within some LXC container does a socket() and then a > bind(..INADDR_ANY...) how does the kernel know which subset of IP interfaces > it should bind to ? does the process context have a network name space > ? The network namespace. > And how do i create per namespace routing tables ? Just like nomral. From inside the network namespace you setup your routing tables. > Example or pointer to docs would be great. or just walk me through the rough > outline of my use case...: > > - create container e0procs, configure just the physical eth0 interface into it ?? > - without assigning an IP address ? > - run a dhcp daemon from withing container e0proces and that > will correctly get ip address/mask and default route configured in a > routing table solely used by container e0procs ? > - container e0procs DHCPd will also populate containerized /etc/resolv.conf with > eth0 domain prefix/DNS-servers... > > - same approach for container c1procs, confgiure phys eth1 interface into it, > start DHCP daemon inside container inside it, get routing table and dNS > for container c1procs from it. > > Is that it ? Of not, then how. If yes, then what type of routing table would > i actually see outside of the containers ? And back to the original question, > would socket(), bind(INADDR_ANY) from inside the containers work correctly ? Yes. bind(INADDR_ANY) works correctly inside a network namespace. A network namespace is from an application perspective like having a separate copy of the networking stack. Eric ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <m1lis0vxu6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <m1lis0vxu6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> @ 2011-11-01 15:26 ` Toerless Eckert [not found] ` <20111101152624.GB14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Toerless Eckert @ 2011-11-01 15:26 UTC (permalink / raw) To: Eric W. Biederman Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Toerless Eckert THanks for replying, Sorry for asking what probably are a lot of naive questions, my excuse is that the documentation is somewhat scattered/incomplete ? ;-)) I am trying to figure out how to minimize the virtualization to just the network name space and instantiate it in a lightweight fashion that can easily be counterfitted into some existing system. What i would like to have is some simple program like "run-ns XXXX <program> <args>" that would run program <args> within namespace XXXX. So i was looking for some system call like set_ns(XXXX), but it seems there is no API like that. Instead i guess i would need to have a "server" process with pid XXXX that does an unshare(CLONE_NEWNS) and then listens for requests to fork client programs, and run-ns would need to send a request to that XXXX process to fork off <program> <args> and make sure that it can transfer all the pre-existing context of run-ns like pid/gid(s), cwd, environment, and i don't even know all the other context a linux process has these days. And then of course communicate exit status of <program> back from XXXX to run-ns. Meaning: it's great to have something like network name spaces, but without some setns(XXXX) system call, it's really difficult to use these network name spaces outside of a concept like LXC - which is a shame, because otherwise the nework name space woudl exactly be what i am looking for. I guess i will have to look how much of an isolated network behvior i can get by using fwmark's. Alas, there is no process-level fwmark context, but it has to be set via setsockopt(SO_MARK) AFAIK, so one would need some LD_PRELOAD library or the like to use it. *sigh* ;-)) Cheers Toerless On Tue, Nov 01, 2011 at 05:20:01AM -0700, Eric W. Biederman wrote: > Toerless Eckert <Toerless.Eckert-vrlraubKdiR4tiELkoLHDcSSVFg4/55HhC4ANOJQIlc@public.gmane.org> writes: > > > Thanks, Eric > > > > How do i configure eg: an LXC container to use a specific network name space XXXX ? > > > > Also: if an app within some LXC container does a socket() and then a > > bind(..INADDR_ANY...) how does the kernel know which subset of IP interfaces > > it should bind to ? does the process context have a network name space > > ? > > The network namespace. > > > And how do i create per namespace routing tables ? > > Just like nomral. From inside the network namespace you setup your > routing tables. > > > Example or pointer to docs would be great. or just walk me through the rough > > outline of my use case...: > > > > - create container e0procs, configure just the physical eth0 interface into it ?? > > - without assigning an IP address ? > > - run a dhcp daemon from withing container e0proces and that > > will correctly get ip address/mask and default route configured in a > > routing table solely used by container e0procs ? > > - container e0procs DHCPd will also populate containerized /etc/resolv.conf with > > eth0 domain prefix/DNS-servers... > > > > - same approach for container c1procs, confgiure phys eth1 interface into it, > > start DHCP daemon inside container inside it, get routing table and dNS > > for container c1procs from it. > > > > Is that it ? Of not, then how. If yes, then what type of routing table would > > i actually see outside of the containers ? And back to the original question, > > would socket(), bind(INADDR_ANY) from inside the containers work correctly ? > > > Yes. bind(INADDR_ANY) works correctly inside a network namespace. > > A network namespace is from an application perspective like having a > separate copy of the networking stack. > > Eric > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20111101152624.GB14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <20111101152624.GB14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> @ 2011-11-01 15:55 ` Daniel Lezcano 2011-11-01 17:17 ` Eric W. Biederman 1 sibling, 0 replies; 9+ messages in thread From: Daniel Lezcano @ 2011-11-01 15:55 UTC (permalink / raw) To: Toerless Eckert Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eric W. Biederman On 11/01/2011 04:26 PM, Toerless Eckert wrote: > THanks for replying, > > Sorry for asking what probably are a lot of naive questions, my excuse is > that the documentation is somewhat scattered/incomplete ? ;-)) > > I am trying to figure out how to minimize the virtualization to just the network > name space and instantiate it in a lightweight fashion that can easily > be counterfitted into some existing system. > > What i would like to have is some simple program like "run-ns XXXX<program> <args>" > that would run program<args> within namespace XXXX. Did you look at the lxc-execute command ? http://lxc.sourceforge.net/man/lxc.html the "Quick Start" section, third line. -- Daniel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <20111101152624.GB14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> 2011-11-01 15:55 ` Daniel Lezcano @ 2011-11-01 17:17 ` Eric W. Biederman [not found] ` <m1hb2nsqy6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: Eric W. Biederman @ 2011-11-01 17:17 UTC (permalink / raw) To: Toerless Eckert; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Toerless Eckert <Toerless.Eckert-vrlraubKdiR4tiELkoLHDcSSVFg4/55HhC4ANOJQIlc@public.gmane.org> writes: > THanks for replying, > > Sorry for asking what probably are a lot of naive questions, my excuse is > that the documentation is somewhat scattered/incomplete ? ;-)) > > I am trying to figure out how to minimize the virtualization to just the network > name space and instantiate it in a lightweight fashion that can easily > be counterfitted into some existing system. > > What i would like to have is some simple program like "run-ns XXXX <program> <args>" > that would run program <args> within namespace XXXX. > > So i was looking for some system call like set_ns(XXXX), but it seems there > is no API like that. Instead i guess i would need to have a "server" process > with pid XXXX that does an unshare(CLONE_NEWNS) and then listens for requests > to fork client programs, and run-ns would need to send a request to that XXXX > process to fork off <program> <args> and make sure that it can transfer all > the pre-existing context of run-ns like pid/gid(s), cwd, environment, and i don't > even know all the other context a linux process has these days. And then of course > communicate exit status of <program> back from XXXX to run-ns. > > Meaning: it's great to have something like network name spaces, but without > some setns(XXXX) system call, it's really difficult to use these network name > spaces outside of a concept like LXC - which is a shame, because otherwise > the nework name space woudl exactly be what i am looking for. Definitely old docs. ip netns add ip netns delete ip netns exec And yes there is a setns system call. If you don't have that you have old bits. All of that should be merged and documented. Eric ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <m1hb2nsqy6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <m1hb2nsqy6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> @ 2011-11-02 19:51 ` Toerless Eckert [not found] ` <20111102195142.GC14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Toerless Eckert @ 2011-11-02 19:51 UTC (permalink / raw) To: Eric W. Biederman Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Toerless Eckert Cool. Although i would claim my bits are "current", and your bits are "bleeding edge". Just found the iproute2 package that supports this on my gentoo by getting the latest cvs version only... ;-) The biggest issue seems to be that setns() is only in 3.0 linux kernels as far as i can see. Have to check whether that's a possible version on the systems where i need it. But at least this is technically cool and makes these network name spaces much more flexible useable (eg: inside and outside of LXC). Cheers Toerless On Tue, Nov 01, 2011 at 10:17:05AM -0700, Eric W. Biederman wrote: > > some setns(XXXX) system call, it's really difficult to use these network name > > spaces outside of a concept like LXC - which is a shame, because otherwise > > the nework name space woudl exactly be what i am looking for. > > Definitely old docs. > > ip netns add > ip netns delete > ip netns exec > > And yes there is a setns system call. > > If you don't have that you have old bits. All of that should be merged > and documented. > > Eric ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20111102195142.GC14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>]
* Re: LXC L3 network isolation, yes/no ?, how ? [not found] ` <20111102195142.GC14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org> @ 2011-11-02 20:11 ` Renato Westphal 0 siblings, 0 replies; 9+ messages in thread From: Renato Westphal @ 2011-11-02 20:11 UTC (permalink / raw) To: Toerless Eckert Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eric W. Biederman 2011/11/2 Toerless Eckert <Toerless.Eckert-jNDFPZUTrfT6U6xlzOR6HsSSVFg4/55HhC4ANOJQIlc@public.gmane.org>: > Cool. Although i would claim my bits are "current", and your bits > are "bleeding edge". Just found the iproute2 package that supports this > on my gentoo by getting the latest cvs version only... ;-) > > The biggest issue seems to be that setns() is only in 3.0 linux kernels > as far as i can see. Have to check whether that's a possible version on the > systems where i need it. Backporting the setns syscall and related stuff to older linux kernels is straightforward. I backported it to the 2.6.35.13 release and everything is working fine. if you are interested let me know. > But at least this is technically cool and makes these network name spaces > much more flexible useable (eg: inside and outside of LXC). > > Cheers > Toerless > > On Tue, Nov 01, 2011 at 10:17:05AM -0700, Eric W. Biederman wrote: >> > some setns(XXXX) system call, it's really difficult to use these network name >> > spaces outside of a concept like LXC - which is a shame, because otherwise >> > the nework name space woudl exactly be what i am looking for. >> >> Definitely old docs. >> >> ip netns add >> ip netns delete >> ip netns exec >> >> And yes there is a setns system call. >> >> If you don't have that you have old bits. All of that should be merged >> and documented. >> >> Eric > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers > -- Renato Westphal ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-11-02 20:11 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-01 2:12 LXC L3 network isolation, yes/no ?, how ? Toerless Eckert
[not found] ` <20111101021230.GE15906-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>
2011-11-01 3:19 ` Eric W. Biederman
[not found] ` <m1r51swmun.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2011-11-01 4:32 ` Toerless Eckert
[not found] ` <20111101043201.GA14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>
2011-11-01 12:20 ` Eric W. Biederman
[not found] ` <m1lis0vxu6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2011-11-01 15:26 ` Toerless Eckert
[not found] ` <20111101152624.GB14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>
2011-11-01 15:55 ` Daniel Lezcano
2011-11-01 17:17 ` Eric W. Biederman
[not found] ` <m1hb2nsqy6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2011-11-02 19:51 ` Toerless Eckert
[not found] ` <20111102195142.GC14734-+4JsuViRYHWM0MU9lROt9PpTrGXM5HoexJJUWDj/nkeELgA04lAiVw@public.gmane.org>
2011-11-02 20:11 ` Renato Westphal
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.