From: Herbert Poetzl <herbert@13thfloor.at>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Sam Vilain <sam@vilain.net>, Kirill Korotaev <dev@sw.ru>,
Daniel Lezcano <dlezcano@fr.ibm.com>,
Andrey Savochkin <saw@swsoft.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
serue@us.ibm.com, haveblue@us.ibm.com, clg@fr.ibm.com,
Andrew Morton <akpm@osdl.org>,
devel@openvz.org, viro@ftp.linux.org.uk,
Alexey Kuznetsov <alexey@sw.ru>
Subject: Re: [patch 2/6] [Network namespace] Network device sharing by view
Date: Wed, 28 Jun 2006 19:18:37 +0200 [thread overview]
Message-ID: <20060628171836.GD6440@MAIL.13thfloor.at> (raw)
In-Reply-To: <m11wt9gulz.fsf@ebiederm.dsl.xmission.com>
On Wed, Jun 28, 2006 at 09:36:40AM -0600, Eric W. Biederman wrote:
> Herbert Poetzl <herbert@13thfloor.at> writes:
>
> > On Wed, Jun 28, 2006 at 06:31:05PM +1200, Sam Vilain wrote:
> >> Eric W. Biederman wrote:
> >> > Have a few more network interfaces for a layer 2 solution
> >> > is fundamental. Believing without proof and after arguments
> >> > to the contrary that you have not contradicted that a layer 2
> >> > solution is inherently slower is non-productive. Arguing
> >> > that a layer 2 only solution most prove itself on guest to guest
> >> > communication is also non-productive.
> >> >
> >>
> >> Yes, it does break what some people consider to be a sanity
> >> condition when you don't have loopback anymore within a guest. I
> >> once experimented with using 127.* addresses for per-guest loopback
> >> devices with vserver to fix this, but that couldn't work without
> >> fixing glibc to not make assumptions deep in the bowels of the
> >> resolver. I logged a fault with gnu.org and you can guess where it
> >> went :-).
> >
> > this is what the lo* patches address, by providing
> > the required loopback isolation and providing lo
> > inside a guest (i.e. it looks and feels like a
> > normal system, except that you cannot modify the
> > interfaces from inside)
>
> Ok. This is new. How do you talk between guests now?
> Before those patches it was through IP addresses on the loopback
> interface as I recall.
no, that was probably your assumption, the IPs are
assigned (in a perfectly normal way) to the interfaces
(e.g. eth0 carries some IPs for guest A and B, eth1
carries others for guest C). the way the linux network
stack works, local addresses (i.e. those of A,B and C)
will automatically communicate via loopback (as they
are local) while outbound traffic will use the proper
interface (nothing is changed here)
the difference in the lo patches is, that we allow to
use the 'localhost' ip range (127.x.x.x) by isolating
traffic (in this range) on the loopback interface
(which typically allows to have 127.0.0.1 and lo
visible inside a guest)
> >> > With a guest with 4 IPs
> >> > 10.0.0.1 192.168.0.1 172.16.0.1 127.0.0.1
> >> > How do you make INADDR_ANY work with just filtering at bind time?
> >> >
> >>
> >> It used to just bind to the first one. Don't know if it still does.
> >
> > no, it _alway_ binds to INADDR_ANY and checks
> > against other sockets (in the same context)
> > comparing the lists of assigned IPs (the subset)
> >
> > so all checks happen at bind/connect time and
> > always against the set of IPs, only exception is
> > a performance optimization we do for single IP
> > guests (where INADDR_ANY gets rewritten to the
> > single IP)
>
> What is the mechanism there?
>
> My rough extrapolation says this mechanism causes problems when
> migrating between machines.
that might be, as we do not consider migration such
important as other folks do :)
> In particular it sounds like only one process can bind to *:80, even
> if it is only allowed to accept connections from a subset of those
> IPs.
no, guest A,B and C can all bind to *:80 and coexist
quite fine, given that they do not have any IP in
the intersection of their subsets (which is checked
at bind time)
> So if on another machine I bound something to *:80 and only allowed to
> use a different set of IPs and then attempted to migrate it, the
> migration would fail because I could not restart the application,
> with all of it's layer 3 resources.
actually I do not see why, unless the destination
has a conflict on the ip subset, in which case you
would end up with a migrated, but not working guest :)
> To be clear I assume when I migrate I always take my IP address or
> addresses with me.
that's fine, the only requirement would be that the
host has a superset of the IP addresses used by the
guests ...
HTC,
Herbert
> Eric
next prev parent reply other threads:[~2006-06-28 17:18 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-09 21:02 [RFC] [patch 0/6] [Network namespace] introduction dlezcano
2006-06-09 21:02 ` [RFC] [patch 1/6] [Network namespace] Network namespace structure dlezcano
2006-06-09 21:02 ` [RFC] [patch 2/6] [Network namespace] Network device sharing by view dlezcano
2006-06-11 10:18 ` Andrew Morton
2006-06-18 18:53 ` Al Viro
2006-06-26 9:47 ` Andrey Savochkin
2006-06-26 13:02 ` Herbert Poetzl
2006-06-26 14:05 ` Eric W. Biederman
2006-06-26 14:08 ` Andrey Savochkin
2006-06-26 18:28 ` Herbert Poetzl
2006-06-26 18:59 ` Eric W. Biederman
2006-06-26 14:56 ` Daniel Lezcano
2006-06-26 15:21 ` Eric W. Biederman
2006-06-26 15:27 ` Andrey Savochkin
2006-06-26 15:49 ` Daniel Lezcano
2006-06-26 16:40 ` Eric W. Biederman
2006-06-26 18:36 ` Herbert Poetzl
2006-06-26 19:35 ` Eric W. Biederman
2006-06-26 20:02 ` Herbert Poetzl
2006-06-26 20:37 ` Eric W. Biederman
2006-06-26 21:26 ` Herbert Poetzl
2006-06-26 21:59 ` Ben Greear
2006-06-26 22:11 ` Eric W. Biederman
2006-06-27 9:09 ` Andrey Savochkin
2006-06-27 15:48 ` Herbert Poetzl
2006-06-27 16:19 ` Andrey Savochkin
2006-06-27 16:40 ` Eric W. Biederman
2006-06-26 22:13 ` Ben Greear
2006-06-26 22:54 ` Herbert Poetzl
2006-06-26 23:08 ` Ben Greear
2006-06-27 16:07 ` Ben Greear
2006-06-27 22:48 ` Herbert Poetzl
2006-06-27 9:11 ` Andrey Savochkin
2006-06-27 9:34 ` Daniel Lezcano
2006-06-27 9:38 ` Andrey Savochkin
2006-06-27 11:21 ` Daniel Lezcano
2006-06-27 11:52 ` Eric W. Biederman
2006-06-27 16:02 ` Herbert Poetzl
2006-06-27 16:47 ` Eric W. Biederman
2006-06-27 17:19 ` Ben Greear
2006-06-27 22:52 ` Herbert Poetzl
2006-06-27 23:12 ` Dave Hansen
2006-06-27 23:42 ` Alexey Kuznetsov
2006-06-28 3:38 ` Eric W. Biederman
2006-06-28 13:36 ` Herbert Poetzl
2006-06-28 13:53 ` jamal
2006-06-28 14:19 ` Andrey Savochkin
2006-06-28 16:17 ` jamal
2006-06-28 16:58 ` Andrey Savochkin
2006-06-28 17:17 ` Eric W. Biederman
2006-06-28 17:04 ` Herbert Poetzl
2006-06-28 14:39 ` Eric W. Biederman
2006-06-30 1:41 ` Sam Vilain
2006-06-29 21:07 ` Sam Vilain
2006-06-29 22:14 ` strict isolation of net interfaces Cedric Le Goater
2006-06-30 2:39 ` Serge E. Hallyn
2006-06-30 2:49 ` Sam Vilain
2006-07-03 14:53 ` Andrey Savochkin
2006-07-04 3:00 ` Sam Vilain
2006-07-04 12:29 ` Daniel Lezcano
2006-07-04 13:13 ` Sam Vilain
2006-07-04 13:19 ` Daniel Lezcano
2006-06-30 8:56 ` Cedric Le Goater
2006-07-03 13:36 ` Herbert Poetzl
2006-06-30 12:23 ` Daniel Lezcano
2006-06-30 14:20 ` Eric W. Biederman
2006-06-30 15:22 ` Daniel Lezcano
2006-06-30 17:58 ` Eric W. Biederman
2006-06-30 16:14 ` Serge E. Hallyn
2006-06-30 17:41 ` Eric W. Biederman
2006-06-30 18:09 ` Eric W. Biederman
2006-06-30 0:15 ` [patch 2/6] [Network namespace] Network device sharing by view jamal
2006-06-30 3:35 ` Herbert Poetzl
2006-06-30 7:45 ` Andrey Savochkin
2006-06-30 13:50 ` jamal
2006-06-30 15:01 ` Andrey Savochkin
2006-06-30 18:22 ` Eric W. Biederman
2006-06-30 21:51 ` jamal
2006-07-01 0:50 ` Eric W. Biederman
2006-06-28 14:21 ` Eric W. Biederman
2006-06-28 14:51 ` Eric W. Biederman
2006-06-27 16:49 ` Alexey Kuznetsov
2006-06-27 11:55 ` Andrey Savochkin
2006-06-27 9:54 ` Kirill Korotaev
2006-06-27 16:09 ` Herbert Poetzl
2006-06-27 16:29 ` Eric W. Biederman
2006-06-27 23:07 ` Herbert Poetzl
2006-06-28 4:07 ` Eric W. Biederman
2006-06-28 6:31 ` Sam Vilain
2006-06-28 14:15 ` Herbert Poetzl
2006-06-28 15:36 ` Eric W. Biederman
2006-06-28 17:18 ` Herbert Poetzl [this message]
2006-06-28 10:14 ` Cedric Le Goater
2006-06-28 14:11 ` Herbert Poetzl
2006-06-28 16:10 ` Eric W. Biederman
2006-07-06 9:45 ` Routing tables (Re: [patch 2/6] [Network namespace] Network device sharing by view) Kari Hurtta
2006-06-09 21:02 ` [RFC] [patch 3/6] [Network namespace] Network devices isolation dlezcano
2006-06-18 18:57 ` Al Viro
2006-06-09 21:02 ` [RFC] [patch 4/6] [Network namespace] Network inet " dlezcano
2006-06-09 21:02 ` [RFC] [patch 5/6] [Network namespace] ipv4 isolation dlezcano
2006-06-10 0:23 ` James Morris
2006-06-10 0:27 ` Rick Jones
2006-06-10 0:47 ` James Morris
2006-06-09 21:02 ` [RFC] [patch 6/6] [Network namespace] Network namespace debugfs dlezcano
2006-06-10 7:16 ` [RFC] [patch 0/6] [Network namespace] introduction Kari Hurtta
2006-06-16 4:23 ` Eric W. Biederman
2006-06-16 9:06 ` Daniel Lezcano
2006-06-16 9:22 ` Eric W. Biederman
2006-06-18 18:47 ` Al Viro
2006-06-20 21:21 ` Daniel Lezcano
2006-06-20 21:25 ` Al Viro
2006-06-20 22:45 ` Daniel Lezcano
2006-06-26 23:38 ` Patrick McHardy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060628171836.GD6440@MAIL.13thfloor.at \
--to=herbert@13thfloor.at \
--cc=akpm@osdl.org \
--cc=alexey@sw.ru \
--cc=clg@fr.ibm.com \
--cc=dev@sw.ru \
--cc=devel@openvz.org \
--cc=dlezcano@fr.ibm.com \
--cc=ebiederm@xmission.com \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sam@vilain.net \
--cc=saw@swsoft.com \
--cc=serue@us.ibm.com \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).