All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Lezcano <dlezcano@fr.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: hadi@cyberus.ca, Dmitry Mishin <dim@openvz.org>,
	Stephen Hemminger <shemminger@osdl.org>,
	netdev@vger.kernel.org,
	Linux Containers <containers@lists.osdl.org>
Subject: Re: Network virtualization/isolation
Date: Tue, 28 Nov 2006 15:15:26 +0100	[thread overview]
Message-ID: <456C447E.5090703@fr.ibm.com> (raw)
In-Reply-To: <m1y7pzoptr.fsf@ebiederm.dsl.xmission.com>

Eric W. Biederman wrote:

[ snip ]

>>
>> The packets arrive to the real device and go through the routes
>> engine. From this point, the used route is enough to know to which
>> container the traffic can go and the sockets subset assigned to the
>> container.
> 
> Note this has potentially the highest overhead of them all because
> this is the only approach in which it is mandatory to inspect the
> network packets to see which container they are in.

If the container is in the route information, when you use the route, 
you have the container destination with it. I don't see the overhead here.

> 
> My real problem with this approach besides seriously complicating
> the administration by not delegating it is that you loose enormous
> amounts of power.

I don't understand why you say administration is more complicated.
unshare -> ifconfig

1 container = 1 IP

[ snip ]

> So you have two columns that you rate these things that I disagree
> with, and you left out what the implications are for code maintenance.
> 
> 1) Network setup.
>    Past a certainly point both bind filtering and Daniel's L3 use a new
>    paradigm for managing the network code and become nearly impossible for
>    system administrators to understand.  The classic one is routing packets
>    between machines over the loopback interface by accident. Huh?

What is this new paradigm you are talking about ?

> 
> The L2. Network setup iss simply the cost of setting up a multiple
> machine network.  This is more complicated but it is well understood
> and well documented today.  Plus for the common cases it is easy to
> get a tool to automate this for you.  When you get a complicated
> network this wins hands down because the existing tools work and
> you don't have to retrain your sysadmins to understand what is
> happening.

unshare -> (guest) add mac address
            (host) add mac address
	   (guest) set ip address
            (host) set ip address
            (host) setup bridge

1 container = 2 net devices (root + guest), 2 IPs, 2 mac addresses, 1 
bridge.
100 containers = 200 net devices, 200 IPs, 200 mac addresses, 1 bridge.

> 
> 2) Runtime Overhead.
> 
> Your analysis is confused. Bind/Accept filter is much cheaper than
> doing a per packet evaluation in the route cache of which container
> it belongs to.  Among other things Bind/Accept filtering allows all
> of the global variables in the network stack to remain global and
> only touches a slow path.  So it is both very simple and very cheap.
> 
> Next in line comes L2 using real network devices, and Daniel's
> L3 thing.  Because there are multiple instances of the networking data
> structures we have an extra pointer indirection.

There is not extra networking data structure instantiation in the 
Daniel's L3.
> 
> Finally we get L2 with an extra network stack traversal, because
> we either need the full power of netfilter and traffic shaping
> gating access to what a node is doing or we simply don't have
> enough real network interfaces.  I assert that we can optimize
> the lack of network interfaces away by optimizing the drivers
> once this becomes an interesting case.
> 
> 3) Long Term Code Maintenance Overhead.
> 
> - A pure L2 implementation.  There is a big one time cost of
>   changing all of the variable accesses.  Once that transition
>   is complete things just work.  All code is shared so there
>   is no real overhead.
> 
> - Bind/Connect/Accept filtering.  There are so few places in
>   the code this is easy to maintain without sharing code with
>   everyone else.

For isolation too ? Can we build network migration on top of that ?

> 
> - Daniel's L3.  A big mass of special purpose code with peculiar
>   semantics that no one else in the network stack cares about
>   but is right in the middle of the code.

Thanks Eric for all your comments.

   -- Daniel

  reply	other threads:[~2006-11-28 14:15 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-25 15:51 Network virtualization/isolation Daniel Lezcano
2006-10-23 20:01 ` Stephen Hemminger
2006-10-26  9:44   ` Daniel Lezcano
2006-10-26 15:56     ` Stephen Hemminger
2006-10-26 22:16       ` Daniel Lezcano
2006-10-27  7:34       ` Dmitry Mishin
2006-10-27  9:10         ` Daniel Lezcano
2006-11-01 14:35           ` jamal
2006-11-01 16:13             ` Daniel Lezcano
2006-11-14 15:17             ` Daniel Lezcano
2006-11-14 18:12               ` James Morris
2006-11-15  9:56                 ` Daniel Lezcano
2006-11-22 12:00               ` Daniel Lezcano
2006-11-25  9:09               ` Eric W. Biederman
2006-11-28 14:15                 ` Daniel Lezcano [this message]
2006-11-28 16:51                   ` Eric W. Biederman
2006-11-28 17:37                     ` Herbert Poetzl
2006-11-28 20:26                     ` Daniel Lezcano
2006-11-28 21:50                       ` Eric W. Biederman
2006-11-29  5:54                         ` Herbert Poetzl
2006-11-29 20:21                         ` Brian Haley
2006-11-29 22:10                           ` [Devel] " Daniel Lezcano
2006-11-30 16:15                             ` Vlad Yasevich
2006-11-30 16:38                               ` Daniel Lezcano
2006-11-30 17:24                                 ` Herbert Poetzl
2006-12-03 12:26                             ` jamal
2006-12-03 14:13                               ` jamal
2006-12-03 16:00                                 ` Eric W. Biederman
2006-12-04 15:19                                   ` Dmitry Mishin
2006-12-04 15:45                                     ` Eric W. Biederman
2006-12-04 16:43                                     ` Herbert Poetzl
2006-12-04 16:58                                       ` Eric W. Biederman
2006-12-04 17:02                                       ` Dmitry Mishin
2006-12-04 17:19                                         ` Herbert Poetzl
2006-12-04 17:41                                         ` Daniel Lezcano
2006-12-04 12:15                                 ` Eric W. Biederman
2006-12-04 13:44                                   ` jamal
2006-12-04 15:35                                     ` Eric W. Biederman
2006-12-04 16:00                                       ` Dmitry Mishin
2006-12-04 16:52                                         ` Eric W. Biederman
2006-12-06 11:54                                           ` [Devel] " Kirill Korotaev
2006-12-06 18:30                                             ` Herbert Poetzl
2006-12-08 19:57                                               ` Eric W. Biederman
2006-12-09  3:50                                                 ` Herbert Poetzl
2006-12-09  6:13                                                   ` Andrew Morton
2006-12-09  6:35                                                     ` Herbert Poetzl
2006-12-09 21:18                                                       ` Dmitry Mishin
2006-12-09 22:34                                                       ` Kir Kolyshkin
2006-12-10  2:21                                                         ` Herbert Poetzl
2006-12-09  8:07                                                   ` Eric W. Biederman
2006-12-09 11:27                                                   ` Tomasz Torcz
2006-12-09 19:04                                                     ` Herbert Poetzl
2006-12-03 16:37                               ` Herbert Poetzl
2006-12-03 16:58                                 ` jamal
2006-12-04 10:18                               ` Daniel Lezcano
2006-12-04 13:22                                 ` jamal
2006-12-02 11:29                         ` Kari Hurtta
2006-12-02 11:49                           ` Kari Hurtta
2006-11-29  5:58                       ` Herbert Poetzl
2006-11-25  8:21             ` Eric W. Biederman
2006-11-26 18:34               ` Herbert Poetzl
2006-11-26 19:41                 ` Ben Greear
2006-11-26 20:52                 ` Eric W. Biederman
2006-11-25  8:27       ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2006-11-25 16:35 Leonid Grossman
2006-11-25 19:26 ` Eric W. Biederman
2006-11-25 22:17 Leonid Grossman
2006-11-25 23:16 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=456C447E.5090703@fr.ibm.com \
    --to=dlezcano@fr.ibm.com \
    --cc=containers@lists.osdl.org \
    --cc=dim@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=hadi@cyberus.ca \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.