From: Stephen Hemminger <shemminger@osdl.org>
To: ebiederm@xmission.com (Eric W. Biederman)
Cc: "Caitlin Bestler" <caitlinb@broadcom.com>,
"Kir Kolyshkin" <kir@openvz.org>,
devel@openvz.org, "Andrey Savochkin" <saw@sw.ru>,
alexey@sw.ru, "Linux Containers" <containers@lists.osdl.org>,
netdev@vger.kernel.org, sam@vilain.net
Subject: Re: [RFC] network namespaces
Date: Wed, 6 Sep 2006 17:53:12 -0700 [thread overview]
Message-ID: <20060906175312.7b4d0ddb@localhost.localdomain> (raw)
In-Reply-To: <m1irk0tw5d.fsf@ebiederm.dsl.xmission.com>
On Wed, 06 Sep 2006 17:25:50 -0600
ebiederm@xmission.com (Eric W. Biederman) wrote:
> "Caitlin Bestler" <caitlinb@broadcom.com> writes:
>
> > ebiederm@xmission.com wrote:
> >
> >>
> >>> Finally, as I understand both network isolation and network
> >>> virtualization (both level2 and level3) can happily co-exist. We do
> >>> have several filesystems in kernel. Let's have several network
> >>> virtualization approaches, and let a user choose. Is that makes
> >>> sense?
> >>
> >> If there are not compelling arguments for using both ways of
> >> doing it is silly to merge both, as it is more maintenance overhead.
> >>
> >
> > My reading is that full virtualization (Xen, etc.) calls for
> > implementing
> > L2 switching between the partitions and the physical NIC(s).
> >
> > The tradeoffs between L2 and L3 switching are indeed complex, but
> > there are two implications of doing L2 switching between partitions:
> >
> > 1) Do we really want to ask device drivers to support L2 switching for
> > partitions and something *different* for containers?
>
> No.
>
> > 2) Do we really want any single packet to traverse an L2 switch (for
> > the partition-style virtualization layer) and then an L3 switch
> > (for the container-style layer)?
>
> In general what has been done with layer 3 is to simply filter which
> processes can use which IP addresses and it all happens at socket
> creation time. So it is very cheap, and it can be done purely
> in the network layer without any driver intervention.
>
> Basically think of what is happening at layer 3 as an extremely light-weight
> version of traffic filtering.
>
> > The full virtualization solution calls for virtual NICs with distinct
> > MAC addresses. Is there any reason why this same solution cannot work
> > for containers (just creating more than one VNIC for the partition,
> > and then assigning each VNIC to a container?)
>
> The VNIC approach is the fundamental idea with the layer two networking
> and if we can push the work down into the device driver it so different
> destination macs show up a in different packet queues it should be
> as fast as a normal networking stack.
>
> Implementing VNICs so far is the only piece of containers that has
> come close to device drivers, and we can likely do it without device
> driver support (but with more cost). Basically this optimization
> is a subset of the Grand Unified Lookup idea.
>
> I think we can do a mergeable implementation without noticeable cost without
> when not using containers without having to resort to a grand unified lookup
> but I may be wrong.
>
> Eric
The problem with VNIC's is it won't work for all devices (without lots of
work), and for many device's it requires putting the device in promiscuous
mode. It also plays havoc with network access control devices.
next prev parent reply other threads:[~2006-09-07 0:53 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-15 14:20 [RFC] network namespaces Andrey Savochkin
2006-08-15 14:48 ` [PATCH 1/9] network namespaces: core and device list Andrey Savochkin
2006-08-16 14:46 ` Dave Hansen
2006-08-16 16:45 ` Stephen Hemminger
2006-08-15 14:48 ` [PATCH 2/9] network namespaces: IPv4 routing Andrey Savochkin
2006-08-15 14:48 ` [PATCH 3/9] network namespaces: playing and debugging Andrey Savochkin
2006-08-16 16:46 ` Stephen Hemminger
2006-08-16 17:22 ` Eric W. Biederman
2006-08-17 6:28 ` Andrey Savochkin
2006-08-17 8:30 ` Kirill Korotaev
2006-08-15 14:48 ` [PATCH 4/9] network namespaces: socket hashes Andrey Savochkin
2006-09-18 15:12 ` Daniel Lezcano
2006-09-20 16:32 ` Andrey Savochkin
2006-09-21 12:34 ` Daniel Lezcano
2006-08-15 14:48 ` [PATCH 5/9] network namespaces: async socket operations Andrey Savochkin
2006-09-22 15:33 ` Daniel Lezcano
2006-09-23 13:16 ` Andrey Savochkin
2006-08-15 14:48 ` [PATCH 6/9] allow proc_dir_entries to have destructor Andrey Savochkin
2006-08-15 14:48 ` [PATCH 7/9] net_device seq_file Andrey Savochkin
2006-08-15 14:48 ` [PATCH 8/9] network namespaces: device to pass packets between namespaces Andrey Savochkin
2006-08-15 14:48 ` [PATCH 9/9] network namespaces: playing with pass-through device Andrey Savochkin
2006-08-16 11:53 ` [RFC] network namespaces Serge E. Hallyn
2006-08-16 15:12 ` Alexey Kuznetsov
2006-08-16 17:35 ` Eric W. Biederman
2006-08-17 8:29 ` Kirill Korotaev
2006-09-05 13:34 ` Daniel Lezcano
2006-09-05 14:45 ` Eric W. Biederman
2006-09-05 15:32 ` Daniel Lezcano
2006-09-05 16:53 ` Herbert Poetzl
2006-09-05 18:27 ` Eric W. Biederman
2006-09-06 14:52 ` Kirill Korotaev
2006-09-06 15:09 ` [Devel] " Kir Kolyshkin
2006-09-06 9:10 ` Daniel Lezcano
2006-09-06 16:56 ` Herbert Poetzl
2006-09-06 17:37 ` [Devel] " Kir Kolyshkin
2006-09-06 18:34 ` Eric W. Biederman
2006-09-06 18:58 ` Kir Kolyshkin
2006-09-06 20:53 ` Cedric Le Goater
2006-09-06 23:06 ` Caitlin Bestler
2006-09-06 23:25 ` Eric W. Biederman
2006-09-07 0:53 ` Stephen Hemminger [this message]
2006-09-07 5:11 ` Eric W. Biederman
2006-09-07 8:25 ` Daniel Lezcano
2006-09-07 18:29 ` Eric W. Biederman
2006-09-08 6:02 ` Herbert Poetzl
2006-09-07 16:23 ` [Devel] " Kirill Korotaev
2006-09-07 17:27 ` Herbert Poetzl
2006-09-07 19:50 ` Eric W. Biederman
2006-09-08 13:10 ` Dmitry Mishin
2006-09-08 18:11 ` Herbert Poetzl
2006-09-09 7:57 ` Dmitry Mishin
2006-09-10 2:47 ` Herbert Poetzl
2006-09-10 3:41 ` Eric W. Biederman
2006-09-10 8:11 ` Dmitry Mishin
2006-09-10 11:48 ` Eric W. Biederman
2006-09-10 19:19 ` [Devel] " Herbert Poetzl
2006-09-10 7:45 ` Dmitry Mishin
2006-09-10 19:22 ` Herbert Poetzl
2006-09-12 3:26 ` Eric W. Biederman
2006-09-11 14:40 ` [Devel] " Daniel Lezcano
2006-09-11 14:57 ` Herbert Poetzl
2006-09-11 15:04 ` Daniel Lezcano
2006-09-11 15:10 ` Dmitry Mishin
2006-09-12 3:28 ` Eric W. Biederman
2006-09-12 7:38 ` Dmitry Mishin
2006-09-06 21:44 ` [Devel] " Daniel Lezcano
2006-09-06 17:58 ` Eric W. Biederman
2006-09-05 15:47 ` Kirill Korotaev
2006-09-05 17:09 ` Eric W. Biederman
2006-09-06 20:25 ` Cedric Le Goater
2006-09-06 20:40 ` Eric W. Biederman
2006-10-04 9:40 ` Daniel Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060906175312.7b4d0ddb@localhost.localdomain \
--to=shemminger@osdl.org \
--cc=alexey@sw.ru \
--cc=caitlinb@broadcom.com \
--cc=containers@lists.osdl.org \
--cc=devel@openvz.org \
--cc=ebiederm@xmission.com \
--cc=kir@openvz.org \
--cc=netdev@vger.kernel.org \
--cc=sam@vilain.net \
--cc=saw@sw.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).