netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Network virtualization/isolation
@ 2006-10-25 15:51 Daniel Lezcano
  2006-10-23 20:01 ` Stephen Hemminger
  0 siblings, 1 reply; 68+ messages in thread
From: Daniel Lezcano @ 2006-10-25 15:51 UTC (permalink / raw)
  To: shemminger; +Cc: netdev

Hi Stephen,

currently the work to make the container enablement into the kernel is 
doing good progress. The ipc, pid, utsname and filesystem system 
ressources are isolated/virtualized relying on the namespaces concept.

But, there is missing the network virtualization/isolation. Two 
approaches are proposed: doing the isolation at the layer 2 and at the 
layer 3.

The first one instanciate a network device by namespace and add a peer 
network device into the "root namespace", all the routing ressources are 
   relative to the namespace. This work is done by Andrey Savochkin from 
the openvz project.

The second relies on the routes and associates the network namespace 
pointer with each route. When the traffic is incoming, the packet 
follows an input route and retrieve the associated network namespace. 
When the traffic is outgoing, the packet, identified from the network 
namespace is coming from, follows only the routes matching the same 
network namespace. This work is made by me.

IMHO, we need the two approach, the layer-2 to be able to bring *very* 
strong isolation for system container with a performance cost and a 
layer-3 to be able to have good isolation for lightweight container or 
application container when performances are more important.

Do you have some suggestions ? What is your point of view on that ?

Thanks in advance.

   -- Daniel

^ permalink raw reply	[flat|nested] 68+ messages in thread
* RE: Network virtualization/isolation
@ 2006-11-25 16:35 Leonid Grossman
  2006-11-25 19:26 ` Eric W. Biederman
  0 siblings, 1 reply; 68+ messages in thread
From: Leonid Grossman @ 2006-11-25 16:35 UTC (permalink / raw)
  To: Eric W. Biederman, hadi
  Cc: Daniel Lezcano, Dmitry Mishin, Stephen Hemminger, netdev,
	Linux Containers

 

> -----Original Message-----
> From: netdev-owner@vger.kernel.org 
> [mailto:netdev-owner@vger.kernel.org] On Behalf Of Eric W. Biederman

> Then the question is how do we reduce the overhead when we 
> don't have enough physical network interfaces to go around.  
> My feeling is that we could push the work to the network 
> adapters and allow single physical network adapters to 
> support multiple network interfaces, each with a different 
> link-layer address.  At which point the overhead is nearly 
> nothing and newer network adapters may start implementing 
> enough filtering in hardware to do all of the work for us.

Correct, to a degree. 
There will be always a limit on the number of physical "channels" that a
NIC 
can support, while keeping these channels fully independent and
protected at the hw level.
So, you will probably still need to implement the sw path, 
with the assumption that some containers (that care about performance)
will get a separate 
NIC interface and avoid the overhead, and other containers will have to
use the sw path. 
There are some multi-channel NICs shipping today so it would be possible
to see the overhead between the two options (I suspect it will be quite
noticeable), but for a general idea about what work could be pushed down
to network adapters in the near future you can look at the pcisig.com
I/O Virtualization Workgroup. 
Once the single root I/O Virtualization spec is completed, it is likely
to be supported by several NIC vendors to provide multiple network
interfaces on a single NIC that you are looking for.


^ permalink raw reply	[flat|nested] 68+ messages in thread
* RE: Network virtualization/isolation
@ 2006-11-25 22:17 Leonid Grossman
  2006-11-25 23:16 ` Eric W. Biederman
  0 siblings, 1 reply; 68+ messages in thread
From: Leonid Grossman @ 2006-11-25 22:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: hadi, Daniel Lezcano, Dmitry Mishin, Stephen Hemminger, netdev,
	Linux Containers

 

> -----Original Message-----
> From: Eric W. Biederman [mailto:ebiederm@xmission.com] 
> Sent: Saturday, November 25, 2006 11:27 AM
> To: Leonid Grossman
> Cc: hadi@cyberus.ca; Daniel Lezcano; Dmitry Mishin; Stephen 
> Hemminger; netdev@vger.kernel.org; Linux Containers
> Subject: Re: Network virtualization/isolation
> 
> "Leonid Grossman" <Leonid.Grossman@neterion.com> writes:
> 
> >  
> >
> >> -----Original Message-----
> >> From: netdev-owner@vger.kernel.org
> >> [mailto:netdev-owner@vger.kernel.org] On Behalf Of Eric W. 
> Biederman
> >
> >> Then the question is how do we reduce the overhead when we 
> don't have 
> >> enough physical network interfaces to go around.
> >> My feeling is that we could push the work to the network 
> adapters and 
> >> allow single physical network adapters to support multiple network 
> >> interfaces, each with a different link-layer address.  At 
> which point 
> >> the overhead is nearly nothing and newer network adapters 
> may start 
> >> implementing enough filtering in hardware to do all of the 
> work for 
> >> us.
> >
> > Correct, to a degree. 
> > There will be always a limit on the number of physical 
> "channels" that 
> > a NIC can support, while keeping these channels fully 
> independent and 
> > protected at the hw level.
> > So, you will probably still need to implement the sw path, with the 
> > assumption that some containers (that care about 
> performance) will get 
> > a separate NIC interface and avoid the overhead, and other 
> containers 
> > will have to use the sw path.
> > There are some multi-channel NICs shipping today so it would be 
> > possible to see the overhead between the two options (I suspect it 
> > will be quite noticeable), but for a general idea about what work 
> > could be pushed down to network adapters in the near future you can 
> > look at the pcisig.com I/O Virtualization Workgroup.
> > Once the single root I/O Virtualization spec is completed, it is 
> > likely to be supported by several NIC vendors to provide multiple 
> > network interfaces on a single NIC that you are looking for.
> 
> Pushing it all of the way into the hardware is an 
> optimization, that while great is likely not necessary.  
> Simply doing a table lookup by link-level address and 
> selecting between several network interfaces is enough to 
> ensure we only traverse the network stack once.
> 
> To keep overhead down in the container case I don't need the 
> hardware support to be so good you can do kernel bypass and 
> still trust that everything is safe.  I simply a fast 
> link-level address to container mapping.  We already look at 
> the link-level address on every packet received so that 
> should not generate any extra cache misses.

I did not mean kernel bypass, just L2 hw channels that for 
all practical purposes act as separate NICs - 
different MAC addresses, no blocking, independent reset, etc.

> 
> In the worst case I might need someone to go as far as the 
> Grand Unified Lookup to remove all of the overheads.  Except 
> for distributing the work load more evenly across the machine 
> with separate interrupts and the like I see no need for 
> separate hardware channels to make things go fast for my needs.
> 
> Despite the title of this thread there is no virtualization 
> or emulation of the hardware involved.  Just enhancements to 
> the existing hardware abstractions.

Right, I was just trying to say that IOV support (likely, from multiple
vendors since
virtualization is expected to be widely used) would provide an option to
export multiple
independent L2 interfaces from a single NIC - even if only a subset of 
IOV functionality would be used in this case.

> 
> Eric
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2006-12-10  2:21 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-25 15:51 Network virtualization/isolation Daniel Lezcano
2006-10-23 20:01 ` Stephen Hemminger
2006-10-26  9:44   ` Daniel Lezcano
2006-10-26 15:56     ` Stephen Hemminger
2006-10-26 22:16       ` Daniel Lezcano
2006-10-27  7:34       ` Dmitry Mishin
2006-10-27  9:10         ` Daniel Lezcano
2006-11-01 14:35           ` jamal
2006-11-01 16:13             ` Daniel Lezcano
2006-11-14 15:17             ` Daniel Lezcano
2006-11-14 18:12               ` James Morris
2006-11-15  9:56                 ` Daniel Lezcano
2006-11-22 12:00               ` Daniel Lezcano
2006-11-25  9:09               ` Eric W. Biederman
2006-11-28 14:15                 ` Daniel Lezcano
2006-11-28 16:51                   ` Eric W. Biederman
2006-11-28 17:37                     ` Herbert Poetzl
2006-11-28 20:26                     ` Daniel Lezcano
2006-11-28 21:50                       ` Eric W. Biederman
2006-11-29  5:54                         ` Herbert Poetzl
2006-11-29 20:21                         ` Brian Haley
2006-11-29 22:10                           ` [Devel] " Daniel Lezcano
2006-11-30 16:15                             ` Vlad Yasevich
2006-11-30 16:38                               ` Daniel Lezcano
2006-11-30 17:24                                 ` Herbert Poetzl
2006-12-03 12:26                             ` jamal
2006-12-03 14:13                               ` jamal
2006-12-03 16:00                                 ` Eric W. Biederman
2006-12-04 15:19                                   ` Dmitry Mishin
2006-12-04 15:45                                     ` Eric W. Biederman
2006-12-04 16:43                                     ` Herbert Poetzl
2006-12-04 16:58                                       ` Eric W. Biederman
2006-12-04 17:02                                       ` Dmitry Mishin
2006-12-04 17:19                                         ` Herbert Poetzl
2006-12-04 17:41                                         ` Daniel Lezcano
2006-12-04 12:15                                 ` Eric W. Biederman
2006-12-04 13:44                                   ` jamal
2006-12-04 15:35                                     ` Eric W. Biederman
2006-12-04 16:00                                       ` Dmitry Mishin
2006-12-04 16:52                                         ` Eric W. Biederman
2006-12-06 11:54                                           ` [Devel] " Kirill Korotaev
2006-12-06 18:30                                             ` Herbert Poetzl
2006-12-08 19:57                                               ` Eric W. Biederman
2006-12-09  3:50                                                 ` Herbert Poetzl
2006-12-09  6:13                                                   ` Andrew Morton
2006-12-09  6:35                                                     ` Herbert Poetzl
2006-12-09 21:18                                                       ` Dmitry Mishin
2006-12-09 22:34                                                       ` Kir Kolyshkin
2006-12-10  2:21                                                         ` Herbert Poetzl
2006-12-09  8:07                                                   ` Eric W. Biederman
2006-12-09 11:27                                                   ` Tomasz Torcz
2006-12-09 19:04                                                     ` Herbert Poetzl
2006-12-03 16:37                               ` Herbert Poetzl
2006-12-03 16:58                                 ` jamal
2006-12-04 10:18                               ` Daniel Lezcano
2006-12-04 13:22                                 ` jamal
2006-12-02 11:29                         ` Kari Hurtta
2006-12-02 11:49                           ` Kari Hurtta
2006-11-29  5:58                       ` Herbert Poetzl
2006-11-25  8:21             ` Eric W. Biederman
2006-11-26 18:34               ` Herbert Poetzl
2006-11-26 19:41                 ` Ben Greear
2006-11-26 20:52                 ` Eric W. Biederman
2006-11-25  8:27       ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2006-11-25 16:35 Leonid Grossman
2006-11-25 19:26 ` Eric W. Biederman
2006-11-25 22:17 Leonid Grossman
2006-11-25 23:16 ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).