Netdev List

* Re: Open vSwitch Design
From: jamal @ 2011-11-24 22:30 UTC (permalink / raw)
  To: Jesse Gross
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Chris Wright, Herbert Xu,
	Eric Dumazet, netdev, John Fastabend, Stephen Hemminger,
	David Miller
In-Reply-To: <CAEP_g=_2L1xFWtDXh_6YyXz1Mt9TR3zvjLzix+SpO6yzeOLsSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Jesse,

I am going to try and respond to your comments below.

On Thu, 2011-11-24 at 12:10 -0800, Jesse Gross wrote:

> 
>  * Switching infrastructure:  As the name implies, Open vSwitch is
> intended to be a network switch, focused on
> virtualization/OpenFlow/software defined networking.  This means that
> what we are modeling is not actually a collection of flows but a
> switch which contains a group of related ports, a software virtual
> device, etc.  The switch model is used in a variety of places, such as
> to measure traffic that actually flows through it in order to
> implement monitoring and sampling protocols.

Can you explain why you couldnt use the current bridge code (likely with
some mods)? I can see you want to isolate the VMs via the virtual ports;
maybe even vlans on the virtual ports - the current bridge code should
be able to handle that.

>  * Flow lookup:  Although used to implement OpenFlow, the kernel flow
> table does not actually directly contain OpenFlow flows.  This is
> because OpenFlow tables can contain wildcards, multiple pipeline
> stages, etc. and we did not want to push that complexity into the
> kernel fast path (nor tie it to a specific version of OpenFlow).
> Instead an exact match flow table is populated on-demand from
> userspace based on the more complex rules stored there.  Although it
> might seem limiting, this design has allowed significant new
> functionality to be added without modifications to the kernel or
> performance impact.

This can be achieved easily with zero changes to the kernel code.
You need to have default filters that redirect flows to user space
when you fail to match.

>  * Packet execution:  Once a flow is matched it can be output,
> enqueued to a particular qdisc, etc.  Some of these operations are
> specific to Open vSwitch, such as sampling, whereas others we leverage
> existing infrastructure (including tc for QoS) by simply marking the
> packet for further processing.

The tc classifier-action-qdisc infrastructure handles this.
The sampler needs a new action defined.

>  * Userspace interfaces:  One of the difficulties of having a
> specialized, exact match flow lookup engine is maintaining
> compatibility across differing kernel/userspace versions.  This
> compatibility shows up heavily in the userspace interfaces and is
> achieved by passing the kernel's version of the flow along with packet
> information.  This allows userspace to install appropriate flows even
> if its interpretation of a packet differs from the kernel's without
> version checks or maintaining multiple implementations of the flow
> extraction code in the kernel.

I didnt quiet follow - are we talking about backward/forward
compatibility?

> It's obviously possible to put this code anywhere, whether it is an
> independent module, in the bridge, or tc.  Regardless, however, it's
> largely new code that is geared towards this particular model so it
> seems better not to add to the complexity of existing components if at
> all possible.

I am still not seeing how this could not be done without the
infrastructure that exists. Granted, the user space brains - thats where
everything else resides - but you are not pushing that i think.

cheers,
jamal

^ permalink raw reply