netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org>
To: Stephen Hemminger <shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org>
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org,
	Chris Wright <chrisw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Herbert Xu
	<herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>,
	Eric Dumazet
	<eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	netdev <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	John Fastabend
	<john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Subject: Re: Open vSwitch Design
Date: Fri, 25 Nov 2011 06:24:36 -0500	[thread overview]
Message-ID: <1322220276.1908.75.camel@mojatatu> (raw)
In-Reply-To: <20111124212021.2ae2fb7f-QE31Isp8l5DVJhW05BI4jyWSNWFUUkiGXqFh9Ls21Oc@public.gmane.org>

On Thu, 2011-11-24 at 21:20 -0800, Stephen Hemminger wrote:
> On Thu, 24 Nov 2011 17:30:33 -0500
> jamal <hadi-fAAogVwAN2Kw5LPnMra/2Q@public.gmane.org> wrote:
> 
 
> > Can you explain why you couldnt use the current bridge code (likely with
> > some mods)? I can see you want to isolate the VMs via the virtual ports;
> > maybe even vlans on the virtual ports - the current bridge code should
> > be able to handle that.
> 
> The way openvswitch works is that the flow table is populated
> by user space. The kernel bridge works completely differently (it learns
> about MAC addresses). 
> 

Most hardware bridges out there support all different modes:
You can have learning in the hardware or defer it to user/control plane
by setting some flags. You can have broadcasting done in hardware or
defer to user space. 
The mods i was thinking of is to bring the Linux bridge to have the 
same behavior. You then need to allow netlink updates of bridge MAC
table from user space. There may be weaknesses with the current bridging
code in relation to Vlans that may need to be addressed.

[But my concern was not so much the bridge - because changes are needed
in that case; it is the "match, actionlist" that is already in place
that got to me.]

> Actually, this is what puts me off on the current implementation.
> I would prefer that the kernel implementation was just a software
> implementation of a hardware OpenFlow switch. That way it would
> be transparent that the control plane in user space was talking to kernel
> or hardware.

Or alternatively, allow the bridge code to support the different modes.
Learning as well as broadcasting mode needs to be settable.
Then you have interesting capability in the kernel that meets the
requirements of an open flow switch (+ anyone who wants to do policy
control in user space with their favorite standard).

> > The tc classifier-action-qdisc infrastructure handles this.
> > The sampler needs a new action defined.
> 
> There are too many damn layers in the software path already.

I think what they are doing in the separation of control and data
is reasonable. The policy and control are in user space. The fastpath
is in the kernel; and it may be in a variety of spots (some arp entry
here, some L3 entry there, a couple of match-action items etc)
the brains which understand the what the different things mean in
aggregation in terms of a service are in user space.

> 
> The problem is that there are two flow classifiers, one in OpenVswitch
> in the kernel, and the other in the user space flow manager. I think the
> issue is that the two have different code.

i see. I can understand having a simple classifier in the kernel and
more complex "consulting" sitting in user space which updates the 
kernel on how to deal with subsequent flow packets.

> Is the kernel/userspace API for OpenVswitch nailed down and documented
> well enough that alternative control plane software could be built?

They do have a generic netlink interface. I would prefer the netlink
interface already in place (which would have worked if they used 
the stuff already in place).


cheers,
jamal

  parent reply	other threads:[~2011-11-25 11:24 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-24 20:10 Open vSwitch Design Jesse Gross
     [not found] ` <CAEP_g=_2L1xFWtDXh_6YyXz1Mt9TR3zvjLzix+SpO6yzeOLsSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-24 22:30   ` jamal
2011-11-25  5:20     ` Stephen Hemminger
     [not found]       ` <20111124212021.2ae2fb7f-QE31Isp8l5DVJhW05BI4jyWSNWFUUkiGXqFh9Ls21Oc@public.gmane.org>
2011-11-25  6:18         ` Eric Dumazet
2011-11-25  6:25           ` David Miller
     [not found]             ` <20111125.012517.2221372383643417980.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2011-11-25  6:36               ` Eric Dumazet
2011-11-25 11:34                 ` jamal
2011-11-25 13:02                   ` Eric Dumazet
2011-11-28 15:20                     ` [PATCH net-next 0/4] net: factorize flow dissector Eric Dumazet
2011-11-25 20:20                   ` Open vSwitch Design Jesse Gross
     [not found]                     ` <CAEP_g=9tcH9kJrVsHc26kXWZEUS8G-U=U7y6k8xaZG5MD0OTyg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-26  1:23                       ` Jamal Hadi Salim
2011-11-25 20:14           ` Jesse Gross
2011-11-25 11:24         ` jamal [this message]
2011-11-25 17:28           ` Stephen Hemminger
2011-11-25 17:55         ` Jesse Gross
2011-11-25 19:52         ` Justin Pettit
     [not found]           ` <2DB44B16-598F-4414-8B35-8E322D705A9A-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
2011-11-26  1:11             ` Jamal Hadi Salim
2011-11-26  4:38               ` Stephen Hemminger
     [not found]                 ` <ec23d63d-27c9-4761-bdd3-e3f54bdb5e77-bX68f012229Xuxj3zoTs5AC/G2K4zDHf@public.gmane.org>
2011-11-26  8:05                   ` Martin Casado
2011-11-28 18:34               ` Justin Pettit
2011-11-28 22:42                 ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1322220276.1908.75.camel@mojatatu \
    --to=hadi-faaogvwan2kw5lpnmra/2q@public.gmane.org \
    --cc=chrisw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org \
    --cc=eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org \
    --cc=jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org \
    --cc=john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=shemminger-ZtmgI6mnKB3QT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).