All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Lovell <mike@dev-zero.net>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)
Date: Mon, 26 Nov 2012 10:19:16 -0700	[thread overview]
Message-ID: <50B3A494.3050002@dev-zero.net> (raw)
In-Reply-To: <CAJSP0QWw4MVCjzO+7VhTE8vjftmWzVWxucY=2Wj77b2vJ7y+aA@mail.gmail.com>

On 11/24/2012 08:21 AM, Stefan Hajnoczi wrote:
> On Mon, Jun 25, 2012 at 7:42 AM, Mike Lovell <mike@dev-zero.net> wrote:
>> This is what I've been calling QDES or QEMU Distributed Ethernet Switch. I
>> first had the idea when I was playing with the udp and mcast socket network
>> backends while exploring how to build a VM infrastructure. I liked the idea of
>> using the sockets backends cause it doesn't require escalated permissions to
>> configure and run as well as the ability to talk over IP networks.
> Hi Mike,
> I was just reading the VXLAN spec and Linux code when I realized this
> is similar to your QDES approach:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d342894c5d2f8c7df194c793ec4059656e09ca31
> http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02
>
> If you're still hacking on QDES you may be interested.
>
> VXLAN is a VLAN mechanism that gets around the 12-bit 802.1Q tag size.
>   In large deployments it may be necessary to have more than 4096
> VLANs, this is where VXLAN comes in.
>
> It's a tiny header with VXLAN Network ID that encapsulates Ethernet inside UDP:
>
> [Outer Ethernet][IP][UDP] [VXLAN] [Inner Ethernet][...]
>
> UDP is used as follows:
> 1. If the host has already learnt an Inner MAC -> Outer IP mapping,
> then it transmits a unicast UDP packet.
> 2. Otherwise it transmits a multicast UDP packet.
>
> That means all hosts join a multicast group - this enables broadcast
> similar to what you've done in your patches.
>
> Typically traffic from a VM on Host A to another VM on Host B will use
> unicast UDP because the Inner MAC -> Outer IP mapping has been learnt.
>
> I'm not sure if it makes sense to implement VXLAN in QEMU because the
> multicast UDP socket uses a well-known port.  I guess that means
> multiple QEMUs running on the same host cannot use VXLAN unless they
> bind to unique IP addresses.  At that point we lose the advantage of a
> pure userspace implementation and might as well use the kernel
> implementation (or OpenVSwitch) with tap devices.
>
> Anyway, it's still interesting and maybe there's a way to solve this.
>
> Stefan

the VXLAN spec gave me some inspiration to write the original patch i 
submitted. unfortunately i made a silly decision of using my own header 
format and should have used the VXLAN one. but i believe just changing 
that would make this compatible with VXLAN.

i do still want to do more work on this such as converting to make it 
compatible with VXLAN. there have also been a lot of other changes to 
the network subsystem that i would need to update the patch for. i've 
been rather busy the past few months with a work project and told myself 
i have to finish that before i can go back to this. i also was waiting 
to see if the curn in the network subsystem would calm down and make all 
the changes i need there at once. hopefully around the new year i'll 
have time to look at it. since i originally sent the patch to the list, 
there have been a few people ask me about it so i think there is some 
interest for it.

i think it does still make sense to implement it in QEMU. there isn't a 
problem with multiple processes using the same multicast address. the 
net_socket_mcast_create function in socket.c already sets the 
IP_MULTICAST_LOOP option which makes it so packets get looped back and 
also delivered to processes on the same host. that is why there is a 
check in qdes_receive to see if the sender is the localAddr and drop it 
if it is. the big advantage i see to implementing VXLAN inside QEMU is 
that it can be done without any escalated privileges and without 
reconfiguring the hosts network configuration.

mike

  reply	other threads:[~2012-11-26 17:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1340602924-3231-1-git-send-email-mike@dev-zero.net>
2012-11-24 15:21 ` [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES) Stefan Hajnoczi
2012-11-26 17:19   ` Mike Lovell [this message]
2012-11-27 12:42     ` Stefan Hajnoczi
2012-11-27 14:24       ` Anthony Liguori
2012-11-28  7:37         ` Mike Lovell
2012-11-28  7:14       ` Mike Lovell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50B3A494.3050002@dev-zero.net \
    --to=mike@dev-zero.net \
    --cc=aliguori@us.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.