qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Mike Lovell <mike@dev-zero.net>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)
Date: Mon, 26 Nov 2012 10:19:16 -0700	[thread overview]
Message-ID: <50B3A494.3050002@dev-zero.net> (raw)
In-Reply-To: <CAJSP0QWw4MVCjzO+7VhTE8vjftmWzVWxucY=2Wj77b2vJ7y+aA@mail.gmail.com>

On 11/24/2012 08:21 AM, Stefan Hajnoczi wrote:
> On Mon, Jun 25, 2012 at 7:42 AM, Mike Lovell <mike@dev-zero.net> wrote:
>> This is what I've been calling QDES or QEMU Distributed Ethernet Switch. I
>> first had the idea when I was playing with the udp and mcast socket network
>> backends while exploring how to build a VM infrastructure. I liked the idea of
>> using the sockets backends cause it doesn't require escalated permissions to
>> configure and run as well as the ability to talk over IP networks.
> Hi Mike,
> I was just reading the VXLAN spec and Linux code when I realized this
> is similar to your QDES approach:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d342894c5d2f8c7df194c793ec4059656e09ca31
> http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02
>
> If you're still hacking on QDES you may be interested.
>
> VXLAN is a VLAN mechanism that gets around the 12-bit 802.1Q tag size.
>   In large deployments it may be necessary to have more than 4096
> VLANs, this is where VXLAN comes in.
>
> It's a tiny header with VXLAN Network ID that encapsulates Ethernet inside UDP:
>
> [Outer Ethernet][IP][UDP] [VXLAN] [Inner Ethernet][...]
>
> UDP is used as follows:
> 1. If the host has already learnt an Inner MAC -> Outer IP mapping,
> then it transmits a unicast UDP packet.
> 2. Otherwise it transmits a multicast UDP packet.
>
> That means all hosts join a multicast group - this enables broadcast
> similar to what you've done in your patches.
>
> Typically traffic from a VM on Host A to another VM on Host B will use
> unicast UDP because the Inner MAC -> Outer IP mapping has been learnt.
>
> I'm not sure if it makes sense to implement VXLAN in QEMU because the
> multicast UDP socket uses a well-known port.  I guess that means
> multiple QEMUs running on the same host cannot use VXLAN unless they
> bind to unique IP addresses.  At that point we lose the advantage of a
> pure userspace implementation and might as well use the kernel
> implementation (or OpenVSwitch) with tap devices.
>
> Anyway, it's still interesting and maybe there's a way to solve this.
>
> Stefan

the VXLAN spec gave me some inspiration to write the original patch i 
submitted. unfortunately i made a silly decision of using my own header 
format and should have used the VXLAN one. but i believe just changing 
that would make this compatible with VXLAN.

i do still want to do more work on this such as converting to make it 
compatible with VXLAN. there have also been a lot of other changes to 
the network subsystem that i would need to update the patch for. i've 
been rather busy the past few months with a work project and told myself 
i have to finish that before i can go back to this. i also was waiting 
to see if the curn in the network subsystem would calm down and make all 
the changes i need there at once. hopefully around the new year i'll 
have time to look at it. since i originally sent the patch to the list, 
there have been a few people ask me about it so i think there is some 
interest for it.

i think it does still make sense to implement it in QEMU. there isn't a 
problem with multiple processes using the same multicast address. the 
net_socket_mcast_create function in socket.c already sets the 
IP_MULTICAST_LOOP option which makes it so packets get looped back and 
also delivered to processes on the same host. that is why there is a 
check in qdes_receive to see if the sender is the localAddr and drop it 
if it is. the big advantage i see to implementing VXLAN inside QEMU is 
that it can be done without any escalated privileges and without 
reconfiguring the hosts network configuration.

mike

  reply	other threads:[~2012-11-26 17:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1340602924-3231-1-git-send-email-mike@dev-zero.net>
2012-11-24 15:21 ` [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES) Stefan Hajnoczi
2012-11-26 17:19   ` Mike Lovell [this message]
2012-11-27 12:42     ` Stefan Hajnoczi
2012-11-27 14:24       ` Anthony Liguori
2012-11-28  7:37         ` Mike Lovell
2012-11-28  7:14       ` Mike Lovell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50B3A494.3050002@dev-zero.net \
    --to=mike@dev-zero.net \
    --cc=aliguori@us.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).