virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Simon Horman <horms@verge.net.au>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	virtualization@lists.linux-foundation.org,
	Jesse Gross <jesse@nicira.com>,
	dev@openvswitch.org, virtualization@lists.osdl.org,
	netdev@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: Flow Control and Port Mirroring Revisited
Date: Thu, 6 Jan 2011 12:27:55 +0200	[thread overview]
Message-ID: <20110106102755.GC12142@redhat.com> (raw)
In-Reply-To: <20110106093312.GA1564@verge.net.au>

On Thu, Jan 06, 2011 at 06:33:12PM +0900, Simon Horman wrote:
> Hi,
> 
> Back in October I reported that I noticed a problem whereby flow control
> breaks down when openvswitch is configured to mirror a port[1].

Apropos the UDP flow control.  See this
http://www.spinics.net/lists/netdev/msg150806.html
for some problems it introduces.
Unfortunately UDP does not have built-in flow control.
At some level it's just conceptually broken:
it's not present in physical networks so why should
we try and emulate it in a virtual network?


Specifically, when you do:
# netperf -c -4 -t UDP_STREAM -H 172.17.60.218 -l 30 -- -m 1472
You are asking: what happens if I push data faster than it can be received?
But why is this an interesting question?
Ask 'what is the maximum rate at which I can send data with %X packet
loss' or 'what is the packet loss at rate Y Gb/s'. netperf has
-b and -w flags for this. It needs to be configured
with --enable-intervals=yes for them to work.

If you pose the questions this way the problem of pacing
the execution just goes away.

> 
> I have (finally) looked into this further and the problem appears to relate
> to cloning of skbs, as Jesse Gross originally suspected.
> 
> More specifically, in do_execute_actions[2] the first n-1 times that an skb
> needs to be transmitted it is cloned first and the final time the original
> skb is used.
> 
> In the case that there is only one action, which is the normal case, then
> the original skb will be used. But in the case of mirroring the cloning
> comes into effect. And in my case the cloned skb seems to go to the (slow)
> eth1 interface while the original skb goes to the (fast) dummy0 interface
> that I set up to be a mirror. The result is that dummy0 "paces" the flow,
> and its a cracking pace at that.
> 
> As an experiment I hacked do_execute_actions() to use the original skb
> for the first action instead of the last one.  In my case the result was
> that eth1 "paces" the flow, and things work reasonably nicely.
> 
> Well, sort of. Things work well for non-GSO skbs but extremely poorly for
> GSO skbs where only 3 (yes 3, not 3%) end up at the remote host running
> netserv. I'm unsure why, but I digress.
> 
> It seems to me that my hack illustrates the point that the flow ends up
> being "paced" by one interface. However I think that what would be
> desirable is that the flow is "paced" by the slowest link. Unfortunately
> I'm unsure how to achieve that.

What if you have multiple UDP sockets with different targets
in the guest?

> One idea that I had was to skb_get() the original skb each time it is
> cloned - that is easy enough. But unfortunately it seems to me that
> approach would require some sort of callback mechanism in kfree_skb() so
> that the cloned skbs can kfree_skb() the original skb.
> 
> Ideas would be greatly appreciated.
> 
> [1] http://openvswitch.org/pipermail/dev_openvswitch.org/2010-October/003806.html
> [2] http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=blob;f=datapath/actions.c;h=5e16143ca402f7da0ee8fc18ee5eb16c3b7598e6;hb=HEAD

  parent reply	other threads:[~2011-01-06 10:27 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-06  9:33 Flow Control and Port Mirroring Revisited Simon Horman
2011-01-06 10:22 ` Eric Dumazet
2011-01-06 12:44   ` Simon Horman
2011-01-06 13:28     ` Eric Dumazet
2011-01-06 22:01       ` Simon Horman
2011-01-06 22:38     ` Jesse Gross
2011-01-07  1:23       ` Simon Horman
2011-01-10  9:31         ` Simon Horman
2011-01-13  6:47           ` Simon Horman
2011-01-13 15:45             ` Jesse Gross
2011-01-13 23:41               ` Simon Horman
2011-01-14  4:58                 ` Michael S. Tsirkin
2011-01-14  6:35                   ` Simon Horman
2011-01-14  6:54                     ` Michael S. Tsirkin
2011-01-16 22:37                       ` Simon Horman
2011-01-16 23:56                         ` Rusty Russell
2011-01-17 10:38                           ` Michael S. Tsirkin
2011-01-17 10:26                         ` Michael S. Tsirkin
2011-01-18 19:41                           ` Rick Jones
2011-01-18 20:13                             ` Michael S. Tsirkin
2011-01-18 21:28                               ` Rick Jones
2011-01-19  9:11                               ` Simon Horman
2011-01-20  8:38                             ` Simon Horman
2011-01-21  2:30                               ` Rick Jones
2011-01-21  9:59                               ` Michael S. Tsirkin
2011-01-21 18:04                                 ` Rick Jones
2011-01-21 23:11                                 ` Simon Horman
2011-01-22 21:57                                   ` Michael S. Tsirkin
2011-01-23  6:38                                     ` Simon Horman
2011-01-23 10:39                                       ` Michael S. Tsirkin
2011-01-23 13:53                                         ` Simon Horman
2011-01-24 18:27                                         ` Rick Jones
2011-01-24 18:36                                           ` Michael S. Tsirkin
2011-01-24 19:01                                             ` Rick Jones
2011-01-24 19:42                                               ` Michael S. Tsirkin
2011-01-06 10:27 ` Michael S. Tsirkin [this message]
2011-01-06 11:30   ` Simon Horman
2011-01-06 12:07     ` Michael S. Tsirkin
2011-01-06 12:29       ` Simon Horman
2011-01-06 12:47         ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110106102755.GC12142@redhat.com \
    --to=mst@redhat.com \
    --cc=dev@openvswitch.org \
    --cc=horms@verge.net.au \
    --cc=jesse@nicira.com \
    --cc=kvm@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=virtualization@lists.osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).