All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Simon Horman <horms@verge.net.au>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	virtualization@lists.linux-foundation.org,
	Jesse Gross <jesse@nicira.com>,
	dev@openvswitch.org, virtualization@lists.osdl.org,
	netdev@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: Flow Control and Port Mirroring Revisited
Date: Thu, 6 Jan 2011 12:27:55 +0200	[thread overview]
Message-ID: <20110106102755.GC12142@redhat.com> (raw)
In-Reply-To: <20110106093312.GA1564@verge.net.au>

On Thu, Jan 06, 2011 at 06:33:12PM +0900, Simon Horman wrote:
> Hi,
> 
> Back in October I reported that I noticed a problem whereby flow control
> breaks down when openvswitch is configured to mirror a port[1].

Apropos the UDP flow control.  See this
http://www.spinics.net/lists/netdev/msg150806.html
for some problems it introduces.
Unfortunately UDP does not have built-in flow control.
At some level it's just conceptually broken:
it's not present in physical networks so why should
we try and emulate it in a virtual network?


Specifically, when you do:
# netperf -c -4 -t UDP_STREAM -H 172.17.60.218 -l 30 -- -m 1472
You are asking: what happens if I push data faster than it can be received?
But why is this an interesting question?
Ask 'what is the maximum rate at which I can send data with %X packet
loss' or 'what is the packet loss at rate Y Gb/s'. netperf has
-b and -w flags for this. It needs to be configured
with --enable-intervals=yes for them to work.

If you pose the questions this way the problem of pacing
the execution just goes away.

> 
> I have (finally) looked into this further and the problem appears to relate
> to cloning of skbs, as Jesse Gross originally suspected.
> 
> More specifically, in do_execute_actions[2] the first n-1 times that an skb
> needs to be transmitted it is cloned first and the final time the original
> skb is used.
> 
> In the case that there is only one action, which is the normal case, then
> the original skb will be used. But in the case of mirroring the cloning
> comes into effect. And in my case the cloned skb seems to go to the (slow)
> eth1 interface while the original skb goes to the (fast) dummy0 interface
> that I set up to be a mirror. The result is that dummy0 "paces" the flow,
> and its a cracking pace at that.
> 
> As an experiment I hacked do_execute_actions() to use the original skb
> for the first action instead of the last one.  In my case the result was
> that eth1 "paces" the flow, and things work reasonably nicely.
> 
> Well, sort of. Things work well for non-GSO skbs but extremely poorly for
> GSO skbs where only 3 (yes 3, not 3%) end up at the remote host running
> netserv. I'm unsure why, but I digress.
> 
> It seems to me that my hack illustrates the point that the flow ends up
> being "paced" by one interface. However I think that what would be
> desirable is that the flow is "paced" by the slowest link. Unfortunately
> I'm unsure how to achieve that.

What if you have multiple UDP sockets with different targets
in the guest?

> One idea that I had was to skb_get() the original skb each time it is
> cloned - that is easy enough. But unfortunately it seems to me that
> approach would require some sort of callback mechanism in kfree_skb() so
> that the cloned skbs can kfree_skb() the original skb.
> 
> Ideas would be greatly appreciated.
> 
> [1] http://openvswitch.org/pipermail/dev_openvswitch.org/2010-October/003806.html
> [2] http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=blob;f=datapath/actions.c;h=5e16143ca402f7da0ee8fc18ee5eb16c3b7598e6;hb=HEAD

  parent reply	other threads:[~2011-01-06 10:27 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-06  9:33 Flow Control and Port Mirroring Revisited Simon Horman
2011-01-06 10:22 ` Eric Dumazet
2011-01-06 12:44   ` Simon Horman
2011-01-06 13:28     ` Eric Dumazet
2011-01-06 22:01       ` Simon Horman
2011-01-06 22:38     ` Jesse Gross
2011-01-07  1:23       ` Simon Horman
2011-01-10  9:31         ` Simon Horman
2011-01-13  6:47           ` Simon Horman
2011-01-13 15:45             ` Jesse Gross
2011-01-13 23:41               ` Simon Horman
2011-01-14  4:58                 ` Michael S. Tsirkin
2011-01-14  6:35                   ` Simon Horman
2011-01-14  6:54                     ` Michael S. Tsirkin
2011-01-16 22:37                       ` Simon Horman
2011-01-16 23:56                         ` Rusty Russell
2011-01-17 10:38                           ` Michael S. Tsirkin
2011-01-17 10:26                         ` Michael S. Tsirkin
2011-01-18 19:41                           ` Rick Jones
2011-01-18 20:13                             ` Michael S. Tsirkin
2011-01-18 21:28                               ` Rick Jones
2011-01-19  9:11                               ` Simon Horman
2011-01-20  8:38                             ` Simon Horman
2011-01-21  2:30                               ` Rick Jones
2011-01-21  9:59                               ` Michael S. Tsirkin
2011-01-21 18:04                                 ` Rick Jones
2011-01-21 23:11                                 ` Simon Horman
2011-01-22 21:57                                   ` Michael S. Tsirkin
2011-01-23  6:38                                     ` Simon Horman
2011-01-23 10:39                                       ` Michael S. Tsirkin
2011-01-23 13:53                                         ` Simon Horman
2011-01-24 18:27                                         ` Rick Jones
2011-01-24 18:36                                           ` Michael S. Tsirkin
2011-01-24 19:01                                             ` Rick Jones
2011-01-24 19:42                                               ` Michael S. Tsirkin
2011-01-06 10:27 ` Michael S. Tsirkin [this message]
2011-01-06 11:30   ` Simon Horman
2011-01-06 12:07     ` Michael S. Tsirkin
2011-01-06 12:29       ` Simon Horman
2011-01-06 12:47         ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110106102755.GC12142@redhat.com \
    --to=mst@redhat.com \
    --cc=dev@openvswitch.org \
    --cc=horms@verge.net.au \
    --cc=jesse@nicira.com \
    --cc=kvm@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=virtualization@lists.osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.