netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
       [not found]                 ` <4A656824.7070100@Voltaire.com>
@ 2009-07-21  7:25                   ` Herbert Xu
  2009-07-21 10:17                     ` Or Gerlitz
  2009-07-21 10:27                     ` Michael S. Tsirkin
  0 siblings, 2 replies; 7+ messages in thread
From: Herbert Xu @ 2009-07-21  7:25 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka,
	Michael S. Tsirkin, Mark McLoughlin, Dor Laor, netdev

On Tue, Jul 21, 2009 at 10:03:00AM +0300, Or Gerlitz wrote:
> 
> okay, when setting net.bridge.bridge-nf-call-iptables to zero, the VM TX / tap+bridge packet rate climbs from 170K to 195K but it still way beyond the 240K rate achieved by the raw mode --> we have now a clear sign on the performance gain this approach provides. 

I find this hard to believe this bridge sans netfilter does a
single lookup based on the MAC address and then just passes the
packet to the underlying driver.

Can you do an oprofile run to see if something else is chewing
up CPU time under the guise of bridging?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
  2009-07-21  7:25                   ` [Qemu-devel] [PATCH] net: add raw backend - some performance measurements Herbert Xu
@ 2009-07-21 10:17                     ` Or Gerlitz
  2009-07-21 10:27                     ` Michael S. Tsirkin
  1 sibling, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2009-07-21 10:17 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka,
	Michael S. Tsirkin, Mark McLoughlin, Dor Laor, netdev

Herbert Xu wrote:
> I find this hard to believe this bridge sans netfilter does a single lookup based 
> on the MAC address and then just passes the packet to the underlying driver.
> Can you do an oprofile run to see if something else is chewing
> up CPU time under the guise of bridging?

okay, here are the top twenty time consumers for the three VM TX modes, the bridge code 
is not anywhere high... I'll send you the complete oprofile logs.

Or.

VM TX with the raw mode -->

samples  %        image name               app name                 symbol name
697453   25.2468  kvm-intel.ko             kvm_intel                vmx_vcpu_run
105024    3.8017  vmlinux                  vmlinux                  _raw_spin_lock
95443     3.4549  igb.ko                   igb                      igb_xmit_frame_adv
68617     2.4838  vmlinux                  vmlinux                  __slab_free
68168     2.4676  qemu-system-x86_64       qemu-system-x86_64       cpu_physical_memory_rw
56272     2.0370  vmlinux                  vmlinux                  tg_shares_up
48573     1.7583  igb.ko                   igb                      igb_clean_tx_irq
46128     1.6698  libc-2.5.so              libc-2.5.so              memcpy
44371     1.6062  vmlinux                  vmlinux                  kmem_cache_alloc
41485     1.5017  vmlinux                  vmlinux                  __alloc_skb
38719     1.4016  qemu-system-x86_64       qemu-system-x86_64       phys_page_find_alloc
38016     1.3761  vmlinux                  vmlinux                  copy_user_generic_string
37690     1.3643  qemu-system-x86_64       qemu-system-x86_64       qemu_get_ram_ptr
34321     1.2424  vmlinux                  vmlinux                  dev_kfree_skb_irq
34313     1.2421  vmlinux                  vmlinux                  __kmalloc_track_caller
28726     1.0398  vmlinux                  vmlinux                  sock_alloc_send_pskb
25195     0.9120  vmlinux                  vmlinux                  kfree
24790     0.8974  vmlinux                  vmlinux                  __slab_alloc
23406     0.8473  vmlinux                  vmlinux                  dev_queue_xmit

VM TX with the tap/bridge+netfilter OFF mode -->

samples  %        image name               app name                 symbol name
447119   21.5219  kvm-intel.ko             kvm_intel                vmx_vcpu_run
70774     3.4067  igb.ko                   igb                      igb_xmit_frame_adv
66324     3.1925  vmlinux                  vmlinux                  _raw_spin_lock
53817     2.5905  vmlinux                  vmlinux                  __slab_free
47494     2.2861  vmlinux                  vmlinux                  tg_shares_up
47213     2.2726  qemu-system-x86_64       qemu-system-x86_64       cpu_physical_memory_rw
40364     1.9429  igb.ko                   igb                      igb_clean_tx_irq
39545     1.9035  vmlinux                  vmlinux                  kmem_cache_alloc
36027     1.7341  libc-2.5.so              libc-2.5.so              memcpy
34945     1.6821  vmlinux                  vmlinux                  __alloc_skb
29747     1.4319  vmlinux                  vmlinux                  dev_kfree_skb_irq
29145     1.4029  vmlinux                  vmlinux                  __kmalloc_track_caller
28680     1.3805  vmlinux                  vmlinux                  copy_user_generic_string
26251     1.2636  qemu-system-x86_64       qemu-system-x86_64       phys_page_find_alloc
25123     1.2093  qemu-system-x86_64       qemu-system-x86_64       qemu_get_ram_ptr
23231     1.1182  vmlinux                  vmlinux                  eth_type_trans
22356     1.0761  vmlinux                  vmlinux                  sock_alloc_send_pskb
22108     1.0642  vmlinux                  vmlinux                  __slab_alloc
21288     1.0247  vmlinux                  vmlinux                  kfree

VM TX with the tap/bridge+netfilter ON mode -->

samples  %        image name               app name                 symbol name
319271   21.1411  kvm-intel.ko             kvm_intel                vmx_vcpu_run
46559     3.0830  vmlinux                  vmlinux                  _raw_spin_lock
39703     2.6290  vmlinux                  vmlinux                  tg_shares_up
35773     2.3688  vmlinux                  vmlinux                  __slab_free
35045     2.3206  qemu-system-x86_64       qemu-system-x86_64       cpu_physical_memory_rw
32612     2.1595  igb.ko                   igb                      igb_xmit_frame_adv
31779     2.1043  vmlinux                  vmlinux                  kmem_cache_alloc
29134     1.9292  libc-2.5.so              libc-2.5.so              memcpy
23031     1.5250  vmlinux                  vmlinux                  copy_user_generic_string
19713     1.3053  vmlinux                  vmlinux                  __kmalloc_track_caller
19303     1.2782  qemu-system-x86_64       qemu-system-x86_64       phys_page_find_alloc
19038     1.2606  vmlinux                  vmlinux                  __alloc_skb
18559     1.2289  vmlinux                  vmlinux                  kfree
18460     1.2224  qemu-system-x86_64       qemu-system-x86_64       qemu_get_ram_ptr
18409     1.2190  vmlinux                  vmlinux                  eth_type_trans
17828     1.1805  igb.ko                   igb                      igb_clean_tx_irq
17622     1.1669  igb.ko                   igb                      igb_poll
17303     1.1457  vmlinux                  vmlinux                  __slab_alloc
17033     1.1279  vmlinux                  vmlinux                  dev_kfree_skb_irq

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
  2009-07-21  7:25                   ` [Qemu-devel] [PATCH] net: add raw backend - some performance measurements Herbert Xu
  2009-07-21 10:17                     ` Or Gerlitz
@ 2009-07-21 10:27                     ` Michael S. Tsirkin
  2009-07-21 11:05                       ` Or Gerlitz
  1 sibling, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2009-07-21 10:27 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Or Gerlitz, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka,
	Mark McLoughlin, Dor Laor, netdev

On Tue, Jul 21, 2009 at 03:25:46PM +0800, Herbert Xu wrote:
> On Tue, Jul 21, 2009 at 10:03:00AM +0300, Or Gerlitz wrote:
> > 
> > okay, when setting net.bridge.bridge-nf-call-iptables to zero, the VM TX / tap+bridge packet rate climbs from 170K to 195K but it still way beyond the 240K rate achieved by the raw mode --> we have now a clear sign on the performance gain this approach provides. 
> 
> I find this hard to believe this bridge sans netfilter does a
> single lookup based on the MAC address and then just passes the
> packet to the underlying driver.

One advantage that raw sockets have over tap+bridge, is that they do not
do their own TX buffering, but use the TX queue for the device directly.
With raw sockets, send will block or fail if the TX queue for device is
full. With tap+bridge, the buffer in tap has to fill up instead, which
is not the same. I'm not sure this is the issue here, but could be: the
benchmark is UDP, isn't it?


> Can you do an oprofile run to see if something else is chewing
> up CPU time under the guise of bridging?
> 
> Thanks,
> -- 
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
  2009-07-21 10:27                     ` Michael S. Tsirkin
@ 2009-07-21 11:05                       ` Or Gerlitz
  2009-07-21 12:01                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2009-07-21 11:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Herbert Xu, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka,
	Mark McLoughlin, Dor Laor, netdev

Michael S. Tsirkin wrote:
> With raw sockets, send will block or fail if the TX queue for device is
> full. With tap+bridge, the buffer in tap has to fill up instead, which
> is not the same. I'm not sure this is the issue here, but could be: the
> benchmark is UDP, isn't it?

Michael, 

What/where is this tap buffer? we're talking on VM TX, so looking on tun_get_user I see a call to 
skb_copy_datagram_from_iovec() to copy from the user buffer to an skb, then a call to netif_rx_ni() and that's it... As for your question, indeed udp, the VM runs netperf/UDP_STREAM

Or.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
  2009-07-21 11:05                       ` Or Gerlitz
@ 2009-07-21 12:01                         ` Michael S. Tsirkin
  2009-07-21 12:14                           ` Herbert Xu
  0 siblings, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2009-07-21 12:01 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Herbert Xu, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka,
	Mark McLoughlin, Dor Laor, netdev

On Tue, Jul 21, 2009 at 02:05:32PM +0300, Or Gerlitz wrote:
> Michael S. Tsirkin wrote:
> > With raw sockets, send will block or fail if the TX queue for device is
> > full. With tap+bridge, the buffer in tap has to fill up instead, which
> > is not the same. I'm not sure this is the issue here, but could be: the
> > benchmark is UDP, isn't it?
> 
> Michael, 
> 
> What/where is this tap buffer?
> we're talking on VM TX, so looking on tun_get_user I see a call to 
> skb_copy_datagram_from_iovec() to copy from the user buffer to an skb, then a call to netif_rx_ni() and that's it... As for your question, indeed udp, the VM runs netperf/UDP_STREAM
> 
> Or.

Queue is not the right word, sorry.

I was referring to the fact that, when bridge floods a packet to
multiple interfaces, it clones the skb and frees the original, which
breaks the send buffer accounting in tun and might let you overrun the
tx queue in one of the devices.  This does not usually happen with raw
sockets.  This is the code in question:

                        if (prev != NULL) {
                                struct sk_buff *skb2;

                                if ((skb2 = skb_clone(skb, GFP_ATOMIC)) == NULL) {
                                        br->dev->stats.tx_dropped++;
                                        kfree_skb(skb);
                                        return;
                                }

                                __packet_hook(prev, skb2);
                        }

the thing to check then would be that some kind of misconfiguration
does not cause the bridge to flood your packets to multiple interfaces.

-- 
MST

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
  2009-07-21 12:01                         ` Michael S. Tsirkin
@ 2009-07-21 12:14                           ` Herbert Xu
  2009-07-21 13:41                             ` Or Gerlitz
  0 siblings, 1 reply; 7+ messages in thread
From: Herbert Xu @ 2009-07-21 12:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka,
	Mark McLoughlin, Dor Laor, netdev

On Tue, Jul 21, 2009 at 03:01:42PM +0300, Michael S. Tsirkin wrote:
>
> the thing to check then would be that some kind of misconfiguration
> does not cause the bridge to flood your packets to multiple interfaces.

Right, we should make sure that the interfaces are not in promiscous
mode.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] net: add raw backend  - some performance measurements
  2009-07-21 12:14                           ` Herbert Xu
@ 2009-07-21 13:41                             ` Or Gerlitz
  0 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2009-07-21 13:41 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Michael S. Tsirkin, Jamie Lokier, Anthony Liguori, qemu-devel,
	Jan Kiszka, Mark McLoughlin, Dor Laor, netdev

Herbert Xu wrote:
> On Tue, Jul 21, 2009 at 03:01:42PM +0300, Michael S. Tsirkin wrote:

>> the thing to check then would be that some kind of misconfiguration
>> does not cause the bridge to flood your packets to multiple interfaces.

> Right, we should make sure that the interfaces are not in promiscous mode

Michael, Herbert, 

First, I don't see how flooding can happen in my setup, I have only two interfaces on 
the bridge (see below), a tap and a NIC (vlan) and the bridge will never attempt to forward
a packet through the port it was received. Second, the bridge always set all interfaces
attached to it to be in promiscous mode, see the call to dev_set_promiscuity() from br_add_if()
but this doesn't mean it applied flooding, it does mac learning...

Or.

# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.0030485f9977       no              eth1.4009
                                                        tap0

The VM mac is de:ab:be:01:01:09 and the remote node mac is 00:30:48:65:a6:2b, you 
can see that these two macs were learned by the bridge and hence no flooding is expected.

# brctl showmacs br0
port no mac addr                is local?       ageing timer
  1     00:30:48:5f:99:77       yes                0.00
  1     00:30:48:65:a6:2b       no                12.50
  2     06:f5:76:64:a0:d4       yes                0.00
  2     de:ab:be:01:01:09       no                 0.00

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-07-21 13:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090701162115.GA4555@shareable.org>
     [not found] ` <4A4CA747.1050509@Voltaire.com>
     [not found]   ` <20090703023911.GD938@shareable.org>
     [not found]     ` <4A534EC4.5030209@voltaire.com>
     [not found]       ` <20090707145739.GB14392@shareable.org>
     [not found]         ` <4A54B0F1.3070201@voltaire.com>
     [not found]           ` <20090715203806.GF3056@shareable.org>
     [not found]             ` <4A647B72.5090404@Voltaire.com>
     [not found]               ` <20090720155308.GA9327@gondor.apana.org.au>
     [not found]                 ` <4A656824.7070100@Voltaire.com>
2009-07-21  7:25                   ` [Qemu-devel] [PATCH] net: add raw backend - some performance measurements Herbert Xu
2009-07-21 10:17                     ` Or Gerlitz
2009-07-21 10:27                     ` Michael S. Tsirkin
2009-07-21 11:05                       ` Or Gerlitz
2009-07-21 12:01                         ` Michael S. Tsirkin
2009-07-21 12:14                           ` Herbert Xu
2009-07-21 13:41                             ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).