* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements [not found] ` <4A656824.7070100@Voltaire.com> @ 2009-07-21 7:25 ` Herbert Xu 2009-07-21 10:17 ` Or Gerlitz 2009-07-21 10:27 ` Michael S. Tsirkin 0 siblings, 2 replies; 7+ messages in thread From: Herbert Xu @ 2009-07-21 7:25 UTC (permalink / raw) To: Or Gerlitz Cc: Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Michael S. Tsirkin, Mark McLoughlin, Dor Laor, netdev On Tue, Jul 21, 2009 at 10:03:00AM +0300, Or Gerlitz wrote: > > okay, when setting net.bridge.bridge-nf-call-iptables to zero, the VM TX / tap+bridge packet rate climbs from 170K to 195K but it still way beyond the 240K rate achieved by the raw mode --> we have now a clear sign on the performance gain this approach provides. I find this hard to believe this bridge sans netfilter does a single lookup based on the MAC address and then just passes the packet to the underlying driver. Can you do an oprofile run to see if something else is chewing up CPU time under the guise of bridging? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements 2009-07-21 7:25 ` [Qemu-devel] [PATCH] net: add raw backend - some performance measurements Herbert Xu @ 2009-07-21 10:17 ` Or Gerlitz 2009-07-21 10:27 ` Michael S. Tsirkin 1 sibling, 0 replies; 7+ messages in thread From: Or Gerlitz @ 2009-07-21 10:17 UTC (permalink / raw) To: Herbert Xu Cc: Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Michael S. Tsirkin, Mark McLoughlin, Dor Laor, netdev Herbert Xu wrote: > I find this hard to believe this bridge sans netfilter does a single lookup based > on the MAC address and then just passes the packet to the underlying driver. > Can you do an oprofile run to see if something else is chewing > up CPU time under the guise of bridging? okay, here are the top twenty time consumers for the three VM TX modes, the bridge code is not anywhere high... I'll send you the complete oprofile logs. Or. VM TX with the raw mode --> samples % image name app name symbol name 697453 25.2468 kvm-intel.ko kvm_intel vmx_vcpu_run 105024 3.8017 vmlinux vmlinux _raw_spin_lock 95443 3.4549 igb.ko igb igb_xmit_frame_adv 68617 2.4838 vmlinux vmlinux __slab_free 68168 2.4676 qemu-system-x86_64 qemu-system-x86_64 cpu_physical_memory_rw 56272 2.0370 vmlinux vmlinux tg_shares_up 48573 1.7583 igb.ko igb igb_clean_tx_irq 46128 1.6698 libc-2.5.so libc-2.5.so memcpy 44371 1.6062 vmlinux vmlinux kmem_cache_alloc 41485 1.5017 vmlinux vmlinux __alloc_skb 38719 1.4016 qemu-system-x86_64 qemu-system-x86_64 phys_page_find_alloc 38016 1.3761 vmlinux vmlinux copy_user_generic_string 37690 1.3643 qemu-system-x86_64 qemu-system-x86_64 qemu_get_ram_ptr 34321 1.2424 vmlinux vmlinux dev_kfree_skb_irq 34313 1.2421 vmlinux vmlinux __kmalloc_track_caller 28726 1.0398 vmlinux vmlinux sock_alloc_send_pskb 25195 0.9120 vmlinux vmlinux kfree 24790 0.8974 vmlinux vmlinux __slab_alloc 23406 0.8473 vmlinux vmlinux dev_queue_xmit VM TX with the tap/bridge+netfilter OFF mode --> samples % image name app name symbol name 447119 21.5219 kvm-intel.ko kvm_intel vmx_vcpu_run 70774 3.4067 igb.ko igb igb_xmit_frame_adv 66324 3.1925 vmlinux vmlinux _raw_spin_lock 53817 2.5905 vmlinux vmlinux __slab_free 47494 2.2861 vmlinux vmlinux tg_shares_up 47213 2.2726 qemu-system-x86_64 qemu-system-x86_64 cpu_physical_memory_rw 40364 1.9429 igb.ko igb igb_clean_tx_irq 39545 1.9035 vmlinux vmlinux kmem_cache_alloc 36027 1.7341 libc-2.5.so libc-2.5.so memcpy 34945 1.6821 vmlinux vmlinux __alloc_skb 29747 1.4319 vmlinux vmlinux dev_kfree_skb_irq 29145 1.4029 vmlinux vmlinux __kmalloc_track_caller 28680 1.3805 vmlinux vmlinux copy_user_generic_string 26251 1.2636 qemu-system-x86_64 qemu-system-x86_64 phys_page_find_alloc 25123 1.2093 qemu-system-x86_64 qemu-system-x86_64 qemu_get_ram_ptr 23231 1.1182 vmlinux vmlinux eth_type_trans 22356 1.0761 vmlinux vmlinux sock_alloc_send_pskb 22108 1.0642 vmlinux vmlinux __slab_alloc 21288 1.0247 vmlinux vmlinux kfree VM TX with the tap/bridge+netfilter ON mode --> samples % image name app name symbol name 319271 21.1411 kvm-intel.ko kvm_intel vmx_vcpu_run 46559 3.0830 vmlinux vmlinux _raw_spin_lock 39703 2.6290 vmlinux vmlinux tg_shares_up 35773 2.3688 vmlinux vmlinux __slab_free 35045 2.3206 qemu-system-x86_64 qemu-system-x86_64 cpu_physical_memory_rw 32612 2.1595 igb.ko igb igb_xmit_frame_adv 31779 2.1043 vmlinux vmlinux kmem_cache_alloc 29134 1.9292 libc-2.5.so libc-2.5.so memcpy 23031 1.5250 vmlinux vmlinux copy_user_generic_string 19713 1.3053 vmlinux vmlinux __kmalloc_track_caller 19303 1.2782 qemu-system-x86_64 qemu-system-x86_64 phys_page_find_alloc 19038 1.2606 vmlinux vmlinux __alloc_skb 18559 1.2289 vmlinux vmlinux kfree 18460 1.2224 qemu-system-x86_64 qemu-system-x86_64 qemu_get_ram_ptr 18409 1.2190 vmlinux vmlinux eth_type_trans 17828 1.1805 igb.ko igb igb_clean_tx_irq 17622 1.1669 igb.ko igb igb_poll 17303 1.1457 vmlinux vmlinux __slab_alloc 17033 1.1279 vmlinux vmlinux dev_kfree_skb_irq ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements 2009-07-21 7:25 ` [Qemu-devel] [PATCH] net: add raw backend - some performance measurements Herbert Xu 2009-07-21 10:17 ` Or Gerlitz @ 2009-07-21 10:27 ` Michael S. Tsirkin 2009-07-21 11:05 ` Or Gerlitz 1 sibling, 1 reply; 7+ messages in thread From: Michael S. Tsirkin @ 2009-07-21 10:27 UTC (permalink / raw) To: Herbert Xu Cc: Or Gerlitz, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Mark McLoughlin, Dor Laor, netdev On Tue, Jul 21, 2009 at 03:25:46PM +0800, Herbert Xu wrote: > On Tue, Jul 21, 2009 at 10:03:00AM +0300, Or Gerlitz wrote: > > > > okay, when setting net.bridge.bridge-nf-call-iptables to zero, the VM TX / tap+bridge packet rate climbs from 170K to 195K but it still way beyond the 240K rate achieved by the raw mode --> we have now a clear sign on the performance gain this approach provides. > > I find this hard to believe this bridge sans netfilter does a > single lookup based on the MAC address and then just passes the > packet to the underlying driver. One advantage that raw sockets have over tap+bridge, is that they do not do their own TX buffering, but use the TX queue for the device directly. With raw sockets, send will block or fail if the TX queue for device is full. With tap+bridge, the buffer in tap has to fill up instead, which is not the same. I'm not sure this is the issue here, but could be: the benchmark is UDP, isn't it? > Can you do an oprofile run to see if something else is chewing > up CPU time under the guise of bridging? > > Thanks, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements 2009-07-21 10:27 ` Michael S. Tsirkin @ 2009-07-21 11:05 ` Or Gerlitz 2009-07-21 12:01 ` Michael S. Tsirkin 0 siblings, 1 reply; 7+ messages in thread From: Or Gerlitz @ 2009-07-21 11:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Herbert Xu, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Mark McLoughlin, Dor Laor, netdev Michael S. Tsirkin wrote: > With raw sockets, send will block or fail if the TX queue for device is > full. With tap+bridge, the buffer in tap has to fill up instead, which > is not the same. I'm not sure this is the issue here, but could be: the > benchmark is UDP, isn't it? Michael, What/where is this tap buffer? we're talking on VM TX, so looking on tun_get_user I see a call to skb_copy_datagram_from_iovec() to copy from the user buffer to an skb, then a call to netif_rx_ni() and that's it... As for your question, indeed udp, the VM runs netperf/UDP_STREAM Or. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements 2009-07-21 11:05 ` Or Gerlitz @ 2009-07-21 12:01 ` Michael S. Tsirkin 2009-07-21 12:14 ` Herbert Xu 0 siblings, 1 reply; 7+ messages in thread From: Michael S. Tsirkin @ 2009-07-21 12:01 UTC (permalink / raw) To: Or Gerlitz Cc: Herbert Xu, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Mark McLoughlin, Dor Laor, netdev On Tue, Jul 21, 2009 at 02:05:32PM +0300, Or Gerlitz wrote: > Michael S. Tsirkin wrote: > > With raw sockets, send will block or fail if the TX queue for device is > > full. With tap+bridge, the buffer in tap has to fill up instead, which > > is not the same. I'm not sure this is the issue here, but could be: the > > benchmark is UDP, isn't it? > > Michael, > > What/where is this tap buffer? > we're talking on VM TX, so looking on tun_get_user I see a call to > skb_copy_datagram_from_iovec() to copy from the user buffer to an skb, then a call to netif_rx_ni() and that's it... As for your question, indeed udp, the VM runs netperf/UDP_STREAM > > Or. Queue is not the right word, sorry. I was referring to the fact that, when bridge floods a packet to multiple interfaces, it clones the skb and frees the original, which breaks the send buffer accounting in tun and might let you overrun the tx queue in one of the devices. This does not usually happen with raw sockets. This is the code in question: if (prev != NULL) { struct sk_buff *skb2; if ((skb2 = skb_clone(skb, GFP_ATOMIC)) == NULL) { br->dev->stats.tx_dropped++; kfree_skb(skb); return; } __packet_hook(prev, skb2); } the thing to check then would be that some kind of misconfiguration does not cause the bridge to flood your packets to multiple interfaces. -- MST ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements 2009-07-21 12:01 ` Michael S. Tsirkin @ 2009-07-21 12:14 ` Herbert Xu 2009-07-21 13:41 ` Or Gerlitz 0 siblings, 1 reply; 7+ messages in thread From: Herbert Xu @ 2009-07-21 12:14 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Or Gerlitz, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Mark McLoughlin, Dor Laor, netdev On Tue, Jul 21, 2009 at 03:01:42PM +0300, Michael S. Tsirkin wrote: > > the thing to check then would be that some kind of misconfiguration > does not cause the bridge to flood your packets to multiple interfaces. Right, we should make sure that the interfaces are not in promiscous mode. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements 2009-07-21 12:14 ` Herbert Xu @ 2009-07-21 13:41 ` Or Gerlitz 0 siblings, 0 replies; 7+ messages in thread From: Or Gerlitz @ 2009-07-21 13:41 UTC (permalink / raw) To: Herbert Xu Cc: Michael S. Tsirkin, Jamie Lokier, Anthony Liguori, qemu-devel, Jan Kiszka, Mark McLoughlin, Dor Laor, netdev Herbert Xu wrote: > On Tue, Jul 21, 2009 at 03:01:42PM +0300, Michael S. Tsirkin wrote: >> the thing to check then would be that some kind of misconfiguration >> does not cause the bridge to flood your packets to multiple interfaces. > Right, we should make sure that the interfaces are not in promiscous mode Michael, Herbert, First, I don't see how flooding can happen in my setup, I have only two interfaces on the bridge (see below), a tap and a NIC (vlan) and the bridge will never attempt to forward a packet through the port it was received. Second, the bridge always set all interfaces attached to it to be in promiscous mode, see the call to dev_set_promiscuity() from br_add_if() but this doesn't mean it applied flooding, it does mac learning... Or. # brctl show bridge name bridge id STP enabled interfaces br0 8000.0030485f9977 no eth1.4009 tap0 The VM mac is de:ab:be:01:01:09 and the remote node mac is 00:30:48:65:a6:2b, you can see that these two macs were learned by the bridge and hence no flooding is expected. # brctl showmacs br0 port no mac addr is local? ageing timer 1 00:30:48:5f:99:77 yes 0.00 1 00:30:48:65:a6:2b no 12.50 2 06:f5:76:64:a0:d4 yes 0.00 2 de:ab:be:01:01:09 no 0.00 ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-07-21 13:41 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090701162115.GA4555@shareable.org>
[not found] ` <4A4CA747.1050509@Voltaire.com>
[not found] ` <20090703023911.GD938@shareable.org>
[not found] ` <4A534EC4.5030209@voltaire.com>
[not found] ` <20090707145739.GB14392@shareable.org>
[not found] ` <4A54B0F1.3070201@voltaire.com>
[not found] ` <20090715203806.GF3056@shareable.org>
[not found] ` <4A647B72.5090404@Voltaire.com>
[not found] ` <20090720155308.GA9327@gondor.apana.org.au>
[not found] ` <4A656824.7070100@Voltaire.com>
2009-07-21 7:25 ` [Qemu-devel] [PATCH] net: add raw backend - some performance measurements Herbert Xu
2009-07-21 10:17 ` Or Gerlitz
2009-07-21 10:27 ` Michael S. Tsirkin
2009-07-21 11:05 ` Or Gerlitz
2009-07-21 12:01 ` Michael S. Tsirkin
2009-07-21 12:14 ` Herbert Xu
2009-07-21 13:41 ` Or Gerlitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).