From: Eric Dumazet <eric.dumazet@gmail.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Divy Le Ray <divy@chelsio.com>, Roland Dreier <rolandd@cisco.com>,
Pavel Emelianov <xemul@openvz.org>,
Dan Williams <dcbw@redhat.com>,
libertas-dev@lists.infradead.org
Subject: Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit
Date: Wed, 03 Jun 2009 23:02:53 +0200 [thread overview]
Message-ID: <4A26E4FD.5010405@gmail.com> (raw)
In-Reply-To: <200906012157.29465.rusty@rustcorp.com.au>
Rusty Russell a écrit :
> On Sat, 30 May 2009 12:41:00 am Eric Dumazet wrote:
>> Rusty Russell a écrit :
>>> DaveM points out that there are advantages to doing it generally (it's
>>> more likely to be on same CPU than after xmit), and I couldn't find
>>> any new starvation issues in simple benchmarking here.
>> If really no starvations are possible at all, I really wonder why some
>> guys added memory accounting to UDP flows. Maybe they dont run "simple
>> benchmarks" but real apps ? :)
>
> Well, without any accounting at all you could use quite a lot of memory as
> there are many places packets can be queued.
>
>> For TCP, I agree your patch is a huge benefit, since its paced by remote
>> ACKS and window control
>
> I doubt that. There'll be some cache friendliness, but I'm not sure it'll be
> measurable, let alone "huge". It's the win to drivers which don't have a
> timely and batching tx free mechanism which I aim for.
At 250.000 packets/second on a Gigabit link, this is huge, I can tell you.
(250.000 incoming packets and 250.000 outgoing packets per second, 700 Mbit/s)
According to this oprofile on CPU0 (dedicated to softirqs on one bnx2 eth adapter)
We can see sock_wfree() being number 2 on the profile, because it touches three cache lines per socket and
transmited packet in TX completion handler.
Also, taking a reference on socket for each xmit packet in flight is very expensive, since it slows
down receiver in __udp4_lib_lookup(). Several cpus are fighting for sk->refcnt cache line.
CPU: Core 2, speed 3000.24 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples cum. samples % cum. % symbol name
21215 21215 11.8847 11.8847 bnx2_poll_work
17239 38454 9.6573 21.5420 sock_wfree << effect of udp memory accounting >>
14817 53271 8.3005 29.8425 __slab_free
14635 67906 8.1986 38.0411 __udp4_lib_lookup
11425 79331 6.4003 44.4414 __alloc_skb
9710 89041 5.4396 49.8810 __slab_alloc
8095 97136 4.5348 54.4158 __udp4_lib_rcv
7831 104967 4.3869 58.8027 sock_def_write_space
7586 112553 4.2497 63.0524 ip_rcv
7518 120071 4.2116 67.2640 skb_dma_unmap
6711 126782 3.7595 71.0235 netif_receive_skb
6272 133054 3.5136 74.5371 udp_queue_rcv_skb
5262 138316 2.9478 77.4849 skb_release_data
5023 143339 2.8139 80.2988 __kmalloc_track_caller
4070 147409 2.2800 82.5788 kmem_cache_alloc
3216 150625 1.8016 84.3804 ipt_do_table
2576 153201 1.4431 85.8235 skb_queue_tail
next prev parent reply other threads:[~2009-06-03 21:03 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-29 14:14 [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit Rusty Russell
2009-05-29 15:11 ` Eric Dumazet
2009-05-29 15:11 ` Eric Dumazet
2009-06-01 12:27 ` Rusty Russell
2009-06-03 21:02 ` Eric Dumazet
2009-06-03 21:02 ` Eric Dumazet [this message]
2009-06-04 3:54 ` Rusty Russell
2009-06-04 3:54 ` Rusty Russell
2009-06-04 4:00 ` David Miller
2009-06-04 4:54 ` Eric Dumazet
2009-06-04 4:56 ` David Miller
2009-06-04 4:56 ` David Miller
2009-06-04 9:18 ` [PATCH] net: No more expensive sock_hold()/sock_put() on each tx Eric Dumazet
2009-06-04 9:26 ` David Miller
2009-06-10 8:17 ` David Miller
2009-06-10 8:30 ` Eric Dumazet
2009-06-11 9:56 ` David Miller
2009-06-04 4:00 ` [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit David Miller
2009-06-01 19:47 ` Patrick Ohly
2009-06-01 19:47 ` Patrick Ohly
2009-06-02 7:25 ` David Miller
2009-06-02 7:25 ` David Miller
2009-06-02 14:08 ` Rusty Russell
2009-06-03 0:14 ` David Miller
2009-07-03 7:55 ` Herbert Xu
2009-07-04 3:02 ` David Miller
2009-07-04 3:08 ` Herbert Xu
2009-07-04 3:13 ` David Miller
2009-07-04 7:42 ` Herbert Xu
2009-07-04 7:42 ` Herbert Xu
2009-07-04 9:09 ` Herbert Xu
2009-07-04 9:09 ` Herbert Xu
2009-07-05 3:26 ` Herbert Xu
2009-07-05 3:34 ` Herbert Xu
2009-07-05 3:34 ` Herbert Xu
2009-08-18 1:47 ` David Miller
2009-08-18 1:47 ` David Miller
2009-08-19 3:19 ` Herbert Xu
2009-08-19 3:19 ` Herbert Xu
2009-08-19 3:34 ` David Miller
2009-08-19 3:34 ` David Miller
2009-07-04 3:13 ` David Miller
2009-07-04 3:08 ` Herbert Xu
2009-07-03 7:55 ` Herbert Xu
2009-06-02 14:08 ` Rusty Russell
-- strict thread matches above, loose matches on Subject: below --
2009-05-29 14:14 Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A26E4FD.5010405@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=dcbw@redhat.com \
--cc=divy@chelsio.com \
--cc=libertas-dev@lists.infradead.org \
--cc=netdev@vger.kernel.org \
--cc=rolandd@cisco.com \
--cc=rusty@rustcorp.com.au \
--cc=virtualization@lists.linux-foundation.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.