From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org, caitlinb@broadcom.com, kelly@au1.ibm.com,
rusty@rustcorp.com.au
Subject: Re: Initial benchmarks of some VJ ideas [mmap memcpy vs copy_to_user].
Date: Thu, 11 May 2006 20:18:15 +0400 [thread overview]
Message-ID: <20060511161815.GA623@2ka.mipt.ru> (raw)
In-Reply-To: <20060511083031.GA12712@2ka.mipt.ru>
On Thu, May 11, 2006 at 12:30:32PM +0400, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> On Thu, May 11, 2006 at 12:07:21AM -0700, David S. Miller (davem@davemloft.net) wrote:
> > You can test with single stream, but then you are only testing
> > in-cache case. Try several thousand sockets and real load from many
> > unique source systems, it becomes interesting then.
>
> I can test system with large number of streams, but unfortunately only
> from small number of different src/dst ip addresses, so I can not
> benchmark route lookup performance in layered design.
I've run it with 200 UDP sockets in receive path. There were two load
generator machines with 100 clients in each.
There are no copies of skb->data in recvmsg().
Since I only have 1Gb link I'm unable to provide each client with high
bandwith, so they send 4k chunks.
Performance dropped twice down to 55 MB/sec and CPU usage increased noticebly
(slow drift from 12 to 8% compared to 2% with one socket),
but it is not because of cache effect I believe,
but due to highly increased number of syscalls per second.
Here is profile result:
1463625 78.0003 poll_idle
19171 1.0217 _spin_lock_irqsave
15887 0.8467 _read_lock
14712 0.7840 kfree
13370 0.7125 ip_frag_queue
11896 0.6340 delay_pmtmr
11811 0.6294 _spin_lock
11723 0.6247 csum_partial
11399 0.6075 ip_frag_destroy
11063 0.5896 serial_in
10533 0.5613 skb_release_data
10524 0.5609 ip_route_input
10319 0.5499 __alloc_skb
9903 0.5278 ip_defrag
9889 0.5270 _read_unlock
9536 0.5082 _write_unlock
8639 0.4604 _write_lock
7557 0.4027 netif_receive_skb
6748 0.3596 ip_frag_intern
6534 0.3482 preempt_schedule
6220 0.3315 __kmalloc
6005 0.3200 schedule
5924 0.3157 irq_entries_start
5823 0.3103 _spin_unlock_irqrestore
5678 0.3026 ip_rcv
5410 0.2883 __kfree_skb
5056 0.2694 kmem_cache_alloc
5014 0.2672 kfree_skb
4900 0.2611 eth_type_trans
4067 0.2167 kmem_cache_free
3532 0.1882 udp_recvmsg
3531 0.1882 ip_frag_reasm
3331 0.1775 _read_lock_irqsave
3327 0.1773 ipq_kill
3304 0.1761 udp_v4_lookup_longway
I'm going to resurrect zero-copy sniffer project [1] and create special
socket option which would allow to insert pages, which contain
skb->data, into process VMA using VM remapping tricks. Unfortunately it
requires TLB flushing and probably there will be no significant
performance/CPU gain if any, but I think, it is the only way to provide receiving
zero-copy access to hardware which does not support header split.
Other idea, which I will try, if I understood you correctly, is to create unified cache.
I think some interesting results can be obtained from following
approach: in softint we do not process skb->data at all, but only get
src/dst/sport/dport/protocol numbers (it could require maximum two cache lines,
or it is not fast-path packet (but something like ipsec) and can be processed as usual)
and create some "initial" cache based on that data, skb is then queued into that
"initial" cache entry and recvmsg() in process context later process'
that entry.
Back to the drawing board...
Thanks for discussion.
1. zero-copy sniffer
http://tservice.net.ru/~s0mbre/old/?section=projects&item=af_tlb
--
Evgeniy Polyakov
next prev parent reply other threads:[~2006-05-11 16:18 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-08 12:24 Initial benchmarks of some VJ ideas [mmap memcpy vs copy_to_user] Evgeniy Polyakov
2006-05-08 19:51 ` Evgeniy Polyakov
2006-05-08 20:15 ` David S. Miller
2006-05-10 19:58 ` David S. Miller
2006-05-11 6:40 ` Evgeniy Polyakov
2006-05-11 7:07 ` David S. Miller
2006-05-11 8:30 ` Evgeniy Polyakov
2006-05-11 16:18 ` Evgeniy Polyakov [this message]
2006-05-11 18:54 ` David S. Miller
2006-05-11 19:30 ` Rick Jones
2006-05-12 7:54 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060511161815.GA623@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=caitlinb@broadcom.com \
--cc=davem@davemloft.net \
--cc=kelly@au1.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).