From: starlight@binnacle.cx
To: chetan loke <loke.chetan@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
linux-kernel@vger.kernel.org, netdev <netdev@vger.kernel.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Christoph Lameter <cl@gentwo.org>, Willy Tarreau <w@1wt.eu>,
Ingo Molnar <mingo@elte.hu>,
Stephen Hemminger <stephen.hemminger@vyatta.com>,
Benjamin LaHaise <bcrl@kvack.org>, Joe Perches <joe@perches.com>,
lokechetan@gmail.com, Con Kolivas <conman@kolivas.org>,
Serge Belyshev <belyshev@depni.sinp.msu.ru>
Subject: Re: big picture UDP/IP performance question re 2.6.18 -> 2.6.32
Date: Fri, 07 Oct 2011 14:37:47 -0400 [thread overview]
Message-ID: <6.2.5.6.2.20111007143050.039bd578@binnacle.cx> (raw)
In-Reply-To: <CAAsGZS4s1wTWW1j7FRUWW9jqpPUVF3Q46AMa7+njvE1ckX0Snw @mail.gmail.com>
At 02:09 PM 10/7/2011 -0400, chetan loke wrote:
>I'm a little confused. Seems like there are
>conflicting goals. If you want to bypass the
>kernel-protocol-stack then you have the following
>options: a) kernel af_packet. This is where we
>would get a chance to test all the kernel features
>etc.
Perhaps I haven't been sufficiently clear.
The "packet socket" mode I refer to in the
earlier post was using AF/PF_PACKET mode sockets
as in
socket(PF_PACKET, SOCK_RAW, eth_p_all);
Have run it in both normal and memory mapped
modes. MMAP mode is a slight bit more expensive
due to the cache pressure from the additional
copy. On the 6174 MMAP seems to be a smidgen
better in certain tests, but in the end both
read() and mapped approaches are effectively
identical on performance--and generally match
the cost of UDP sockets almost exactly.
b) Use non-commodity(?) NICs(from vendors
>you mentioned): where it might have some on-board
>memory(cushion) and so it can absorb the spikes
>and can also smoothen out too many
>PCI-transactions for bursty (and small payload -
>as in 64 byte traffic). But wait, when you use the
>libs provided by these vendors, then their
>driver(especially the Rx path) is not so much
>working in inline mode as NIC drivers in case a)
>above. This driver with a special Rx-path purely
>exists for managing your mmap'd queues.So
>of-course it's going to be faster that the
>traditional inline drivers. In this partial-inline
>mode, the adapter might i) batch the packets and
>ii) send a single notification to the
>host-side. With that single event you are now
>processing 1+ packets.
Kernel bypass is probably the best answer for
what we do. Problem has been lack of maturity
in their driver software. Looks like it's reaching
a point where they cover our use case. As mentioned
earlier, Solarflare could not match the Intel
82599 + ixgbe for this app last year. Was a
disaster. Myricom is focused on UDP (better
for us), but only just added multi-core IRQ
doorbell wakeups in recent months. Previously
one had to accept all IRQs on a single core or
poll, neither of which works for us.
>You got it. In case of tilera there are two modes:
>tile-cpu in device mode: beats most of the
>non-COTS NICs. It runs linux on the adapter
>side. Imagine having the flexibility/power to
>program the ASIC using your favorite OS. Its
>orgasmic. So go for it! tile-cpu in host-mode:
>Yes, it could be a game changer.
We almost went for the 1st gen Tile64 outboard
NIC approach, but were concerned about whether
they would survive--still are. Intel has
crushed more than a few competitors along
the way. If Google or Facebook buys into the
Tile-Gx it becomes a safe choice overnight.
next prev parent reply other threads:[~2011-10-07 18:47 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-07 3:27 big picture UDP/IP performance question re 2.6.18 -> 2.6.32 starlight
2011-10-07 5:40 ` Eric Dumazet
2011-10-07 6:13 ` starlight
2011-10-07 18:09 ` chetan loke
[not found] ` <CAAsGZS4s1wTWW1j7FRUWW9jqpPUVF3Q46AMa7+njvE1ckX0Snw @mail.gmail.com>
2011-10-07 18:37 ` starlight [this message]
2011-10-07 19:27 ` chetan loke
[not found] ` <CAAsGZS4b2F9N3nV3TNu5xG+=2d0L0ncste4xv2vqoVFb1pOxEw @mail.gmail.com>
2011-10-07 19:41 ` starlight
2011-10-07 20:07 ` Ben Hutchings
2011-10-11 16:24 ` Chris Friesen
-- strict thread matches above, loose matches on Subject: below --
2011-10-07 2:33 starlight
2011-10-07 2:24 starlight
2011-10-05 6:58 starlight
2011-10-05 8:53 ` Eric Dumazet
[not found] ` <1317804832.2473.25.camel@edumazet-HP-Compaq-6005-Pr o-SFF-PC>
2011-10-05 11:50 ` starlight
2011-10-05 6:11 starlight
2011-10-05 3:35 starlight
2011-10-03 18:02 starlight
2011-10-05 6:53 ` Eric Dumazet
2011-10-03 15:25 starlight
2011-10-03 16:16 ` Eric Dumazet
[not found] ` <1317658588.2442.5.camel@edumazet-HP-Compaq-6005-Pro -SFF-PC>
2011-10-03 16:28 ` starlight
2011-10-04 19:16 ` Christoph Lameter
2011-10-04 19:38 ` Joe Perches
2011-10-04 19:42 ` Christoph Lameter
2011-10-04 19:49 ` Serge Belyshev
2011-10-04 20:03 ` Christoph Lameter
2011-10-04 20:12 ` Serge Belyshev
2011-10-04 22:32 ` Con Kolivas
2011-10-04 19:45 ` starlight
2011-10-05 13:22 ` Peter Zijlstra
2011-10-05 14:26 ` Christoph Lameter
2011-10-05 15:12 ` Andi Kleen
2011-10-05 15:33 ` Peter Zijlstra
2011-10-05 15:12 ` starlight
2011-10-02 5:33 starlight
2011-10-02 7:21 ` Eric Dumazet
2011-10-02 8:03 ` Eric Dumazet
2011-10-02 14:47 ` Stephen Hemminger
2011-10-02 15:06 ` starlight
2011-10-04 19:54 ` Loke, Chetan
2011-10-01 21:13 starlight
2011-10-01 18:16 starlight
2011-10-01 18:40 ` Willy Tarreau
2011-10-01 19:11 ` Eric Dumazet
2011-10-01 19:43 ` starlight
[not found] <6.2.5.6.2.20111001012019.05c05b80@flumedata.com>
2011-10-01 6:44 ` Eric Dumazet
2011-10-01 15:56 ` starlight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6.2.5.6.2.20111007143050.039bd578@binnacle.cx \
--to=starlight@binnacle.cx \
--cc=a.p.zijlstra@chello.nl \
--cc=bcrl@kvack.org \
--cc=belyshev@depni.sinp.msu.ru \
--cc=cl@gentwo.org \
--cc=conman@kolivas.org \
--cc=eric.dumazet@gmail.com \
--cc=joe@perches.com \
--cc=linux-kernel@vger.kernel.org \
--cc=loke.chetan@gmail.com \
--cc=lokechetan@gmail.com \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=stephen.hemminger@vyatta.com \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).