public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jeff V. Merkey" <jmerkey@devicelogics.com>
To: Andi Kleen <ak@suse.de>
Cc: ltd@cisco.com, linux-kernel@vger.kernel.org
Subject: Re: Linux 2.6.9 pktgen module causes INIT process respawning   and sickness
Date: Tue, 23 Nov 2004 14:57:16 -0700	[thread overview]
Message-ID: <41A3B23C.2080406@devicelogics.com> (raw)
In-Reply-To: <p73k6sc1epi.fsf@bragg.suse.de>


Andi,

For network forensics and analysis, it is almost a requirement if you 
are using Linux. The bus speeds on these systems
also support 450 MB/S throughput for disk and network I/O. I agree it's 
not that interesting if you are
deploying file servers that are remote attached on PPPoE and PPP as a 
network server or workstation, given
that NFS and userspace servers like SAMBA are predominant in Linux as 
file service. High performance real time
network analysis is a different story. High performance I/O file service 
and storage are also
interesting and I can see folks wanting it.

I guess I have a hard time understanding the following statement,

" ... perhaps [supporting 10 GbE and 1GbE for high performance beyond 
remote internet access ] is not that interesting ... "

Hope it's not too wet in Germany this time of year. I am heading back to 
Stolberg and Heinsberg
to show off our new baby boy born Oct 11, 2004 to his O-ma and O-O-ma (I 
guess this is how you spell this)
end of January (I hope). I might be even make it to Nurnberg while I'm 
at it. :-)

Implementation of this with skb's would not be trivial. M$ in their 
network drivers did this sort of circular list of pages
structure per adapter for receives and use it "pinned" to some of their 
proprietary drivers in W2K and would use their
version of an skb as a "pointer" of sorts that could dynamically assign 
a filled page from this list as a "receive" then perform
the user space copy from the page and release it back to the adapter. 
This allowed them to fill the ring buffers with static
addresses and copy into user space as fast as they could allocate 
control blocks.

For linux, I would guess the easiest way to do this same sort of thing 
would be to allocate a page per ring buffer
entry, pin the entries, and use allocated skb buffers to point into the 
buffer long enough to copy out the data. This would
**HELP** currently but not fix the problem completely, but the approach 
would allow linux to easily move to a table driven
method since it would switch from a ring of pinned pages to tables of 
pinned pages that could be swapped in and out.

We would need to logically detach the memory from the skb and make the 
skb a pointer block into the skb->data
area of the list. M$ does something similiar to what I described. It 
does make the whole skb_clone thing
a lot more complicated but for those apps that need to "hold" skb's 
which is infrequent for most cases,
someone could just call skb_clone() when they needed a private sopy of 
and skb->data pair.

Jeff

Andi Kleen wrote:

>"Jeff V. Merkey" <jmerkey@devicelogics.com> writes:
>  
>
>>I can sustain full line rate gigabit on two adapters at the tsame time
>>with a 12 CLK interpacket gap time and 0 dropped packets at 64
>>byte sizes from a Smartbits to Linux provided the adapter ring buffer
>>is loaded with static addresses. This demonstrates that it is
>>possible to sustain 64 byte packet rates at full line rate with
>>current DMA architectures on 400 Mhz buses with Linux.
>>(which means it will handle any network loading scenario). The
>>bottleneck from my measurements appears to be the
>>overhead of serializing writes to the adapter ring buffer IO
>>memory. The current drivers also perform interrupt
>>coalescing very well with Linux. What's needed is a method for
>>submission of ring buffer entries that can be sent in large
>>scatter gather listings rather than one at a time. Ring buffers
>>    
>>
>
>Batching would also decrease locking overhead on the Linux side (less
>spinlocks taken)
>
>We do it already for TCP using TSO for upto 64K packets when
>the hardware supports it. There were some ideas some time back
>to do it also for routing and other protocols - basically passing 
>lists of skbs to hard_start_xmit instead of always single ones - 
>but nobody implemented it so far.
>
>It was one entry in the "ideas to speed up the network stack" 
>list i posted some time back.
>
>With TSO working fine it doesn't seem to be that pressing.
>
>One problem with the TSO implementation is that TSO only works for a
>single connection. If you have hundreds that chatter in small packets
>it won't help batching that up. Problem is that batching data from
>separate sockets up would need more global lists and add possible SMP
>scalability problems from more locks and more shared state. This 
>is a real concern on Linux now - 512 CPU machines are really unforgiving.
>
>However in practice it doesn't seem to be that big a problem because
>it's extremly unlikely that you'll sustain even a gigabit ethernet
>with such a multi process load. It has far more non network CPU
>overhead than a simple packet generator or pktgen.
>
>So overall I agree with Lincoln that the small packet case is not
>that interesting except perhaps for DOS testing.
>
>-Andi
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>


  reply	other threads:[~2004-11-23 21:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <419E6B44.8050505@devicelogics.com.suse.lists.linux.kernel>
     [not found] ` <5.1.0.14.2.20041122144144.04e3d9f0@171.71.163.14.suse.lists.linux.kernel>
     [not found]   ` <5.1.0.14.2.20041123094109.04003720@171.71.163.14.suse.lists.linux.kernel>
     [not found]     ` <41A2862A.2000602@devicelogics.com.suse.lists.linux.kernel>
2004-11-23 20:40       ` Linux 2.6.9 pktgen module causes INIT process respawning and sickness Andi Kleen
2004-11-23 21:57         ` Jeff V. Merkey [this message]
2004-11-23 22:27           ` Andi Kleen
2004-11-23 22:54             ` Jeff V. Merkey
2004-11-25  2:17               ` Lincoln Dale
2004-11-22 18:30 Jeff V. Merkey
  -- strict thread matches above, loose matches on Subject: below --
2004-11-19 21:53 Jeff V. Merkey
2004-11-19 22:06 ` Jeff V. Merkey
2004-11-22  3:44   ` Lincoln Dale
2004-11-22 17:06     ` Jeff V. Merkey
2004-11-22 22:50       ` Lincoln Dale
2004-11-23  0:36         ` Jeff V. Merkey
2004-11-23  1:06           ` Jeff V. Merkey
2004-11-23  1:25             ` Lincoln Dale
2004-11-23  1:23           ` Lincoln Dale
2004-11-22 17:19   ` Martin Josefsson
2004-11-22 18:33     ` Jeff V. Merkey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41A3B23C.2080406@devicelogics.com \
    --to=jmerkey@devicelogics.com \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltd@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox