public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oliver Dain <omd1@cornell.edu>
To: Gianni Tedesco <gianni@scaramanga.co.uk>, odain2@mindspring.com
Cc: linux-kernel@vger.kernel.org
Subject: Re: CONFIG_PACKET_MMAP revisited
Date: Thu, 6 Nov 2003 09:13:41 -0500	[thread overview]
Message-ID: <200311060913.41719.omd1@cornell.edu> (raw)
In-Reply-To: <1068116914.6144.1410.camel@lemsip>

On Thursday November 6 2003 6:08 am, Gianni Tedesco wrote:
> On Wed, 2003-10-29 at 05:09, odain2@mindspring.com wrote:
> > I believe that in normal operation each packet
> > (or with NICs that do interrupt coalescing, every n packets) causes an
> > interrupt which causes a context switch, the kernel then copies the data
> > from the DMA buffer to the shared buffer and does a RETI.  That's fairly
> > expensive.
>
> The cost of handling that interrupt and doing an iret is unavoidable
> (ignoring NAPI). The main point you are missing with the ring buffer is
> that if packets come in at a fast enough rate, the usermode task never
> context switches, because there is always data in the ring buffer, so it
> loops in usermode forever.

It seems to me that it can't loop in user mode forever.  There is no way to 
get data into user mode (the ring buffer) witout going through the kernel.  
My understanding is that the NIC doesn't transfer directly to the user mode 
ring buffer, but rather to a different DMA buffer.  The kernel must copy it 
from the DMA buffer to the ring buffer. Thus once the user mode app has 
processed all the data in the ring buffer the kenel _must_ get involved to 
get more data to user space.  Currently the data gets there because the NIC 
produces an interrupt for each packet (or for every few packets) and when the 
kernel handles these the data is copied to user space.  Then, as you point 
out, the cost of the RETI can't be avoided.  

NAPI tries to solve this problem.  I don't know much about NAPI, but as I 
understand it, the idea is this: The cost of the RETI's and context switches 
(which occur on each interrupt) can be reduced if the NIC doesn't produce an 
interrupt for every packet but instead does interrupt coalescing, but this 
only goes so far.  If too many packets are coalesced the data copied by the 
kernel will no longer fit in the L1 cache and we'll pay the price of moving 
it there twice (once when the kernel copies the data from main memory to the 
ring buffer and once when the user mode application reads it out of the 
ring), the latency may become a problem, we've still got a context switch 
every time the user mode application has processed everything in the ring 
buffer (and perhaps more often), and we're still paying the price of copying 
data from the DMA buffer to the ring.

However, if the NIC could transfer the data directly to user space it wouldn't 
need to cause an interrupt and the cost of the RETI and the context switch is 
avoided.  The user mode app really could process forever without sleeping at 
that point.

> The problem could be the packets are coming in just too slow to allow
> the ring buffer to work properly and causing the application to sleep on
> poll(2) every time. This would kill performance at pathelogical packet
> rates I guess.
>
> You could work around this by spinning for a few thousand spins before
> calling poll(2) (or even indefinately for that matter, and allow the
> kernel to preempt you if need be).



  reply	other threads:[~2003-11-06 14:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-29  4:09 CONFIG_PACKET_MMAP revisited odain2
2003-10-29  4:50 ` Jamie Lokier
2003-11-06 11:08 ` Gianni Tedesco
2003-11-06 14:13   ` Oliver Dain [this message]
2003-11-06 14:31     ` Gianni Tedesco
2003-11-06 15:29       ` P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200311060913.41719.omd1@cornell.edu \
    --to=omd1@cornell.edu \
    --cc=gianni@scaramanga.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=odain2@mindspring.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox