From: David Daney <ddaney@caviumnetworks.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: ralf@linux-mips.org, linux-mips@linux-mips.org,
netdev@vger.kernel.org, gregkh@suse.de
Subject: Re: [PATCH 4/4] Staging: Octeon: Free transmit SKBs in a timely manner.
Date: Mon, 15 Feb 2010 14:05:22 -0800 [thread overview]
Message-ID: <4B79C522.4040405@caviumnetworks.com> (raw)
In-Reply-To: <1266268271.2859.22.camel@edumazet-laptop>
On 02/15/2010 01:11 PM, Eric Dumazet wrote:
> Le lundi 15 février 2010 à 12:41 -0800, David Daney a écrit :
>> On 02/15/2010 12:27 PM, Eric Dumazet wrote:
>>> Le lundi 15 février 2010 à 12:13 -0800, David Daney a écrit :
>>>> If we wait for the once-per-second cleanup to free transmit SKBs,
>>>> sockets with small transmit buffer sizes might spend most of their
>>>> time blocked waiting for the cleanup.
>>>>
>>>> Normally we do a cleanup for each transmitted packet. We add a
>>>> watchdog type timer so that we also schedule a timeout for 150uS after
>>>> a packet is transmitted. The watchdog is reset for each transmitted
>>>> packet, so for high packet rates, it never expires. At these high
>>>> rates, the cleanups are done for each packet so the extra watchdog
>>>> initiated cleanups are not needed.
>>>
>>> s/needed/fired/
>>>
>>
>> or perhaps s/are not needed/are neither needed nor fired/
>>
>>> Hmm, but re-arming a timer for each transmited packet must have a cost ?
>>>
>>
>> The cost is fairly low (less than 10 processor clock cycles). We didn't
>> add this for amusement, people actually do things like only send UDP
>> packets from userspace. Since we can fill the transmit queue faster
>> than it is emptied, the socket transmit buffer is quickly consumed. If
>> we don't free the SKBs in short order, the transmitting process get to
>> take a long sleep (until our previous once per second clean up task was
>> run).
>
> I understand this, but traditionaly, NIC drivers dont use a timer, but a
> 'TX complete' interrupt, that usually fires a few us after packet
> submission on Gigabit speed.
>
Indeed. Lacking this type of interrupt, the watchdog seemed the best
short term solution.
I am investigating the possibility of feeding TX complete notifications
back through the RX path where it is possible to generate interrupts.
The drawback to this is that it takes a lot more CPU cycles as well as
added cache pressure.
> A fast program could try to send X small udp packets in less than 150
> us, X being greater than the size of your TX ring.
My TX queue (it is not a ring) size can be made arbitrarily large
(currently 1000). 64bytes * 1000 packets * 10 bits/packet / 10e9
bits/sec == 640uS. My watchdog will fire after less than 1/4 of the
ring capacity is freed.
>
> So your patch makes the window smaller, but it still is there (at
> physical layer, we'll see a burst of packets, a ~100us delay, then a
> second burst)
>
With this patch, there will be no burstiness using default socket buffer
sizes and packets of arbitrary size on a standard 1gig port.
On the 10gig ports there is the possibility for burstiness as you aptly
explain. However, in practice it would be difficult to arrange things
to achieve sufficiently high packet rates, so we can live with it like this.
David Daney
next prev parent reply other threads:[~2010-02-15 22:05 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-15 20:12 [PATCH 0/4] Improvements to octeon_ethernet David Daney
2010-02-15 20:13 ` [PATCH 1/4] Staging: octeon: remove unneeded includes David Daney
2010-02-17 14:17 ` Ralf Baechle
2010-02-15 20:13 ` [PATCH 2/4] Staging: Octeon: Run phy bus accesses on a workqueue David Daney
2010-02-17 14:17 ` Ralf Baechle
2010-02-15 20:13 ` [PATCH 3/4] MIPS: Octeon: Do proper acknowledgment of CIU timer interrupts David Daney
2010-02-17 14:17 ` Ralf Baechle
2010-02-15 20:13 ` [PATCH 4/4] Staging: Octeon: Free transmit SKBs in a timely manner David Daney
2010-02-15 20:27 ` Eric Dumazet
2010-02-15 20:41 ` David Daney
2010-02-15 21:11 ` Eric Dumazet
2010-02-15 22:05 ` David Daney [this message]
2010-02-15 23:06 ` [PATCH 4/4] Staging: Octeon: Free transmit SKBs in a timely manner (v2) David Daney
2010-02-17 14:17 ` Ralf Baechle
2010-02-15 21:05 ` [PATCH 0/4] Improvements to octeon_ethernet Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B79C522.4040405@caviumnetworks.com \
--to=ddaney@caviumnetworks.com \
--cc=eric.dumazet@gmail.com \
--cc=gregkh@suse.de \
--cc=linux-mips@linux-mips.org \
--cc=netdev@vger.kernel.org \
--cc=ralf@linux-mips.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.