All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Ben Hutchings <bhutchings@solarflare.com>
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-net-drivers@solarflare.com
Subject: Re: [PATCH net 1/2] tcp: Limit number of segments generated by GSO per skb
Date: Mon, 30 Jul 2012 14:00:25 -0700	[thread overview]
Message-ID: <5016F5E9.7010704@candelatech.com> (raw)
In-Reply-To: <1343677270.2667.31.camel@bwh-desktop.uk.solarflarecom.com>

On 07/30/2012 12:41 PM, Ben Hutchings wrote:
> On Mon, 2012-07-30 at 10:23 -0700, Ben Greear wrote:
>> On 07/30/2012 10:16 AM, Ben Hutchings wrote:
>>> A peer (or local user) may cause TCP to use a nominal MSS of as little
>>> as 88 (actual MSS of 76 with timestamps).  Given that we have a
>>> sufficiently prodigious local sender and the peer ACKs quickly enough,
>>> it is nevertheless possible to grow the window for such a connection
>>> to the point that we will try to send just under 64K at once.  This
>>> results in a single skb that expands to 861 segments.
>>>
>>> In some drivers with TSO support, such an skb will require hundreds of
>>> DMA descriptors; a substantial fraction of a TX ring or even more than
>>> a full ring.  The TX queue selected for the skb may stall and trigger
>>> the TX watchdog repeatedly (since the problem skb will be retried
>>> after the TX reset).  This particularly affects sfc, for which the
>>> issue is designated as CVE-2012-3412.  However it may be that some
>>> hardware or firmware also fails to handle such an extreme TSO request
>>> correctly.
>>>
>>> Therefore, limit the number of segments per skb to 100.  This should
>>> make no difference to behaviour unless the actual MSS is less than
>>> about 700.
>>
>> Please do not do this...or at least allow over-rides.  We love
>> the trick of seting very small MSS and making the NICs generate
>> huge numbers of small TCP frames with efficient user-space
>> logic.   We use this for stateful TCP load testing when high
>> numbers of tcp packets-per-second is desired.
>
> Please test whether this actually makes a difference - my suspicion is
> that 100 segments per skb is easily enough to prevent the host being a
> bottleneck.

Any CPU I can save I can use for other tasks.  If we can use the
NIC's offload features to segment pkts, then we get near linear
increase in pkts-per-second by adding NICs..at least up to whatever
the total bandwidth of the system is...

If you want to have the OS default to a safe value, that is
fine by me..but please give us a tunable so that we can get
the old behaviour.

It's always possible I'm not the only one using this,
and I think it would be considered bad form to break
existing features and provide no work-around.

Thanks,
Ben

>
>> Intel NICs, including 10G, work just fine with minimal MSS
>> in this scenario.
>
> I'll leave this to the Intel maintainers to answer.
>
> Ben.
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

  reply	other threads:[~2012-07-30 21:00 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-30 17:14 [PATCH net 0/2] Prevent extreme TSO parameters from stalling TX queues Ben Hutchings
2012-07-30 17:16 ` [PATCH net 1/2] tcp: Limit number of segments generated by GSO per skb Ben Hutchings
2012-07-30 17:23   ` Ben Greear
2012-07-30 19:41     ` Ben Hutchings
2012-07-30 21:00       ` Ben Greear [this message]
2012-07-30 17:31   ` Eric Dumazet
2012-07-30 19:35     ` Ben Hutchings
2012-07-30 19:56       ` Ben Hutchings
2012-07-30 21:46       ` David Miller
2012-07-30 22:20         ` Ben Hutchings
2012-07-30 22:50           ` Stephen Hemminger
2012-07-30 23:07             ` Ben Hutchings
2012-07-30 17:17 ` [PATCH net 2/2] sfc: Correct the minimum TX queue size Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5016F5E9.7010704@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=linux-net-drivers@solarflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.