* MTU and TCP transmit offload.
@ 2011-09-21 21:06 Ben Greear
2011-09-21 22:11 ` Rick Jones
0 siblings, 1 reply; 5+ messages in thread
From: Ben Greear @ 2011-09-21 21:06 UTC (permalink / raw)
To: netdev
We saw something interesting while doing some testing
on 3.0.4.
We configured 2 Ethernet NICs with standard 1500 MTU, and added
a mac-vlan on each, with MTU of 300. The goal was to generate as
many ~300 byte TCP packets as possible, for load testing purposes.
We configured our tool to open sockets on the mac-vlans and send/receive
TCP (IPv4) traffic.
This actually seems to work quite nicely, allowing user-space to
do large writes (24k in our case), and it appears have lots of
small packets on the wire. We still need to sniff with external
system to verify this..but packets-per-second counters look good.
Evidently this all works because macvlans know that the NIC
can do TSO, and the '300' MTU is passed in the big packet
given to the NIC.
This got me thinking...at least for my purposes, it would be
nice to have a per-socket 'MTU' setting. The idea is that
you could ask the NIC to do the TSO at whatever 'mtu' you
wanted, without having to resort to mac-vlans with artificially
small MTU.
So, is there any interest in supporting such a socket option?
I can't think of any use besides TCP traffic load testing, but
perhaps someone else can think of one? Or, is load-testing
enough?
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU and TCP transmit offload.
2011-09-21 21:06 MTU and TCP transmit offload Ben Greear
@ 2011-09-21 22:11 ` Rick Jones
2011-09-21 22:16 ` Ben Greear
0 siblings, 1 reply; 5+ messages in thread
From: Rick Jones @ 2011-09-21 22:11 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
On 09/21/2011 02:06 PM, Ben Greear wrote:
> We saw something interesting while doing some testing
> on 3.0.4.
>
> We configured 2 Ethernet NICs with standard 1500 MTU, and added
> a mac-vlan on each, with MTU of 300. The goal was to generate as
> many ~300 byte TCP packets as possible, for load testing purposes.
> We configured our tool to open sockets on the mac-vlans and send/receive
> TCP (IPv4) traffic.
Presumably one could instead set static PathMTU entries in the routing
tables and accomplish the same thing as you did with the mac-vlans?
> This actually seems to work quite nicely, allowing user-space to
> do large writes (24k in our case), and it appears have lots of
> small packets on the wire. We still need to sniff with external
> system to verify this..but packets-per-second counters look good.
>
> Evidently this all works because macvlans know that the NIC
> can do TSO, and the '300' MTU is passed in the big packet
> given to the NIC.
>
> This got me thinking...at least for my purposes, it would be
> nice to have a per-socket 'MTU' setting. The idea is that
> you could ask the NIC to do the TSO at whatever 'mtu' you
> wanted, without having to resort to mac-vlans with artificially
> small MTU.
>
> So, is there any interest in supporting such a socket option?
>
> I can't think of any use besides TCP traffic load testing, but
> perhaps someone else can think of one? Or, is load-testing
> enough?
Isn't that covered by setsockopt() support for TCP_MAX_SEG? With TSO
what gets passed to the NIC isn't the MTU, but the connection's MSS
derived (in part at least) from the MTU of the egress interface. If one
had made a setsockopt(TCP_MAX_SEG) call prior to the connect() or
listen() call, presumably that would have influenced the MSS exchange at
connection establishment.
rick jones
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU and TCP transmit offload.
2011-09-21 22:11 ` Rick Jones
@ 2011-09-21 22:16 ` Ben Greear
2011-09-21 23:42 ` Ben Greear
0 siblings, 1 reply; 5+ messages in thread
From: Ben Greear @ 2011-09-21 22:16 UTC (permalink / raw)
To: Rick Jones; +Cc: netdev
On 09/21/2011 03:11 PM, Rick Jones wrote:
> On 09/21/2011 02:06 PM, Ben Greear wrote:
>> We saw something interesting while doing some testing
>> on 3.0.4.
>>
>> We configured 2 Ethernet NICs with standard 1500 MTU, and added
>> a mac-vlan on each, with MTU of 300. The goal was to generate as
>> many ~300 byte TCP packets as possible, for load testing purposes.
>> We configured our tool to open sockets on the mac-vlans and send/receive
>> TCP (IPv4) traffic.
>
> Presumably one could instead set static PathMTU entries in the routing tables and accomplish the same thing as you did with the mac-vlans?
>
>> This actually seems to work quite nicely, allowing user-space to
>> do large writes (24k in our case), and it appears have lots of
>> small packets on the wire. We still need to sniff with external
>> system to verify this..but packets-per-second counters look good.
>>
>> Evidently this all works because macvlans know that the NIC
>> can do TSO, and the '300' MTU is passed in the big packet
>> given to the NIC.
>>
>> This got me thinking...at least for my purposes, it would be
>> nice to have a per-socket 'MTU' setting. The idea is that
>> you could ask the NIC to do the TSO at whatever 'mtu' you
>> wanted, without having to resort to mac-vlans with artificially
>> small MTU.
>>
>> So, is there any interest in supporting such a socket option?
>>
>> I can't think of any use besides TCP traffic load testing, but
>> perhaps someone else can think of one? Or, is load-testing
>> enough?
>
> Isn't that covered by setsockopt() support for TCP_MAX_SEG? With TSO what gets passed to the NIC isn't the MTU, but the connection's MSS derived (in part at
> least) from the MTU of the egress interface. If one had made a setsockopt(TCP_MAX_SEG) call prior to the connect() or listen() call, presumably that would have
> influenced the MSS exchange at connection establishment.
Ohh, that looks promising!
I'll give that a try.
Thanks,
Ben
>
> rick jones
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU and TCP transmit offload.
2011-09-21 22:16 ` Ben Greear
@ 2011-09-21 23:42 ` Ben Greear
2011-09-21 23:58 ` Rick Jones
0 siblings, 1 reply; 5+ messages in thread
From: Ben Greear @ 2011-09-21 23:42 UTC (permalink / raw)
To: Rick Jones; +Cc: netdev
On 09/21/2011 03:16 PM, Ben Greear wrote:
> On 09/21/2011 03:11 PM, Rick Jones wrote:
>> On 09/21/2011 02:06 PM, Ben Greear wrote:
>>> We saw something interesting while doing some testing
>>> on 3.0.4.
>>>
>>> We configured 2 Ethernet NICs with standard 1500 MTU, and added
>>> a mac-vlan on each, with MTU of 300. The goal was to generate as
>>> many ~300 byte TCP packets as possible, for load testing purposes.
>>> We configured our tool to open sockets on the mac-vlans and send/receive
>>> TCP (IPv4) traffic.
>>
>> Presumably one could instead set static PathMTU entries in the routing tables and accomplish the same thing as you did with the mac-vlans?
>>
>>> This actually seems to work quite nicely, allowing user-space to
>>> do large writes (24k in our case), and it appears have lots of
>>> small packets on the wire. We still need to sniff with external
>>> system to verify this..but packets-per-second counters look good.
>>>
>>> Evidently this all works because macvlans know that the NIC
>>> can do TSO, and the '300' MTU is passed in the big packet
>>> given to the NIC.
>>>
>>> This got me thinking...at least for my purposes, it would be
>>> nice to have a per-socket 'MTU' setting. The idea is that
>>> you could ask the NIC to do the TSO at whatever 'mtu' you
>>> wanted, without having to resort to mac-vlans with artificially
>>> small MTU.
>>>
>>> So, is there any interest in supporting such a socket option?
>>>
>>> I can't think of any use besides TCP traffic load testing, but
>>> perhaps someone else can think of one? Or, is load-testing
>>> enough?
>>
>> Isn't that covered by setsockopt() support for TCP_MAX_SEG? With TSO what gets passed to the NIC isn't the MTU, but the connection's MSS derived (in part at
>> least) from the MTU of the egress interface. If one had made a setsockopt(TCP_MAX_SEG) call prior to the connect() or listen() call, presumably that would have
>> influenced the MSS exchange at connection establishment.
>
> Ohh, that looks promising!
>
> I'll give that a try.
This works like a charm. I'm so glad I don't need to hack
a new sockopt!
Thanks,
Ben
>
> Thanks,
> Ben
>
>>
>> rick jones
>
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU and TCP transmit offload.
2011-09-21 23:42 ` Ben Greear
@ 2011-09-21 23:58 ` Rick Jones
0 siblings, 0 replies; 5+ messages in thread
From: Rick Jones @ 2011-09-21 23:58 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
>>> Isn't that covered by setsockopt() support for TCP_MAX_SEG? With
>>> TSO what gets passed to the NIC isn't the MTU, but the
>>> connection's MSS derived (in part at least) from the MTU of the
>>> egress interface. If one had made a setsockopt(TCP_MAX_SEG) call
>>> prior to the connect() or listen() call, presumably that would
>>> have influenced the MSS exchange at connection establishment.
>>
>> Ohh, that looks promising!
>>
>> I'll give that a try.
>
> This works like a charm. I'm so glad I don't need to hack
> a new sockopt!
Glad it works for you.
happy benchmarking,
rick jones
One of these days I'll have to decide if I add that as an option to
netperf, which presently only does the getsockopt(TCP_MAX_SEG). Folks
with a strong preference one way or t'other should feel free to contact
me on the side or via netperf-talk at netperf.org.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-09-21 23:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-21 21:06 MTU and TCP transmit offload Ben Greear
2011-09-21 22:11 ` Rick Jones
2011-09-21 22:16 ` Ben Greear
2011-09-21 23:42 ` Ben Greear
2011-09-21 23:58 ` Rick Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).