From: Claudiu Manoil <claudiu.manoil@freescale.com>
To: Tomas Hruby <thruby@gmail.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: <netdev@vger.kernel.org>, "David S. Miller" <davem@davemloft.net>
Subject: Re: [RFC net-next 0/4] gianfar: Use separate NAPI for Tx confirmation processing
Date: Mon, 13 Aug 2012 19:23:09 +0300 [thread overview]
Message-ID: <502929ED.2050703@freescale.com> (raw)
In-Reply-To: <5023D21E.1000008@freescale.com>
On 08/09/2012 06:07 PM, Claudiu Manoil wrote:
> On 8/9/2012 2:06 AM, Tomas Hruby wrote:
>> On Wed, Aug 8, 2012 at 9:44 AM, Eric Dumazet <eric.dumazet@gmail.com>
>> wrote:
>>> On Wed, 2012-08-08 at 12:24 -0400, Paul Gortmaker wrote:
>>>> [[RFC net-next 0/4] gianfar: Use separate NAPI for Tx confirmation
>>>> processing] On 08/08/2012 (Wed 15:26) Claudiu Manoil wrote:
>>>>
>>>>> Hi all,
>>>>> This set of patches basically splits the existing napi poll
>>>>> routine into
>>>>> two separate napi functions, one for Rx processing (triggered by
>>>>> frame
>>>>> receive interrupts only) and one for the Tx confirmation path
>>>>> processing
>>>>> (triggerred by Tx confirmation interrupts only). The polling
>>>>> algorithm
>>>>> behind remains much the same.
>>>>>
>>>>> Important throughput improvements have been noted on low power
>>>>> boards with
>>>>> this set of changes.
>>>>> For instance, for the following netperf test:
>>>>> netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
>>>>> yields a throughput gain from oscilating ~500-~700 Mbps to steady
>>>>> ~940 Mbps,
>>>>> (if the Rx/Tx paths are processed on different cores), w/ no
>>>>> increase in CPU%,
>>>>> on a p1020rdb - 2 core machine featuring etsec2.0 (Multi-Queue
>>>>> Multi-Group
>>>>> driver mode).
>>>>
>>>> It would be interesting to know more about what was causing that large
>>>> an oscillation -- presumably you will have it reappear once one core
>>>> becomes 100% utilized. Also, any thoughts on how the change will
>>>> change
>>>> performance on an older low power single core gianfar system (e.g.
>>>> 83xx)?
>>>
>>> I also was wondering if this low performance could be caused by BQL
>>>
>>> Since TCP stack is driven by incoming ACKS, a NAPI run could have to
>>> handle 10 TCP acks in a row, and resulting xmits could hit BQL and
>>> transit on qdisc (Because NAPI handler wont handle TX completions in
>>> the
>>> middle of RX handler)
>>
>> Does disabling BQL help? Is the BQL limit stable? To what value is it
>> set? I would be very much interested in more data if the issue is BQL
>> related.
>>
>> .
>>
>
> I agree that more tests should be run to investigate why gianfar under-
> performs on the low power p1020rdb platform, and BQL seems to be
> a good starting point (thanks for the hint). What I can say now is that
> the issue is not apparent on p2020rdb, for instance, which is a more
> powerful platform: the CPUs - 1200 MHz instead of 800 MHz; twice the
> size of L2 cache (512 KB), greater bus (CCB) frequency ... On this
> board (p2020rdb) the netperf test reaches 940Mbps both w/ and w/o these
> patches.
>
> For a single core system I'm not expecting any performance degradation,
> simply because I don't see why the proposed napi poll implementation
> would be slower than the existing one. I'll do some measurements on a
> p1010rdb too (single core, CPU:800 MHz) and get back to you with the
> results.
>
Hi all,
Please find below the netperf measurements performed on a p1010rdb machine
(single core, low power). Three kernel images were used:
1) Linux version 3.5.0-20970-gaae06bf -- net-next commit aae06bf
2) Linux version 3.5.0-20974-g2920464 -- commit aae06bf + Tx NAPI patches
3) Linux version 3.5.0-20970-gaae06bf-dirty -- commit aae06bf +
CONFIG_BQL set to 'n'
The results show that, on *Image 1)*, by adjusting
tcp_limit_output_bytes no substantial
improvements are seen, as the throughput stays in the 580-60x Mbps range .
By changing the coalescing settings from default* (rx coalescing off,
tx-usecs: 10, tx-frames: 16) to:
"ethtool -C eth1 rx-frames 22 tx-frames 22 rx-usecs 32 tx-usecs 32"
we get a throughput of ~710 Mbps.
For *Image 2)*, using the default tcp_limit_output_bytes value (131072)
- I've noticed
that "tweaking" tcp_limit_output_bytes does not improve the throughput
-, we get the
following performance numbers:
* default coalescing settings: ~650 Mbps
* rx-frames tx-frames 22 rx-usecs 32 tx-usecs 32: ~860-880 Mbps
For *Image 3)*, by disabling BQL (CONFIG_BQL = n), there's *no* relevant
performance
improvement compared to Image 1).
(note:
For all the measurements, rx and tx BD ring sizes have been set to 64,
for best performance.)
So, I really tend to believe that the performance degradation comes
primarily from the driver,
and the napi poll processing turns out to be an important source for
that. The proposed patches
show substantial improvement, especially for SMP systems where Tx and Rx
processing may be
done in parallel.
What do you think?
Is it ok to proceed by re-spinning the patches? Do you recommend
additional measurements?
Regards,
Claudiu
//=Image 1)================
root@p1010rdb:~# cat /proc/version
Linux version 3.5.0-20970-gaae06bf [...]
root@p1010rdb:~# zcat /proc/config.gz | grep BQL
CONFIG_BQL=y
root@p1010rdb:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes
131072
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 580.76 99.95 11.76 14.099 1.659
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 598.21 99.95 10.91 13.687 1.493
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 583.04 99.95 11.25 14.043 1.581
root@p1010rdb:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes
65536
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 604.29 99.95 11.15 13.550 1.512
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 603.52 99.50 12.57 13.506 1.706
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 596.18 99.95 12.81 13.734 1.760
root@p1010rdb:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes
32768
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 582.32 99.95 12.96 14.061 1.824
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 583.79 99.95 11.19 14.026 1.571
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 584.16 99.95 11.36 14.016 1.592
root@p1010rdb:~# ethtool -C eth1 rx-frames 22 tx-frames 22 rx-usecs 32
tx-usecs 32
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 708.77 99.85 13.32 11.541 1.540
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 710.50 99.95 12.46 11.524 1.437
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 709.95 99.95 14.15 11.533 1.633
//=Image 2)================
root@p1010rdb:~# cat /proc/version
Linux version 3.5.0-20974-g2920464 [...]
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 652.60 99.95 13.05 12.547 1.638
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 657.47 99.95 11.81 12.454 1.471
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 655.77 99.95 11.80 12.486 1.474
root@p1010rdb:~# ethtool -C eth1 rx-frames 22 rx-usecs 32 tx-frames 22
tx-usecs 32
root@p1010rdb:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes
131072
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.01 882.42 99.20 18.06 9.209 1.676
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 867.02 99.75 16.21 9.425 1.531
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.01 874.29 99.85 15.25 9.356 1.429
//=Image 3)================
Linux version 3.5.0-20970-gaae06bf-dirty [...] //CONFIG_BQL = n
root@p1010rdb:~# cat /proc/version
Linux version 3.5.0-20970-gaae06bf-dirty
(b08782@zro04-ws574.ea.freescale.net) (gcc version 4.6.2 (GCC) ) #3 Mon
Aug 13 13:58:25 EEST 2012
root@p1010rdb:~# zcat /proc/config.gz | grep BQL
# CONFIG_BQL is not set
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 595.08 99.95 12.51 13.759 1.722
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 593.95 99.95 10.96 13.785 1.511
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 595.30 99.90 11.11 13.747 1.528
root@p1010rdb:~# ethtool -C eth1 rx-frames 22 rx-usecs 32 tx-frames 22
tx-usecs 32
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 710.46 99.95 12.46 11.525 1.437
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 714.27 99.95 14.05 11.463 1.611
root@p1010rdb:~# netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1
(192.168.10.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 1500 20.00 717.69 99.95 12.56 11.409 1.433
.
next prev parent reply other threads:[~2012-08-13 16:23 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-08 12:26 [RFC net-next 0/4] gianfar: Use separate NAPI for Tx confirmation processing Claudiu Manoil
2012-08-08 12:26 ` [RFC net-next 1/4] gianfar: Remove redundant programming of [rt]xic registers Claudiu Manoil
2012-08-08 12:26 ` [RFC net-next 2/4] gianfar: Clear ievent from interrupt handler for [RT]x int Claudiu Manoil
2012-08-08 12:26 ` [RFC net-next 3/4] gianfar: Separate out the Rx and Tx coalescing functions Claudiu Manoil
2012-08-08 12:26 ` [RFC net-next 4/4] gianfar: Use separate NAPIs for Tx and Rx processing Claudiu Manoil
2012-08-14 0:51 ` Paul Gortmaker
2012-08-14 16:08 ` Claudiu Manoil
2012-08-08 15:44 ` [RFC net-next 3/4] gianfar: Separate out the Rx and Tx coalescing functions Paul Gortmaker
2012-08-09 16:24 ` Claudiu Manoil
2012-08-15 1:29 ` Paul Gortmaker
2012-08-08 16:11 ` [RFC net-next 2/4] gianfar: Clear ievent from interrupt handler for [RT]x int Paul Gortmaker
2012-08-09 16:04 ` Claudiu Manoil
2012-08-08 16:24 ` [RFC net-next 0/4] gianfar: Use separate NAPI for Tx confirmation processing Paul Gortmaker
2012-08-08 16:44 ` Eric Dumazet
2012-08-08 23:06 ` Tomas Hruby
2012-08-09 15:07 ` Claudiu Manoil
2012-08-13 16:23 ` Claudiu Manoil [this message]
2012-08-14 1:15 ` Paul Gortmaker
2012-08-14 16:08 ` Claudiu Manoil
2012-08-16 15:36 ` Paul Gortmaker
2012-08-17 11:28 ` Claudiu Manoil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=502929ED.2050703@freescale.com \
--to=claudiu.manoil@freescale.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=paul.gortmaker@windriver.com \
--cc=thruby@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).