From: Eric Dumazet <eric.dumazet@gmail.com>
To: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>,
netdev@vger.kernel.org
Cc: edumazet@google.com, hkallweit1@gmail.com
Subject: Re: r8169: Performance regression and latency instability
Date: Fri, 16 Aug 2019 14:35:06 +0200 [thread overview]
Message-ID: <217e3fa9-7782-08c7-1f2b-8dabacaa83f9@gmail.com> (raw)
In-Reply-To: <72898d5b-9424-0bcd-3d8a-fc2e2dd0dbf1@intra2net.com>
On 8/16/19 2:09 PM, Juliana Rodrigueiro wrote:
> Greetings!
>
> During migration from kernel 3.14 to 4.19, we noticed a regression on the network performance. Under the exact same circumstances, the standard deviation of the latency is more than double than before on the Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.
>
> Kernel 3.14:
> # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
> 313.37
>
> Kernel 4.19:
> # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
> 632.96
>
> In contrast, we noticed small improvements in performance with other non-Realtek network cards (igb, tg3). Which suggested a possible driver related bug.
>
> However after bisecting the code, I ended up with the following patch, which was introduced in kernel 4.17 and modifies net/ipv4:
>
> commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
> Author: Eric Dumazet <edumazet@google.com>
> Date: Mon Feb 19 11:56:47 2018 -0800
>
> tcp: switch to GSO being always on
>
> Could you please help me to clarify, should GSO be always on on my device? Or does it just affect TCP? According to ethtool it is always off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.
>
> # ethtool -k eth0
> Offload parameters for eth0:
> Cannot get device udp large send offload settings: Operation not supported
> rx-checksumming: on
> tx-checksumming: off
> scatter-gather: off
> tcp-segmentation-offload: off
> udp-fragmentation-offload: off
> generic-segmentation-offload: off
> generic-receive-offload: on
> large-receive-offload: off
>
> I validated that reverting "tcp: switch to GSO being always on" successfully brings back the better performance for the r8169 driver.
>
> I'm sure that reverting that commit is not the optimal solution, so I would like to kindly ask for help to shed some light in this issue.
Hi Juliana
I am sure that all commits done in TCP stack can show a regression on a particular
combination of packet sizes, MTU size, NIC, and measured metric.
Basically if your NIC does not support SG and TSO, there is a possibility
that the changes we did to enter the era of 100Gbit and 200Gbit NIC might
hurt a bit.
Lack of SG means that the lower stack might have to perform memory allocations
to perform the segmentation and this might be slow (or even fail) under memory pressure.
I have no idea why you can even turn on SG, if it is turned off by default.
Please give us more information on the NIC
ethtool -i eth0 ; ifconfig eth0
Possibly try to use a recent ethtool, it seems yours is pretty old.
I also see this relevant commit : I have no idea why SG would have any relation with TSO.
commit a7eb6a4f2560d5ae64bfac98d79d11378ca2de6c
Author: Holger Hoffstätte <holger@applied-asynchrony.com>
Date: Fri Aug 9 00:02:40 2019 +0200
r8169: fix performance issue on RTL8168evl
Disabling TSO but leaving SG active results is a significant
performance drop. Therefore disable also SG on RTL8168evl.
This restores the original performance.
Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index b2a275d8504c..912bd41eaa1b 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6898,9 +6898,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
/* RTL8168e-vl has a HW issue with TSO */
if (tp->mac_version == RTL_GIGA_MAC_VER_34) {
- dev->vlan_features &= ~NETIF_F_ALL_TSO;
- dev->hw_features &= ~NETIF_F_ALL_TSO;
- dev->features &= ~NETIF_F_ALL_TSO;
+ dev->vlan_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+ dev->hw_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+ dev->features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
}
dev->hw_features |= NETIF_F_RXALL;
next prev parent reply other threads:[~2019-08-16 12:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-16 12:09 r8169: Performance regression and latency instability Juliana Rodrigueiro
2019-08-16 12:35 ` Eric Dumazet [this message]
2019-08-16 13:59 ` Holger Hoffstätte
2019-08-16 19:12 ` Heiner Kallweit
2019-08-19 16:04 ` Juliana Rodrigueiro
2019-09-06 11:25 ` Juliana Rodrigueiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=217e3fa9-7782-08c7-1f2b-8dabacaa83f9@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=edumazet@google.com \
--cc=hkallweit1@gmail.com \
--cc=juliana.rodrigueiro@intra2net.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox