From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Brian King <brking@linux.vnet.ibm.com>,
Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
Jonathan Maxwell <jmaxwell37@gmail.com>,
David Dai <zdai@us.ibm.com>,
Thomas Falcon <tlfalcon@linux.vnet.ibm.com>,
"David S. Miller" <davem@davemloft.net>,
Sumit Semwal <sumit.semwal@linaro.org>
Subject: [PATCH 4.4 18/18] ibmveth: set correct gso_size and gso_type
Date: Sun, 16 Apr 2017 10:02:41 +0200 [thread overview]
Message-ID: <20170416080213.642331960@linuxfoundation.org> (raw)
In-Reply-To: <20170416080212.713469777@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
commit 7b5967389f5a8dfb9d32843830f5e2717e20995d upstream.
This patch is based on an earlier one submitted
by Jon Maxwell with the following commit message:
"We recently encountered a bug where a few customers using ibmveth on the
same LPAR hit an issue where a TCP session hung when large receive was
enabled. Closer analysis revealed that the session was stuck because the
one side was advertising a zero window repeatedly.
We narrowed this down to the fact the ibmveth driver did not set gso_size
which is translated by TCP into the MSS later up the stack. The MSS is
used to calculate the TCP window size and as that was abnormally large,
it was calculating a zero window, even although the sockets receive buffer
was completely empty."
We rely on the Virtual I/O Server partition in a pseries
environment to provide the MSS through the TCP header checksum
field. The stipulation is that users should not disable checksum
offloading if rx packet aggregation is enabled through VIOS.
Some firmware offerings provide the MSS in the RX buffer.
This is signalled by a bit in the RX queue descriptor.
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Reviewed-by: David Dai <zdai@us.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/ibm/ibmveth.c | 65 +++++++++++++++++++++++++++++++++++--
drivers/net/ethernet/ibm/ibmveth.h | 1
2 files changed, 64 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -58,7 +58,7 @@ static struct kobj_type ktype_veth_pool;
static const char ibmveth_driver_name[] = "ibmveth";
static const char ibmveth_driver_string[] = "IBM Power Virtual Ethernet Driver";
-#define ibmveth_driver_version "1.05"
+#define ibmveth_driver_version "1.06"
MODULE_AUTHOR("Santiago Leon <santil@linux.vnet.ibm.com>");
MODULE_DESCRIPTION("IBM Power Virtual Ethernet Driver");
@@ -137,6 +137,11 @@ static inline int ibmveth_rxq_frame_offs
return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_OFF_MASK;
}
+static inline int ibmveth_rxq_large_packet(struct ibmveth_adapter *adapter)
+{
+ return ibmveth_rxq_flags(adapter) & IBMVETH_RXQ_LRG_PKT;
+}
+
static inline int ibmveth_rxq_frame_length(struct ibmveth_adapter *adapter)
{
return be32_to_cpu(adapter->rx_queue.queue_addr[adapter->rx_queue.index].length);
@@ -1172,6 +1177,45 @@ map_failed:
goto retry_bounce;
}
+static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
+{
+ int offset = 0;
+
+ /* only TCP packets will be aggregated */
+ if (skb->protocol == htons(ETH_P_IP)) {
+ struct iphdr *iph = (struct iphdr *)skb->data;
+
+ if (iph->protocol == IPPROTO_TCP) {
+ offset = iph->ihl * 4;
+ skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+ } else {
+ return;
+ }
+ } else if (skb->protocol == htons(ETH_P_IPV6)) {
+ struct ipv6hdr *iph6 = (struct ipv6hdr *)skb->data;
+
+ if (iph6->nexthdr == IPPROTO_TCP) {
+ offset = sizeof(struct ipv6hdr);
+ skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+ } else {
+ return;
+ }
+ } else {
+ return;
+ }
+ /* if mss is not set through Large Packet bit/mss in rx buffer,
+ * expect that the mss will be written to the tcp header checksum.
+ */
+ if (lrg_pkt) {
+ skb_shinfo(skb)->gso_size = mss;
+ } else if (offset) {
+ struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
+
+ skb_shinfo(skb)->gso_size = ntohs(tcph->check);
+ tcph->check = 0;
+ }
+}
+
static int ibmveth_poll(struct napi_struct *napi, int budget)
{
struct ibmveth_adapter *adapter =
@@ -1180,6 +1224,7 @@ static int ibmveth_poll(struct napi_stru
int frames_processed = 0;
unsigned long lpar_rc;
struct iphdr *iph;
+ u16 mss = 0;
restart_poll:
while (frames_processed < budget) {
@@ -1197,9 +1242,21 @@ restart_poll:
int length = ibmveth_rxq_frame_length(adapter);
int offset = ibmveth_rxq_frame_offset(adapter);
int csum_good = ibmveth_rxq_csum_good(adapter);
+ int lrg_pkt = ibmveth_rxq_large_packet(adapter);
skb = ibmveth_rxq_get_buffer(adapter);
+ /* if the large packet bit is set in the rx queue
+ * descriptor, the mss will be written by PHYP eight
+ * bytes from the start of the rx buffer, which is
+ * skb->data at this stage
+ */
+ if (lrg_pkt) {
+ __be64 *rxmss = (__be64 *)(skb->data + 8);
+
+ mss = (u16)be64_to_cpu(*rxmss);
+ }
+
new_skb = NULL;
if (length < rx_copybreak)
new_skb = netdev_alloc_skb(netdev, length);
@@ -1233,11 +1290,15 @@ restart_poll:
if (iph->check == 0xffff) {
iph->check = 0;
iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
- adapter->rx_large_packets++;
}
}
}
+ if (length > netdev->mtu + ETH_HLEN) {
+ ibmveth_rx_mss_helper(skb, mss, lrg_pkt);
+ adapter->rx_large_packets++;
+ }
+
napi_gro_receive(napi, skb); /* send it up */
netdev->stats.rx_packets++;
--- a/drivers/net/ethernet/ibm/ibmveth.h
+++ b/drivers/net/ethernet/ibm/ibmveth.h
@@ -209,6 +209,7 @@ struct ibmveth_rx_q_entry {
#define IBMVETH_RXQ_TOGGLE 0x80000000
#define IBMVETH_RXQ_TOGGLE_SHIFT 31
#define IBMVETH_RXQ_VALID 0x40000000
+#define IBMVETH_RXQ_LRG_PKT 0x04000000
#define IBMVETH_RXQ_NO_CSUM 0x02000000
#define IBMVETH_RXQ_CSUM_GOOD 0x01000000
#define IBMVETH_RXQ_OFF_MASK 0x0000FFFF
next prev parent reply other threads:[~2017-04-16 8:03 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-16 8:02 [PATCH 4.4 00/18] 4.4.62-stable review Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 02/18] drm/i915: Stop using RP_DOWN_EI on Baytrail Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 03/18] usb: dwc3: gadget: delay unmap of bounced requests Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 05/18] MIPS: Introduce irq_stack Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 06/18] MIPS: Stack unwinding while on IRQ stack Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 07/18] MIPS: Only change $28 to thread_info if coming from user mode Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 08/18] MIPS: Switch to the irq_stack in interrupts Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 09/18] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 10/18] MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 12/18] net/packet: fix overflow in check for priv area size Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 13/18] blk-mq: Avoid memory reclaim when remapping queues Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 14/18] usb: hub: Wait for connection to be reestablished after port reset Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 15/18] net/mlx4_en: Fix bad WQE issue Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 16/18] net/mlx4_core: Fix racy CQ (Completion Queue) free Greg Kroah-Hartman
2017-04-16 8:02 ` [PATCH 4.4 17/18] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Greg Kroah-Hartman
2017-04-16 8:02 ` Greg Kroah-Hartman [this message]
2017-04-16 21:28 ` [PATCH 4.4 00/18] 4.4.62-stable review Guenter Roeck
2017-04-17 18:19 ` Shuah Khan
[not found] ` <58f36a37.cc9bdf0a.86c97.bb8a@mx.google.com>
2017-04-17 18:48 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170416080213.642331960@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=brking@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=jmaxwell37@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo.leitner@gmail.com \
--cc=pradeeps@linux.vnet.ibm.com \
--cc=stable@vger.kernel.org \
--cc=sumit.semwal@linaro.org \
--cc=tlfalcon@linux.vnet.ibm.com \
--cc=zdai@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).