netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] bnx2x: fix GRO parameters
@ 2013-01-17 13:26 Yuval Mintz
  2013-01-17 14:33 ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Yuval Mintz @ 2013-01-17 13:26 UTC (permalink / raw)
  To: davem, eric.dumazet, netdev; +Cc: eilong, ariele, Yuval Mintz

bnx2x does an internal GRO pass but doesn't provide gso_segs, thus
breaking qdisc_pkt_len_init() in case ingress qdisc is used.

We store gso_segs in NAPI_GRO_CB(skb)->count, where tcp_gro_complete()
expects to find the number of aggregated segments.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
---
This is Eric Dumazet's patch with slight semantic modifications.
Eric - Do you want to put your ack or signoff on this patch?

Dave - Please apply this to 'net-next' afterwards.

Thanks,
Yuval Mintz
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 47 ++++++++++++-------------
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 18fc26e..49810a0 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -439,31 +439,34 @@ static void bnx2x_tpa_start(struct bnx2x_fastpath *fp, u16 queue,
  */
 #define TPA_TSTAMP_OPT_LEN	12
 /**
- * bnx2x_set_lro_mss - calculate the approximate value of the MSS
+ * bnx2x_set_gro_params - compute GRO values
  *
- * @bp:			driver handle
+ * @skb:		packet skb
  * @parsing_flags:	parsing flags from the START CQE
  * @len_on_bd:		total length of the first packet for the
  *			aggregation.
+ * @pkt_len:		length of all segments
  *
  * Approximate value of the MSS for this aggregation calculated using
  * the first packet of it.
+ * Compute number of aggregated segments, and gso_type
  */
-static u16 bnx2x_set_lro_mss(struct bnx2x *bp, u16 parsing_flags,
-			     u16 len_on_bd)
+static void bnx2x_set_gro_params(struct sk_buff *skb, u16 parsing_flags,
+				 u16 len_on_bd, unsigned int pkt_len)
 {
-	/*
-	 * TPA arrgregation won't have either IP options or TCP options
+	/* TPA aggregation won't have either IP options or TCP options
 	 * other than timestamp or IPv6 extension headers.
 	 */
 	u16 hdrs_len = ETH_HLEN + sizeof(struct tcphdr);
 
 	if (GET_FLAG(parsing_flags, PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) ==
-	    PRS_FLAG_OVERETH_IPV6)
+	    PRS_FLAG_OVERETH_IPV6) {
 		hdrs_len += sizeof(struct ipv6hdr);
-	else /* IPv4 */
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+	} else {
 		hdrs_len += sizeof(struct iphdr);
-
+		skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+	}
 
 	/* Check if there was a TCP timestamp, if there is it's will
 	 * always be 12 bytes length: nop nop kind length echo val.
@@ -473,7 +476,13 @@ static u16 bnx2x_set_lro_mss(struct bnx2x *bp, u16 parsing_flags,
 	if (parsing_flags & PARSING_FLAGS_TIME_STAMP_EXIST_FLAG)
 		hdrs_len += TPA_TSTAMP_OPT_LEN;
 
-	return len_on_bd - hdrs_len;
+	skb_shinfo(skb)->gso_size = len_on_bd - hdrs_len;
+
+	/* tcp_gro_complete() will copy NAPI_GRO_CB(skb)->count
+	 * to skb_shinfo(skb)->gso_segs
+	 */
+	NAPI_GRO_CB(skb)->count = DIV_ROUND_UP(pkt_len - hdrs_len,
+					       skb_shinfo(skb)->gso_size);
 }
 
 static int bnx2x_alloc_rx_sge(struct bnx2x *bp,
@@ -527,19 +536,9 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 	}
 
 	/* This is needed in order to enable forwarding support */
-	if (frag_size) {
-		skb_shinfo(skb)->gso_size = bnx2x_set_lro_mss(bp,
-					tpa_info->parsing_flags, len_on_bd);
-
-		/* set for GRO */
-		if (fp->mode == TPA_MODE_GRO && skb_shinfo(skb)->gso_size)
-			skb_shinfo(skb)->gso_type =
-			    (GET_FLAG(tpa_info->parsing_flags,
-				      PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) ==
-						PRS_FLAG_OVERETH_IPV6) ?
-				SKB_GSO_TCPV6 : SKB_GSO_TCPV4;
-	}
-
+	if (frag_size)
+		bnx2x_set_gro_params(skb, tpa_info->parsing_flags, len_on_bd,
+				     le16_to_cpu(cqe->pkt_len));
 
 #ifdef BNX2X_STOP_ON_ERROR
 	if (pages > min_t(u32, 8, MAX_SKB_FRAGS)*SGE_PAGE_SIZE*PAGES_PER_SGE) {
@@ -651,7 +650,7 @@ static void bnx2x_gro_receive(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 			       struct sk_buff *skb)
 {
 #ifdef CONFIG_INET
-	if (fp->mode == TPA_MODE_GRO && skb_shinfo(skb)->gso_size) {
+	if (skb_shinfo(skb)->gso_size) {
 		skb_set_network_header(skb, 0);
 		switch (be16_to_cpu(skb->protocol)) {
 		case ETH_P_IP:
-- 
1.8.1.227.g44fe835

^ permalink raw reply related	[flat|nested] 9+ messages in thread
* Re: [PATCH net-next 09/10] bnx2x: Added FW GRO bridging support
@ 2013-01-15  7:28 Yuval Mintz
  2013-01-15 14:39 ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Yuval Mintz @ 2013-01-15  7:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, eilong, ariele

On 01/14/2013 08:44 PM, Eric Dumazet wrote:
> On Mon, 2013-01-14 at 09:22 -0800, Eric Dumazet wrote:
> 
> What is the value of gso_segs ?
> 
> The reason I am pointing this out is the recent change in commit
> 1def9238d4aa2146924994aa4b7dc861f03b9362
> (net_sched: more precise pkt_len computation)
> 
> bnx2x not setting gso_segs means that qdisc accounting on ingress is
> completely wrong.

Hi Eric,

First I just want to state that you're totally correct about the gso_segs -
bnx2x is not setting it correctly (it's currently totally omitted), and so
it would incorrectly affect the accounting.

However, notice this behaviour was not introduced in this patch -
Since 621b4d6 bnx2x is using FW GRO, overriding the kernel's GRO implementation:
As the bnx2x driver is supplied with the aggregated packet from its FW,
the bnx2x passes a value in the `gso_size' field of its skb, causing
`skb_is_gso' to return `true'.
This will cause the aggregated skb to override the GRO machinations
(in `dev_gro_receive'), overriding all calls to `gro_receive' and thus also
the call to `skb_gro_receive' which whould have incremented `gso_segs'.

This patch actually tries to correct said behaviour, obviously failing
with the gso_segs but still improving the current state of bnx2x GRO
in bridging scenarios.

>> This looks weird to me. This should be called by GRO stack only.

I think this is the main question - we could try and implement this
inside the network-core itself, but as said behaviour is unique to the
bnx2x driver (correct me if I'm wrong, but I'm unaware of any other
driver which does GRO without the kernel GRO implementation), the
solution is specially tailored for the bnx2x driver.

We could either:
  1. Continue with this patch, later sending a patch correcting gso_segs,
     as this is not a new issue.
  2. Send a V2 patch-series which will also set gso_segs correctly.
  3. Send a V2 patch-series which omits this patch, and later send an RFC
     for some kernel implementation which fixes the issue.

Your thoughts on this matter will be greatly appreciated.

Thanks,
Yuval

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-01-17 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-17 13:26 [PATCH net-next] bnx2x: fix GRO parameters Yuval Mintz
2013-01-17 14:33 ` Eric Dumazet
2013-01-17 19:56   ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2013-01-15  7:28 [PATCH net-next 09/10] bnx2x: Added FW GRO bridging support Yuval Mintz
2013-01-15 14:39 ` Eric Dumazet
2013-01-15 14:02   ` Yuval Mintz
2013-01-15 20:09     ` David Miller
2013-01-16  4:56       ` Eric Dumazet
2013-01-16  5:37         ` [PATCH net-next] bnx2x: fix GRO parameters Eric Dumazet
2013-01-16  7:01           ` Yuval Mintz
2013-01-16 15:29             ` Eric Dumazet
2013-01-16 14:38               ` Yuval Mintz
2013-01-16 16:50                 ` Eric Dumazet
2013-01-17  7:16                   ` Yuval Mintz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).