public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO
@ 2026-03-03 19:55 Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 1/4] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO Joshua Washington
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Joshua Washington @ 2026-03-03 19:55 UTC (permalink / raw)
  To: netdev
  Cc: Joshua Washington, Harshitha Ramamurthy, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Willem de Bruijn, Jordan Rhee,
	Ankit Garg, John Fraker, Ziwei Xiao, Matt Olson,
	Praveen Kaligineedi, Tim Hostetler, linux-kernel, bpf

From: Ankit Garg <nktgrg@google.com>

The DQO device has always performed HW GRO, not LRO. This series updates
the feature bit and modifies the RX path to enhance support. It sets
gso_segs correctly so the software stack can continue coalescing, and
pulls network headers into the skb linear space to avoid multiple small
memory copies when header-split is disabled.

We also enable HW GRO by default on supported devices.

Ankit Garg (4):
  gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO
  gve: fix SW coalescing when hw-GRO is used
  gve: pull network headers into skb linear part
  gve: Enable hw-gro by default if device supported
--
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next 1/4] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO
  2026-03-03 19:55 [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO Joshua Washington
@ 2026-03-03 19:55 ` Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 2/4] gve: fix SW coalescing when hw-GRO is used Joshua Washington
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Joshua Washington @ 2026-03-03 19:55 UTC (permalink / raw)
  To: netdev
  Cc: Joshua Washington, Harshitha Ramamurthy, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Willem de Bruijn, Jordan Rhee,
	Ankit Garg, John Fraker, Ziwei Xiao, Matt Olson,
	Praveen Kaligineedi, Tim Hostetler, linux-kernel, bpf

From: Ankit Garg <nktgrg@google.com>

The device behind DQO format has always coalesced packets per stricter
hardware GRO spec even though it was being advertised as LRO.

Update advertised capability to match device behavior.

Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/ethernet/google/gve/gve_adminq.c |  6 +++---
 drivers/net/ethernet/google/gve/gve_main.c   | 15 ++++++++-------
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c b/drivers/net/ethernet/google/gve/gve_adminq.c
index b72cc0fa..873672f6 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.c
+++ b/drivers/net/ethernet/google/gve/gve_adminq.c
@@ -791,7 +791,7 @@ static void gve_adminq_get_create_rx_queue_cmd(struct gve_priv *priv,
 		cmd->create_rx_queue.rx_buff_ring_size =
 			cpu_to_be16(priv->rx_desc_cnt);
 		cmd->create_rx_queue.enable_rsc =
-			!!(priv->dev->features & NETIF_F_LRO);
+			!!(priv->dev->features & NETIF_F_GRO_HW);
 		if (priv->header_split_enabled)
 			cmd->create_rx_queue.header_buffer_size =
 				cpu_to_be16(priv->header_buf_size);
@@ -1127,9 +1127,9 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 
 	gve_set_default_rss_sizes(priv);
 
-	/* DQO supports LRO. */
+	/* DQO supports HW-GRO. */
 	if (!gve_is_gqi(priv))
-		priv->dev->hw_features |= NETIF_F_LRO;
+		priv->dev->hw_features |= NETIF_F_GRO_HW;
 
 	priv->max_registered_pages =
 				be64_to_cpu(descriptor->max_registered_pages);
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index 0ee864b0..ee963c98 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -1718,9 +1718,9 @@ static int gve_verify_xdp_configuration(struct net_device *dev,
 	struct gve_priv *priv = netdev_priv(dev);
 	u16 max_xdp_mtu;
 
-	if (dev->features & NETIF_F_LRO) {
+	if (dev->features & NETIF_F_GRO_HW) {
 		NL_SET_ERR_MSG_MOD(extack,
-				   "XDP is not supported when LRO is on.");
+				   "XDP is not supported when HW-GRO is on.");
 		return -EOPNOTSUPP;
 	}
 
@@ -2137,12 +2137,13 @@ static int gve_set_features(struct net_device *netdev,
 
 	gve_get_curr_alloc_cfgs(priv, &tx_alloc_cfg, &rx_alloc_cfg);
 
-	if ((netdev->features & NETIF_F_LRO) != (features & NETIF_F_LRO)) {
-		netdev->features ^= NETIF_F_LRO;
-		if (priv->xdp_prog && (netdev->features & NETIF_F_LRO)) {
+	if ((netdev->features & NETIF_F_GRO_HW) !=
+	    (features & NETIF_F_GRO_HW)) {
+		netdev->features ^= NETIF_F_GRO_HW;
+		if (priv->xdp_prog && (netdev->features & NETIF_F_GRO_HW)) {
 			netdev_warn(netdev,
-				    "XDP is not supported when LRO is on.\n");
-			err =  -EOPNOTSUPP;
+				    "HW-GRO is not supported when XDP is on.");
+			err = -EOPNOTSUPP;
 			goto revert_features;
 		}
 		if (netif_running(netdev)) {
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next 2/4] gve: fix SW coalescing when hw-GRO is used
  2026-03-03 19:55 [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 1/4] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO Joshua Washington
@ 2026-03-03 19:55 ` Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 3/4] gve: pull network headers into skb linear part Joshua Washington
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Joshua Washington @ 2026-03-03 19:55 UTC (permalink / raw)
  To: netdev
  Cc: Joshua Washington, Harshitha Ramamurthy, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Willem de Bruijn, Jordan Rhee,
	Ankit Garg, John Fraker, Ziwei Xiao, Matt Olson,
	Praveen Kaligineedi, Tim Hostetler, linux-kernel, bpf

From: Ankit Garg <nktgrg@google.com>

Leaving gso_segs unpopulated on hardware GRO packet prevents further
coalescing by software stack because the kernel's GRO logic marks the
SKB for flush because the expected length of all segments doesn't match
actual payload length.

Setting gso_segs correctly results in significantly more segments being
coalesced as measured by the result of dev_gro_receive().

gso_segs are derived from payload length. When header-split is enabled,
payload is in the non-linear portion of skb. And when header-split is
disabled, we have to parse the headers to determine payload length.

Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/ethernet/google/gve/gve_rx_dqo.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
index 63a96106..5ba893e5 100644
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -945,11 +945,16 @@ static int gve_rx_complete_rsc(struct sk_buff *skb,
 			       struct gve_ptype ptype)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
+	int rsc_segments, rsc_seg_len, hdr_len;
 
-	/* Only TCP is supported right now. */
+	/* HW-GRO only coalesces TCP. */
 	if (ptype.l4_type != GVE_L4_TYPE_TCP)
 		return -EINVAL;
 
+	rsc_seg_len = le16_to_cpu(desc->rsc_seg_len);
+	if (!rsc_seg_len)
+		return 0;
+
 	switch (ptype.l3_type) {
 	case GVE_L3_TYPE_IPV4:
 		shinfo->gso_type = SKB_GSO_TCPV4;
@@ -961,7 +966,21 @@ static int gve_rx_complete_rsc(struct sk_buff *skb,
 		return -EINVAL;
 	}
 
-	shinfo->gso_size = le16_to_cpu(desc->rsc_seg_len);
+	if (skb_headlen(skb)) {
+		/* With header-split, payload is in the non-linear part */
+		rsc_segments = DIV_ROUND_UP(skb->data_len, rsc_seg_len);
+	} else {
+		/* HW-GRO packets are guaranteed to have complete TCP/IP
+		 * headers in frag[0] when header-split is not enabled.
+		 */
+		hdr_len = eth_get_headlen(skb->dev,
+					  skb_frag_address(&shinfo->frags[0]),
+					  skb_frag_size(&shinfo->frags[0]));
+		rsc_segments = DIV_ROUND_UP(skb->len - hdr_len, rsc_seg_len);
+	}
+	shinfo->gso_size = rsc_seg_len;
+	shinfo->gso_segs = rsc_segments;
+
 	return 0;
 }
 
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next 3/4] gve: pull network headers into skb linear part
  2026-03-03 19:55 [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 1/4] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 2/4] gve: fix SW coalescing when hw-GRO is used Joshua Washington
@ 2026-03-03 19:55 ` Joshua Washington
  2026-03-03 19:55 ` [PATCH net-next 4/4] gve: Enable hw-gro by default if device supported Joshua Washington
  2026-03-05 15:00 ` [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: Joshua Washington @ 2026-03-03 19:55 UTC (permalink / raw)
  To: netdev
  Cc: Joshua Washington, Harshitha Ramamurthy, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Willem de Bruijn, Jordan Rhee,
	Ankit Garg, John Fraker, Ziwei Xiao, Matt Olson,
	Praveen Kaligineedi, Tim Hostetler, linux-kernel, bpf

From: Ankit Garg <nktgrg@google.com>

Currently, in DQO mode with hw-gro enabled, entire received packet is
placed into skb fragments when header-split is disabled. This leaves
the skb linear part empty, forcing the networking stack to do multiple
small memory copies to access eth, IP and TCP headers.

This patch adds a single memcpy to put all headers into linear portion
before packet reaches the SW GRO stack; thus eliminating multiple
smaller memcpy calls.

Additionally, the criteria for calling napi_gro_frags() was updated.
Since skb->head is now populated, we instead check if the SKB is the
cached NAPI scratchpad to ensure we continue using the zero-allocation
path.

Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/ethernet/google/gve/gve_rx_dqo.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
index 5ba893e5..ac44f50d 100644
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -946,6 +946,8 @@ static int gve_rx_complete_rsc(struct sk_buff *skb,
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
 	int rsc_segments, rsc_seg_len, hdr_len;
+	skb_frag_t *frag;
+	void *va;
 
 	/* HW-GRO only coalesces TCP. */
 	if (ptype.l4_type != GVE_L4_TYPE_TCP)
@@ -973,10 +975,20 @@ static int gve_rx_complete_rsc(struct sk_buff *skb,
 		/* HW-GRO packets are guaranteed to have complete TCP/IP
 		 * headers in frag[0] when header-split is not enabled.
 		 */
-		hdr_len = eth_get_headlen(skb->dev,
-					  skb_frag_address(&shinfo->frags[0]),
-					  skb_frag_size(&shinfo->frags[0]));
+		frag = &skb_shinfo(skb)->frags[0];
+		va = skb_frag_address(frag);
+		hdr_len =
+			eth_get_headlen(skb->dev, va, skb_frag_size(frag));
 		rsc_segments = DIV_ROUND_UP(skb->len - hdr_len, rsc_seg_len);
+		skb_copy_to_linear_data(skb, va, hdr_len);
+		skb_frag_size_sub(frag, hdr_len);
+		/* Verify we didn't empty the fragment completely as that could
+		 * otherwise lead to page leaks.
+		 */
+		DEBUG_NET_WARN_ON_ONCE(!skb_frag_size(frag));
+		skb_frag_off_add(frag, hdr_len);
+		skb->data_len -= hdr_len;
+		skb->tail += hdr_len;
 	}
 	shinfo->gso_size = rsc_seg_len;
 	shinfo->gso_segs = rsc_segments;
@@ -1013,7 +1025,7 @@ static int gve_rx_complete_skb(struct gve_rx_ring *rx, struct napi_struct *napi,
 			return err;
 	}
 
-	if (skb_headlen(rx->ctx.skb_head) == 0)
+	if (rx->ctx.skb_head == napi->skb)
 		napi_gro_frags(napi);
 	else
 		napi_gro_receive(napi, rx->ctx.skb_head);
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next 4/4] gve: Enable hw-gro by default if device supported
  2026-03-03 19:55 [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO Joshua Washington
                   ` (2 preceding siblings ...)
  2026-03-03 19:55 ` [PATCH net-next 3/4] gve: pull network headers into skb linear part Joshua Washington
@ 2026-03-03 19:55 ` Joshua Washington
  2026-03-05 15:00 ` [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: Joshua Washington @ 2026-03-03 19:55 UTC (permalink / raw)
  To: netdev
  Cc: Joshua Washington, Harshitha Ramamurthy, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Willem de Bruijn, Jordan Rhee,
	Ankit Garg, John Fraker, Ziwei Xiao, Matt Olson,
	Praveen Kaligineedi, Tim Hostetler, linux-kernel, bpf

From: Ankit Garg <nktgrg@google.com>

Change the driver's default behavior to enable hw-gro whenever supported
for device.

Performance observations:
- We observed ~10% improvement in RX single stream throughput across
  various MTU sizes.
- No change in TCP_RR/TCP_CRR latencies

Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/ethernet/google/gve/gve_adminq.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c b/drivers/net/ethernet/google/gve/gve_adminq.c
index 873672f6..6ce8345e 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.c
+++ b/drivers/net/ethernet/google/gve/gve_adminq.c
@@ -1128,8 +1128,10 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 	gve_set_default_rss_sizes(priv);
 
 	/* DQO supports HW-GRO. */
-	if (!gve_is_gqi(priv))
+	if (gve_is_dqo(priv)) {
 		priv->dev->hw_features |= NETIF_F_GRO_HW;
+		priv->dev->features |= NETIF_F_GRO_HW;
+	}
 
 	priv->max_registered_pages =
 				be64_to_cpu(descriptor->max_registered_pages);
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO
  2026-03-03 19:55 [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO Joshua Washington
                   ` (3 preceding siblings ...)
  2026-03-03 19:55 ` [PATCH net-next 4/4] gve: Enable hw-gro by default if device supported Joshua Washington
@ 2026-03-05 15:00 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-05 15:00 UTC (permalink / raw)
  To: Joshua Washington
  Cc: netdev, hramamurthy, andrew+netdev, davem, edumazet, kuba, pabeni,
	ast, daniel, hawk, john.fastabend, sdf, willemb, jordanrhee,
	nktgrg, jfraker, ziweixiao, maolson, pkaligineedi, thostet,
	linux-kernel, bpf

Hello:

This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue,  3 Mar 2026 11:55:45 -0800 you wrote:
> From: Ankit Garg <nktgrg@google.com>
> 
> The DQO device has always performed HW GRO, not LRO. This series updates
> the feature bit and modifies the RX path to enhance support. It sets
> gso_segs correctly so the software stack can continue coalescing, and
> pulls network headers into the skb linear space to avoid multiple small
> memory copies when header-split is disabled.
> 
> [...]

Here is the summary with links:
  - [net-next,1/4] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO
    https://git.kernel.org/netdev/net-next/c/e637c244b954
  - [net-next,2/4] gve: fix SW coalescing when hw-GRO is used
    https://git.kernel.org/netdev/net-next/c/ea4c1176871f
  - [net-next,3/4] gve: pull network headers into skb linear part
    https://git.kernel.org/netdev/net-next/c/0c7025fd24db
  - [net-next,4/4] gve: Enable hw-gro by default if device supported
    https://git.kernel.org/netdev/net-next/c/3c398063ef01

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-05 15:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-03 19:55 [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO Joshua Washington
2026-03-03 19:55 ` [PATCH net-next 1/4] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO Joshua Washington
2026-03-03 19:55 ` [PATCH net-next 2/4] gve: fix SW coalescing when hw-GRO is used Joshua Washington
2026-03-03 19:55 ` [PATCH net-next 3/4] gve: pull network headers into skb linear part Joshua Washington
2026-03-03 19:55 ` [PATCH net-next 4/4] gve: Enable hw-gro by default if device supported Joshua Washington
2026-03-05 15:00 ` [PATCH net-next 0/4] gve: optimize and enable HW GRO for DQO patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox