[PATCH v3 net-next 0/4] net: batched receive in GRO path

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 net-next 0/4] net: batched receive in GRO path
@ 2018-11-14 18:07 Edward Cree
  2018-11-14 18:09 ` [PATCH v3 net-next 1/4] net: introduce list entry point for GRO Edward Cree
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Edward Cree @ 2018-11-14 18:07 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev, eric.dumazet

This series listifies part of GRO processing, in a manner which allows those
 packets which are not GROed (i.e. for which dev_gro_receive returns
 GRO_NORMAL) to be passed on to the listified regular receive path.
dev_gro_receive() itself is not listified, nor the per-protocol GRO
 callback, since GRO's need to hold packets on lists under napi->gro_hash
 makes keeping the packets on other lists awkward, and since the GRO control
 block state of held skbs can refer only to one 'new' skb at a time.

Performance figures with this series, collected on a back-to-back pair of
 Solarflare sfn8522-r2 NICs with 120-second NetPerf tests.  In the stats,
 sample size n for old and new code is 6 runs each; p is from a Welch t-test.
Tests were run both with GRO enabled and disabled, the latter simulating
 uncoalesceable packets (e.g. due to IP or TCP options).  Payload_size in all
 tests was 8000 bytes.  BW tests use 4 streams, RR tests use 100.
TCP Stream, GRO on:
net-next: 9.415 Gb/s (line rate); 190% total rxcpu
after #4: 9.415 Gb/s; 192% total rxcpu
 p_bw = 0.155; p_cpu = 0.382
TCP Stream, GRO off:
net-next: 5.625 Gb/s
after #4: 6.551 Gb/s
  16.5% faster; p < 0.001
TCP RR, GRO on:
net-next: 837.6 us
after #4: 840.0 us
  0.3% slower; p = 0.229
TCP RR, GRO off:
net-next: 867.6 us
after #4: 860.1 us
  0.9% faster; p = 0.064
UDP Stream (GRO off):
net-next: 7.808 Gb/s
after #4: 7.848 Gb/s
  0.5% slower; p = 0.144
Conclusion:
* TCP b/w is 16.5% faster for traffic which cannot be coalesced by GRO.
* TCP latency might be slightly improved in the same case, but it's not
  quite statistically significant
* Both see no statistically significant change in performance with GRO
  active
* UDP throughput might be slightly slowed (probably by patch #3) but it's
  not statistically significant.  Note that drivers which (unlike sfc) pass
  UDP traffic to GRO will probably see gains here as this gives them access
  to bundling.

Change history:
v3: Rebased on latest net-next.  Re-ran performance tests and added TCP_RR
 tests at suggestion of Eric Dumazet.  Expanded changelog of patch #3.

v2: Rebased on latest net-next.  Removed RFC tags.  Otherwise unchanged
 owing to lack of comments on v1.

Edward Cree (4):
  net: introduce list entry point for GRO
  sfc: use batched receive for GRO
  net: make listified RX functions return number of good packets
  net/core: handle GRO_NORMAL skbs as a list in napi_gro_receive_list

 drivers/net/ethernet/sfc/efx.c        |  11 +++-
 drivers/net/ethernet/sfc/net_driver.h |   1 +
 drivers/net/ethernet/sfc/rx.c         |  16 +++++-
 include/linux/netdevice.h             |   6 +-
 include/net/ip.h                      |   4 +-
 include/net/ipv6.h                    |   4 +-
 net/core/dev.c                        | 104 ++++++++++++++++++++++++++--------
 net/ipv4/ip_input.c                   |  39 ++++++++-----
 net/ipv6/ip6_input.c                  |  37 +++++++-----
 9 files changed, 157 insertions(+), 65 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 net-next 1/4] net: introduce list entry point for GRO
  2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
@ 2018-11-14 18:09 ` Edward Cree
  2018-11-14 18:09 ` [PATCH v3 net-next 2/4] sfc: use batched receive " Edward Cree
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-11-14 18:09 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev, eric.dumazet

Also export napi_frags_skb() so that drivers using the napi_gro_frags()
 interface can prepare their SKBs properly for submitting on such a list.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 include/linux/netdevice.h |  2 ++
 net/core/dev.c            | 28 +++++++++++++++++++++++++++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 487fa5e0e165..2cef1d0fb2b1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3589,8 +3589,10 @@ int netif_receive_skb(struct sk_buff *skb);
 int netif_receive_skb_core(struct sk_buff *skb);
 void netif_receive_skb_list(struct list_head *head);
 gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb);
+int napi_gro_receive_list(struct napi_struct *napi, struct list_head *head);
 void napi_gro_flush(struct napi_struct *napi, bool flush_old);
 struct sk_buff *napi_get_frags(struct napi_struct *napi);
+struct sk_buff *napi_frags_skb(struct napi_struct *napi);
 gro_result_t napi_gro_frags(struct napi_struct *napi);
 struct packet_offload *gro_find_receive_by_type(__be16 type);
 struct packet_offload *gro_find_complete_by_type(__be16 type);
diff --git a/net/core/dev.c b/net/core/dev.c
index bf7e0a471186..8f0fb56170b3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5645,6 +5645,31 @@ gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(napi_gro_receive);
 
+/* Returns the number of SKBs on the list successfully received */
+int napi_gro_receive_list(struct napi_struct *napi, struct list_head *head)
+{
+	struct sk_buff *skb, *next;
+	gro_result_t result;
+	int kept = 0;
+
+	list_for_each_entry(skb, head, list) {
+		skb_mark_napi_id(skb, napi);
+		trace_napi_gro_receive_entry(skb);
+		skb_gro_reset_offset(skb);
+	}
+
+	list_for_each_entry_safe(skb, next, head, list) {
+		list_del(&skb->list);
+		skb->next = NULL;
+		result = dev_gro_receive(napi, skb);
+		result = napi_skb_finish(result, skb);
+		if (result != GRO_DROP)
+			kept++;
+	}
+	return kept;
+}
+EXPORT_SYMBOL(napi_gro_receive_list);
+
 static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
 {
 	if (unlikely(skb->pfmemalloc)) {
@@ -5716,7 +5741,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
  * Drivers could call both napi_gro_frags() and napi_gro_receive()
  * We copy ethernet header into skb->data to have a common layout.
  */
-static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
+struct sk_buff *napi_frags_skb(struct napi_struct *napi)
 {
 	struct sk_buff *skb = napi->skb;
 	const struct ethhdr *eth;
@@ -5752,6 +5777,7 @@ static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
 
 	return skb;
 }
+EXPORT_SYMBOL(napi_frags_skb);
 
 gro_result_t napi_gro_frags(struct napi_struct *napi)
 {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 net-next 2/4] sfc: use batched receive for GRO
  2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
  2018-11-14 18:09 ` [PATCH v3 net-next 1/4] net: introduce list entry point for GRO Edward Cree
@ 2018-11-14 18:09 ` Edward Cree
  2018-11-14 18:09 ` [PATCH v3 net-next 3/4] net: make listified RX functions return number of good packets Edward Cree
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-11-14 18:09 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev, eric.dumazet

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/efx.c        | 11 +++++++++--
 drivers/net/ethernet/sfc/net_driver.h |  1 +
 drivers/net/ethernet/sfc/rx.c         | 16 +++++++++++++---
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 98fe7e762e17..dbe4a70b36b0 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -263,9 +263,9 @@ static int efx_check_disabled(struct efx_nic *efx)
  */
 static int efx_process_channel(struct efx_channel *channel, int budget)
 {
+	struct list_head rx_list, gro_list;
 	struct efx_tx_queue *tx_queue;
-	struct list_head rx_list;
-	int spent;
+	int spent, gro_count;
 
 	if (unlikely(!channel->enabled))
 		return 0;
@@ -275,6 +275,10 @@ static int efx_process_channel(struct efx_channel *channel, int budget)
 	INIT_LIST_HEAD(&rx_list);
 	channel->rx_list = &rx_list;
 
+	EFX_WARN_ON_PARANOID(channel->gro_list != NULL);
+	INIT_LIST_HEAD(&gro_list);
+	channel->gro_list = &gro_list;
+
 	efx_for_each_channel_tx_queue(tx_queue, channel) {
 		tx_queue->pkts_compl = 0;
 		tx_queue->bytes_compl = 0;
@@ -300,6 +304,9 @@ static int efx_process_channel(struct efx_channel *channel, int budget)
 	/* Receive any packets we queued up */
 	netif_receive_skb_list(channel->rx_list);
 	channel->rx_list = NULL;
+	gro_count = napi_gro_receive_list(&channel->napi_str, channel->gro_list);
+	channel->irq_mod_score += gro_count * 2;
+	channel->gro_list = NULL;
 
 	return spent;
 }
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 961b92979640..72addac7a84a 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -502,6 +502,7 @@ struct efx_channel {
 	unsigned int rx_pkt_index;
 
 	struct list_head *rx_list;
+	struct list_head *gro_list;
 
 	struct efx_rx_queue rx_queue;
 	struct efx_tx_queue tx_queue[EFX_TXQ_TYPES];
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index 396ff01298cd..0534a54048c6 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -453,9 +453,19 @@ efx_rx_packet_gro(struct efx_channel *channel, struct efx_rx_buffer *rx_buf,
 
 	skb_record_rx_queue(skb, channel->rx_queue.core_index);
 
-	gro_result = napi_gro_frags(napi);
-	if (gro_result != GRO_DROP)
-		channel->irq_mod_score += 2;
+	/* Pass the packet up */
+	if (channel->gro_list != NULL) {
+		/* Clear napi->skb and prepare skb for GRO */
+		skb = napi_frags_skb(napi);
+		if (skb)
+			/* Add to list, will pass up later */
+			list_add_tail(&skb->list, channel->gro_list);
+	} else {
+		/* No list, so pass it up now */
+		gro_result = napi_gro_frags(napi);
+		if (gro_result != GRO_DROP)
+			channel->irq_mod_score += 2;
+	}
 }
 
 /* Allocate and construct an SKB around page fragments */

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 net-next 3/4] net: make listified RX functions return number of good packets
  2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
  2018-11-14 18:09 ` [PATCH v3 net-next 1/4] net: introduce list entry point for GRO Edward Cree
  2018-11-14 18:09 ` [PATCH v3 net-next 2/4] sfc: use batched receive " Edward Cree
@ 2018-11-14 18:09 ` Edward Cree
  2018-11-14 18:10 ` [PATCH v3 net-next 4/4] net/core: handle GRO_NORMAL skbs as a list in napi_gro_receive_list Edward Cree
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-11-14 18:09 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev, eric.dumazet

'Good' packets are defined as skbs for which netif_receive_skb() would
 have returned %NET_RX_SUCCESS.  Thus, drivers can use this number for
 adaptive interrupt moderation where they previously reacted to the
 return code from netif_receive_skb().

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 include/linux/netdevice.h |  4 +--
 include/net/ip.h          |  4 +--
 include/net/ipv6.h        |  4 +--
 net/core/dev.c            | 63 +++++++++++++++++++++++++++++------------------
 net/ipv4/ip_input.c       | 39 ++++++++++++++++++-----------
 net/ipv6/ip6_input.c      | 37 +++++++++++++++++-----------
 6 files changed, 92 insertions(+), 59 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2cef1d0fb2b1..76b98386a5dd 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2357,7 +2357,7 @@ struct packet_type {
 					 struct net_device *,
 					 struct packet_type *,
 					 struct net_device *);
-	void			(*list_func) (struct list_head *,
+	int			(*list_func) (struct list_head *,
 					      struct packet_type *,
 					      struct net_device *);
 	bool			(*id_match)(struct packet_type *ptype,
@@ -3587,7 +3587,7 @@ int netif_rx(struct sk_buff *skb);
 int netif_rx_ni(struct sk_buff *skb);
 int netif_receive_skb(struct sk_buff *skb);
 int netif_receive_skb_core(struct sk_buff *skb);
-void netif_receive_skb_list(struct list_head *head);
+int netif_receive_skb_list(struct list_head *head);
 gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb);
 int napi_gro_receive_list(struct napi_struct *napi, struct list_head *head);
 void napi_gro_flush(struct napi_struct *napi, bool flush_old);
diff --git a/include/net/ip.h b/include/net/ip.h
index 8866bfce6121..33ab464f7a09 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -152,8 +152,8 @@ int ip_build_and_send_pkt(struct sk_buff *skb, const struct sock *sk,
 			  struct ip_options_rcu *opt);
 int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 	   struct net_device *orig_dev);
-void ip_list_rcv(struct list_head *head, struct packet_type *pt,
-		 struct net_device *orig_dev);
+int ip_list_rcv(struct list_head *head, struct packet_type *pt,
+		struct net_device *orig_dev);
 int ip_local_deliver(struct sk_buff *skb);
 void ip_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int proto);
 int ip_mr_input(struct sk_buff *skb);
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index daf80863d3a5..e25920829a94 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -914,8 +914,8 @@ static inline __be32 flowi6_get_flowlabel(const struct flowi6 *fl6)
 
 int ipv6_rcv(struct sk_buff *skb, struct net_device *dev,
 	     struct packet_type *pt, struct net_device *orig_dev);
-void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
-		   struct net_device *orig_dev);
+int ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
+		  struct net_device *orig_dev);
 
 int ip6_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 8f0fb56170b3..35427167f6fb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4969,24 +4969,27 @@ int netif_receive_skb_core(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(netif_receive_skb_core);
 
-static inline void __netif_receive_skb_list_ptype(struct list_head *head,
-						  struct packet_type *pt_prev,
-						  struct net_device *orig_dev)
+static inline int __netif_receive_skb_list_ptype(struct list_head *head,
+						 struct packet_type *pt_prev,
+						 struct net_device *orig_dev)
 {
 	struct sk_buff *skb, *next;
+	int kept = 0;
 
 	if (!pt_prev)
-		return;
+		return 0;
 	if (list_empty(head))
-		return;
+		return 0;
 	if (pt_prev->list_func != NULL)
-		pt_prev->list_func(head, pt_prev, orig_dev);
+		kept = pt_prev->list_func(head, pt_prev, orig_dev);
 	else
 		list_for_each_entry_safe(skb, next, head, list)
-			pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
+			if (pt_prev->func(skb, skb->dev, pt_prev, orig_dev) == NET_RX_SUCCESS)
+				kept++;
+	return kept;
 }
 
-static void __netif_receive_skb_list_core(struct list_head *head, bool pfmemalloc)
+static int __netif_receive_skb_list_core(struct list_head *head, bool pfmemalloc)
 {
 	/* Fast-path assumptions:
 	 * - There is no RX handler.
@@ -5003,6 +5006,7 @@ static void __netif_receive_skb_list_core(struct list_head *head, bool pfmemallo
 	struct net_device *od_curr = NULL;
 	struct list_head sublist;
 	struct sk_buff *skb, *next;
+	int kept = 0, ret;
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
@@ -5010,12 +5014,15 @@ static void __netif_receive_skb_list_core(struct list_head *head, bool pfmemallo
 		struct packet_type *pt_prev = NULL;
 
 		list_del(&skb->list);
-		__netif_receive_skb_core(skb, pfmemalloc, &pt_prev);
-		if (!pt_prev)
+		ret = __netif_receive_skb_core(skb, pfmemalloc, &pt_prev);
+		if (!pt_prev) {
+			if (ret == NET_RX_SUCCESS)
+				kept++;
 			continue;
+		}
 		if (pt_curr != pt_prev || od_curr != orig_dev) {
 			/* dispatch old sublist */
-			__netif_receive_skb_list_ptype(&sublist, pt_curr, od_curr);
+			kept += __netif_receive_skb_list_ptype(&sublist, pt_curr, od_curr);
 			/* start new sublist */
 			INIT_LIST_HEAD(&sublist);
 			pt_curr = pt_prev;
@@ -5025,7 +5032,8 @@ static void __netif_receive_skb_list_core(struct list_head *head, bool pfmemallo
 	}
 
 	/* dispatch final sublist */
-	__netif_receive_skb_list_ptype(&sublist, pt_curr, od_curr);
+	kept += __netif_receive_skb_list_ptype(&sublist, pt_curr, od_curr);
+	return kept;
 }
 
 static int __netif_receive_skb(struct sk_buff *skb)
@@ -5053,11 +5061,12 @@ static int __netif_receive_skb(struct sk_buff *skb)
 	return ret;
 }
 
-static void __netif_receive_skb_list(struct list_head *head)
+static int __netif_receive_skb_list(struct list_head *head)
 {
 	unsigned long noreclaim_flag = 0;
 	struct sk_buff *skb, *next;
 	bool pfmemalloc = false; /* Is current sublist PF_MEMALLOC? */
+	int kept = 0;
 
 	list_for_each_entry_safe(skb, next, head, list) {
 		if ((sk_memalloc_socks() && skb_pfmemalloc(skb)) != pfmemalloc) {
@@ -5066,7 +5075,7 @@ static void __netif_receive_skb_list(struct list_head *head)
 			/* Handle the previous sublist */
 			list_cut_before(&sublist, head, &skb->list);
 			if (!list_empty(&sublist))
-				__netif_receive_skb_list_core(&sublist, pfmemalloc);
+				kept += __netif_receive_skb_list_core(&sublist, pfmemalloc);
 			pfmemalloc = !pfmemalloc;
 			/* See comments in __netif_receive_skb */
 			if (pfmemalloc)
@@ -5077,10 +5086,11 @@ static void __netif_receive_skb_list(struct list_head *head)
 	}
 	/* Handle the remaining sublist */
 	if (!list_empty(head))
-		__netif_receive_skb_list_core(head, pfmemalloc);
+		kept += __netif_receive_skb_list_core(head, pfmemalloc);
 	/* Restore pflags */
 	if (pfmemalloc)
 		memalloc_noreclaim_restore(noreclaim_flag);
+	return kept;
 }
 
 static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
@@ -5156,17 +5166,20 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
 	return ret;
 }
 
-static void netif_receive_skb_list_internal(struct list_head *head)
+static int netif_receive_skb_list_internal(struct list_head *head)
 {
 	struct bpf_prog *xdp_prog = NULL;
 	struct sk_buff *skb, *next;
 	struct list_head sublist;
+	int kept = 0;
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
 		net_timestamp_check(netdev_tstamp_prequeue, skb);
 		list_del(&skb->list);
-		if (!skb_defer_rx_timestamp(skb))
+		if (skb_defer_rx_timestamp(skb))
+			kept++;
+		else
 			list_add_tail(&skb->list, &sublist);
 	}
 	list_splice_init(&sublist, head);
@@ -5196,13 +5209,15 @@ static void netif_receive_skb_list_internal(struct list_head *head)
 			if (cpu >= 0) {
 				/* Will be handled, remove from list */
 				list_del(&skb->list);
-				enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
+				if (enqueue_to_backlog(skb, cpu, &rflow->last_qtail) == NET_RX_SUCCESS)
+					kept++;
 			}
 		}
 	}
 #endif
-	__netif_receive_skb_list(head);
+	kept += __netif_receive_skb_list(head);
 	rcu_read_unlock();
+	return kept;
 }
 
 /**
@@ -5232,21 +5247,21 @@ EXPORT_SYMBOL(netif_receive_skb);
  *	netif_receive_skb_list - process many receive buffers from network
  *	@head: list of skbs to process.
  *
- *	Since return value of netif_receive_skb() is normally ignored, and
- *	wouldn't be meaningful for a list, this function returns void.
+ *	Returns the number of skbs for which netif_receive_skb() would have
+ *	returned %NET_RX_SUCCESS.
  *
  *	This function may only be called from softirq context and interrupts
  *	should be enabled.
  */
-void netif_receive_skb_list(struct list_head *head)
+int netif_receive_skb_list(struct list_head *head)
 {
 	struct sk_buff *skb;
 
 	if (list_empty(head))
-		return;
+		return 0;
 	list_for_each_entry(skb, head, list)
 		trace_netif_receive_skb_list_entry(skb);
-	netif_receive_skb_list_internal(head);
+	return netif_receive_skb_list_internal(head);
 }
 EXPORT_SYMBOL(netif_receive_skb_list);
 
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 72250b4e466d..fa4eb82f9e55 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -525,22 +525,26 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 		       ip_rcv_finish);
 }
 
-static void ip_sublist_rcv_finish(struct list_head *head)
+static int ip_sublist_rcv_finish(struct list_head *head)
 {
 	struct sk_buff *skb, *next;
+	int kept = 0;
 
 	list_for_each_entry_safe(skb, next, head, list) {
 		skb_list_del_init(skb);
-		dst_input(skb);
+		if (dst_input(skb) == NET_RX_SUCCESS)
+			kept++;
 	}
+	return kept;
 }
 
-static void ip_list_rcv_finish(struct net *net, struct sock *sk,
-			       struct list_head *head)
+static int ip_list_rcv_finish(struct net *net, struct sock *sk,
+			      struct list_head *head)
 {
 	struct dst_entry *curr_dst = NULL;
 	struct sk_buff *skb, *next;
 	struct list_head sublist;
+	int kept = 0;
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
@@ -551,8 +555,10 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 		 * skb to its handler for processing
 		 */
 		skb = l3mdev_ip_rcv(skb);
-		if (!skb)
+		if (!skb) {
+			kept++;
 			continue;
+		}
 		if (ip_rcv_finish_core(net, sk, skb) == NET_RX_DROP)
 			continue;
 
@@ -560,7 +566,7 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 		if (curr_dst != dst) {
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
-				ip_sublist_rcv_finish(&sublist);
+				kept += ip_sublist_rcv_finish(&sublist);
 			/* start new sublist */
 			INIT_LIST_HEAD(&sublist);
 			curr_dst = dst;
@@ -568,25 +574,27 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 		list_add_tail(&skb->list, &sublist);
 	}
 	/* dispatch final sublist */
-	ip_sublist_rcv_finish(&sublist);
+	kept += ip_sublist_rcv_finish(&sublist);
+	return kept;
 }
 
-static void ip_sublist_rcv(struct list_head *head, struct net_device *dev,
-			   struct net *net)
+static int ip_sublist_rcv(struct list_head *head, struct net_device *dev,
+			  struct net *net)
 {
 	NF_HOOK_LIST(NFPROTO_IPV4, NF_INET_PRE_ROUTING, net, NULL,
 		     head, dev, NULL, ip_rcv_finish);
-	ip_list_rcv_finish(net, NULL, head);
+	return ip_list_rcv_finish(net, NULL, head);
 }
 
-/* Receive a list of IP packets */
-void ip_list_rcv(struct list_head *head, struct packet_type *pt,
-		 struct net_device *orig_dev)
+/* Receive a list of IP packets; return number of successful receives */
+int ip_list_rcv(struct list_head *head, struct packet_type *pt,
+		struct net_device *orig_dev)
 {
 	struct net_device *curr_dev = NULL;
 	struct net *curr_net = NULL;
 	struct sk_buff *skb, *next;
 	struct list_head sublist;
+	int kept = 0;
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
@@ -601,7 +609,7 @@ void ip_list_rcv(struct list_head *head, struct packet_type *pt,
 		if (curr_dev != dev || curr_net != net) {
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
-				ip_sublist_rcv(&sublist, curr_dev, curr_net);
+				kept += ip_sublist_rcv(&sublist, curr_dev, curr_net);
 			/* start new sublist */
 			INIT_LIST_HEAD(&sublist);
 			curr_dev = dev;
@@ -610,5 +618,6 @@ void ip_list_rcv(struct list_head *head, struct packet_type *pt,
 		list_add_tail(&skb->list, &sublist);
 	}
 	/* dispatch final sublist */
-	ip_sublist_rcv(&sublist, curr_dev, curr_net);
+	kept += ip_sublist_rcv(&sublist, curr_dev, curr_net);
+	return kept;
 }
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 3c06cc9e9b79..6e013672e1de 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -76,20 +76,24 @@ int ip6_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 	return dst_input(skb);
 }
 
-static void ip6_sublist_rcv_finish(struct list_head *head)
+static int ip6_sublist_rcv_finish(struct list_head *head)
 {
 	struct sk_buff *skb, *next;
+	int kept = 0;
 
 	list_for_each_entry_safe(skb, next, head, list)
-		dst_input(skb);
+		if (dst_input(skb) == NET_RX_SUCCESS)
+			kept++;
+	return kept;
 }
 
-static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
-				struct list_head *head)
+static int ip6_list_rcv_finish(struct net *net, struct sock *sk,
+			       struct list_head *head)
 {
 	struct dst_entry *curr_dst = NULL;
 	struct sk_buff *skb, *next;
 	struct list_head sublist;
+	int kept = 0;
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
@@ -100,14 +104,16 @@ static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
 		 * skb to its handler for processing
 		 */
 		skb = l3mdev_ip6_rcv(skb);
-		if (!skb)
+		if (!skb) {
+			kept++;
 			continue;
+		}
 		ip6_rcv_finish_core(net, sk, skb);
 		dst = skb_dst(skb);
 		if (curr_dst != dst) {
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
-				ip6_sublist_rcv_finish(&sublist);
+				kept += ip6_sublist_rcv_finish(&sublist);
 			/* start new sublist */
 			INIT_LIST_HEAD(&sublist);
 			curr_dst = dst;
@@ -115,7 +121,8 @@ static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
 		list_add_tail(&skb->list, &sublist);
 	}
 	/* dispatch final sublist */
-	ip6_sublist_rcv_finish(&sublist);
+	kept += ip6_sublist_rcv_finish(&sublist);
+	return kept;
 }
 
 static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev,
@@ -274,22 +281,23 @@ int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt
 		       ip6_rcv_finish);
 }
 
-static void ip6_sublist_rcv(struct list_head *head, struct net_device *dev,
-			    struct net *net)
+static int ip6_sublist_rcv(struct list_head *head, struct net_device *dev,
+			   struct net *net)
 {
 	NF_HOOK_LIST(NFPROTO_IPV6, NF_INET_PRE_ROUTING, net, NULL,
 		     head, dev, NULL, ip6_rcv_finish);
-	ip6_list_rcv_finish(net, NULL, head);
+	return ip6_list_rcv_finish(net, NULL, head);
 }
 
 /* Receive a list of IPv6 packets */
-void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
-		   struct net_device *orig_dev)
+int ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
+		  struct net_device *orig_dev)
 {
 	struct net_device *curr_dev = NULL;
 	struct net *curr_net = NULL;
 	struct sk_buff *skb, *next;
 	struct list_head sublist;
+	int kept = 0;
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
@@ -304,7 +312,7 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
 		if (curr_dev != dev || curr_net != net) {
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
-				ip6_sublist_rcv(&sublist, curr_dev, curr_net);
+				kept += ip6_sublist_rcv(&sublist, curr_dev, curr_net);
 			/* start new sublist */
 			INIT_LIST_HEAD(&sublist);
 			curr_dev = dev;
@@ -313,7 +321,8 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
 		list_add_tail(&skb->list, &sublist);
 	}
 	/* dispatch final sublist */
-	ip6_sublist_rcv(&sublist, curr_dev, curr_net);
+	kept += ip6_sublist_rcv(&sublist, curr_dev, curr_net);
+	return kept;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 net-next 4/4] net/core: handle GRO_NORMAL skbs as a list in napi_gro_receive_list
  2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
                   ` (2 preceding siblings ...)
  2018-11-14 18:09 ` [PATCH v3 net-next 3/4] net: make listified RX functions return number of good packets Edward Cree
@ 2018-11-14 18:10 ` Edward Cree
  2018-11-15  7:22 ` [PATCH v3 net-next 0/4] net: batched receive in GRO path Eric Dumazet
  2018-11-15 22:37 ` Edward Cree
  5 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-11-14 18:10 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev, eric.dumazet

Allows GRO-using drivers to get the benefits of batching for non-GROable
 traffic.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 net/core/dev.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 35427167f6fb..65bfe28fbc81 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5664,6 +5664,7 @@ EXPORT_SYMBOL(napi_gro_receive);
 int napi_gro_receive_list(struct napi_struct *napi, struct list_head *head)
 {
 	struct sk_buff *skb, *next;
+	struct list_head sublist;
 	gro_result_t result;
 	int kept = 0;
 
@@ -5673,14 +5674,26 @@ int napi_gro_receive_list(struct napi_struct *napi, struct list_head *head)
 		skb_gro_reset_offset(skb);
 	}
 
+	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
 		list_del(&skb->list);
 		skb->next = NULL;
 		result = dev_gro_receive(napi, skb);
-		result = napi_skb_finish(result, skb);
-		if (result != GRO_DROP)
-			kept++;
+		if (result == GRO_NORMAL) {
+			list_add_tail(&skb->list, &sublist);
+			continue;
+		} else {
+			if (!list_empty(&sublist)) {
+				/* Handle the GRO_NORMAL skbs to prevent OoO */
+				kept += netif_receive_skb_list_internal(&sublist);
+				INIT_LIST_HEAD(&sublist);
+			}
+			result = napi_skb_finish(result, skb);
+			if (result != GRO_DROP)
+				kept++;
+		}
 	}
+	kept += netif_receive_skb_list_internal(&sublist);
 	return kept;
 }
 EXPORT_SYMBOL(napi_gro_receive_list);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
                   ` (3 preceding siblings ...)
  2018-11-14 18:10 ` [PATCH v3 net-next 4/4] net/core: handle GRO_NORMAL skbs as a list in napi_gro_receive_list Edward Cree
@ 2018-11-15  7:22 ` Eric Dumazet
  2018-11-15 18:43   ` Edward Cree
  2018-11-15 22:37 ` Edward Cree
  5 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2018-11-15  7:22 UTC (permalink / raw)
  To: Edward Cree, linux-net-drivers, davem; +Cc: netdev



On 11/14/2018 10:07 AM, Edward Cree wrote:
>
> Conclusion:
> * TCP b/w is 16.5% faster for traffic which cannot be coalesced by GRO.
>

But only for traffic that actually was perfect GRO candidate, right ?

Now what happens if all the packets you are batching are hitting different TCP sockets ?

(DDOS attack patterns)

By the time we build a list of 64 packets, the first packets in the list wont be anymore
in L1 cache (32 KB 8-way associative typically), and we will probably have cache trashing.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-15  7:22 ` [PATCH v3 net-next 0/4] net: batched receive in GRO path Eric Dumazet
@ 2018-11-15 18:43   ` Edward Cree
  2018-11-15 20:08     ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-11-15 18:43 UTC (permalink / raw)
  To: Eric Dumazet, linux-net-drivers, davem; +Cc: netdev

On 15/11/18 07:22, Eric Dumazet wrote:
> On 11/14/2018 10:07 AM, Edward Cree wrote:
>> Conclusion:
>> * TCP b/w is 16.5% faster for traffic which cannot be coalesced by GRO.
> But only for traffic that actually was perfect GRO candidate, right ?
>
> Now what happens if all the packets you are batching are hitting different TCP sockets ?
The batch is already split up by the time it hits TCP sockets; batching
 currently only goes as far as ip_sublist_rcv_finish() which calls
 dst_input(skb) in a loop.  So as long as the packets are all for the
 same dst IP, we should get all of this gain.
If the packets have different dst IP addresses then we split the batch
 slightly earlier, in ip_list_rcv_finish(), but that won't make very
 much difference, I expect we'll still get most of this gain.  There is
 a lot of the stack (layer 2 stuff, taps, etc.) that we still traverse
 as a batch.

> By the time we build a list of 64 packets, the first packets in the list wont be anymore
> in L1 cache (32 KB 8-way associative typically), and we will probably have cache trashing.
Most of the packet isn't touched and thus won't be brought into cache.
Only the headers of each packet (worst-case let's say 256 bytes) will
 be touched during batch processing, that's 16kB.  And not all at once
 i.e. by the time we touch the later cachelines of a packet we'll be
 done with the earlier ones except maybe in cases where GRO decides
 very late on that it can't coalesce.
And since the alternative is thrashing of the I$ cache, I don't think
 there's an a priori argument that this will hurt — and my tests seem
 to indicate that it's OK and that we gain more from better I$ usage
 than we lose from worse D$ usage patterns.
If you think there are cases in which the latter will dominate, please
 suggest some tests that will embody them; I'm happy to keep running
 experiments.  Also you could come up with an analogue of patch #2 for
 whatever HW you have (it shouldn't be difficult) allowing you to run
 your own tests (e.g. if you have bigger/more powerful test rigs than
 I have access to ;-)

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-15 18:43   ` Edward Cree
@ 2018-11-15 20:08     ` Eric Dumazet
  2018-11-15 21:45       ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2018-11-15 20:08 UTC (permalink / raw)
  To: Edward Cree, linux-net-drivers, davem; +Cc: netdev

On 11/15/2018 10:43 AM, Edward Cree wrote:

Most of the packet isn't touched and thus won't be brought into cache.
> Only the headers of each packet (worst-case let's say 256 bytes) will
>  be touched during batch processing, that's 16kB.

You assume perfect use of the caches, but part of the cache has collisions.

I am alarmed by the complexity added, for example in GRO, considering
that we also added GRO for UDP.

I dunno, can you show us for example if a reassembly workload can benefit
from all this stuff ?

Paolo Abeni sure will be interested knowing if we can get a 20% increase for this
IP defrag workloads.

If you present numbers for traffic that GRO handles just fine, it does not
really make sense, unless your plan maybe is to remove GRO completely ?

We have observed at Google a constant increase of cpu cycles spent for TCP_RR
on latest kernels. The gap is now about 20% with kernels from two years ago,
and I could not yet find a faulty commit. It seems we add little overhead after
another, and every patch author is convinced he is doing the right thing.

With multi queue NICS, vast majority of napi->poll() invocations handle only one packet.
Unfortunately we can not really increase interrupt mitigations (ethtool -c) 
on NIC without sacrificing latencies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-15 20:08     ` Eric Dumazet
@ 2018-11-15 21:45       ` Edward Cree
  2018-11-15 22:01         ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-11-15 21:45 UTC (permalink / raw)
  To: Eric Dumazet, linux-net-drivers, davem; +Cc: netdev

On 15/11/18 20:08, Eric Dumazet wrote:
> On 11/15/2018 10:43 AM, Edward Cree wrote:
>
> Most of the packet isn't touched and thus won't be brought into cache.
>> Only the headers of each packet (worst-case let's say 256 bytes) will
>>  be touched during batch processing, that's 16kB.
> You assume perfect use of the caches, but part of the cache has collisions.
I assume nothing, that's why I'm running lots of tests & benchmarks.
Remember that gains from batching are not only in I$; the D$ cache is
 also going to be used for things like route lookups and netfilter
 progs, and locality for those is improved by batching.
It might be possible to use PMCs to get hard numbers on how I$ and D$
 hit & eviction rates change, idk how useful that would be.

> I am alarmed by the complexity added, for example in GRO, considering
> that we also added GRO for UDP.
This series doesn't really add complexity _in_ GRO, it's more a piece
 on the outside that's calling GRO machinery slightly differently.
Drivers which just call the existing non-list-based entry points won't
 even see any of this code.

> I dunno, can you show us for example if a reassembly workload can benefit
> from all this stuff ?
Sure, I can try a UDP test with payload_size > MTU.  (I can't think of a
 way to force interleaving of fragments from different packets, though.)

> If you present numbers for traffic that GRO handles just fine, it does not
> really make sense, unless your plan maybe is to remove GRO completely ?
That's just the easiest thing to test.  It's that much harder to set up
 tests to use e.g. IP options that GRO will baulk at.  It's also not too
 easy to create traffic with the kind of flow interleaving that DDoS
 scenarios would present, as that requires something like a many-to-one
 rig with a switch and I don't have enough lab machines for such a test.
I'm not planning to remove GRO.  GRO is faster than batched receive.
Batched receive, however, works equally well for all traffic whether it's
 GRO-able or not.
Thus both are worth having.  This patch series is about using batched
 receive for packets that GRO looks at and says "no thanks".

> We have observed at Google a constant increase of cpu cycles spent for TCP_RR
> on latest kernels. The gap is now about 20% with kernels from two years ago,
> and I could not yet find a faulty commit. It seems we add little overhead after
> another, and every patch author is convinced he is doing the right thing.
>
> With multi queue NICS, vast majority of napi->poll() invocations handle only one packet.
> Unfortunately we can not really increase interrupt mitigations (ethtool -c) 
> on NIC without sacrificing latencies.
At one point when I was working on the original batching patches, I tried
 making them skip batching if poll() hadn't used up the entire NAPI budget
 (as a signal that we're not BW-constrained), but it didn't seem to yield
 any benefit.  However I could try it again, or try checking the list
 length and handling packets singly if it's less than some threshold...?

If napi->poll() is only handling one packet, surely GRO can't do anything
 useful either?  (AIUI at the end of the poll the GRO lists get flushed.)
 Is it maybe a sign that you're just spreading over too many queues??

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-15 21:45       ` Edward Cree
@ 2018-11-15 22:01         ` Eric Dumazet
  2018-11-15 22:45           ` Edward Cree
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2018-11-15 22:01 UTC (permalink / raw)
  To: Edward Cree, Eric Dumazet, linux-net-drivers, davem; +Cc: netdev

On 11/15/2018 01:45 PM, Edward Cree wrote:
> 
> If napi->poll() is only handling one packet, surely GRO can't do anything
>  useful either?  (AIUI at the end of the poll the GRO lists get flushed.)

That is my point.

Adding yet another layer that will add no gain but add more waste of cpu cycles.

In fact I know many people disabling GRO in some cases because it adds ~5% penalty
for traffic that is not aggregated.

>  Is it maybe a sign that you're just spreading over too many queues??

Not really. You also want to be able to receive more traffic if the need comes.

Most NIC share the same IRQ for one TX/RX queue, and you might have an imbalance between TX and RX load.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
                   ` (4 preceding siblings ...)
  2018-11-15  7:22 ` [PATCH v3 net-next 0/4] net: batched receive in GRO path Eric Dumazet
@ 2018-11-15 22:37 ` Edward Cree
  5 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2018-11-15 22:37 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev, eric.dumazet

Some corrections as it looks like I didn't proofread this carefully enough
 before sending it...

On 14/11/18 18:07, Edward Cree wrote:
> Payload_size in all tests was 8000 bytes.
This was for TCP tests; the UDP test used 1-byte payloads.

> UDP Stream (GRO off):
> net-next: 7.808 Gb/s
> after #4: 7.848 Gb/s
These numbers were Mb/s, not Gb/s.
>   0.5% slower; p = 0.144
And of course the 'after' state was 0.5% _faster_.
> * UDP throughput might be slightly slowed (probably by patch #3) but it's
>   not statistically significant.
Ditto here, UDP has not been slowed.

-Ed

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-15 22:01         ` Eric Dumazet
@ 2018-11-15 22:45           ` Edward Cree
  2018-11-15 23:08             ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Edward Cree @ 2018-11-15 22:45 UTC (permalink / raw)
  To: Eric Dumazet, linux-net-drivers, davem; +Cc: netdev

On 15/11/18 22:01, Eric Dumazet wrote:
> On 11/15/2018 01:45 PM, Edward Cree wrote:
>> If napi->poll() is only handling one packet, surely GRO can't do anything
>>  useful either?  (AIUI at the end of the poll the GRO lists get flushed.)
> That is my point.
>
> Adding yet another layer that will add no gain but add more waste of cpu cycles.
>
> In fact I know many people disabling GRO in some cases because it adds ~5% penalty
> for traffic that is not aggregated.
Does there maybe need to be an (ethtool -K) option to disable batch receive,
 then, for this kind of user?

>>  Is it maybe a sign that you're just spreading over too many queues??
> Not really. You also want to be able to receive more traffic if the need comes.
Oh I see, this is about using less CPU when not maxed out, rather than
 increasing the maximum performance.
I did see a 6% RXCPU usage increase in the "TCP RR, GRO on" test.  (Before=
 188.7%, after=200%, Welch p<0.001, Cohen's d=6.2.)  I'll try adding a "skip
 batching for short lists" and retest, see if that improves matters.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path
  2018-11-15 22:45           ` Edward Cree
@ 2018-11-15 23:08             ` Eric Dumazet
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2018-11-15 23:08 UTC (permalink / raw)
  To: Edward Cree, Eric Dumazet, linux-net-drivers, davem; +Cc: netdev



On 11/15/2018 02:45 PM, Edward Cree wrote:
> On 15/11/18 22:01, Eric Dumazet wrote:
>> On 11/15/2018 01:45 PM, Edward Cree wrote:
>>> If napi->poll() is only handling one packet, surely GRO can't do anything
>>>  useful either?  (AIUI at the end of the poll the GRO lists get flushed.)
>> That is my point.
>>
>> Adding yet another layer that will add no gain but add more waste of cpu cycles.
>>
>> In fact I know many people disabling GRO in some cases because it adds ~5% penalty
>> for traffic that is not aggregated.
> Does there maybe need to be an (ethtool -K) option to disable batch receive,
>  then, for this kind of user?

I do not want to hold on your patches, only to remind us that we add a lot of
features and stuff that might help in some cases only.

Another example is the IP early demux for UDP packets, which is clearly
a waste of time when the receiving socket is not a connected socket.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-11-16  9:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-14 18:07 [PATCH v3 net-next 0/4] net: batched receive in GRO path Edward Cree
2018-11-14 18:09 ` [PATCH v3 net-next 1/4] net: introduce list entry point for GRO Edward Cree
2018-11-14 18:09 ` [PATCH v3 net-next 2/4] sfc: use batched receive " Edward Cree
2018-11-14 18:09 ` [PATCH v3 net-next 3/4] net: make listified RX functions return number of good packets Edward Cree
2018-11-14 18:10 ` [PATCH v3 net-next 4/4] net/core: handle GRO_NORMAL skbs as a list in napi_gro_receive_list Edward Cree
2018-11-15  7:22 ` [PATCH v3 net-next 0/4] net: batched receive in GRO path Eric Dumazet
2018-11-15 18:43   ` Edward Cree
2018-11-15 20:08     ` Eric Dumazet
2018-11-15 21:45       ` Edward Cree
2018-11-15 22:01         ` Eric Dumazet
2018-11-15 22:45           ` Edward Cree
2018-11-15 23:08             ` Eric Dumazet
2018-11-15 22:37 ` Edward Cree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).