[PATCH net-next 0/2] macvlan: optimize receive path

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next 0/2] macvlan: optimize receive path
@ 2014-10-10  3:13 Jason Baron
  2014-10-10  3:13 ` [PATCH net-next 1/2] macvlan: pass 'bool' type to macvlan_count_rx() Jason Baron
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jason Baron @ 2014-10-10  3:13 UTC (permalink / raw)
  To: davem; +Cc: eric.dumazet, stephen, vyasevich, kaber, netdev

Hi,

So after porting this optimization to net-next, I found that the netperf
results of TCP_RR regress right at the maximum peak of transactions/sec. That
is as I increase the number of threads via the first argument to super_netperf,
the number of transactions/sec keep increasing, peak, and then start
decreasing. It is right at the peak, that I see a small regression with this
patch (see results in patch 2/2).

Without the patch, the ksoftirqd threads are the top cpu consumers threads on
the system, since the extra 'netif_rx()', is queuing more softirq work, whereas
with the patch, the ksoftirqd threads are below all of the 'netserver' threads 
in terms of their cpu usage. So there appears to be some interaction between how
softirqs are serviced at the peak here and this patch. I think the test results
are still supportive of this approach, but I wanted to be clear on my findings.

Thanks,

-Jason

Jason Baron (2):
  macvlan: pass 'bool' type to macvlan_count_rx()
  macvlan: optimize the receive path

 drivers/net/macvlan.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

-- 
1.8.2.rc2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 1/2] macvlan: pass 'bool' type to macvlan_count_rx()
  2014-10-10  3:13 [PATCH net-next 0/2] macvlan: optimize receive path Jason Baron
@ 2014-10-10  3:13 ` Jason Baron
  2014-10-10  3:13 ` [PATCH net-next 2/2] macvlan: optimize the receive path Jason Baron
  2014-10-10 19:10 ` [PATCH net-next 0/2] macvlan: optimize " David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Jason Baron @ 2014-10-10  3:13 UTC (permalink / raw)
  To: davem; +Cc: eric.dumazet, stephen, vyasevich, kaber, netdev

Pass last argument to macvlan_count_rx() as the correct bool type.

Signed-off-by: Jason Baron <jbaron@akamai.com>
---
 drivers/net/macvlan.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 38b4fae..c7c58af 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -260,7 +260,7 @@ static void macvlan_broadcast(struct sk_buff *skb,
 					mode == MACVLAN_MODE_BRIDGE) ?:
 				      netif_rx_ni(nskb);
 			macvlan_count_rx(vlan, skb->len + ETH_HLEN,
-					 err == NET_RX_SUCCESS, 1);
+					 err == NET_RX_SUCCESS, true);
 		}
 	}
 }
@@ -379,7 +379,7 @@ static void macvlan_forward_source_one(struct sk_buff *skb,
 	nskb->pkt_type = PACKET_HOST;
 
 	ret = netif_rx(nskb);
-	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, 0);
+	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, false);
 }
 
 static void macvlan_forward_source(struct sk_buff *skb,
@@ -457,7 +457,7 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 	ret = netif_rx(skb);
 
 out:
-	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, 0);
+	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, false);
 	return RX_HANDLER_CONSUMED;
 }
 
-- 
1.8.2.rc2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next 2/2] macvlan: optimize the receive path
  2014-10-10  3:13 [PATCH net-next 0/2] macvlan: optimize receive path Jason Baron
  2014-10-10  3:13 ` [PATCH net-next 1/2] macvlan: pass 'bool' type to macvlan_count_rx() Jason Baron
@ 2014-10-10  3:13 ` Jason Baron
  2014-10-10 19:10 ` [PATCH net-next 0/2] macvlan: optimize " David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Jason Baron @ 2014-10-10  3:13 UTC (permalink / raw)
  To: davem; +Cc: eric.dumazet, stephen, vyasevich, kaber, netdev

The netif_rx() call on the fast path of macvlan_handle_frame() appears to
be there to ensure that we properly throttle incoming packets. However, it
would appear as though the proper throttling is already in place for all
possible ingress paths, and that the call is redundant. If packets are arriving
from the physical NIC, we've already throttled them by this point. Otherwise,
if they are coming via macvlan_queue_xmit(), it calls either
'dev_forward_skb()', which ends up calling netif_rx_internal(), or else in
the broadcast case, we are throttling via macvlan_broadcast_enqueue().

The test results below are from off the box to an lxc instance running macvlan.
Once the tranactions/sec stop increasing, the cpu idle time has gone to 0.
Results are from a quad core Intel E3-1270 V2@3.50GHz box with bnx2x 10G card.

for i in {10,100,200,300,400,500};
do super_netperf $i -H $ip -t TCP_RR; done
Average of 5 runs.

trans/sec 		 trans/sec
(3.17-rc7-net-next)      (3.17-rc7-net-next + this patch)
----------               ----------
208101                   211534 (+1.6%)
839493                   850162 (+1.3%)
845071                   844053 (-.12%)
816330                   819623 (+.4%)
778700                   789938 (+1.4%)
735984                   754408 (+2.5%)

Signed-off-by: Jason Baron <jbaron@akamai.com>
---
 drivers/net/macvlan.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index c7c58af..29b3bb4 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -407,7 +407,8 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 	const struct macvlan_dev *src;
 	struct net_device *dev;
 	unsigned int len = 0;
-	int ret = NET_RX_DROP;
+	int ret;
+	rx_handler_result_t handle_res;
 
 	port = macvlan_port_get_rcu(skb->dev);
 	if (is_multicast_ether_addr(eth->h_dest)) {
@@ -423,6 +424,7 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 			vlan = src;
 			ret = macvlan_broadcast_one(skb, vlan, eth, 0) ?:
 			      netif_rx(skb);
+			handle_res = RX_HANDLER_CONSUMED;
 			goto out;
 		}
 
@@ -448,17 +450,20 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 	}
 	len = skb->len + ETH_HLEN;
 	skb = skb_share_check(skb, GFP_ATOMIC);
-	if (!skb)
+	if (!skb) {
+		ret = NET_RX_DROP;
+		handle_res = RX_HANDLER_CONSUMED;
 		goto out;
+	}
 
 	skb->dev = dev;
 	skb->pkt_type = PACKET_HOST;
 
-	ret = netif_rx(skb);
-
+	ret = NET_RX_SUCCESS;
+	handle_res = RX_HANDLER_ANOTHER;
 out:
 	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, false);
-	return RX_HANDLER_CONSUMED;
+	return handle_res;
 }
 
 static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
-- 
1.8.2.rc2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 0/2] macvlan: optimize receive path
  2014-10-10  3:13 [PATCH net-next 0/2] macvlan: optimize receive path Jason Baron
  2014-10-10  3:13 ` [PATCH net-next 1/2] macvlan: pass 'bool' type to macvlan_count_rx() Jason Baron
  2014-10-10  3:13 ` [PATCH net-next 2/2] macvlan: optimize the receive path Jason Baron
@ 2014-10-10 19:10 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2014-10-10 19:10 UTC (permalink / raw)
  To: jbaron; +Cc: eric.dumazet, stephen, vyasevich, kaber, netdev

From: Jason Baron <jbaron@akamai.com>
Date: Fri, 10 Oct 2014 03:13:24 +0000 (GMT)

> So after porting this optimization to net-next, I found that the netperf
> results of TCP_RR regress right at the maximum peak of transactions/sec. That
> is as I increase the number of threads via the first argument to super_netperf,
> the number of transactions/sec keep increasing, peak, and then start
> decreasing. It is right at the peak, that I see a small regression with this
> patch (see results in patch 2/2).
> 
> Without the patch, the ksoftirqd threads are the top cpu consumers threads on
> the system, since the extra 'netif_rx()', is queuing more softirq work, whereas
> with the patch, the ksoftirqd threads are below all of the 'netserver' threads 
> in terms of their cpu usage. So there appears to be some interaction between how
> softirqs are serviced at the peak here and this patch. I think the test results
> are still supportive of this approach, but I wanted to be clear on my findings.

I think this is definitely the right thing to do, applied, thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-10-10 19:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-10  3:13 [PATCH net-next 0/2] macvlan: optimize receive path Jason Baron
2014-10-10  3:13 ` [PATCH net-next 1/2] macvlan: pass 'bool' type to macvlan_count_rx() Jason Baron
2014-10-10  3:13 ` [PATCH net-next 2/2] macvlan: optimize the receive path Jason Baron
2014-10-10 19:10 ` [PATCH net-next 0/2] macvlan: optimize " David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).