* [PATCH v3 net-next 0/1] ip_tunnel: Create percpu gro_cell
@ 2015-01-16 18:10 Martin KaFai Lau
2015-01-16 18:11 ` [PATCH v3 net-next 1/1] " Martin KaFai Lau
0 siblings, 1 reply; 4+ messages in thread
From: Martin KaFai Lau @ 2015-01-16 18:10 UTC (permalink / raw)
To: netdev; +Cc: Eric Dumazet, kernel-team
In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
All skb will be queued to the same cell->napi_skbs. The
gro_cell_poll is pinned to one core under load. In production traffic,
we also see severe rx_dropped in the tunl iface and it is probably due to
this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog
This patch is trying to alloc_percpu(struct gro_cell) and schedule
gro_cell_poll to process the skb in the same core.
Changes from v1:
Eric Dumazet pointed out that ____cacheline_aligned_in_smp is no longer needed.
Changes from v2:
Dropped the one-item-struct cleanup patch per comment.
Setup:
VIP_PREFIX=9.9.9.9/32
REMOTE_REAL_IP=10.228.95.75
if [ "$1" = "encap" ]
then
sudo ip tunnel add mode ipip remote ${REMOTE_REAL_IP}
sudo ip link set dev ipip0 up
sudo ip route add dev ipip0 ${VIP_PREFIX}
else
# Decapsulating host
sudo ip tunnel add mode ipip
sudo ip link set dev tunl0 up
sudo ip addr add dev lo ${VIP_PREFIX}
sudo sysctl -a | grep '\.rp_filter' | awk '{print $1;}' | \
xargs -n1 -I{} sudo sysctl {}=0
fi
Before:
[root@DECAP ~]# netserver -p 8888
[root@ENCAP ~]# super_netperf 200 -t TCP_RR -H 9.9.9.9 -p 8888 \
-l 30 -- -d 0x6 -m 8k,64k -s 1M -S 1M
332215
[root@DECAP ~]# perf probe -a gro_cell_poll
[root@DECAP ~]# perf stat -I 1000 -a -A -e probe:gro_cell_poll
117.258518273 CPU0 0 probe:gro_cell_poll
117.258518273 CPU1 0 probe:gro_cell_poll
117.258518273 CPU2 0 probe:gro_cell_poll
117.258518273 CPU3 0 probe:gro_cell_poll
117.258518273 CPU4 0 probe:gro_cell_poll
117.258518273 CPU5 0 probe:gro_cell_poll
117.258518273 CPU6 0 probe:gro_cell_poll
117.258518273 CPU7 0 probe:gro_cell_poll
117.258518273 CPU8 0 probe:gro_cell_poll
117.258518273 CPU9 0 probe:gro_cell_poll
117.258518273 CPU10 0 probe:gro_cell_poll
117.258518273 CPU11 0 probe:gro_cell_poll
117.258518273 CPU12 0 probe:gro_cell_poll
117.258518273 CPU13 0 probe:gro_cell_poll
117.258518273 CPU14 0 probe:gro_cell_poll
117.258518273 CPU15 4,882 probe:gro_cell_poll
117.258518273 CPU16 0 probe:gro_cell_poll
117.258518273 CPU17 0 probe:gro_cell_poll
117.258518273 CPU18 0 probe:gro_cell_poll
117.258518273 CPU19 0 probe:gro_cell_poll
117.258518273 CPU20 0 probe:gro_cell_poll
117.258518273 CPU21 0 probe:gro_cell_poll
117.258518273 CPU22 0 probe:gro_cell_poll
117.258518273 CPU23 0 probe:gro_cell_poll
117.258518273 CPU24 0 probe:gro_cell_poll
117.258518273 CPU25 0 probe:gro_cell_poll
117.258518273 CPU26 0 probe:gro_cell_poll
117.258518273 CPU27 0 probe:gro_cell_poll
117.258518273 CPU28 0 probe:gro_cell_poll
117.258518273 CPU29 0 probe:gro_cell_poll
117.258518273 CPU30 0 probe:gro_cell_poll
117.258518273 CPU31 0 probe:gro_cell_poll
117.258518273 CPU32 0 probe:gro_cell_poll
117.258518273 CPU33 0 probe:gro_cell_poll
117.258518273 CPU34 0 probe:gro_cell_poll
117.258518273 CPU35 0 probe:gro_cell_poll
117.258518273 CPU36 0 probe:gro_cell_poll
117.258518273 CPU37 0 probe:gro_cell_poll
117.258518273 CPU38 0 probe:gro_cell_poll
117.258518273 CPU39 0 probe:gro_cell_poll
After:
[root@DECAP ~]# netserver -p 8888
[root@ENCAP ~]# super_netperf 200 -t TCP_RR -H 9.9.9.9 -p 8888 \
-l 30 -- -d 0x6 -m 8k,64k -s 1M -S 1M
877530
[root@DECAP ~]# perf probe -a gro_cell_poll
[root@DECAP ~]# perf stat -I 1000 -a -A -e probe:gro_cell_poll
40.085714389 CPU0 13,607 probe:gro_cell_poll
40.085714389 CPU1 13,188 probe:gro_cell_poll
40.085714389 CPU2 12,913 probe:gro_cell_poll
40.085714389 CPU3 12,790 probe:gro_cell_poll
40.085714389 CPU4 13,395 probe:gro_cell_poll
40.085714389 CPU5 13,121 probe:gro_cell_poll
40.085714389 CPU6 11,083 probe:gro_cell_poll
40.085714389 CPU7 12,945 probe:gro_cell_poll
40.085714389 CPU8 13,704 probe:gro_cell_poll
40.085714389 CPU9 13,514 probe:gro_cell_poll
40.085714389 CPU10 0 probe:gro_cell_poll
40.085714389 CPU11 0 probe:gro_cell_poll
40.085714389 CPU12 0 probe:gro_cell_poll
40.085714389 CPU13 0 probe:gro_cell_poll
40.085714389 CPU14 0 probe:gro_cell_poll
40.085714389 CPU15 0 probe:gro_cell_poll
40.085714389 CPU16 0 probe:gro_cell_poll
40.085714389 CPU17 0 probe:gro_cell_poll
40.085714389 CPU18 0 probe:gro_cell_poll
40.085714389 CPU19 0 probe:gro_cell_poll
40.085714389 CPU20 10,402 probe:gro_cell_poll
40.085714389 CPU21 12,312 probe:gro_cell_poll
40.085714389 CPU22 11,913 probe:gro_cell_poll
40.085714389 CPU23 12,964 probe:gro_cell_poll
40.085714389 CPU24 13,727 probe:gro_cell_poll
40.085714389 CPU25 12,943 probe:gro_cell_poll
40.085714389 CPU26 13,558 probe:gro_cell_poll
40.085714389 CPU27 12,676 probe:gro_cell_poll
40.085714389 CPU28 13,754 probe:gro_cell_poll
40.085714389 CPU29 13,379 probe:gro_cell_poll
40.085714389 CPU30 0 probe:gro_cell_poll
40.085714389 CPU31 0 probe:gro_cell_poll
40.085714389 CPU32 0 probe:gro_cell_poll
40.085714389 CPU33 0 probe:gro_cell_poll
40.085714389 CPU34 0 probe:gro_cell_poll
40.085714389 CPU35 0 probe:gro_cell_poll
40.085714389 CPU36 0 probe:gro_cell_poll
40.085714389 CPU37 0 probe:gro_cell_poll
40.085714389 CPU38 0 probe:gro_cell_poll
40.085714389 CPU39 0 probe:gro_cell_poll
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 net-next 1/1] ip_tunnel: Create percpu gro_cell
2015-01-16 18:10 [PATCH v3 net-next 0/1] ip_tunnel: Create percpu gro_cell Martin KaFai Lau
@ 2015-01-16 18:11 ` Martin KaFai Lau
2015-01-16 19:38 ` Eric Dumazet
2015-01-18 6:57 ` David Miller
0 siblings, 2 replies; 4+ messages in thread
From: Martin KaFai Lau @ 2015-01-16 18:11 UTC (permalink / raw)
To: netdev; +Cc: Eric Dumazet, kernel-team
In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
All skb will be queued to the same cell->napi_skbs. The
gro_cell_poll is pinned to one core under load. In production traffic,
we also see severe rx_dropped in the tunl iface and it is probably due to
this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog.
This patch is trying to alloc_percpu(struct gro_cell) and schedule
gro_cell_poll to process the skb in the same core.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/net/gro_cells.h | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/include/net/gro_cells.h b/include/net/gro_cells.h
index 734d9b5..0f712c0 100644
--- a/include/net/gro_cells.h
+++ b/include/net/gro_cells.h
@@ -8,25 +8,23 @@
struct gro_cell {
struct sk_buff_head napi_skbs;
struct napi_struct napi;
-} ____cacheline_aligned_in_smp;
+};
struct gro_cells {
- unsigned int gro_cells_mask;
- struct gro_cell *cells;
+ struct gro_cell __percpu *cells;
};
static inline void gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
{
- struct gro_cell *cell = gcells->cells;
+ struct gro_cell *cell;
struct net_device *dev = skb->dev;
- if (!cell || skb_cloned(skb) || !(dev->features & NETIF_F_GRO)) {
+ if (!gcells->cells || skb_cloned(skb) || !(dev->features & NETIF_F_GRO)) {
netif_rx(skb);
return;
}
- if (skb_rx_queue_recorded(skb))
- cell += skb_get_rx_queue(skb) & gcells->gro_cells_mask;
+ cell = this_cpu_ptr(gcells->cells);
if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) {
atomic_long_inc(&dev->rx_dropped);
@@ -72,15 +70,12 @@ static inline int gro_cells_init(struct gro_cells *gcells, struct net_device *de
{
int i;
- gcells->gro_cells_mask = roundup_pow_of_two(netif_get_num_default_rss_queues()) - 1;
- gcells->cells = kcalloc(gcells->gro_cells_mask + 1,
- sizeof(struct gro_cell),
- GFP_KERNEL);
+ gcells->cells = alloc_percpu(struct gro_cell);
if (!gcells->cells)
return -ENOMEM;
- for (i = 0; i <= gcells->gro_cells_mask; i++) {
- struct gro_cell *cell = gcells->cells + i;
+ for_each_possible_cpu(i) {
+ struct gro_cell *cell = per_cpu_ptr(gcells->cells, i);
skb_queue_head_init(&cell->napi_skbs);
netif_napi_add(dev, &cell->napi, gro_cell_poll, 64);
@@ -91,16 +86,16 @@ static inline int gro_cells_init(struct gro_cells *gcells, struct net_device *de
static inline void gro_cells_destroy(struct gro_cells *gcells)
{
- struct gro_cell *cell = gcells->cells;
int i;
- if (!cell)
+ if (!gcells->cells)
return;
- for (i = 0; i <= gcells->gro_cells_mask; i++,cell++) {
+ for_each_possible_cpu(i) {
+ struct gro_cell *cell = per_cpu_ptr(gcells->cells, i);
netif_napi_del(&cell->napi);
skb_queue_purge(&cell->napi_skbs);
}
- kfree(gcells->cells);
+ free_percpu(gcells->cells);
gcells->cells = NULL;
}
--
1.8.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 net-next 1/1] ip_tunnel: Create percpu gro_cell
2015-01-16 18:11 ` [PATCH v3 net-next 1/1] " Martin KaFai Lau
@ 2015-01-16 19:38 ` Eric Dumazet
2015-01-18 6:57 ` David Miller
1 sibling, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2015-01-16 19:38 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: netdev, kernel-team
On Fri, 2015-01-16 at 10:11 -0800, Martin KaFai Lau wrote:
> In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
> All skb will be queued to the same cell->napi_skbs. The
> gro_cell_poll is pinned to one core under load. In production traffic,
> we also see severe rx_dropped in the tunl iface and it is probably due to
> this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog.
>
> This patch is trying to alloc_percpu(struct gro_cell) and schedule
> gro_cell_poll to process the skb in the same core.
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---
> include/net/gro_cells.h | 29 ++++++++++++-----------------
> 1 file changed, 12 insertions(+), 17 deletions(-)
Acked-by: Eric Dumazet <edumazet@google.com>
Thanks
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3 net-next 1/1] ip_tunnel: Create percpu gro_cell
2015-01-16 18:11 ` [PATCH v3 net-next 1/1] " Martin KaFai Lau
2015-01-16 19:38 ` Eric Dumazet
@ 2015-01-18 6:57 ` David Miller
1 sibling, 0 replies; 4+ messages in thread
From: David Miller @ 2015-01-18 6:57 UTC (permalink / raw)
To: kafai; +Cc: netdev, eric.dumazet, kernel-team
From: Martin KaFai Lau <kafai@fb.com>
Date: Fri, 16 Jan 2015 10:11:00 -0800
> In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
> All skb will be queued to the same cell->napi_skbs. The
> gro_cell_poll is pinned to one core under load. In production traffic,
> we also see severe rx_dropped in the tunl iface and it is probably due to
> this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog.
>
> This patch is trying to alloc_percpu(struct gro_cell) and schedule
> gro_cell_poll to process the skb in the same core.
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-01-18 6:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-16 18:10 [PATCH v3 net-next 0/1] ip_tunnel: Create percpu gro_cell Martin KaFai Lau
2015-01-16 18:11 ` [PATCH v3 net-next 1/1] " Martin KaFai Lau
2015-01-16 19:38 ` Eric Dumazet
2015-01-18 6:57 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).