* [RFC] net: distribute vxlan tunneled traffic across multiple TXQs
@ 2013-12-17 8:40 Sathya Perla
2013-12-17 16:45 ` Eric Dumazet
0 siblings, 1 reply; 3+ messages in thread
From: Sathya Perla @ 2013-12-17 8:40 UTC (permalink / raw)
To: netdev
TX traffic is distributed across multiple TXQs using skb->sk->sk_hash.
For vxlan skbs, the reference to the original socket (skb->sk) is replaced
with vxlan-sk. Because of this all tunneled traffic ends up only on one TXQ.
This patch uses the skb->rxhash field to carry the original sk->sk_hash
value so that it can be used by netdev layer to pick a TXQ. If this approach
is agreeable then we can change the name of skb->rxhash to skb->hash so that
it can be used in both RX and TX paths.
But, after a TXQ is picked based on the skb->rxhash for tunneled traffic,
it's index cannot be recorded in the original socket as it's reference
is no longer available in skb. So, the TXQ-index would need to be
computed (from skb->rxhash) for each skb. Any ideas on how this can be
avoided?
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
drivers/net/vxlan.c | 2 ++
net/core/flow_dissector.c | 6 ++++--
net/ipv4/ip_tunnel_core.c | 1 -
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 58f6a0c..f4e4a83 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1572,6 +1572,8 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
uh->len = htons(skb->len);
uh->check = 0;
+ if (skb->sk && skb->sk->sk_hash)
+ skb->rxhash = skb->sk->sk_hash;
vxlan_set_owner(vs->sock->sk, skb);
err = handle_offloads(skb);
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index d6ef173..5a5ae5a 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -260,7 +260,9 @@ u16 __skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb,
qcount = dev->tc_to_txq[tc].count;
}
- if (skb->sk && skb->sk->sk_hash)
+ if (skb->encapsulation && skb->rxhash)
+ hash = skb->rxhash;
+ else if (skb->sk && skb->sk->sk_hash)
hash = skb->sk->sk_hash;
else
hash = (__force u16) skb->protocol;
@@ -383,7 +385,7 @@ u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
if (new_index < 0)
new_index = skb_tx_hash(dev, skb);
- if (queue_index != new_index && sk &&
+ if (queue_index != new_index && sk && !skb->encapsulation &&
rcu_access_pointer(sk->sk_dst_cache))
sk_tx_queue_set(sk, new_index);
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 42ffbc8..183313b 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -56,7 +56,6 @@ int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
skb_scrub_packet(skb, xnet);
- skb->rxhash = 0;
skb_dst_set(skb, &rt->dst);
memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
--
1.7.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC] net: distribute vxlan tunneled traffic across multiple TXQs
2013-12-17 8:40 [RFC] net: distribute vxlan tunneled traffic across multiple TXQs Sathya Perla
@ 2013-12-17 16:45 ` Eric Dumazet
2013-12-19 7:43 ` Sathya Perla
0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2013-12-17 16:45 UTC (permalink / raw)
To: Sathya Perla; +Cc: netdev
On Tue, 2013-12-17 at 14:10 +0530, Sathya Perla wrote:
> TX traffic is distributed across multiple TXQs using skb->sk->sk_hash.
> For vxlan skbs, the reference to the original socket (skb->sk) is replaced
> with vxlan-sk. Because of this all tunneled traffic ends up only on one TXQ.
>
> This patch uses the skb->rxhash field to carry the original sk->sk_hash
> value so that it can be used by netdev layer to pick a TXQ. If this approach
> is agreeable then we can change the name of skb->rxhash to skb->hash so that
> it can be used in both RX and TX paths.
>
> But, after a TXQ is picked based on the skb->rxhash for tunneled traffic,
> it's index cannot be recorded in the original socket as it's reference
> is no longer available in skb. So, the TXQ-index would need to be
> computed (from skb->rxhash) for each skb. Any ideas on how this can be
> avoided?
Real question is : Why vxlan needs to set a skb destructor ?
skb_orphan(skb) breaks TCP Small queues and FQ/pacing packet scheduler,
plus other things...
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [RFC] net: distribute vxlan tunneled traffic across multiple TXQs
2013-12-17 16:45 ` Eric Dumazet
@ 2013-12-19 7:43 ` Sathya Perla
0 siblings, 0 replies; 3+ messages in thread
From: Sathya Perla @ 2013-12-19 7:43 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev@vger.kernel.org
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Tuesday, December 17, 2013 10:15 PM
> To: Sathya Perla
> Cc: netdev@vger.kernel.org
> Subject: Re: [RFC] net: distribute vxlan tunneled traffic across multiple TXQs
>
> On Tue, 2013-12-17 at 14:10 +0530, Sathya Perla wrote:
> > TX traffic is distributed across multiple TXQs using skb->sk->sk_hash.
> > For vxlan skbs, the reference to the original socket (skb->sk) is replaced
> > with vxlan-sk. Because of this all tunneled traffic ends up only on one TXQ.
> >
> > This patch uses the skb->rxhash field to carry the original sk->sk_hash
> > value so that it can be used by netdev layer to pick a TXQ. If this approach
> > is agreeable then we can change the name of skb->rxhash to skb->hash so that
> > it can be used in both RX and TX paths.
> >
> > But, after a TXQ is picked based on the skb->rxhash for tunneled traffic,
> > it's index cannot be recorded in the original socket as it's reference
> > is no longer available in skb. So, the TXQ-index would need to be
> > computed (from skb->rxhash) for each skb. Any ideas on how this can be
> > avoided?
>
> Real question is : Why vxlan needs to set a skb destructor ?
The need for a vxlan skb destructor is not apparent to me.
The code just bumps up vxlan-sk->refcnt and does nothing else.
>
> skb_orphan(skb) breaks TCP Small queues and FQ/pacing packet scheduler,
> plus other things...
It also seems to violate the TCP wmem accounting of the original socket.
I'll test a patch removing the vxlan destructor and post it for comments.
thanks,
-Sathya
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-12-19 7:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-17 8:40 [RFC] net: distribute vxlan tunneled traffic across multiple TXQs Sathya Perla
2013-12-17 16:45 ` Eric Dumazet
2013-12-19 7:43 ` Sathya Perla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).