From: Matteo Croce <mcroce@redhat.com>
To: netdev@vger.kernel.org
Cc: Jay Vosburgh <j.vosburgh@gmail.com>,
Veaceslav Falico <vfalico@gmail.com>,
Andy Gospodarek <andy@greyhouse.net>,
"David S . Miller " <davem@davemloft.net>,
Stanislav Fomichev <sdf@google.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Song Liu <songliubraving@fb.com>,
Alexei Starovoitov <ast@kernel.org>,
Paul Blakey <paulb@mellanox.com>,
linux-kernel@vger.kernel.org
Subject: [PATCH net-next 4/4] bonding: balance ICMP echoes in layer3+4 mode
Date: Mon, 21 Oct 2019 22:09:48 +0200 [thread overview]
Message-ID: <20191021200948.23775-5-mcroce@redhat.com> (raw)
In-Reply-To: <20191021200948.23775-1-mcroce@redhat.com>
The bonding uses the L4 ports to balance flows between slaves.
As the ICMP protocol has no ports, those packets are sent all to the
same device:
# tcpdump -qltnni veth0 ip |sed 's/^/0: /' &
# tcpdump -qltnni veth1 ip |sed 's/^/1: /' &
# ping -qc1 192.168.0.2
1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 315, seq 1, length 64
1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 315, seq 1, length 64
# ping -qc1 192.168.0.2
1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 316, seq 1, length 64
1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 316, seq 1, length 64
# ping -qc1 192.168.0.2
1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 317, seq 1, length 64
1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 317, seq 1, length 64
But some ICMP packets have an Identifier field which is
used to match packets within sessions, let's use this value in the hash
function to balance these packets between bond slaves:
# ping -qc1 192.168.0.2
0: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 303, seq 1, length 64
0: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 303, seq 1, length 64
# ping -qc1 192.168.0.2
1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 304, seq 1, length 64
1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 304, seq 1, length 64
Signed-off-by: Matteo Croce <mcroce@redhat.com>
---
drivers/net/bonding/bond_main.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 21d8fcc83c9c..83afb03f4d07 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3267,6 +3267,8 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
return skb_flow_dissect_flow_keys(skb, fk, 0);
fk->ports.ports = 0;
+ fk->icmp.icmp = 0;
+ fk->icmp.id = 0;
noff = skb_network_offset(skb);
if (skb->protocol == htons(ETH_P_IP)) {
if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph))))
@@ -3286,8 +3288,14 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
} else {
return false;
}
- if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0)
- fk->ports.ports = skb_flow_get_ports(skb, noff, proto);
+ if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0) {
+ if (proto == IPPROTO_ICMP || proto == IPPROTO_ICMPV6)
+ skb_flow_get_icmp_tci(skb, &fk->icmp, skb->data,
+ skb_transport_offset(skb),
+ skb_headlen(skb));
+ else
+ fk->ports.ports = skb_flow_get_ports(skb, noff, proto);
+ }
return true;
}
@@ -3314,10 +3322,14 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
return bond_eth_hash(skb);
if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER23 ||
- bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23)
+ bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23) {
hash = bond_eth_hash(skb);
- else
- hash = (__force u32)flow.ports.ports;
+ } else {
+ if (flow.icmp.id)
+ memcpy(&hash, &flow.icmp, sizeof(hash));
+ else
+ memcpy(&hash, &flow.ports.ports, sizeof(hash));
+ }
hash ^= (__force u32)flow_get_u32_dst(&flow) ^
(__force u32)flow_get_u32_src(&flow);
hash ^= (hash >> 16);
--
2.21.0
next prev parent reply other threads:[~2019-10-21 20:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-21 20:09 [PATCH net-next 0/4] ICMP flow improvements Matteo Croce
2019-10-21 20:09 ` [PATCH net-next 1/4] flow_dissector: add meaningful comments Matteo Croce
2019-10-23 9:57 ` Simon Horman
2019-10-21 20:09 ` [PATCH net-next 2/4] flow_dissector: skip the ICMP dissector for non ICMP packets Matteo Croce
2019-10-23 9:57 ` Simon Horman
2019-10-21 20:09 ` [PATCH net-next 3/4] flow_dissector: extract more ICMP information Matteo Croce
2019-10-23 10:00 ` Simon Horman
2019-10-23 10:53 ` Matteo Croce
2019-10-23 17:55 ` Simon Horman
2019-10-25 0:27 ` Matteo Croce
2019-10-25 6:28 ` Simon Horman
2019-10-25 18:24 ` Matteo Croce
2019-10-26 7:55 ` Simon Horman
2019-10-21 20:09 ` Matteo Croce [this message]
2019-10-23 10:01 ` [PATCH net-next 4/4] bonding: balance ICMP echoes in layer3+4 mode Simon Horman
2019-10-23 16:58 ` Matteo Croce
2019-10-23 18:00 ` Simon Horman
2019-10-24 22:05 ` [PATCH net-next 0/4] ICMP flow improvements David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191021200948.23775-5-mcroce@redhat.com \
--to=mcroce@redhat.com \
--cc=andy@greyhouse.net \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=j.vosburgh@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paulb@mellanox.com \
--cc=sdf@google.com \
--cc=songliubraving@fb.com \
--cc=vfalico@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox