* [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
@ 2014-01-16 2:05 Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 1/6] bonding: always update last_arp_rx on arp recieve Veaceslav Falico
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Jay Vosburgh, Andy Gospodarek, David S. Miller, Veaceslav Falico
Hi,
Currently, if arp_validate is off (0), slave_last_rx() returns the
slave->dev->last_rx, which is always updated on *any* packet received by
slave, and not only arps. This means that, if the validation of arps is
off, we're treating *any* incoming packet as a proof of slave being up, and
not only arps.
This might seem logical at the first glance, however it can cause a lot of
troubles and false-positives, one example would be:
The arp_ip_target is NOT accessible, however someone in the broadcast domain
spams with any broadcast traffic. This way bonding will be tricked that the
slave is still up (as in - can access arp_ip_target), while it's not.
The documentation for arp_validate also states that *ARPs* will (not) be
validated if it's on/off, and that the arp monitoring works on arps as
traffic generators.
Also, the net_device->last_rx is already used in a lot of drivers (even
though the comment states to NOT do it :)), and it's also ugly to modify it
from bonding.
So, to fix this, remove the last_rx from bonding, *always* call
bond_arp_rcv() in slave's rx_handler (bond_handle_frame), and if we spot an
arp there - update the slave->last_arp_rx - and use it instead of
net_device->last_rx. Finally, rename slave_last_rx() to slave_last_arp_rx()
to reflect the changes.
As the changes touch really sensitive parts, I've tried to split them as
much as possible, for easier debugging/bisecting.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_main.c | 18 ++++++++----------
drivers/net/bonding/bond_options.c | 12 ++----------
drivers/net/bonding/bonding.h | 16 ++++++----------
include/linux/netdevice.h | 8 +-------
4 files changed, 17 insertions(+), 37 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next 1/6] bonding: always update last_arp_rx on arp recieve
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
@ 2014-01-16 2:05 ` Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 2/6] bonding: always set recv_probe to bond_arp_rcv in arp monitor Veaceslav Falico
` (6 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
Currently we're updating the last_arp_rx only when we've validate the
packet, however afterwards we use it as 'ANY last arp received', but not
only validated ARPs.
Fix this by updating it even if the slave doesn't require arp validation.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f2fe6cb..cfb37af 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2294,8 +2294,10 @@ int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
read_lock(&bond->lock);
- if (!slave_do_arp_validate(bond, slave))
+ if (!slave_do_arp_validate(bond, slave)) {
+ slave->last_arp_rx = jiffies;
goto out_unlock;
+ }
alen = arp_hdr_len(bond->dev);
--
1.8.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 2/6] bonding: always set recv_probe to bond_arp_rcv in arp monitor
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 1/6] bonding: always update last_arp_rx on arp recieve Veaceslav Falico
@ 2014-01-16 2:05 ` Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 3/6] bonding: use last_arp_rx in slave_last_rx() Veaceslav Falico
` (5 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
Currently we only set bond_arp_rcv() if we're using arp_validate, however
this makes us skip updating last_arp_rx if we're not validating incoming
ARPs - thus, if arp_validate is off, last_arp_rx will never be updated.
Fix this by always setting up recv_probe = bond_arp_rcv, even if we're not
using arp_validate.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_main.c | 3 +--
drivers/net/bonding/bond_options.c | 12 ++----------
2 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index cfb37af..1e0f21a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3062,8 +3062,7 @@ static int bond_open(struct net_device *bond_dev)
if (bond->params.arp_interval) { /* arp interval, in milliseconds. */
queue_delayed_work(bond->wq, &bond->arp_work, 0);
- if (bond->params.arp_validate)
- bond->recv_probe = bond_arp_rcv;
+ bond->recv_probe = bond_arp_rcv;
}
if (bond->params.mode == BOND_MODE_8023AD) {
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 945a666..1bab20e 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -283,13 +283,11 @@ int bond_option_arp_interval_set(struct bonding *bond, int arp_interval)
* is called.
*/
if (!arp_interval) {
- if (bond->params.arp_validate)
- bond->recv_probe = NULL;
+ bond->recv_probe = NULL;
cancel_delayed_work_sync(&bond->arp_work);
} else {
/* arp_validate can be set only in active-backup mode */
- if (bond->params.arp_validate)
- bond->recv_probe = bond_arp_rcv;
+ bond->recv_probe = bond_arp_rcv;
cancel_delayed_work_sync(&bond->mii_work);
queue_delayed_work(bond->wq, &bond->arp_work, 0);
}
@@ -446,12 +444,6 @@ int bond_option_arp_validate_set(struct bonding *bond, int arp_validate)
bond->dev->name, arp_validate_tbl[arp_validate].modename,
arp_validate);
- if (bond->dev->flags & IFF_UP) {
- if (!arp_validate)
- bond->recv_probe = NULL;
- else if (bond->params.arp_interval)
- bond->recv_probe = bond_arp_rcv;
- }
bond->params.arp_validate = arp_validate;
return 0;
--
1.8.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 3/6] bonding: use last_arp_rx in slave_last_rx()
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 1/6] bonding: always update last_arp_rx on arp recieve Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 2/6] bonding: always set recv_probe to bond_arp_rcv in arp monitor Veaceslav Falico
@ 2014-01-16 2:05 ` Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 4/6] bonding: rename slave_last_rx() to slave_last_arp_rx() Veaceslav Falico
` (4 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
Now that last_arp_rx really has the last time we've received any (validated or
not) ARP, we can use it in slave_last_rx() instead of slave->dev->last_rx.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bonding.h | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 955dc48..1af8c1f 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -344,14 +344,10 @@ static inline unsigned long slave_oldest_target_arp_rx(struct bonding *bond,
static inline unsigned long slave_last_rx(struct bonding *bond,
struct slave *slave)
{
- if (slave_do_arp_validate(bond, slave)) {
- if (bond->params.arp_all_targets == BOND_ARP_TARGETS_ALL)
- return slave_oldest_target_arp_rx(bond, slave);
- else
- return slave->last_arp_rx;
- }
-
- return slave->dev->last_rx;
+ if (bond->params.arp_all_targets == BOND_ARP_TARGETS_ALL)
+ return slave_oldest_target_arp_rx(bond, slave);
+
+ return slave->last_arp_rx;
}
#ifdef CONFIG_NET_POLL_CONTROLLER
--
1.8.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 4/6] bonding: rename slave_last_rx() to slave_last_arp_rx()
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
` (2 preceding siblings ...)
2014-01-16 2:05 ` [PATCH net-next 3/6] bonding: use last_arp_rx in slave_last_rx() Veaceslav Falico
@ 2014-01-16 2:05 ` Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 5/6] bonding: use last_arp_rx in bond_loadbalance_arp_mon() Veaceslav Falico
` (3 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
To make the code more readable.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_main.c | 4 ++--
drivers/net/bonding/bonding.h | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1e0f21a..399e691 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2347,7 +2347,7 @@ int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
if (bond_is_active_slave(slave))
bond_validate_arp(bond, slave, sip, tip);
else if (bond->curr_active_slave &&
- time_after(slave_last_rx(bond, bond->curr_active_slave),
+ time_after(slave_last_arp_rx(bond, bond->curr_active_slave),
bond->curr_active_slave->jiffies))
bond_validate_arp(bond, slave, tip, sip);
@@ -2504,7 +2504,7 @@ static int bond_ab_arp_inspect(struct bonding *bond)
bond_for_each_slave_rcu(bond, slave, iter) {
slave->new_link = BOND_LINK_NOCHANGE;
- last_rx = slave_last_rx(bond, slave);
+ last_rx = slave_last_arp_rx(bond, slave);
if (slave->link != BOND_LINK_UP) {
if (bond_time_in_interval(bond, last_rx, 1)) {
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 1af8c1f..99126b2 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -341,8 +341,8 @@ static inline unsigned long slave_oldest_target_arp_rx(struct bonding *bond,
return ret;
}
-static inline unsigned long slave_last_rx(struct bonding *bond,
- struct slave *slave)
+static inline unsigned long slave_last_arp_rx(struct bonding *bond,
+ struct slave *slave)
{
if (bond->params.arp_all_targets == BOND_ARP_TARGETS_ALL)
return slave_oldest_target_arp_rx(bond, slave);
--
1.8.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 5/6] bonding: use last_arp_rx in bond_loadbalance_arp_mon()
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
` (3 preceding siblings ...)
2014-01-16 2:05 ` [PATCH net-next 4/6] bonding: rename slave_last_rx() to slave_last_arp_rx() Veaceslav Falico
@ 2014-01-16 2:05 ` Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 6/6] bonding: remove useless updating of slave->dev->last_rx Veaceslav Falico
` (2 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
Now that last_arp_rx correctly show the last time we've received an ARP, we
can use it safely instead of slave->dev->last_rx.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 399e691..f5ac7e0 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2406,7 +2406,7 @@ static void bond_loadbalance_arp_mon(struct work_struct *work)
if (slave->link != BOND_LINK_UP) {
if (bond_time_in_interval(bond, trans_start, 1) &&
- bond_time_in_interval(bond, slave->dev->last_rx, 1)) {
+ bond_time_in_interval(bond, slave->last_arp_rx, 1)) {
slave->link = BOND_LINK_UP;
bond_set_active_slave(slave);
@@ -2435,7 +2435,7 @@ static void bond_loadbalance_arp_mon(struct work_struct *work)
* if we don't know our ip yet
*/
if (!bond_time_in_interval(bond, trans_start, 2) ||
- !bond_time_in_interval(bond, slave->dev->last_rx, 2)) {
+ !bond_time_in_interval(bond, slave->last_arp_rx, 2)) {
slave->link = BOND_LINK_DOWN;
bond_set_backup_slave(slave);
--
1.8.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 6/6] bonding: remove useless updating of slave->dev->last_rx
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
` (4 preceding siblings ...)
2014-01-16 2:05 ` [PATCH net-next 5/6] bonding: use last_arp_rx in bond_loadbalance_arp_mon() Veaceslav Falico
@ 2014-01-16 2:05 ` Veaceslav Falico
2014-01-16 5:09 ` [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Jay Vosburgh
2014-01-16 5:53 ` David Miller
7 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 2:05 UTC (permalink / raw)
To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek, David S. Miller
Now that all the logic is handled via last_arp_rx, we don't need to use
last_rx.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_main.c | 3 ---
include/linux/netdevice.h | 8 +-------
2 files changed, 1 insertion(+), 10 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f5ac7e0..f9e5512 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1163,9 +1163,6 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
slave = bond_slave_get_rcu(skb->dev);
bond = slave->bond;
- if (bond->params.arp_interval)
- slave->dev->last_rx = jiffies;
-
recv_probe = ACCESS_ONCE(bond->recv_probe);
if (recv_probe) {
ret = recv_probe(skb, bond, slave);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 30f6513..2016f00 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1299,13 +1299,7 @@ struct net_device {
/*
* Cache lines mostly used on receive path (including eth_type_trans())
*/
- unsigned long last_rx; /* Time of last Rx
- * This should not be set in
- * drivers, unless really needed,
- * because network stack (bonding)
- * use it if/when necessary, to
- * avoid dirtying this cache line.
- */
+ unsigned long last_rx; /* Time of last Rx */
/* Interface address info used in eth_type_trans() */
unsigned char *dev_addr; /* hw address, (before bcast
--
1.8.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
` (5 preceding siblings ...)
2014-01-16 2:05 ` [PATCH net-next 6/6] bonding: remove useless updating of slave->dev->last_rx Veaceslav Falico
@ 2014-01-16 5:09 ` Jay Vosburgh
2014-01-16 6:01 ` David Miller
2014-01-16 8:41 ` Veaceslav Falico
2014-01-16 5:53 ` David Miller
7 siblings, 2 replies; 15+ messages in thread
From: Jay Vosburgh @ 2014-01-16 5:09 UTC (permalink / raw)
To: Veaceslav Falico; +Cc: netdev, Andy Gospodarek, David S. Miller
Veaceslav Falico <vfalico@redhat.com> wrote:
>Currently, if arp_validate is off (0), slave_last_rx() returns the
>slave->dev->last_rx, which is always updated on *any* packet received by
>slave, and not only arps. This means that, if the validation of arps is
>off, we're treating *any* incoming packet as a proof of slave being up, and
>not only arps.
The "any incoming packet" part is intentional.
>This might seem logical at the first glance, however it can cause a lot of
>troubles and false-positives, one example would be:
>
>The arp_ip_target is NOT accessible, however someone in the broadcast domain
>spams with any broadcast traffic. This way bonding will be tricked that the
>slave is still up (as in - can access arp_ip_target), while it's not.
This type of situation is why arp_validate was added.
The specific situation was when multiple hosts using bonding
with the ARP monitor were set up behind a common gateway (in the same
Ethernet broadcast domain). The arp_ip_target is unreachable for
whatever reason. In that case, the various bonding instances on the
different hosts will each issue broadcast ARP requests, and (in the
absence of arp_validate) those requests would trick the other bonds into
believing that they are up.
I don't think this patch set will resolve that problem, since
you explicitly permit any incoming ARP to count.
>The documentation for arp_validate also states that *ARPs* will (not) be
>validated if it's on/off, and that the arp monitoring works on arps as
>traffic generators.
I wrote most of that text in the documentation, and the intent
was not to imply that only ARPs should count for "up-ness" even without
arp_validate enabled. The intent was to distinguish it from
"non-validate," in which any incoming traffic counted for "up-ness."
The main reason for preserving the non-validate behavior (any
traffic counts) is for the loadbalance (xor and rr) modes. In those
modes, the switch decides which slave receives the incoming traffic, and
so it's to our advantage to permit any incoming traffic to count for
"up-ness." The arp_validate option is not allowed in these modes
because it won't work.
With these changes, I suspect that the loadbalance ARP monitor
will be less reliable with these changes (granted that it's already a
bit dodgy in its dependence on the switch to hit all slaves with
incoming packets regularly). Particularly if the switch ports are
configured into an Etherchannel ("static link aggregation") group, in
which case only one slave will receive any given frame (broadcast /
multicast traffic will not be duplicated across all slaves).
I'm not sure that this change (the "only count ARPs even without
arp_validate" bit) won't break existing configurations. Did you test
the -rr and -xor modes with ARP monitor after your changes (with and
without configuring a channel group on the switch ports)?
>Also, the net_device->last_rx is already used in a lot of drivers (even
>though the comment states to NOT do it :)), and it's also ugly to modify it
>from bonding.
I didn't check, but I suspect those are mostly leftovers from
the distant past, when the drivers were expected to update last_rx, or
perhaps drivers using it for their own purposes.
I don't really see an issue in decoupling bonding from the
net_device->last_rx; it's pretty much the same thing that was done for
trans_start some time ago.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>So, to fix this, remove the last_rx from bonding, *always* call
>bond_arp_rcv() in slave's rx_handler (bond_handle_frame), and if we spot an
>arp there - update the slave->last_arp_rx - and use it instead of
>net_device->last_rx. Finally, rename slave_last_rx() to slave_last_arp_rx()
>to reflect the changes.
>
>As the changes touch really sensitive parts, I've tried to split them as
>much as possible, for easier debugging/bisecting.
>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: "David S. Miller" <davem@davemloft.net>
>CC: netdev@vger.kernel.org
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>
>---
> drivers/net/bonding/bond_main.c | 18 ++++++++----------
> drivers/net/bonding/bond_options.c | 12 ++----------
> drivers/net/bonding/bonding.h | 16 ++++++----------
> include/linux/netdevice.h | 8 +-------
> 4 files changed, 17 insertions(+), 37 deletions(-)
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
` (6 preceding siblings ...)
2014-01-16 5:09 ` [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Jay Vosburgh
@ 2014-01-16 5:53 ` David Miller
7 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2014-01-16 5:53 UTC (permalink / raw)
To: vfalico; +Cc: netdev, fubar, andy
From: Veaceslav Falico <vfalico@redhat.com>
Date: Thu, 16 Jan 2014 03:05:10 +0100
> Currently, if arp_validate is off (0), slave_last_rx() returns the
> slave->dev->last_rx, which is always updated on *any* packet received by
> slave, and not only arps. This means that, if the validation of arps is
> off, we're treating *any* incoming packet as a proof of slave being up, and
> not only arps.
>
> This might seem logical at the first glance, however it can cause a lot of
> troubles and false-positives, one example would be:
>
> The arp_ip_target is NOT accessible, however someone in the broadcast domain
> spams with any broadcast traffic. This way bonding will be tricked that the
> slave is still up (as in - can access arp_ip_target), while it's not.
>
> The documentation for arp_validate also states that *ARPs* will (not) be
> validated if it's on/off, and that the arp monitoring works on arps as
> traffic generators.
>
> Also, the net_device->last_rx is already used in a lot of drivers (even
> though the comment states to NOT do it :)), and it's also ugly to modify it
> from bonding.
>
> So, to fix this, remove the last_rx from bonding, *always* call
> bond_arp_rcv() in slave's rx_handler (bond_handle_frame), and if we spot an
> arp there - update the slave->last_arp_rx - and use it instead of
> net_device->last_rx. Finally, rename slave_last_rx() to slave_last_arp_rx()
> to reflect the changes.
>
> As the changes touch really sensitive parts, I've tried to split them as
> much as possible, for easier debugging/bisecting.
>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
This series looks really good, applied, thanks Veaceslav.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 5:09 ` [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Jay Vosburgh
@ 2014-01-16 6:01 ` David Miller
2014-01-17 8:02 ` Veaceslav Falico
2014-01-16 8:41 ` Veaceslav Falico
1 sibling, 1 reply; 15+ messages in thread
From: David Miller @ 2014-01-16 6:01 UTC (permalink / raw)
To: fubar; +Cc: vfalico, netdev, andy
From: Jay Vosburgh <fubar@us.ibm.com>
Date: Wed, 15 Jan 2014 21:09:57 -0800
> The main reason for preserving the non-validate behavior (any
> traffic counts) is for the loadbalance (xor and rr) modes. In those
> modes, the switch decides which slave receives the incoming traffic, and
> so it's to our advantage to permit any incoming traffic to count for
> "up-ness." The arp_validate option is not allowed in these modes
> because it won't work.
>
> With these changes, I suspect that the loadbalance ARP monitor
> will be less reliable with these changes (granted that it's already a
> bit dodgy in its dependence on the switch to hit all slaves with
> incoming packets regularly). Particularly if the switch ports are
> configured into an Etherchannel ("static link aggregation") group, in
> which case only one slave will receive any given frame (broadcast /
> multicast traffic will not be duplicated across all slaves).
>
> I'm not sure that this change (the "only count ARPs even without
> arp_validate" bit) won't break existing configurations. Did you test
> the -rr and -xor modes with ARP monitor after your changes (with and
> without configuring a channel group on the switch ports)?
Sorry Jay, I only read this just now. I won't push these changes
out until you've had some time to discuss them.
To my untrained eye they looked rather straightforward :-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 5:09 ` [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Jay Vosburgh
2014-01-16 6:01 ` David Miller
@ 2014-01-16 8:41 ` Veaceslav Falico
2014-01-16 22:38 ` Jay Vosburgh
1 sibling, 1 reply; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-16 8:41 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, Andy Gospodarek, David S. Miller
On Wed, Jan 15, 2014 at 09:09:57PM -0800, Jay Vosburgh wrote:
>Veaceslav Falico <vfalico@redhat.com> wrote:
>
>>Currently, if arp_validate is off (0), slave_last_rx() returns the
>>slave->dev->last_rx, which is always updated on *any* packet received by
>>slave, and not only arps. This means that, if the validation of arps is
>>off, we're treating *any* incoming packet as a proof of slave being up, and
>>not only arps.
>
> The "any incoming packet" part is intentional.
>
>>This might seem logical at the first glance, however it can cause a lot of
>>troubles and false-positives, one example would be:
>>
>>The arp_ip_target is NOT accessible, however someone in the broadcast domain
>>spams with any broadcast traffic. This way bonding will be tricked that the
>>slave is still up (as in - can access arp_ip_target), while it's not.
>
> This type of situation is why arp_validate was added.
>
> The specific situation was when multiple hosts using bonding
>with the ARP monitor were set up behind a common gateway (in the same
>Ethernet broadcast domain). The arp_ip_target is unreachable for
>whatever reason. In that case, the various bonding instances on the
>different hosts will each issue broadcast ARP requests, and (in the
>absence of arp_validate) those requests would trick the other bonds into
>believing that they are up.
>
> I don't think this patch set will resolve that problem, since
>you explicitly permit any incoming ARP to count.
I've said it was tricky at first glance :).
For this situation the arp_validate is *indeed* the cure. I'm not disabling
(or even working with) how arp_validate=1/2 works, I'm working with
arp_validate == 0.
Before the patchset (with arp_validate == 0 ):
*Any* packet (arp and non-arp) will signal us that the slave is up -
because we use slave->dev->last_rx (updated on *every* incoming packet in
bond_handle_frame).
After the patchset (with arp_validate == 0 ):
*ONLY ARP* packets signal us that the slave is up - because we use
slave->last_arp_rx that is updated every time we see an ARP packet.
The way the modes work with arp_validate > 0 don't change :), as we're
updating slave->last_arp_rx the old way in this case - after validation.
So that the scenario you've described still works flawlessly, and now
already we won't be tricked by some weird broadcast traffic even with
arp_validate == 0.
>
>>The documentation for arp_validate also states that *ARPs* will (not) be
>>validated if it's on/off, and that the arp monitoring works on arps as
>>traffic generators.
>
> I wrote most of that text in the documentation, and the intent
>was not to imply that only ARPs should count for "up-ness" even without
>arp_validate enabled. The intent was to distinguish it from
>"non-validate," in which any incoming traffic counted for "up-ness."
>
> The main reason for preserving the non-validate behavior (any
>traffic counts) is for the loadbalance (xor and rr) modes. In those
>modes, the switch decides which slave receives the incoming traffic, and
>so it's to our advantage to permit any incoming traffic to count for
>"up-ness." The arp_validate option is not allowed in these modes
>because it won't work.
>
> With these changes, I suspect that the loadbalance ARP monitor
>will be less reliable with these changes (granted that it's already a
>bit dodgy in its dependence on the switch to hit all slaves with
>incoming packets regularly). Particularly if the switch ports are
>configured into an Etherchannel ("static link aggregation") group, in
>which case only one slave will receive any given frame (broadcast /
>multicast traffic will not be duplicated across all slaves).
The non-AB modes also gave me a headache, however after thinking a bit I've
decided to change them also (mainly, it's the change of
arp_loadbalance_mon function).
The usual usage, however, is to generate traffic via arps. If we don't see
arp replies - this means that arp_ip_target is down, and thus the slave is
down.
>
> I'm not sure that this change (the "only count ARPs even without
>arp_validate" bit) won't break existing configurations. Did you test
>the -rr and -xor modes with ARP monitor after your changes (with and
>without configuring a channel group on the switch ports)?
Sure, all works fine, afaics. Obviously, these were basic tests, and bugs
might exist.
The only possible scenario of breakage for someone, from my POV, is:
1) arp monitor is used with loadbalance mode
2) arp_ip_targets are set but _any_ arp replies are never received
3) the user relies on that every slave will receive at least one packet per
arp_interval
This use case:
1) contradicts with documentation
2) contradicts with logic (arp monitor, arp ip targets etc. are used
without, actually, meaning something)
3) is really unstable
In this case, indeed, it won't work. Two points, though:
1) It shouldn't work in the first place, is unstable etc.
2) Can be easily fixed by the following oneliner (though I *really* woudn't
like to do it, as it's useless and dangerous). Basically, for non-ab mode
we set last_arp_rx for every packet. Again, I wouldn't like to do it, but
if you know any use case scenario for this usage (no working arps but
continuous receive traffic) - I can send it as v2 or 7/6 patch (whatever
suits David better, as he already applied it but didn't push).
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 0f613ae..87358e5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2286,6 +2286,11 @@ int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
__be32 sip, tip;
int alen;
+ if (!USES_PRIMARY(bond->params.mode)) {
+ slave->last_arp_rx = jiffies;
+ return RX_HANDLER_ANOTHER;
+ }
+
if (skb->protocol != __cpu_to_be16(ETH_P_ARP))
return RX_HANDLER_ANOTHER;
>
>>Also, the net_device->last_rx is already used in a lot of drivers (even
>>though the comment states to NOT do it :)), and it's also ugly to modify it
>>from bonding.
>
> I didn't check, but I suspect those are mostly leftovers from
>the distant past, when the drivers were expected to update last_rx, or
>perhaps drivers using it for their own purposes.
It's really a mix. Somebody just updates them, somebody uses it for their
own purposes etc.
>
> I don't really see an issue in decoupling bonding from the
>net_device->last_rx; it's pretty much the same thing that was done for
>trans_start some time ago.
trans_start removal is also in queue :). Though still needs some
polishing...
>
> -J
>
>---
> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>
>
>>So, to fix this, remove the last_rx from bonding, *always* call
>>bond_arp_rcv() in slave's rx_handler (bond_handle_frame), and if we spot an
>>arp there - update the slave->last_arp_rx - and use it instead of
>>net_device->last_rx. Finally, rename slave_last_rx() to slave_last_arp_rx()
>>to reflect the changes.
>>
>>As the changes touch really sensitive parts, I've tried to split them as
>>much as possible, for easier debugging/bisecting.
>>
>>CC: Jay Vosburgh <fubar@us.ibm.com>
>>CC: Andy Gospodarek <andy@greyhouse.net>
>>CC: "David S. Miller" <davem@davemloft.net>
>>CC: netdev@vger.kernel.org
>>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>>
>>---
>> drivers/net/bonding/bond_main.c | 18 ++++++++----------
>> drivers/net/bonding/bond_options.c | 12 ++----------
>> drivers/net/bonding/bonding.h | 16 ++++++----------
>> include/linux/netdevice.h | 8 +-------
>> 4 files changed, 17 insertions(+), 37 deletions(-)
>>
>
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 8:41 ` Veaceslav Falico
@ 2014-01-16 22:38 ` Jay Vosburgh
2014-01-17 6:57 ` Veaceslav Falico
0 siblings, 1 reply; 15+ messages in thread
From: Jay Vosburgh @ 2014-01-16 22:38 UTC (permalink / raw)
To: Veaceslav Falico; +Cc: netdev, Andy Gospodarek, David S. Miller
Veaceslav Falico <vfalico@redhat.com> wrote:
>On Wed, Jan 15, 2014 at 09:09:57PM -0800, Jay Vosburgh wrote:
>>Veaceslav Falico <vfalico@redhat.com> wrote:
>>
>>>Currently, if arp_validate is off (0), slave_last_rx() returns the
>>>slave->dev->last_rx, which is always updated on *any* packet received by
>>>slave, and not only arps. This means that, if the validation of arps is
>>>off, we're treating *any* incoming packet as a proof of slave being up, and
>>>not only arps.
>>
>> The "any incoming packet" part is intentional.
>>
>>>This might seem logical at the first glance, however it can cause a lot of
>>>troubles and false-positives, one example would be:
>>>
>>>The arp_ip_target is NOT accessible, however someone in the broadcast domain
>>>spams with any broadcast traffic. This way bonding will be tricked that the
>>>slave is still up (as in - can access arp_ip_target), while it's not.
>>
>> This type of situation is why arp_validate was added.
>>
>> The specific situation was when multiple hosts using bonding
>>with the ARP monitor were set up behind a common gateway (in the same
>>Ethernet broadcast domain). The arp_ip_target is unreachable for
>>whatever reason. In that case, the various bonding instances on the
>>different hosts will each issue broadcast ARP requests, and (in the
>>absence of arp_validate) those requests would trick the other bonds into
>>believing that they are up.
>>
>> I don't think this patch set will resolve that problem, since
>>you explicitly permit any incoming ARP to count.
>
>I've said it was tricky at first glance :).
>
>For this situation the arp_validate is *indeed* the cure. I'm not disabling
>(or even working with) how arp_validate=1/2 works, I'm working with
>arp_validate == 0.
>
>Before the patchset (with arp_validate == 0 ):
>
>*Any* packet (arp and non-arp) will signal us that the slave is up -
>because we use slave->dev->last_rx (updated on *every* incoming packet in
>bond_handle_frame).
>
>After the patchset (with arp_validate == 0 ):
>
>*ONLY ARP* packets signal us that the slave is up - because we use
>slave->last_arp_rx that is updated every time we see an ARP packet.
But that's effectively making arp_validate the only setting if
the host is on a quiet network segment without much other host ARP
traffic. This may not be an issue for the active-backup mode, but the
load balance modes will have serious issues (more on that below).
Actually, thinking about it, for active-backup, if a backup
slave is in a different Ethernet broadcast domain than the active slave,
your change will likely break those configurations. With arp_validate
enabled or your change applied, the backup slaves in active-backup
depend on the ARP request broadcasts to be forwarded by the switch to
the backup slave. Currently, without arp_validate, that dependency is
not there; the backup slave can receive any traffic (although in most
configurations, the ARP broadcast is what it will receive).
That would be a legal (if very odd) bonding configuration. I'm
not aware of anybody seting things up that way.
>The way the modes work with arp_validate > 0 don't change :), as we're
>updating slave->last_arp_rx the old way in this case - after validation.
>
>So that the scenario you've described still works flawlessly, and now
>already we won't be tricked by some weird broadcast traffic even with
>arp_validate == 0.
So why not just enable arp_validate and get the same effect?
>>>The documentation for arp_validate also states that *ARPs* will (not) be
>>>validated if it's on/off, and that the arp monitoring works on arps as
>>>traffic generators.
>>
>> I wrote most of that text in the documentation, and the intent
>>was not to imply that only ARPs should count for "up-ness" even without
>>arp_validate enabled. The intent was to distinguish it from
>>"non-validate," in which any incoming traffic counted for "up-ness."
>>
>> The main reason for preserving the non-validate behavior (any
>>traffic counts) is for the loadbalance (xor and rr) modes. In those
>>modes, the switch decides which slave receives the incoming traffic, and
>>so it's to our advantage to permit any incoming traffic to count for
>>"up-ness." The arp_validate option is not allowed in these modes
>>because it won't work.
>>
>> With these changes, I suspect that the loadbalance ARP monitor
>>will be less reliable with these changes (granted that it's already a
>>bit dodgy in its dependence on the switch to hit all slaves with
>>incoming packets regularly). Particularly if the switch ports are
>>configured into an Etherchannel ("static link aggregation") group, in
>>which case only one slave will receive any given frame (broadcast /
>>multicast traffic will not be duplicated across all slaves).
>
>The non-AB modes also gave me a headache, however after thinking a bit I've
>decided to change them also (mainly, it's the change of
>arp_loadbalance_mon function).
I had that same headache when I was implementing the
arp_validate stuff. But I disagree with changing this.
>The usual usage, however, is to generate traffic via arps. If we don't see
>arp replies - this means that arp_ip_target is down, and thus the slave is
>down.
The issue with the loadbalance (-xor and -rr) modes is that the
incoming ARPs will be balanced by the switch, and won't be delivered to
all slaves, or at least not at a rate such that each slave sees an ARP
every arp_interval or so.
Currently, the loadbalance modes will work with the ARP monitor
as long as sufficient traffic volume (of any kind) flows across the
slaves; with your change, that will no longer be true.
In this case, the ARP monitor currently is essentially using
"slave can send and receive packets" as the test for availability.
>> I'm not sure that this change (the "only count ARPs even without
>>arp_validate" bit) won't break existing configurations. Did you test
>>the -rr and -xor modes with ARP monitor after your changes (with and
>>without configuring a channel group on the switch ports)?
>
>Sure, all works fine, afaics. Obviously, these were basic tests, and bugs
>might exist.
I set up bonding to run some tests. I used three slaves,
connected to a switch with those ports set for Etherchannel, balancing
according to source IP (Cisco port-channel load-balance src-ip). For
non-IP packets (like ARP), this will balance by source MAC. I set
arp_interval to 1000 (1 second), and used one arp_ip_target.
Without your patches, I can get all slaves to stay up with a
sufficient traffic load (ping, netperf, etc). This is dependent upon
the switch distributing the packets egressing the switch to all ports of
the Etherchannel. This is not by any means an ideal situation, but has
been the state for some time now.
With your patches, one slave stays up (it is receiving the ARP
replies from the arp_ip_target). Regular traffic of any kind does not
keep the second and third slaves up. In my opinion, this is a
regression from previous behavior.
How did you test this such that all slaves stayed up? My
suspicion is that you did not configure the switch ports for
Etherchannel (static link aggregation), and thus all broadcast ARP
requests were flooded to all slaves. This would mark all slaves up, but
not have any real reliance on ARP replies coming from the arp_ip_target
(i.e., if there were no ARP replies, the slaves would still remain up
due to the broadcast ARPs generated by the bond itself).
>The only possible scenario of breakage for someone, from my POV, is:
>
>1) arp monitor is used with loadbalance mode
>2) arp_ip_targets are set but _any_ arp replies are never received
>3) the user relies on that every slave will receive at least one packet per
>arp_interval
For 2, above, the issue is that, after your changes, each slave
must receive an ARP during each arp_interval.
>This use case:
>
>1) contradicts with documentation
>2) contradicts with logic (arp monitor, arp ip targets etc. are used
>without, actually, meaning something)
>3) is really unstable
On 1, I'll grant that the documentation is ambiguous, but the
acceptance of any incoming packet for the non-validate ARP monitor is
functioning as designed. At worst, this means the documentation should
be updated to be clearer, not that the code is functioning incorrectly
because it can be shown to contradict a reading of the documentation.
Heck, looking at the bonding.txt documentation, it still says
that device drivers have to update ->last_rx and ->trans_start, which
isn't true. So, yah, the documentation needs some work.
On 2, sure they mean something; (without arp_validate or this
change) the ARP request / replies are one source, but not the only
source, of traffic to determine slave state.
On 3, that is, indeed, the case, and this is mentioned in the
documentation somewhere (that a continuous flow of traffic is necessary
to maintain correct up/down status for the slaves). The whole point of
accepting any incoming frame (not just ARPs) is to permit this very
scenario to function at maximum efficiency, even if that's not 100%
reliable in all cases.
>In this case, indeed, it won't work. Two points, though:
>
>1) It shouldn't work in the first place, is unstable etc.
>2) Can be easily fixed by the following oneliner (though I *really* woudn't
>like to do it, as it's useless and dangerous). Basically, for non-ab mode
>we set last_arp_rx for every packet. Again, I wouldn't like to do it, but
>if you know any use case scenario for this usage (no working arps but
>continuous receive traffic) - I can send it as v2 or 7/6 patch (whatever
>suits David better, as he already applied it but didn't push).
It's not useless and dangerous, it's preserving the intended
behavior for backwards compatibility with existing configurations.
Further, the current behavior is what allows ARP monitor for the
loadbalance modes (-xor and -rr) to work at all.
The use case is any -xor or -rr bond using ARP monitor against a
switch configured for Etherchannel (static link aggregation) on the
bonded ports. In those cases, the incoming ARP replies from an ARP
target will be delivered to only one slave (because switches generally
do load balancing by hash), and the other slaves will not see any
incoming ARP traffic from the arp_target. Even if the switch did round
robin, the slaves still won't see enough ARPs coming in to reliably keep
the slaves up, even if there is lots of regular traffic.
The fact that "accept only ARP" does not work at all for the
loadbalance modes is the reason that arp_validate is not permitted for
those modes.
I think the bottom line here is pretty simple:
Using the ARP monitor with the loadbalance modes is not a common
configuration in my experience, and making it work is tricky. However,
anyone using it today will be relying on the current behavior, which we
therefore must not change.
-J
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 0f613ae..87358e5 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2286,6 +2286,11 @@ int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
> __be32 sip, tip;
> int alen;
> + if (!USES_PRIMARY(bond->params.mode)) {
>+ slave->last_arp_rx = jiffies;
>+ return RX_HANDLER_ANOTHER;
>+ }
>+
> if (skb->protocol != __cpu_to_be16(ETH_P_ARP))
> return RX_HANDLER_ANOTHER;
>
>
>>
>>>Also, the net_device->last_rx is already used in a lot of drivers (even
>>>though the comment states to NOT do it :)), and it's also ugly to modify it
>>>from bonding.
>>
>> I didn't check, but I suspect those are mostly leftovers from
>>the distant past, when the drivers were expected to update last_rx, or
>>perhaps drivers using it for their own purposes.
>
>It's really a mix. Somebody just updates them, somebody uses it for their
>own purposes etc.
>
>>
>> I don't really see an issue in decoupling bonding from the
>>net_device->last_rx; it's pretty much the same thing that was done for
>>trans_start some time ago.
>
>trans_start removal is also in queue :). Though still needs some
>polishing...
>
>>
>> -J
>>
>>---
>> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>>
>>
>>>So, to fix this, remove the last_rx from bonding, *always* call
>>>bond_arp_rcv() in slave's rx_handler (bond_handle_frame), and if we spot an
>>>arp there - update the slave->last_arp_rx - and use it instead of
>>>net_device->last_rx. Finally, rename slave_last_rx() to slave_last_arp_rx()
>>>to reflect the changes.
>>>
>>>As the changes touch really sensitive parts, I've tried to split them as
>>>much as possible, for easier debugging/bisecting.
>>>
>>>CC: Jay Vosburgh <fubar@us.ibm.com>
>>>CC: Andy Gospodarek <andy@greyhouse.net>
>>>CC: "David S. Miller" <davem@davemloft.net>
>>>CC: netdev@vger.kernel.org
>>>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>>>
>>>---
>>> drivers/net/bonding/bond_main.c | 18 ++++++++----------
>>> drivers/net/bonding/bond_options.c | 12 ++----------
>>> drivers/net/bonding/bonding.h | 16 ++++++----------
>>> include/linux/netdevice.h | 8 +-------
>>> 4 files changed, 17 insertions(+), 37 deletions(-)
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 22:38 ` Jay Vosburgh
@ 2014-01-17 6:57 ` Veaceslav Falico
2014-01-17 17:07 ` Veaceslav Falico
0 siblings, 1 reply; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-17 6:57 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, Andy Gospodarek, David S. Miller
On Thu, Jan 16, 2014 at 02:38:59PM -0800, Jay Vosburgh wrote:
...snip...
> I think the bottom line here is pretty simple:
>
> Using the ARP monitor with the loadbalance modes is not a common
>configuration in my experience, and making it work is tricky. However,
>anyone using it today will be relying on the current behavior, which we
>therefore must not change.
Yep, agreed. It might be against the documentation, these use cases might
be weird/illogical - but they (kind of) work, and we both agree that this
change might break them, so it's definitely a no go.
OTOH, I'd still like to help people who have some kind of broadcast traffic
(STP, CDP, some routing etc.) running over network and keeping their slaves
up (and those that cannot/don't want to use arp_validate=3).
What do you think about this*? It's on top of this series, extends
arp_validate to (not) filter out ARPs on not-validated slaves and permits
it to be used in non-AB mode (also, we don't need that bond->lock, we're
always under RCU).
*:
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index a4d925e..7223ef4 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -279,19 +279,45 @@ arp_validate
none or 0
- No validation is performed. This is the default.
+ No validation is performed. This is the default. Any arriving
+ traffic (arp or non-arp) is considered a proof that the slave
+ is up.
active or 1
- Validation is performed only for the active slave.
+ Validation is performed only for the active slave. Only ARPs
+ that arrive from any arp_ip_target are considered legit. The
+ backup slave still does no validation (as if arp_validate=0).
backup or 2
- Validation is performed only for backup slaves.
+ Validation is performed only for backup slaves. Only ARPs
+ that arrive from any arp_ip_target are considered legit. The
+ active slave still has no validation (as if arp_validate=0).
all or 3
- Validation is performed for all slaves.
+ Validation is performed for all slaves. Only ARPs
+ that arrive from any arp_ip_target are considered legit.
+
+ arp or 4
+
+ Any arp packet is accepted as a proof that any slave is up,
+ but no IP-based validation is made.
+
+ active_arp or 5
+
+ Validation is performed only for the active slave. Only ARPs
+ that arrive from any arp_ip_target are considered legit. The
+ backup slave validates only arp packets, but doesn't check the
+ source (as if arp_validate=4).
+
+ backup_any or 6
+
+ Validation is performed only for backup slaves. Only ARPs
+ that arrive from any arp_ip_target are considered legit. The
+ active slave validates only arp packets, but doesn't check the
+ source (as if arp_validate=4).
For the active slave, the validation checks ARP replies to
confirm that they were generated by an arp_ip_target. Since
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 0f613ae..2ef1d5a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -246,6 +246,9 @@ const struct bond_parm_tbl arp_validate_tbl[] = {
{ "active", BOND_ARP_VALIDATE_ACTIVE},
{ "backup", BOND_ARP_VALIDATE_BACKUP},
{ "all", BOND_ARP_VALIDATE_ALL},
+{ "arp", BOND_ARP_VALIDATE_ARP},
+{ "active_arp", BOND_ARP_VALIDATE_ACTIVE_ARP},
+{ "backup_arp", BOND_ARP_VALIDATE_BACKUP_ARP},
{ NULL, -1},
};
@@ -2284,16 +2287,15 @@ int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
struct arphdr *arp = (struct arphdr *)skb->data;
unsigned char *arp_ptr;
__be32 sip, tip;
- int alen;
-
- if (skb->protocol != __cpu_to_be16(ETH_P_ARP))
- return RX_HANDLER_ANOTHER;
-
- read_lock(&bond->lock);
+ int alen, is_arp = skb->protocol == __cpu_to_be16(ETH_P_ARP);
if (!slave_do_arp_validate(bond, slave)) {
- slave->last_arp_rx = jiffies;
- goto out_unlock;
+ if ((slave_do_arp_validate_only(bond, slave) && is_arp) ||
+ !slave_do_arp_validate_only(bond, slave))
+ slave->last_arp_rx = jiffies;
+ return RX_HANDLER_ANOTHER;
+ } else if (!is_arp) {
+ return RX_HANDLER_ANOTHER;
}
alen = arp_hdr_len(bond->dev);
@@ -2349,7 +2351,6 @@ int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
bond_validate_arp(bond, slave, tip, sip);
out_unlock:
- read_unlock(&bond->lock);
if (arp != (struct arphdr *)skb->data)
kfree(arp);
return RX_HANDLER_ANOTHER;
@@ -4181,10 +4182,6 @@ static int bond_check_params(struct bond_params *params)
}
if (arp_validate) {
- if (bond_mode != BOND_MODE_ACTIVEBACKUP) {
- pr_err("arp_validate only supported in active-backup mode\n");
- return -EINVAL;
- }
if (!arp_interval) {
pr_err("arp_validate requires arp_interval\n");
return -EINVAL;
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 1bab20e..9d6d231 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -434,12 +434,6 @@ int bond_option_arp_validate_set(struct bonding *bond, int arp_validate)
return -EINVAL;
}
- if (bond->params.mode != BOND_MODE_ACTIVEBACKUP) {
- pr_err("%s: arp_validate only supported in active-backup mode.\n",
- bond->dev->name);
- return -EINVAL;
- }
-
pr_info("%s: setting arp_validate to %s (%d).\n",
bond->dev->name, arp_validate_tbl[arp_validate].modename,
arp_validate);
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 9f07af1..19eb023 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -319,6 +319,11 @@ static inline bool bond_is_active_slave(struct slave *slave)
#define BOND_ARP_VALIDATE_BACKUP (1 << BOND_STATE_BACKUP)
#define BOND_ARP_VALIDATE_ALL (BOND_ARP_VALIDATE_ACTIVE | \
BOND_ARP_VALIDATE_BACKUP)
+#define BOND_ARP_VALIDATE_ARP (BOND_ARP_VALIDATE_ALL + 1) /* бля... */
+#define BOND_ARP_VALIDATE_ACTIVE_ARP (BOND_ARP_VALIDATE_ACTIVE | \
+ BOND_ARP_VALIDATE_ARP)
+#define BOND_ARP_VALIDATE_BACKUP_ARP (BOND_ARP_VALIDATE_BACKUP | \
+ BOND_ARP_VALIDATE_ARP)
static inline int slave_do_arp_validate(struct bonding *bond,
struct slave *slave)
@@ -326,6 +331,12 @@ static inline int slave_do_arp_validate(struct bonding *bond,
return bond->params.arp_validate & (1 << bond_slave_state(slave));
}
+static inline int slave_do_arp_validate_only(struct bonding *bond,
+ struct slave *slave)
+{
+ return bond->params.arp_validate & BOND_ARP_VALIDATE_ARP;
+}
+
/* Get the oldest arp which we've received on this slave for bond's
* arp_targets.
*/
>
> -J
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-16 6:01 ` David Miller
@ 2014-01-17 8:02 ` Veaceslav Falico
0 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-17 8:02 UTC (permalink / raw)
To: David Miller; +Cc: fubar, netdev, andy
On Wed, Jan 15, 2014 at 10:01:32PM -0800, David Miller wrote:
>From: Jay Vosburgh <fubar@us.ibm.com>
>Date: Wed, 15 Jan 2014 21:09:57 -0800
>
>> The main reason for preserving the non-validate behavior (any
>> traffic counts) is for the loadbalance (xor and rr) modes. In those
>> modes, the switch decides which slave receives the incoming traffic, and
>> so it's to our advantage to permit any incoming traffic to count for
>> "up-ness." The arp_validate option is not allowed in these modes
>> because it won't work.
>>
>> With these changes, I suspect that the loadbalance ARP monitor
>> will be less reliable with these changes (granted that it's already a
>> bit dodgy in its dependence on the switch to hit all slaves with
>> incoming packets regularly). Particularly if the switch ports are
>> configured into an Etherchannel ("static link aggregation") group, in
>> which case only one slave will receive any given frame (broadcast /
>> multicast traffic will not be duplicated across all slaves).
>>
>> I'm not sure that this change (the "only count ARPs even without
>> arp_validate" bit) won't break existing configurations. Did you test
>> the -rr and -xor modes with ARP monitor after your changes (with and
>> without configuring a channel group on the switch ports)?
>
>Sorry Jay, I only read this just now. I won't push these changes
>out until you've had some time to discuss them.
Sorry David, this change indeed might break some configurations (leaving
aside the question if they're legit or not...). So, please, drop them, I'll
discuss with Jay and send a v2.
Thanks a lot, and sorry for the noise.
>
>To my untrained eye they looked rather straightforward :-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used
2014-01-17 6:57 ` Veaceslav Falico
@ 2014-01-17 17:07 ` Veaceslav Falico
0 siblings, 0 replies; 15+ messages in thread
From: Veaceslav Falico @ 2014-01-17 17:07 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, Andy Gospodarek, David S. Miller
On Fri, Jan 17, 2014 at 07:57:18AM +0100, Veaceslav Falico wrote:
...snip...
>What do you think about this*? It's on top of this series, extends
>arp_validate to (not) filter out ARPs on not-validated slaves and permits
>it to be used in non-AB mode (also, we don't need that bond->lock, we're
>always under RCU).
Hi Jay,
In the meanwhile I've integrated this approach (adding new arp_validate
options) and sent v2.
The new version, when used with arp_validate=0/1/2/3 (old values) behaves
exactly the same, while adding 3 new options - only unvalidated ARPs,
validated ARPs on active and unvalidated ARPs on slave and vice versa
(there's no analogue for ALL_ARP cause it doesn't make much sense - as the
validated ARPs for both slaves are already ARPs :) ).
Hope that helps, and sorry that I didn't do my homework in the first time.
Thank you!
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2014-01-17 17:10 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-16 2:05 [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 1/6] bonding: always update last_arp_rx on arp recieve Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 2/6] bonding: always set recv_probe to bond_arp_rcv in arp monitor Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 3/6] bonding: use last_arp_rx in slave_last_rx() Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 4/6] bonding: rename slave_last_rx() to slave_last_arp_rx() Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 5/6] bonding: use last_arp_rx in bond_loadbalance_arp_mon() Veaceslav Falico
2014-01-16 2:05 ` [PATCH net-next 6/6] bonding: remove useless updating of slave->dev->last_rx Veaceslav Falico
2014-01-16 5:09 ` [PATCH net-next 0/6] bonding: only rely on arp packets if arp monitor is used Jay Vosburgh
2014-01-16 6:01 ` David Miller
2014-01-17 8:02 ` Veaceslav Falico
2014-01-16 8:41 ` Veaceslav Falico
2014-01-16 22:38 ` Jay Vosburgh
2014-01-17 6:57 ` Veaceslav Falico
2014-01-17 17:07 ` Veaceslav Falico
2014-01-16 5:53 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).