netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net] bonding: don't use stale speed and duplex information
@ 2016-02-08 20:10 Jay Vosburgh
  2016-02-14  2:36 ` Ding Tianhong
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jay Vosburgh @ 2016-02-08 20:10 UTC (permalink / raw)
  To: netdev, Tantilov, Emil S, zhuyj
  Cc: Veaceslav Falico, dingtianhong, Andy Gospodarek, David S. Miller


	There is presently a race condition between the bonding periodic
link monitor and the updating of a slave's speed and duplex.  The former
occurs on a periodic basis, and the latter in response to a driver's
calling of netif_carrier_on.

	It is possible for the periodic monitor to run between the
driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
event that causes bonding to update the slave's speed and duplex.  This
manifests most notably as a report that a slave is up and "0 Mbps full
duplex" after enslavement, but in principle could report an incorrect
speed and duplex after any link up event if the device comes up with a
different speed or duplex.  This affects the 802.3ad aggregator
selection, as the speed and duplex are selection criteria.

	This is fixed by updating the speed and duplex in the periodic
monitor, prior to using that information.

	This was done historically in bonding, but the call to
bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
don't call update_speed_duplex() under spinlocks"), as it might sleep
under lock.  Later, the locking was changed to only hold RTNL, and so
after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
under spinlocks") this call is again safe.

Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

---

v2: Correct Veaceslav's email address

Note: The "Fixes" commit is the commit that makes this operation safe
again, not the commit that originally introduced the race.  I don't see
any simple way to resolve this bug between these two commits.

 drivers/net/bonding/bond_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..cabaeb61333d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
 			continue;
 
 		case BOND_LINK_UP:
+			bond_update_speed_duplex(slave);
 			bond_set_slave_link_state(slave, BOND_LINK_UP,
 						  BOND_SLAVE_NOTIFY_NOW);
 			slave->last_link_up = jiffies;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-02-29  6:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-08 20:10 [PATCH v2 net] bonding: don't use stale speed and duplex information Jay Vosburgh
2016-02-14  2:36 ` Ding Tianhong
2016-02-16 20:14 ` David Miller
2016-02-18 20:25   ` Jay Vosburgh
2016-02-18 20:27     ` David Miller
2016-02-25  8:35 ` zhuyj
2016-02-25 13:33   ` Jay Vosburgh
2016-02-26  2:21     ` zhuyj
2016-02-29  5:39       ` Jay Vosburgh
2016-02-29  6:41         ` zhuyj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).