From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Nikolay Aleksandrov <nikolay@redhat.com>,
Jay Vosburgh <fubar@us.ibm.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [ 54/68] bonding: fix miimon and arp_interval delayed work race conditions
Date: Tue, 2 Apr 2013 15:13:43 -0700 [thread overview]
Message-ID: <20130402221336.237705124@linuxfoundation.org> (raw)
In-Reply-To: <20130402221329.915209206@linuxfoundation.org>
3.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: "nikolay@redhat.com" <nikolay@redhat.com>
[ Upstream commit fbb0c41b814d497c656fc7be9e35456f139cb2fb ]
First I would give three observations which will be used later.
Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
This usage is wrong because the pending bit is cleared just before the
work's fn is executed and if the function re-arms itself we might end up
with the work still running. It's safe to call cancel_delayed_work_sync()
even if the work is not queued at all.
Observation 2: Use of INIT_DELAYED_WORK()
Work needs to be initialized only once prior to (de/en)queueing.
Observation 3: IFF_UP is set only after ndo_open is called
Related race conditions:
1. Race between bonding_store_miimon() and bonding_store_arp_interval()
Because of Obs.1 we can end up having both works enqueued.
2. Multiple races with INIT_DELAYED_WORK()
Since the works are not protected by anything between INIT_DELAYED_WORK()
and calls to (en/de)queue it is possible for races between the following
functions:
(races are also possible between the calls to INIT_DELAYED_WORK()
and workqueue code)
bonding_store_miimon() - bonding_store_arp_interval(), bond_close(),
bond_open(), enqueued functions
bonding_store_arp_interval() - bonding_store_miimon(), bond_close(),
bond_open(), enqueued functions
3. By Obs.1 we need to change bond_cancel_all()
Bugs 1 and 2 are fixed by moving all work initializations in bond_open
which by Obs. 2 and Obs. 3 and the fact that we make sure that all works
are cancelled in bond_close(), is guaranteed not to have any work
enqueued.
Also RTNL lock is now acquired in bonding_store_miimon/arp_interval so
they can't race with bond_close and bond_open. The opposing work is
cancelled only if the IFF_UP flag is set and it is cancelled
unconditionally. The opposing work is already cancelled if the interface
is down so no need to cancel it again. This way we don't need new
synchronizations for the bonding workqueue. These bugs (and fixes) are
tied together and belong in the same patch.
Note: I have left 1 line intentionally over 80 characters (84) because I
didn't like how it looks broken down. If you'd prefer it otherwise,
then simply break it.
v2: Make description text < 75 columns
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/bonding/bond_main.c | 88 +++++++++++----------------------------
drivers/net/bonding/bond_sysfs.c | 34 ++++-----------
2 files changed, 36 insertions(+), 86 deletions(-)
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3398,6 +3398,28 @@ static int bond_xmit_hash_policy_l2(stru
/*-------------------------- Device entry points ----------------------------*/
+static void bond_work_init_all(struct bonding *bond)
+{
+ INIT_DELAYED_WORK(&bond->mcast_work,
+ bond_resend_igmp_join_requests_delayed);
+ INIT_DELAYED_WORK(&bond->alb_work, bond_alb_monitor);
+ INIT_DELAYED_WORK(&bond->mii_work, bond_mii_monitor);
+ if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
+ INIT_DELAYED_WORK(&bond->arp_work, bond_activebackup_arp_mon);
+ else
+ INIT_DELAYED_WORK(&bond->arp_work, bond_loadbalance_arp_mon);
+ INIT_DELAYED_WORK(&bond->ad_work, bond_3ad_state_machine_handler);
+}
+
+static void bond_work_cancel_all(struct bonding *bond)
+{
+ cancel_delayed_work_sync(&bond->mii_work);
+ cancel_delayed_work_sync(&bond->arp_work);
+ cancel_delayed_work_sync(&bond->alb_work);
+ cancel_delayed_work_sync(&bond->ad_work);
+ cancel_delayed_work_sync(&bond->mcast_work);
+}
+
static int bond_open(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
@@ -3420,41 +3442,27 @@ static int bond_open(struct net_device *
}
read_unlock(&bond->lock);
- INIT_DELAYED_WORK(&bond->mcast_work, bond_resend_igmp_join_requests_delayed);
+ bond_work_init_all(bond);
if (bond_is_lb(bond)) {
/* bond_alb_initialize must be called before the timer
* is started.
*/
- if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB))) {
- /* something went wrong - fail the open operation */
+ if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB)))
return -ENOMEM;
- }
-
- INIT_DELAYED_WORK(&bond->alb_work, bond_alb_monitor);
queue_delayed_work(bond->wq, &bond->alb_work, 0);
}
- if (bond->params.miimon) { /* link check interval, in milliseconds. */
- INIT_DELAYED_WORK(&bond->mii_work, bond_mii_monitor);
+ if (bond->params.miimon) /* link check interval, in milliseconds. */
queue_delayed_work(bond->wq, &bond->mii_work, 0);
- }
if (bond->params.arp_interval) { /* arp interval, in milliseconds. */
- if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
- INIT_DELAYED_WORK(&bond->arp_work,
- bond_activebackup_arp_mon);
- else
- INIT_DELAYED_WORK(&bond->arp_work,
- bond_loadbalance_arp_mon);
-
queue_delayed_work(bond->wq, &bond->arp_work, 0);
if (bond->params.arp_validate)
bond->recv_probe = bond_arp_rcv;
}
if (bond->params.mode == BOND_MODE_8023AD) {
- INIT_DELAYED_WORK(&bond->ad_work, bond_3ad_state_machine_handler);
queue_delayed_work(bond->wq, &bond->ad_work, 0);
/* register to receive LACPDUs */
bond->recv_probe = bond_3ad_lacpdu_recv;
@@ -3469,34 +3477,10 @@ static int bond_close(struct net_device
struct bonding *bond = netdev_priv(bond_dev);
write_lock_bh(&bond->lock);
-
bond->send_peer_notif = 0;
-
write_unlock_bh(&bond->lock);
- if (bond->params.miimon) { /* link check interval, in milliseconds. */
- cancel_delayed_work_sync(&bond->mii_work);
- }
-
- if (bond->params.arp_interval) { /* arp interval, in milliseconds. */
- cancel_delayed_work_sync(&bond->arp_work);
- }
-
- switch (bond->params.mode) {
- case BOND_MODE_8023AD:
- cancel_delayed_work_sync(&bond->ad_work);
- break;
- case BOND_MODE_TLB:
- case BOND_MODE_ALB:
- cancel_delayed_work_sync(&bond->alb_work);
- break;
- default:
- break;
- }
-
- if (delayed_work_pending(&bond->mcast_work))
- cancel_delayed_work_sync(&bond->mcast_work);
-
+ bond_work_cancel_all(bond);
if (bond_is_lb(bond)) {
/* Must be called only after all
* slaves have been released
@@ -4375,26 +4359,6 @@ static void bond_setup(struct net_device
bond_dev->features |= bond_dev->hw_features;
}
-static void bond_work_cancel_all(struct bonding *bond)
-{
- if (bond->params.miimon && delayed_work_pending(&bond->mii_work))
- cancel_delayed_work_sync(&bond->mii_work);
-
- if (bond->params.arp_interval && delayed_work_pending(&bond->arp_work))
- cancel_delayed_work_sync(&bond->arp_work);
-
- if (bond->params.mode == BOND_MODE_ALB &&
- delayed_work_pending(&bond->alb_work))
- cancel_delayed_work_sync(&bond->alb_work);
-
- if (bond->params.mode == BOND_MODE_8023AD &&
- delayed_work_pending(&bond->ad_work))
- cancel_delayed_work_sync(&bond->ad_work);
-
- if (delayed_work_pending(&bond->mcast_work))
- cancel_delayed_work_sync(&bond->mcast_work);
-}
-
/*
* Destroy a bonding device.
* Must be under rtnl_lock when this function is called.
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -518,6 +518,8 @@ static ssize_t bonding_store_arp_interva
int new_value, ret = count;
struct bonding *bond = to_bond(d);
+ if (!rtnl_trylock())
+ return restart_syscall();
if (sscanf(buf, "%d", &new_value) != 1) {
pr_err("%s: no arp_interval value specified.\n",
bond->dev->name);
@@ -544,10 +546,6 @@ static ssize_t bonding_store_arp_interva
pr_info("%s: ARP monitoring cannot be used with MII monitoring. %s Disabling MII monitoring.\n",
bond->dev->name, bond->dev->name);
bond->params.miimon = 0;
- if (delayed_work_pending(&bond->mii_work)) {
- cancel_delayed_work(&bond->mii_work);
- flush_workqueue(bond->wq);
- }
}
if (!bond->params.arp_targets[0]) {
pr_info("%s: ARP monitoring has been set up, but no ARP targets have been specified.\n",
@@ -559,19 +557,12 @@ static ssize_t bonding_store_arp_interva
* timer will get fired off when the open function
* is called.
*/
- if (!delayed_work_pending(&bond->arp_work)) {
- if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
- INIT_DELAYED_WORK(&bond->arp_work,
- bond_activebackup_arp_mon);
- else
- INIT_DELAYED_WORK(&bond->arp_work,
- bond_loadbalance_arp_mon);
-
- queue_delayed_work(bond->wq, &bond->arp_work, 0);
- }
+ cancel_delayed_work_sync(&bond->mii_work);
+ queue_delayed_work(bond->wq, &bond->arp_work, 0);
}
out:
+ rtnl_unlock();
return ret;
}
static DEVICE_ATTR(arp_interval, S_IRUGO | S_IWUSR,
@@ -967,6 +958,8 @@ static ssize_t bonding_store_miimon(stru
int new_value, ret = count;
struct bonding *bond = to_bond(d);
+ if (!rtnl_trylock())
+ return restart_syscall();
if (sscanf(buf, "%d", &new_value) != 1) {
pr_err("%s: no miimon value specified.\n",
bond->dev->name);
@@ -998,10 +991,6 @@ static ssize_t bonding_store_miimon(stru
bond->params.arp_validate =
BOND_ARP_VALIDATE_NONE;
}
- if (delayed_work_pending(&bond->arp_work)) {
- cancel_delayed_work(&bond->arp_work);
- flush_workqueue(bond->wq);
- }
}
if (bond->dev->flags & IFF_UP) {
@@ -1010,15 +999,12 @@ static ssize_t bonding_store_miimon(stru
* timer will get fired off when the open function
* is called.
*/
- if (!delayed_work_pending(&bond->mii_work)) {
- INIT_DELAYED_WORK(&bond->mii_work,
- bond_mii_monitor);
- queue_delayed_work(bond->wq,
- &bond->mii_work, 0);
- }
+ cancel_delayed_work_sync(&bond->arp_work);
+ queue_delayed_work(bond->wq, &bond->mii_work, 0);
}
}
out:
+ rtnl_unlock();
return ret;
}
static DEVICE_ATTR(miimon, S_IRUGO | S_IWUSR,
next prev parent reply other threads:[~2013-04-02 22:13 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-02 22:12 [ 00/68] 3.4.39-stable review Greg Kroah-Hartman
2013-04-02 22:12 ` [ 01/68] signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer Greg Kroah-Hartman
2013-04-02 22:12 ` [ 02/68] kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER Greg Kroah-Hartman
2013-04-02 22:12 ` [ 03/68] SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked Greg Kroah-Hartman
2013-04-02 22:12 ` [ 04/68] tile: expect new initramfs name from hypervisor file system Greg Kroah-Hartman
2013-04-02 22:12 ` [ 05/68] Bluetooth: Fix not closing SCO sockets in the BT_CONNECT2 state Greg Kroah-Hartman
2013-04-02 22:12 ` [ 06/68] Bluetooth: Add support for Dell[QCA 0cf3:0036] Greg Kroah-Hartman
2013-04-02 22:12 ` [ 07/68] Bluetooth: Add support for Dell[QCA 0cf3:817a] Greg Kroah-Hartman
2013-04-02 22:12 ` [ 08/68] staging: comedi: s626: fix continuous acquisition Greg Kroah-Hartman
2013-04-02 22:12 ` [ 09/68] sysfs: fix race between readdir and lseek Greg Kroah-Hartman
2013-04-02 22:12 ` [ 10/68] sysfs: handle failure path correctly for readdir() Greg Kroah-Hartman
2013-04-02 22:13 ` [ 11/68] can: sja1000: fix define conflict on SH Greg Kroah-Hartman
2013-04-02 22:13 ` [ 12/68] ath9k_hw: revert chainmask to user configuration after calibration Greg Kroah-Hartman
2013-04-02 22:13 ` [ 13/68] HID: usbhid: quirk for Realtek Multi-card reader Greg Kroah-Hartman
2013-04-02 22:13 ` [ 14/68] rtlwifi: usb: add missing freeing of skbuff Greg Kroah-Hartman
2013-04-02 22:13 ` [ 15/68] b43: N-PHY: increase initial value of "mind" in RSSI calibration Greg Kroah-Hartman
2013-04-02 22:13 ` [ 16/68] b43: A fix for DMA transmission sequence errors Greg Kroah-Hartman
2013-04-02 22:13 ` [ 17/68] b43: N-PHY: use more bits for offset in RSSI calibration Greg Kroah-Hartman
2013-04-02 22:13 ` [ 18/68] tg3: fix length overflow in VPD firmware parsing Greg Kroah-Hartman
2013-04-02 22:13 ` [ 19/68] iommu/amd: Make sure dma_ops are set for hotplug devices Greg Kroah-Hartman
2013-04-02 22:13 ` [ 20/68] xen/blkback: correctly respond to unknown, non-native requests Greg Kroah-Hartman
2013-04-02 22:13 ` [ 21/68] xen-blkback: fix dispatch_rw_block_io() error path Greg Kroah-Hartman
2013-04-02 22:13 ` [ 22/68] tty: atmel_serial_probe(): index of atmel_ports[] fix Greg Kroah-Hartman
2013-04-02 22:13 ` [ 23/68] usb: ftdi_sio: Add support for Mitsubishi FX-USB-AW/-BD Greg Kroah-Hartman
2013-04-02 22:13 ` [ 24/68] vt: synchronize_rcu() under spinlock is not nice Greg Kroah-Hartman
2013-04-02 22:13 ` [ 25/68] mwifiex: cancel cmd timer and free curr_cmd in shutdown process Greg Kroah-Hartman
2013-04-08 18:01 ` Bing Zhao
2013-04-02 22:13 ` [ 26/68] pnfs-block: removing DM device maybe cause oops when call dev_remove Greg Kroah-Hartman
2013-04-02 22:13 ` [ 27/68] net/irda: add missing error path release_sock call Greg Kroah-Hartman
2013-04-02 22:13 ` [ 28/68] usb: xhci: Fix TRB transfer length macro used for Event TRB Greg Kroah-Hartman
2013-04-02 22:13 ` [ 29/68] Btrfs: fix race between mmap writes and compression Greg Kroah-Hartman
2013-04-02 22:13 ` [ 30/68] Btrfs: limit the global reserve to 512mb Greg Kroah-Hartman
2013-04-02 22:13 ` [ 31/68] Btrfs: dont drop path when printing out tree errors in scrub Greg Kroah-Hartman
2013-04-02 22:13 ` [ 32/68] usb: gadget: udc-core: fix a regression during gadget driver unbinding Greg Kroah-Hartman
2013-04-02 22:13 ` [ 33/68] loop: prevent bdev freeing while device in use Greg Kroah-Hartman
2013-04-02 22:13 ` [ 34/68] ARM: cns3xxx: fix mapping of private memory region Greg Kroah-Hartman
2013-04-02 22:13 ` [ 35/68] nfsd4: reject "negative" acl lengths Greg Kroah-Hartman
2013-04-02 22:13 ` [ 36/68] drm/i915: Dont clobber crtc->fb when queue_flip fails Greg Kroah-Hartman
2013-04-02 22:13 ` [ 37/68] Btrfs: fix space leak when we fail to reserve metadata space Greg Kroah-Hartman
2013-04-02 22:13 ` [ 38/68] efivars: explicitly calculate length of VariableName Greg Kroah-Hartman
2013-04-02 22:13 ` [ 39/68] efivars: Handle duplicate names from get_next_variable() Greg Kroah-Hartman
2013-04-02 22:13 ` [ 40/68] ext4: convert number of blocks to clusters properly Greg Kroah-Hartman
2013-04-02 22:13 ` [ 41/68] ext4: use atomic64_t for the per-flexbg free_clusters count Greg Kroah-Hartman
2013-04-02 22:13 ` [ 42/68] tracing: Protect tracer flags with trace_types_lock Greg Kroah-Hartman
2013-04-02 22:13 ` [ 43/68] tracing: Prevent buffer overwrite disabled for latency tracers Greg Kroah-Hartman
2013-04-02 22:13 ` [ 44/68] net: remove a WARN_ON() in net_enable_timestamp() Greg Kroah-Hartman
2013-04-02 22:13 ` [ 45/68] sky2: Receive Overflows not counted Greg Kroah-Hartman
2013-04-02 22:13 ` [ 46/68] sky2: Threshold for Pause Packet is set wrong Greg Kroah-Hartman
2013-04-02 22:13 ` [ 47/68] tcp: preserve ACK clocking in TSO Greg Kroah-Hartman
2013-04-02 22:13 ` [ 48/68] tcp: undo spurious timeout after SACK reneging Greg Kroah-Hartman
2013-04-02 22:13 ` [ 49/68] 8021q: fix a potential use-after-free Greg Kroah-Hartman
2013-04-02 22:13 ` [ 50/68] thermal: shorten too long mcast group name Greg Kroah-Hartman
2013-04-02 22:13 ` [ 51/68] unix: fix a race condition in unix_release() Greg Kroah-Hartman
2013-04-02 22:13 ` [ 52/68] af_unix: dont send SCM_CREDENTIAL when dest socket is NULL Greg Kroah-Hartman
2013-04-02 22:13 ` [ 53/68] bonding: remove already created master sysfs link on failure Greg Kroah-Hartman
2013-04-02 22:13 ` Greg Kroah-Hartman [this message]
2013-04-02 22:13 ` [ 55/68] bonding: fix disabling of arp_interval and miimon Greg Kroah-Hartman
2013-04-02 22:13 ` [ 56/68] drivers: net: ethernet: davinci_emac: use netif_wake_queue() while restarting tx queue Greg Kroah-Hartman
2013-04-02 22:13 ` [ 57/68] drivers: net: ethernet: cpsw: " Greg Kroah-Hartman
2013-04-02 22:13 ` [ 58/68] net: fix *_DIAG_MAX constants Greg Kroah-Hartman
2013-04-02 22:13 ` [ 59/68] aoe: reserve enough headroom on skbs Greg Kroah-Hartman
2013-04-02 22:13 ` [ 60/68] atl1e: drop pci-msi support because of packet corruption Greg Kroah-Hartman
2013-04-02 22:13 ` [ 61/68] DM9000B: driver initialization upgrade Greg Kroah-Hartman
2013-04-02 22:13 ` [ 62/68] ipv6: dont accept multicast traffic with scope 0 Greg Kroah-Hartman
2013-04-02 22:13 ` [ 63/68] ipv6: fix bad free of addrconf_init_net Greg Kroah-Hartman
2013-04-02 22:13 ` [ 64/68] ipv6: dont accept node local multicast traffic from the wire Greg Kroah-Hartman
2013-04-02 22:13 ` [ 65/68] ks8851: Fix interpretation of rxlen field Greg Kroah-Hartman
2013-04-02 22:13 ` [ 66/68] net: add a synchronize_net() in netdev_rx_handler_unregister() Greg Kroah-Hartman
2013-04-02 22:13 ` [ 67/68] pch_gbe: fix ip_summed checksum reporting on rx Greg Kroah-Hartman
2013-04-02 22:13 ` [ 68/68] smsc75xx: fix jumbo frame support Greg Kroah-Hartman
2013-04-03 15:19 ` [ 00/68] 3.4.39-stable review Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130402221336.237705124@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=fubar@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nikolay@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).