From: Sasha Levin <Alexander.Levin@microsoft.com>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"stable@vger.kernel.org" <stable@vger.kernel.org>
Cc: Feras Daoud <ferasda@mellanox.com>,
Leon Romanovsky <leon@kernel.org>,
Doug Ledford <dledford@redhat.com>,
Sasha Levin <Alexander.Levin@microsoft.com>
Subject: [PATCH AUTOSEL for 4.4 036/101] IB/ipoib: Fix deadlock between ipoib_stop and mcast join flow
Date: Thu, 8 Mar 2018 05:01:39 +0000 [thread overview]
Message-ID: <20180308050023.8548-36-alexander.levin@microsoft.com> (raw)
In-Reply-To: <20180308050023.8548-1-alexander.levin@microsoft.com>
From: Feras Daoud <ferasda@mellanox.com>
[ Upstream commit 3e31a490e01a6e67cbe9f6e1df2f3ff0fbf48972 ]
Before calling ipoib_stop, rtnl_lock should be taken, then
the flow clears the IPOIB_FLAG_ADMIN_UP and IPOIB_FLAG_OPER_UP
flags, and waits for mcast completion if IPOIB_MCAST_FLAG_BUSY
is set.
On the other hand, the flow of multicast join task initializes
a mcast completion, sets the IPOIB_MCAST_FLAG_BUSY and calls
ipoib_mcast_join. If IPOIB_FLAG_OPER_UP flag is not set, this
call returns EINVAL without setting the mcast completion and
leads to a deadlock.
ipoib_stop |
| |
clear_bit(IPOIB_FLAG_ADMIN_UP) |
| |
Context Switch |
| ipoib_mcast_join_task
| |
| spin_lock_irq(lock)
| |
| init_completion(mcast)
| |
| set_bit(IPOIB_MCAST_FLAG_BUSY)
| |
| Context Switch
| |
clear_bit(IPOIB_FLAG_OPER_UP) |
| |
spin_lock_irqsave(lock) |
| |
Context Switch |
| ipoib_mcast_join
| return (-EINVAL)
| |
| spin_unlock_irq(lock)
| |
| Context Switch
| |
ipoib_mcast_dev_flush |
wait_for_completion(mcast) |
ipoib_stop will wait for mcast completion for ever, and will
not release the rtnl_lock. As a result panic occurs with the
following trace:
[13441.639268] Call Trace:
[13441.640150] [<ffffffff8168b579>] schedule+0x29/0x70
[13441.641038] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0
[13441.641914] [<ffffffff810bc017>] ? complete+0x47/0x50
[13441.642765] [<ffffffff810a690d>] ? flush_workqueue_prep_pwqs+0x16d/0x200
[13441.643580] [<ffffffff8168b956>] wait_for_completion+0x116/0x170
[13441.644434] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20
[13441.645293] [<ffffffffa05af170>] ipoib_mcast_dev_flush+0x150/0x190 [ib_ipoib]
[13441.646159] [<ffffffffa05ac967>] ipoib_ib_dev_down+0x37/0x60 [ib_ipoib]
[13441.647013] [<ffffffffa05a4805>] ipoib_stop+0x75/0x150 [ib_ipoib]
Fixes: 08bc327629cb ("IB/ipoib: fix for rare multicast join race condition")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
---
drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 5580ab0b5781..682a69daac5d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -473,6 +473,9 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
return -EINVAL;
+ init_completion(&mcast->done);
+ set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags);
+
ipoib_dbg_mcast(priv, "joining MGID %pI6\n", mcast->mcmember.mgid.raw);
rec.mgid = mcast->mcmember.mgid;
@@ -631,8 +634,6 @@ void ipoib_mcast_join_task(struct work_struct *work)
if (mcast->backoff == 1 ||
time_after_eq(jiffies, mcast->delay_until)) {
/* Found the next unjoined group */
- init_completion(&mcast->done);
- set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags);
if (ipoib_mcast_join(dev, mcast)) {
spin_unlock_irq(&priv->lock);
return;
@@ -652,11 +653,9 @@ void ipoib_mcast_join_task(struct work_struct *work)
queue_delayed_work(priv->wq, &priv->mcast_task,
delay_until - jiffies);
}
- if (mcast) {
- init_completion(&mcast->done);
- set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags);
+ if (mcast)
ipoib_mcast_join(dev, mcast);
- }
+
spin_unlock_irq(&priv->lock);
}
--
2.14.1
next prev parent reply other threads:[~2018-03-08 5:01 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-08 5:01 [PATCH AUTOSEL for 4.4 001/101] usb: dwc2: Make sure we disconnect the gadget state Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 002/101] usb: gadget: dummy_hcd: Fix wrong power status bit clear/reset in dummy_hub_control() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 003/101] drivers/perf: arm_pmu: handle no platform_device Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 005/101] scsi: sg: check for valid direction before starting the request Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 004/101] perf session: Don't rely on evlist in pipe mode Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 006/101] scsi: sg: close race condition in sg_remove_sfp_usercontext() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 007/101] kprobes/x86: Fix kprobe-booster not to boost far call instructions Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 008/101] kprobes/x86: Set kprobes pages read-only Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 010/101] wil6210: fix memory access violation in wil_memcpy_from/toio_32 Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 009/101] pwm: tegra: Increase precision in PWM rate calculation Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 011/101] platform/x86: asus-nb-wmi: Add wapf4 quirk for the X302UA Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 012/101] regulator: anatop: set default voltage selector for pcie Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 013/101] x86: i8259: export legacy_pic symbol Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 014/101] rtc: cmos: Do not assume irq 8 for rtc when there are no legacy irqs Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 016/101] time: Change posix clocks ops interfaces to use timespec64 Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 015/101] Input: ar1021_i2c - fix too long name in driver's device table Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 017/101] ACPI/processor: Fix error handling in __acpi_processor_start() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 019/101] cpufreq/sh: Replace racy task affinity logic Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 018/101] ACPI/processor: " Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 020/101] genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 021/101] i2c: i2c-scmi: add a MS HID Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 023/101] [media] media/dvb-core: Race condition when writing to CAM Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 022/101] net: ipv6: send unsolicited NA on admin up Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 025/101] ath: Fix updating radar flags for coutry code India Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 024/101] spi: dw: Disable clock after unregistering the host Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 026/101] clk: ns2: Correct SDIO bits Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 027/101] scsi: virtio_scsi: Always try to read VPD pages Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 029/101] ARM: 8668/1: ftrace: Fix dynamic ftrace with DEBUG_RODATA and !FRAME_POINTER Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 028/101] KVM: PPC: Book3S PR: Exit KVM on failed mapping Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 030/101] iommu/omap: Register driver before setting IOMMU ops Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 032/101] NFS: Fix missing pg_cleanup after nfs_pageio_cond_complete() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 031/101] md/raid10: wait up frozen array in handle_write_completed Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 034/101] e1000e: fix timing for 82579 Gigabit Ethernet controller Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 033/101] tcp: remove poll() flakes with FastOpen Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 035/101] ALSA: hda - Fix headset microphone detection for ASUS N551 and N751 Sasha Levin
2018-03-08 5:01 ` Sasha Levin [this message]
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 038/101] HSI: ssi_protocol: double free in ssip_pn_xmit() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 039/101] IB/mlx4: Take write semaphore when changing the vma struct Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 037/101] IB/ipoib: Update broadcast object if PKey value was changed in index 0 Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 041/101] ASoC: Intel: Skylake: Uninitialized variable in probe_codec() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 040/101] IB/mlx4: Change vma from shared to private Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 043/101] netfilter: xt_CT: fix refcnt leak on error path Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 042/101] Fix driver usage of 128B WQEs when WQ_CREATE is V1 Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 046/101] wan: pc300too: abort path on failure Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 045/101] mmc: host: omap_hsmmc: checking for NULL instead of IS_ERR() Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 044/101] openvswitch: Delete conntrack entry clashing with an expectation Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 048/101] scsi: mac_esp: Replace bogus memory barrier with spinlock Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 047/101] qlcnic: fix unchecked return value Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 049/101] infiniband/uverbs: Fix integer overflows Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 051/101] iio: st_pressure: st_accel: Initialise sensor platform data properly Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 050/101] NFS: don't try to cross a mountpount when there isn't one there Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 053/101] rndis_wlan: add return value validation Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 052/101] mt7601u: check return value of alloc_skb Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 055/101] mac80211: don't parse encrypted management frames in ieee80211_frame_acked Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 054/101] Btrfs: send, fix file hole not being preserved due to inline extent Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 056/101] mfd: palmas: Reset the POWERHOLD mux during power off Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 058/101] staging: unisys: visorhba: fix s-Par to boot with option CONFIG_VMAP_STACK set to y Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 057/101] mtip32xx: use runtime tag to initialize command header Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 060/101] mmc: sdhci-of-esdhc: limit SD clock for ls1012a/ls1046a Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 059/101] staging: wilc1000: fix unchecked return value Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 062/101] ipmi/watchdog: fix wdog hang on panic waiting for ipmi response Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 061/101] ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 064/101] drm/nouveau/kms: Increase max retries in scanout position queries Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 065/101] bnx2x: Align RX buffers Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 063/101] ACPI / PMIC: xpower: Fix power_table addresses Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 067/101] Input: twl4030-pwrbutton - use correct device for irq request Sasha Levin
2018-03-08 5:01 ` [PATCH AUTOSEL for 4.4 066/101] power: supply: pda_power: move from timer to delayed_work Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 069/101] ia64: fix module loading for gcc-5.4 Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 068/101] md/raid10: skip spare disk as 'first' disk Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 071/101] video: fbdev: udlfb: Fix buffer on stack Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 070/101] tcm_fileio: Prevent information leak for short reads Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 072/101] sm501fb: don't return zero on failure path in sm501fb_start() Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 073/101] net: hns: fix ethtool_get_strings overflow in hns driver Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 075/101] rtc: ds1374: wdt: Fix issue with timeout scaling from secs to wdt ticks Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 074/101] cifs: small underflow in cnvrtDosUnixTm() Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 076/101] rtc: ds1374: wdt: Fix stop/start ioctl always returning -EINVAL Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 077/101] perf tests kmod-path: Don't fail if compressed modules aren't supported Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 079/101] media: c8sectpfe: fix potential NULL pointer dereference in c8sectpfe_timer_interrupt Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 078/101] Bluetooth: hci_qca: Avoid setup failure on missing rampatch Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 080/101] drm/msm: fix leak in failed get_pages Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 081/101] RDMA/iwpm: Fix uninitialized error code in iwpm_send_mapinfo() Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 082/101] rtlwifi: rtl_pci: Fix the bug when inactiveps is enabled Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 083/101] media: bt8xx: Fix err 'bt878_probe()' Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 085/101] cros_ec: fix nul-termination for firmware build info Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 084/101] media: [RESEND] media: dvb-frontends: Add delay to Si2168 restart Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 086/101] platform/chrome: Use proper protocol transfer function Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 088/101] IB/ipoib: Avoid memory leak if the SA returns a different DGID Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 087/101] mmc: avoid removing non-removable hosts during suspend Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 089/101] RDMA/cma: Use correct size when writing netlink stats Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 090/101] IB/umem: Fix use of npages/nmap fields Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 092/101] drm/omap: DMM: Check for DMM readiness after successful transaction commit Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 091/101] vgacon: Set VGA struct resource types Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 094/101] coresight: Fix disabling of CoreSight TPIU Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 093/101] pty: cancel pty slave port buf's work in tty_release Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 096/101] iommu/vt-d: clean up pr_irq if request_threaded_irq fails Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 095/101] pinctrl: Really force states during suspend/resume Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 097/101] ip6_vti: adjust vti mtu according to mtu of lower device Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 099/101] nfsd4: permit layoutget of executable-only files Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 098/101] RDMA/ocrdma: Fix permissions for OCRDMA_RESET_STATS Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 101/101] dmaengine: ti-dma-crossbar: Fix event mapping for TPCC_EVT_MUX_60_63 Sasha Levin
2018-03-08 5:02 ` [PATCH AUTOSEL for 4.4 100/101] clk: si5351: Rename internal plls to avoid name collisions Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180308050023.8548-36-alexander.levin@microsoft.com \
--to=alexander.levin@microsoft.com \
--cc=dledford@redhat.com \
--cc=ferasda@mellanox.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox