stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Yishai Hadas <yishaih@mellanox.com>,
	Artemy Kovalyov <artemyko@mellanox.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	Jason Gunthorpe <jgg@mellanox.com>,
	Sasha Levin <sashal@kernel.org>,
	linux-rdma@vger.kernel.org
Subject: [PATCH AUTOSEL 5.2 23/44] IB/mlx5: Fix implicit MR release flow
Date: Tue, 20 Aug 2019 09:40:07 -0400	[thread overview]
Message-ID: <20190820134028.10829-23-sashal@kernel.org> (raw)
In-Reply-To: <20190820134028.10829-1-sashal@kernel.org>

From: Yishai Hadas <yishaih@mellanox.com>

[ Upstream commit f591822c3cf314442819486f45ff7dc1f690e0c0 ]

Once implicit MR is being called to be released by
ib_umem_notifier_release() its leaves were marked as "dying".

However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is
called, it skips running the mr_leaf_free_action (i.e. umem_odp->work)
when those leaves were marked as "dying".

As such ib_umem_release() for the leaves won't be called and their MRs
will be leaked as well.

When an application exits/killed without calling dereg_mr we might hit the
above flow.

This fatal scenario is reported by WARN_ON() upon
mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the
call trace can be seen below.

Originally the "dying" mark as part of ib_umem_notifier_release() was
introduced to prevent pagefault_mr() from returning a success response
once this happened. However, we already have today the completion
mechanism so no need for that in those flows any more.  Even in case a
success response will be returned the firmware will not find the pages and
an error will be returned in the following call as a released mm will
cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero().

Fix the above issue by dropping the "dying" from the above flows.  The
other flows that are using "dying" are still needed it for their
synchronization purposes.

   WARNING: CPU: 1 PID: 7218 at
   drivers/infiniband/hw/mlx5/main.c:2004
		  mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib]
   CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G     E
	       5.2.0-rc6+ #13
   Call Trace:
   uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs]
   ib_uverbs_close+0x1f/0x80 [ib_uverbs]
   __fput+0xbe/0x250
   task_work_run+0x88/0xa0
   do_exit+0x2cb/0xc30
   ? __fput+0x14b/0x250
   do_group_exit+0x39/0xb0
   get_signal+0x191/0x920
   ? _raw_spin_unlock_bh+0xa/0x20
   ? inet_csk_accept+0x229/0x2f0
   do_signal+0x36/0x5e0
   ? put_unused_fd+0x5b/0x70
   ? __sys_accept4+0x1a6/0x1e0
   ? inet_hash+0x35/0x40
   ? release_sock+0x43/0x90
   ? _raw_spin_unlock_bh+0xa/0x20
   ? inet_listen+0x9f/0x120
   exit_to_usermode_loop+0x5c/0xc6
   do_syscall_64+0x182/0x1b0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support")
Link: https://lore.kernel.org/r/20190805083010.21777-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/infiniband/core/umem_odp.c |  4 ----
 drivers/infiniband/hw/mlx5/odp.c   | 24 +++++++++---------------
 2 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index e4b13a32692a9..5e5f7dd82c50d 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -114,10 +114,6 @@ static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
 	 * prevent any further fault handling on this MR.
 	 */
 	ib_umem_notifier_start_account(umem_odp);
-	umem_odp->dying = 1;
-	/* Make sure that the fact the umem is dying is out before we release
-	 * all pending page faults. */
-	smp_wmb();
 	complete_all(&umem_odp->notifier_completion);
 	umem->context->invalidate_range(umem_odp, ib_umem_start(umem),
 					ib_umem_end(umem));
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 91507a2e92900..d711f3e31fa74 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -581,7 +581,6 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 			u32 flags)
 {
 	int npages = 0, current_seq, page_shift, ret, np;
-	bool implicit = false;
 	struct ib_umem_odp *odp_mr = to_ib_umem_odp(mr->umem);
 	bool downgrade = flags & MLX5_PF_FLAGS_DOWNGRADE;
 	bool prefetch = flags & MLX5_PF_FLAGS_PREFETCH;
@@ -596,7 +595,6 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 		if (IS_ERR(odp))
 			return PTR_ERR(odp);
 		mr = odp->private;
-		implicit = true;
 	} else {
 		odp = odp_mr;
 	}
@@ -684,19 +682,15 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 
 out:
 	if (ret == -EAGAIN) {
-		if (implicit || !odp->dying) {
-			unsigned long timeout =
-				msecs_to_jiffies(MMU_NOTIFIER_TIMEOUT);
-
-			if (!wait_for_completion_timeout(
-					&odp->notifier_completion,
-					timeout)) {
-				mlx5_ib_warn(dev, "timeout waiting for mmu notifier. seq %d against %d. notifiers_count=%d\n",
-					     current_seq, odp->notifiers_seq, odp->notifiers_count);
-			}
-		} else {
-			/* The MR is being killed, kill the QP as well. */
-			ret = -EFAULT;
+		unsigned long timeout = msecs_to_jiffies(MMU_NOTIFIER_TIMEOUT);
+
+		if (!wait_for_completion_timeout(&odp->notifier_completion,
+						 timeout)) {
+			mlx5_ib_warn(
+				dev,
+				"timeout waiting for mmu notifier. seq %d against %d. notifiers_count=%d\n",
+				current_seq, odp->notifiers_seq,
+				odp->notifiers_count);
 		}
 	}
 
-- 
2.20.1


  parent reply	other threads:[~2019-08-20 13:40 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20 13:39 [PATCH AUTOSEL 5.2 01/44] dmaengine: ste_dma40: fix unneeded variable warning Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 02/44] nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 03/44] afs: Fix the CB.ProbeUuid service handler to reply correctly Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 04/44] afs: Fix loop index mixup in afs_deliver_vl_get_entry_by_name_u() Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 05/44] fs: afs: Fix a possible null-pointer dereference in afs_put_read() Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 06/44] afs: Fix off-by-one in afs_rename() expected data version calculation Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 07/44] afs: Only update d_fsdata if different in afs_d_revalidate() Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 08/44] afs: Fix missing dentry data version updating Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 09/44] intel_th: Use the correct style for SPDX License Identifier Sasha Levin
2019-08-20 14:27   ` Greg Kroah-Hartman
2019-08-20 20:03     ` Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 10/44] nvmet: Fix use-after-free bug when a port is removed Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 11/44] nvmet-loop: Flush nvme_delete_wq when removing the port Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 12/44] nvmet-file: fix nvmet_file_flush() always returning an error Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 13/44] nvme-core: Fix extra device_put() call on error path Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 14/44] nvme: fix a possible deadlock when passthru commands sent to a multipath device Sasha Levin
2019-08-20 13:39 ` [PATCH AUTOSEL 5.2 15/44] nvme-rdma: fix possible use-after-free in connect error flow Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 16/44] nvme: fix controller removal race with scan work Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 17/44] nvme-pci: Fix async probe remove race Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 18/44] soundwire: cadence_master: fix register definition for SLAVE_STATE Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 19/44] soundwire: cadence_master: fix definitions for INTSTAT0/1 Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 20/44] iio: adc: max9611: Fix temperature reading in probe Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 21/44] auxdisplay: panel: need to delete scan_timer when misc_register fails in panel_attach Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 22/44] btrfs: trim: Check the range passed into to prevent overflow Sasha Levin
2019-08-20 13:40 ` Sasha Levin [this message]
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 24/44] dmaengine: stm32-mdma: Fix a possible null-pointer dereference in stm32_mdma_irq_handler() Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 25/44] omap-dma/omap_vout_vrfb: fix off-by-one fi value Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 26/44] iommu/dma: Handle SG length overflow better Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 27/44] dma-direct: don't truncate dma_required_mask to bus addressing capabilities Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 28/44] usb: gadget: composite: Clear "suspended" on reset/disconnect Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 29/44] usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 30/44] habanalabs: fix DRAM usage accounting on context tear down Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 31/44] habanalabs: fix endianness handling for packets from user Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 32/44] habanalabs: fix completion queue handling when host is BE Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 33/44] habanalabs: fix endianness handling for internal QMAN submission Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 34/44] habanalabs: fix device IRQ unmasking for BE host Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 35/44] xen/blkback: fix memory leaks Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 36/44] arm64: cpufeature: Don't treat granule sizes as strict Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 37/44] riscv: fix flush_tlb_range() end address for flush_tlb_page() Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 38/44] i2c: rcar: avoid race when unregistering slave client Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 39/44] i2c: emev2: " Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 40/44] drm/scheduler: use job count instead of peek Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 41/44] drm/ast: Fixed reboot test may cause system hanged Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 42/44] usb: host: fotg2: restart hcd after port reset Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 43/44] tools: hv: fixed Python pep8/flake8 warnings for lsvmbus Sasha Levin
2019-08-20 13:40 ` [PATCH AUTOSEL 5.2 44/44] tools: hv: fix KVP and VSS daemons exit code Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190820134028.10829-23-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=artemyko@mellanox.com \
    --cc=jgg@mellanox.com \
    --cc=leonro@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).