public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: NeilBrown <neilb@suse.de>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Sasha Levin <sashal@kernel.org>,
	linux-nfs@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 5.6 34/50] SUNRPC: defer slow parts of rpc_free_client() to a workqueue.
Date: Thu,  7 May 2020 10:27:10 -0400	[thread overview]
Message-ID: <20200507142726.25751-34-sashal@kernel.org> (raw)
In-Reply-To: <20200507142726.25751-1-sashal@kernel.org>

From: NeilBrown <neilb@suse.de>

[ Upstream commit 7c4310ff56422ea43418305d22bbc5fe19150ec4 ]

The rpciod workqueue is on the write-out path for freeing dirty memory,
so it is important that it never block waiting for memory to be
allocated - this can lead to a deadlock.

rpc_execute() - which is often called by an rpciod work item - calls
rcp_task_release_client() which can lead to rpc_free_client().

rpc_free_client() makes two calls which could potentially block wating
for memory allocation.

rpc_clnt_debugfs_unregister() calls into debugfs and will block while
any of the debugfs files are being accessed.  In particular it can block
while any of the 'open' methods are being called and all of these use
malloc for one thing or another.  So this can deadlock if the memory
allocation waits for NFS to complete some writes via rpciod.

rpc_clnt_remove_pipedir() can take the inode_lock() and while it isn't
obvious that memory allocations can happen while the lock it held, it is
safer to assume they might and to not let rpciod call
rpc_clnt_remove_pipedir().

So this patch moves these two calls (together with the final kfree() and
rpciod_down()) into a work-item to be run from the system work-queue.
rpciod can continue its important work, and the final stages of the free
can happen whenever they happen.

I have seen this deadlock on a 4.12 based kernel where debugfs used
synchronize_srcu() when removing objects.  synchronize_srcu() requires a
workqueue and there were no free workther threads and none could be
allocated.  While debugsfs no longer uses SRCU, I believe the deadlock
is still possible.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/sunrpc/clnt.h |  8 +++++++-
 net/sunrpc/clnt.c           | 21 +++++++++++++++++----
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index ca7e108248e21..7bd124e06b36f 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -71,7 +71,13 @@ struct rpc_clnt {
 #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
 	struct dentry		*cl_debugfs;	/* debugfs directory */
 #endif
-	struct rpc_xprt_iter	cl_xpi;
+	/* cl_work is only needed after cl_xpi is no longer used,
+	 * and that are of similar size
+	 */
+	union {
+		struct rpc_xprt_iter	cl_xpi;
+		struct work_struct	cl_work;
+	};
 	const struct cred	*cl_cred;
 };
 
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 7324b21f923e6..a2c215a6980d8 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -880,6 +880,20 @@ EXPORT_SYMBOL_GPL(rpc_shutdown_client);
 /*
  * Free an RPC client
  */
+static void rpc_free_client_work(struct work_struct *work)
+{
+	struct rpc_clnt *clnt = container_of(work, struct rpc_clnt, cl_work);
+
+	/* These might block on processes that might allocate memory,
+	 * so they cannot be called in rpciod, so they are handled separately
+	 * here.
+	 */
+	rpc_clnt_debugfs_unregister(clnt);
+	rpc_clnt_remove_pipedir(clnt);
+
+	kfree(clnt);
+	rpciod_down();
+}
 static struct rpc_clnt *
 rpc_free_client(struct rpc_clnt *clnt)
 {
@@ -890,17 +904,16 @@ rpc_free_client(struct rpc_clnt *clnt)
 			rcu_dereference(clnt->cl_xprt)->servername);
 	if (clnt->cl_parent != clnt)
 		parent = clnt->cl_parent;
-	rpc_clnt_debugfs_unregister(clnt);
-	rpc_clnt_remove_pipedir(clnt);
 	rpc_unregister_client(clnt);
 	rpc_free_iostats(clnt->cl_metrics);
 	clnt->cl_metrics = NULL;
 	xprt_put(rcu_dereference_raw(clnt->cl_xprt));
 	xprt_iter_destroy(&clnt->cl_xpi);
-	rpciod_down();
 	put_cred(clnt->cl_cred);
 	rpc_free_clid(clnt);
-	kfree(clnt);
+
+	INIT_WORK(&clnt->cl_work, rpc_free_client_work);
+	schedule_work(&clnt->cl_work);
 	return parent;
 }
 
-- 
2.20.1


  parent reply	other threads:[~2020-05-07 14:36 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-07 14:26 [PATCH AUTOSEL 5.6 01/50] RDMA/mlx4: Initialize ib_spec on the stack Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 02/50] RDMA/siw: Fix potential siw_mem refcnt leak in siw_fastreg_mr() Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 03/50] dmaengine: hisilicon: Fix build error without PCI_MSI Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 04/50] dmaengine: ti: k3-psil: fix deadlock on error path Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 05/50] dmaengine: xilinx_dma: Add missing check for empty list Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 06/50] nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 07/50] vfio: avoid possible overflow in vfio_iommu_type1_pin_pages Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 08/50] riscv: fix vdso build with lld Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 09/50] scsi: qla2xxx: set UNLOADING before waiting for session deletion Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 10/50] scsi: qla2xxx: check UNLOADING before posting async work Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 11/50] scsi: target/iblock: fix WRITE SAME zeroing Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 12/50] RDMA/mlx5: Set GRH fields in query QP on RoCE Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 13/50] RDMA/uverbs: Fix a race with disassociate and exit_mmap() Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 14/50] RDMA/core: Prevent mixed use of FDs between shared ufiles Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 15/50] RDMA/core: Fix overwriting of uobj in case of error Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 16/50] dmaengine: pch_dma.c: Avoid data race between probe and irq handler Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 17/50] dmaengine: mmp_tdma: Do not ignore slave config validation errors Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 18/50] dmaengine: mmp_tdma: Reset channel error on release Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 19/50] vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn() Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 20/50] ALSA: hda: Match both PCI ID and SSID for driver blacklist Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 21/50] drm/amd/display: blank dp stream before re-train the link Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 22/50] selftests/ftrace: Check the first record for kprobe_args_type.tc Sasha Levin
2020-05-07 14:26 ` [PATCH AUTOSEL 5.6 23/50] RDMA/core: Fix race between destroy and release FD object Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 24/50] cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 25/50] dma-buf: Fix SET_NAME ioctl uapi Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 26/50] nvme: prevent double free in nvme_alloc_ns() error handling Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 27/50] dmaengine: fix channel index enumeration Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 28/50] dmaengine: dmatest: Fix iteration non-stop logic Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 29/50] i2c: iproc: generate stop event for slave writes Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 30/50] ALSA: hda/hdmi: fix race in monitor detection during probe Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 31/50] dmaengine: dmatest: Fix process hang when reading 'wait' parameter Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 32/50] drm/amd/powerplay: avoid using pm_en before it is initialized revised Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 33/50] drm/amdgpu: bump version for invalidate L2 before SDMA IBs Sasha Levin
2020-05-07 16:11   ` Michel Dänzer
2020-05-16 23:08     ` Sasha Levin
2020-05-07 14:27 ` Sasha Levin [this message]
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 35/50] drm/amd/display: check if REFCLK_CNTL register is present Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 36/50] drm/amd/display: Defer cursor update around VUPDATE for all ASIC Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 37/50] drm/amd/display: Update downspread percent to match spreadsheet for DCN2.1 Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 38/50] Fix use after free in get_tree_bdev() Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 39/50] drm/qxl: lost qxl_bo_kunmap_atomic_page in qxl_image_init_helper() Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 40/50] ALSA: opti9xx: shut up gcc-10 range warning Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 41/50] i2c: aspeed: Avoid i2c interrupt status clear race condition Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 42/50] fibmap: Warn and return an error in case of block > INT_MAX Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 43/50] block: remove the bd_openers checks in blk_drop_partitions Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 44/50] arm64: vdso: Add -fasynchronous-unwind-tables to cflags Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 45/50] io_uring: use cond_resched() in io_ring_ctx_wait_and_kill() Sasha Levin
2020-05-07 14:27 ` [PATCH AUTOSEL 5.6 46/50] iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200507142726.25751-34-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox