From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Joe Korty <joe.korty@concurrent-rt.com>,
Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 5.4 23/60] NFSD: Repair misuse of sv_lock in 5.10.16-rt30.
Date: Mon, 22 Mar 2021 13:28:11 +0100 [thread overview]
Message-ID: <20210322121923.143455870@linuxfoundation.org> (raw)
In-Reply-To: <20210322121922.372583154@linuxfoundation.org>
From: Joe Korty <joe.korty@concurrent-rt.com>
commit c7de87ff9dac5f396f62d584f3908f80ddc0e07b upstream.
[ This problem is in mainline, but only rt has the chops to be
able to detect it. ]
Lockdep reports a circular lock dependency between serv->sv_lock and
softirq_ctl.lock on system shutdown, when using a kernel built with
CONFIG_PREEMPT_RT=y, and a nfs mount exists.
This is due to the definition of spin_lock_bh on rt:
local_bh_disable();
rt_spin_lock(lock);
which forces a softirq_ctl.lock -> serv->sv_lock dependency. This is
not a problem as long as _every_ lock of serv->sv_lock is a:
spin_lock_bh(&serv->sv_lock);
but there is one of the form:
spin_lock(&serv->sv_lock);
This is what is causing the circular dependency splat. The spin_lock()
grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable.
If later on in the critical region, someone does a local_bh_disable, we
get a serv->sv_lock -> softirq_ctrl.lock dependency established. Deadlock.
Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no
exceptions.
[ OK ] Stopped target NFS client services.
Stopping Logout off all iSCSI sessions on shutdown...
Stopping NFS server and services...
[ 109.442380]
[ 109.442385] ======================================================
[ 109.442386] WARNING: possible circular locking dependency detected
[ 109.442387] 5.10.16-rt30 #1 Not tainted
[ 109.442389] ------------------------------------------------------
[ 109.442390] nfsd/1032 is trying to acquire lock:
[ 109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270
[ 109.442405]
[ 109.442405] but task is already holding lock:
[ 109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[ 109.442415]
[ 109.442415] which lock already depends on the new lock.
[ 109.442415]
[ 109.442416]
[ 109.442416] the existing dependency chain (in reverse order) is:
[ 109.442417]
[ 109.442417] -> #1 (&serv->sv_lock){+.+.}-{0:0}:
[ 109.442421] rt_spin_lock+0x2b/0xc0
[ 109.442428] svc_add_new_perm_xprt+0x42/0xa0
[ 109.442430] svc_addsock+0x135/0x220
[ 109.442434] write_ports+0x4b3/0x620
[ 109.442438] nfsctl_transaction_write+0x45/0x80
[ 109.442440] vfs_write+0xff/0x420
[ 109.442444] ksys_write+0x4f/0xc0
[ 109.442446] do_syscall_64+0x33/0x40
[ 109.442450] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 109.442454]
[ 109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}:
[ 109.442457] __lock_acquire+0x1264/0x20b0
[ 109.442463] lock_acquire+0xc2/0x400
[ 109.442466] rt_spin_lock+0x2b/0xc0
[ 109.442469] __local_bh_disable_ip+0xd9/0x270
[ 109.442471] svc_xprt_do_enqueue+0xc0/0x4d0
[ 109.442474] svc_close_list+0x60/0x90
[ 109.442476] svc_close_net+0x49/0x1a0
[ 109.442478] svc_shutdown_net+0x12/0x40
[ 109.442480] nfsd_destroy+0xc5/0x180
[ 109.442482] nfsd+0x1bc/0x270
[ 109.442483] kthread+0x194/0x1b0
[ 109.442487] ret_from_fork+0x22/0x30
[ 109.442492]
[ 109.442492] other info that might help us debug this:
[ 109.442492]
[ 109.442493] Possible unsafe locking scenario:
[ 109.442493]
[ 109.442493] CPU0 CPU1
[ 109.442494] ---- ----
[ 109.442495] lock(&serv->sv_lock);
[ 109.442496] lock((softirq_ctrl.lock).lock);
[ 109.442498] lock(&serv->sv_lock);
[ 109.442499] lock((softirq_ctrl.lock).lock);
[ 109.442501]
[ 109.442501] *** DEADLOCK ***
[ 109.442501]
[ 109.442501] 3 locks held by nfsd/1032:
[ 109.442503] #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270
[ 109.442508] #1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[ 109.442512] #2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[ 109.442518]
[ 109.442518] stack backtrace:
[ 109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 #1
[ 109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
[ 109.442524] Call Trace:
[ 109.442527] dump_stack+0x77/0x97
[ 109.442533] check_noncircular+0xdc/0xf0
[ 109.442546] __lock_acquire+0x1264/0x20b0
[ 109.442553] lock_acquire+0xc2/0x400
[ 109.442564] rt_spin_lock+0x2b/0xc0
[ 109.442570] __local_bh_disable_ip+0xd9/0x270
[ 109.442573] svc_xprt_do_enqueue+0xc0/0x4d0
[ 109.442577] svc_close_list+0x60/0x90
[ 109.442581] svc_close_net+0x49/0x1a0
[ 109.442585] svc_shutdown_net+0x12/0x40
[ 109.442588] nfsd_destroy+0xc5/0x180
[ 109.442590] nfsd+0x1bc/0x270
[ 109.442595] kthread+0x194/0x1b0
[ 109.442600] ret_from_fork+0x22/0x30
[ 109.518225] nfsd: last server has exited, flushing export cache
[ OK ] Stopped NFSv4 ID-name mapping service.
[ OK ] Stopped GSSAPI Proxy Daemon.
[ OK ] Stopped NFS Mount Daemon.
[ OK ] Stopped NFS status monitor for NFSv2/3 locking..
Fixes: 719f8bcc883e ("svcrpc: fix xpt_list traversal locking on shutdown")
Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/sunrpc/svc_xprt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -1072,7 +1072,7 @@ static int svc_close_list(struct svc_ser
struct svc_xprt *xprt;
int ret = 0;
- spin_lock(&serv->sv_lock);
+ spin_lock_bh(&serv->sv_lock);
list_for_each_entry(xprt, xprt_list, xpt_list) {
if (xprt->xpt_net != net)
continue;
@@ -1080,7 +1080,7 @@ static int svc_close_list(struct svc_ser
set_bit(XPT_CLOSE, &xprt->xpt_flags);
svc_xprt_enqueue(xprt);
}
- spin_unlock(&serv->sv_lock);
+ spin_unlock_bh(&serv->sv_lock);
return ret;
}
next prev parent reply other threads:[~2021-03-22 12:49 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-22 12:27 [PATCH 5.4 00/60] 5.4.108-rc1 review Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 01/60] ASoC: ak4458: Add MODULE_DEVICE_TABLE Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 02/60] ASoC: ak5558: " Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 03/60] ALSA: dice: fix null pointer dereference when node is disconnected Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 04/60] ALSA: hda/realtek: apply pin quirk for XiaomiNotebook Pro Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 05/60] ALSA: hda: generic: Fix the micmute led init state Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 06/60] ALSA: hda/realtek: Apply headset-mic quirks for Xiaomi Redmibook Air Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 07/60] Revert "PM: runtime: Update device status before letting suppliers suspend" Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 08/60] s390/vtime: fix increased steal time accounting Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 09/60] ARM: 9030/1: entry: omit FP emulation for UND exceptions taken in kernel mode Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 10/60] ARM: 9044/1: vfp: use undef hook for VFP support detection Greg Kroah-Hartman
2021-03-22 12:27 ` [PATCH 5.4 11/60] btrfs: fix race when cloning extent buffer during rewind of an old root Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 12/60] btrfs: fix slab cache flags for free space tree bitmap Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 13/60] ASoC: fsl_ssi: Fix TDM slot setup for I2S mode Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 14/60] ASoC: SOF: Intel: unregister DMIC device on probe error Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 15/60] ASoC: SOF: intel: fix wrong poll bits in dsp power down Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 16/60] ASoC: simple-card-utils: Do not handle device clock Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 17/60] afs: Stop listxattr() from listing "afs.*" attributes Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 18/60] nvme: fix Write Zeroes limitations Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 19/60] nvme-tcp: fix possible hang when failing to set io queues Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 20/60] nvme-tcp: fix a NULL deref when receiving a 0-length r2t PDU Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 21/60] nvmet: dont check iosqes,iocqes for discovery controllers Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 22/60] nfsd: Dont keep looking up unhashed files in the nfsd file cache Greg Kroah-Hartman
2021-03-22 12:28 ` Greg Kroah-Hartman [this message]
2021-03-22 12:28 ` [PATCH 5.4 24/60] svcrdma: disable timeouts on rdma backchannel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 25/60] vfio: IOMMU_API should be selected Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 26/60] sunrpc: fix refcount leak for rpc auth modules Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 27/60] net/qrtr: fix __netdev_alloc_skb call Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 28/60] kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL again Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 29/60] riscv: Correct SPARSEMEM configuration Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 30/60] scsi: lpfc: Fix some error codes in debugfs Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 31/60] scsi: myrs: Fix a double free in myrs_cleanup() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 32/60] counter: stm32-timer-cnt: Report count function when SLAVE_MODE_DISABLED Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 33/60] nvme-rdma: fix possible hang when failing to set io queues Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 34/60] usb-storage: Add quirk to defeat Kindles automatic unload Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 35/60] usbip: Fix incorrect double assignment to udc->ud.tcp_rx Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 36/60] USB: replace hardcode maximum usb string length by definition Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 37/60] usb: gadget: configfs: Fix KASAN use-after-free Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 38/60] usb: typec: tcpm: Invoke power_supply_changed for tcpm-source-psy- Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 39/60] iio:adc:stm32-adc: Add HAS_IOMEM dependency Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 40/60] iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 41/60] iio: adis16400: Fix an error code in adis16400_initial_setup() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 42/60] iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 43/60] iio: adc: ad7949: fix wrong ADC result due to incorrect bit mask Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 44/60] iio: hid-sensor-humidity: Fix alignment issue of timestamp channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 45/60] iio: hid-sensor-prox: Fix scale not correct issue Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 46/60] iio: hid-sensor-temperature: Fix issues of timestamp channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 47/60] counter: stm32-timer-cnt: fix ceiling write max value Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 48/60] PCI: rpadlpar: Fix potential drc_name corruption in store functions Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 49/60] perf/x86/intel: Fix a crash caused by zero PEBS status Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 50/60] x86/ioapic: Ignore IRQ2 again Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 51/60] kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 52/60] x86: Move TS_COMPAT back to asm/thread_info.h Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 53/60] x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 54/60] ext4: find old entry again if failed to rename whiteout Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 55/60] ext4: do not try to set xattr into ea_inode if value is empty Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 56/60] ext4: fix potential error in ext4_do_update_inode Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 57/60] efi: use 32-bit alignment for efi_guid_t literals Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 58/60] firmware/efi: Fix a use after bug in efi_mem_reserve_persistent Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 59/60] genirq: Disable interrupts for force threaded handlers Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 5.4 60/60] x86/apic/of: Fix CPU devicetree-node lookups Greg Kroah-Hartman
2021-03-22 14:35 ` [PATCH 5.4 00/60] 5.4.108-rc1 review Jon Hunter
2021-03-22 18:01 ` Florian Fainelli
2021-03-22 21:53 ` Guenter Roeck
2021-03-23 0:51 ` Samuel Zou
2021-03-23 10:20 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210322121923.143455870@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=chuck.lever@oracle.com \
--cc=joe.korty@concurrent-rt.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).