stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Joe Korty <joe.korty@concurrent-rt.com>,
	Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 4.19 15/43] NFSD: Repair misuse of sv_lock in 5.10.16-rt30.
Date: Mon, 22 Mar 2021 13:28:29 +0100	[thread overview]
Message-ID: <20210322121920.428427760@linuxfoundation.org> (raw)
In-Reply-To: <20210322121919.936671417@linuxfoundation.org>

From: Joe Korty <joe.korty@concurrent-rt.com>

commit c7de87ff9dac5f396f62d584f3908f80ddc0e07b upstream.

[ This problem is in mainline, but only rt has the chops to be
able to detect it. ]

Lockdep reports a circular lock dependency between serv->sv_lock and
softirq_ctl.lock on system shutdown, when using a kernel built with
CONFIG_PREEMPT_RT=y, and a nfs mount exists.

This is due to the definition of spin_lock_bh on rt:

	local_bh_disable();
	rt_spin_lock(lock);

which forces a softirq_ctl.lock -> serv->sv_lock dependency.  This is
not a problem as long as _every_ lock of serv->sv_lock is a:

	spin_lock_bh(&serv->sv_lock);

but there is one of the form:

	spin_lock(&serv->sv_lock);

This is what is causing the circular dependency splat.  The spin_lock()
grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable.
If later on in the critical region,  someone does a local_bh_disable, we
get a serv->sv_lock -> softirq_ctrl.lock dependency established.  Deadlock.

Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no
exceptions.

[  OK  ] Stopped target NFS client services.
         Stopping Logout off all iSCSI sessions on shutdown...
         Stopping NFS server and services...
[  109.442380]
[  109.442385] ======================================================
[  109.442386] WARNING: possible circular locking dependency detected
[  109.442387] 5.10.16-rt30 #1 Not tainted
[  109.442389] ------------------------------------------------------
[  109.442390] nfsd/1032 is trying to acquire lock:
[  109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270
[  109.442405]
[  109.442405] but task is already holding lock:
[  109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[  109.442415]
[  109.442415] which lock already depends on the new lock.
[  109.442415]
[  109.442416]
[  109.442416] the existing dependency chain (in reverse order) is:
[  109.442417]
[  109.442417] -> #1 (&serv->sv_lock){+.+.}-{0:0}:
[  109.442421]        rt_spin_lock+0x2b/0xc0
[  109.442428]        svc_add_new_perm_xprt+0x42/0xa0
[  109.442430]        svc_addsock+0x135/0x220
[  109.442434]        write_ports+0x4b3/0x620
[  109.442438]        nfsctl_transaction_write+0x45/0x80
[  109.442440]        vfs_write+0xff/0x420
[  109.442444]        ksys_write+0x4f/0xc0
[  109.442446]        do_syscall_64+0x33/0x40
[  109.442450]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  109.442454]
[  109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}:
[  109.442457]        __lock_acquire+0x1264/0x20b0
[  109.442463]        lock_acquire+0xc2/0x400
[  109.442466]        rt_spin_lock+0x2b/0xc0
[  109.442469]        __local_bh_disable_ip+0xd9/0x270
[  109.442471]        svc_xprt_do_enqueue+0xc0/0x4d0
[  109.442474]        svc_close_list+0x60/0x90
[  109.442476]        svc_close_net+0x49/0x1a0
[  109.442478]        svc_shutdown_net+0x12/0x40
[  109.442480]        nfsd_destroy+0xc5/0x180
[  109.442482]        nfsd+0x1bc/0x270
[  109.442483]        kthread+0x194/0x1b0
[  109.442487]        ret_from_fork+0x22/0x30
[  109.442492]
[  109.442492] other info that might help us debug this:
[  109.442492]
[  109.442493]  Possible unsafe locking scenario:
[  109.442493]
[  109.442493]        CPU0                    CPU1
[  109.442494]        ----                    ----
[  109.442495]   lock(&serv->sv_lock);
[  109.442496]                                lock((softirq_ctrl.lock).lock);
[  109.442498]                                lock(&serv->sv_lock);
[  109.442499]   lock((softirq_ctrl.lock).lock);
[  109.442501]
[  109.442501]  *** DEADLOCK ***
[  109.442501]
[  109.442501] 3 locks held by nfsd/1032:
[  109.442503]  #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270
[  109.442508]  #1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[  109.442512]  #2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[  109.442518]
[  109.442518] stack backtrace:
[  109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 #1
[  109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
[  109.442524] Call Trace:
[  109.442527]  dump_stack+0x77/0x97
[  109.442533]  check_noncircular+0xdc/0xf0
[  109.442546]  __lock_acquire+0x1264/0x20b0
[  109.442553]  lock_acquire+0xc2/0x400
[  109.442564]  rt_spin_lock+0x2b/0xc0
[  109.442570]  __local_bh_disable_ip+0xd9/0x270
[  109.442573]  svc_xprt_do_enqueue+0xc0/0x4d0
[  109.442577]  svc_close_list+0x60/0x90
[  109.442581]  svc_close_net+0x49/0x1a0
[  109.442585]  svc_shutdown_net+0x12/0x40
[  109.442588]  nfsd_destroy+0xc5/0x180
[  109.442590]  nfsd+0x1bc/0x270
[  109.442595]  kthread+0x194/0x1b0
[  109.442600]  ret_from_fork+0x22/0x30
[  109.518225] nfsd: last server has exited, flushing export cache
[  OK  ] Stopped NFSv4 ID-name mapping service.
[  OK  ] Stopped GSSAPI Proxy Daemon.
[  OK  ] Stopped NFS Mount Daemon.
[  OK  ] Stopped NFS status monitor for NFSv2/3 locking..

Fixes: 719f8bcc883e ("svcrpc: fix xpt_list traversal locking on shutdown")
Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sunrpc/svc_xprt.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -1052,7 +1052,7 @@ static int svc_close_list(struct svc_ser
 	struct svc_xprt *xprt;
 	int ret = 0;
 
-	spin_lock(&serv->sv_lock);
+	spin_lock_bh(&serv->sv_lock);
 	list_for_each_entry(xprt, xprt_list, xpt_list) {
 		if (xprt->xpt_net != net)
 			continue;
@@ -1060,7 +1060,7 @@ static int svc_close_list(struct svc_ser
 		set_bit(XPT_CLOSE, &xprt->xpt_flags);
 		svc_xprt_enqueue(xprt);
 	}
-	spin_unlock(&serv->sv_lock);
+	spin_unlock_bh(&serv->sv_lock);
 	return ret;
 }
 



  parent reply	other threads:[~2021-03-22 12:53 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-22 12:28 [PATCH 4.19 00/43] 4.19.183-rc1 review Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 01/43] ASoC: ak4458: Add MODULE_DEVICE_TABLE Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 02/43] ASoC: ak5558: " Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 03/43] ALSA: hda: generic: Fix the micmute led init state Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 04/43] Revert "PM: runtime: Update device status before letting suppliers suspend" Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 05/43] vmlinux.lds.h: Create section for protection against instrumentation Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 06/43] lkdtm: dont move ctors to .rodata Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 07/43] perf tools: Use %define api.pure full instead of %pure-parser Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 08/43] tools build feature: Check if get_current_dir_name() is available Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 09/43] tools build feature: Check if eventfd() " Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 10/43] tools build: Check if gettid() is available before providing helper Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 11/43] btrfs: fix race when cloning extent buffer during rewind of an old root Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 12/43] btrfs: fix slab cache flags for free space tree bitmap Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 13/43] ASoC: fsl_ssi: Fix TDM slot setup for I2S mode Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 14/43] nvmet: dont check iosqes,iocqes for discovery controllers Greg Kroah-Hartman
2021-03-22 12:28 ` Greg Kroah-Hartman [this message]
2021-03-22 12:28 ` [PATCH 4.19 16/43] svcrdma: disable timeouts on rdma backchannel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 17/43] sunrpc: fix refcount leak for rpc auth modules Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 18/43] net/qrtr: fix __netdev_alloc_skb call Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 19/43] scsi: lpfc: Fix some error codes in debugfs Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 20/43] nvme-rdma: fix possible hang when failing to set io queues Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 21/43] powerpc: Force inlining of cpu_has_feature() to avoid build failure Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 22/43] usb-storage: Add quirk to defeat Kindles automatic unload Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 23/43] usbip: Fix incorrect double assignment to udc->ud.tcp_rx Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 24/43] USB: replace hardcode maximum usb string length by definition Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 25/43] usb: gadget: configfs: Fix KASAN use-after-free Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 26/43] iio:adc:stm32-adc: Add HAS_IOMEM dependency Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 27/43] iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 28/43] iio: adis16400: Fix an error code in adis16400_initial_setup() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 29/43] iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 30/43] iio: hid-sensor-humidity: Fix alignment issue of timestamp channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 31/43] iio: hid-sensor-prox: Fix scale not correct issue Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 32/43] iio: hid-sensor-temperature: Fix issues of timestamp channel Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 33/43] PCI: rpadlpar: Fix potential drc_name corruption in store functions Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 34/43] perf/x86/intel: Fix a crash caused by zero PEBS status Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 35/43] x86/ioapic: Ignore IRQ2 again Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 36/43] kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 37/43] x86: Move TS_COMPAT back to asm/thread_info.h Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 38/43] x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall() Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 39/43] ext4: find old entry again if failed to rename whiteout Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 40/43] ext4: do not try to set xattr into ea_inode if value is empty Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 41/43] ext4: fix potential error in ext4_do_update_inode Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 42/43] genirq: Disable interrupts for force threaded handlers Greg Kroah-Hartman
2021-03-22 12:28 ` [PATCH 4.19 43/43] x86/apic/of: Fix CPU devicetree-node lookups Greg Kroah-Hartman
2021-03-22 14:35 ` [PATCH 4.19 00/43] 4.19.183-rc1 review Jon Hunter
2021-03-22 20:14 ` Pavel Machek
2021-03-22 21:53 ` Guenter Roeck
2021-03-23  7:17 ` Samuel Zou
2021-03-23 10:31 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210322121920.428427760@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=chuck.lever@oracle.com \
    --cc=joe.korty@concurrent-rt.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).