All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@redhat.com>
To: Joe Korty <joe.korty@concurrent-rt.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH] Repair misuse of sv_lock in 5.10.16-rt30.
Date: Fri, 26 Feb 2021 10:00:36 -0500	[thread overview]
Message-ID: <YDkNFKmzb7rbumYf@pick.fieldses.org> (raw)
In-Reply-To: <20210226143820.GA49043@zipoli.concurrent-rt.com>

Adding Chuck, linux-nfs.

Makes sense to me.--b.

On Fri, Feb 26, 2021 at 09:38:20AM -0500, Joe Korty wrote:
> Repair misuse of sv_lock in 5.10.16-rt30.
> 
> [ This problem is in mainline, but only rt has the chops to be
> able to detect it. ]
> 
> Lockdep reports a circular lock dependency between serv->sv_lock and
> softirq_ctl.lock on system shutdown, when using a kernel built with
> CONFIG_PREEMPT_RT=y, and a nfs mount exists.
> 
> This is due to the definition of spin_lock_bh on rt:
> 
> 	local_bh_disable();
> 	rt_spin_lock(lock);
> 
> which forces a softirq_ctl.lock -> serv->sv_lock dependency.  This is
> not a problem as long as _every_ lock of serv->sv_lock is a:
> 
> 	spin_lock_bh(&serv->sv_lock);
> 
> but there is one of the form:
> 
> 	spin_lock(&serv->sv_lock);
> 
> This is what is causing the circular dependency splat.  The spin_lock()
> grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable.
> If later on in the critical region,  someone does a local_bh_disable, we
> get a serv->sv_lock -> softirq_ctrl.lock dependency established.  Deadlock.
> 
> Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no
> exceptions.
> 
> Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
> 
> 
> 
> 
> [  OK  ] Stopped target NFS client services.
>          Stopping Logout off all iSCSI sessions on shutdown...
>          Stopping NFS server and services...
> [  109.442380] 
> [  109.442385] ======================================================
> [  109.442386] WARNING: possible circular locking dependency detected
> [  109.442387] 5.10.16-rt30 #1 Not tainted
> [  109.442389] ------------------------------------------------------
> [  109.442390] nfsd/1032 is trying to acquire lock:
> [  109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270
> [  109.442405] 
> [  109.442405] but task is already holding lock:
> [  109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
> [  109.442415] 
> [  109.442415] which lock already depends on the new lock.
> [  109.442415] 
> [  109.442416] 
> [  109.442416] the existing dependency chain (in reverse order) is:
> [  109.442417] 
> [  109.442417] -> #1 (&serv->sv_lock){+.+.}-{0:0}:
> [  109.442421]        rt_spin_lock+0x2b/0xc0
> [  109.442428]        svc_add_new_perm_xprt+0x42/0xa0
> [  109.442430]        svc_addsock+0x135/0x220
> [  109.442434]        write_ports+0x4b3/0x620
> [  109.442438]        nfsctl_transaction_write+0x45/0x80
> [  109.442440]        vfs_write+0xff/0x420
> [  109.442444]        ksys_write+0x4f/0xc0
> [  109.442446]        do_syscall_64+0x33/0x40
> [  109.442450]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  109.442454] 
> [  109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}:
> [  109.442457]        __lock_acquire+0x1264/0x20b0
> [  109.442463]        lock_acquire+0xc2/0x400
> [  109.442466]        rt_spin_lock+0x2b/0xc0
> [  109.442469]        __local_bh_disable_ip+0xd9/0x270
> [  109.442471]        svc_xprt_do_enqueue+0xc0/0x4d0
> [  109.442474]        svc_close_list+0x60/0x90
> [  109.442476]        svc_close_net+0x49/0x1a0
> [  109.442478]        svc_shutdown_net+0x12/0x40
> [  109.442480]        nfsd_destroy+0xc5/0x180
> [  109.442482]        nfsd+0x1bc/0x270
> [  109.442483]        kthread+0x194/0x1b0
> [  109.442487]        ret_from_fork+0x22/0x30
> [  109.442492] 
> [  109.442492] other info that might help us debug this:
> [  109.442492] 
> [  109.442493]  Possible unsafe locking scenario:
> [  109.442493] 
> [  109.442493]        CPU0                    CPU1
> [  109.442494]        ----                    ----
> [  109.442495]   lock(&serv->sv_lock);
> [  109.442496]                                lock((softirq_ctrl.lock).lock);
> [  109.442498]                                lock(&serv->sv_lock);
> [  109.442499]   lock((softirq_ctrl.lock).lock);
> [  109.442501] 
> [  109.442501]  *** DEADLOCK ***
> [  109.442501] 
> [  109.442501] 3 locks held by nfsd/1032:
> [  109.442503]  #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270
> [  109.442508]  #1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
> [  109.442512]  #2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
> [  109.442518] 
> [  109.442518] stack backtrace:
> [  109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 #1
> [  109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
> [  109.442524] Call Trace:
> [  109.442527]  dump_stack+0x77/0x97
> [  109.442533]  check_noncircular+0xdc/0xf0
> [  109.442546]  __lock_acquire+0x1264/0x20b0
> [  109.442553]  lock_acquire+0xc2/0x400
> [  109.442564]  rt_spin_lock+0x2b/0xc0
> [  109.442570]  __local_bh_disable_ip+0xd9/0x270
> [  109.442573]  svc_xprt_do_enqueue+0xc0/0x4d0
> [  109.442577]  svc_close_list+0x60/0x90
> [  109.442581]  svc_close_net+0x49/0x1a0
> [  109.442585]  svc_shutdown_net+0x12/0x40
> [  109.442588]  nfsd_destroy+0xc5/0x180
> [  109.442590]  nfsd+0x1bc/0x270
> [  109.442595]  kthread+0x194/0x1b0
> [  109.442600]  ret_from_fork+0x22/0x30
> [  109.518225] nfsd: last server has exited, flushing export cache
> [  OK  ] Stopped NFSv4 ID-name mapping service.
> [  OK  ] Stopped GSSAPI Proxy Daemon.
> [  OK  ] Stopped NFS Mount Daemon.
> [  OK  ] Stopped NFS status monitor for NFSv2/3 locking..
> Index: b/net/sunrpc/svc_xprt.c
> ===================================================================
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -1062,7 +1062,7 @@ static int svc_close_list(struct svc_ser
>  	struct svc_xprt *xprt;
>  	int ret = 0;
>  
> -	spin_lock(&serv->sv_lock);
> +	spin_lock_bh(&serv->sv_lock);
>  	list_for_each_entry(xprt, xprt_list, xpt_list) {
>  		if (xprt->xpt_net != net)
>  			continue;
> @@ -1070,7 +1070,7 @@ static int svc_close_list(struct svc_ser
>  		set_bit(XPT_CLOSE, &xprt->xpt_flags);
>  		svc_xprt_enqueue(xprt);
>  	}
> -	spin_unlock(&serv->sv_lock);
> +	spin_unlock_bh(&serv->sv_lock);
>  	return ret;
>  }
>  
> 


  reply	other threads:[~2021-02-26 15:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 14:38 [PATCH] Repair misuse of sv_lock in 5.10.16-rt30 Joe Korty
2021-02-26 15:00 ` J. Bruce Fields [this message]
2021-02-26 15:15   ` Chuck Lever
2021-02-26 15:19     ` Joe Korty
2021-02-26 15:21       ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDkNFKmzb7rbumYf@pick.fieldses.org \
    --to=bfields@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=chuck.lever@oracle.com \
    --cc=joe.korty@concurrent-rt.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.