Re: [PATCH 4/3] locking/lockdep: Improve the deadlock scenario print for sync and read lock

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Boqun Feng <boqun.feng@gmail.com>
To: Waiman Long <longman@redhat.com>
Cc: linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
	kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	seanjc@google.com, Joel Fernandes <joel@joelfernandes.org>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Luczaj <mhal@rbox.co>
Subject: Re: [PATCH 4/3] locking/lockdep: Improve the deadlock scenario print for sync and read lock
Date: Mon, 16 Jan 2023 14:35:33 -0800	[thread overview]
Message-ID: <Y8XRNdrW3t1jvpMF@boqun-archlinux> (raw)
In-Reply-To: <221e35b8-88f5-5fc5-6961-6a8ce060a97b@redhat.com>

On Mon, Jan 16, 2023 at 05:21:09PM -0500, Waiman Long wrote:
> On 1/13/23 18:57, Boqun Feng wrote:
> > Lock scenario print is always a weak spot of lockdep splats. Improvement
> > can be made if we rework the dependency search and the error printing.
> > 
> > However without touching the graph search, we can improve a little for
> > the circular deadlock case, since we have the to-be-added lock
> > dependency, and know whether these two locks are read/write/sync.
> > 
> > In order to know whether a held_lock is sync or not, a bit was
> > "stolen" from ->references, which reduce our limit for the same lock
> > class nesting from 2^12 to 2^11, and it should still be good enough.
> > 
> > Besides, since we now have bit in held_lock for sync, we don't need the
> > "hardirqoffs being 1" trick, and also we can avoid the __lock_release()
> > if we jump out of __lock_acquire() before the held_lock stored.
> > 
> > With these changes, a deadlock case evolved with read lock and sync gets
> > a better print-out from:
> > 
> > 	[...]  Possible unsafe locking scenario:
> > 	[...]
> > 	[...]        CPU0                    CPU1
> > 	[...]        ----                    ----
> > 	[...]   lock(srcuA);
> > 	[...]                                lock(srcuB);
> > 	[...]                                lock(srcuA);
> > 	[...]   lock(srcuB);
> > 
> > to
> > 
> > 	[...]  Possible unsafe locking scenario:
> > 	[...]
> > 	[...]        CPU0                    CPU1
> > 	[...]        ----                    ----
> > 	[...]   rlock(srcuA);
> > 	[...]                                lock(srcuB);
> > 	[...]                                lock(srcuA);
> > 	[...]   sync(srcuB);
> > 
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> >   include/linux/lockdep.h  |  3 ++-
> >   kernel/locking/lockdep.c | 48 ++++++++++++++++++++++++++--------------
> >   2 files changed, 34 insertions(+), 17 deletions(-)
> > 
> > diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> > index ba09df6a0872..febd7ecc225c 100644
> > --- a/include/linux/lockdep.h
> > +++ b/include/linux/lockdep.h
> > @@ -134,7 +134,8 @@ struct held_lock {
> >   	unsigned int read:2;        /* see lock_acquire() comment */
> >   	unsigned int check:1;       /* see lock_acquire() comment */
> >   	unsigned int hardirqs_off:1;
> > -	unsigned int references:12;					/* 32 bits */
> > +	unsigned int sync:1;
> > +	unsigned int references:11;					/* 32 bits */
> >   	unsigned int pin_count;
> >   };
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index cffa026a765f..4031d87f6829 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1880,6 +1880,8 @@ print_circular_lock_scenario(struct held_lock *src,
> >   	struct lock_class *source = hlock_class(src);
> >   	struct lock_class *target = hlock_class(tgt);
> >   	struct lock_class *parent = prt->class;
> > +	int src_read = src->read;
> > +	int tgt_read = tgt->read;
> >   	/*
> >   	 * A direct locking problem where unsafe_class lock is taken
> > @@ -1907,7 +1909,10 @@ print_circular_lock_scenario(struct held_lock *src,
> >   	printk(" Possible unsafe locking scenario:\n\n");
> >   	printk("       CPU0                    CPU1\n");
> >   	printk("       ----                    ----\n");
> > -	printk("  lock(");
> > +	if (tgt_read != 0)
> > +		printk("  rlock(");
> > +	else
> > +		printk("  lock(");
> >   	__print_lock_name(target);
> >   	printk(KERN_CONT ");\n");
> >   	printk("                               lock(");
> > @@ -1916,7 +1921,12 @@ print_circular_lock_scenario(struct held_lock *src,
> >   	printk("                               lock(");
> >   	__print_lock_name(target);
> >   	printk(KERN_CONT ");\n");
> > -	printk("  lock(");
> > +	if (src_read != 0)
> > +		printk("  rlock(");
> > +	else if (src->sync)
> > +		printk("  sync(");
> > +	else
> > +		printk("  lock(");
> >   	__print_lock_name(source);
> >   	printk(KERN_CONT ");\n");
> >   	printk("\n *** DEADLOCK ***\n\n");
> 
> src can be sync() but not the target. Is there a reason why that is the
> case?
> 

The functions annotated by sync() don't create real critical sections,
so no lock dependency can be created from a sync(), for example:

	synchronize_srcu(A);
	mutex_lock(B);

no dependency from A to B. In the scenario case, if we see a dependency
target -> source, the target cannot be a lock_sync(). I will add better
documentation later.

> 
> > @@ -4530,7 +4540,13 @@ mark_usage(struct task_struct *curr, struct held_lock *hlock, int check)
> >   					return 0;
> >   		}
> >   	}
> > -	if (!hlock->hardirqs_off) {
> > +
> > +	/*
> > +	 * For lock_sync(), don't mark the ENABLED usage, since lock_sync()
> > +	 * creates no critical section and no extra dependency can be introduced
> > +	 * by interrupts
> > +	 */
> > +	if (!hlock->hardirqs_off && !hlock->sync) {
> >   		if (hlock->read) {
> >   			if (!mark_lock(curr, hlock,
> >   					LOCK_ENABLED_HARDIRQ_READ))
> > @@ -4909,7 +4925,7 @@ static int __lock_is_held(const struct lockdep_map *lock, int read);
> >   static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >   			  int trylock, int read, int check, int hardirqs_off,
> >   			  struct lockdep_map *nest_lock, unsigned long ip,
> > -			  int references, int pin_count)
> > +			  int references, int pin_count, int sync)
> >   {
> >   	struct task_struct *curr = current;
> >   	struct lock_class *class = NULL;
> > @@ -4960,7 +4976,8 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >   	class_idx = class - lock_classes;
> > -	if (depth) { /* we're holding locks */
> > +	if (depth && !sync) {
> > +		/* we're holding locks and the new held lock is not a sync */
> >   		hlock = curr->held_locks + depth - 1;
> >   		if (hlock->class_idx == class_idx && nest_lock) {
> >   			if (!references)
> > @@ -4994,6 +5011,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >   	hlock->trylock = trylock;
> >   	hlock->read = read;
> >   	hlock->check = check;
> > +	hlock->sync = !!sync;
> >   	hlock->hardirqs_off = !!hardirqs_off;
> >   	hlock->references = references;
> >   #ifdef CONFIG_LOCK_STAT
> > @@ -5055,6 +5073,10 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >   	if (!validate_chain(curr, hlock, chain_head, chain_key))
> >   		return 0;
> > +	/* For lock_sync(), we are done here since no actual critical section */
> > +	if (hlock->sync)
> > +		return 1;
> > +
> >   	curr->curr_chain_key = chain_key;
> >   	curr->lockdep_depth++;
> >   	check_chain_key(curr);
> 
> Even with sync, there is still a corresponding lock_acquire() and
> lock_release(), you can't exit here without increasing lockdep_depth. That
> can cause underflow.
> 

I actually remove the __lock_release() in lock_sync() in this patch, so
I think it's OK. But I must admit the whole submission is to give David
something to see whether the output is an improvement, so I probably
should separate the output changes and the lock_sync() internall into
two patches (and the later can also be folded into the introduction
patch).

Regards,
Boqun

> Cheers,
> Longman
>

next prev parent reply	other threads:[~2023-01-16 22:35 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13  6:59 [PATCH 0/3] Detect SRCU related deadlocks Boqun Feng
2023-01-13  6:59 ` [PATCH 1/3] locking/lockdep: Introduce lock_sync() Boqun Feng
2023-01-16 21:56   ` Waiman Long
2023-01-13  6:59 ` [PATCH 2/3] rcu: Equip sleepable RCU with lockdep dependency graph checks Boqun Feng
2023-01-13 11:29   ` Paul E. McKenney
2023-01-13 18:05     ` Boqun Feng
2023-01-13 19:11       ` Paul E. McKenney
2023-01-16 17:36         ` Paolo Bonzini
2023-01-16 17:54           ` Boqun Feng
2023-01-16 18:56             ` Paul E. McKenney
2023-01-13 13:03   ` Hillf Danton
2023-01-13 17:58     ` Boqun Feng
2023-01-13 23:58       ` Hillf Danton
2023-01-14  0:17         ` Boqun Feng
2023-01-14  7:18           ` Hillf Danton
2023-01-14  7:32             ` Boqun Feng
2023-01-14 10:26               ` Hillf Danton
2023-01-15  0:18                 ` Boqun Feng
2023-01-16 22:01   ` Waiman Long
2023-01-13  6:59 ` [PATCH 3/3] WIP: locking/lockdep: selftests: Add selftests for SRCU Boqun Feng
2023-01-13 12:46 ` [PATCH 0/3] KVM: Make use of SRCU deadlock detection support David Woodhouse
2023-01-13 12:46   ` [PATCH 1/3] KVM: Show lockdep the kvm->mutex vs. kvm->srcu ordering rule David Woodhouse
2023-01-13 12:46   ` [PATCH 2/3] KVM: selftests: Use enum for test numbers in xen_shinfo_test David Woodhouse
2023-01-13 17:13     ` David Woodhouse
2023-01-13 12:46   ` [PATCH 3/3] KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test David Woodhouse
2023-02-04  2:32     ` Sean Christopherson
2023-02-04  2:34   ` [PATCH 0/3] KVM: Make use of SRCU deadlock detection support Sean Christopherson
2023-01-13 23:57 ` [PATCH 4/3] locking/lockdep: Improve the deadlock scenario print for sync and read lock Boqun Feng
2023-01-16 22:21   ` Waiman Long
2023-01-16 22:35     ` Boqun Feng [this message]
2023-01-17  1:36       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8XRNdrW3t1jvpMF@boqun-archlinux \
    --to=boqun.feng@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhal@rbox.co \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.