Re: deadlock in synchronize_srcu() in debugfs?

linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: deadlock in synchronize_srcu() in debugfs?
       [not found] ` <1490282991.2766.7.camel@sipsolutions.net>
@ 2017-03-24  8:56   ` Johannes Berg
  2017-03-24  9:24     ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2017-03-24  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Stange, Paul E.McKenney, gregkh, sharon.dvir,
	Peter Zijlstra, Ingo Molnar, linux-wireless

On Thu, 2017-03-23 at 16:29 +0100, Johannes Berg wrote:
> Isn't it possible for the following to happen?
> 
> CPU1					CPU2
> 
> mutex_lock(&M);
> 					full_proxy_xyz();
> 					srcu_read_lock(&debugfs_srcu);
> 					real_fops->xyz();
> 					mutex_lock(&M);
> debugfs_remove(F);
> synchronize_srcu(&debugfs_srcu);

So I'm pretty sure that this can happen. I'm not convinced that it's
happening here, but still.

I tried to make lockdep flag it, but the only way I could get it to
flag it was to do this:

--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -235,7 +235,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 	preempt_disable();
 	retval = __srcu_read_lock(sp);
 	preempt_enable();
-	rcu_lock_acquire(&(sp)->dep_map);
+	lock_map_acquire(&(sp)->dep_map);
 	return retval;
 }

@@ -249,7 +249,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
 	__releases(sp)
 {
-	rcu_lock_release(&(sp)->dep_map);
+	lock_map_release(&(sp)->dep_map);
 	__srcu_read_unlock(sp, idx);
 }

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index ef3bcfb15b39..0f9e542ca3f2 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -395,6 +395,9 @@ static void __synchronize_srcu(struct srcu_struct *sp, int trycount)
 			 lock_is_held(&rcu_sched_lock_map),
 			 "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section");

+	lock_map_acquire(&sp->dep_map);
+	lock_map_release(&sp->dep_map);
+
 	might_sleep();
 	init_completion(&rcu.completion);

The lock_map_acquire() in srcu_read_lock() is really not desired
though, since it will make recursion get flagged as bad. If I change
that to lock_map_acquire_read() though, the problem doesn't get flagged
for some reason. I thought it should.

Regardless though, I don't see a way to solve this problem for debugfs.
We have a ton of debugfs files in net/mac80211/debugfs.c that need to
acquire e.g. the RTNL (or other locks), and I'm not sure we can easily
avoid removing the debugfs files under the RTNL, since we get all our
configuration callbacks with the RTNL already held...

Need to think about that, but perhaps there's some other solution?

johannes

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: deadlock in synchronize_srcu() in debugfs?
  2017-03-24  8:56   ` deadlock in synchronize_srcu() in debugfs? Johannes Berg
@ 2017-03-24  9:24     ` Johannes Berg
  2017-03-24 17:45       ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2017-03-24  9:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Stange, Paul E.McKenney, gregkh, sharon.dvir,
	Peter Zijlstra, Ingo Molnar, linux-wireless

Hi,

On Fri, 2017-03-24 at 09:56 +0100, Johannes Berg wrote:
> On Thu, 2017-03-23 at 16:29 +0100, Johannes Berg wrote:
> > Isn't it possible for the following to happen?
> > 
> > CPU1					CPU2
> > 
> > mutex_lock(&M); // acquires mutex
> > 					full_proxy_xyz();
> > 					srcu_read_lock(&debugfs_srcu);
> > 					real_fops->xyz();
> > 					mutex_lock(&M); // waiting for mutex
> > debugfs_remove(F);
> > synchronize_srcu(&debugfs_srcu);

> So I'm pretty sure that this can happen. I'm not convinced that it's
> happening here, but still.

I'm a bit confused, in that SRCU, of course, doesn't wait until all the
readers are done - that'd be a regular reader/writer lock or something.

However, it does (have to) wait until all the currently active read-
side sections have terminated, which still leads to a deadlock in the
example above, I think?

In his 2006 LWN article Paul wrote:

    The designer of a given subsystem is responsible for: (1) ensuring
    that SRCU read-side sleeping is bounded and (2) limiting the amount
    of memory waiting for synchronize_srcu(). [1]

In the case of debugfs files acquiring locks, (1) can't really be
guaranteed, especially if those locks can be held while doing
synchronize_srcu() [via debugfs_remove], so I still think the lockdep
annotation needs to be changed to at least have some annotation at
synchronize_srcu() time so we can detect this.

Now, I still suspect there's some other bug here in the case that I'm
seeing, because I don't actually see the "mutex_lock(&M); // waiting"
piece in the traces. I'll need to run this with some tracing on Monday
when the test guys are back from the weekend.

I'm also not sure how I can possibly fix this in debugfs in mac80211
and friends, but that's perhaps a different story. Clearly, this
debugfs patch is a good thing - the code will likely have had use-
after-free problems in this situation without it. But flagging the
potential deadlocks would make it a lot easier to find them.

johannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: deadlock in synchronize_srcu() in debugfs?
  2017-03-24  9:24     ` Johannes Berg
@ 2017-03-24 17:45       ` Paul E. McKenney
  2017-03-24 18:51         ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2017-03-24 17:45 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Nicolai Stange, gregkh, sharon.dvir, Peter Zijlstra,
	Ingo Molnar, linux-wireless

On Fri, Mar 24, 2017 at 10:24:46AM +0100, Johannes Berg wrote:
> Hi,
> 
> On Fri, 2017-03-24 at 09:56 +0100, Johannes Berg wrote:
> > On Thu, 2017-03-23 at 16:29 +0100, Johannes Berg wrote:
> > > Isn't it possible for the following to happen?
> > > 
> > > CPU1					CPU2
> > > 
> > > mutex_lock(&M); // acquires mutex
> > > 					full_proxy_xyz();
> > > 					srcu_read_lock(&debugfs_srcu);
> > > 					real_fops->xyz();
> > > 					mutex_lock(&M); // waiting for mutex
> > > debugfs_remove(F);
> > > synchronize_srcu(&debugfs_srcu);
> 
> > So I'm pretty sure that this can happen. I'm not convinced that it's
> > happening here, but still.
> 
> I'm a bit confused, in that SRCU, of course, doesn't wait until all the
> readers are done - that'd be a regular reader/writer lock or something.

Agreed, synchronize_srcu()  does not have to wait for new readers
(as a reader/writer lock would), but it -does- have have to wait for
pre-existing readers, like the one shown in your example above.

> However, it does (have to) wait until all the currently active read-
> side sections have terminated, which still leads to a deadlock in the
> example above, I think?

Yes.  CPU2 has a pre-existing reader that CPU1's synchronize_srcu()
must wait for.  But CPU2's reader cannot end until CPU1 releases
its lock, which it cannot do until after CPU2's reader ends.  Thus,
as you say, deadlock.

The rule is that if you are within any kind of RCU read-side critical
section, you cannot directly or indirectly wait for a grace period from
that same RCU flavor.

> In his 2006 LWN article Paul wrote:
> 
>     The designer of a given subsystem is responsible for: (1) ensuring
>     that SRCU read-side sleeping is bounded and (2) limiting the amount
>     of memory waiting for synchronize_srcu(). [1]
> 
> In the case of debugfs files acquiring locks, (1) can't really be
> guaranteed, especially if those locks can be held while doing
> synchronize_srcu() [via debugfs_remove], so I still think the lockdep
> annotation needs to be changed to at least have some annotation at
> synchronize_srcu() time so we can detect this.

That would be very nice!

There are some challenges, though.  This is OK:

	CPU1				CPU2
	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
	mutex_lock(&my_lock);		i = srcu_read_lock(&mysrcu);
	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
	mutex_unlock(&my_lock);		srcu_read_unlock(&mysrcu, i);

	CPU3
	synchronize_srcu(&mylock);

This could be a deadlock for reader-writer locking, but not for SRCU.

This is also OK:

	CPU1				CPU2
	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
	mutex_lock(&my_lock);		synchronize_srcu(&yoursrcu);
	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
	mutex_unlock(&my_lock);

Here CPU1's read-side critical sections are for mysrcu, which is
independent of CPU2's grace period for yoursrcu.

So you could flag any lockdep cycle that contained a reader and a
synchronous grace period for the same flavor of RCU, where for SRCU the
identity of the srcu_struct structure is part of the flavor.

> Now, I still suspect there's some other bug here in the case that I'm
> seeing, because I don't actually see the "mutex_lock(&M); // waiting"
> piece in the traces. I'll need to run this with some tracing on Monday
> when the test guys are back from the weekend.
> 
> I'm also not sure how I can possibly fix this in debugfs in mac80211
> and friends, but that's perhaps a different story. Clearly, this
> debugfs patch is a good thing - the code will likely have had use-
> after-free problems in this situation without it. But flagging the
> potential deadlocks would make it a lot easier to find them.

No argument here!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: deadlock in synchronize_srcu() in debugfs?
  2017-03-24 17:45       ` Paul E. McKenney
@ 2017-03-24 18:51         ` Johannes Berg
  2017-03-24 19:33           ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Johannes Berg @ 2017-03-24 18:51 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, Nicolai Stange, gregkh, sharon.dvir, Peter Zijlstra,
	Ingo Molnar, linux-wireless

> Yes.  CPU2 has a pre-existing reader that CPU1's synchronize_srcu()
> must wait for.  But CPU2's reader cannot end until CPU1 releases
> its lock, which it cannot do until after CPU2's reader ends.  Thus,
> as you say, deadlock.
> 
> The rule is that if you are within any kind of RCU read-side critical
> section, you cannot directly or indirectly wait for a grace period
> from that same RCU flavor.

Right. This is indirect then, in a way.

> There are some challenges, though.  This is OK:
> 
>	CPU1				CPU2
>	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
>	mutex_lock(&my_lock);		i = srcu_read_lock(&mysrcu);
>	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
>	mutex_unlock(&my_lock);		srcu_read_unlock(&mysrcu, i);
> 
> 	CPU3
> 	synchronize_srcu(&mylock);
> 
> This could be a deadlock for reader-writer locking, but not for SRCU.

Hmm, yes, that's a good point. If srcu_read_lock() was read_lock, and
synchronize_srcu() was write_lock(), then the write_lock() could stop
CPU2's read_lock() from acquiring the lock, and thus cause a deadlock.

However, I'm not convinced that lockdep handles reader/writer locks
correctly to start with, right now, since it *didn't* actually trigger
any warnings when I annotated SRCU as a reader/writer lock.

> This is also OK:
>	CPU1				CPU2
>	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
>	mutex_lock(&my_lock);		synchronize_srcu(&yoursrc
u);
>	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
>	mutex_unlock(&my_lock);
> 
> Here CPU1's read-side critical sections are for mysrcu, which is
> independent of CPU2's grace period for yoursrcu.

Right, but that's already covered by having separate a lockdep_map for
each SRCU subsystem (mysrcu, yoursrcu).

> So you could flag any lockdep cycle that contained a reader and a
> synchronous grace period for the same flavor of RCU, where for SRCU
> the identity of the srcu_struct structure is part of the flavor.

Right. Basically, I think SRCU should be like a reader/writer lock
(perhaps fixed to work right). The only difference seems to be the
scenario you outlined above (first of the two)?

Actually, given the scenario above, for lockdep purposes the
reader/writer lock is actually the same as a recursive lock, I guess?

You outlined a scenario in which the reader gets blocked due to a
writer (CPU3 doing a write_lock()) so the reader can still participate
in a deadlock cycle since it can - without any other locks being held
by CPU3 that participate - cause a deadlock between CPU1 and CPU2 here.
For lockdep then, even seeing the CPU1 and CPU2 scenarios should be
sufficient to flag a deadlock (*).

This part then isn't true for SRCU, because there forward progress will
still be made. So for SRCU, the "reader" side really needs to be
connected with a "writer" side to form a deadlock cycle, unlike for a
reader/writer lock.

johannes

(*) technically only after checking that write_lock() is ever used, but
... seems reasonable enough to assume that it will be used, since why
would anyone ever use a reader/writer lock if there are only readers?
That's a no-op.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: deadlock in synchronize_srcu() in debugfs?
  2017-03-24 18:51         ` Johannes Berg
@ 2017-03-24 19:33           ` Paul E. McKenney
  2017-03-24 20:20             ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2017-03-24 19:33 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Nicolai Stange, gregkh, sharon.dvir, Peter Zijlstra,
	Ingo Molnar, linux-wireless

On Fri, Mar 24, 2017 at 07:51:47PM +0100, Johannes Berg wrote:
> 
> > Yes.  CPU2 has a pre-existing reader that CPU1's synchronize_srcu()
> > must wait for.  But CPU2's reader cannot end until CPU1 releases
> > its lock, which it cannot do until after CPU2's reader ends.  Thus,
> > as you say, deadlock.
> > 
> > The rule is that if you are within any kind of RCU read-side critical
> > section, you cannot directly or indirectly wait for a grace period
> > from that same RCU flavor.
> 
> Right. This is indirect then, in a way.

Agreed, in a way.  ;-)

> > There are some challenges, though.  This is OK:
> > 
> >	CPU1				CPU2
> >	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
> >	mutex_lock(&my_lock);		i = srcu_read_lock(&mysrcu);
> >	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
> >	mutex_unlock(&my_lock);		srcu_read_unlock(&mysrcu, i);
> > 
> > 	CPU3
> > 	synchronize_srcu(&mylock);
> > 
> > This could be a deadlock for reader-writer locking, but not for SRCU.
> 
> Hmm, yes, that's a good point. If srcu_read_lock() was read_lock, and
> synchronize_srcu() was write_lock(), then the write_lock() could stop
> CPU2's read_lock() from acquiring the lock, and thus cause a deadlock.

Yes.

> However, I'm not convinced that lockdep handles reader/writer locks
> correctly to start with, right now, since it *didn't* actually trigger
> any warnings when I annotated SRCU as a reader/writer lock.

I haven't looked into lockdep enough to know either way.

> > This is also OK:
> >	CPU1				CPU2
> >	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
> >	mutex_lock(&my_lock);		synchronize_srcu(&yoursrc
> u);
> >	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
> >	mutex_unlock(&my_lock);
> > 
> > Here CPU1's read-side critical sections are for mysrcu, which is
> > independent of CPU2's grace period for yoursrcu.
> 
> Right, but that's already covered by having separate a lockdep_map for
> each SRCU subsystem (mysrcu, yoursrcu).

I hope so, but haven't proved that this would work in all possible cases.

> > So you could flag any lockdep cycle that contained a reader and a
> > synchronous grace period for the same flavor of RCU, where for SRCU
> > the identity of the srcu_struct structure is part of the flavor.
> 
> Right. Basically, I think SRCU should be like a reader/writer lock
> (perhaps fixed to work right). The only difference seems to be the
> scenario you outlined above (first of the two)?
> 
> Actually, given the scenario above, for lockdep purposes the
> reader/writer lock is actually the same as a recursive lock, I guess?

Except that a recursive reader/writer lock can still have deadlocks
involving the outermost reader that would not be deadlocks for the
equivalent SRCU scenarios.

> You outlined a scenario in which the reader gets blocked due to a
> writer (CPU3 doing a write_lock()) so the reader can still participate
> in a deadlock cycle since it can - without any other locks being held
> by CPU3 that participate - cause a deadlock between CPU1 and CPU2 here.
> For lockdep then, even seeing the CPU1 and CPU2 scenarios should be
> sufficient to flag a deadlock (*).

Might this be one of the reasons why lockdep has problems with
reader-writer locks?

> This part then isn't true for SRCU, because there forward progress will
> still be made. So for SRCU, the "reader" side really needs to be
> connected with a "writer" side to form a deadlock cycle, unlike for a
> reader/writer lock.

Yes, for SRCU, srcu_read_lock() itself never blocks, so it never
participates directly in a deadlock cycle.  It has to be the case
that something within the SRCU read-side critical section blocks
and takes its place in the deadlock cycle.

Then again, if you didn't have something blocking within your SRCU
read-side critical section, why would you be using SRCU instead of
just plain RCU?  ;-)

> johannes
> 
> (*) technically only after checking that write_lock() is ever used, but
> ... seems reasonable enough to assume that it will be used, since why
> would anyone ever use a reader/writer lock if there are only readers?
> That's a no-op.

Makes sense to me!  The only reasons I can come up with are things like
shutting lockdep up when it wants a given lock read-held or some such.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: deadlock in synchronize_srcu() in debugfs?
  2017-03-24 19:33           ` Paul E. McKenney
@ 2017-03-24 20:20             ` Paul E. McKenney
  2017-03-27 11:18               ` Johannes Berg
  0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2017-03-24 20:20 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-kernel, Nicolai Stange, gregkh, sharon.dvir, Peter Zijlstra,
	Ingo Molnar, linux-wireless

On Fri, Mar 24, 2017 at 12:33:22PM -0700, Paul E. McKenney wrote:
> On Fri, Mar 24, 2017 at 07:51:47PM +0100, Johannes Berg wrote:
> > 
> > > Yes.  CPU2 has a pre-existing reader that CPU1's synchronize_srcu()
> > > must wait for.  But CPU2's reader cannot end until CPU1 releases
> > > its lock, which it cannot do until after CPU2's reader ends.  Thus,
> > > as you say, deadlock.
> > > 
> > > The rule is that if you are within any kind of RCU read-side critical
> > > section, you cannot directly or indirectly wait for a grace period
> > > from that same RCU flavor.
> > 
> > Right. This is indirect then, in a way.
> 
> Agreed, in a way.  ;-)
> 
> > > There are some challenges, though.  This is OK:
> > > 
> > >	CPU1				CPU2
> > >	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
> > >	mutex_lock(&my_lock);		i = srcu_read_lock(&mysrcu);
> > >	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
> > >	mutex_unlock(&my_lock);		srcu_read_unlock(&mysrcu, i);
> > > 
> > > 	CPU3
> > > 	synchronize_srcu(&mylock);
> > > 
> > > This could be a deadlock for reader-writer locking, but not for SRCU.
> > 
> > Hmm, yes, that's a good point. If srcu_read_lock() was read_lock, and
> > synchronize_srcu() was write_lock(), then the write_lock() could stop
> > CPU2's read_lock() from acquiring the lock, and thus cause a deadlock.
> 
> Yes.
> 
> > However, I'm not convinced that lockdep handles reader/writer locks
> > correctly to start with, right now, since it *didn't* actually trigger
> > any warnings when I annotated SRCU as a reader/writer lock.
> 
> I haven't looked into lockdep enough to know either way.
> 
> > > This is also OK:
> > >	CPU1				CPU2
> > >	i = srcu_read_lock(&mysrcu);	mutex_lock(&my_lock);
> > >	mutex_lock(&my_lock);		synchronize_srcu(&yoursrc
> > u);
> > >	srcu_read_unlock(&mysrcu, i);	mutex_unlock(&my_lock);
> > >	mutex_unlock(&my_lock);
> > > 
> > > Here CPU1's read-side critical sections are for mysrcu, which is
> > > independent of CPU2's grace period for yoursrcu.
> > 
> > Right, but that's already covered by having separate a lockdep_map for
> > each SRCU subsystem (mysrcu, yoursrcu).
> 
> I hope so, but haven't proved that this would work in all possible cases.
> 
> > > So you could flag any lockdep cycle that contained a reader and a
> > > synchronous grace period for the same flavor of RCU, where for SRCU
> > > the identity of the srcu_struct structure is part of the flavor.
> > 
> > Right. Basically, I think SRCU should be like a reader/writer lock
> > (perhaps fixed to work right). The only difference seems to be the
> > scenario you outlined above (first of the two)?
> > 
> > Actually, given the scenario above, for lockdep purposes the
> > reader/writer lock is actually the same as a recursive lock, I guess?
> 
> Except that a recursive reader/writer lock can still have deadlocks
> involving the outermost reader that would not be deadlocks for the
> equivalent SRCU scenarios.
> 
> > You outlined a scenario in which the reader gets blocked due to a
> > writer (CPU3 doing a write_lock()) so the reader can still participate
> > in a deadlock cycle since it can - without any other locks being held
> > by CPU3 that participate - cause a deadlock between CPU1 and CPU2 here.
> > For lockdep then, even seeing the CPU1 and CPU2 scenarios should be
> > sufficient to flag a deadlock (*).
> 
> Might this be one of the reasons why lockdep has problems with
> reader-writer locks?
> 
> > This part then isn't true for SRCU, because there forward progress will
> > still be made. So for SRCU, the "reader" side really needs to be
> > connected with a "writer" side to form a deadlock cycle, unlike for a
> > reader/writer lock.
> 
> Yes, for SRCU, srcu_read_lock() itself never blocks, so it never
> participates directly in a deadlock cycle.  It has to be the case
> that something within the SRCU read-side critical section blocks
> and takes its place in the deadlock cycle.
> 
> Then again, if you didn't have something blocking within your SRCU
> read-side critical section, why would you be using SRCU instead of
> just plain RCU?  ;-)
> 
> > johannes
> > 
> > (*) technically only after checking that write_lock() is ever used, but
> > ... seems reasonable enough to assume that it will be used, since why
> > would anyone ever use a reader/writer lock if there are only readers?
> > That's a no-op.
> 
> Makes sense to me!  The only reasons I can come up with are things like
> shutting lockdep up when it wants a given lock read-held or some such.

And I cannot resist adding this one:

	CPU 1				CPU 2
	i = srcu_read_lock(&s1);	mutex_lock(&l1);
	mutex_lock(&l1);		synchronize_srcu(&s2);
	mutex_unlock(&l1);		mutex_unlock(&l1);
	srcu_read_unlock(&s1, i);

	CPU 3				CPU 4
	i = srcu_read_lock(&s2);	mutex_lock(&l2);
	mutex_lock(&l2);		synchronize_srcu(&s1);
	mutex_unlock(&l2);		mutex_unlock(&l2);
	srcu_read_unlock(&s2, i);

Removing the SRCU statements from any of these CPU would break the
deadlock.  This can be easily extended to a deadlock cycle involving
any number of srcu_struct structures.

But this would still be a cycle involving an srcu_read_lock() and a
synchronize_srcu() on the same srcu_struct, which is reassuring.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: deadlock in synchronize_srcu() in debugfs?
  2017-03-24 20:20             ` Paul E. McKenney
@ 2017-03-27 11:18               ` Johannes Berg
  0 siblings, 0 replies; 7+ messages in thread
From: Johannes Berg @ 2017-03-27 11:18 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, Nicolai Stange, gregkh, sharon.dvir, Peter Zijlstra,
	Ingo Molnar, linux-wireless

On Fri, 2017-03-24 at 13:20 -0700, Paul E. McKenney wrote:
> 
> And I cannot resist adding this one:
> 
> 	CPU 1				CPU 2
> 	i = srcu_read_lock(&s1);	mutex_lock(&l1);
> 	mutex_lock(&l1);		synchronize_srcu(&s2);
> 	mutex_unlock(&l1);		mutex_unlock(&l1);
> 	srcu_read_unlock(&s1, i);
> 
> 	CPU 3				CPU 4
> 	i = srcu_read_lock(&s2);	mutex_lock(&l2);
> 	mutex_lock(&l2);		synchronize_srcu(&s1);
> 	mutex_unlock(&l2);		mutex_unlock(&l2);
> 	srcu_read_unlock(&s2, i);
> 
> Removing the SRCU statements from any of these CPU would break the
> deadlock.  This can be easily extended to a deadlock cycle involving
> any number of srcu_struct structures.
> 
> But this would still be a cycle involving an srcu_read_lock() and a
> synchronize_srcu() on the same srcu_struct, which is reassuring.

Right, you can cycle this indefinitely. lockdep has some kind of
maximum chain length I think. :)

johannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-03-27 11:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1490280886.2766.4.camel@sipsolutions.net>
     [not found] ` <1490282991.2766.7.camel@sipsolutions.net>
2017-03-24  8:56   ` deadlock in synchronize_srcu() in debugfs? Johannes Berg
2017-03-24  9:24     ` Johannes Berg
2017-03-24 17:45       ` Paul E. McKenney
2017-03-24 18:51         ` Johannes Berg
2017-03-24 19:33           ` Paul E. McKenney
2017-03-24 20:20             ` Paul E. McKenney
2017-03-27 11:18               ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).