All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Qu Wenruo <quwenruo@cn.fujitsu.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	xfs@oss.sgi.com, linux-mm@kvack.org,
	Michal Hocko <mhocko@kernel.org>, Ingo Molnar <mingo@kernel.org>
Subject: Re: Xfs lockdep warning with for-dave-for-4.6 branch
Date: Thu, 19 May 2016 10:11:46 +0200	[thread overview]
Message-ID: <20160519081146.GS3193@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160517223549.GV26977@dastard>

On Wed, May 18, 2016 at 08:35:49AM +1000, Dave Chinner wrote:
> On Tue, May 17, 2016 at 04:49:12PM +0200, Peter Zijlstra wrote:
> > On Tue, May 17, 2016 at 09:10:56AM +1000, Dave Chinner wrote:
> > 
> > > The reason we don't have lock clases for the ilock is that we aren't
> > > supposed to call memory reclaim with that lock held in exclusive
> > > mode. This is because reclaim can run transactions, and that may
> > > need to flush dirty inodes to make progress. Flushing dirty inode
> > > requires taking the ilock in shared mode.
> > > 
> > > In the code path that was reported, we hold the ilock in /shared/
> > > mode with no transaction context (we are doing a read-only
> > > operation). This means we can run transactions in memory reclaim
> > > because a) we can't deadlock on the inode we hold locks on, and b)
> > > transaction reservations will be able to make progress as we don't
> > > hold any locks it can block on.
> > 
> > Just to clarify; I read the above as that we cannot block on recursive
> > shared locks, is this correct?
> > 
> > Because we can in fact block on down_read()+down_read() just fine, so if
> > you're assuming that, then something's busted.
> 
> The transaction reservation path will run down_read_trylock() on the
> inode, not down_read(). Hence if there are no pending writers, it
> will happily take the lock twice and make progress, otherwise it
> will skip the inode and there's no deadlock.  If there's a pending
> writer, then we have another context that is already in a
> transaction context and has already pushed the item, hence it is
> only in the scope of the current push because IO hasn't completed
> yet and removed it from the list.
> 
> > Otherwise, I'm not quite reading it right, which is, given the
> > complexity of that stuff, entirely possible.
> 
> There's a maze of dark, grue-filled twisty passages here...

OK; I might need a bit more again.

So now the code does something like:

	down_read(&i_lock);		-- lockdep marks lock as held
	kmalloc(GFP_KERNEL);		-- lockdep marks held locks as ENABLED_RECLAIM_FS
	  --> reclaim()
	     down_read_trylock(&i_lock); -- lockdep does _NOT_ mark as USED_IN_RECLAIM_FS

Right?

My 'problem' is that lockdep doesn't consider a trylock for the USED_IN
annotation, so the i_lock class will only get the ENABLED tag but not
get the USED_IN tag, and therefore _should_ not trigger the inversion.


So what exactly is triggering the inversion?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Qu Wenruo <quwenruo@cn.fujitsu.com>,
	xfs@oss.sgi.com, linux-mm@kvack.org,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: Xfs lockdep warning with for-dave-for-4.6 branch
Date: Thu, 19 May 2016 10:11:46 +0200	[thread overview]
Message-ID: <20160519081146.GS3193@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160517223549.GV26977@dastard>

On Wed, May 18, 2016 at 08:35:49AM +1000, Dave Chinner wrote:
> On Tue, May 17, 2016 at 04:49:12PM +0200, Peter Zijlstra wrote:
> > On Tue, May 17, 2016 at 09:10:56AM +1000, Dave Chinner wrote:
> > 
> > > The reason we don't have lock clases for the ilock is that we aren't
> > > supposed to call memory reclaim with that lock held in exclusive
> > > mode. This is because reclaim can run transactions, and that may
> > > need to flush dirty inodes to make progress. Flushing dirty inode
> > > requires taking the ilock in shared mode.
> > > 
> > > In the code path that was reported, we hold the ilock in /shared/
> > > mode with no transaction context (we are doing a read-only
> > > operation). This means we can run transactions in memory reclaim
> > > because a) we can't deadlock on the inode we hold locks on, and b)
> > > transaction reservations will be able to make progress as we don't
> > > hold any locks it can block on.
> > 
> > Just to clarify; I read the above as that we cannot block on recursive
> > shared locks, is this correct?
> > 
> > Because we can in fact block on down_read()+down_read() just fine, so if
> > you're assuming that, then something's busted.
> 
> The transaction reservation path will run down_read_trylock() on the
> inode, not down_read(). Hence if there are no pending writers, it
> will happily take the lock twice and make progress, otherwise it
> will skip the inode and there's no deadlock.  If there's a pending
> writer, then we have another context that is already in a
> transaction context and has already pushed the item, hence it is
> only in the scope of the current push because IO hasn't completed
> yet and removed it from the list.
> 
> > Otherwise, I'm not quite reading it right, which is, given the
> > complexity of that stuff, entirely possible.
> 
> There's a maze of dark, grue-filled twisty passages here...

OK; I might need a bit more again.

So now the code does something like:

	down_read(&i_lock);		-- lockdep marks lock as held
	kmalloc(GFP_KERNEL);		-- lockdep marks held locks as ENABLED_RECLAIM_FS
	  --> reclaim()
	     down_read_trylock(&i_lock); -- lockdep does _NOT_ mark as USED_IN_RECLAIM_FS

Right?

My 'problem' is that lockdep doesn't consider a trylock for the USED_IN
annotation, so the i_lock class will only get the ENABLED tag but not
get the USED_IN tag, and therefore _should_ not trigger the inversion.


So what exactly is triggering the inversion?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-05-19  8:11 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-12  5:53 Xfs lockdep warning with for-dave-for-4.6 branch Qu Wenruo
2016-05-12  5:57 ` Darrick J. Wong
2016-05-12  8:03   ` Dave Chinner
2016-05-13 16:03     ` Michal Hocko
2016-05-13 16:03       ` Michal Hocko
2016-05-16 10:41       ` Peter Zijlstra
2016-05-16 10:41         ` Peter Zijlstra
2016-05-16 13:05         ` Michal Hocko
2016-05-16 13:05           ` Michal Hocko
2016-05-16 13:25           ` Peter Zijlstra
2016-05-16 13:25             ` Peter Zijlstra
2016-05-16 23:10             ` Dave Chinner
2016-05-16 23:10               ` Dave Chinner
2016-05-17 14:49               ` Peter Zijlstra
2016-05-17 14:49                 ` Peter Zijlstra
2016-05-17 22:35                 ` Dave Chinner
2016-05-17 22:35                   ` Dave Chinner
2016-05-18  7:20                   ` Peter Zijlstra
2016-05-18  7:20                     ` Peter Zijlstra
2016-05-18  8:25                     ` Michal Hocko
2016-05-18  8:25                       ` Michal Hocko
2016-05-18  9:49                       ` Peter Zijlstra
2016-05-18  9:49                         ` Peter Zijlstra
2016-05-18 11:31                         ` Michal Hocko
2016-05-18 11:31                           ` Michal Hocko
2016-05-19  8:11                   ` Peter Zijlstra [this message]
2016-05-19  8:11                     ` Peter Zijlstra
2016-05-20  0:17                     ` Dave Chinner
2016-05-20  0:17                       ` Dave Chinner
2016-06-01 13:17                       ` Michal Hocko
2016-06-01 13:17                         ` Michal Hocko
2016-06-01 18:16                         ` Peter Zijlstra
2016-06-01 18:16                           ` Peter Zijlstra
2016-06-02 14:50                           ` Michal Hocko
2016-06-02 14:50                             ` Michal Hocko
2016-06-02 15:11                             ` Peter Zijlstra
2016-06-02 15:11                               ` Peter Zijlstra
2016-06-02 15:46                               ` Michal Hocko
2016-06-02 15:46                                 ` Michal Hocko
2016-06-02 23:22                                 ` Dave Chinner
2016-06-02 23:22                                   ` Dave Chinner
2016-06-06 12:20                                   ` Michal Hocko
2016-06-06 12:20                                     ` Michal Hocko
2016-06-15  7:21                                     ` Dave Chinner
2016-06-15  7:21                                       ` Dave Chinner
2016-06-21 14:26                                       ` Michal Hocko
2016-06-21 14:26                                         ` Michal Hocko
2016-06-22  1:03                                         ` Dave Chinner
2016-06-22  1:03                                           ` Dave Chinner
2016-06-22 12:38                                           ` Michal Hocko
2016-06-22 12:38                                             ` Michal Hocko
2016-06-22 22:58                                             ` Dave Chinner
2016-06-22 22:58                                               ` Dave Chinner
2016-06-23 11:35                                               ` Michal Hocko
2016-06-23 11:35                                                 ` Michal Hocko
2016-10-06 13:04                           ` Michal Hocko
2016-10-06 13:04                             ` Michal Hocko
2016-10-17 13:49                             ` Michal Hocko
2016-10-17 13:49                               ` Michal Hocko
2016-10-19  0:33                             ` Dave Chinner
2016-10-19  0:33                               ` Dave Chinner
2016-10-19  5:30                               ` Dave Chinner
2016-10-19  5:30                                 ` Dave Chinner
2016-10-19  8:33                             ` Peter Zijlstra
2016-10-19  8:33                               ` Peter Zijlstra
2016-10-19 12:06                               ` Michal Hocko
2016-10-19 12:06                                 ` Michal Hocko
2016-10-19 21:49                                 ` Dave Chinner
2016-10-19 21:49                                   ` Dave Chinner
2016-10-20  7:15                                   ` Michal Hocko
2016-10-20  7:15                                     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160519081146.GS3193@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.