linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Add updated DAX locking to ext2
@ 2015-10-09 22:02 Ross Zwisler
  2015-10-09 22:02 ` [PATCH 1/2] dax: dax_pfn_mkwrite() truncate race check Ross Zwisler
  0 siblings, 1 reply; 2+ messages in thread
From: Ross Zwisler @ 2015-10-09 22:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, Alexander Viro, Jan Kara, Matthew Wilcox,
	linux-ext4, linux-fsdevel, Andrew Morton, Dan Williams,
	Dave Chinner, linux-nvdimm, Matthew Wilcox, Andreas Dilger

The first patch in this series is a somewhat related bug fix.  The second patch
adds new locking to ext2 to isolate DAX faults (page faults, PMD faults, page
mkwrite and pfn mkwrite) from ext2 operations that modify a given inode's data
block allocations.

In my first attempt at this fix I attempted to expand the protection offered by
ext2_inode_info->truncate_mutex so that we took truncate_mutex before our DAX
fault paths were entered.  The DAX fault handlers would then call a version of
ext2_get_block() that assumed truncate_mutex was already held.

This *almost* works, but it turns out that it introduces a lock ordering
inversion between truncate_mutex and a given page's page_lock.  With my series
as it is, the lock ordering looks like this:

mmap_sem (MM)
  ext2_inode_info->dax_sem
    sb_start_pagefault (vfs, freeze - taken in DAX)
      address_space->i_mmap_rwsem or page_lock (mutually exclusive in DAX)
        ext2_inode_info->truncate_mutex

With truncate_mutex being taken before the page fault handler, as in my initial
attempt, our lock ordering is instead:

mmap_sem (MM)
  ext2_inode_info->truncate_mutex
    sb_start_pagefault (vfs, freeze - taken in DAX)
      address_space->i_mmap_rwsem or page_lock (mutually exclusive in DAX)

This is fine for DAX, but it turns out that you can't actually take
truncate_mutex before a page_lock because some operations call ext2_get_block()
while already holding page_lock from calls higher in the stack.  This means
that whatever locks are taken in ext2_get_block() must be below page_lock in
the locking order, else you can deadlock.  To solve this I had to introduce a
new lock that would sit above page_lock in the locking order.  This is the new
"dax_sem" semaphore.

Folks with ext2 experience, have I missed any other cases that need to be
protected by dax_sem?  I took a hard look at all the cases in XFS that are
protected by i_mmaplock and convinced myself that truncate was the only one of
them present in ext2.

I've tested this series using xfstests with DAX both enabled and disabled.
Lockdep was on and working, and didn't have any complaints.

Ross Zwisler (2):
  dax: dax_pfn_mkwrite() truncate race check
  ext2: Add locking for DAX faults

 fs/dax.c        | 13 +++++++++++--
 fs/ext2/ext2.h  |  1 +
 fs/ext2/file.c  | 51 +++++++++++++++++++++++++++++++++++++++++++++++----
 fs/ext2/inode.c |  9 +++++++++
 fs/ext2/super.c |  1 +
 5 files changed, 69 insertions(+), 6 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-10-09 22:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-09 22:02 [PATCH 0/2] Add updated DAX locking to ext2 Ross Zwisler
2015-10-09 22:02 ` [PATCH 1/2] dax: dax_pfn_mkwrite() truncate race check Ross Zwisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).