From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Jan Kara <jack@suse.com>
Cc: linux-ext4@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 1/6] ext4: Fix races between page faults and hole punching
Date: Wed, 14 Oct 2015 21:00:59 -0600	[thread overview]
Message-ID: <20151015030059.GB31087@linux.intel.com> (raw)
In-Reply-To: <1444822227-29984-2-git-send-email-jack@suse.com>
On Wed, Oct 14, 2015 at 01:30:22PM +0200, Jan Kara wrote:
> Currently, page faults and hole punching are completely unsynchronized.
> This can result in page fault faulting in a page into a range that we
> are punching after truncate_pagecache_range() has been called and thus
> we can end up with a page mapped to disk blocks that will be shortly
> freed. Filesystem corruption will shortly follow. Note that the same
> race is avoided for truncate by checking page fault offset against
> i_size but there isn't similar mechanism available for punching holes.
> 
> Fix the problem by creating new rw semaphore i_mmap_sem in inode and
> grab it for writing over truncate and hole punching and for read over
> page faults. We cannot easily use i_data_sem for this since that ranks
> below transaction start and we need something ranking above it so that
> it can be held over the whole truncate / hole punching operation.
> 
> Signed-off-by: Jan Kara <jack@suse.com>
> ---
>  fs/ext4/ext4.h  | 10 +++++++++
>  fs/ext4/file.c  | 66 +++++++++++++++++++++++++++++++++++++++++++++++++--------
>  fs/ext4/inode.c | 27 +++++++++++++++++++----
>  fs/ext4/super.c |  1 +
>  4 files changed, 91 insertions(+), 13 deletions(-)
I wonder if there are a few other operations in ext4_fallocate() that
we may need to protect in addition to ext4_punch_hole()?
Do ext4_collapse_range(), ext4_insert_range() and maybe even ext4_zero_range()
need protection?
For what it's worth the rest of the locking looks good to me.  The lock
ordering is the same as with ext2 and XFS, and all the DAX fault handlers look
correct to me.
next prev parent reply	other threads:[~2015-10-15  3:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-14 11:30 [PATCH 0/6] ext4: Punch hole and DAX fixes Jan Kara
2015-10-14 11:30 ` [PATCH 1/6] ext4: Fix races between page faults and hole punching Jan Kara
2015-10-15  3:00   ` Ross Zwisler [this message]
2015-10-15  9:14     ` Jan Kara
2015-10-15 20:22     ` Dave Chinner
2015-10-14 11:30 ` [PATCH 2/6] ext4: Document lock ordering Jan Kara
2015-10-14 11:30 ` [PATCH 3/6] ext4: Get rid of EXT4_GET_BLOCKS_NO_LOCK flag Jan Kara
2015-10-14 11:30 ` [PATCH 4/6] ext4: Provide ext4_issue_zeroout() Jan Kara
2015-10-14 13:18   ` kbuild test robot
2015-10-14 11:30 ` [PATCH 5/6] ext4: Implement allocation of pre-zeroed blocks Jan Kara
2015-10-14 11:30 ` [PATCH 6/6] ext4: Use pre-zeroed blocks for DAX page faults Jan Kara
2015-10-14 18:06 ` [PATCH 0/6] ext4: Punch hole and DAX fixes Ross Zwisler
2015-10-14 21:07   ` Ross Zwisler
2015-10-15  9:13     ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20151015030059.GB31087@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=matthew.r.wilcox@intel.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).