All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: linux-fsdevel@vger.kernel.org
Subject: hole-punch vs fault
Date: Wed, 27 Nov 2013 06:48:31 -0700	[thread overview]
Message-ID: <20131127134831.GI24288@parisc-linux.org> (raw)


I'm trying to figure out what protects us in this scenario:

Thread A mmaps a file and reads from it.
Thread B punches holes in that file.

It seems to me that thread A can end up with references to pages that are
in the page cache that point to deallocated blocks (which the filesystem
might subsequently allocate to a different file).

Thread B calls ext4_punch_hole()
Thread B takes ->i_mutex
Thread B calls truncate_pagecache_range()
Thread B calls unmap_mapping_range() which removes the PTE from Thread A's
  page tables.
Thread B calls truncate_inode_pages_range() which removes the pages from the
  page cache.
Thread A calls filemap_fault().  It calls find_get_page(), but it's not there.
  It calls do_sync_mmap_readahead() then find_get_page() again, still not
  there.  So it calls page_cache_read() which calls ext4_readpage which calls
  mpage_readpage(), which calls ext4_get_block() which finds the blocks
  in question, nowhere taking the i_mutex.
Now Thread B gets back from its summer holiday, takes i_data_sem and calls
  ext4_es_remove_extent() and ext4_ext_remove_space() /
  ext4_free_hole_blocks() before releasing i_data_sem and i_mutex.
Thread A happily dirties the page it has.
Thread C extends a file, which gets the victim block allocated.
vmscan causes the page to be written back, since it's dirty and on the LRU lists
Bang, thread C has a corrupted file.

This doesn't happen with the normal truncate path since i_size is written
before calling unmap_mapping_range, and i_size is checked after filemap_fault
gets the page lock.

How should we solve this?  I see two realistic options.  One is an rwsem
in the inode or address_space taken for read by the fault path and for
write by the holepunch path.  It's kind of icky because each filesystem
will need to add support for it.

The second is to put some kind of generation counter in the inode or
address_space that is incremented on every deallocation.  The fault path
would read the generation counter before calling find_get_page() and then
check it hasn't changed after getting the page lock (goto retry_find).  I
like that one a little more since we can fix it all in common code, and I
think it'll be lower overhead in the fault path.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

             reply	other threads:[~2013-11-27 13:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-27 13:48 Matthew Wilcox [this message]
2013-11-27 22:19 ` hole-punch vs fault Jan Kara
2013-11-28  2:33   ` Matthew Wilcox
2013-11-28  3:30     ` Matthew Wilcox
2013-11-28  4:22     ` Dave Chinner
2013-11-28  4:44     ` Matthew Wilcox
2013-11-28 12:24       ` Matthew Wilcox
2013-11-28 22:12       ` Dave Chinner
2013-11-29 13:11         ` Matthew Wilcox
2013-12-01 21:52           ` Dave Chinner
2013-12-02  8:33             ` Jan Kara
2013-12-02 15:58               ` Matthew Wilcox
2013-12-02 20:11                 ` Jan Kara
2013-12-02 20:13                 ` Matthew Wilcox
2013-12-02 23:13                   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131127134831.GI24288@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.