From: Jan Kara <jack@suse.cz>
To: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>,
Allison Henderson <achender@linux.vnet.ibm.com>,
Christoph Hellwig <hch@infradead.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: inode_dio_wait and hole-punch?
Date: Tue, 15 May 2012 12:18:48 +0200 [thread overview]
Message-ID: <20120515101848.GC22359@quack.suse.cz> (raw)
In-Reply-To: <alpine.LSU.2.00.1205141751150.1960@eggly.anvils>
On Mon 14-05-12 18:13:36, Hugh Dickins wrote:
> On Tue, 15 May 2012, Jan Kara wrote:
> > On Sun 13-05-12 14:26:40, Hugh Dickins wrote:
> > > In looking through hole-punching implementations, I got the impression
> > > that perhaps ext4 and ocfs2 might need to inode_dio_wait() before they
> > > truncate pagecache. And that might require ext4 to hold i_mutex there?
> > Not necessarily before truncating page cache AFAICS
>
> Yes, DIO would hold reference to pages, so truncating cache should be safe.
>
> > but certainly before we go and remove blocks from the file.
>
> Yes, that seems more like it.
>
> > In case of ext4 doing that before
> > ext4_flush_completed_IO() call would look most appropriate I'd think. Good
> > spotting! Will you send a patch?
>
> No, Jan, I'm afraid of doing more harm than good to ext4 here -
> I'd much rather trust you to take that risk with my data!
Oh, I am flattered. I can corrupt a few more blocks for you in reward ;).
> In particular, it's not clear to me whether ext4 needs to take i_mutex
> there, so inode_dio_wait() semantics are complete. And it's not clear
> to me precisely where it should take i_mutex if so: maybe it actually
> needs to take i_mutex across the whole function, I do not know.
>
> Notice (I sent Allison reminder of it yesterday) that ext4_ext_punch_hole()
> is already wrong in its "if (offset >= inode->i_size) return 0" at the
> start, which makes fallocate PUNCH_HOLE unable to remove blocks added
> earlier by fallocate KEEP_SIZE beyond eof.
>
> But I know nothing of ext4 block allocation rules, so that's something
> also where a "fix" from me would be more likely to do harm than good.
Hum, indeed the locking around ext4_ext_punch_hole() looks a bit
insufficient. Even i_mutex need not be enough because of races with page
faults and writeback. That filemap_write_and_wait_range() call is just a
feeble attempt to avoid those. i_data_sem we take further down is enough to
stop all these races but currently I'm not clear what part of the code
before we take that lock can be just discarded, what part is correct and
what part is buggy and needs better locking. I'll take a deeper look...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2012-05-15 10:19 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-13 21:26 inode_dio_wait and hole-punch? Hugh Dickins
2012-05-14 22:42 ` Jan Kara
2012-05-15 1:13 ` Hugh Dickins
2012-05-15 10:18 ` Jan Kara [this message]
2012-05-15 6:59 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120515101848.GC22359@quack.suse.cz \
--to=jack@suse.cz \
--cc=achender@linux.vnet.ibm.com \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).