* inode_dio_wait and hole-punch?
@ 2012-05-13 21:26 Hugh Dickins
2012-05-14 22:42 ` Jan Kara
2012-05-15 6:59 ` Christoph Hellwig
0 siblings, 2 replies; 5+ messages in thread
From: Hugh Dickins @ 2012-05-13 21:26 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-fsdevel
In looking through hole-punching implementations, I got the impression
that perhaps ext4 and ocfs2 might need to inode_dio_wait() before they
truncate pagecache. And that might require ext4 to hold i_mutex there?
Hugh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: inode_dio_wait and hole-punch?
2012-05-13 21:26 inode_dio_wait and hole-punch? Hugh Dickins
@ 2012-05-14 22:42 ` Jan Kara
2012-05-15 1:13 ` Hugh Dickins
2012-05-15 6:59 ` Christoph Hellwig
1 sibling, 1 reply; 5+ messages in thread
From: Jan Kara @ 2012-05-14 22:42 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Christoph Hellwig, linux-fsdevel
On Sun 13-05-12 14:26:40, Hugh Dickins wrote:
> In looking through hole-punching implementations, I got the impression
> that perhaps ext4 and ocfs2 might need to inode_dio_wait() before they
> truncate pagecache. And that might require ext4 to hold i_mutex there?
Not necessarily before truncating page cache AFAICS but certainly before
we go and remove blocks from the file. In case of ext4 doing that before
ext4_flush_completed_IO() call would look most appropriate I'd think. Good
spotting! Will you send a patch?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: inode_dio_wait and hole-punch?
2012-05-14 22:42 ` Jan Kara
@ 2012-05-15 1:13 ` Hugh Dickins
2012-05-15 10:18 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2012-05-15 1:13 UTC (permalink / raw)
To: Jan Kara; +Cc: Allison Henderson, Christoph Hellwig, linux-fsdevel
On Tue, 15 May 2012, Jan Kara wrote:
> On Sun 13-05-12 14:26:40, Hugh Dickins wrote:
> > In looking through hole-punching implementations, I got the impression
> > that perhaps ext4 and ocfs2 might need to inode_dio_wait() before they
> > truncate pagecache. And that might require ext4 to hold i_mutex there?
> Not necessarily before truncating page cache AFAICS
Yes, DIO would hold reference to pages, so truncating cache should be safe.
> but certainly before we go and remove blocks from the file.
Yes, that seems more like it.
> In case of ext4 doing that before
> ext4_flush_completed_IO() call would look most appropriate I'd think. Good
> spotting! Will you send a patch?
No, Jan, I'm afraid of doing more harm than good to ext4 here -
I'd much rather trust you to take that risk with my data!
In particular, it's not clear to me whether ext4 needs to take i_mutex
there, so inode_dio_wait() semantics are complete. And it's not clear
to me precisely where it should take i_mutex if so: maybe it actually
needs to take i_mutex across the whole function, I do not know.
Notice (I sent Allison reminder of it yesterday) that ext4_ext_punch_hole()
is already wrong in its "if (offset >= inode->i_size) return 0" at the
start, which makes fallocate PUNCH_HOLE unable to remove blocks added
earlier by fallocate KEEP_SIZE beyond eof.
But I know nothing of ext4 block allocation rules, so that's something
also where a "fix" from me would be more likely to do harm than good.
Thanks a lot for taking these on, if you will.
Hugh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: inode_dio_wait and hole-punch?
2012-05-13 21:26 inode_dio_wait and hole-punch? Hugh Dickins
2012-05-14 22:42 ` Jan Kara
@ 2012-05-15 6:59 ` Christoph Hellwig
1 sibling, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2012-05-15 6:59 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Christoph Hellwig, linux-fsdevel
On Sun, May 13, 2012 at 02:26:40PM -0700, Hugh Dickins wrote:
> In looking through hole-punching implementations, I got the impression
> that perhaps ext4 and ocfs2 might need to inode_dio_wait() before they
> truncate pagecache. And that might require ext4 to hold i_mutex there?
Anything freeing blocks should be protected by inode_dio_wait, and any
call to it needs i_mutex or equivalent protection against starting new
direct I/O.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: inode_dio_wait and hole-punch?
2012-05-15 1:13 ` Hugh Dickins
@ 2012-05-15 10:18 ` Jan Kara
0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2012-05-15 10:18 UTC (permalink / raw)
To: Hugh Dickins
Cc: Jan Kara, Allison Henderson, Christoph Hellwig, linux-fsdevel
On Mon 14-05-12 18:13:36, Hugh Dickins wrote:
> On Tue, 15 May 2012, Jan Kara wrote:
> > On Sun 13-05-12 14:26:40, Hugh Dickins wrote:
> > > In looking through hole-punching implementations, I got the impression
> > > that perhaps ext4 and ocfs2 might need to inode_dio_wait() before they
> > > truncate pagecache. And that might require ext4 to hold i_mutex there?
> > Not necessarily before truncating page cache AFAICS
>
> Yes, DIO would hold reference to pages, so truncating cache should be safe.
>
> > but certainly before we go and remove blocks from the file.
>
> Yes, that seems more like it.
>
> > In case of ext4 doing that before
> > ext4_flush_completed_IO() call would look most appropriate I'd think. Good
> > spotting! Will you send a patch?
>
> No, Jan, I'm afraid of doing more harm than good to ext4 here -
> I'd much rather trust you to take that risk with my data!
Oh, I am flattered. I can corrupt a few more blocks for you in reward ;).
> In particular, it's not clear to me whether ext4 needs to take i_mutex
> there, so inode_dio_wait() semantics are complete. And it's not clear
> to me precisely where it should take i_mutex if so: maybe it actually
> needs to take i_mutex across the whole function, I do not know.
>
> Notice (I sent Allison reminder of it yesterday) that ext4_ext_punch_hole()
> is already wrong in its "if (offset >= inode->i_size) return 0" at the
> start, which makes fallocate PUNCH_HOLE unable to remove blocks added
> earlier by fallocate KEEP_SIZE beyond eof.
>
> But I know nothing of ext4 block allocation rules, so that's something
> also where a "fix" from me would be more likely to do harm than good.
Hum, indeed the locking around ext4_ext_punch_hole() looks a bit
insufficient. Even i_mutex need not be enough because of races with page
faults and writeback. That filemap_write_and_wait_range() call is just a
feeble attempt to avoid those. i_data_sem we take further down is enough to
stop all these races but currently I'm not clear what part of the code
before we take that lock can be just discarded, what part is correct and
what part is buggy and needs better locking. I'll take a deeper look...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-05-15 10:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-13 21:26 inode_dio_wait and hole-punch? Hugh Dickins
2012-05-14 22:42 ` Jan Kara
2012-05-15 1:13 ` Hugh Dickins
2012-05-15 10:18 ` Jan Kara
2012-05-15 6:59 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).