From: Jan Kara <jack@suse.cz>
To: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu, jack@suse.cz
Subject: Re: [PATCH] ext4: fix ext4_flush_completed_IO wait semantics
Date: Thu, 4 Oct 2012 12:11:06 +0200 [thread overview]
Message-ID: <20121004101106.GC4641@quack.suse.cz> (raw)
In-Reply-To: <1349289807-18811-1-git-send-email-dmonakhov@openvz.org>
On Wed 03-10-12 22:43:27, Dmitry Monakhov wrote:
> BUG #1) All places where we call ext4_flush_completed_IO are broken
> because buffered io and DIO/AIO goes through three stages
> 1) submitted io,
> 2) completed io (in i_completed_io_list) conversion pended
> 3) finished io (conversion done)
> And by calling ext4_flush_completed_IO we will flush only
> requests which were in (2) stage, which is wrong because:
> 1) punch_hole and truncate _must_ wait for all outstanding unwritten io
> regardless to it's state.
> 2) fsync and nolock_dio_read should also wait because there is
> a time window between end_page_writeback() and ext4_add_complete_io()
> As result integrity fsync is broken in case of buffered write
> to fallocated region:
> fsync blkdev_completion
> ->filemap_write_and_wait_range
> ->ext4_end_bio
> ->end_page_writeback
> <-- filemap_write_and_wait_range return
> ->ext4_flush_completed_IO
> sees empty i_completed_io_list but pended
> conversion still exist
> ->ext4_add_complete_io
>
> BUG #2) Race window becomes wider due to 'ext4: completed_io locking cleanup V4'
>
> This patch make following changes:
> 1) ext4_flush_completed_io() now first try to flush completed io and when
> wait for any outstanding unwritten io via ext4_unwritten_wait()
> 2) Rename function to more appropriate name.
> 3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
> prevent endless wait
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
This patch looks good except for:
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index 8d849da..37cd5a4 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -807,9 +807,11 @@ ssize_t ext4_ind_direct_IO(int rw, struct kiocb *iocb,
>
> retry:
> if (rw == READ && ext4_should_dioread_nolock(inode)) {
> - if (unlikely(!list_empty(&ei->i_completed_io_list)))
> - ext4_flush_completed_IO(inode);
> -
> + if (unlikely(!atomic_read(&EXT4_I(inode)->i_unwritten))) {
This condition which seems to be inverted...
> + mutex_lock(&inode->i_mutex);
> + ext4_flush_unwritten_io(inode);
> + mutex_unlock(&inode->i_mutex);
> + }
> /*
> * Nolock dioread optimization may be dynamically disabled
> * via ext4_inode_block_unlocked_dio(). Check inode's state
After fixing that, you can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2012-10-04 10:11 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-03 18:43 [PATCH] ext4: fix ext4_flush_completed_IO wait semantics Dmitry Monakhov
2012-10-03 18:55 ` Theodore Ts'o
2012-10-03 19:32 ` Dmitry Monakhov
2012-10-03 20:24 ` Theodore Ts'o
2012-10-04 10:11 ` Jan Kara [this message]
2012-10-05 4:20 ` Theodore Ts'o
2012-10-05 12:40 ` Theodore Ts'o
2012-10-05 13:01 ` Dmitry Monakhov
2012-10-05 13:28 ` Theodore Ts'o
2012-10-05 13:53 ` Dmitry Monakhov
2012-10-05 14:20 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121004101106.GC4641@quack.suse.cz \
--to=jack@suse.cz \
--cc=dmonakhov@openvz.org \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).