From: Jan Kara <jack@suse.cz>
To: Ted Tso <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org,
"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>,
Jan Kara <jack@suse.cz>,
stable@vger.kernel.org
Subject: Re: [PATCH 1/2] ext4: Fix data exposure after a crash
Date: Fri, 19 Feb 2016 19:44:30 +0100 [thread overview]
Message-ID: <20160219184430.GA15651@quack.suse.cz> (raw)
In-Reply-To: <1452507830-8574-1-git-send-email-jack@suse.cz>
Hi Ted,
It seems this patch (and the following cleanup) got missed. Can you please
merge it? Thanks!
Honza
On Mon 11-01-16 11:23:49, Jan Kara wrote:
> Huang has reported that in his powerfail testing he is seeing stale
> block contents in some of recently allocated blocks although he mounts
> ext4 in data=ordered mode. After some investigation I have found out
> that indeed when delayed allocation is used, we don't add inode to
> transaction's list of inodes needing flushing before commit. Originally
> we were doing that but commit f3b59291a69d removed the logic with a
> flawed argument that it is not needed.
>
> The problem is that although for delayed allocated blocks we write their
> contents immediately after allocating them, there is no guarantee that
> the IO scheduler or device doesn't reorder things and thus transaction
> allocating blocks and attaching them to inode can reach stable storage
> before actual block contents. Actually whenever we attach freshly
> allocated blocks to inode using a written extent, we should add inode to
> transaction's ordered inode list to make sure we properly wait for block
> contents to be written before committing the transaction. So that is
> what we do in this patch. This also handles other cases where stale data
> exposure was possible - like filling hole via mmap in
> data=ordered,nodelalloc mode.
>
> The only exception to the above rule are extending direct IO writes where
> blkdev_direct_IO() waits for IO to complete before increasing i_size and
> thus stale data exposure is not possible. For now we don't complicate
> the code with optimizing this special case since the overhead is pretty
> low. In case this is observed to be a performance problem we can always
> handle it using a special flag to ext4_map_blocks().
>
> CC: stable@vger.kernel.org
> Fixes: f3b59291a69d0b734be1fc8be489fef2dd846d3d
> Reported-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
> Tested-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ext4/inode.c | 23 ++++++++++++++---------
> 1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index ff2f3cd38522..b216a3eb41a8 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -682,6 +682,20 @@ out_sem:
> ret = check_block_validity(inode, map);
> if (ret != 0)
> return ret;
> +
> + /*
> + * Inodes with freshly allocated blocks where contents will be
> + * visible after transaction commit must be on transaction's
> + * ordered data list.
> + */
> + if (map->m_flags & EXT4_MAP_NEW &&
> + !(map->m_flags & EXT4_MAP_UNWRITTEN) &&
> + !(flags & EXT4_GET_BLOCKS_ZERO) &&
> + ext4_should_order_data(inode)) {
> + ret = ext4_jbd2_file_inode(handle, inode);
> + if (ret)
> + return ret;
> + }
> }
> return retval;
> }
> @@ -1135,15 +1149,6 @@ static int ext4_write_end(struct file *file,
> int i_size_changed = 0;
>
> trace_ext4_write_end(inode, pos, len, copied);
> - if (ext4_test_inode_state(inode, EXT4_STATE_ORDERED_MODE)) {
> - ret = ext4_jbd2_file_inode(handle, inode);
> - if (ret) {
> - unlock_page(page);
> - page_cache_release(page);
> - goto errout;
> - }
> - }
> -
> if (ext4_has_inline_data(inode)) {
> ret = ext4_write_inline_data_end(inode, pos, len,
> copied, page);
> --
> 2.6.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2016-02-19 18:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-11 10:23 [PATCH 1/2] ext4: Fix data exposure after a crash Jan Kara
2016-01-11 10:23 ` [PATCH 2/2] ext4: Remove EXT4_STATE_ORDERED_MODE Jan Kara
2016-02-19 18:44 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160219184430.GA15651@quack.suse.cz \
--to=jack@suse.cz \
--cc=Weller.Huang@cn.bosch.com \
--cc=linux-ext4@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).