From: Nikolay Borisov <kernel@kyup.com>
To: Jan Kara <jack@suse.cz>, Ted Tso <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, Jan Kara <jack@suse.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] ext4: Fix bh->b_state corruption
Date: Fri, 22 Jan 2016 09:08:55 +0200 [thread overview]
Message-ID: <56A1D587.1020504@kyup.com> (raw)
In-Reply-To: <1452185721-32477-1-git-send-email-jack@suse.cz>
Ping on that one, it seems it's going to miss the 4.5 merge window?
On 01/07/2016 06:55 PM, Jan Kara wrote:
> From: Jan Kara <jack@suse.com>
>
> ext4 can update bh->b_state non-atomically in _ext4_get_block() and
> ext4_da_get_block_prep(). Usually this is fine since bh is just a
> temporary storage for mapping information on stack but in some cases it
> can be fully living bh attached to a page. In such case non-atomic
> update of bh->b_state can race with an atomic update which then gets
> lost. Usually when we are mapping bh and thus updating bh->b_state
> non-atomically, nobody else touches the bh and so things work out fine
> but there is one case to especially worry about: ext4_finish_bio() uses
> BH_Uptodate_Lock on the first bh in the page to synchronize handling of
> PageWriteback state. So when blocksize < pagesize, we can be atomically
> modifying bh->b_state of a buffer that actually isn't under IO and thus
> can race e.g. with delalloc trying to map that buffer. The result is
> that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the
> corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock.
>
> Fix the problem by always updating bh->b_state bits atomically.
>
> CC: stable@vger.kernel.org
> Reported-by: Nikolay Borisov <kernel@kyup.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ext4/inode.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index ea433a7f4bca..06bda0361e7c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -657,6 +657,34 @@ has_zeroout:
> return retval;
> }
>
> +/*
> + * Update EXT4_MAP_FLAGS in bh->b_state. For buffer heads attached to pages
> + * we have to be careful as someone else may be manipulating b_state as well.
> + */
> +static void ext4_update_bh_state(struct buffer_head *bh, unsigned long flags)
> +{
> + unsigned long old_state;
> + unsigned long new_state;
> +
> + flags &= EXT4_MAP_FLAGS;
> +
> + /* Dummy buffer_head? Set non-atomically. */
> + if (!bh->b_page) {
> + bh->b_state = (bh->b_state & ~EXT4_MAP_FLAGS) | flags;
> + return;
> + }
> + /*
> + * Someone else may be modifying b_state. Be careful! This is ugly but
> + * once we get rid of using bh as a container for mapping information
> + * to pass to / from get_block functions, this can go away.
> + */
> + do {
> + old_state = READ_ONCE(bh->b_state);
> + new_state = (old_state & ~EXT4_MAP_FLAGS) | flags;
> + } while (unlikely(
> + cmpxchg(&bh->b_state, old_state, new_state) != old_state));
> +}
> +
> /* Maximum number of blocks we map for direct IO at once. */
> #define DIO_MAX_BLOCKS 4096
>
> @@ -693,7 +721,7 @@ static int _ext4_get_block(struct inode *inode, sector_t iblock,
> ext4_io_end_t *io_end = ext4_inode_aio(inode);
>
> map_bh(bh, inode->i_sb, map.m_pblk);
> - bh->b_state = (bh->b_state & ~EXT4_MAP_FLAGS) | map.m_flags;
> + ext4_update_bh_state(bh, map.m_flags);
> if (IS_DAX(inode) && buffer_unwritten(bh)) {
> /*
> * dgc: I suspect unwritten conversion on ext4+DAX is
> @@ -1669,7 +1697,7 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
> return ret;
>
> map_bh(bh, inode->i_sb, map.m_pblk);
> - bh->b_state = (bh->b_state & ~EXT4_MAP_FLAGS) | map.m_flags;
> + ext4_update_bh_state(bh, map.m_flags);
>
> if (buffer_unwritten(bh)) {
> /* A delayed write to unwritten bh should be marked
>
next prev parent reply other threads:[~2016-01-22 7:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-07 16:55 [PATCH] ext4: Fix bh->b_state corruption Jan Kara
2016-01-22 7:08 ` Nikolay Borisov [this message]
2016-02-18 16:09 ` Jan Kara
2016-02-19 5:08 ` Theodore Ts'o
-- strict thread matches above, loose matches on Subject: below --
2016-02-24 7:32 [PATCH] ext4: fix " Nikolay Borisov
2016-02-27 18:16 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A1D587.1020504@kyup.com \
--to=kernel@kyup.com \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.