All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Monakhov <dmonakhov@openvz.org>
To: Lukas Czerner <lczerner@redhat.com>, linux-ext4@vger.kernel.org
Cc: tytso@mit.edu, Lukas Czerner <lczerner@redhat.com>
Subject: Re: [PATCH] ext4: Prevent race while waling extent tree
Date: Thu, 08 Nov 2012 16:01:17 +0400	[thread overview]
Message-ID: <87liecs3qq.fsf@openvz.org> (raw)
In-Reply-To: <1352372929-18513-1-git-send-email-lczerner@redhat.com>

On Thu,  8 Nov 2012 12:08:49 +0100, Lukas Czerner <lczerner@redhat.com> wrote:
> Currently ext4_ext_walk_space() only takes i_data_sem for read when
> searching for the extent at given block with ext4_ext_find_extent().
> Then it drops the lock and the extent tree can be changed at will.
> However later on we're searching for the 'next' extent, but the extent
> tree might already have changed, so the information might not be
> accurate.
> 
> In fact we can hit BUG_ON(end <= start) if the extent got inserted into
> the tree after the one we found and before the block we were searching
> for. This has been reproduced by running xfstests 225 in loop on s390x
> architecture, but theoretically we could hit this on any other
> architecture as well, but probably not as often.
> 
> ext4_ext_walk_space() is currently only used from ext4_fiemap() and even
> if we do not hit the BUG_ON() fiemap might return scrambled information
> to the user.
> 
> Fix this by requiring ext4_ext_walk_space() to be called with i_data_sem
> held. By calling it from ext4_fiemap() we can only take the i_data_sem
> for read, but possibly other users might want to modify the extents so
> they will be able to take write lock.
Agree as a short term fix for BUGON case, but Theodore suggested to use
seqlock approach http://lists.openwall.net/linux-ext4/2011/10/26/25

> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
>  fs/ext4/extents.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 7011ac9..f1aca06 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -1959,6 +1959,11 @@ cleanup:
>  	return err;
>  }
>  
> +/*
> + * ext4_ext_walk_space() should be called with i_data_sem locked. If we're
> + * not modifying found extents, or extent tree in callback function, then
> + * read lock is ok.
> + */
>  static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block,
>  			       ext4_lblk_t num, ext_prepare_callback func,
>  			       void *cbdata)
> @@ -1976,9 +1981,7 @@ static int ext4_ext_walk_space(struct inode *inode, ext4_lblk_t block,
>  	while (block < last && block != EXT_MAX_BLOCKS) {
>  		num = last - block;
>  		/* find extent for this block */
> -		down_read(&EXT4_I(inode)->i_data_sem);
>  		path = ext4_ext_find_extent(inode, block, path);
> -		up_read(&EXT4_I(inode)->i_data_sem);
>  		if (IS_ERR(path)) {
>  			err = PTR_ERR(path);
>  			path = NULL;
> @@ -5021,8 +5024,10 @@ int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>  		 * Walk the extent tree gathering extent information.
>  		 * ext4_ext_fiemap_cb will push extents back to user.
>  		 */
> +		down_read(&EXT4_I(inode)->i_data_sem);
>  		error = ext4_ext_walk_space(inode, start_blk, len_blks,
>  					  ext4_ext_fiemap_cb, fieinfo);
> +		up_read(&EXT4_I(inode)->i_data_sem);
>  	}
>  
>  	return error;
> -- 
> 1.7.7.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-11-08 12:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-08 11:08 [PATCH] ext4: Prevent race while waling extent tree Lukas Czerner
2012-11-08 12:01 ` Dmitry Monakhov [this message]
2012-11-08 13:43   ` Lukáš Czerner
2012-11-08 16:07     ` Lukáš Czerner
2012-11-08 21:52 ` Zach Brown
2012-11-09  9:19   ` Lukáš Czerner
  -- strict thread matches above, loose matches on Subject: below --
2012-11-12 14:57 Lukas Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87liecs3qq.fsf@openvz.org \
    --to=dmonakhov@openvz.org \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.