Re: [PATCH 3/8] fsfreeze: freeze_super and thaw_bdev don't play well together

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: "Fernando Luis Vázquez Cao" <fernando_b1@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Dave Chinner <dchinner@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/8] fsfreeze: freeze_super and thaw_bdev don't play well together
Date: Fri, 13 Jul 2012 15:45:37 +0200	[thread overview]
Message-ID: <20120713134537.GF20361@quack.suse.cz> (raw)
In-Reply-To: <1342083943.7033.12.camel@nexus.lab.ntt.co.jp>

On Thu 12-07-12 18:05:43, Fernando Luis Vázquez Cao wrote:
> Changes from Dave Chinner's version:
>  - Remove s_frozen check in freeze_super which is not needed now that it is
>    re-entrant.
>  - Decrement freeze counter if the freeze_fs callback fails.
> 
> ---
> 
> thaw_bdev() has re-entrancy guards to allow freezes to nest
> together. That is, it ensures that the filesystem is not thawed
> until the last thaw command is issued. This is needed to prevent the
> filesystem from being unfrozen while an existing freezer is still
> operating on the filesystem in a frozen state (e.g. dm-snapshot).
> 
> Currently, freeze_super() and thaw_super() bypasses these guards,
> and as a result manual freezing and unfreezing via the ioctl methods
> do not nest correctly. hence mixing userspace directed freezes with
> block device level freezes result in inconsistency due to premature
> thawing of the filesystem.
> 
> Move the re-enterency guards to the superblock and into freeze_super
> and thaw_super() so that userspace directed freezes nest correctly
> again. Caveat: Things work as expected as long as direct calls to
> thaw_super are always in response to a previous sb level freeze. In
> other words an unpaired call to thaw_super can still thaw a
> filesystem frozen using freeze_bdev (this issue could be addressed
> in a follow-up patch if deemed necessary).
> 
> This patch retains the bdev level mutex and counter to keep the
> "feature" that we can freeze a block device that does not have a
> filesystem mounted yet.
> 
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Jan Kara <jack@suse.cz>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
> ---
  Some mostly minor coments below. 

 
> diff -urNp vfs-orig/fs/block_dev.c vfs/fs/block_dev.c
> --- vfs-orig/fs/block_dev.c	2012-07-12 14:31:38.936631141 +0900
> +++ vfs/fs/block_dev.c	2012-07-12 15:03:57.032627014 +0900
> @@ -257,16 +257,18 @@ int fsync_bdev(struct block_device *bdev
>  EXPORT_SYMBOL(fsync_bdev);
>  
>  /**
> - * freeze_bdev  --  lock a filesystem and force it into a consistent state
> + * freeze_bdev  --  lock a block device
>   * @bdev:	blockdevice to lock
>   *
> - * If a superblock is found on this device, we take the s_umount semaphore
> - * on it to make sure nobody unmounts until the snapshot creation is done.
> - * The reference counter (bd_fsfreeze_count) guarantees that only the last
> - * unfreeze process can unfreeze the frozen filesystem actually when multiple
> - * freeze requests arrive simultaneously. It counts up in freeze_bdev() and
> - * count down in thaw_bdev(). When it becomes 0, thaw_bdev() will unfreeze
> - * actually.
> + * Locks the block device and, if present, the associated filesystem too.
> + *
> + * The reference counter (bd_fsfreeze_count) is used to implement the feature
> + * that allows one to freeze a block device that does not have a filesystem
> + * mounted yet. For filesystems using mount_bdev the kernel takes care of
> + * things by preventing the mount operation from succeeding if the underlying
> + * block device is frozen. Other filesystems should check this counter or risk
> + * a situation where a freeze_bdev user (e.g. dm snapshot) and mount race,
> + * which may lead to inconsistencies.
>   */
>  struct super_block *freeze_bdev(struct block_device *bdev)
>  {
> @@ -274,17 +276,7 @@ struct super_block *freeze_bdev(struct b
>  	int error = 0;
>  
>  	mutex_lock(&bdev->bd_fsfreeze_mutex);
> -	if (++bdev->bd_fsfreeze_count > 1) {
> -		/*
> -		 * We don't even need to grab a reference - the first call
> -		 * to freeze_bdev grab an active reference and only the last
> -		 * thaw_bdev drops it.
> -		 */
> -		sb = get_super(bdev);
> -		drop_super(sb);
> -		mutex_unlock(&bdev->bd_fsfreeze_mutex);
> -		return sb;
> -	}
> +	bdev->bd_fsfreeze_count++;
>  
>  	sb = get_active_super(bdev);
>  	if (!sb)
> @@ -297,30 +289,33 @@ struct super_block *freeze_bdev(struct b
>  		return ERR_PTR(error);
>  	}
>  	deactivate_super(sb);
> - out:
> +out:
>  	sync_blockdev(bdev);
>  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
> -	return sb;	/* thaw_bdev releases s->s_umount */
> +	return sb;
>  }
>  EXPORT_SYMBOL(freeze_bdev);
>  
>  /**
> - * __thaw_bdev  -- unlock filesystem
> + * __thaw_bdev  -- unlock a block device
>   * @bdev:	blockdevice to unlock
>   * @sb:		associated superblock
>   * @emergency:	emergency thaw
>   *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> + * Unlocks the block device and, if present, the associated filesystem too.
>   */
>  static int __thaw_bdev(struct block_device *bdev, struct super_block *sb, int emergency)
>  {
>  	int error = -EINVAL;
>  
>  	mutex_lock(&bdev->bd_fsfreeze_mutex);
> +
>  	if (!bdev->bd_fsfreeze_count)
>  		goto out;
>  
> -	if (--bdev->bd_fsfreeze_count > 0 || !sb) {
> +	bdev->bd_fsfreeze_count--;
> +
> +	if (!sb) {
>  		error = 0;
>  		goto out;
>  	}
> @@ -336,13 +331,6 @@ out:
>  	return error;
>  }
>  
> -/**
> - * thaw_bdev  -- unlock filesystem
> - * @bdev:	blockdevice to unlock
> - * @sb:		associated superblock
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> - */
>  int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>  {
>  	return __thaw_bdev(bdev, sb, 0);
  It's a bit confusing (for reviewer) to remove the documentation here when
you remove the whole function in a later patch...

...
> @@ -1233,29 +1240,33 @@ static int __thaw_super(struct super_blo
>  {
>  	int error = 0;
>  
> -	if (sb->s_frozen == SB_UNFROZEN) {
> +	mutex_lock(&sb->s_freeze_mutex);
> +	if (!sb->s_freeze_count) {
>  		error = -EINVAL;
> -		goto out;
> +		goto out_unlock;
>  	}
> +	sb->s_freeze_count = emergency ? 1 : sb->s_freeze_count;
  It would be cleaner to do this somewhere in do_thaw_one() and not here.
Also you won't have to pass the emergency parameter then... Also it might
be more logical for __thaw_super() to expect also s_freeze_mutex locked and
handle the locking in thaw_super().

								Honza
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2012-07-13 13:45 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-12  8:57 [PATCH 0/8 v2] fsfreeze: miscellaneous fixes and cleanups Fernando Luis Vázquez Cao
2012-07-12  9:02 ` [PATCH 1/8] fsfreeze: Prevent emergency thaw from looping infinitely Fernando Luis Vázquez Cao
2012-07-13 12:59   ` Jan Kara
2012-07-17  5:13     ` Fernando Luis Vazquez Cao
2012-07-12  9:04 ` [PATCH 2/8] fsfreeze: emergency thaw will deadlock on s_umount Fernando Luis Vázquez Cao
2012-07-13 13:17   ` Jan Kara
2012-08-09  6:00     ` Fernando Luis Vazquez Cao
2012-09-13  6:23     ` Fernando Luis Vazquez Cao
2012-07-12  9:05 ` [PATCH 3/8] fsfreeze: freeze_super and thaw_bdev don't play well together Fernando Luis Vázquez Cao
2012-07-13 13:45   ` Jan Kara [this message]
2012-08-09  9:00     ` Fernando Luis Vazquez Cao
2012-07-12  9:07 ` [PATCH 4/8] fsfreeze: switch to using super methods where possible Fernando Luis Vázquez Cao
2012-07-13 13:50   ` Jan Kara
2012-07-12  9:09 ` [PATCH 5/8] fsfreeze: move emergency thaw code to fs/super.c Fernando Luis Vázquez Cao
2012-07-13 13:53   ` Jan Kara
2012-07-12  9:10 ` [PATCH 6/8] fsfreeze: add vfs ioctl to check freeze state Fernando Luis Vázquez Cao
2012-07-13 13:54   ` Jan Kara
2012-07-15 22:45     ` Dave Chinner
2012-09-13  6:19       ` Fernando Luis Vazquez Cao
2012-09-13  7:18         ` Dave Chinner
2012-09-13  8:19           ` Fernando Luis Vazquez Cao
2012-09-14  0:15             ` Dave Chinner
2012-09-14  1:46               ` Fernando Luis Vazquez Cao
2012-09-14  6:28                 ` Dave Chinner
2012-09-14  8:18                   ` Fernando Luis Vazquez Cao
2012-09-13  6:11     ` Fernando Luis Vazquez Cao
2012-07-12  9:11 ` [PATCH 7/8] fsfreeze: add block device " Fernando Luis Vázquez Cao
2012-07-12  9:12 ` [PATCH 8/8] fsfreeze: update Documentation/filesystems/Locking Fernando Luis Vázquez Cao
2012-07-13 14:11   ` Jan Kara
2012-07-17  1:42     ` Fernando Luis Vazquez Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120713134537.GF20361@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=dchinner@redhat.com \
    --cc=fernando_b1@lab.ntt.co.jp \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).