All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: Filipe David Borba Manana <fdmanana@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4] Btrfs: fix sync fs to actually wait for all data to be persisted
Date: Tue, 24 Sep 2013 09:31:25 +0800	[thread overview]
Message-ID: <5240EB6D.101@cn.fujitsu.com> (raw)
In-Reply-To: <1379932511-16265-1-git-send-email-fdmanana@gmail.com>

On 	mon, 23 Sep 2013 11:35:11 +0100, Filipe David Borba Manana wrote:
> Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't
> wait for delayed work to finish before returning success to the
> caller. This change fixes this, ensuring that there's no data loss
> if a power failure happens right after fs sync returns success to
> the caller and before the next commit happens.
> 
> Steps to reproduce the data loss issue:
> 
> $ mkfs.btrfs -f /dev/sdb3
> $ mount /dev/sdb3 /mnt/btrfs
> $ perl -e '$d = ("\x41" x 6001); open($f,">","/mnt/btrfs/foobar"); print $f $d; close($f);' && btrfs fi sync /mnt/btrfs
> 
> Right after the btrfs fi sync command (a second or 2 for example), power
> off the machine and reboot it. The file will be empty, as it can be verified
> after mounting the filesystem and through btrfs-debug-tree:
> 
> $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8
>         item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36
>                 location key (257 INODE_ITEM 0) type FILE
>                 namelen 6 datalen 0 name: foobar
>         item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160
>                 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1
>         item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16
>                 inode ref index 2 namelen 6 name: foobar
> checksum tree key (CSUM_TREE ROOT_ITEM 0)
> leaf 29429760 items 0 free space 3995 generation 7 owner 7
> fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e
> chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae
> uuid tree key (UUID_TREE ROOT_ITEM 0)
> 
> After this patch, the data loss no longer happens after a power failure and
> btrfs-debug-tree shows:
> 
> $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8
> 	item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36
> 		location key (257 INODE_ITEM 0) type FILE
> 		namelen 6 datalen 0 name: foobar
> 	item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160
> 		inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1
> 	item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16
> 		inode ref index 2 namelen 6 name: foobar
> 	item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53
> 		extent data disk byte 12845056 nr 8192
> 		extent data offset 0 nr 8192 ram 8192
> 		extent compression 0
> checksum tree key (CSUM_TREE ROOT_ITEM 0)
> 
> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>

Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>

> ---
> 
> V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as
>     suggested by Miao Xie.
> V3: Use btrfs_start_all_delalloc_inodes() instead but outside btrfs_sync_fs(),
>     in the sync IOCTL handler. Using writeback_inodes_sb() is not very honest
>     because it doesn't guarantee inode data is persisted and we have no way
>     to know if persistence really happened or not, returning 0 (success) always.
>     Thanks Liu Bo for the suggestion.
> V4: Be even more honest in the sync IOCTL handler - don't always return success
>     regardless of the result of the btrfs_sync_fs() call.
> 
>  fs/btrfs/ioctl.c |   12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 9d46f60..385c58f 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -4557,9 +4557,15 @@ long btrfs_ioctl(struct file *file, unsigned int
>  		return btrfs_ioctl_logical_to_ino(root, argp);
>  	case BTRFS_IOC_SPACE_INFO:
>  		return btrfs_ioctl_space_info(root, argp);
> -	case BTRFS_IOC_SYNC:
> -		btrfs_sync_fs(file->f_dentry->d_sb, 1);
> -		return 0;
> +	case BTRFS_IOC_SYNC: {
> +		int ret;
> +
> +		ret = btrfs_start_all_delalloc_inodes(root->fs_info, 0);
> +		if (ret)
> +			return ret;
> +		ret = btrfs_sync_fs(file->f_dentry->d_sb, 1);
> +		return ret;
> +	}
>  	case BTRFS_IOC_START_SYNC:
>  		return btrfs_ioctl_start_sync(root, argp);
>  	case BTRFS_IOC_WAIT_SYNC:
> 


      reply	other threads:[~2013-09-24  1:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-22 20:55 [PATCH] Btrfs: fix sync fs to actually wait for all data to be persisted Filipe David Borba Manana
2013-09-23  1:30 ` Miao Xie
2013-09-23  9:11   ` Filipe David Manana
2013-09-23  9:51     ` Liu Bo
2013-09-23  9:23 ` [PATCH v2] " Filipe David Borba Manana
2013-09-23  9:53   ` Filipe David Manana
2013-09-23  9:59     ` Liu Bo
2013-09-23 10:06       ` Filipe David Manana
2013-09-23 10:28 ` [PATCH v3] " Filipe David Borba Manana
2013-09-23 10:35 ` [PATCH v4] " Filipe David Borba Manana
2013-09-24  1:31   ` Miao Xie [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5240EB6D.101@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.